Keshav Santhanam

Orcid: 0000-0001-5939-7944

According to our database1, Keshav Santhanam authored at least 14 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
ALTO: An Efficient Network Orchestrator for Compound AI Systems.
Proceedings of the 4th Workshop on Machine Learning and Systems, 2024

2023
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines.
CoRR, 2023

Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs.
CoRR, 2023

Cheaply Estimating Inference Efficiency Metrics for Autoregressive Transformer Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP.
CoRR, 2022

Holistic Evaluation of Language Models.
CoRR, 2022

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

PLAID: An Efficient Engine for Late Interaction Retrieval.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021
DistIR: An Intermediate Representation and Simulator for Efficient Neural Network Distribution.
CoRR, 2021

DistIR: An Intermediate Representation for Optimizing Distributed Neural Networks.
Proceedings of the EuroMLSys@EuroSys 2021, 2021

2020
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

2018
ROLA: A New Distributed Transaction Protocol and Its Formal Analysis.
Proceedings of the Fundamental Approaches to Software Engineering, 2018


  Loading...