Banghua Zhu

CoRR, 2024

Fairness in Serving Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

SLoRA: Scalable Serving of Thousands of LoRA Adapters.

[BibT_eX]

[DOI]

Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF.

[BibT_eX]

[DOI]

Anastasios Nikolas Angelopoulos

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference.

[BibT_eX]

[DOI]

Wei-Lin Chiang

Lianmin Zheng

Ying Sheng

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards the Fundamental Limits of Knowledge Transfer over Finite Domains.

[BibT_eX]

[DOI]

Qingyue Zhao

Proceedings of the Twelfth International Conference on Learning Representations, 2024

The Effective Horizon Explains Deep RL Performance in Stochastic Environments.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Towards Optimal Statistical Watermarking.

[BibT_eX]

[DOI]

CoRR, 2023

S-LoRA: Serving Thousands of Concurrent LoRA Adapters.

[BibT_eX]

[DOI]

CoRR, 2023

QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources.

[BibT_eX]

[DOI]

CoRR, 2023

Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment.

[BibT_eX]

[DOI]

CoRR, 2023

Fine-Tuning Language Models with Advantage-Induced Policy Alignment.

[BibT_eX]

[DOI]

Hiteshi Sharma

Felipe Vieira Frujeri

CoRR, 2023

On Optimal Caching and Model Multiplexing for Large Model Inference.

[BibT_eX]

[DOI]

CoRR, 2023

Online Learning in a Creator Economy.

[BibT_eX]

[DOI]

Sai Praneeth Karimireddy

CoRR, 2023

The Sample Complexity of Online Contract Design.

[BibT_eX]

[DOI]

Proceedings of the 24th ACM Conference on Economics and Computation, 2023

Doubly-Robust Self-Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Optimal Caching and Model Selection for Large Model Inference.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

On the Optimal Bounds for Noisy Computing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2023

Variable-Length Insertion-Based Noisy Sorting.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2023

Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Online Learning in Stackelberg Games with an Omniscient Follower.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Jump-Start Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Byzantine-Robust Federated Learning with Optimal Statistical Rates.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Minimax Off-Policy Evaluation for Multi-Armed Bandits.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2022

Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees.

[BibT_eX]

[DOI]

CoRR, 2022

Robust Estimation for Nonparametric Families via Generative Adversarial Networks.

[BibT_eX]

[DOI]

CoRR, 2022

Robust Estimation for Non-parametric Families via Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2022

2021

Linear Representation Meta-Reinforcement Learning for Instant Adaptation.

[BibT_eX]

[DOI]

Matt Peng

CoRR, 2021

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Deconstructing Generative Adversarial Networks.

[BibT_eX]

[DOI]

David Tse

IEEE Trans. Inf. Theory, 2020

Robust estimation via generalized quasi-gradients.

[BibT_eX]

[DOI]

Jacob Steinhardt

CoRR, 2020

When does the Tukey Median work?

[BibT_eX]

[DOI]

Jacob Steinhardt

Proceedings of the IEEE International Symposium on Information Theory, 2020

2019

Joint Transceiver Optimization for Wireless Communication PHY Using Neural Network.

[BibT_eX]

[DOI]

IEEE J. Sel. Areas Commun., 2019

Generalized Resilience and Robust Statistics.

[BibT_eX]

[DOI]

Jacob Steinhardt

CoRR, 2019

2017

Sparse Tensor Decomposition for Haplotype Assembly of Diploids and Polyploids.

[BibT_eX]

[DOI]

Abolfazl Hashemi