Hao Zhang

CoRR, October, 2025

lmgame-Bench: How Good are LLMs at Playing Games?

[BibT_eX]

[DOI]

CoRR, May, 2025

VSA: Faster Video Diffusion with Trainable Sparse Attention.

[BibT_eX]

[DOI]

CoRR, May, 2025

Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile.

[BibT_eX]

[DOI]

CoRR, February, 2025

Fast Video Generation with Sliding Tile Attention.

[BibT_eX]

[DOI]

CoRR, February, 2025

Fast Video Generation with Sliding Tile Attention.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

GameArena: Evaluating LLM Reasoning through Live Computer Games.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Scaling Long Context Training Data by Long-Distance Referrals.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs.

[BibT_eX]

[DOI]

Lanxiang Hu

Tajana Rosing

Anastasios Nikolas Angelopoulos

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Efficiently Serving LLM Reasoning Programs with Certaindex.

[BibT_eX]

[DOI]

CoRR, 2024

Specifications: The missing link to making the development of LLM systems an engineering discipline.

[BibT_eX]

[DOI]

Anastasios Angelopoulos

CoRR, 2024

MPC-Minimized Secure LLM Inference.

[BibT_eX]

[DOI]

CoRR, 2024

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput.

[BibT_eX]

[DOI]

CoRR, 2024

Toward Inference-optimal Mixture-of-Expert Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving.

[BibT_eX]

[DOI]

CoRR, 2024

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving.

[BibT_eX]

[DOI]

Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Efficient LLM Scheduling by Learning to Rank.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Online Speculative Decoding.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference.

[BibT_eX]

[DOI]

Wei-Lin Chiang

Lianmin Zheng

Ying Sheng

Proceedings of the Forty-first International Conference on Machine Learning, 2024

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023

LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

Efficient Memory Management for Large Language Model Serving with PagedAttention.

[BibT_eX]

[DOI]

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.

[BibT_eX]

[DOI]

Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

On Optimizing the Communication of Model Parallelism.

[BibT_eX]

[DOI]

Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

MPCFORMER: Fast, Performant and Provate Transformer Inference with MPC.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

DNB: A Joint Learning Framework for Deep Bayesian Nonparametric Clustering.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2022

MPCFormer: fast, performant and private Transformer inference with MPC.

[BibT_eX]

[DOI]

CoRR, 2022

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021

Machine Learning Parallelism Could Be Adaptive, Composable and Automated.

[BibT_eX]

[DOI]

PhD thesis, 2021

Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning.

[BibT_eX]

[DOI]

Aurick Qiao

Sang Keun Choe

Suhas Jayaram Subramanya

Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021

Simple and Automatic Distributed Machine Learning on Ray.

[BibT_eX]

[DOI]

Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Ada-Segment: Automated Multi-loss Adaptation for Panoptic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

BayesAdapter: Being Bayesian, Inexpensively and Robustly, via Bayeisan Fine-tuning.

[BibT_eX]

[DOI]

CoRR, 2020

Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2020

AutoSync: Learning to Synchronize for Data-Parallel Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

AutoLoss: Learning Discrete Schedule for Alternate Optimization.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Toward Understanding the Impact of Staleness in Distributed Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

2018

AutoLoss: Learning Discrete Schedules for Alternate Optimization.

[BibT_eX]

[DOI]

CoRR, 2018

Cavs: An Efficient Runtime System for Dynamic Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Symbolic Graph Reasoning Meets Convolutions.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

SCAN: Structure Correcting Adversarial Network for Organ Segmentation in Chest X-Rays.

[BibT_eX]

[DOI]

Proceedings of the Deep Learning in Medical Image Analysis - and - Multimodal Learning for Clinical Decision Support, 2018

Generative Semantic Manipulation with Mask-Contrasting GAN.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

2017

Cavs: A Vertex-centric Programming Interface for Dynamic Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2017

Generative Semantic Manipulation with Contrasting GAN.

[BibT_eX]

[DOI]

Xiaodan Liang

Eric P. Xing

CoRR, 2017

SCAN: Structure Correcting Adversarial Network for Chest X-rays Organ Segmentation.

[BibT_eX]

[DOI]

CoRR, 2017

ZM-Net: Real-time Zero-shot Image Manipulation Network.

[BibT_eX]

[DOI]

CoRR, 2017

Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2017 USENIX Annual Technical Conference, 2017

Structured Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Recurrent Topic-Transition GAN for Visual Paragraph Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

2016

Automatic Photo Adjustment Using Deep Neural Networks.

[BibT_eX]

[DOI]

ACM Trans. Graph., 2016

Combining the Best of Convolutional Layers and Recurrent Layers: A Hybrid Network for Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2016

GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server.

[BibT_eX]

[DOI]

Proceedings of the Eleventh European Conference on Computer Systems, 2016

Learning Concept Taxonomies from Multi-modal Data.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015

Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Poseidon: A System Architecture for Efficient GPU-based Deep Learning on Multiple Machines.

[BibT_eX]

[DOI]

CoRR, 2015

Dynamic Topic Modeling for Monitoring Market Competition from Online Text and Image Data.

[BibT_eX]

[DOI]

Gunhee Kim

Eric P. Xing

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Regularizing DNN acoustic models with Gaussian stochastic neurons.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Semi-supervised training in low-resource ASR and KWS.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Automatic Photo Adjustment Using Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2014

Improvements to speaker adaptive training of deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Towards speaker adaptive training of deep neural network acoustic models.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Distributed learning of multilingual DNN feature extractors using GPUs.

[BibT_eX]

[DOI]