Aurick Qiao

Orcid: 0009-0004-9119-8696

According to our database¹, Aurick Qiao authored at least 23 papers between 2014 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

MoE-Prefill: Zero Redundancy Overheads in MoE Prefill Serving.

[BibT_eX]

[DOI]

CoRR, May, 2026

TAGQuant: Token-Aware Clustering for Group-Wise Quantization.

[BibT_eX]

[DOI]

Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026

Shift Parallelism: Low-Latency, High-Throughput LLM Inference for Dynamic Workloads.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025

OWL: Overcoming Window Length-Dependence in Speculative Decoding for Long-Context Inputs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI.

[BibT_eX]

[DOI]

CoRR, July, 2025

Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences.

[BibT_eX]

[DOI]

CoRR, June, 2025

TALE: Token-Adaptive Low-Rank KVCache Approximation with Reconstruction Elimination.

[BibT_eX]

[DOI]

Trans. Assoc. Comput. Linguistics, 2025

SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Efficiently Scaling LLM Reasoning Programs with Certaindex.

[BibT_eX]

[DOI]

Tajana Simunic Rosing

Ion Stoica

Hao Zhang

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Efficiently Serving LLM Reasoning Programs with Certaindex.

[BibT_eX]

[DOI]

CoRR, 2024

SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference.

[BibT_eX]

[DOI]

CoRR, 2024

Efficient LLM Scheduling by Learning to Rank.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

LLM360: Towards Fully Transparent Open-Source LLMs.

[BibT_eX]

[DOI]

CoRR, 2023

Sia: Heterogeneity-aware, goodput-optimized ML-cluster scheduling.

[BibT_eX]

[DOI]

Suhas Jayaram Subramanya

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

2021

Elastic Machine Learning Systems with Co-adaptation.

[BibT_eX]

[DOI]

Aurick Qiao

PhD thesis, 2021

Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning.

[BibT_eX]

[DOI]

Aurick Qiao

Sang Keun Choe

Suhas Jayaram Subramanya

Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021

2020

Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2020

2019

Fault Tolerance in Iterative-Convergent Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Litz: Elastic Framework for High-Performance Distributed Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 2018 USENIX Annual Technical Conference, 2018

2015

Managed communication and consistency for fast data-parallel iterative analytics.

[BibT_eX]

[DOI]

Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

2014

Multi-Pivot Quicksort: Theory and Experiments.

[BibT_eX]

[DOI]

Shrinu Kushagra

Alejandro López-Ortiz

Aurick Qiao

J. Ian Munro

Proceedings of the 2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments, 2014

Aurick Qiao

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...