Qiuli Mao
Orcid: 0009-0004-8777-2579
According to our database1,
Qiuli Mao authored at least 8 papers
between 2023 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Efficient and Adaptable Overlapping for Computation and Communication via Signaling and Reordering.
Proceedings of the 21st European Conference on Computer Systems, 2026
2025
FlashDecoding++Next: High Throughput LLM Inference With Latency and Memory Optimization.
IEEE Trans. Computers, October, 2025
semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage.
CoRR, April, 2025
FlashOverlap: A Lightweight Design for Efficiently Overlapping Communication and Computation.
CoRR, April, 2025
SOLA: Optimizing SLO Attainment for Large Language Model Serving with State-Aware Scheduling.
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025
2024
FlashDecoding++: Faster Large Language Model Inference with Asynchronization, Flat GEMM Optimization, and Heuristics.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024
FEASTA: A Flexible and Efficient Accelerator for Sparse Tensor Algebra in Machine Learning.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023