Xin Liu
Orcid: 0009-0004-0341-3860Affiliations:
- East China Normal University, Shanghai, China
- Shanghai AI Laboratory, Shanghai, China
According to our database1,
Xin Liu authored at least 49 papers
between 2022 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on github.com
On csauthors.net:
Bibliography
2026
DASH: Deterministic Attention Scheduling for High-throughput Reproducible LLM Training.
CoRR, January, 2026
MegaScale-Omni: A Hyper-Scale, Workload-Resilient System for MultiModal LLM Training in Production.
Proceedings of the 21st European Conference on Computer Systems, 2026
Proceedings of the 21st European Conference on Computer Systems, 2026
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production.
Proceedings of the 21st European Conference on Computer Systems, 2026
SwiftSpec: Disaggregated Speculative Decoding and Fused Kernels for Low-Latency LLM Inference.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026
OmniScale: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
IEEE Trans. Parallel Distributed Syst., September, 2025
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning.
CoRR, September, 2025
Boosting Embodied AI Agents through Perception-Generation Disaggregation and Asynchronous Pipeline Execution.
CoRR, September, 2025
CoRR, September, 2025
CoRR, August, 2025
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo.
CoRR, August, 2025
SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding.
CoRR, June, 2025
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production.
CoRR, May, 2025
Triton-distributed: Programming Overlapping Kernels on Distributed AI Systems with the Triton Compiler.
CoRR, April, 2025
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training.
CoRR, April, 2025
MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism.
CoRR, April, 2025
ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs.
CoRR, February, 2025
Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles, 2025
MegaScale-Infer: Efficient Mixture-of-Experts Model Serving with Disaggregated Expert Parallelism.
Proceedings of the ACM SIGCOMM 2025 Conference, 2025
ByteScale: Communication-Efficient Scaling of LLM Training with a 2048K Context Length on 16384 GPUs.
Proceedings of the ACM SIGCOMM 2025 Conference, 2025
Proceedings of the International Conference for High Performance Computing, 2025
Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025
ByteCheckpoint: A Unified Checkpointing System for Large Foundation Model Development.
Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation, 2025
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025
TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric Primitives.
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025
Proceedings of the Forty-second International Conference on Machine Learning, 2025
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
CoRR, 2024
CoRR, 2024
MuxFlow: efficient GPU sharing in production-level clusters with more than 10000 GPUs.
Sci. China Inf. Sci., 2024
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
MuxFlow: Efficient and Safe GPU Sharing in Large-Scale Production Deep Learning Clusters.
CoRR, 2023
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023
Not All Tasks Are Equal: A Parameter-Efficient Task Reweighting Method for Few-Shot Learning.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023
Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
2022
Inf. Sci., 2022
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022