Xin Liu

Orcid: 0009-0004-0341-3860

Affiliations:

East China Normal University, Shanghai, China
Shanghai AI Laboratory, Shanghai, China

According to our database¹, Xin Liu authored at least 51 papers between 2022 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

DITRON: Distributed Multi-level Tiling Compiler for Parallel Tensor Programs.

[BibT_eX]

[DOI]

CoRR, May, 2026

veScale-FSDP: Flexible and High-Performance FSDP at Scale.

[BibT_eX]

[DOI]

CoRR, February, 2026

DASH: Deterministic Attention Scheduling for High-throughput Reproducible LLM Training.

[BibT_eX]

[DOI]

CoRR, January, 2026

MegaScale-Omni: A Hyper-Scale, Workload-Resilient System for MultiModal LLM Training in Production.

[BibT_eX]

[DOI]

Proceedings of the 21st European Conference on Computer Systems, 2026

Laminar: A Scalable Asynchronous RL Post-Training Framework.

[BibT_eX]

[DOI]

Proceedings of the 21st European Conference on Computer Systems, 2026

MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production.

[BibT_eX]

[DOI]

Proceedings of the 21st European Conference on Computer Systems, 2026

SwiftSpec: Disaggregated Speculative Decoding and Fused Kernels for Low-Latency LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

OmniScale: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Cannikin: No Lagger of SLO in Concurrent Multiple LoRA LLM Serving.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., September, 2025

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning.

[BibT_eX]

[DOI]

CoRR, September, 2025

Boosting Embodied AI Agents through Perception-Generation Disaggregation and Asynchronous Pipeline Execution.

[BibT_eX]

[DOI]

CoRR, September, 2025

LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving.

[BibT_eX]

[DOI]

CoRR, September, 2025

A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understanding.

[BibT_eX]

[DOI]

CoRR, August, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo.

[BibT_eX]

[DOI]

CoRR, August, 2025

SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding.

[BibT_eX]

[DOI]

CoRR, June, 2025

MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production.

[BibT_eX]

[DOI]

CoRR, May, 2025

Triton-distributed: Programming Overlapping Kernels on Distributed AI Systems with the Triton Compiler.

[BibT_eX]

[DOI]

CoRR, April, 2025

OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training.

[BibT_eX]

[DOI]

CoRR, April, 2025

MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism.

[BibT_eX]

[DOI]

CoRR, April, 2025

ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs.

[BibT_eX]

[DOI]

CoRR, February, 2025

Robust LLM Training Infrastructure at ByteDance.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles, 2025

MegaScale-Infer: Efficient Mixture-of-Experts Model Serving with Disaggregated Expert Parallelism.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2025 Conference, 2025

ByteScale: Communication-Efficient Scaling of LLM Training with a 2048K Context Length on 16384 GPUs.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2025 Conference, 2025

LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2025

Understanding Stragglers in Large Model Training Using What-if Analysis.

[BibT_eX]

[DOI]

Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025

ByteCheckpoint: A Unified Checkpointing System for Large Foundation Model Development.

[BibT_eX]

[DOI]

Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation, 2025

DUO: No Compromise to Accuracy Degradation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

COMET: Fine-grained Computation-communication Overlapping for Mixture-of-Experts.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric Primitives.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

A Comprehensive Overhaul of Multimodal Assistant with Small Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

ByteCheckpoint: A Unified Checkpointing System for LLM Development.

[BibT_eX]

[DOI]

CoRR, 2024

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

[BibT_eX]

[DOI]

CoRR, 2024

FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion.

[BibT_eX]

[DOI]

CoRR, 2024

RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MuxFlow: efficient GPU sharing in production-level clusters with more than 10000 GPUs.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

MegaScale: Scaling Large Language Model Training to More Than 10, 000 GPUs.

[BibT_eX]

[DOI]

Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Safety of Multimodal Large Language Models on Images and Text.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Query-Relevant Images Jailbreak Large Multi-Modal Models.

[BibT_eX]

[DOI]

CoRR, 2023

MuxFlow: Efficient and Safe GPU Sharing in Large-Scale Production Deep Learning Clusters.

[BibT_eX]

[DOI]

CoRR, 2023

vMF Loss: Exploring a Scattered Intra-class Hypersphere for Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Not All Tasks Are Equal: A Parameter-Efficient Task Reweighting Method for Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Recognizable Information Bottleneck.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

2022

Adaptive distribution calibration for few-shot learning via optimal transport.

[BibT_eX]

[DOI]

Inf. Sci., 2022

BaGuaLu: targeting brain scale pretrained models with over 37 million cores.

[BibT_eX]

[DOI]

Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022

Teach Less, Learn More: On the Undistillable Classes in Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Xin Liu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...