Xin Liu

Orcid: 0009-0004-0341-3860

Affiliations:
  • East China Normal University, Shanghai, China
  • Shanghai AI Laboratory, Shanghai, China


According to our database1, Xin Liu authored at least 28 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Cannikin: No Lagger of SLO in Concurrent Multiple LoRA LLM Serving.
IEEE Trans. Parallel Distributed Syst., September, 2025

SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding.
CoRR, June, 2025

MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production.
CoRR, May, 2025

Triton-distributed: Programming Overlapping Kernels on Distributed AI Systems with the Triton Compiler.
CoRR, April, 2025

MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism.
CoRR, April, 2025

TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric Primitives.
CoRR, March, 2025

Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts.
CoRR, February, 2025

Understanding Stragglers in Large Model Training Using What-if Analysis.
Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025

A Comprehensive Overhaul of Multimodal Assistant with Small Language Models.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference.
CoRR, 2024

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
CoRR, 2024

FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion.
CoRR, 2024

RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation.
CoRR, 2024

Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models.
CoRR, 2024

MuxFlow: efficient GPU sharing in production-level clusters with more than 10000 GPUs.
Sci. China Inf. Sci., 2024

MegaScale: Scaling Large Language Model Training to More Than 10, 000 GPUs.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

Safety of Multimodal Large Language Models on Images and Text.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Query-Relevant Images Jailbreak Large Multi-Modal Models.
CoRR, 2023

MuxFlow: Efficient and Safe GPU Sharing in Large-Scale Production Deep Learning Clusters.
CoRR, 2023

vMF Loss: Exploring a Scattered Intra-class Hypersphere for Few-Shot Learning.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Not All Tasks Are Equal: A Parameter-Efficient Task Reweighting Method for Few-Shot Learning.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Recognizable Information Bottleneck.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

2022
Adaptive distribution calibration for few-shot learning via optimal transport.
Inf. Sci., 2022

BaGuaLu: targeting brain scale pretrained models with over 37 million cores.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022

Teach Less, Learn More: On the Undistillable Classes in Knowledge Distillation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022


  Loading...