Xuanlei Zhao

Orcid: 0009-0000-4877-3115

According to our database1, Xuanlei Zhao authored at least 17 papers between 2024 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
HelixPipe: Efficient Distributed Training of Long Sequence Transformers with Attention Parallel Pipeline Parallelism.
CoRR, July, 2025

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights.
CoRR, June, 2025

REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training.
CoRR, May, 2025

DD-Ranking: Rethinking the Evaluation of Dataset Distillation.
CoRR, May, 2025

Dynamic Vision Mamba.
CoRR, April, 2025

Enhance-A-Video: Better Generated Video for Free.
CoRR, February, 2025

CDIO: Cross-Domain Inference Optimization with Resource Preference Prediction for Edge-Cloud Collaboration.
CoRR, February, 2025

Real-Time Video Generation with Pyramid Attention Broadcast.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024
Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training.
CoRR, 2024

WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem.
CoRR, 2024

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers.
CoRR, 2024

HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices.
CoRR, 2024

AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference.
CoRR, 2024

FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

AutoChunk: Automated Activation Chunk for Memory-Efficient Deep Learning Inference.
Proceedings of the Twelfth International Conference on Learning Representations, 2024


  Loading...