Youhui Bai
Orcid: 0009-0007-6073-7011
According to our database1,
Youhui Bai authored at least 17 papers
between 2017 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
nnScaler-M: Constraint-Guided and Placement-Aware Parallelization Plan Generation for Deep Learning Training.
IEEE Trans. Parallel Distributed Syst., July, 2026
Accelerating Long-Tail Generation in Synchronous RLHF Training via Adaptive Tensor Parallelism.
CoRR, May, 2026
CoRR, April, 2026
Lagom: Unleashing the Power of Communication and Computation Overlapping for Distributed LLM Training.
CoRR, February, 2026
SMIDT: High-Performance Inference Framework for MoE Models with Dynamic Top-K Routing.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
CLO: Efficient LLM Inference System with CPU-Light KVCache Offloading via Algorithm-System Co-Design.
CoRR, November, 2025
A Generic, High-Performance, Compression-Aware Framework for Data Parallel DNN Training.
IEEE Trans. Parallel Distributed Syst., July, 2025
BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference.
CoRR, February, 2025
HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference.
CoRR, 2024
2023
IEEE Trans. Parallel Distributed Syst., August, 2023
MPress: Democratizing Billion-Scale Model Training on Multi-GPU Servers via Memory-Saving Inter-Operator Parallelism.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
2021
IEEE Trans. Parallel Distributed Syst., 2021
Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021
2017
Proceedings of the 46th International Conference on Parallel Processing, 2017