Yinmin Zhong

Orcid: 0000-0002-2504-7652

According to our database1, Yinmin Zhong authored at least 10 papers between 2023 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DistMind: Efficient Resource Disaggregation for Deep Learning Workloads.
IEEE/ACM Trans. Netw., June, 2024

RLHFuse: Efficient RLHF Training for Large Language Models with Inter- and Intra-Stage Fusion.
CoRR, 2024

DistTrain: Addressing Model and Data Heterogeneity with Disaggregated Training for Multimodal Large Language Models.
CoRR, 2024

FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion.
CoRR, 2024

LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism.
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

MegaScale: Scaling Large Language Model Training to More Than 10, 000 GPUs.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

2023
Fast Distributed Inference Serving for Large Language Models.
CoRR, 2023

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023


  Loading...