Yong Li
Orcid: 0000-0001-9072-3170Affiliations:
- Alibaba Group, Beijing, China
According to our database1,
Yong Li
authored at least 32 papers
between 2019 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
CoRR, May, 2025
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025
Proceedings of the Forty-second International Conference on Machine Learning, 2025
CaliEX: A Disk-Based Large-Scale GNN Training System with Joint Design of Caching and Execution.
Proceedings of the 41st IEEE International Conference on Data Engineering, 2025
Spindle: Efficient Distributed Training of Multi-Task Large Models via Wavefront Scheduling.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025
2024
ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG.
IEEE Trans. Parallel Distributed Syst., 2024
Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management.
CoRR, 2024
CoRR, 2024
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.
CoRR, 2024
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
2023
Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.
Proc. VLDB Endow., 2023
GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning.
Proc. ACM Manag. Data, 2023
ROAM: memory-efficient large DNN training via optimized operator ordering and memory layout.
CoRR, 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.
CoRR, 2023
CoRR, 2023
EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.
Proceedings of the International Conference for High Performance Computing, 2023
uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural Networks.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
Proceedings of the 2022 USENIX Annual Technical Conference, 2022
MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022
PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining.
CoRR, 2021
2020
CoRR, 2020
EasyTransfer - A Simple and Scalable Deep Transfer Learning Platform for NLP Applications.
CoRR, 2020
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020
2019