Yong Li

Orcid: 0000-0001-9072-3170

Affiliations:
  • Alibaba Group, Beijing, China


According to our database1, Yong Li authored at least 32 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling.
CoRR, May, 2025

Wan: Open and Advanced Large-Scale Video Generative Models.
CoRR, March, 2025

Helios: Efficient Distributed Dynamic Graph Sampling for Online GNN Inference.
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025

Efficient Long Context Fine-tuning with Chunk Flow.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

CaliEX: A Disk-Based Large-Scale GNN Training System with Joint Design of Caching and Execution.
Proceedings of the 41st IEEE International Conference on Data Engineering, 2025

Spindle: Efficient Distributed Training of Multi-Task Large Models via Wavefront Scheduling.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024
ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG.
IEEE Trans. Parallel Distributed Syst., 2024

BladeDISC++: Memory Optimizations Based On Symbolic Shape.
CoRR, 2024

Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management.
CoRR, 2024

Rubick: Exploiting Job Reconfigurability for Deep Learning Cluster Scheduling.
CoRR, 2024

Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.
CoRR, 2024

Llumnix: Dynamic Scheduling for Large Language Model Serving.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

2023
Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.
Proc. VLDB Endow., 2023

GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning.
Proc. ACM Manag. Data, 2023

ROAM: memory-efficient large DNN training via optimized operator ordering and memory layout.
CoRR, 2023

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.
CoRR, 2023

TAP: Accelerating Large-Scale DNN Training Through Tensor Automatic Parallelisation.
CoRR, 2023

EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.
Proceedings of the International Conference for High Performance Computing, 2023

uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural Networks.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
EasyScale: Accuracy-consistent Elastic Training for Deep Learning.
CoRR, 2022

Structure Enhanced Graph Neural Networks for Link Prediction.
CoRR, 2022

Whale: Efficient Giant Model Training over Heterogeneous GPUs.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining.
CoRR, 2021

Exploring Sparse Expert Models and Beyond.
CoRR, 2021

M6: A Chinese Multimodal Pretrainer.
CoRR, 2021

2020
Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training.
CoRR, 2020

EasyTransfer - A Simple and Scalable Deep Transfer Learning Platform for NLP Applications.
CoRR, 2020

Whale: A Unified Distributed Training Framework.
CoRR, 2020

AntMan: Dynamic Scaling on GPU Clusters for Deep Learning.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

2019
AliGraph: A Comprehensive Graph Neural Network Platform.
Proc. VLDB Endow., 2019


  Loading...