Yong Li
Orcid: 0000-0001-9072-3170Affiliations:
- Jiangsu University, Automotive Engineering Research Institute, Zhenjiang, China
- University of Science and Technology Beijing, China (PhD 2015)
According to our database1,
Yong Li
authored at least 34 papers
between 2018 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025
2024
ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG.
IEEE Trans. Parallel Distributed Syst., 2024
CoRR, 2024
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.
CoRR, 2024
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
2023
Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.
Proc. VLDB Endow., 2023
GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning.
Proc. ACM Manag. Data, 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.
CoRR, 2023
CoRR, 2023
Legion: Automatically Pushing the Envelope of Multi-GPU System for Billion-Scale GNN Training.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023
EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.
Proceedings of the International Conference for High Performance Computing, 2023
uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural Networks.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
Proceedings of the 2022 USENIX Annual Technical Conference, 2022
Proceedings of the SC22: International Conference for High Performance Computing, 2022
MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022
PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: EMNLP 2022 - Industry Track, Abu Dhabi, UAE, December 7, 2022
2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining.
CoRR, 2021
MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021
2020
CoRR, 2020
EasyTransfer - A Simple and Scalable Deep Transfer Learning Platform for NLP Applications.
CoRR, 2020
MicroRec: Accelerating Deep Recommendation Systems to Microseconds by Hardware and Data Structure Solutions.
CoRR, 2020
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020
2019
2018
GA-BPNN Based Hybrid Steering Control Approach for Unmanned Driving Electric Vehicle with In-Wheel Motors.
Complex., 2018
A Nonlinear Decoupling Control Approach Using RBFNNI-Based Robust Pole Placement for a Permanent Magnet In-Wheel Motor.
IEEE Access, 2018