Cheng Li

Orcid: 0000-0002-9991-4472

Affiliations:
  • University of Illinois Urbana-Champaign, IL, USA


According to our database1, Cheng Li authored at least 20 papers between 2016 and 2020.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2020
Performance benchmarking, analysis, and optimization of deep learning inference
PhD thesis, 2020

MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale.
CoRR, 2020

DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs.
Proceedings of the ICPE '20: ACM/SPEC International Conference on Performance Engineering, 2020

DLSpec: A Deep Learning Task Exchange Specification.
Proceedings of the 2020 USENIX Conference on Operational Machine Learning, 2020

Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

The Design and Implementation of a Scalable Deep Learning Benchmarking Platform.
Proceedings of the 13th IEEE International Conference on Cloud Computing, 2020

2019
The Design and Implementation of a Scalable DL Benchmarking Platform.
CoRR, 2019

AI Matrix: A Deep Learning Benchmark for Alibaba Data Centers.
CoRR, 2019

Across-Stack Profiling and Characterization of Machine Learning Models on GPUs.
CoRR, 2019

Challenges and Pitfalls of Reproducing Machine Learning Artifacts.
CoRR, 2019

Evaluating Characteristics of CUDA Communication Primitives on High-Bandwidth Interconnects.
Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

MLModelScope: Evaluate and Introspect Cognitive Pipelines.
Proceedings of the 2019 IEEE World Congress on Services, 2019

Accelerating reduction and scan using tensor core units.
Proceedings of the ACM International Conference on Supercomputing, 2019

TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep Learning Inference in Function-as-a-Service.
Proceedings of the 12th IEEE International Conference on Cloud Computing, 2019

2018
MLModelScope: Evaluate and Measure ML Models within AI Pipelines.
CoRR, 2018

TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments.
CoRR, 2018

SCOPE: C3SR Systems Characterization and Benchmarking Framework.
CoRR, 2018

2017
RAI: A Scalable Project Submission System for Parallel Programming Courses.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

2016
KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016


  Loading...