Cheng Li

Orcid: 0000-0001-7064-6120

Affiliations:

University of Science and Technology of China (USTC), China
Max Planck Institute for Software Systems, Kaiserslautern / Saarbrücken, Germany (former)

According to our database¹, Cheng Li authored at least 51 papers between 2010 and 2024.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Noctua: Towards Automated and Practical Fine-grained Consistency Analysis.

[BibT_eX]

[DOI]

Proceedings of the Nineteenth European Conference on Computer Systems, 2024

Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

A Comprehensive Study on Post-Training Quantization for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases.

[BibT_eX]

[DOI]

Xiaoxia Wu

Cheng Li

Reza Yazdani Aminabadi

Zhewei Yao

Yuxiong He

CoRR, 2023

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction.

[BibT_eX]

[DOI]

CoRR, 2023

SPFresh: Incremental In-Place Update for Billion-Scale Vector Search.

[BibT_eX]

[DOI]

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

gSampler: General and Efficient GPU-based Graph Sampling for Graph Learning.

[BibT_eX]

[DOI]

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

MUSE: A Programmable Metadata Load Estimation Interface for Ceph File System.

[BibT_eX]

[DOI]

Xinyang Shao

Cheng Li

Yinlong Xu

Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases.

[BibT_eX]

[DOI]

Xiaoxia Wu

Cheng Li

Reza Yazdani Aminabadi

Zhewei Yao

Yuxiong He

Proceedings of the International Conference on Machine Learning, 2023

DySR: Adaptive Super-Resolution via Algorithm and System Co-design.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

MPress: Democratizing Billion-Scale Model Training on Multi-GPU Servers via Memory-Saving Inter-Operator Parallelism.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Revitalizing the Forgotten On-Chip DMA to Expedite Data Movement in NVM-based Storage Systems.

[BibT_eX]

[DOI]

Proceedings of the 21st USENIX Conference on File and Storage Technologies, 2023

CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections.

[BibT_eX]

[DOI]

Proceedings of the Eighteenth European Conference on Computer Systems, 2023

2022

vPipe: A Virtualized Acceleration System for Achieving Efficient and Scalable Pipeline Parallel DNN Training.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

SelectiveEC: Towards Balanced Recovery Load on Erasure-Coded Storage Systems.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

A Data Layout and Fast Failure Recovery Scheme for Distributed Storage Systems With Mixed Erasure Codes.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2022

Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers.

[BibT_eX]

[DOI]

CoRR, 2022

BiFeat: Supercharge GNN Training via Graph Feature Quantization.

[BibT_eX]

[DOI]

CoRR, 2022

DeepSpeed- Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale.

[BibT_eX]

[DOI]

Reza Yazdani Aminabadi

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Repair-Optimal Data Placement for Locally Repairable Codes with Optimal Minimum Hamming Distance.

[BibT_eX]

[DOI]

Proceedings of the 51st International Conference on Parallel Processing, 2022

2021

Efficient Data Loader for Fast Sampling-Based GNN Training on Large Graphs.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

Leveraging NVMe SSDs for Building a Fast, Cost-effective, LSM-tree-based KV Store.

[BibT_eX]

[DOI]

ACM Trans. Storage, 2021

MTFC: A Multi-GPU Training Framework for Cube-CNN-Based Hyperspectral Image Classification.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2021

AutoGR: Automated Geo-Replication with Fast System Performance and Preserved Application Semantics.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2021

ECR: Eviction-cost-aware cache management policy for page-level flash-based SSDs.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2021

Gradient Compression Supercharged High-Performance Data Parallel DNN Training.

[BibT_eX]

[DOI]

Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

Lunule: an agile and judicious metadata load balancer for CephFS.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

SpanDB: A Fast, Cost-Effective LSM-tree Based KV Store on Hybrid Storage.

[BibT_eX]

[DOI]

Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021

Lessons learned from migrating complex stateful applications onto serverless platforms.

[BibT_eX]

[DOI]

Proceedings of the APSys '21: 12th ACM SIGOPS Asia-Pacific Workshop on Systems, 2021

2020

Not All Explorations Are Equal: Harnessing Heterogeneous Profiling Cost for Efficient MLaaS Training.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

PDL: A Data Layout towards Fast Failure Recovery for Erasure-coded Distributed Storage Systems.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE Conference on Computer Communications, 2020

PaGraph: Scaling GNN training on large graphs via computation-aware caching.

[BibT_eX]

[DOI]

Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020

2019

BiloKey : A Scalable Bi-Index Locality-Aware In-Memory Key-Value Store.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2019

Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2019

ElasticBF: Elastic Bloom Filter with Hotness Awareness for Boosting Read Performance in Large Key-Value Stores.

[BibT_eX]

[DOI]

Proceedings of the 2019 USENIX Annual Technical Conference, 2019

HCFTL: A Locality-Aware Page-Level Flash Translation Layer.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Bayesian Optimisation for Objective Functions with Varying Smoothness.

[BibT_eX]

[DOI]

Proceedings of the AI 2019: Advances in Artificial Intelligence, 2019

2018

Fine-grained consistency for geo-replicated systems.

[BibT_eX]

[DOI]

Cheng Li

Nuno M. Preguiça

Rodrigo Rodrigues

Proceedings of the 2018 USENIX Annual Technical Conference, 2018

LCR: Load-Aware Cache Replacement Algorithm for Flash-Based SSDs.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Networking, 2018

A Flexible Method for Time-of-Flight Camera Calibration Using Random Forest.

[BibT_eX]

[DOI]

Chi Xu

Cheng Li

Proceedings of the Smart Multimedia - First International Conference, 2018

ElasticBF: Fine-grained and Elastic Bloom Filter Towards Efficient Read for LSM-tree-based KV Stores.

[BibT_eX]

[DOI]

Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems, 2018

2016

Building fast and consistent (geo-)replicated systems: from principles to practice.

[BibT_eX]

[DOI]

Cheng Li

PhD thesis, 2016

Geo-Replication: Fast If Possible, Consistent If Necessary.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2016

2015

Visigoth fault tolerance.

[BibT_eX]

[DOI]

Flavio Paiva Junqueira

Rodrigo Rodrigues

Proceedings of the Tenth European Conference on Computer Systems, 2015

Minimizing coordination in replicated systems.

[BibT_eX]

[DOI]

Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data, 2015

2014

Automating the Choice of Consistency Levels in Replicated Systems.

[BibT_eX]

[DOI]

Proceedings of the 2014 USENIX Annual Technical Conference, 2014

2012

Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary.

[BibT_eX]

[DOI]

Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, 2012

2011

Finding complex concurrency bugs in large multi-threaded applications.

[BibT_eX]

[DOI]

Pedro Fonseca

Cheng Li

Rodrigo Rodrigues

Proceedings of the European Conference on Computer Systems, 2011

2010

A study of the internal and external effects of concurrency bugs.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE/IFIP International Conference on Dependable Systems and Networks, 2010

Cheng Li

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...