Mingzhen Li

Orcid: 0000-0002-4115-9072

Affiliations:

Chinese Academy of Sciences, Insititute of Computing Technology, State Key Lab of Processors (SKLP), Beijing, China
Beihang University, Beijing, China (PhD 2023)

According to our database¹, Mingzhen Li authored at least 31 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

29-Billion Atoms Molecular Dynamics Simulation With Ab Initio Accuracy on 35 Million Cores of New Sunway Supercomputer.

[BibT_eX]

[DOI]

IEEE Trans. Computers, May, 2025

An interpretable DeePMD-kit performance model for emerging supercomputers.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., April, 2025

Efficient Long Context Fine-tuning with Chunk Flow.

[BibT_eX]

[DOI]

CoRR, March, 2025

Mario: Near Zero-cost Activation Checkpointing in Pipeline Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025

INSPIRIT: Adaptive Priority-based Task Scheduling for Heterogeneous Hardware.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Parallel and Distributed Processing Symposium, 2025

FastSpMM: Leveraging Tensor Cores for Sparse Matrix Multiplication.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM International Conference on Computing Frontiers, 2025

2024

Towards optimized tensor code generation for deep learning on sunway many-core processor.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., April, 2024

ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2024

Building a domain-specific compiler for emerging processors with a reusable approach.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

Retrospection on the Performance Analysis Tools for Large-Scale HPC Programs.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE International Conference on High Performance Computing, 2024

Accelerating Large-Scale Sparse LU Factorization for RF Circuit Simulation.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

2023

Adapting combined tiling to stencil optimizations on sunway processor.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., September, 2023

EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2023

Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs.

[BibT_eX]

[DOI]

Proceedings of the 52nd International Conference on Parallel Processing, 2023

2022

QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU.

[BibT_eX]

[DOI]

Parallel Comput., 2022

Mimose: An Input-Aware Checkpointing Planner for Efficient Training on GPU.

[BibT_eX]

[DOI]

CoRR, 2022

EasyScale: Accuracy-consistent Elastic Training for Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2022

FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity.

[BibT_eX]

[DOI]

CoRR, 2022

CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs.

[BibT_eX]

[DOI]

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Toward accelerated stencil computation by adapting tensor core unit on GPU.

[BibT_eX]

[DOI]

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021

The Deep Learning Compiler: A Comprehensive Survey.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

swMR: A Framework for Accelerating MapReduce Applications on Sunway Taihulight.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2021

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

PriPro: Towards Effective Privacy Protection on Edge-Cloud System running DNN Inference.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020

Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

The Deep Learning Compiler: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2020

Real-Time Polyp Detection for Colonoscopy Video on CPU.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, 2020

swRodinia: A Benchmark Suite for Exploiting Architecture Properties of Sunway Processor.

[BibT_eX]

[DOI]

Proceedings of the Benchmarking, Measuring, and Optimizing, 2020

2018

Block-Checksum-Based Fault Tolerance for Matrix Multiplication on Large-Scale Parallel Systems.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Multi-role SpTRSV on Sunway Many-Core Architecture.

[BibT_eX]

[DOI]

Mingzhen Li

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...