Mingzhen Li

Orcid: 0000-0002-4115-9072

Affiliations:

Chinese Academy of Sciences, Insititute of Computing Technology, State Key Lab of Processors (SKLP), Beijing, China
Beihang University, Beijing, China (PhD 2023)

According to our database¹, Mingzhen Li authored at least 40 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

JanusPipe: Efficient Pipeline Parallel Training for Machine Learning Interatomic Potentials.

[BibT_eX]

[DOI]

CoRR, May, 2026

Breaking the Training Barrier of Billion-Parameter Universal Machine Learning Interatomic Potentials.

[BibT_eX]

[DOI]

CoRR, April, 2026

Efficient large-scale sparse LU factorization for fast radio frequency circuit simulation.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2026

2025

Large-scale Neural Network Quantum States for ab initio Quantum Chemistry Simulations on Fugaku.

[BibT_eX]

[DOI]

CoRR, June, 2025

29-Billion Atoms Molecular Dynamics Simulation With Ab Initio Accuracy on 35 Million Cores of New Sunway Supercomputer.

[BibT_eX]

[DOI]

IEEE Trans. Computers, May, 2025

Scaling Neural-Network-Based Molecular Dynamics with Long-Range Electrostatic Interactions to 51 Nanoseconds per Day.

[BibT_eX]

[DOI]

CoRR, April, 2025

An interpretable DeePMD-kit performance model for emerging supercomputers.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., April, 2025

Scaling Neural-Network-Based Molecular Dynamics with Long-Range Electrostatic Interactions to 51 Nanoseconds per Day.

[BibT_eX]

[DOI]

Dataset, April, 2025

Mario: Near Zero-cost Activation Checkpointing in Pipeline Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025

Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

INSPIRIT: Adaptive Priority-based Task Scheduling for Heterogeneous Hardware.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Parallel and Distributed Processing Symposium, 2025

ESC: Effective Submanifold Convolution using Tensor Cores.

[BibT_eX]

[DOI]

Proceedings of the 54th International Conference on Parallel Processing, 2025

Efficient Long Context Fine-tuning with Chunk Flow.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

FastSpMM: Leveraging Tensor Cores for Sparse Matrix Multiplication.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM International Conference on Computing Frontiers, 2025

2024

Towards optimized tensor code generation for deep learning on sunway many-core processor.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., April, 2024

ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2024

Building a domain-specific compiler for emerging processors with a reusable approach.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

Scaling Molecular Dynamics with ab initio Accuracy to 149 Nanoseconds per Day.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2024

Retrospection on the Performance Analysis Tools for Large-Scale HPC Programs.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE International Conference on High Performance Computing, 2024

Accelerating Large-Scale Sparse LU Factorization for RF Circuit Simulation.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

2023

Adapting combined tiling to stencil optimizations on sunway processor.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., September, 2023

EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2023

Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs.

[BibT_eX]

[DOI]

Proceedings of the 52nd International Conference on Parallel Processing, 2023

2022

QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU.

[BibT_eX]

[DOI]

Parallel Comput., 2022

Mimose: An Input-Aware Checkpointing Planner for Efficient Training on GPU.

[BibT_eX]

[DOI]

CoRR, 2022

EasyScale: Accuracy-consistent Elastic Training for Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2022

FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity.

[BibT_eX]

[DOI]

CoRR, 2022

CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs.

[BibT_eX]

[DOI]

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Toward accelerated stencil computation by adapting tensor core unit on GPU.

[BibT_eX]

[DOI]

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021

The Deep Learning Compiler: A Comprehensive Survey.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

swMR: A Framework for Accelerating MapReduce Applications on Sunway Taihulight.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2021

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

PriPro: Towards Effective Privacy Protection on Edge-Cloud System running DNN Inference.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020

Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

The Deep Learning Compiler: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2020

Real-Time Polyp Detection for Colonoscopy Video on CPU.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, 2020

swRodinia: A Benchmark Suite for Exploiting Architecture Properties of Sunway Processor.

[BibT_eX]

[DOI]

Proceedings of the Benchmarking, Measuring, and Optimizing, 2020

2018

Block-Checksum-Based Fault Tolerance for Matrix Multiplication on Large-Scale Parallel Systems.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Multi-role SpTRSV on Sunway Many-Core Architecture.

[BibT_eX]

[DOI]

Mingzhen Li

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...