Junshi Chen

Orcid: 0000-0002-6487-3658

Affiliations:
  • University of Science and Technology of China, Hefei, Anhui, China


According to our database1, Junshi Chen authored at least 34 papers between 2017 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
PWDFT-SW: Extending the Limit of Plane-Wave DFT Calculations to 16K Atoms on the New Sunway Supercomputer.
IEEE Trans. Parallel Distributed Syst., July, 2025

SparStencil: Retargeting Sparse Tensor Cores to Scientific Stencil Computations via Structured Sparsity Transformation.
Proceedings of the International Conference for High Performance Computing, 2025

Million-Atom Ab Initio Electron Dynamics: Discontinuous Galerkin Real-Time Time-Dependent Density Functional Theory.
Proceedings of the International Conference for High Performance Computing, 2025

AMALI: An Analytical Model for Accurately Modeling LLM Inference on Modern GPUs.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

CIExplorer: Microarchitecture-Aware Exploration for Tightly Integrated Custom Instruction.
Proceedings of the 39th ACM International Conference on Supercomputing, 2025

WinRS: Accelerate Winograd Backward-Filter Convolution with Tiny Workspace.
Proceedings of the 54th International Conference on Parallel Processing, 2025

PromptSeg: Learning to Segment Medical Image via Visual Prompts.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Carver: Learning to Reconstruct Right Ventricle from Sparse Multi-View 2D Echocardiograms.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Uniform Dense Blocking for Efficient Sparse LU Factorization in First-Principles Materials Simulation.
Proceedings of the Euro-Par 2025: Parallel Processing, 2025

Pruner: A Draft-then-Verify Exploration Mechanism to Accelerate Tensor Program Tuning.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024
SWattention: designing fast and memory-efficient attention for a new Sunway Supercomputer.
J. Supercomput., July, 2024

Extending the limit of LR-TDDFT on two different approaches: Numerical algorithms and new Sunway heterogeneous supercomputer.
Parallel Comput., 2024

Pruner: An Efficient Cross-Platform Tensor Compiler with Dual Awareness.
CoRR, 2024

Enabling 13K-Atom Excited-State GW Calculations via Low-Rank Approximations and HPC on the New Sunway Supercomputer.
Proceedings of the International Conference for High Performance Computing, 2024

DB-SpGEMM: A Massively Distributed Block-Sparse Matrix-Matrix Multiplication for Linear-Scaling DFT Calculations.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

Multi-level Load Balancing Strategies for Massively Parallel Smoothed Particle Hydrodynamics Simulation.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

2023
High performance computing for first-principles Kohn-Sham density functional theory towards exascale supercomputers.
CCF Trans. High Perform. Comput., March, 2023

swMPAS-A: Scaling MPAS-A to 39 Million Heterogeneous Cores on the New Generation Sunway Supercomputer.
IEEE Trans. Parallel Distributed Syst., 2023

Establishing a Modeling System in 3-km Horizontal Resolution for Global Atmospheric Circulation triggered by Submarine Volcanic Eruptions with 400 Billion Smoothed Particle Hydrodynamics.
Proceedings of the International Conference for High Performance Computing, 2023

SWSPH: A Massively Parallel SPH Implementation for Hundred-Billion-Particle Simulation on New Sunway Supercomputer.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023

2022
Bridging the Gap between Deep Learning and Frustrated Quantum Spin System for Extreme-Scale Simulations on New Generation of Sunway Supercomputer.
IEEE Trans. Parallel Distributed Syst., 2022

2.5 Million-Atom Ab Initio Electronic-Structure Simulation of Complex Metallic Heterostructures with DGDFT.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

Accelerating Parallel First-Principles Excited-State Calculation by Low-Rank Approximation with K-Means Clustering.
Proceedings of the 51st International Conference on Parallel Processing, 2022

2021
Towards Efficient Short-Range Pair Interaction on Sunway Many-Core Architecture.
J. Comput. Sci. Technol., 2021

RDMA-Based Apache Storm for High-Performance Stream Data Processing.
Int. J. Parallel Program., 2021

Symplectic structure-preserving particle-in-cell whole-volume simulation of tokamak plasmas to 111.3 trillion particles and 25.7 billion grids.
Proceedings of the International Conference for High Performance Computing, 2021

2020
RDMA-Based Apache Storm for High-Performance Stream Data Processing.
Proceedings of the Network and Parallel Computing, 2020

An Efficient Multi-GPU Implementation for Linear-Response Time-Dependent Density Functional Theory.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

Optimizing Astrophysical Simulation Software on Sunway Heterogeneous Manycore Architecture.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

2019
Improving the Performance of MongoDB with RDMA.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

Redesign NAMD Molecular Dynamics Non-Bonded Force-Field on Sunway Manycore Processor.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

2017
A Dataflow-Based Runtime Support on a 100P Actual System.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Refactoring the Molecular Docking Simulation for Heterogeneous, Manycore Processors Systems.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Pipelining Computation and Optimization Strategies for Scaling GROMACS on the Sunway Many-Core Processor.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2017


  Loading...