Lin Gan

Orcid: 0000-0003-1297-4462

Affiliations:
  • Tsinghua University, Beijing, China


According to our database1, Lin Gan authored at least 84 papers between 2013 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Accelerating Molecular Dynamics Simulations on ARM Multi-Core Processors.
IEEE Trans. Parallel Distributed Syst., April, 2026

RabbitVar: Ultra-fast and accurate somatic small-variant calling on multi-core architectures.
Future Gener. Comput. Syst., 2026

HierCut: Enabling 16-bit Format Mixed Precision for Molecular Dynamics through Hierarchical Cutoff.
Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026

2025
FastMPS: Revisit Data Parallel in Large-scale Matrix Product State Sampling.
CoRR, December, 2025

MMStencil: Optimizing High-order Stencils on Multicore CPU using Matrix Unit.
CoRR, July, 2025

GenTT: Generate Vectorized Codes for General Tensor Permutation.
CoRR, June, 2025

SW-TNC : Reaching the Most Complex Random Quantum Circuit via Tensor Network Contraction.
CoRR, April, 2025

Leveraging the Hardware Resources to Accelerate cryo-EM Reconstruction of RELION on the New Sunway Supercomputer.
ACM Trans. Archit. Code Optim., March, 2025

T2-RELION: Task Parallelism, Tensor Core Accelerated RELION for Cryo-EM 3D Reconstruction.
Proceedings of the International Conference for High Performance Computing, 2025

Trillion Ligands per Day: Performance-Portable Virtual Screening via Compound Database Optimization and Multi-Target Docking.
Proceedings of the International Conference for High Performance Computing, 2025

Auto-Stencil: Performance-Driven Stencil Optimization with Hardware Feedback for LLMs.
Proceedings of the 54th International Conference on Parallel Processing, 2025

SCINet: a high-performance computing infrastructure prototype based on supercomputers and high-speed Internet.
Proceedings of the 2025 9th International Conference on High Performance Compilation, 2025

2024
O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platform.
CCF Trans. High Perform. Comput., June, 2024

Towards optimized tensor code generation for deep learning on sunway many-core processor.
Frontiers Comput. Sci., April, 2024

ESFLOW: Mapping Large-Scale Earthquake Simulation to Spatial Computing Systems.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

Enabling High-Performance Physical Based Rendering on New Sunway Supercomputer.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

2023
Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP.
Computer, August, 2023

Bio-ESMD: A Data Centric Implementation for Large-Scale Biological System Simulation on Sunway TaihuLight Supercomputer.
IEEE Trans. Parallel Distributed Syst., March, 2023

69.7-PFlops Extreme Scale Earthquake Simulation with Crossing Multi-faults and Topography on Sunway.
Proceedings of the International Conference for High Performance Computing, 2023

Enabling Real World Scale Structural Superlubricity All-Atom Simulation on the Next-Generation Sunway Supercomputer.
Proceedings of the International Conference for High Performance Computing, 2023

Lifetime-Based Optimization for Simulating Quantum Circuits on a New Sunway Supercomputer.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

2022
Detection of Burst Users and Symbols for Grant-Free Communication in the Presence of Massive Connected Users.
IEEE Trans. Veh. Technol., 2022

Critique of "MemXCT: Memory-Centric X-Ray CT Reconstruction With Massive Parallelization" by SCC Team From Tsinghua University.
IEEE Trans. Parallel Distributed Syst., 2022

Benchmarking 50-Photon Gaussian Boson Sampling on the Sunway TaihuLight.
IEEE Trans. Parallel Distributed Syst., 2022

Optimization of Reactive Force Field Simulation: Refactor, Parallelization, and Vectorization for Interactions.
IEEE Trans. Parallel Distributed Syst., 2022

High-Resolution Land Cover Mapping Through Learning With Noise Correction.
IEEE Trans. Geosci. Remote. Sens., 2022

Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP.
IEEE Trans. Computers, 2022

Enabling Large-Scale Simulation of CAM on the Sunway TaihuLight Supercomputer.
IEEE Trans. Computers, 2022

Validating quantum-supremacy experiments with exact and fast tensor network contraction.
CoRR, 2022

Lifetime-based Method for Quantum Simulation on a New Sunway Supercomputer.
CoRR, 2022

Accelerating cryo-EM Reconstruction of RELION on the New Sunway Supercomputer.
Proceedings of the IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2022

2021
The Deep Learning Compiler: A Comprehensive Survey.
IEEE Trans. Parallel Distributed Syst., 2021

Towards efficient tile low-rank GEMM computation on sunway many-core processors.
J. Supercomput., 2021

Translating novel HPC techniques into efficient geoscience solutions.
J. Comput. Sci., 2021

Towards efficient canonical polyadic decomposition on sunway many-core processor.
Inf. Sci., 2021

Highly scalable parallel genetic algorithm on Sunway many-core processors.
Future Gener. Comput. Syst., 2021

2020
Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture.
IEEE Trans. Parallel Distributed Syst., 2020

Massively Scaling Seismic Processing on Sunway TaihuLight Supercomputer.
IEEE Trans. Parallel Distributed Syst., 2020

Millimeter-Scale and Billion-Atom Reactive Force Field Simulation on Sunway Taihulight.
IEEE Trans. Parallel Distributed Syst., 2020

Improving 3-m Resolution Land Cover Mapping through Efficient Learning from an Imperfect 10-m Resolution Map.
Remote. Sens., 2020

Efficient AES implementation on Sunway TaihuLight supercomputer: A systematic approach.
J. Parallel Distributed Comput., 2020

High performance reconfigurable computing for numerical simulation and deep learning.
CCF Trans. High Perform. Comput., 2020

Tuning a general purpose software cache library for TaihuLight's SW26010 processor.
CCF Trans. High Perform. Comput., 2020

SpTFS: sparse tensor format selection for MTTKRP via deep learning.
Proceedings of the International Conference for High Performance Computing, 2020

Cell-list based molecular dynamics on many-core processors: a case study on sunway TaihuLight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2020

Neighbor-list-free molecular dynamics on sunway TaihuLight supercomputer.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

swRodinia: A Benchmark Suite for Exploiting Architecture Properties of Sunway Processor.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2020

2019
Optimizing Finite Volume Method Solvers on Nvidia GPUs.
IEEE Trans. Parallel Distributed Syst., 2019

Performance Tuning and Analysis for Stencil-Based Applications on POWER8 Processor.
ACM Trans. Archit. Code Optim., 2019

swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture.
CoRR, 2019

swTensor: accelerating tensor decomposition on Sunway architecture.
CCF Trans. High Perform. Comput., 2019

SW_GROMACS: accelerate GROMACS on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2019

swATOP: Automatically Optimizing Deep Learning Operators on SW26010 Many-Core Processor.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Scaling the Training of Recurrent Neural Networks on Sunway TaihuLight Supercomputer.
Proceedings of the Computational Science - ICCS 2019, 2019

Million-Core-Scalable Simulation of the Elastic Migration Algorithm on Sunway TaihuLight Supercomputer.
Proceedings of the 19th IEEE/ACM International Symposium on Cluster, 2019

2018
Optimizing Convolutional Neural Networks on the Sunway TaihuLight Supercomputer.
ACM Trans. Archit. Code Optim., 2018

Redesigning LAMMPS for peta-scale and hundred-billion-atom simulation on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2018

Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2018

A Fast Sparse Triangular Solver for Structured-grid Problems on Sunway Many-core Processor SW26010.
Proceedings of the 47th International Conference on Parallel Processing, 2018

PLZMA: A Parallel Data Compression Method for Cloud Computing.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

2017
Solving Mesoscale Atmospheric Dynamics Using a Reconfigurable Dataflow Architecture.
IEEE Micro, 2017

Chapter Four - Data Flow Computing in Geoscience Applications.
Adv. Comput., 2017

Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2017

SW-AES: Accelerating AES Algorithm on the Sunway TaihuLight.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

26 PFLOPS Stencil Computations for Atmospheric Modeling on Sunway TaihuLight.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016
Evaluating the POWER8 Architecture through Optimizing Stencil-Based Algorithms.
Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

10M-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics.
Proceedings of the International Conference for High Performance Computing, 2016

Accelerating the 3D euler atmospheric solver through heterogeneous CPU-GPU platforms.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Generalized GPU Acceleration for Applications Employing Finite-Volume Methods.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

Performance optimization of Jacobi stencil algorithms based on POWER8 architecture.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

2015
Solving the Global Atmospheric Equations through Heterogeneous Reconfigurable Platforms.
ACM Trans. Reconfigurable Technol. Syst., 2015

Ultra-Scalable CPU-MIC Acceleration of Mesoscale Atmospheric Modeling on Tianhe-2.
IEEE Trans. Computers, 2015

Optimizing Complex Spatially-Variant Coefficient Stencils for Seismic Modeling on GPU.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Optimizing Residue Number Reverse Converters through Bitwise Arithmetic on FPGAs.
Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

2014
Scaling Reverse Time Migration Performance through Reconfigurable Dataflow Engines.
IEEE Micro, 2014

Evaluating multi-core and many-core architectures through accelerating the three-dimensional Lax-Wendroff correction stencil.
Int. J. High Perform. Comput. Appl., 2014

Enabling and Scaling a Global Shallow-Water Atmospheric Model on Tianhe-2.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Scaling and analyzing the stencil performance on multi-core and many-core architectures.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

A highly-efficient and green data flow engine for solving euler atmospheric equations.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

2013
A peta-scalable CPU-GPU algorithm for global atmospheric simulations.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

Accelerating the 3D Elastic Wave Forward Modeling on GPU and MIC.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Accelerating solvers for global atmospheric equations through mixed-precision data flow engine.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Global Atmospheric Simulation on a Reconfigurable Platform.
Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013


  Loading...