Kun Li

Orcid: 0000-0002-1013-1325

According to our database1, Kun Li authored at least 26 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
SparStencil: Retargeting Sparse Tensor Cores to Scientific Stencil Computations via Structured Sparsity Transformation.
CoRR, June, 2025

SwarmThinkers: Learning Physically Consistent Atomic KMC Transitions at Scale.
CoRR, May, 2025

LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning.
CoRR, January, 2025

JENGA: Enhancing LLM Long-Context Fine-tuning with Contextual Token Sparsity.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

Jigsaw: Toward Conflict-free Vectorized Stencil Computation by Tessellating Swizzled Registers.
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025

FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core Units.
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025

2024
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism Transformation.
CoRR, 2024

Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management.
CoRR, 2024

LoRAStencil: Low-Rank Adaptation of Stencil Computation on Tensor Cores.
Proceedings of the International Conference for High Performance Computing, 2024

Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity.
Proceedings of the International Conference for High Performance Computing, 2024

ConvStencil: Transform Stencil Computation to Matrix Multiplication on Tensor Cores.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

2023
AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format.
IEEE Trans. Parallel Distributed Syst., March, 2023

Gamify Stencil Dwarf on Cloud for Democratizing Scientific Computing.
CoRR, 2023

OpenFFT: An Adaptive Tuning Framework for 3D FFT on ARM Multicore CPUs.
Proceedings of the 37th International Conference on Supercomputing, 2023

2022
An Accurate and Efficient Large-Scale Regression Method Through Best Friend Clustering.
IEEE Trans. Parallel Distributed Syst., 2022

An Efficient Vectorization Scheme for Stencil Computation.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

LBBGEMM: A Load-balanced Batch GEMM Framework on ARM CPU s.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

EgpuIP: An Embedded GPU Accelerated Library for Image Processing.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

2021
Temporal vectorization for stencils.
Proceedings of the International Conference for High Performance Computing, 2021

Reducing redundancy in data organization and arithmetic calculation for stencil computations.
Proceedings of the International Conference for High Performance Computing, 2021

2020
likundec/stencil: First Release.
Dataset, June, 2020

FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations.
J. Supercomput., 2020

2019
Correction to: FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations.
J. Supercomput., 2019

OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight.
Proceedings of the International Conference for High Performance Computing, 2019

swMD: Performance Optimizations for Molecular Dynamics Simulation on Sunway Taihulight.
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019

2018
Communication-Avoiding for Dynamical Core of Atmospheric General Circulation Model.
Proceedings of the 47th International Conference on Parallel Processing, 2018


  Loading...