Gwangsun Kim

Orcid: 0000-0001-5749-5794

According to our database1, Gwangsun Kim authored at least 30 papers between 2011 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
A Programming Model for Efficient Inter-Kernel Control-Flow on Memory-Mapped Near-Data Processing Architecture (WIP).
Proceedings of the 27th ACM SIGPLAN/SIGBED International Conference on Languages, 2026

2025
Cost-Effective Extension of DRAM-PIM for Group-Wise LLM Quantization.
IEEE Comput. Archit. Lett., 2025

PyTorchSim: A Comprehensive, Fast, and Accurate NPU Simulation Framework.
Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

LibraPIM: Dynamic Load Rebalancing to Maximize Utilization in PIM-Assisted LLM Inference Systems.
Proceedings of the 34th International Conference on Parallel Architectures and Compilation Techniques, 2025

2024
Bandwidth-Effective DRAM Cache for GPUs with Storage-Class Memory.
CoRR, 2024

ONNXim: A Fast, Cycle-Level Multi-Core NPU Simulator.
IEEE Comput. Archit. Lett., 2024

Non-Invasive, Memory Access-Triggered Near-Data Processing for DNN Training Acceleration on GPUs.
IEEE Access, 2024

Low-Overhead General-Purpose Near-Data Processing in CXL Memory Expanders.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

Bandwidth-Effective DRAM Cache for GPU s with Storage-Class Memory.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2022
Overcoming Memory Capacity Wall of GPUs With Heterogeneous Memory Stack.
IEEE Comput. Archit. Lett., 2022

Dynamic global adaptive routing in high-radix networks.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

2021
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs.
IEEE Comput. Archit. Lett., 2021

2018
TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

2017
Toward standardized near-data processing with unrestricted data placement for GPUs.
Proceedings of the International Conference for High Performance Computing, 2017

History-Based Arbitration for Fairness in Processor-Interconnect of NUMA Servers.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
Design and Analysis of Hybrid Flow Control for Hierarchical Ring Network-on-Chip.
IEEE Trans. Computers, 2016

Contention-based congestion management in large-scale networks.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

iPAWS: Instruction-issue pattern-based adaptive warp scheduling for GPGPUs.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

Accelerating Linked-list Traversal Through Near-Data Processing.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Overcoming far-end congestion in large-scale networks.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

2014
Low-Overhead Network-on-Chip Support for Location-Oblivious Task Placement.
IEEE Trans. Computers, 2014

Multi-GPU System Design with Memory Networks.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Transportation-network-inspired network-on-chip.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

2013
Memory-centric system interconnect design with Hybrid Memory Cubes.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Scalable on-chip network in power constrained manycore processors.
Proceedings of the 2012 International Green Computing Conference, 2012

2011
FlexiBuffer: reducing leakage power in on-chip network routers.
Proceedings of the 48th Design Automation Conference, 2011


  Loading...