Youngsok Kim

Orcid: 0000-0002-1015-9969

According to our database¹, Youngsok Kim authored at least 55 papers between 2014 and 2026.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

DFLOP: A Data-driven Framework for Multimodal LLM Training Pipeline Optimization.

[BibT_eX]

[DOI]

CoRR, March, 2026

DANCE++: Differentiable Accelerator/Network Co-Exploration With Hard Constraints and Data-Free Training for Real-World Scenarios.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2026

Peak-memory-aware partitioning and scheduling for multi-tenant DNN model inference.

[BibT_eX]

[DOI]

J. Syst. Archit., 2026

SMTcheck: Accurate SMT Interference Prediction to Improve Scheduling Efficiency in Datacenters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2026

LoCaLUT: Harnessing Capacity-Computation Tradeoffs for LUT-Based Inference in DRAM-PIM.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2026

2025

IntervalSim++: Enhanced Interval Simulation for Unbalanced Processor Designs.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2025

LATPC: Accelerating GPU Address Translation Using Locality-Aware TLB Prefetching and MSHR Compression.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

GCStack+GCScaler: Fast and Accurate GPU Performance Analyses Using Fine-Grained Stall Cycle Accounting and Interval Analysis.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

PIM-CARE: A Compiler-Assisted Dynamic Resource Allocation Framework for Real-world DRAM PIM.

[BibT_eX]

[DOI]

Proceedings of the 39th ACM International Conference on Supercomputing, 2025

DMO-DB: Mitigating the Data Movement Bottlenecks of GPU-Accelerated Relational OLAP.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Parallel Architectures and Compilation Techniques, 2025

2024

SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs.

[BibT_eX]

[DOI]

Proc. ACM Manag. Data, December, 2024

AIS-SNU/GraNNDis_Artifact: Artifact Evaluation Submission.

[BibT_eX]

[DOI]

Dataset, July, 2024

GCStack: A GPU Cycle Accounting Mechanism for Providing Accurate Insight Into GPU Performance.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2024

Gem5-AVX: Extension of the Gem5 Simulator to Support AVX Instruction Sets.

[BibT_eX]

[DOI]

IEEE Access, 2024

AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read Mapping.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

Orchestrating Multiple Mixed Precision Models on a Shared Precision-Scalable NPU.

[BibT_eX]

[DOI]

Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, 2024

PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU.

[BibT_eX]

[DOI]

Proceedings of the 53rd International Conference on Parallel Processing, 2024

Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

MPC-Wrapper: Fully Harnessing the Potential of Samsung Aquabolt-XL HBM2-PIM on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2024

GraNNDis: Fast Distributed Graph Neural Network Training Framework for Multi-Server Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023

Enabling Fine-Grained Spatial Multitasking on Systolic-Array NPUs Using Dataflow Mirroring.

[BibT_eX]

[DOI]

IEEE Trans. Computers, December, 2023

Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs.

[BibT_eX]

[DOI]

Proc. ACM Manag. Data, 2023

GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters.

[BibT_eX]

[DOI]

CoRR, 2023

McCore: A Holistic Management of High-Performance Heterogeneous Multicores.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Pipe-BD: Pipelined Parallel Blockwise Distillation.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Occamy: Memory-efficient GPU Compiler for DNN Inference.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Virtual PIM: Resource-Aware Dynamic DPU Allocation and Workload Scheduling Framework for Multi-DPU PIM Architecture.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022

GuardiaNN: Fast and Secure On-Device Inference in TrustZone Using Embedded SRAM and Cryptographic Hardware.

[BibT_eX]

[DOI]

Proceedings of the Middleware '22: 23rd International Middleware Conference, Quebec, QC, Canada, November 7, 2022

GCoM: a detailed GPU core model for accurate analytical modeling of modern GPUs.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

SALoBa: Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Enabling hard constraints in differentiable neural network and accelerator co-exploration.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

Decoupling Schedule, Topology Layout, and Algorithm to Easily Enlarge the Tuning Space of GPU Graph Processing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021

Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2021

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs.

[BibT_eX]

[DOI]

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

DANCE: Differentiable Accelerator/Network Co-Exploration.

[BibT_eX]

[DOI]

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Thread-Aware Area-Efficient High-Level Synthesis Compiler for Embedded Devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021

2020

Real-Time Object Detection System with Multi-Path Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, 2020

2019

FlexLearn: Fast and Highly Efficient Brain Simulations Using Flexible On-Chip Learning.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization.

[BibT_eX]

[DOI]

Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019

2018

Flexon: A Flexible Digital Neuron for Efficient Spiking Neural Network Simulations.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks.

[BibT_eX]

[DOI]

Amirali Boroumand

Saugata Ghose

Youngsok Kim

Rachata Ausavarungnirun

Parthasarathy Ranganathan

Onur Mutlu

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017

GPUpd: a fast and scalable multi-GPU architecture using cooperative projection and distribution.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

2016

Efficient footprint caching for Tagless DRAM Caches.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

CloudSwap: A Cloud-Assisted Swap Mechanism for Mobile Devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

2015

DCS: a fast and scalable device-centric server architecture.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

2014

ScaleGPU: GPU Architecture for Memory-Unaware GPU Programming.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2014

Stealing Webpages Rendered on Your Browser by Exploiting GPU Vulnerabilities.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Symposium on Security and Privacy, 2014

GPUdmm: A high-performance and memory-oblivious GPU architecture using dynamic memory management.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Youngsok Kim

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...