Myung Kuk Yoon

Orcid: 0000-0002-9332-0251

According to our database1, Myung Kuk Yoon authored at least 34 papers between 2013 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
TLP Balancer: Predictive Thread Allocation for Multitenant Inference in Embedded GPUs.
IEEE Embed. Syst. Lett., June, 2025

Beyond VABlock: Improving Transformer workloads through aggressive prefetching.
J. Syst. Archit., 2025

MOST: Memory Oversubscription-Aware Scheduling for Tensor Migration on GPU Unified Storage.
IEEE Comput. Archit. Lett., 2025

SSFFT: Energy-Efficient Selective Scaling for Fast Fourier Transform in Embedded GPUs.
Proceedings of the 26th ACM SIGPLAN/SIGBED International Conference on Languages, 2025

Hierarchical Traversal Stack Design Using Shared Memory for GPU Ray Tracing.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2025

Avant-Garde: Empowering GPUs with Scaled Numeric Formats.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

Marching Page Walks: Batching and Concurrent Page Table Walks for Enhancing GPU Throughput.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

Warped-Compaction: Maximizing GPU Register File Bandwidth Utilization via Operand Compaction.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

HyMM: A Hybrid Sparse-Dense Matrix Multiplication Accelerator for GCNs.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

2024
Adaptive Kernel Merge and Fusion for Multi-Tenant Inference in Embedded GPUs.
IEEE Embed. Syst. Lett., December, 2024

Triple-A: Early Operand Collector Allocation for Maximizing GPU Register Bank Utilization.
IEEE Embed. Syst. Lett., June, 2024

Conflict-aware compiler for hierarchical register file on GPUs.
J. Syst. Archit., 2024

SAVector: Vectored Systolic Arrays.
IEEE Access, 2024

DEPrune: Depth-wise Separable Convolution Pruning for Maximizing GPU Parallelism.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

VitBit: Enhancing Embedded GPU Performance for AI Workloads through Register Operand Packing.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

2023
Early-Adaptor: An Adaptive Framework forProactive UVM Memory Management.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Warped-MC: An Efficient Memory Controller Scheme for Massively Parallel Processors.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Balanced Column-Wise Block Pruning for Maximizing GPU Parallelism.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

INTERPRET: Inter-Warp Register Reuse for GPU Tensor Core.
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022
CASH-RF: A Compiler-Assisted Hierarchical Register File in GPUs.
IEEE Embed. Syst. Lett., 2022

GhostLeg: Selective Memory Coalescing for Secure GPU Architecture.
IEEE Access, 2022

Analyzing GCN Aggregation on GPU.
IEEE Access, 2022

TEA-RC: Thread Context-Aware Register Cache for GPUs.
IEEE Access, 2022

Reconstructing Out-of-Order Issue Queue.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

2020
REACT: Scalable and High-Performance Regular Expression Pattern Matching Accelerator for In-Storage Processing.
IEEE Trans. Parallel Distributed Syst., 2020

2019
Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs.
IEEE Trans. Computers, 2019

2018
WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs.
IEEE Trans. Computers, 2018

FineReg: Fine-Grained Register File Management for Augmenting GPU Throughput.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

2017
Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs.
IEEE Trans. Parallel Distributed Syst., 2017

2016
Virtual Thread: Maximizing Thread-Level Parallelism beyond GPU Scheduling Limit.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

APRES: Improving Cache Efficiency by Exploiting Load Characteristics on GPUs.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Warped-preexecution: A GPU pre-execution approach for improving latency hiding.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
DRAW: investigating benefits of adaptive fetch group size on GPU.
Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

2013
A Distributed Signature Detection Method for Detecting Intrusions in Sensor Systems.
Sensors, 2013


  Loading...