Won Woo Ro

Murali Annavaram

CoRR, June, 2025

q-Point: A Numeric Format for Quantum Circuit Simulation Using Polar Form Complex Numbers.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2025

Perspective Shifts: Cultivating Teacher Diversity in Online Knowledge Distillation.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2025

REC: Enhancing fine-grained cache coherence protocol in multi-GPU systems.

[BibT_eX]

[DOI]

J. Syst. Archit., 2025

REDIT: Redirection-Enabled Memory-Side Directory Architecture for CXL Memory Fabric.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2025

Deep Reinforcement Learning-Based Combinatorial Optimization Solver to Address Wireless Resource Allocation Problem.

[BibT_eX]

[DOI]

Raihan Muhammad Syahran

Kae Won Choi

Proceedings of the 102nd IEEE Vehicular Technology Conference, 2025

BitL: A Hybrid Bit-Serial and Parallel Deep Learning Accelerator for Critical Path Reduction.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

LATPC: Accelerating GPU Address Translation Using Locality-Aware TLB Prefetching and MSHR Compression.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

COSMOS: An LLC Contention Slowdown Model for Heterogeneous Multi-Core Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2025

Garibaldi: A Pairwise Instruction-Data Management for Enhancing Shared Last-Level Cache Performance in Server Workloads.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

QR-Map: A Map-Based Approach to Quantum Circuit Abstraction for Qubit Reuse Optimization.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

Avant-Garde: Empowering GPUs with Scaled Numeric Formats.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

Heliostat: Harnessing Ray Tracing Accelerators for Page Table Walks.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

Adversarial Purification via Super-Resolution and Diffusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

WINS: Winograd Structured Pruning for Fast Winograd Convolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

PIMFY: Eliminating Remote Page Walks in MCM GPUs.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE International Conference on Computer Design, 2025

Marching Page Walks: Batching and Concurrent Page Table Walks for Enhancing GPU Throughput.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

Ditto: Accelerating Diffusion Model via Temporal Value Similarity.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

Enhancing IOMMU Efficiency in Heterogeneous SaCs: A Study on Cache Policy Impacts.

[BibT_eX]

[DOI]

Won Hur

Proceedings of the International Conference on Electronics, Information, and Communication, 2025

CVMAX: Accelerator Architecture with Polar Form Multiplication for Complex-Valued Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

Qubit Movement-Optimized Program Generation on Zoned Neutral Atom Processors.

[BibT_eX]

[DOI]

Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization, 2025

PIMutation: Exploring the Potential of Real PIM Architecture for Quantum Circuit Simulation.

[BibT_eX]

[DOI]

Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

2024

SHREG: Mitigating register redundancy in GPUs.

[BibT_eX]

[DOI]

J. Syst. Archit., 2024

M3XU: Achieving High-Precision and Complex Matrix Multiplication with Low-Precision MXUs.

[BibT_eX]

[DOI]

Dongho Ha

Yunan Zhang

Chen-Chien Kao

Christopher J. Hughes

Hung-Wei Tseng

Proceedings of the International Conference for High Performance Computing, 2024

DEPrune: Depth-wise Separable Convolution Pruning for Maximizing GPU Parallelism.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

GUMSO: Gating Unnecessary On-Chip Memory Slices for Power Optimization on GPUs.

[BibT_eX]

[DOI]

Seunghyun Jin

Hyunwuk Lee

Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, 2024

AirGun: Adaptive Granularity Quantization for Accelerating Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

MOSQ: Accelerating Classical Simulation of UCCSD Ansatz Circuits using Merged Operation.

[BibT_eX]

[DOI]

Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

Barber: Balancing Thermal Relaxation Deviations of NISQ Programs by Exploiting Bit-Inverted Circuits.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024

Geneva: A Dynamic Confluence of Speculative Execution and In-Order Commitment Windows.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

REPrune: Channel Pruning via Kernel Representative Selection.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Recompiling QAOA Circuits on Various Rotational Directions.

[BibT_eX]

[DOI]

Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023

A convertible neural processor supporting adaptive quantization for real-time neural networks.

[BibT_eX]

[DOI]

J. Syst. Archit., December, 2023

FLIXR: Embedding Index Into Flash Translation Layer in SSDs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2023

MAD MAcce: Supporting Multiply-Add Operations for Democratizing Matrix-Multiplication Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Exploiting Inherent Properties of Complex Numbers for Accelerating Complex Valued Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

McCore: A Holistic Management of High-Performance Heterogeneous Multicores.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

AESPA: Asynchronous Execution Scheme to Exploit Bank-Level Parallelism of Processing-in-Memory.

[BibT_eX]

[DOI]

Hongju Kal

Chanyoung Yoo

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Early-Adaptor: An Adaptive Framework forProactive UVM Memory Management.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

TensorCV: Accelerating Inference-Adjacent Computation Using Tensor Processors.

[BibT_eX]

[DOI]

Dongho Ha

Hung-Wei Tseng

Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2023

R2D2: Removing ReDunDancy Utilizing Linearity of Address Generation in GPUs.

[BibT_eX]

[DOI]

Dongho Ha

Yunho Oh

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Lightning Talk: Efficiency and Programmability of DNN Accelerators and GPUs.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Quixote: Improving Fidelity of Quantum Program by Independent Execution of Controlled Gates.

[BibT_eX]

[DOI]

Enhyeok Jang

Seungwoo Choi

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Context Swap: Multi-PIM System Preventing Remote Memory Access for Large Embedding Model Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023

Balanced Column-Wise Block Pruning for Maximizing GPU Parallelism.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

INTERPRET: Inter-Warp Register Reuse for GPU Tensor Core.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022

CASH-RF: A Compiler-Assisted Hierarchical Register File in GPUs.

[BibT_eX]

[DOI]

IEEE Embed. Syst. Lett., 2022

TEA-RC: Thread Context-Aware Register Cache for GPUs.

[BibT_eX]

[DOI]

IEEE Access, 2022

Reconstructing Out-of-Order Issue Queue.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

2021

Two-Stage In-Storage Processing and Scheduling for Pattern Matching Applications.

[BibT_eX]

[DOI]

IEEE Access, 2021

PIMCaffe: Functional Evaluation of a Machine Learning Framework for In-Memory Neural Processing Unit.

[BibT_eX]

[DOI]

IEEE Access, 2021

Chapter Six - Deep learning with GPUs.

[BibT_eX]

[DOI]

Adv. Comput., 2021

QoS-Aware Scheduling for Cellular Networks Using Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Jonathan Robert Malin

Gun Ko

Proceedings of the Network and Parallel Computing, 2021

SPACE: Locality-Aware Processing in Heterogeneous Memory for Personalized Recommendations.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

2020

REACT: Scalable and High-Performance Regular Expression Pattern Matching Accelerator for In-Storage Processing.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

Hi-End: Hierarchical, Endurance-Aware STT-MRAM-Based Register File for Energy-Efficient GPUs.

[BibT_eX]

[DOI]

IEEE Access, 2020

Duplo: Lifting Redundant Memory Accesses of Deep Neural Networks for GPU Tensor Cores.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Check-In: In-Storage Checkpointing for Key-Value Store System Leveraging Flash-Based SSDs.

[BibT_eX]

[DOI]

Joohyeong Yoon

Won Seob Jeong

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

CASINO Core Microarchitecture: Generating Out-of-Order Schedules Using Cascaded In-Order Scheduling Windows.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2019

Fast CU Depth Decision for HEVC Using Neural Networks.

[BibT_eX]

[DOI]

Kyungah Kim

IEEE Trans. Circuits Syst. Video Technol., 2019

Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

OverCome: Coarse-Grained Instruction Commit with Handover Register Renaming.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

Contents-aware partitioning algorithm for parallel high efficiency video coding.

[BibT_eX]

[DOI]

Kyungah Kim

Multim. Tools Appl., 2019

Linebacker: preserving victim cache lines in idle register files of GPUs.

[BibT_eX]

[DOI]

Proceedings of the 46th International Symposium on Computer Architecture, 2019

Efficient Dilated-Winograd Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Access Characteristic-based Cache Replacement Policy in an SSD.

[BibT_eX]

[DOI]

Joohyeong Yoon

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

2018

Exploiting Pseudo-Quadtree Structure for Accelerating HEVC Spatial Resolution Downscaling Transcoder.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2018

Architectural Protection of Application Privacy against Software and Physical Attacks in Untrusted Cloud Environment.

[BibT_eX]

[DOI]

IEEE Trans. Cloud Comput., 2018

WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2018

Simultaneous and Speculative Thread Migration for Improving Energy Efficiency of Heterogeneous Core Architectures.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2018

A semantic sensor mashup platform for Internet of Things.

[BibT_eX]

[DOI]

Sungkwang Eom

Kyong-Ho Lee

Proceedings of the 4th IEEE World Forum on Internet of Things, 2018

FineReg: Fine-Grained Register File Management for Augmenting GPU Throughput.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs.

[BibT_eX]

[DOI]

Keunsoo Kim

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017

Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2017

Improving Energy Efficiency of GPUs through Data Compression and Compressed Execution.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2017

Dynamic Load Balancing of Dispatch Scheduling for Solid State Disks.

[BibT_eX]

[DOI]

Myung Hyun Jo

IEEE Trans. Computers, 2017

An adaptive plan-based approach to integrating semantic streams with remote RDF data.

[BibT_eX]

[DOI]

J. Inf. Sci., 2017

Parallel in-order execution architecture for low-power processor.

[BibT_eX]

[DOI]

Kyungmin Lee

Ipoom Jeong

Proceedings of the International SoC Design Conference, 2017

Characterizing convolutional neural network workloads on a detailed GPU simulator.

[BibT_eX]

[DOI]

Proceedings of the International SoC Design Conference, 2017

Access Pattern-Aware Cache Management for Improving Data Utilization in GPU.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

2016

Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2016

Parallel GPU Architecture Simulation Framework Exploiting Architectural-Level Parallelism with Timing Error Prediction.

[BibT_eX]

[DOI]

Sangpil Lee

IEEE Trans. Computers, 2016

Server side, play buffer based quality control for adaptive media streaming.

[BibT_eX]

[DOI]

Keunsoo Kim

Benjamin Y. Cho

Multim. Tools Appl., 2016

Virtual Thread: Maximizing Thread-Level Parallelism beyond GPU Scheduling Limit.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Warped-Slicer: Efficient Intra-SM Slicing through Dynamic Resource Partitioning for GPU Multiprogramming.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

APRES: Improving Cache Efficiency by Exploiting Load Characteristics on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Warped-preexecution: A GPU pre-execution approach for improving latency hiding.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015

Dynamic Load Balancing of Parallel SURF with Vertical Partitioning.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2015

Network Variation and Fault Tolerant Performance Acceleration in Mobile Devices with Simultaneous Remote Execution.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2015

A Performance-Energy Model to Evaluate Single Thread Execution Acceleration.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2015

Proactive Plan-Based Continuous Query Processing over Diverse SPARQL Endpoints.

[BibT_eX]

[DOI]

Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2015

DRAW: investigating benefits of adaptive fetch group size on GPU.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

A frequency scaling model for energy efficient DVFS designs based on circuit delay optimization.

[BibT_eX]

[DOI]

Ki Bum Chun

Proceedings of the International Symposium on Consumer Electronics, 2015

Warped-compression: enabling power efficient GPUs through register compression.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Complex Sensor Mashups for Linking Sensors and Formula-Based Knowledge Bases.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Information Reuse and Integration, 2015

An accelerated separable median filter with sorting networks.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

True motion compensation with feature detection for frame rate up-conversion.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Integrity Protection for Big Data Processing with Dynamic Redundancy Computation.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Autonomic Computing, 2015

Contention-Free Fair Queuing for High-Speed Storage with RAID-0 Architecture.

[BibT_eX]

[DOI]

Myung Hyun Jo

Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Enhancing Software Dependability and Security with Hardware Supported Instruction Address Space Randomization.

[BibT_eX]

[DOI]

Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

Another Look at Secure Big Data Processing: Formal Framework and a Potential Approach.

[BibT_eX]

[DOI]

Proceedings of the 8th IEEE International Conference on Cloud Computing, 2015

2014

$C\!\!-\!\!Lock$ : Energy Efficient Synchronization for Embedded Multicore Systems.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2014

Complexity-Effective Contention Management with Dynamic Backoff for Transactional Memory Systems.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2014

Exploiting Implementation Diversity and Partial Connection of Routers in Application-Specific Network-on-Chip Topology Synthesis.

[BibT_eX]

[DOI]

Minje Jun

Eui-Young Chung

IEEE Trans. Computers, 2014

A Malicious Pattern Detection Engine for Embedded Security Systems in the Internet of Things.

[BibT_eX]

[DOI]

Deokho Kim

Sensors, 2014

Boosting CUDA Applications with CPU-GPU Hybrid Computing.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2014

Swarm Processor System: hardware process scheduler based energy efficient multi-core system.

[BibT_eX]

[DOI]

IEICE Electron. Express, 2014

Architectural investigation of matrix data layout on multicore processors.

[BibT_eX]

[DOI]

Minwoo Kim

Future Gener. Comput. Syst., 2014

Accelerating MapReduce framework on multi-GPU systems.

[BibT_eX]

[DOI]

Clust. Comput., 2014

LUT based secure cloud computing - An implementation using FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs, 2014

Workload synthesis: Generating benchmark workloads from statistical execution profile.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Accelerating gesture recognition algorithm using coarse grained reconfigurable architectures.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Audio, 2014

Hyper threading-aware Virtual Machine migration.

[BibT_eX]

[DOI]

Chungmu Oh

Proceedings of the International Conference on Electronics, Information and Communications, 2014

Development of efficient VCPU pinning mechanism in Xen.

[BibT_eX]

[DOI]

Kyung Yoon Min

Seung Hun Kim

Proceedings of the International Conference on Electronics, Information and Communications, 2014

Multicore speedup models using frequency scaling with fixed power budget.

[BibT_eX]

[DOI]

Seungwon Lee

Seung Hun Kim

Proceedings of the International Conference on Electronics, Information and Communications, 2014

2013

Design and evaluation of random linear network coding Accelerators on FPGAs.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2013

Importance of Coherence Protocols with Network Applications on Multicore Processors.

[BibT_eX]

[DOI]

Kyueun Yi

IEEE Trans. Computers, 2013

A Distributed Signature Detection Method for Detecting Intrusions in Sensor Systems.

[BibT_eX]

[DOI]

Sensors, 2013

Parallelized sub-resource loading for web rendering engine.

[BibT_eX]

[DOI]

J. Syst. Archit., 2013

Benefits of using parallelized non-progressive network coding.

[BibT_eX]

[DOI]

Minwoo Kim

J. Netw. Comput. Appl., 2013

GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table.

[BibT_eX]

[DOI]

Yunho Oh

Int. J. Parallel Program., 2013

Exploiting SIMD parallelism on dynamically partitioned parallel network coding for P2P systems.

[BibT_eX]

[DOI]

Deokho Kim

Comput. Electr. Eng., 2013

Parallel GPU architecture simulation framework exploiting work allocation unit parallelism.

[BibT_eX]

[DOI]

Sangpil Lee

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Mark-Sharing: A Parallel Garbage Collection Algorithm for Low Synchronization Overhead.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

MGMR: Multi-GPU Based MapReduce.

[BibT_eX]

[DOI]

Proceedings of the Grid and Pervasive Computing - 8th International Conference, 2013

2012

Offloading of media transcoding for high-quality multimedia services.

[BibT_eX]

[DOI]

IEEE Trans. Consumer Electron., 2012

Reconfigurable and parallelized network coding decoder for VANETs.

[BibT_eX]

[DOI]

Sunwoo Kim

Mob. Inf. Syst., 2012

Introducing the Extremely Heterogeneous Architecture.

[BibT_eX]

[DOI]

Shaoshan Liu

Chen Liu

Alfredo Cristóbal-Salas

Christophe Cérin

Jian-Jun Han

J. Interconnect. Networks, 2012

An Efficient Block Cipher Implementation on Many-Core Graphics Processing Units.

[BibT_eX]

[DOI]

J. Inf. Process. Syst., 2012

Multi-Threading and Suffix Grouping on Massive Multiple Pattern Matching Algorithm.

[BibT_eX]

[DOI]

Comput. J., 2012

Accelerated Network Coding with Dynamic Stream Decomposition on Graphics Processing Unit.

[BibT_eX]

[DOI]

Sangpil Lee

Comput. J., 2012

Conflict Avoidance Scheduling Using Grouping List for Transactional Memory.

[BibT_eX]

[DOI]

Dongmin Choi

Seung-Hun Kim

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Cooperative heterogeneous computing for parallel processing on CPU/GPU hybrids.

[BibT_eX]

[DOI]

Proceedings of the 16th Workshop on Interaction between Compilers and Computer Architectures, 2012

2011

Network Coding on Heterogeneous Multi-Core Processors for Wireless Sensor Networks.

[BibT_eX]

[DOI]

Deokho Kim

Sensors, 2011

A Novel Sequential Tree Algorithm Based on Scoreboard for MPI Broadcast Communication.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2011

A Low-Cost Standard Mode MPI Hardware Unit for Embedded MPSoC.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2011

2010

On Improving Parallelized Network Coding with Dynamic Partitioning.

[BibT_eX]

[DOI]

Joon-Sang Park

IEEE Trans. Parallel Distributed Syst., 2010

Multithreaded pattern matching algorithm with data rearrangement.

[BibT_eX]

[DOI]

Seung-Hun Kim

IEICE Electron. Express, 2010

Hardware implementation of a tessellation accelerator for the OpenVG standard.

[BibT_eX]

[DOI]

IEICE Electron. Express, 2010

Implementing FFT using SPMD style of OpenMP.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Networked Computing and Advanced Information Management, 2010

FPGA implementation of highly parallelized decoder logic for network coding (abstract only).

[BibT_eX]

[DOI]

Sunwoo Kim

Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, 2010

2009

A complexity-effective microprocessor design with decoupled dispatch queues and prefetching.

[BibT_eX]

[DOI]

Parallel Comput., 2009

Efficient Parallelized Network Coding for P2P File Sharing Applications.

[BibT_eX]

[DOI]

Joon-Sang Park

Proceedings of the Advances in Grid and Pervasive Computing, 4th International Conference, 2009

Fully Pipelined Hardware Implementation of 128-Bit SEED Block Cipher Algorithm.

[BibT_eX]

[DOI]

Proceedings of the Reconfigurable Computing: Architectures, 2009

2008

A low-complexity microprocessor design with speculative pre-execution.

[BibT_eX]

[DOI]

J. Syst. Archit., 2008

Efficient peer-to-peer file sharing using network coding in MANET.

[BibT_eX]

[DOI]

J. Commun. Networks, 2008

Delay Analysis of Car-to-Car Reliable Data Delivery Strategies Based on Data Mulling with Network Coding.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2008

Simultaneous thin-thread processors for low-power embedded systems.

[BibT_eX]

[DOI]

IEICE Electron. Express, 2008

2006

Design and evaluation of a hierarchical decoupled architecture.

[BibT_eX]

[DOI]

J. Supercomput., 2006

Speculative pre-execution assisted by compiler (SPEAR).

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2006

Design and Effectiveness of Small-Sized Decoupled Dispatch Queues.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

2005

Techniques to Improve Performance Beyond Pipelining: Superpipelining, Superscalar, and VLIW.

[BibT_eX]

[DOI]

Jung-Yup Kang

Adv. Comput., 2005

A Low-Complexity Issue Queue Design with Speculative Pre-execution.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2005

2004

SPEAR: A Hybrid Model for Speculative Pre-Execution.

[BibT_eX]

[DOI]

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

2003

HiDISC: A Decoupled Architecture for Data-Intensive Application.

[BibT_eX]

[DOI]

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Compiler Support for Dynamic Speculative Pre-Execution.

[BibT_eX]

[DOI]