Won Woo Ro

According to our database1, Won Woo Ro
  • authored at least 73 papers between 2003 and 2017.
  • has a "Dijkstra number"2 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2017
Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs.
IEEE Trans. Parallel Distrib. Syst., 2017

Improving Energy Efficiency of GPUs through Data Compression and Compressed Execution.
IEEE Trans. Computers, 2017

Dynamic Load Balancing of Dispatch Scheduling for Solid State Disks.
IEEE Trans. Computers, 2017

Access Pattern-Aware Cache Management for Improving Data Utilization in GPU.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

2016
Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph.
IEEE Trans. Circuits Syst. Video Techn., 2016

Parallel GPU Architecture Simulation Framework Exploiting Architectural-Level Parallelism with Timing Error Prediction.
IEEE Trans. Computers, 2016

Server side, play buffer based quality control for adaptive media streaming.
Multimedia Tools Appl., 2016

Virtual Thread: Maximizing Thread-Level Parallelism beyond GPU Scheduling Limit.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Warped-Slicer: Efficient Intra-SM Slicing through Dynamic Resource Partitioning for GPU Multiprogramming.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

APRES: Improving Cache Efficiency by Exploiting Load Characteristics on GPUs.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Warped-preexecution: A GPU pre-execution approach for improving latency hiding.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
Dynamic Load Balancing of Parallel SURF with Vertical Partitioning.
IEEE Trans. Parallel Distrib. Syst., 2015

Network Variation and Fault Tolerant Performance Acceleration in Mobile Devices with Simultaneous Remote Execution.
IEEE Trans. Computers, 2015

A Performance-Energy Model to Evaluate Single Thread Execution Acceleration.
Computer Architecture Letters, 2015

DRAW: investigating benefits of adaptive fetch group size on GPU.
Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

Warped-compression: enabling power efficient GPUs through register compression.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

An accelerated separable median filter with sorting networks.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

True motion compensation with feature detection for frame rate up-conversion.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Integrity Protection for Big Data Processing with Dynamic Redundancy Computation.
Proceedings of the 2015 IEEE International Conference on Autonomic Computing, 2015

Contention-Free Fair Queuing for High-Speed Storage with RAID-0 Architecture.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Enhancing Software Dependability and Security with Hardware Supported Instruction Address Space Randomization.
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

Another Look at Secure Big Data Processing: Formal Framework and a Potential Approach.
Proceedings of the 8th IEEE International Conference on Cloud Computing, 2015

2014
$C\!\!-\!\!Lock$ : Energy Efficient Synchronization for Embedded Multicore Systems.
IEEE Trans. Computers, 2014

Complexity-Effective Contention Management with Dynamic Backoff for Transactional Memory Systems.
IEEE Trans. Computers, 2014

Exploiting Implementation Diversity and Partial Connection of Routers in Application-Specific Network-on-Chip Topology Synthesis.
IEEE Trans. Computers, 2014

A Malicious Pattern Detection Engine for Embedded Security Systems in the Internet of Things.
Sensors, 2014

Boosting CUDA Applications with CPU-GPU Hybrid Computing.
International Journal of Parallel Programming, 2014

Swarm Processor System: hardware process scheduler based energy efficient multi-core system.
IEICE Electronic Express, 2014

Architectural investigation of matrix data layout on multicore processors.
Future Generation Comp. Syst., 2014

Accelerating MapReduce framework on multi-GPU systems.
Cluster Computing, 2014

LUT based secure cloud computing - An implementation using FPGAs.
Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs, 2014

Workload synthesis: Generating benchmark workloads from statistical execution profile.
Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

2013
Design and evaluation of random linear network coding Accelerators on FPGAs.
ACM Trans. Embedded Comput. Syst., 2013

Importance of Coherence Protocols with Network Applications on Multicore Processors.
IEEE Trans. Computers, 2013

A Distributed Signature Detection Method for Detecting Intrusions in Sensor Systems.
Sensors, 2013

Parallelized sub-resource loading for web rendering engine.
Journal of Systems Architecture - Embedded Systems Design, 2013

Benefits of using parallelized non-progressive network coding.
J. Network and Computer Applications, 2013

GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table.
International Journal of Parallel Programming, 2013

Exploiting SIMD parallelism on dynamically partitioned parallel network coding for P2P systems.
Computers & Electrical Engineering, 2013

Parallel GPU architecture simulation framework exploiting work allocation unit parallelism.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Mark-Sharing: A Parallel Garbage Collection Algorithm for Low Synchronization Overhead.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

MGMR: Multi-GPU Based MapReduce.
Proceedings of the Grid and Pervasive Computing - 8th International Conference, 2013

2012
Offloading of media transcoding for high-quality multimedia services.
IEEE Trans. Consumer Electronics, 2012

Reconfigurable and parallelized network coding decoder for VANETs.
Mobile Information Systems, 2012

Introducing the Extremely Heterogeneous Architecture.
Journal of Interconnection Networks, 2012

An Efficient Block Cipher Implementation on Many-Core Graphics Processing Units.
JIPS, 2012

Multi-Threading and Suffix Grouping on Massive Multiple Pattern Matching Algorithm.
Comput. J., 2012

Accelerated Network Coding with Dynamic Stream Decomposition on Graphics Processing Unit.
Comput. J., 2012

Conflict Avoidance Scheduling Using Grouping List for Transactional Memory.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Cooperative heterogeneous computing for parallel processing on CPU/GPU hybrids.
Proceedings of the 16th Workshop on Interaction between Compilers and Computer Architectures, 2012

2011
Network Coding on Heterogeneous Multi-Core Processors for Wireless Sensor Networks.
Sensors, 2011

A Novel Sequential Tree Algorithm Based on Scoreboard for MPI Broadcast Communication.
IEICE Transactions, 2011

A Low-Cost Standard Mode MPI Hardware Unit for Embedded MPSoC.
IEICE Transactions, 2011

2010
On Improving Parallelized Network Coding with Dynamic Partitioning.
IEEE Trans. Parallel Distrib. Syst., 2010

Multithreaded pattern matching algorithm with data rearrangement.
IEICE Electronic Express, 2010

Hardware implementation of a tessellation accelerator for the OpenVG standard.
IEICE Electronic Express, 2010

Implementing FFT using SPMD style of OpenMP.
Proceedings of the International Conference on Networked Computing and Advanced Information Management, 2010

FPGA implementation of highly parallelized decoder logic for network coding (abstract only).
Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, 2010

2009
A complexity-effective microprocessor design with decoupled dispatch queues and prefetching.
Parallel Computing, 2009

Efficient Parallelized Network Coding for P2P File Sharing Applications.
Proceedings of the Advances in Grid and Pervasive Computing, 4th International Conference, 2009

Fully Pipelined Hardware Implementation of 128-Bit SEED Block Cipher Algorithm.
Proceedings of the Reconfigurable Computing: Architectures, 2009

2008
A low-complexity microprocessor design with speculative pre-execution.
Journal of Systems Architecture - Embedded Systems Design, 2008

Efficient peer-to-peer file sharing using network coding in MANET.
Journal of Communications and Networks, 2008

Delay Analysis of Car-to-Car Reliable Data Delivery Strategies Based on Data Mulling with Network Coding.
IEICE Transactions, 2008

Simultaneous thin-thread processors for low-power embedded systems.
IEICE Electronic Express, 2008

2006
Design and evaluation of a hierarchical decoupled architecture.
The Journal of Supercomputing, 2006

Speculative pre-execution assisted by compiler (SPEAR).
J. Parallel Distrib. Comput., 2006

Design and Effectiveness of Small-Sized Decoupled Dispatch Queues.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

2005
Techniques to Improve Performance Beyond Pipelining: Superpipelining, Superscalar, and VLIW.
Advances in Computers, 2005

A Low-Complexity Issue Queue Design with Speculative Pre-execution.
Proceedings of the High Performance Computing, 2005

2004
SPEAR: A Hybrid Model for Speculative Pre-Execution.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

2003
HiDISC: A Decoupled Architecture for Data-Intensive Application.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Compiler Support for Dynamic Speculative Pre-Execution.
Proceedings of the 7th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-7 2003), 2003


  Loading...