Seon Wook Kim

Orcid: 0000-0001-6555-1741

According to our database1, Seon Wook Kim authored at least 87 papers between 1993 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Optimal Model Partitioning with Low-Overhead Profiling on the PIM-based Platform for Deep Learning Inference.
ACM Trans. Design Autom. Electr. Syst., March, 2024

Low Overhead PIM-to-PIM Communication on PCIe-based Multi-PIM Platforms for Executing Large-Scale AI Models.
Proceedings of the International Conference on Electronics, Information, and Communication, 2024

Supporting Multi-Channels to DRAM-based PIM Execution for Boosting the Performance.
Proceedings of the International Conference on Electronics, Information, and Communication, 2024

2023
PISA-DMA: Processing-in-Memory Instruction Set Architecture Using DMA.
IEEE Access, 2023

BL-PIM: Varying the Burst Length to Realize the All-Bank Performance and Minimize the Multi-Workload Interference for in-DRAM PIM.
IEEE Access, 2023

2022
Low-overhead inverted LUT design for bounded DNN activation functions on floating-point vector ALUs.
Microprocess. Microsystems, September, 2022

Silent-PIM: Realizing the Processing-in-Memory Computing With Standard Memory Requests.
IEEE Trans. Parallel Distributed Syst., 2022

Achieving the Performance of All-Bank In-DRAM PIM With Standard Memory Interface: Memory-Computation Decoupling.
IEEE Access, 2022

Extending the ONNX Runtime Framework for the Processing-in-Memory Execution.
Proceedings of the International Conference on Electronics, Information, and Communication, 2022

2021
Monolithic 3D stacked multiply-accumulate units.
Integr., 2021

Tile-based Code Generation for Efficiently Accessing to Scratchpad Memory.
Proceedings of the International Conference on Electronics, Information, and Communication, 2021

Applying Piecewise Linear Approximation for DNN Non-Linear Activation Functions to Bfloat16 MACs.
Proceedings of the International Conference on Electronics, Information, and Communication, 2021

2020
Generating Representative Test Sequences from Real Workload for Minimizing DRAM Verification Overhead.
ACM Trans. Design Autom. Electr. Syst., 2020

2019
Fault Tolerance Technique Offlining Faulty Blocks by Heap Memory Management.
ACM Trans. Design Autom. Electr. Syst., 2019

Reducing DRAM Refresh Rate Using Retention Time Aware Universal Hashing Redundancy Repair.
ACM Trans. Design Autom. Electr. Syst., 2019

Design and Implementation of Display Stream Compression Decoder With Line Buffer Optimization.
IEEE Trans. Consumer Electron., 2019

Design of Processing-"Inside"-Memory Optimized for DRAM Behaviors.
IEEE Access, 2019

Epsim: A Scalable and Parallel Marssx86 Simulator With Exploiting Epoch-Based Execution.
IEEE Access, 2019

Exploring the Relation between Monolithic 3D L1 GPU Cache Capacity and Warp Scheduling Efficiency.
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

2018
Recovering from Biased Distribution of Faulty Cells in Memory by Reorganizing Replacement Regions through Universal Hashing.
ACM Trans. Design Autom. Electr. Syst., 2018

Energy-Efficient DRAM Selective Refresh Technique with Page Residence in a Memory Hierarchy of Hardware-Managed TLB.
IEICE Trans. Electron., 2018

2017
Content-Aware Bit Shuffling for Maximizing PCM Endurance.
ACM Trans. Design Autom. Electr. Syst., 2017

P-DRAMSim2: Exploiting thread-level parallelism in DRAMSim2.
IEICE Electron. Express, 2017

A decoupled bit shifting technique using data encoding/decoding for DRAM redundancy repair.
IEICE Electron. Express, 2017

2016
JavaScript Parallelizing Compiler for Exploiting Parallelism from Data-Parallel HTML5 Applications.
ACM Trans. Archit. Code Optim., 2016

High-throughput low-area design of AES using constant binary matrix-vector multiplication.
Microprocess. Microsystems, 2016

2015
Lowering Minimum Supply Voltage for Power-Efficient Cache Design by Exploiting Data Redundancy.
ACM Trans. Design Autom. Electr. Syst., 2015

O2WebCL: an automatic OpenCL-to-WebCL translator for high performance web computing.
J. Supercomput., 2015

D<sup>2</sup>ART: Direct Data Accessing from Passive RFID Tag for infra-less, contact-less, and battery-less pervasive computing.
Microprocess. Microsystems, 2015

2014
Web-based image processing using JavaScript and WebCL.
Proceedings of the IEEE International Conference on Consumer Electronics, 2014

Performance comparison of GCC and LLVM on the EISC processor.
Proceedings of the International Conference on Electronics, Information and Communications, 2014

Performance evaluation of GCC 4.7.1 on EISC.
Proceedings of the International Conference on Electronics, Information and Communications, 2014

2013
A Self-Calibrated DLL-Based Clock Generator for an Energy-Aware EISC Processor.
IEEE Trans. Very Large Scale Integr. Syst., 2013

DiSCo: Distributed Scalable Compilation Tool for Heavy Compilation Workload.
IEICE Trans. Inf. Syst., 2013

2012
Resource Efficient Implementation of Low Power MB-OFDM PHY Baseband Modem With Highly Parallel Architecture.
IEEE Trans. Very Large Scale Integr. Syst., 2012

AndroScope for detailed performance study of the android platform and its applications.
Proceedings of the IEEE International Conference on Consumer Electronics, 2012

2011
A Reconfigurable FIR Filter Architecture to Trade Off Filter Performance for Dynamic Power Consumption.
IEEE Trans. Very Large Scale Integr. Syst., 2011

A processor-based decoupled timing controller for flexible and low-cost 2D/3D plasma display panel design.
IEEE Trans. Consumer Electron., 2011

Applying frame layout to hardware design in FPGA for seamless support of cross calls in CPU-FPGA coupling architecture.
Microprocess. Microsystems, 2011

Runtime parallelization of legacy code on a transactional memory system.
Proceedings of the High Performance Embedded Architectures and Compilers, 2011

2010
A Novel Architecture for Block Interleaving Algorithm in MB-OFDM Using Mixed Radix System.
IEEE Trans. Very Large Scale Integr. Syst., 2010

Hierarchical data structure-based timing controller design for plasma display panels.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Support of cross calls between a microprocessor and FPGA in CPU-FPGA coupling architecture.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Design issues and optimization in DisplayPort link layer implementation.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2010

Design of ultra low power stream data receiver based on UHF passive RFID tag system.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2010

Implementation of x86 Binary-to-C Translator by Using GNU Tools.
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010

2009
A low-power baseband modem architecture for a mobile RFID reader.
J. Embed. Comput., 2009

2008
Virtual Memory and Buffer Storage.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

A Reconfigurable Processor Infrastructure for Accelerating Java Applications.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2008

Applying passive RFID system to wireless headphones for extreme low power consumption.
Proceedings of the 45th Design Automation Conference, 2008

A DC-DC converter with a dual VCDL-based ADC and a self-calibrated DLL-based clock generator for an energy-aware EISC processor.
Proceedings of the IEEE 2008 Custom Integrated Circuits Conference, 2008

2007
Performance Study of Anti-collision Algorithms for EPC-C1 Gen2 RFID Protocol.
Proceedings of the Information Networking. Towards Ubiquitous Networking and Services, 2007

Compiler Construction for Lockstep Execution of Multithreaded Processors.
Proceedings of the Seventh International Conference on Computer and Information Technology (CIT 2007), 2007

A Dataflow Analysis for Mode Set Optimization in DSP Instruction Sets.
Proceedings of the Seventh International Conference on Computer and Information Technology (CIT 2007), 2007

2006
Exploiting reference idempotency to reduce speculative storage overflow.
ACM Trans. Program. Lang. Syst., 2006

A New Energy x Delay-Aware Flip-Flop.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2006

Implementation of H.264/AVC decoder for mobile video applications.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Jaguar: a compiler infrastructure for Java reconfigurable computing.
Proceedings of the ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, 2006

A Multi-protocol Baseband Modem Processor for a Mobile RFID Reader.
Proceedings of the Embedded and Ubiquitous Computing, International Conference, 2006

Code Generation and Optimization for Java-to-C Compilers.
Proceedings of the Emerging Directions in Embedded and Ubiquitous Computing, 2006

Implementation of H.264/AVC decoder for mobile video applications.
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006

OpenMP Directive Extension for BlackFin 561 Dual Core Processor.
Proceedings of the Sixth International Conference on Computer and Information Technology (CIT 2006), 2006

2005
Study of OpenMP applications on the InfiniBand-based software distributed shared-memory system.
Parallel Comput., 2005

The distributed virtual shared-memory system based on the InfiniBand architecture.
J. Parallel Distributed Comput., 2005

Implementation of H.264/AVC baseline profile decoder for mobile video applications.
Proceedings of the 12th IEEE International Conference on Electronics, 2005

2004
Charge-Sharing-Problem Reduced Split-Path Domino Logic.
Proceedings of the 17th International Conference on VLSI Design (VLSI Design 2004), 2004

A Distributed-Request-Based DiffServ CAC for Seamless Fast-Handoff in Mobile Internet.
Proceedings of the Quality of Service in the Emerging Networking Panorama: Fifth International Workshop on Quality of Future Internet Services, 2004

Implementation of the Software Distributed Shared-Memory System on the InfiniBand.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2004

Implementation of a low power motion detection camera processor using a CMOS Image Sensor.
Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

Characterization of OpenMP Applications on the InfiniBand-Based Distributed Virtual Shared Memory System.
Proceedings of the High Performance Computing, 2004

2003
OpenMP and Compilation Issue in Embedded Applications.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2003

Parallelizing Parallel Rollout Algorithm for Solving Markov Decision Processes.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2003

Dynamic Instrumentation of Large-Scale MPI and OpenMP Applications.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

2002
VGV: Supporting Performance Analysis of Object-Oriented Mixed MPI/OpenMP Parallel Applications.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
Parallel programming environment for OpenMP.
Sci. Program., 2001

Portable Compilers for OpenMP.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

Reference idempotency analysis: a framework for optimizing speculative execution.
Proceedings of the 2001 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'01), 2001

The Structure of a Compiler for Explicit and Implicit Parallelism.
Proceedings of the Languages and Compilers for Parallel Computing, 2001

Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor.
Proceedings of the 15th international conference on Supercomputing, 2001

2000
Where Does the Speedup Go: Quantitative Modeling of Performance Losses in Shared-Memory Programs.
Parallel Process. Lett., 2000

A Performance Advisor Tool for Shared-Memory Parallel Programming.
Proceedings of the Languages and Compilers for Parallel Computing, 2000

Quantifying Differences between OpenMP and MPI Using a Large-Scale Application Suite.
Proceedings of the High Performance Computing, Third International Symposium, 2000

Compiler Techniques for Energy Saving in Instruction Caches of Speculative Parallel Microarchitectures.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

1999
Compiling for Speculative Architectures.
Proceedings of the Languages and Compilers for Parallel Computing, 1999

1998
Compiler-Based Tools for Analyzing Parallel Programs.
Parallel Comput., 1998

1995
An Extended Fuzzy Clustering Algorithm and its Application.
J. Circuits Syst. Comput., 1995

1993
Full Adder-based Inner Product Step Processors for Residue and Quadratic Residue Number Systems.
Proceedings of the 1993 IEEE International Symposium on Circuits and Systems, 1993


  Loading...