Jaejin Lee

Orcid: 0000-0003-4638-8170

According to our database1, Jaejin Lee authored at least 145 papers between 1995 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Attention-based Reinforcement Learning for Combinatorial Optimization: Application to Job Shop Scheduling Problem.
CoRR, 2024

2023
Sparse Code With Minimum Hamming Distance of Three for Spin-Torque Transfer Magnetic Random Access Memory.
IEEE Access, 2023

Improving Bit-Error-Rate Performance Using Modulation Coding Techniques for Spin-Torque Transfer Magnetic Random Access Memory.
IEEE Access, 2023

UDC-SIT: A Real-World Dataset for Under-Display Cameras.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Hybrid CUDA Unified Memory Management in Fully Homomorphic Encryption Workloads.
Proceedings of the 30th IEEE International Conference on High Performance Computing, 2023

DeepUM: Tensor Migration and Prefetching in Unified Memory.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
SIRTEM: Spatially Informed Rapid Testing for Epidemic Modeling and Response to COVID-19.
ACM Trans. Spatial Algorithms Syst., 2022

FARNN: FPGA-GPU Hybrid Acceleration Platform for Recurrent Neural Networks.
IEEE Trans. Parallel Distributed Syst., 2022

Lightweight Soft Error Resilience for In-Order Cores.
CoRR, 2022

Parallel Detection Based on a Generalized Partial Response Target for Staggered Bit-Patterned Media Recording Systems.
IEEE Access, 2022

SnuQS: scaling quantum circuit simulation using storage devices.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

SnuHPL: high performance LINPACK for heterogeneous GPUs.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021
Photovoltaic Cell With Built-In Antenna for Internet of Things Applications.
IEEE Access, 2021

DeepCuts: a deep learning optimization framework for versatile GPU workloads.
Proceedings of the PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021

Turnpike: Lightweight Soft Error Resilience for In-Order Cores.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

SnuRHAC: A Runtime for Heterogeneous Accelerator Clusters with CUDA Unified Memory.
Proceedings of the HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, 2021

2020
Introduction to the Special Issue on PPoPP 2017 (Part 2).
ACM Trans. Parallel Comput., 2020

A 1.1-V 10-nm Class 6.4-Gb/s/Pin 16-Gb DDR5 SDRAM With a Phase Rotator-ILO DLL, High-Speed SerDes, and DFE/FFE Equalization Scheme for Rx/Tx.
IEEE J. Solid State Circuits, 2020

4-ary 14/16 modulation code for reducing two-dimensional inter-symbol interference.
IET Commun., 2020

CyCNN: A Rotation Invariant CNN using Polar Mapping and Cylindrical Convolution Layers.
CoRR, 2020

Overlapping host-to-device copy and computation using hidden unified memory.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

Compiler-directed soft error resilience for lightweight GPU register file protection.
Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2020

SOFF: An OpenCL High-Level Synthesis Framework for FPGAs.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019
Introduction to the Special Issue on PPoPP 2017 (Part 1).
ACM Trans. Parallel Comput., 2019


HIPS 2019 Keynote.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

SNU-NPB 2019: Parallelizing and Optimizing NPB in OpenCL and CUDA for Modern GPUs.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

A Practical Model for Optimal Placement of Virtual Network Functions.
Proceedings of the 33rd International Conference on Information Networking, 2019

FA3C: FPGA-Accelerated Deep Reinforcement Learning.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
An Auto-Tuner for OpenCL Work-Group Size on GPUs.
IEEE Trans. Parallel Distributed Syst., 2018

Measurement and Analysis of Electric Signal Transmission Using Human Body as Medium for WBAN Applications.
IEEE Trans. Instrum. Meas., 2018

Transparent GPU memory management for DNNs.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Interpixel interference mitigation using differential coding in vehicular visible light communication based image sensor.
Proceedings of the 2018 International Conference on Information Networking, 2018

2017
High-Bandwidth Memory (HBM) Test Challenges and Solutions.
IEEE Des. Test, 2017

POSTER: MAPA: An Automatic Memory Access Pattern Analyzer for GPU Applications.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Memory-Access-Pattern Analysis Techniques for OpenCL Kernels.
Proceedings of the Languages and Compilers for Parallel Computing, 2017

Performance analysis of CNN frameworks for GPUs.
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

2016
What We Can Learn from the Data: A Multiple-Case Study Examining Behavior Patterns by Students with Different Characteristics in Using a Serious Game.
Technol. Knowl. Learn., 2016

Efficient Checkpointing of Live Virtual Machines.
IEEE Trans. Computers, 2016

Elimination of two-dimensional intersymbol interference through the use of a 9/12 two-dimensional modulation code.
IET Commun., 2016

Translating OpenMP device constructs to OpenCL using unnecessary data transfer elimination.
Proceedings of the International Conference for High Performance Computing, 2016

A distributed OpenCL framework using redundant computation and data replication.
Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2016


Burst error sensing scheme for page-oriented data.
Proceedings of the International Conference on Information and Communication Technology Convergence, 2016

PIPSEA: A Practical IPsec Gateway on Embedded APUs.
Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016

2015
Accelerating LINPACK with MPI-OpenCL on Clusters of Multi-GPU Nodes.
IEEE Trans. Parallel Distributed Syst., 2015

A Performance Model for GPUs with Caches.
IEEE Trans. Parallel Distributed Syst., 2015

A 1.2 V 8 Gb 8-Channel 128 GB/s High-Bandwidth Memory (HBM) Stacked DRAM With Effective I/O Test Circuits.
IEEE J. Solid State Circuits, 2015

Bridging OpenCL and CUDA: a comparative analysis and translation.
Proceedings of the International Conference for High Performance Computing, 2015

Scheduling for Better Energy Efficiency on Many-Core Chips.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2015

An indoor environment VLC-based localization algorithm for handset devices.
Proceedings of the Seventh International Conference on Ubiquitous and Future Networks, 2015

Power allocation scheme for D2D communications in an OFDM-based cellular system.
Proceedings of the 2015 International Conference on Information Networking, 2015

Inter-symbol interference compensation for bit patterned media recording storage.
Proceedings of the 2015 International Conference on Information Networking, 2015

Improving SOVA output using extrinsic informations for bit patterned media recording.
Proceedings of the IEEE International Conference on Consumer Electronics, 2015

Design considerations of HBM stacked DRAM and the memory architecture extension.
Proceedings of the 2015 IEEE Custom Integrated Circuits Conference, 2015

2014
An exact measurement and repair circuit of TSV connections for 128GB/s high-bandwidth memory(HBM) stacked DRAM.
Proceedings of the Symposium on VLSI Circuits, 2014

OpenCL framework for ARM processors with NEON support.
Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing, 2014

Lightweight and block-level concurrent sweeping for javascript garbage collection.
Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2014

25.3 A 1.35V 5.0Gb/s/pin GDDR5M with 5.4mW standby power and an error-adaptive duty-cycle corrector.
Proceedings of the 2014 IEEE International Conference on Solid-State Circuits Conference, 2014

25.2 A 1.2V 8Gb 8-channel 128GB/s high-bandwidth memory (HBM) stacked DRAM with effective microbump I/O test methods using 29nm process and TSV.
Proceedings of the 2014 IEEE International Conference on Solid-State Circuits Conference, 2014

23Gbps 9.4pJ/bit 80/100GHz band CMOS transceiver with on-board antenna for short-range communication.
Proceedings of the IEEE Asian Solid-State Circuits Conference, 2014

Versatile and scalable parallel histogram construction.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
Modulation coding for flash memories.
Proceedings of the International Conference on Computing, Networking and Communications, 2013

An OpenCL optimizing compiler for reconfigurable processors.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

Automatic OpenCL work-group size selection for multicore CPUs.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Automatic code overlay generation and partially redundant code fetch elimination.
ACM Trans. Archit. Code Optim., 2012

OpenCL as a unified programming model for heterogeneous CPU/GPU clusters.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

High-efficiency ferrite meander antenna (HEMA) for LTE applications.
Proceedings of the 31st IEEE Military Communications Conference, 2012

SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters.
Proceedings of the International Conference on Supercomputing, 2012

An automatic code overlaying technique for multicores with explicitly-managed memory hierarchies.
Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

2011
Demand Paging Techniques for Flash Memory Using Compiler Post-Pass Optimizations.
ACM Trans. Embed. Comput. Syst., 2011

Coupling canceller maximum-likelihood (CCML) detection for multi-level cell NAND flash memory.
IEEE Trans. Consumer Electron., 2011

The role of tasks and epistemological beliefs in online peer questioning.
Comput. Educ., 2011

Fast and space-efficient virtual machine checkpointing.
Proceedings of the 7th International Conference on Virtual Execution Environments, 2011

Achieving a single compute device image in OpenCL for multiple GPUs.
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

An instruction-scheduling-aware data partitioning technique for coarse-grained reconfigurable architectures.
Proceedings of the ACM SIGPLAN/SIGBED 2011 conference on Languages, 2011

OpenCL as a Programming Model for GPU Clusters.
Proceedings of the Languages and Compilers for Parallel Computing, 2011

Performance characterization of the NAS Parallel Benchmarks in OpenCL.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

A low-power small-area open loop digital DLL for 2.2Gb/s/pin 2Gb DDR3 SDRAM.
Proceedings of the IEEE Asian Solid-State Circuits Conference, 2011

An efficient software shared virtual memory for the single-chip cloud computer.
Proceedings of the APSys '11 Asia Pacific Workshop on Systems, 2011

SFMalloc: A Lock-Free and Mostly Synchronization-Free Dynamic Memory Allocator for Manycores.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

A Software-Managed Coherent Memory Architecture for Manycores.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

An OpenCL Framework for Homogeneous Manycores with No Hardware Cache Coherence.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Hydra: A Block-Mapped Parallel Flash Memory Solid-State Disk Architecture.
IEEE Trans. Computers, 2010

Scratchpad Memory Management Techniques for Code in Embedded Systems without an MMU.
IEEE Trans. Computers, 2010

Adaptive execution techniques of parallel programs for multiprocessors.
J. Parallel Distributed Comput., 2010

COMIC++: A software SVM system for heterogeneous multicore accelerator clusters.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

Parallelizing the H.264 decoder on the cell BE architecture.
Proceedings of the 10th International conference on Embedded software, 2010

A software-SVM-based transactional memory for multicore accelerator architectures with local memory.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

An OpenCL framework for heterogeneous multicores with local memory.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
Prefetching with Helper Threads for Loosely Coupled Multiprocessor Systems.
IEEE Trans. Parallel Distributed Syst., 2009

High recording density hard disk channel equalization using a bilinear recursive polynomial model.
IEICE Electron. Express, 2009

Design and implementation of software-managed caches for multicores with local memory.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

2008
Dynamic scratchpad memory management for code in portable systems with an MMU.
ACM Trans. Embed. Comput. Syst., 2008

Message-passing iterative decoding between detector and RSC code decoder for PMR channel.
IEEE Trans. Consumer Electron., 2008

A Practical Improvement to the Partial Redundancy Elimination in SSA Form.
J. Comput. Sci. Eng., 2008

FaCSim: a fast and cycle-accurate architecture simulator for embedded systems.
Proceedings of the 2008 ACM SIGPLAN/SIGBED Conference on Languages, 2008

Scratchpad memory management in a multitasking environment.
Proceedings of the 8th ACM & IEEE International conference on Embedded software, 2008

COMIC: a coherent shared memory interface for cell be.
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007
Selective code transformation for dual instruction set processors.
ACM Trans. Embed. Comput. Syst., 2007

Dynamic data scratchpad memory management for a memory subsystem with an MMU.
Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, 2007

2006
An algorithmic sign-reversing involution for special rim-hook tableaux.
J. Algorithms, 2006

Helper thread prefetching for loosely-coupled multiprocessor systems.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Scratchpad memory management for portable systems with a memory management unit.
Proceedings of the 6th ACM & IEEE International conference on Embedded software, 2006

A dynamic code placement technique for scratchpad memory using postpass optimization.
Proceedings of the 2006 International Conference on Compilers, 2006

2005
Error control scheme for high-speed DVD systems.
IEEE Trans. Consumer Electron., 2005

Eliminating Conflict Misses Using Prime Number-Based Cache Indexing.
IEEE Trans. Computers, 2005

Compiler techniques for high performance sequentially consistent java programs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005

Adaptive execution techniques for SMT multiprocessor architectures.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005

Evaluating the Impact of Thread Escape Analysis on a Memory Consistency Model-Aware Compiler.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

2004
A compiler for multiple memory models.
Concurr. Comput. Pract. Exp., 2004

A Flexible Tradeoff Between Code Size and WCET Using a Dual Instruction Set Processor.
Proceedings of the Software and Compilers for Embedded Systems, 8th International Workshop, 2004

Using Prime Numbers for Cache Indexing to Eliminate Conflict Misses.
Proceedings of the 10th International Conference on High-Performance Computer Architecture (HPCA-10 2004), 2004

Compiler-assisted demand paging for embedded systems with flash memory.
Proceedings of the EMSOFT 2004, 2004

2003
Correlation Prefetching with a User-Level Memory Thread.
IEEE Trans. Parallel Distributed Syst., 2003

A Flexible Tradeoff between Code Size and WCET Employing Dual Instruction Set Processors.
Proceedings of the 3rd International Workshop on Worst-Case Execution Time Analysis, 2003

Code Generation for a Dual Instruction Set Processor Based on Selective Code Transformation.
Proceedings of the Software and Compilers for Embedded Systems, 7th International Workshop, 2003

Automatic fence insertion for shared memory multiprocessing.
Proceedings of the 17th Annual International Conference on Supercomputing, 2003

Design of VDSL Networks for the High Speed Internet Services.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

2002
Modified maximum a posteriori decoding algorithm using a priori probability ratio as threshold.
Proceedings of the 2002 IEEE Wireless Communications and Networking Conference Record, 2002

Radio location using decision feedback method.
Proceedings of the 2002 IEEE Wireless Communications and Networking Conference Record, 2002

Optimizing the Java Piped I/O Stream Library for Performance.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

Automatic Implementation of Programming Language Consistency Models.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

Adaptively Increasing Performance and Scalability of Automatically Parallelized Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

The Pensieve Project: A Compiler Infrastructure for Memory Models.
Proceedings of the International Symposium on Parallel Architectures, 2002

Using a User-Level Memory Thread for Correlation Prefetching.
Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), 2002

2001
Automatic Code Mapping on an Intelligent Memory Architecture.
IEEE Trans. Computers, 2001

Hiding Relaxed Memory Consistency with a Compiler.
IEEE Trans. Computers, 2001

A convergence of geometric mean for T-related fuzzy numbers.
Fuzzy Sets Syst., 2001

Information extraction method without original image using turbo code.
Proceedings of the 2001 International Conference on Image Processing, 2001

Automatically Mapping Code on an Intelligent Memory Architecture.
Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001

2000
A watermarking sequence using parities of error control coding for image authentication and correction.
IEEE Trans. Consumer Electron., 2000

Iterative equalization-detection algorithm using trellis-based equalizer and RSC code for high density optical recording systems.
IEEE Trans. Consumer Electron., 2000

On the law of large numbers for mutually T-related L-R fuzzy numbers.
Fuzzy Sets Syst., 2000

Adaptively Mapping Code in an Intelligent Memory Architecture.
Proceedings of the Intelligent Memory Systems, Second International Workshop, 2000

Image Integrity and Correction using Parities of Error Control Coding.
Proceedings of the 2000 IEEE International Conference on Multimedia and Expo, 2000

Hiding Relaxed Memory Consistency with Compilers.
Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT'00), 2000

1999
Compilation Techniques for Explicitly Parallel Programs
PhD thesis, 1999

Basic Compiler Algorithms for Parallel Programs.
Proceedings of the 1999 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'99), 1999

1998
A Constant Propagation Algorithm for Explicitly Parallel Programs.
Int. J. Parallel Program., 1998

Capacity of M-ary multitrack RLL codes for storage channels.
Proceedings of the 1998 IEEE International Conference on Communications, 1998

1997
Concurrent Static Single Assignment Form and Constant Propagation for Explicitly Parallel Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 1997

1996
Parallel Programming with Polaris.
Computer, 1996

Restructuring Programs for High-Speed Computers with Polaris.
Proceedings of the 1996 International Conference on Parallel Processing Workshop, 1996

1995
STeP: The Stanford Temporal Prover.
Proceedings of the TAPSOFT'95: Theory and Practice of Software Development, 1995


  Loading...