Jangwoo Kim

Orcid: 0000-0003-2193-5748

According to our database1, Jangwoo Kim authored at least 70 papers between 2004 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Fault-Tolerant Million Qubit-Scale Distributed Quantum Computer.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
STfusion: Fast and Flexible Multi-NN Execution Using Spatio-Temporal Block Fusion and Memory Management.
IEEE Trans. Computers, April, 2023

A Fast and Flexible FPGA-based Accelerator for Natural Language Processing Neural Networks.
ACM Trans. Archit. Code Optim., March, 2023

Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter Behaviors.
Proceedings of the 24th International Middleware Conference, 2023

QIsim: Architecting 10+K Qubit QC Interfaces Toward Quantum Supremacy.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

F4T: A Fast and Flexible FPGA-based Full-stack TCP Acceleration Framework.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022
DLS: A Fast and Flexible Neural Network Training System With Fine-grained Heterogeneous Device Orchestration.
IEEE Trans. Parallel Distributed Syst., 2022

SmartFVM: A Fast, Flexible, and Scalable Hardware-based Virtualization for Commodity Storage Devices.
ACM Trans. Storage, 2022

LSim: Fine-Grained Simulation Framework for Large-Scale Performance Evaluation.
IEEE Comput. Archit. Lett., 2022

3D-FPIM: An Extreme Energy-Efficient DNN Acceleration System Using 3D NAND Flash-Based In-Situ PIM Unit.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

XQsim: modeling cross-technology control processors for 10+K qubit quantum computers.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

NeuroSync: A Scalable and Accurate Brain Simulator Using Safe and Efficient Speculation.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

CryoWire: wire-driven microarchitecture designs for cryogenic computing.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Performance Modeling and Practical Use Cases for Black-Box SSDs.
ACM Trans. Storage, 2021

Superconductor Computing for Neural Networks.
IEEE Micro, 2021

A Next-Generation Cryogenic Processor Architecture.
IEEE Micro, 2021

An accurate and fair evaluation methodology for SNN-based inferencing with full-stack hardware design space explorations.
Neurocomputing, 2021

A Fast and Flexible Hardware-based Virtualization Mechanism for Computational Storage Devices.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021

DifuzzRTL: Differential Fuzz Testing to Find CPU Bugs.
Proceedings of the 42nd IEEE Symposium on Security and Privacy, 2021

UC-Check: Characterizing Micro-operation Caches in x86 Processors and Implications in Security and Performance.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

CryoGuard: A Near Refresh-Free Robust DRAM Design for Cryogenic Computing.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

NeuroEngine: a hardware-based event-driven simulation system for advanced brain-inspired computing.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
FVM: FPGA-assisted Virtual Device Emulation for Fast, Scalable, and Flexible Storage Virtualization.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

TrainBox: An Extreme-Scale Neural Network Training Server Architecture by Systematically Balancing Operations.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

SuperNPU: An Extremely Fast Neural Processing Unit Using Superconducting Logic Devices.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

CryoCore: A Fast and Dense Processor Architecture for Cryogenic Computing.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

A Multi-Neural Network Acceleration Architecture.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Scalable Multi-FPGA Acceleration for Large RNNs with Full Parallelism Levels.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

CryoCache: A Fast, Large, and Cost-Effective Cache Architecture for Cryogenic Computing.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
FlexLearn: Fast and Highly Efficient Brain Simulations Using Flexible On-Chip Learning.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

FIDR: A Scalable Storage System for Fine-Grain Inline Data Reduction with Efficient Memory Handling.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Cryogenic computer architecture modeling with memory-side case studies.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

MnnFast: a fast and scalable system architecture for memory-augmented neural networks.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

CIDR: A Cost-Effective In-Line Data Reduction System for Terabit-Per-Second Scale SSD Arrays.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization.
Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019

Enforcing Last-Level Cache Partitioning through Memory Virtual Channels.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
DiagSim: Systematically Diagnosing Simulators for Healthy Simulations.
ACM Trans. Archit. Code Optim., 2018

SSD Performance Modeling Using Bottleneck Analysis.
IEEE Comput. Archit. Lett., 2018

A Scalable HW-Based Inline Deduplication for SSD Arrays.
IEEE Comput. Archit. Lett., 2018

DynaMix: Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

SSDcheck: Timely and Accurate Prediction of Irregular Behaviors in Black-Box SSDs.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

RpStacks-MT: A High-Throughput Design Evaluation Methodology for Multi-Core Processors.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Flexon: A Flexible Digital Neuron for Efficient Spiking Neural Network Simulations.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

WSMeter: A Performance Evaluation Methodology for Google's Production Warehouse-Scale Computers.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
GPUpd: a fast and scalable multi-GPU architecture using cooperative projection and distribution.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

StressRight: Finding the right stress for accurate in-development system evaluation.
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

2016
Efficient footprint caching for Tagless DRAM Caches.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

DTStorage: Dynamic Tape-Based Storage for Cost-Effective and Highly-Available Streaming Service.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

CloudSwap: A Cloud-Assisted Swap Mechanism for Mobile Devices.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

2015
DCS: a fast and scalable device-centric server architecture.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

A fully associative, tagless DRAM cache.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014
ScaleGPU: GPU Architecture for Memory-Unaware GPU Programming.
IEEE Comput. Archit. Lett., 2014

Stealing Webpages Rendered on Your Browser by Exploiting GPU Vulnerabilities.
Proceedings of the 2014 IEEE Symposium on Security and Privacy, 2014

Microbank: Architecting Through-Silicon Interposer-Based Main Memory Systems.
Proceedings of the International Conference for High Performance Computing, 2014

RpStacks: Fast and Accurate Processor Design Space Exploration Using Representative Stall-Event Stacks.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

GPUdmm: A high-performance and memory-oblivious GPU architecture using dynamic memory management.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

CMcloud: Cloud Platform for Cost-Effective Offloading of Mobile Applications.
Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

2013
Guide-copy: fast and silent migration of virtual machine for datacenters.
Proceedings of the International Conference for High Performance Computing, 2013

Building Fast, Dense, Low-Power Caches Using Erasure-Based Inline Multi-bit ECC.
Proceedings of the IEEE 19th Pacific Rim International Symposium on Dependable Computing, 2013

2007
PAI: A Lightweight Mechanism for Single-Node Memory Recovery in DSM Servers.
Proceedings of the 13th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2007), 2007

Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding.
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007

2005
TRUSS: A Reliable, Scalable Server Architecture.
IEEE Micro, 2005

Temporal Streaming of Shared Memory.
Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005

Store-Ordered Streaming of Shared Memory.
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005

2004
SimFlex: a fast, accurate, flexible full-system simulation framework for performance evaluation of server architecture.
SIGMETRICS Perform. Evaluation Rev., 2004

Fingerprinting: Bounding Soft-Error-Detection Latency and Bandwidth.
IEEE Micro, 2004

Memory coherence activity prediction in commercial workloads.
Proceedings of the 3rd Workshop on Memory Performance Issues, 2004

Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures.
Proceedings of the 37th Annual International Symposium on Microarchitecture (MICRO-37 2004), 2004


  Loading...