Nam Sung Kim
Orcid: 0000-0002-0442-5634Affiliations:
- University of Illinois, Urbana-Champaign, IL, USA
According to our database1,
Nam Sung Kim
authored at least 231 papers
between 2002 and 2024.
Collaborative distances:
Collaborative distances:
Awards
ACM Fellow
ACM Fellow 2020, "For contributions to design and modeling of power-efficient computer architectures".
IEEE Fellow
IEEE Fellow 2016, "For contribution to circuits and architectures for power-efficient microprocessors".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
2024
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
A Quantitative Analysis and Guidelines of Data Streaming Accelerator in Modern Intel Xeon Scalable Processors.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
AttAcc! Unleashing the Power of PIM for Batched Transformer-based Generative Model Inference.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
A Quantitative Analysis and Guideline of Data Streaming Accelerator in Intel 4th Gen Xeon Scalable Processors.
CoRR, 2023
CoRR, 2023
X-ray: Discovering DRAM Internal Structure and Error Characteristics by Issuing Memory Commands.
IEEE Comput. Archit. Lett., 2023
IEEE Comput. Archit. Lett., 2023
Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models.
IEEE Comput. Archit. Lett., 2023
Proceedings of the 2023 USENIX Annual Technical Conference, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Making Sense of Using a SmartNIC to Reduce Datacenter Tax from SLO and TCO Perspectives.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023
Rambda: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Proceedings of the 19th Workshop on Hot Topics in Operating Systems, 2023
2022
DML: Dynamic Partial Reconfiguration With Scalable Task Scheduling for Multi-Applications on FPGAs.
IEEE Trans. Computers, 2022
IEEE Trans. Computers, 2022
Harmony: Overcoming the hurdles of GPU memory capacity to train massive DNN models on commodity servers.
Proc. VLDB Endow., 2022
Near-Memory Processing in Action: Accelerating Personalized Recommendation With AxDIMM.
IEEE Micro, 2022
Aquabolt-XL HBM2-PIM, LPDDR5-PIM With In-Memory Processing, and AXDIMM With Acceleration Buffer.
IEEE Micro, 2022
CoRR, 2022
ORCA: A Network and Architecture Co-design for Offloading us-scale Datacenter Applications.
CoRR, 2022
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022
BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling.
Proceedings of Machine Learning and Systems 2022, 2022
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022
2021
Virtual-Cache: A cache-line borrowing technique for efficient GPU cache architectures.
Microprocess. Microsystems, September, 2021
An 8.5-Gb/s/Pin 12-Gb LPDDR5 SDRAM With a Hybrid-Bank Architecture, Low Power, and Speed-Boosting Techniques.
IEEE J. Solid State Circuits, 2021
A 16-GB 640-GB/s HBM2E DRAM With a Data-Bus Window Extension Technique and a Synergetic On-Die ECC Scheme.
IEEE J. Solid State Circuits, 2021
IEEE Comput. Archit. Lett., 2021
GreenDIMM: OS-assisted DRAM Power Management for DRAM with a Sub-array Granularity Power-Down State.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
25.4 A 20nm 6GB Function-In-Memory DRAM, Based on HBM2 with a 1.2TFLOPS Programmable Computing Unit Using Bank-Level Parallelism, for Machine Learning Applications.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021
25.2 A 16Gb Sub-1V 7.14Gb/s/pin LPDDR5 SDRAM Applying a Mosaic Architecture with a Short-Feedback 1-Tap DFE, an FSS Bus with Low-Level Swing and an Adaptively Controlled Body Biasing in a 3<sup>rd</sup>-Generation 10nm DRAM.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Proceedings of the HotOS '21: Workshop on Hot Topics in Operating Systems, 2021
Aquabolt-XL: Samsung HBM2-PIM with in-memory processing for ML accelerators and beyond.
Proceedings of the IEEE Hot Chips 33 Symposium, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
2020
Errata to "Exploring Fault-Tolerant Erasure Codes for Scalable All-Flash Array Clusters".
IEEE Trans. Parallel Distributed Syst., 2020
IEEE Trans. Knowl. Data Eng., 2020
CoRR, 2020
IEEE Comput. Archit. Lett., 2020
IEEE Comput. Archit. Lett., 2020
Proceedings of the 33rd International Conference on VLSI Design and 19th International Conference on Embedded Systems, 2020
Planaria: Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of Deep Neural Networks.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
22.1 A 1.1V 16GB 640GB/s HBM2E DRAM with a Data-Bus Window-Extension Technique and a Synergetic On-Die ECC Scheme.
Proceedings of the 2020 IEEE International Solid- State Circuits Conference, 2020
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
Mixed-Signal Charge-Domain Acceleration of Deep Neural Networks through Interleaved Bit-Partitioned Arithmetic.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
An Efficient GPU Cache Architecture for Applications with Irregular Memory Access Patterns.
ACM Trans. Archit. Code Optim., 2019
An Energy-Efficient Programmable Mixed-Signal Accelerator for Machine Learning Algorithms.
IEEE Micro, 2019
Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic.
CoRR, 2019
IEEE Comput. Archit. Lett., 2019
Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip, 2019
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Proceedings of the International Symposium on Memory Systems, 2019
Near-Memory and In-Storage FPGA Acceleration for Emerging Cognitive Computing Workloads.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019
Proceedings of the 46th International Symposium on Computer Architecture, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Practical Near-Data Processing to Evolve Memory and Storage Devices into Mainstream Heterogeneous Computing Systems.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
FlatFlash: Exploiting the Byte-Accessibility of SSDs within a Unified Memory-Storage Hierarchy.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019
Proceedings of the Approximate Circuits, Methodologies and CAD., 2019
2018
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018
IEEE Micro, 2018
CoRR, 2018
IEEE Comput. Archit. Lett., 2018
IEEE Comput. Archit. Lett., 2018
IEEE Access, 2018
FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018
GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Amber*: Enabling Precise Full-System Simulation with Detailed Modeling of All SSD Resources.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Application-Transparent Near-Memory Processing Architecture with Memory Channel Network.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the International Symposium on Memory Systems, 2018
Proceedings of the International Symposium on Low Power Electronics and Design, 2018
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
PROMISE: An End-to-End Design of a Programmable Mixed-Signal Accelerator for Machine-Learning Algorithms.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
CIAO: Cache Interference-Aware Throughput-Oriented Architecture and Scheduling for GPUs.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018
FlexiGAN: An End-to-End Solution for FPGA Acceleration of Generative Adversarial Networks.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018
Proceedings of the ACM Symposium on Cloud Computing, 2018
Proceedings of the IEEE Asian Solid-State Circuits Conference, 2018
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018
3D-Xpath: high-density managed DRAM architecture with cost-effective alternative paths for memory transactions.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018
2017
IEEE Trans. Biomed. Eng., 2017
Heterogeneous Computing Meets Near-Memory Acceleration and High-Level Synthesis in the Post-Moore Era.
IEEE Micro, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017
Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, 2017
Elastic-Cache: GPU Cache Architecture for Efficient Fine- and Coarse-Grained Cache-Line Management.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
Understanding power-performance relationship of energy-efficient modern DRAM devices.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Understanding system characteristics of online erasure coding on scalable, distributed and large-scale SSD array systems.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Proceedings of the IEEE International Conference on Rebooting Computing, 2017
Collaborative (CPU + GPU) algorithms for triangle counting and truss decomposition on the Minsky architecture: Static graph challenge: Subgraph isomorphism.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017
G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
NCAP: Network-Driven, Packet Context-Aware Power Management for Client-Server Architecture.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017
2016
IEEE Trans. Parallel Distributed Syst., 2016
SIGARCH Comput. Archit. News, 2016
Near-DRAM Acceleration with Single-ISA Heterogeneous Processing in Standard Memory Modules.
IEEE Micro, 2016
IEICE Electron. Express, 2016
IEEE Comput. Archit. Lett., 2016
Snatch: Opportunistically reassigning power allocation between processor and memory in 3D stacks.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Chameleon: Versatile and practical near-DRAM acceleration architecture for large memory systems.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
Proceedings of the 34th IEEE International Conference on Computer Design, 2016
VARIUS-TC: A modular architecture-level model of parametric variation for thin-channel switches.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
VR-scale: runtime dynamic phase scaling of processor voltage regulators for improving power efficiency.
Proceedings of the 53rd Annual Design Automation Conference, 2016
2015
Energy-Efficient Approximate Multiplication for Digital Signal Processing and Classification Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2015
Decoupled Control and Data Processing for Approximate Near-Threshold Voltage Computing.
IEEE Micro, 2015
IEEE Comput. Archit. Lett., 2015
Proceedings of the 2015 USENIX Annual Technical Conference, 2015
vCache: architectural support for transparent and isolated virtual LLCs in virtualized environments.
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Comparison of single-ISA heterogeneous versus wide dynamic range processors for mobile applications.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015
Alloy: Parallel-serial memory channel architecture for single-chip heterogeneous processor systems.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 25th edition on Great Lakes Symposium on VLSI, GLVLSI 2015, Pittsburgh, PA, USA, May 20, 2015
2014
Low-Cost Per-Core Voltage Domain Support for Power-Constrained High-Performance Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2014
Optimization of a Cell Counting Algorithm for Mobile Point-of-Care Testing Platforms.
Sensors, 2014
Low-cost scratchpad memory organizations using heterogeneous cell sizes for low-voltage operations.
Microprocess. Microsystems, 2014
Maximizing throughput of power/thermal-constrained processors by balancing power consumption of cores.
Proceedings of the Fifteenth International Symposium on Quality Electronic Design, 2014
Energy-efficient reconfigurable cache architectures for accelerator-enabled embedded systems.
Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014
Quantitative comparison of the power reduction techniques for samsung reconfigurable processor.
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014
SleepScale: Runtime joint speed scaling and sleep states management for power efficient data centers.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Process variation-aware workload partitioning algorithms for GPUs supporting spatial-multitasking.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
2013
Clamping Virtual Supply Voltage of Power-Gated Circuits for Active Leakage Reduction and Gate-Oxide Reliability.
IEEE Trans. Very Large Scale Integr. Syst., 2013
Improving Throughput of Power-Constrained Many-Core Processors Based on Unreliable Devices.
IEEE Micro, 2013
Queuing Theoretic Analysis of Power-performance Tradeoff in Power-efficient Computing
CoRR, 2013
Exploiting GPU peak-power and performance tradeoffs through reduced effective pipeline latency.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
Improving platform energy: chip area trade-off in near-threshold computing environment.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2013
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2013
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013
Proceedings of the 18th Asia and South Pacific Design Automation Conference, 2013
2012
Analyzing Potential Throughput Improvement of Power- and Thermal-Constrained Multicore Processors by Exploiting DVFS and PCPG.
IEEE Trans. Very Large Scale Integr. Syst., 2012
Maximizing Frequency and Yield of Power-Constrained Designs Using Programmable Power-Gating.
IEEE Trans. Very Large Scale Integr. Syst., 2012
Analyzing the Impact of Joint Optimization of Cell Size, Redundancy, and ECC on Low-Voltage SRAM Array Total Area.
IEEE Trans. Very Large Scale Integr. Syst., 2012
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012
Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks, 2012
VARIUS-NTV: A microarchitectural model to capture the increased sensitivity of manycores to process variations at near-threshold voltages.
Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks, 2012
Workload-aware voltage regulator optimization for power efficient multi-core processors.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
Cost-effective power delivery to support per-core voltage domains for power-constrained processors.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Lossless and lossy memory I/O link compression for improving performance of GPGPU workloads.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
2011
Analyzing the performance and energy impact of 3D memory integration on embedded DSPs.
Proceedings of the 2011 International Conference on Embedded Computer Systems: Architectures, 2011
Proceedings of the 12th International Symposium on Quality Electronic Design, 2011
Analyzing throughput of GPGPUs exploiting within-die core-to-core frequency variation.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2011
Low-voltage on-chip cache architecture using heterogeneous cell sizes for high-performance processors.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011
Proceedings of the Design, Automation and Test in Europe, 2011
Proceedings of the Design, Automation and Test in Europe, 2011
AVS-aware power-gate sizing for maximum performance and power efficiency of power-constrained processors.
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011
Proceedings of the 22nd IEEE International Conference on Application-specific Systems, 2011
Proceedings of the Conference Record of the Forty Fifth Asilomar Conference on Signals, 2011
Improving Throughput of Power-Constrained GPUs Using Dynamic Voltage/Frequency and Core Scaling.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
2010
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010
Analyzing and minimizing effects of temperature variation and NBTI on active leakage power of power-gated circuits.
Proceedings of the 11th International Symposium on Quality of Electronic Design (ISQED 2010), 2010
Proceedings of the 11th International Symposium on Quality of Electronic Design (ISQED 2010), 2010
Proceedings of the 2010 International Symposium on Low Power Electronics and Design, 2010
Minimizing total area of low-voltage SRAM arrays through joint optimization of cell size, redundancy, and ECC.
Proceedings of the 28th International Conference on Computer Design, 2010
Optimal algorithm for profile-based power gating: A compiler technique for reducing leakage on execution units in microprocessors.
Proceedings of the 2010 International Conference on Computer-Aided Design, 2010
Runtime temperature-based power estimation for optimizing throughput of thermal-constrained multi-core processors.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010
Analyzing impact of multiple ABB and AVS domains on throughput of power and thermal-constrained multi-core processors.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010
2009
Analyzing potential power reduction with adaptive voltage positioning optimized for multicore processors.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009
Optimizing total power of many-core processors considering voltage scaling limit and process variations.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009
Statistical static timing analysis considering leakage variability in power gated designs.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009
Optimizing throughput of power- and thermal-constrained multicore processors using DVFS and per-core power-gating.
Proceedings of the 46th Design Automation Conference, 2009
2008
On-chip cache device scaling limits and effective fault repair techniques in future nanoscale technology.
Microprocess. Microsystems, 2008
2007
Proceedings of the 2007 International Conference on Computer-Aided Design, 2007
2005
IEEE Trans. Very Large Scale Integr. Syst., 2005
Proceedings of the 15th ACM Great Lakes Symposium on VLSI 2005, 2005
Power-Performance Trade-Offs in Nanometer-Scale Multi-Level Caches Considering Total Leakage.
Proceedings of the 2005 Design, 2005
2004
IEEE Trans. Very Large Scale Integr. Syst., 2004
IEEE Micro, 2004
Proceedings of the 2004 International Symposium on Low Power Electronics and Design, 2004
2003
Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003
Proceedings of the 17th Annual International Conference on Supercomputing, 2003
Proceedings of the 2003 International Conference on Computer-Aided Design, 2003
Proceedings of the IEEE Custom Integrated Circuits Conference, 2003
2002
Drowsy instruction caches: leakage power reduction using dynamic voltage scaling and cache sub-bank prediction.
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002
Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), 2002