Jason Clemons

Proceedings of the IEEE International Solid-State Circuits Conference, 2026

2025

HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving.

[BibT_eX]

[DOI]

CoRR, April, 2025

2024

Vision Transformer Computation and Resilience for Dynamic Inference.

[BibT_eX]

[DOI]

Kavya Sreedhar

Mark Horowitz

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

2023

Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing.

[BibT_eX]

[DOI]

ACM Trans. Comput. Syst., 2023

2022

Enabling and Accelerating Dynamic Vision Transformer Inference for Real-Time Applications.

[BibT_eX]

[DOI]

Kavya Sreedhar

Mark Horowitz

CoRR, 2022

Augmenting Legacy Networks for Flexible Inference.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

2021

Simba: scaling deep-learning inference with chiplet-based architecture.

[BibT_eX]

[DOI]

Yakun Sophia Shao

Commun. ACM, 2021

2020

A 0.32-128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm.

[BibT_eX]

[DOI]

Brian Zimmer

IEEE J. Solid State Circuits, 2020

2019

A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm.

[BibT_eX]

[DOI]

Brian Zimmer

Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture.

[BibT_eX]

[DOI]

Yakun Sophia Shao

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

MAGNet: A Modular Accelerator Generator for Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2019

A 0.11 PJ/OP, 0.32-128 Tops, Scalable Multi-Chip-Module-Based Deep Neural Network Accelerator Designed with A High-Productivity vlsi Methodology.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Hot Chips 31 Symposium (HCS), 2019

Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration.

[BibT_eX]

[DOI]

Christopher W. Fletcher

Joel S. Emer

Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018

Automotive Computing.

[BibT_eX]

[DOI]

Hsien-Hsin Sean Lee

IEEE Micro, 2018

Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training.

[BibT_eX]

[DOI]

CoRR, 2018

A modular digital VLSI flow for high-productivity SoC design.

[BibT_eX]

[DOI]

Brucek Khailany

Evgeni Khmer

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

2016

Virtualizing Deep Neural Networks for Memory-Efficient Neural Network Design.

[BibT_eX]

[DOI]

CoRR, 2016

GA3C: GPU-based A3C for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2016

vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

A patch memory system for image processing and computer vision.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

A real-time energy-efficient superpixel hardware accelerator for mobile computer vision applications.

[BibT_eX]

[DOI]

Injoon Hong

Iuri Frosio

Brucek Khailany

Proceedings of the 53rd Annual Design Automation Conference, 2016

2013

Computer Architectures for Mobile Computer Vision Systems.

[BibT_eX]

[DOI]