Michael Pellauer

Angshuman Parashar

Tushar Krishna

Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Self adaptive reconfigurable arrays (SARA): learning flexible GEMM accelerator configuration and mapping-space using ML.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021

Self-Adaptive Reconfigurable Arrays (SARA): Using ML to Assist Scaling GEMM Acceleration.

[BibT_eX]

[DOI]

Ananda Samajdar

Tushar Krishna

CoRR, 2021

Flexion: A Quantitative Metric for Flexibility in DNN Accelerators.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2021

Heterogeneous Dataflow Accelerators for Multi-DNN Workloads.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

2020

Data Orchestration in Deep Learning Accelerators

[BibT_eX]

[DOI]

Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01767-4, 2020

MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings.

[BibT_eX]

[DOI]

IEEE Micro, 2020

2019

Understanding Reuse, Performance, and Hardware Cost of DNN Dataflow: A Data-Centric Approach.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

ExTensor: An Accelerator for Sparse Tensor Algebra.

[BibT_eX]

[DOI]

Kartik Hegde

Hadi Asghari Moghaddam

Christopher W. Fletcher

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration.

[BibT_eX]

[DOI]

Rangharajan Venkatesan

Stephen W. Keckler

Christopher W. Fletcher

Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018

Full-Stack Memory Model Verification with TriCheck.

[BibT_eX]

[DOI]

IEEE Micro, 2018

MAESTRO: An Open-source Infrastructure for Modeling Dataflows within Deep Learning Accelerators.

[BibT_eX]

[DOI]

Hyoukjun Kwon

Tushar Krishna

CoRR, 2018

UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition.

[BibT_eX]

[DOI]

Christopher W. Fletcher

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

A modular digital VLSI flow for high-productivity SoC design.

[BibT_eX]

[DOI]

Brucek Khailany

Evgeni Khmer

Rangharajan Venkatesan

Nathaniel Ross Pinckney

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

RTLcheck: verifying the memory consistency of RTL designs.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016

Exploring the Trisection of Software, Hardware, and ISA in Memory Model Design.

[BibT_eX]

[DOI]

CoRR, 2016

Counterexamples and Proof Loophole for the C/C++ to POWER and ARMv7 Trailing-Sync Compiler Mappings.

[BibT_eX]

[DOI]

CoRR, 2016

2015

Efficient Control and Communication Paradigms for Coarse-Grained Spatial Architectures.

[BibT_eX]

[DOI]

ACM Trans. Comput. Syst., 2015

Verifying Correct Microarchitectural Enforcement of Memory Consistency Models.

[BibT_eX]

[DOI]

Daniel Lustig

Margaret Martonosi

IEEE Micro, 2015

CCICheck: using µhb graphs to verify the coherence-consistency interface.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

ArMOR: defending against memory consistency model mismatches in heterogeneous architectures.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014

Efficient Spatial Processing Element Control via Triggered Instructions.

[BibT_eX]

[DOI]

IEEE Micro, 2014

Pipe Check: Specifying and Verifying Microarchitectural Enforcement of Memory Consistency Models.

[BibT_eX]

[DOI]

Daniel Lustig

Margaret Martonosi

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

2013

Triggered instructions: a control paradigm for spatially-programmed architectures.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

A Hierarchical Architectural Framework for Reconfigurable Logic Computing.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Heracles: a tool for fast RTL-based design space exploration of multicore processors.

[BibT_eX]

[DOI]

Michel A. Kinsy

Srinivas Devadas

Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

2012

Leveraging latency-insensitivity to ease multiple FPGA design.

[BibT_eX]

[DOI]

Kermin Elliott Fleming

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

2011

Cycle-accurate multicore performance models on FPGAs.

[BibT_eX]

[DOI]

PhD thesis, 2011

HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Heracles: Fully Synthesizable Parameterized MIPS-Based Multicore System.

[BibT_eX]

[DOI]

Michel A. Kinsy

Srinivas Devadas

Proceedings of the International Conference on Field Programmable Logic and Applications, 2011

Leap scratchpads: automatic memory and cache management for reconfigurable logic.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

2010

Design contest overview: Combined architecture for network stream categorization and intrusion detection (CANSCID).

[BibT_eX]

[DOI]

Forrest Brewer

Proceedings of the 8th ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE 2010), 2010

A design flow based on modular refinement.

[BibT_eX]

[DOI]

Proceedings of the 8th ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE 2010), 2010

2009

A-Port Networks: Preserving the Timed Behavior of Synchronous Systems for Modeling on FPGAs.

[BibT_eX]

[DOI]

Michael Adler

ACM Trans. Reconfigurable Technol. Syst., 2009

Soft connections: addressing the hardware-design modularity problem.

[BibT_eX]

[DOI]

Proceedings of the 46th Design Automation Conference, 2009

2008

Quick Performance Models Quickly: Closely-Coupled Partitioned Simulation on FPGAs.

[BibT_eX]

[DOI]

Michael Adler

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2008

A-Ports: an efficient abstraction for cycle-accurate performance models on FPGAs.

[BibT_eX]

[DOI]

Michael Adler

Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

2007

Hardware Acceleration of Matrix Multiplication on a Xilinx FPGA.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM & IEEE International Conference on Formal Methods and Models for Co-Design (MEMOCODE 2007), May 30, 2007

Scheduling as Rule Composition.

[BibT_eX]

[DOI]

Nirav Dave