Tony Nowatzki

Orcid: 0000-0001-8483-3824

According to our database1, Tony Nowatzki authored at least 48 papers between 2012 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
PIMSAB: A Processing-In-Memory System with Spatially-Aware Communication and Bit-Serial-Aware Computation.
CoRR, 2023

Affinity Alloc: Taming Not-So Near-Data Computing.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Infinity Stream: Portable and Programmer-Friendly In-/Near-Memory Fusion.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Explainable-DSE: An Agile and Explainable Exploration of Efficient HW/SW Codesigns of Deep Learning Accelerators Using Bottleneck Analysis.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Unifying Spatial Accelerator Compilation With Idiomatic and Modular Transformations.
IEEE Micro, 2022

Infinity Stream: Enabling Transparent and Automated In-Memory Computing.
IEEE Comput. Archit. Lett., 2022

OverGen: Improving FPGA Usability through Domain-specific Overlay Generation.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

A programmable, energy-minimal dataflow compiler and architecture.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

The Mozart reuse exposed dataflow processor for AI and beyond: industrial product.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Near-Stream Computing: General and Transparent Near-Cache Acceleration.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

TaskStream: accelerating task-parallel workloads by recovering program structure.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights.
Proc. IEEE, 2021

PolyGraph: Exposing the Value of Flexibility for Graph Processing Accelerators.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Stream Floating: Enabling Proactive and Decentralized Cache Optimizations.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

Mozart: Designing for Software Maturity and the Next Paradigm for Chip Architectures.
Proceedings of the IEEE Hot Chips 33 Symposium, 2021

UNIT: Unifying Tensorized Instruction Compilation.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021

2020
Towards General-Purpose Acceleration: Finding Structure in Irregularity.
IEEE Micro, 2020

DSAGEN: Synthesizing Programmable Spatial Accelerators.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

A Hybrid Systolic-Dataflow Architecture for Inductive Matrix Algorithms.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2019
Exploiting Fine-Grain Ordered Parallelism in Dense Matrix Algorithms.
CoRR, 2019

DAEGEN: A Modular Compiler for Exploring Decoupled Spatial Accelerators.
IEEE Comput. Archit. Lett., 2019

Heterogeneous Von Neumann/dataflow microprocessors.
Commun. ACM, 2019

μIR -An intermediate representation for transforming and optimizing the microarchitecture of application accelerators.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Towards General Purpose Acceleration by Exploiting Common Data-Dependence Forms.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Stream-based memory access specialization for general purpose processors.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

2018
Hybrid optimization/heuristic instruction scheduling for programmable accelerator codesign.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
Domain Specialization Is Generally Unnecessary for Accelerators.
IEEE Micro, 2017

Kickstarting Semiconductor Innovation with Open Source Hardware.
Computer, 2017

Stream-Dataflow Acceleration.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

2016
A Heterogeneous Von Neumann/Explicit Dataflow Processor.
IEEE Micro, 2016

Open-source Hardware: Opportunities and Challenges.
CoRR, 2016

Software transparent dynamic binary translation for coarse-grain reconfigurable architectures.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Pushing the limits of accelerator efficiency while retaining programmability.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Modularizing the microprocessor core to outperform traditional out-of-order.
Proceedings of the 2016 IEEE Hot Chips 28 Symposium (HCS), 2016

Analyzing Behavior Specialized Acceleration.
Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015
Comprehensive Circuit Failure Prediction for Logic and SRAM Using Virtual Aging.
IEEE Micro, 2015

Architectural Simulators Considered Harmful.
IEEE Micro, 2015

A Graph-Based Program Representation for Analyzing Hardware Specialization Approaches.
IEEE Comput. Archit. Lett., 2015

Performance evaluation of a DySER FPGA prototype system spanning the compiler, microarchitecture, and hardware implementation.
Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

Exploring the potential of heterogeneous von neumann/dataflow execution models.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014
A Scheduling Framework for Spatial Architectures Across Multiple Constraint-Solving Theories.
ACM Trans. Program. Lang. Syst., 2014

2013
Optimization and Mathematical Modeling in Computer Architecture
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01773-5, 2013

Constraint centric scheduling guide.
SIGARCH Comput. Archit. News, 2013

A general constraint-centric scheduling framework for spatial architectures.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2013

Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing.
IEEE Micro, 2012

Design, integration and implementation of the DySER hardware accelerator into OpenSPARC.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

Prototyping the DySER specialization architecture with OpenSPARC.
Proceedings of the 2012 IEEE Hot Chips 24 Symposium (HCS), 2012


  Loading...