Paul N. Whatmough

According to our database1, Paul N. Whatmough authored at least 58 papers between 2009 and 2020.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2020
SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads.
ACM Trans. Archit. Code Optim., 2020

CHIPKIT: An Agile, Reusable Open-Source Framework for Rapid Test Chip Development.
IEEE Micro, 2020

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers.
CoRR, 2020

Sparse Systolic Tensor Array for Efficient CNN Hardware Acceleration.
CoRR, 2020

Compressing Language Models using Doped Kronecker Products.
CoRR, 2020

Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation.
CoRR, 2020

CHIPKIT: An agile, reusable open-source framework for rapid test chip development.
CoRR, 2020

Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference.
IEEE Comput. Archit. Lett., 2020

The Sky Is Not the Limit: A Visual Performance Model for Cyber-Physical Co-Design in Autonomous Machines.
IEEE Comput. Archit. Lett., 2020

A 3mm<sup>2</sup> Programmable Bayesian Inference Accelerator for Unsupervised Machine Perception using Parallel Gibbs Sampling in 16nm.
Proceedings of the IEEE Symposium on VLSI Circuits, 2020

Searching for Winograd-aware Quantized Networks.
Proceedings of Machine Learning and Systems 2020, 2020

Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

A Systematic Methodology for Characterizing Scalability of DNN Accelerators using SCALE-Sim.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids.
Proceedings of the Interspeech 2020, 2020

A Scalable Bayesian Inference Accelerator for Unsupervised Learning.
Proceedings of the IEEE Hot Chips 32 Symposium, 2020

2019
A 16-nm Always-On DNN Processor With Adaptive Clocking and Multi-Cycle Banked SRAMs.
IEEE J. Solid State Circuits, 2019

Guest Editors' Introduction: Hardware and Algorithms for Energy-Constrained On-Chip Machine Learning (Part 2).
ACM J. Emerg. Technol. Comput. Syst., 2019

Guest Editors' Introduction to the Special Section on Hardware and Algorithms for Energy-Constrained On-chip Machine Learning.
ACM J. Emerg. Technol. Comput. Syst., 2019

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads.
CoRR, 2019

ISP4ML: Understanding the Role of Image Signal Processing in Efficient Deep Learning Vision Systems.
CoRR, 2019

FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning.
CoRR, 2019

A 16nm 25mm<sup>2</sup> SoC with a 54.5x Flexibility-Efficiency Range from Dual-Core Arm Cortex-A53 to eFPGA and Cache-Coherent Accelerators.
Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

FixyNN: Energy-Efficient Real-Time Mobile Computer Vision Hardware Acceleration via Transfer Learning.
Proceedings of Machine Learning and Systems 2019, 2019

ASV: Accelerated Stereo Vision System.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

GeST: An Automatic Framework For Generating CPU Stress-Tests.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019

On-Chip Memory Technology Design Space Explorations for Mobile Deep Neural Network Accelerators.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

2018
DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications.
IEEE J. Solid State Circuits, 2018

Energy Efficient Hardware for On-Device CNN Inference via Transfer Learning.
CoRR, 2018

SCALE-Sim: Systolic CNN Accelerator.
CoRR, 2018

Mobile Machine Learning Hardware at ARM: A Systems-on-Chip (SoC) Perspective.
CoRR, 2018

Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

A Wide Dynamic Range Sparse FC-DNN Processor with Multi-Cycle Banked SRAM Read and Adaptive Clocking in 16nm FinFET.
Proceedings of the 44th IEEE European Solid State Circuits Conference, 2018

Ares: a framework for quantifying the resilience of deep neural networks.
Proceedings of the 55th Annual Design Automation Conference, 2018

2017
Deep Learning for Computer Architects
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, 2017

Power Integrity Analysis of a 28 nm Dual-Core ARM Cortex-A57 Cluster Using an All-Digital Power Delivery Monitor.
IEEE J. Solid State Circuits, 2017

14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications.
Proceedings of the 2017 IEEE International Solid-State Circuits Conference, 2017

A case for efficient accelerator design space exploration via Bayesian optimization.
Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, 2017

Applications of Deep Neural Networks for Ultra Low Power IoT.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Sub-uJ deep neural networks for embedded applications.
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016
Sequence-Aware Watermark Design for Soft IP Embedded Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2016

A low-power correlator for wakeup receivers with algorithm pruning through early termination.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015
A 0.6V all-digital body-coupled wakeup transceiver for IoT applications.
Proceedings of the Symposium on VLSI Circuits, 2015

14.6 An all-digital power-delivery monitor for analysis of a 28nm dual-core ARM Cortex-A57 cluster.
Proceedings of the 2015 IEEE International Solid-State Circuits Conference, 2015

Analysis of adaptive clocking technique for resonant supply voltage noise mitigation.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Modeling and characterization of the system-level Power Delivery Network for a dual-core ARM Cortex-A57 cluster in 28nm CMOS.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

2014
Precision-Energy-Throughput Scaling of Generic Matrix Multiplication and Convolution Kernels via Linear Projections.
IEEE Trans. Circuits Syst. Video Technol., 2014

A Low-Power 1-GHz Razor FIR Accelerator With Time-Borrow Tracking Pipeline and Approximate Error Correction in 65-nm CMOS.
IEEE J. Solid State Circuits, 2014

Clock-modulation based watermark for protection of embedded processors.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

2013
Circuit-Level Timing Error Tolerance for Low-Power DSP Filters and Transforms.
IEEE Trans. Very Large Scale Integr. Syst., 2013

A low-power 1GHz razor FIR accelerator with time-borrow tracking pipeline and approximate error correction in 65nm CMOS.
Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

Precision-energy-throughput scaling of generic matrix multiplication and discrete convolution kernels via linear projections.
Proceedings of the 11th IEEE Symposium on Embedded Systems for Real-time Multimedia, 2013

2012
VLSI Architecture for a Reconfigurable Spectrally Efficient FDM Baseband Transmitter.
IEEE Trans. Circuits Syst. I Regul. Pap., 2012

Selective time borrowing for DSP pipelines with hybrid voltage control loop.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

2011
Error-resilient low-power DSP via path-delay shaping.
Proceedings of the 48th Design Automation Conference, 2011

2010
A robust FIR filter with in situ error detection.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

2009
System-Efficiency Analysis of Power Amplifier Supply-Tracking Regimes in Mobile Transmitters.
IEEE Trans. Circuits Syst. I Regul. Pap., 2009


  Loading...