Jeremy Fowers

According to our database1, Jeremy Fowers authored at least 20 papers between 2012 and 2022.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022

A software-defined tensor streaming multiprocessor for large-scale machine learning.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

2020
Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Inside Project Brainwave's Cloud-Scale, Real-Time AI Processor.
IEEE Micro, 2019

2018
Serving DNNs in Real Time at Datacenter Scale with Project Brainwave.
IEEE Micro, 2018

A Configurable Cloud-Scale DNN Processor for Real-Time AI.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

2017
Configurable Clouds.
IEEE Micro, 2017

2016
A cloud-scale acceleration architecture.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

2015
A Tradeoff Analysis of FPGAs, GPUs, and Multicores for Sliding-Window Applications.
ACM Trans. Reconfigurable Technol. Syst., 2015

A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services.
IEEE Micro, 2015

Toward accelerating deep learning at scale using specialized hardware in the datacenter.
Proceedings of the 2015 IEEE Hot Chips 27 Symposium (HCS), 2015

A Scalable High-Bandwidth Architecture for Lossless Compression on FPGAs.
Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

2014
A framework for dynamic parallelization of FPGA-accelerated applications.
Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems, 2014

A High Memory Bandwidth FPGA Accelerator for Sparse Matrix-Vector Multiplication.
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

2013
A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors.
ACM Trans. Archit. Code Optim., 2013

Dynafuse: dynamic dependence analysis for FPGA pipeline fusion and locality optimizations.
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

A high-performance, low-energy FPGA accelerator for correntropy-based feature tracking (abstract only).
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

A comparison of correntropy-based feature tracking on FPGAs and GPUs.
Proceedings of the 24th International Conference on Application-Specific Systems, 2013

2012
A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications.
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

The RACECAR heuristic for automatic function specialization on multi-core heterogeneous systems.
Proceedings of the 15th International Conference on Compilers, 2012


  Loading...