Ritchie Zhao

Orcid: 0000-0003-1656-9165

According to our database1, Ritchie Zhao authored at least 21 papers between 2015 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Microscaling Data Formats for Deep Learning.
CoRR, 2023

Shared Microexponents: A Little Shifting Goes a Long Way.
CoRR, 2023


2020
Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators.
CoRR, 2019

A 1.4 GHz 695 Giga Risc-V Inst/s 496-Core Manycore Processor With Mesh On-Chip Network and an All-Digital Synthesized PLL in 16nm CMOS.
Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting.
Proceedings of the 36th International Conference on Machine Learning, 2019

Building Efficient Deep Neural Networks With Unitary Group Convolutions.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips.
IEEE Micro, 2018

Serving DNNs in Real Time at Datacenter Scale with Project Brainwave.
IEEE Micro, 2018

Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs.
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

2017
Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

A Parallel Bandit-Based Approach for Autotuning FPGA Compilation.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Enabling adaptive loop pipelining in high-level synthesis.
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016
Improving high-level synthesis with decoupled data structure optimization.
Proceedings of the 53rd Annual Design Automation Conference, 2016

2015
ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

Area-efficient pipelining for FPGA-targeted high-level synthesis.
Proceedings of the 52nd Annual Design Automation Conference, 2015


  Loading...