Yufei Ma

Orcid: 0000-0002-2670-524X

Affiliations:
  • Peking University, Beijing, China
  • Arizona State University, Tempe, AZ, USA (former)


According to our database1, Yufei Ma authored at least 31 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
30.2 A 22nm 0.26nW/Synapse Spike-Driven Spiking Neural Network Processing Unit Using Time-Step-First Dataflow and Sparsity-Adaptive In-Memory Computing.
Proceedings of the IEEE International Solid-State Circuits Conference, 2024

2023
Research progress on low-power artificial intelligence of things (AIoT) chip design.
Sci. China Inf. Sci., October, 2023

An 82-nW 0.53-pJ/SOP Clock-Free Spiking Neural Network With 40-μs Latency for AIoT Wake-Up Functions Using a Multilevel-Event-Driven Bionic Architecture and Computing-in-Memory Technique.
IEEE Trans. Circuits Syst. I Regul. Pap., 2023

A 22nm Delta-Sigma Computing-In-Memory (Δ∑CIM) SRAM Macro with Near-Zero-Mean Outputs and LSB-First ADCs Achieving 21.38TOPS/W for 8b-MAC Edge AI Processing.
Proceedings of the IEEE International Solid- State Circuits Conference, 2023

DCIM-3DRec: A 3D Reconstruction Accelerator with Digital Computing-in-Memory and Octree-Based Scheduler.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2023

A Model-Specific End-to-End Design Methodology for Resource-Constrained TinyML Hardware.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

A A 22nm 0.43pJ/SOP Sparsity-Aware In-Memory Neuromorphic Computing System with Hybrid Spiking and Artificial Neural Network and Configurable Topology.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2023

RIMAC: An Array-Level ADC/DAC-Free ReRAM-Based in-Memory DNN Processor with Analog Cache and Computation.
Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

2022
A Flexible and Efficient FPGA Accelerator for Various Large-Scale and Lightweight CNNs.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

Hybrid Stochastic-Binary Computing for Low-Latency and High-Precision Inference of CNNs.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

DCIM-GCN: Digital Computing-in-Memory to Efficiently Accelerate Graph Convolutional Networks.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

2021
SWIFT: Small-World-based Structural Pruning to Accelerate DNN Inference on FPGA.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

2020
Performance Modeling for CNN Inference Accelerators on FPGA.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Automatic Compilation of Diverse CNNs Onto High-Performance FPGA Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Efficient Inference of Large-Scale and Lightweight Convolutional Neural Networks on FPGA.
Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

Efficient Hardware Post Processing of Anchor-Based Object Detection on FPGA.
Proceedings of the 2020 IEEE Computer Society Annual Symposium on VLSI, 2020

Optimizing Stochastic Computing for Low Latency Inference of Convolutional Neural Networks.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

In-Memory Computing: The Next-Generation AI Computing Paradigm.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

An Efficient FPGA Accelerator Optimized for High Throughput Sparse CNN Inference.
Proceedings of the 2020 IEEE Asia Pacific Conference on Circuits and Systems, 2020

2019
Efficient Network Construction Through Structural Plasticity.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2019

Automatic Compiler Based FPGA Accelerator for CNN Training.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

2018
Hardware Acceleration of Deep Convolutional Neural Networks on FPGA.
PhD thesis, 2018

Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA.
IEEE Trans. Very Large Scale Integr. Syst., 2018

ALAMO: FPGA acceleration of deep learning algorithms with a modularized RTL compiler.
Integr., 2018

Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs.
Proceedings of the International Conference on Computer-Aided Design, 2018

2017
End-to-end scalable FPGA accelerator for deep residual networks.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

2016
Scalable and modularized RTL compilation of Convolutional Neural Networks onto FPGA.
Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks.
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

2015
Energy-efficient reconstruction of compressively sensed bioelectrical signals with stochastic computing circuits.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015


  Loading...