Fengbin Tu

Orcid: 0000-0003-2228-8829

According to our database1, Fengbin Tu authored at least 51 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
HDSuper: High-Quality and High Computational Utilization Edge Super-Resolution Accelerator With Hardware-Algorithm Co-Design Techniques.
IEEE Trans. Circuits Syst. I Regul. Pap., April, 2024

MulTCIM: Digital Computing-in-Memory-Based Multimodal Transformer Accelerator With Attention-Token-Bit Hybrid Sparsity.
IEEE J. Solid State Circuits, January, 2024

15.1 A 0.795fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch.
Proceedings of the IEEE International Solid-State Circuits Conference, 2024

20.2 A 28nm 74.34TFLOPS/W BF16 Heterogenous CIM-Based Accelerator Exploiting Denoising-Similarity for Diffusion Models.
Proceedings of the IEEE International Solid-State Circuits Conference, 2024

2023
Reconfigurability, Why It Matters in AI Tasks Processing: A Survey of Reconfigurable AI Chips.
IEEE Trans. Circuits Syst. I Regul. Pap., March, 2023

SPCIM: Sparsity-Balanced Practical CIM Accelerator With Optimized Spatial-Temporal Multi-Macro Utilization.
IEEE Trans. Circuits Syst. I Regul. Pap., January, 2023

STAR: An STGCN ARchitecture for Skeleton-Based Human Action Recognition.
IEEE Trans. Circuits Syst. I Regul. Pap., 2023

SDP: Co-Designing Algorithm, Dataflow, and Architecture for In-SRAM Sparse NN Acceleration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023

SPG: Structure-Private Graph Database via SqueezePIR.
Proc. VLDB Endow., 2023

ReDCIM: Reconfigurable Digital Computing- In -Memory Processor With Unified FP/INT Pipeline for Cloud AI Acceleration.
IEEE J. Solid State Circuits, 2023

TranCIM: Full-Digital Bitline-Transpose CIM-based Sparse Transformer Accelerator With Pipeline/Parallel Reconfigurable Modes.
IEEE J. Solid State Circuits, 2023

A 137.5 TOPS/W SRAM Compute-in-Memory Macro with 9-b Memory Cell-Embedded ADCs and Signal Margin Enhancement Techniques for AI Edge Applications.
CoRR, 2023

Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow Plane.
CoRR, 2023

DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference.
CoRR, 2023

Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow Plane.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

TensorCIM: A 28nm 3.7nJ/Gather and 8.3TFLOPS/W FP32 Digital-CIM Tensor Processor for MCM-CIM-Based Beyond-NN Acceleration.
Proceedings of the IEEE International Solid- State Circuits Conference, 2023

MuITCIM: A 28nm $2.24 \mu\mathrm{J}$/Token Attention-Token-Bit Hybrid Sparse Digital CIM-Based Accelerator for Multimodal Transformers.
Proceedings of the IEEE International Solid- State Circuits Conference, 2023

ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

BIOS: A 40nm Bionic Sensor-defined 0.47pJ/SOP, 268.7TSOPs/W Configurable Spiking Neuron-in-Memory Processor for Wearable Healthcare.
Proceedings of the 49th IEEE European Solid State Circuits Conference, 2023

PIM-HLS: An Automatic Hardware Generation Tool for Heterogeneous Processing-In-Memory-based Neural Network Accelerators.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

AutoDCIM: An Automated Digital CIM Compiler.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
GQNA: Generic Quantized DNN Accelerator With Weight-Repetition-Aware Activation Aggregating.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Dynamic Sparse Attention for Scalable Transformer Acceleration.
IEEE Trans. Computers, 2022

A 28nm 15.59µJ/Token Full-Digital Bitline-Transpose CIM-Based Sparse Transformer Accelerator with Pipeline/Parallel Reconfigurable Modes.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 Reconfigurable Digital CIM Processor with Unified FP/INT Pipeline and Bitwise In-Memory Booth Multiplication for Cloud Deep Learning Acceleration.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

INSPIRE: in-storage private information retrieval via protocol and architecture co-design.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Accelerating Spatiotemporal Supervised Training of Large-Scale Spiking Neural Networks on GPU.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Alleviating datapath conflicts and design centralization in graph analytics acceleration.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

DOTA: detect and omit weak attentions for scalable transformer acceleration.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Erratum to "Evolver: a Deep Learning Processor With On-Device Quantization-Voltage-Frequency Tuning".
IEEE J. Solid State Circuits, 2021

Evolver: A Deep Learning Processor With On-Device Quantization-Voltage-Frequency Tuning.
IEEE J. Solid State Circuits, 2021

Brain-Inspired Computing: Adventure from Beyond CMOS Technologies to Beyond von Neumann Architectures ICCAD Special Session Paper.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

ADROIT: An Adaptive Dynamic Refresh Optimization Framework for DRAM Energy Saving In DNN Training.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020
DUET: Boosting Deep Neural Network Efficiency on Dual-Module Architecture.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

STC: Significance-aware Transform-based Codec Framework for External Memory Access Reduction.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019
Parana: A Parallel Neural Architecture Considering Thermal Problem of 3D Stacked Memory.
IEEE Trans. Parallel Distributed Syst., 2019

Reconfigurable Architecture for Neural Approximation in Multimedia Computing.
IEEE Trans. Circuits Syst. Video Technol., 2019

A High Throughput Acceleration for Hybrid Neural Networks With Efficient Resource Management on FPGA.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

MoNA: Mobile Neural Architecture with Reconfigurable Parallel Dimensions.
Proceedings of the 17th IEEE International New Circuits and Systems Conference, 2019

Towards Efficient Compact Network Training on Edge-Devices.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

2018
GNA: Reconfigurable and Efficient Architecture for Generative Network Acceleration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

A High Energy Efficient Reconfigurable Hybrid Neural Network Processor for Deep Learning Applications.
IEEE J. Solid State Circuits, 2018

An Energy Efficient JPEG Encoder with Neural Network Based Approximation and Near-Threshold Computing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Bit-width Adaptive Accelerator Design for Convolution Neural Network.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

LCP: a layer clusters paralleling mapping method for accelerating inception and residual networks on FPGA.
Proceedings of the 55th Annual Design Automation Conference, 2018

2017
Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns.
IEEE Trans. Very Large Scale Integr. Syst., 2017

AEPE: An area and power efficient RRAM crossbar-based accelerator for deep CNNs.
Proceedings of the IEEE 6th Non-Volatile Memory Systems and Applications Symposium, 2017

2015
Neural approximating architecture targeting multiple application domains.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

RNA: a reconfigurable architecture for hardware neural acceleration.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015


  Loading...