Zhenhua Zhu

Orcid: 0009-0007-9259-7180

Affiliations:

Hong Kong University of Science and Technology (HKUST), Hong Kong
Tsinghua University, Department of Electrical Engineering, BNRist, Beijing, China (PhD 2024)

According to our database¹, Zhenhua Zhu authored at least 75 papers between 2017 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

ReNN-RV: Run-Time PE Reconfiguration for DNN Inference Acceleration With Custom RISC-V ISA.

[BibT_eX]

[DOI]

IEEE Trans. Computers, May, 2026

CD-LLM: A Heterogeneous Multi-FPGA System for Batched Decoding of 70B+ LLMs Using a Compute-Dedicated Architecture.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., March, 2026

Towards Floating Point-Based AI Acceleration: Hybrid PIM with Non-Uniform Data Format and Reduced Multiplications.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., January, 2026

STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal Planning.

[BibT_eX]

[DOI]

Proceedings of the 21st European Conference on Computer Systems, 2026

Efficient and Adaptable Overlapping for Computation and Communication via Signaling and Reordering.

[BibT_eX]

[DOI]

Proceedings of the 21st European Conference on Computer Systems, 2026

Endor: Exploit Nearly-Decode-Only Opportunities of LLM Reasoning on Near-Memory Architecture.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference, 2026

FlashGEMM: Mesh-Aware Efficient GEMM for 3D-Stacked LLM Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference, 2026

PICoSNN: Partially Incoherent Configurable Optical Computing Architecture for SNN Acceleration.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference, 2026

CIM-Tuner: Balancing the Compute and Storage Capacity of SRAM-CIM Accelerator via Hardware-mapping Co-exploration.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference, 2026

SpAct-NDP: Efficient LLM Inference via Sparse Activation on NDP-GPU Heterogeneous Architecture.

[BibT_eX]

[DOI]

Proceedings of the 31st Asia and South Pacific Design Automation Conference, 2026

2025

Cross-Layer Design and Design Automation for In-Memory Computing Based on Nonvolatile Memory Technologies.

[BibT_eX]

[DOI]

Xiaobo Sharon Hu

Mingyen Lee

Mengyuan Li

João Paulo Cardoso de Lima

IEEE Des. Test, December, 2025

Exploiting the Memory-Compute-Coupling Feature for CIM Accelerator Design Optimization.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2025

db-SP: Accelerating Sparse Attention for Visual Generative Models with Dual-Balanced Sequence Parallelism.

[BibT_eX]

[DOI]

CoRR, November, 2025

Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-Design.

[BibT_eX]

[DOI]

CoRR, November, 2025

Reducing GPU Memory Fragmentation via Spatio-Temporal Planning for Efficient Large-Scale Model Training.

[BibT_eX]

[DOI]

CoRR, July, 2025

HyCTor: A Hybrid CNN-Transformer Network Accelerator With Flexible Weight/Output Stationary Dataflow and Multicore Extension.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2025

FlashOverlap: A Lightweight Design for Efficiently Overlapping Communication and Computation.

[BibT_eX]

[DOI]

CoRR, April, 2025

REACT3D: Real-time Edge Accelerator for Incremental Training in 3D Gaussian Splatting based SLAM Systems.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

Deep Neural Network Inference Partitioning in Embedded Hybrid Analog-Digital Systems.

[BibT_eX]

[DOI]

Proceedings of the 26th International Symposium on Quality Electronic Design, 2025

How Do Errors Impact NN Accuracy on Non-Ideal Analog PIM? Fast Evaluation via an Error-Injected Robustness Metric.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2025

UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

TB-STC: Transposable Block-wise N: M Structured Sparse Tensor Core.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

FMC-LLM: Enabling FPGAs for Efficient Batched Decoding of 70B+ LLMs with a Memory-Centric Streaming Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

HPIM-NoC: A Priori-Knowledge-Based Optimization Framework for Heterogeneous PIM-Based NoCs.

[BibT_eX]

[DOI]

Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

PARO: Hardware-Software Co-design with Pattern-aware Reorder-based Attention Quantization in Video Generation Models.

[BibT_eX]

[DOI]

Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

2024

Toward High-Accuracy and Real-Time Two-Stage Small Object Detection on FPGA.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., September, 2024

TDPP: 2-D Permutation-Based Protection of Memristive Deep Neural Networks.

[BibT_eX]

[DOI]

Minhui Zou

Zhenhua Zhu

Tzofnat Greenberg-Toledo

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., March, 2024

LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization.

[BibT_eX]

[DOI]

CoRR, 2024

Efficient and Effective Retrieval of Dense-Sparse Hybrid Vectors using Graph-based Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

CoRR, 2024

Efficient Deployment of Large Language Model across Cloud-Device Systems.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International System-on-Chip Conference, 2024

MOTPE/D: Hardware and Algorithm Co-design for Reconfigurable Neuromorphic Processor.

[BibT_eX]

[DOI]

Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

Towards Floating Point-Based Attention-Free LLM: Hybrid PIM with Non-Uniform Data Format and Reduced Multiplications.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024

GLITCHES: GPU-FPGA LLM Inference Through a Collaborative Heterogeneous System.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2024

DyPIM: Dynamic-Inference-Enabled Processing - In-Memory Accelerator.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

DySpMM: From Fix to Dynamic for Sparse Matrix-Matrix Multiplication Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

EPIM: Efficient Processing-In-Memory Accelerators based on Epitome.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

FEASTA: A Flexible and Efficient Accelerator for Sparse Tensor Algebra in Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

CoGNN: An Algorithm-Hardware Co-Design Approach to Accelerate GNN Inference With Minibatch Sampling.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., December, 2023

MNSIM 2.0: A Behavior-Level Modeling Tool for Processing-In-Memory Architectures.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

Gibbon: An Efficient Co-Exploration Framework of NN Model and Processing-In-Memory Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

Serving Multi-DNN Workloads on FPGAs: A Coordinated Architecture, Scheduling, and Mapping Perspective.

[BibT_eX]

[DOI]

IEEE Trans. Computers, May, 2023

TDPP: Two-Dimensional Permutation-Based Protection of Memristive Deep Neural Networks.

[BibT_eX]

[DOI]

Minhui Zou

Zhenhua Zhu

Tzofnat Greenberg-Toledo

CoRR, 2023

DF-GAS: a Distributed FPGA-as-a-Service Architecture towards Billion-Scale Graph-based Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Realizing Extreme Endurance Through Fault-aware Wear Leveling and Improved Tolerance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

SRAM-Based Processing-In-Memory Design with Kullback-Leibler Divergence-Based Dynamic Precision Quantization.

[BibT_eX]

[DOI]

Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

Minimizing Communication Conflicts in Network-On-Chip Based Processing-In-Memory Architecture.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

PIM-HLS: An Automatic Hardware Generation Tool for Heterogeneous Processing-In-Memory-based Neural Network Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Processing-In-Hierarchical-Memory Architecture for Billion-Scale Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Memory-Efficient and Real-Time SPAD-based dToF Depth Sensor with Spatial and Statistical Correlation.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022

Exploring the Potential of Low-Bit Training of Convolutional Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Optimizing Graph-based Approximate Nearest Neighbor Search: Stronger and Smarter.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Conference on Mobile Data Management, 2022

WESCO: Weight-encoded Reliability and Security Co-design for In-memory Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

DIMMining: pruning-efficient and parallel graph mining on near-memory-computing.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Exploiting Parallelism with Vertex-Clustering in Processing-In-Memory-based GCN Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Gibbon: Efficient Co-Exploration of NN Model and Processing-In-Memory Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

2021

FTT-NAS: Discovering Fault-tolerant Convolutional Neural Architecture.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2021

Enabling Lower-Power Charge-Domain Nonvolatile In-Memory Computing With Ferroelectric FETs.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2021

Rerec: In-ReRAM Acceleration with Access-Aware Mapping for Personalized Recommendation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Reliability-Aware Training and Performance Modeling for Processing-In-Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

MNSIM-TIME: Performance Modeling Framework for Training-In-Memory Architectures.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

2020

FTT-NAS: Discovering Fault-Tolerant Neural Architecture.

[BibT_eX]

[DOI]

CoRR, 2020

Efficient 16 Boolean logic and arithmetic based on bipolar oxide memristors.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2020

MNSIM 2.0: A Behavior-Level Modeling Tool for Memristor-based Neuromorphic Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

Security Enhancement for RRAM Computing System through Obfuscating Crossbar Row Connections.

[BibT_eX]

[DOI]

Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

An Energy-Efficient Quantized and Regularized Training Framework For Processing-In-Memory Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

2019

TIME: A Training-in-Memory Architecture for RRAM-Based Deep Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

HDC-IM: Hyperdimensional Computing In-Memory Architecture based on RRAM.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Conference on Electronics, Circuits and Systems, 2019

A General Logic Synthesis Framework for Memristor-based Logic Design.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2019

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Learning the sparsity for ReRAM: mapping and pruning sparse neural network for ReRAM based accelerator.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018

Mixed size crossbar based RRAM CNN accelerator with overlapped mapping method.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2018

Rescuing memristor-based computing with non-linear resistance levels.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Training low bitwidth convolutional neural network on RRAM.

[BibT_eX]

[DOI]

Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

2017

TIME: A Training-in-memory Architecture for Memristor-based Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

Zhenhua Zhu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...