Zhenman Fang

Orcid: 0000-0003-0603-9697

According to our database1, Zhenman Fang authored at least 72 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
E4SA: An Ultra-Efficient Systolic Array Architecture for 4-Bit Convolutional Neural Networks.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

HiSpMV: Hybrid Row Distribution and Vector Buffering for Imbalanced SpMV Acceleration on FPGAs.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

2023
CHIP-KNNv2: A Configurable and High-Performance K-Nearest Neighbors Accelerator on HBM-based FPGAs.
ACM Trans. Reconfigurable Technol. Syst., December, 2023

TAPA: A Scalable Task-parallel Dataflow Programming Framework for Modern FPGAs with Co-optimization of HLS and Physical Design.
ACM Trans. Reconfigurable Technol. Syst., December, 2023

SASA: A Scalable and Automatic Stencil Acceleration Framework for Optimized Hybrid Spatial and Temporal Parallelism on HBM-based FPGAs.
ACM Trans. Reconfigurable Technol. Syst., June, 2023

SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery.
IEEE Trans. Geosci. Remote. Sens., 2023

TopSort: A High-Performance Two-Phase Sorting Accelerator Optimized on HBM-Based FPGAs.
IEEE Trans. Emerg. Top. Comput., 2023

A Cycle-Accurate Soft Error Vulnerability Analysis Framework for FPGA-based Designs.
CoRR, 2023

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Journal Track Paper ICFPT 2023 : HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks.
Proceedings of the International Conference on Field Programmable Technology, 2023

HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks.
Proceedings of the 31st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2023

SQL2FPGA: Automatic Acceleration of SQL Query Processing on Modern CPU-FPGA Platforms.
Proceedings of the 31st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2023

PASTA: Programming and Automation Support for Scalable Task-Parallel HLS Programs on Modern Multi-Die FPGAs.
Proceedings of the 31st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2023

ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for Low-Bit DNN Training.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023

2022
SyncNN: Evaluating and Accelerating Spiking Neural Networks on FPGAs.
ACM Trans. Reconfigurable Technol. Syst., 2022

Quick-Div: Rethinking Integer Divider Design for FPGA-based Soft-processors.
ACM Trans. Reconfigurable Technol. Syst., 2022

Demystifying the Soft and Hardened Memory Systems of Modern FPGAs for Software Programmers through Microbenchmarking.
ACM Trans. Reconfigurable Technol. Syst., 2022

Introduction to the Special Section on High-level Synthesis for FPGA: Next-generation Technologies and Applications.
ACM Trans. Design Autom. Electr. Syst., 2022

Algorithm/Hardware Codesign for Real-Time On-Satellite CNN-Based Ship Detection in SAR Imagery.
IEEE Trans. Geosci. Remote. Sens., 2022

Stealthy Attack on Algorithmic-Protected DNNs via Smart Bit Flipping.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

FILM-QNN: Efficient FPGA Acceleration of Deep Neural Networks with Intra-Layer, Mixed-Precision Quantization.
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding.
Proceedings of the Computer Vision - ECCV 2022, 2022

Blind Data Adversarial Bit-flip Attack against Deep Neural Networks.
Proceedings of the 25th Euromicro Conference on Digital System Design, 2022

A Majority-based Approximate Adder for FPGAs.
Proceedings of the 25th Euromicro Conference on Digital System Design, 2022

FitAct: Error Resilient Deep Neural Networks via Fine-Grained Post-Trainable Activation Functions.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

FPGA-aware automatic acceleration framework for vision transformer with mixed-scheme quantization: late breaking results.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Hardware-efficient stochastic rounding unit design for DNN training: late breaking results.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Programming and Synthesis for Software-defined FPGA Acceleration: Status and Future Prospects.
ACM Trans. Reconfigurable Technol. Syst., 2021

SeaPlace: Process Variation Aware Placement for Reliable Combinational Circuits against SETs and METs.
CoRR, 2021

BDFA: A Blind Data Adversarial Bit-flip Attack on Deep Neural Networks.
CoRR, 2021

MAPLE: A Machine Learning based Aging-Aware FPGA Architecture Exploration Framework.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers through Microbenchmarking.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

LEAP: A Deep Learning based Aging-Aware Architecture Exploration Framework for FPGAs.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

2020
FPGA-based Near Data Processing Platform Selection Using Fast Performance Modeling (WiP Paper).
Proceedings of the 21st ACM SIGPLAN/SIGBED International Conference on Languages, 2020

Reconfigurable Accelerator Compute Hierarchy: A Case Study using Content-Based Image Retrieval.
Proceedings of the IEEE International Symposium on Workload Characterization, 2020

CHIP-KNN: A Configurable and High-Performance K-Nearest Neighbors Accelerator on Cloud FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2020

Aadam: A Fast, Accurate, and Versatile Aging-Aware Cell Library Delay Model using Feed-Forward Neural Network.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

EASpiNN: Effective Automated Spiking Neural Network Evaluation on FPGA.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

Algorithm-Hardware Co-design for BQSR Acceleration in Genome Analysis ToolKit.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

2019
In-Depth Analysis on Microarchitectures of Modern Heterogeneous CPU-FPGA Platforms.
ACM Trans. Reconfigurable Technol. Syst., 2019

Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Customizable Computing - From Single Chip to Datacenters.
Proc. IEEE, 2019

An FPGA-Based BWT Accelerator for Bzip2 Data Compression.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

Rethinking Integer Divider Design for FPGA-Based Soft-Processors.
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

Understanding Performance Gains of Accelerator-Rich Architectures.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

2018
CPU-FPGA Coscheduling for Big Data Applications.
IEEE Des. Test, 2018

Best-Effort FPGA Programming: A Few Steps Can Go a Long Way.
CoRR, 2018

Doppio: I/O-Aware Performance Analysis, Modeling and Optimization for In-memory Computing Framework.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2018

High-Throughput Lossless Compression on Tightly Coupled CPU-FPGA Platforms: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

Understanding Performance Differences of FPGAs and GPUs: (Abtract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

K-Flow: A Programming and Scheduling Framework to Optimize Dataflow Execution on CPU-FPGA Platforms: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

High-Throughput Lossless Compression on Tightly Coupled CPU-FPGA Platforms.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

Understanding Performance Differences of FPGAs and GPUs.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

2017
AIM: accelerating computational genomics through scalable and noninvasive accelerator-interposed memory.
Proceedings of the International Symposium on Memory Systems, 2017

Supporting Address Translation for Accelerator-Centric Architectures.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

CPU-FPGA Co-Optimization for Big Data Applications: A Case Study of In-Memory Samtool Sorting (Abstract Only).
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

2016
Revisiting FPGA Acceleration of Molecular Dynamics Simulation with Dynamic Data Flow Behavior in High-Level Synthesis.
CoRR, 2016

ARAPrototyper: Enabling Rapid Prototyping and Evaluation for Accelerator-Rich Architectures.
CoRR, 2016

Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

When Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration.
Proceedings of the 8th USENIX Workshop on Hot Topics in Cloud Computing, 2016

ARAPrototyper: Enabling Rapid Prototyping and Evaluation for Accelerator-Rich Architecture (Abstact Only).
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

Energy Efficiency of Full Pipelining: A Case Study for Matrix Multiplication.
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

A quantitative analysis on microarchitectures of modern CPU-FPGA platforms.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.
Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016

2015
PARADE: A Cycle-Accurate Full-System Simulation Platform for Accelerator-Rich Architectural Design and Exploration.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

2014
Measuring Microarchitectural Details of Multi- and Many-Core Memory Systems through Microbenchmarking.
ACM Trans. Archit. Code Optim., 2014

Multi-stage coordinated prefetching for present-day processors.
Proceedings of the 2014 International Conference on Supercomputing, 2014

2012
Improving dynamic prediction accuracy through multi-level phase analysis.
Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2012

Transformer: a functional-driven cycle-accurate multicore simulator.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

2011
A comprehensive analysis and parallelization of an image retrieval algorithm.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2011


  Loading...