Peipei Zhou

Orcid: 0000-0002-0493-1844

Affiliations:
  • Brown University, RI, USA
  • University of Pittsburgh, PA, USA (former)
  • University of California, Los Angeles, CA, USA (Ph.D.)


According to our database1, Peipei Zhou authored at least 48 papers between 2014 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
MTrain: Enable Efficient CNN Training on Heterogeneous FPGA-Based Edge Servers.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., September, 2025

AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration.
CoRR, April, 2025

ART: Customizing Accelerators for DNN-Enabled Real-Time Safety-Critical Systems.
Proceedings of the Great Lakes Symposium on VLSI 2025, GLSVLSI 2025, New Orleans, LA, USA, 30 June 2025, 2025

ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

Towards Accelerator Customization in Real-time Safety-critical Systems.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

Ph.D. Project ARIES: Efficient Mapping and Automated Compilation for AMD Versal Devices.
Proceedings of the 33rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2025

Ph.D. Project AIM: Accelerating Arbitrary-Precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP.
Proceedings of the 33rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2025

2024
FiberFlex: Real-time FPGA-based Intelligent and Distributed Fiber Sensor System for Pedestrian Recognition.
ACM Trans. Reconfigurable Technol. Syst., December, 2024

CHEF: A Framework for Deploying Heterogeneous Models on Clusters With Heterogeneous FPGAs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2024

EQ-ViT: Algorithm-Hardware Co-Design for End-to-End Acceleration of Real-Time Vision Transformer Inference on Versal ACAP Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2024

CHARM 2.0: Composing Heterogeneous Accelerators for Deep Learning on Versal ACAP Architecture.
ACM Trans. Reconfigurable Technol. Syst., September, 2024

Towards Error Correction for Computing in Racetrack Memory.
CoRR, 2024

Towards Data-center Level Carbon Modeling and Optimization for Deep Learning Inference.
CoRR, 2024

Towards Carbon Modeling of Cloud Servers with Accelerators.
CoRR, 2024

SCARIF: Towards Carbon Modeling of Cloud Servers with Accelerators.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2024

Amortizing Embodied Carbon Across Generations.
Proceedings of the 15th IEEE International Green and Sustainable Computing Conference, 2024

SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

Enabling On-Device Large Language Model Personalization with Self-Supervised Data Selection and Synthesis.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Reducing Smart Phone Environmental Footprints with In-Memory Processing.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2024

Challenges and Opportunities to Enable Large-Scale Computing via Heterogeneous Chiplets.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

2023
Sustainable AI Processing at the Edge.
IEEE Micro, 2023

AutoMM: Energy-Efficient Multi-Data-Type Matrix Multiply Design on Heterogeneous Programmable System-on-Chip.
CoRR, 2023

REFRESH FPGAs: Sustainable FPGA Chiplet Architectures.
Proceedings of the 14th International Green and Sustainable Computing Conference, 2023

AIM: Accelerating Arbitrary-Precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

CHARM: Composing Heterogeneous AcceleRators for Matrix Multiply on Versal ACAP Architecture.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

High Performance, Low Power Matrix Multiply Design on ACAP: from Architecture, Design Challenges and DSE Perspectives.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023

2022
EF-Train: Enable Efficient On-device CNN Training on FPGA through Data Reshaping for Online Adaptation or Personalization.
ACM Trans. Design Autom. Electr. Syst., 2022

Enabling Weakly Supervised Temporal Action Localization From On-Device Learning of the Video Stream.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Sustainable AI Processing at the Edge.
CoRR, 2022

H2H: heterogeneous model to heterogeneous system mapping with computation and communication awareness.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices.
ACM Trans. Embed. Comput. Syst., 2021

MOCHA: Multinode Cost Optimization in Heterogeneous Clouds with Accelerators.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

2020
Algorithm-Hardware Co-design for BQSR Acceleration in Genome Analysis ToolKit.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

2019
Modeling and Optimization for Customized Computing: Performance, Energy and Cost Perspective.
PhD thesis, 2019

Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

2018
Best-Effort FPGA Programming: A Few Steps Can Go a Long Way.
CoRR, 2018

Doppio: I/O-Aware Performance Analysis, Modeling and Optimization for In-memory Computing Framework.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2018

SODA: stencil with optimized dataflow architecture.
Proceedings of the International Conference on Computer-Aided Design, 2018

An Optimal Microarchitecture for Stencil Computation with Data Reuse and Fine-Grained Parallelism: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

ST-Accel: A High-Level Programming Platform for Streaming Applications on FPGA.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

Latte: Locality Aware Transformation for High-Level Synthesis.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

2017
Bandwidth Optimization Through On-Chip Memory Restructuring for HLS.
Proceedings of the 54th Annual Design Automation Conference, 2017

2016
ARAPrototyper: Enabling Rapid Prototyping and Evaluation for Accelerator-Rich Architectures.
CoRR, 2016

Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

ARAPrototyper: Enabling Rapid Prototyping and Evaluation for Accelerator-Rich Architecture (Abstact Only).
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

Energy Efficiency of Full Pipelining: A Case Study for Matrix Multiplication.
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

2014
A Fully Pipelined and Dynamically Composable Architecture of CGRA.
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014


  Loading...