Peipei Zhou

Orcid: 0000-0002-0493-1844

Affiliations:
  • University of Pittsburgh, PA, USA
  • University of California, Los Angeles, CA, USA (Ph.D.)


According to our database1, Peipei Zhou authored at least 33 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Towards Data-center Level Carbon Modeling and Optimization for Deep Learning Inference.
CoRR, 2024

Towards Carbon Modeling of Cloud Servers with Accelerators.
CoRR, 2024

SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

2023
Sustainable AI Processing at the Edge.
IEEE Micro, 2023

REFRESH FPGAs: Sustainable FPGA Chiplet Architectures.
CoRR, 2023

Challenges and Opportunities to Enable Large-Scale Computing via Heterogeneous Chiplets.
CoRR, 2023

Enabling On-Device Large Language Model Personalization with Self-Supervised Data Selection and Synthesis.
CoRR, 2023

AutoMM: Energy-Efficient Multi-Data-Type Matrix Multiply Design on Heterogeneous Programmable System-on-Chip.
CoRR, 2023

AIM: Accelerating Arbitrary-Precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

CHARM: Composing Heterogeneous AcceleRators for Matrix Multiply on Versal ACAP Architecture.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

High Performance, Low Power Matrix Multiply Design on ACAP: from Architecture, Design Challenges and DSE Perspectives.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023

2022
EF-Train: Enable Efficient On-device CNN Training on FPGA through Data Reshaping for Online Adaptation or Personalization.
ACM Trans. Design Autom. Electr. Syst., 2022

Enabling Weakly Supervised Temporal Action Localization From On-Device Learning of the Video Stream.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Sustainable AI Processing at the Edge.
CoRR, 2022

H2H: heterogeneous model to heterogeneous system mapping with computation and communication awareness.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices.
ACM Trans. Embed. Comput. Syst., 2021

MOCHA: Multinode Cost Optimization in Heterogeneous Clouds with Accelerators.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

2020
Algorithm-Hardware Co-design for BQSR Acceleration in Genome Analysis ToolKit.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

2019
Modeling and Optimization for Customized Computing: Performance, Energy and Cost Perspective.
PhD thesis, 2019

Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

2018
Best-Effort FPGA Programming: A Few Steps Can Go a Long Way.
CoRR, 2018

Doppio: I/O-Aware Performance Analysis, Modeling and Optimization for In-memory Computing Framework.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2018

SODA: stencil with optimized dataflow architecture.
Proceedings of the International Conference on Computer-Aided Design, 2018

An Optimal Microarchitecture for Stencil Computation with Data Reuse and Fine-Grained Parallelism: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

ST-Accel: A High-Level Programming Platform for Streaming Applications on FPGA.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

Latte: Locality Aware Transformation for High-Level Synthesis.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

2017
Bandwidth Optimization Through On-Chip Memory Restructuring for HLS.
Proceedings of the 54th Annual Design Automation Conference, 2017

2016
ARAPrototyper: Enabling Rapid Prototyping and Evaluation for Accelerator-Rich Architectures.
CoRR, 2016

Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

ARAPrototyper: Enabling Rapid Prototyping and Evaluation for Accelerator-Rich Architecture (Abstact Only).
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

Energy Efficiency of Full Pipelining: A Case Study for Matrix Multiplication.
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

2014
A Fully Pipelined and Dynamically Composable Architecture of CGRA.
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014


  Loading...