Peng Zhang

Affiliations:

Peking University, Advanced Institute of Information Technology, Hangzhou, China
Falcon Computing Solutions, Inc., Los Angeles, CA, USA
University of California, Los Angeles, CA, USA (former)

According to our database¹, Peng Zhang authored at least 40 papers between 2011 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

A hardware-friendly algorithm for LCU-level pipe-lined integer motion estimation.

[BibT_eX]

[DOI]

Multim. Tools Appl., January, 2024

2023

A Reconfigurable Multiple Transform Selection Architecture for VVC.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., May, 2023

An Efficient Real-Time Hardware Architecture for Deblocking Filter in AVS3.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Scanline-based fast algorithm and pipelined hardware design of rate-distortion optimized quantization for AVS3.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2023

Architecture Design of AVS3 Fractional Motion Estimation for 4K UHD Video Coding.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2023

Fast Algorithm and VLSI Architecture Design of Rough Mode Decision for AVS3.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2023

An Improved Hardware Architecture for Integer-Pixel Motion Estimation in AVS3.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2023

2022

An Area-efficient Unified Transform Architecture for VVC.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

A 3.1 Gbin/s advanced entropy coding hardware design for AVS3.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Efficient Algorithm and Hardware Architecture for Rate Estimation in Mode Decision of AVS3.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

A Parallel and Pipelined Hardware Architecture for Fractional-Pixel Motion Estimation in AVS3.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2022

A Fast CU Partition Decision Strategy for AVS3 Intra Coding.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2022

2021

A Multiplier-less Transform Architecture with the Diagonal Data Mapping Transpose Memory for The AVS3 Standard.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on ASIC, 2021

2019

Overcoming Data Transfer Bottlenecks in DNN Accelerators via Layer-Conscious Memory Managment.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

2018

AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture.

[BibT_eX]

[DOI]

CoRR, 2018

TGPA: tile-grained pipeline architecture for low latency CNN inference.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2018

S2FA: an accelerator automation framework for heterogeneous computing in datacenters.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

Automated accelerator generation and optimization with composable, parallel and pipeline architecture.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

HLScope+, : Fast and accurate performance estimation for FPGA HLS.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

2016

An Optimal Microarchitecture for Stencil Computation Acceleration Based on Nonuniform Partitioning of Data Reuse Buffers.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Software Infrastructure for Enabling FPGA-Based Accelerations in Data Centers: Invited Paper.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016

Source-to-Source Optimization for HLS.

[BibT_eX]

[DOI]

Proceedings of the FPGAs for Software Programmers, 2016

2015

High efficiency VLSI implementation of an edge-directed video up-scaler using high level synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2015

Resource-Aware Throughput Optimization for High-Level Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

CMOST: a system-level FPGA compilation framework.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual Design Automation Conference, 2015

2014

Combining computation and communication optimizations in system synthesis for streaming applications.

[BibT_eX]

[DOI]

Jason Cong

Muhuan Huang

Peng Zhang

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

FPGA Acceleration for Simultaneous Medical Image Reconstruction and Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

An Optimal Microarchitecture for Stencil Computation Acceleration Based on Non-Uniform Partitioning of Data Reuse Buffers.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual Design Automation Conference 2014, 2014

2013

Automatic multidimensional memory partitioning for FPGA-based accelerators (abstract only).

[BibT_eX]

[DOI]

Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Polyhedral-based data reuse optimization for configurable computing.

[BibT_eX]

[DOI]

Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Efficient system-level mapping from streaming applications to FPGAs (abstract only).

[BibT_eX]

[DOI]

Jason Cong

Muhuan Huang

Peng Zhang

Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Memory partitioning for multidimensional arrays in high-level synthesis.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual Design Automation Conference 2013, 2013

2012

Task-Level Data Model for Hardware Synthesis Based on Concurrent Collections.

[BibT_eX]

[DOI]

J. Electr. Comput. Eng., 2012

A Study on the Impact of Compiler Optimizations on High-Level Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2012

Memory partitioning and scheduling co-optimization in behavioral synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012

Combining module selection and replication for throughput-driven streaming programs.

[BibT_eX]

[DOI]

Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

Optimizing memory hierarchy allocation with loop transformations for high-level synthesis.

[BibT_eX]

[DOI]

Jason Cong

Peng Zhang

Yi Zou

Proceedings of the 49th Annual Design Automation Conference 2012, 2012

An integrated and automated memory optimization flow for FPGA behavioral synthesis.

[BibT_eX]

[DOI]

Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

2011

Combined loop transformation and hierarchy allocation for data reuse optimization.

[BibT_eX]

[DOI]

Jason Cong

Peng Zhang

Yi Zou

Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011

Peng Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...