Shulin Zeng

Orcid: 0000-0002-1030-3748

According to our database¹, Shulin Zeng authored at least 40 papers between 2017 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

CD-LLM: A Heterogeneous Multi-FPGA System for Batched Decoding of 70B+ LLMs Using a Compute-Dedicated Architecture.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., March, 2026

2025

Robustness Against Faults in Configuration Memories of FPGA-Based LLMs.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Artif. Intell., June, 2025

ShiftQuant: Toward Accurate and Efficient Sub-8-bit Integer Training.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2025

TB-STC: Transposable Block-wise N: M Structured Sparse Tensor Core.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

FMC-LLM: Enabling FPGAs for Efficient Batched Decoding of 70B+ LLMs with a Memory-Centric Streaming Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

FlightVGM: Efficient Video Generation Model Inference with Online Sparsification and Hybrid Precision on FPGAs.

[BibT_eX]

[DOI]

Jun Liu

Shulin Zeng

Li Ding

Widyadewi Soedarmadji

Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

PARO: Hardware-Software Co-design with Pattern-aware Reorder-based Attention Quantization in Video Generation Models.

[BibT_eX]

[DOI]

Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

2024

An Efficient Flood Detection Method With Satellite Images Based on Algorithm-Hardware Co-Design.

[BibT_eX]

[DOI]

IEEE Geosci. Remote. Sens. Lett., 2024

Towards Accurate and Efficient Sub-8-Bit Integer Training.

[BibT_eX]

[DOI]

CoRR, 2024

Efficient and Effective Retrieval of Dense-Sparse Hybrid Vectors using Graph-based Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

CoRR, 2024

GLITCHES: GPU-FPGA LLM Inference Through a Collaborative Heterogeneous System.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2024

FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

DySpMM: From Fix to Dynamic for Sparse Matrix-Matrix Multiplication Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

FEASTA: A Flexible and Efficient Accelerator for Sparse Tensor Algebra in Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

CoGNN: An Algorithm-Hardware Co-Design Approach to Accelerate GNN Inference With Minibatch Sampling.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., December, 2023

Serving Multi-DNN Workloads on FPGAs: A Coordinated Architecture, Scheduling, and Mapping Perspective.

[BibT_eX]

[DOI]

IEEE Trans. Computers, May, 2023

DF-GAS: a Distributed FPGA-as-a-Service Architecture towards Billion-Scale Graph-based Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Processing-In-Hierarchical-Memory Architecture for Billion-Scale Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

An Efficient Accelerator for Point-based and Voxel-based Point Cloud Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

NTGAT: A Graph Attention Network Accelerator with Runtime Node Tailoring.

[BibT_eX]

[DOI]

Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

2022

Soft Error Tolerant Convolutional Neural Networks on FPGAs With Ensemble Learning.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2022

A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2022

Exploring the Potential of Low-Bit Training of Convolutional Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

INCAME: Interruptible CNN Accelerator for Multirobot Exploration.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Efficient Autonomous Driving System Design: From Software to Hardware.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

2021

3M-AI: A Multi-task and Multi-core Virtualization Framework for Multi-FPGA AI Systems in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

Reliability-Aware Training and Performance Modeling for Processing-In-Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

Efficient Computing Platform Design for Autonomous Driving Systems.

[BibT_eX]

[DOI]

Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

Ensemble of Pruned Networks for Reliable Classifiers.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

2020

Towards Lower Bit Multiplication for Convolutional Neural Network Training.

[BibT_eX]

[DOI]

CoRR, 2020

Optimizing CNN Accelerator With Improved Roofline Model.

[BibT_eX]

[DOI]

Shaoxia Fang

Shulin Zeng

Yu Wang

Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

Enable Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

INCAME: INterruptible CNN Accelerator for Multi-robot Exploration.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

Enabling Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

INCA: INterruptible CNN Accelerator for Multi-tasking in Embedded Robots.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Black Box Search Space Profiling for Accelerator-Aware Neural Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

2019

[DL] A Survey of FPGA-based Neural Network Inference Accelerators.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2019

A Fine-Grained Sparse Accelerator for Multi-Precision DNN.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

2018

An Efficient Reconfigurable Framework for General Purpose CNN-RNN Models on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Conference on Digital Signal Processing, 2018

2017

A Survey of FPGA Based Neural Network Accelerator.

[BibT_eX]

[DOI]

CoRR, 2017

Shulin Zeng

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...