We stand with Ukraine

We stand with Ukraine

Yao Chen

Orcid: 0000-0002-5798-2282

Affiliations:

Advanced Digital Sciences Center, Illinois at Singapore, Singapore

According to our database¹, Yao Chen authored at least 62 papers between 2014 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2026

Geometric Partition for Billion-Scale Approximate Nearest Neighbor Search.

[DOI]

,

,

,

,

IEEE Trans. Knowl. Data Eng., May, 2026

QoS Awareness and Improved Throughput of Point Cloud Services With Dynamic Workloads.

[DOI]

,

,

,

,

,

,

,

IEEE Trans. Computers, March, 2026

Efficient KV Cache Spillover Management on Memory-Constrained GPU for LLM Inference.

[DOI]

,

,

,

,

,

,

,

,

,

,

IEEE Trans. Parallel Distributed Syst., January, 2026

RidgeWalker: Perfectly Pipelined Graph Random Walks on FPGAs.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2026

2025

Scalable and Load-Balanced Full-Graph GNN Training on Multiple GPUs.

[DOI]

,

,

,

IEEE Trans. Knowl. Data Eng., July, 2025

HLStrans: Dataset for LLM-Driven C-to-HLS Hardware Code Synthesis.

[DOI]

,

,

,

,

CoRR, July, 2025

Clementi: Efficient Load Balancing and Communication Overlap for Multi-FPGA Graph Processing.

[DOI]

,

,

,

,

,

Proc. ACM Manag. Data, June, 2025

Revisiting the Design of In-Memory Dynamic Graph Storage.

[DOI]

,

,

,

,

,

,

,

,

,

Proc. ACM Manag. Data, February, 2025

Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance.

[DOI]

,

,

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

HiPACK: Efficient Sub-8-Bit Direct Convolution with SIMD and Bitwise Management.

[DOI]

,

,

Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

Blurred Encoding for Trajectory Representation Learning.

[DOI]

,

,

,

,

,

Ryosuke Shibasaki

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.2, 2025

ScalaGBM: Memory Efficient GBDT Training for High-Dimensional Data on GPU.

[DOI]

,

,

,

,

,

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

Vista: Vector Indexing and Search for Large-Scale Imbalanced Datasets.

[DOI]

,

,

,

,

Proceedings of the 41st IEEE International Conference on Data Engineering, 2025

Configurable DSP-Based CAM Architecture for Data-Intensive Applications on FPGAs.

[DOI]

,

,

,

,

Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

2024

Introduction to the Special Issue on FPGA-based Embedded Systems for Industrial and IoT Applications.

[DOI]

,

Carlos Enrique Montenegro-Marín

,

Yun (Eric) Liang

,

,

,

Raymond Nijssen

ACM Trans. Reconfigurable Technol. Syst., December, 2024

Winols: A Large-Tiling Sparse Winograd CNN Accelerator on FPGAs.

[DOI]

,

,

,

,

,

ACM Trans. Archit. Code Optim., June, 2024

Aggressive Post-Training Compression on Extremely Large Language Models.

[DOI]

,

,

,

CoRR, 2024

Deep Feature Surgery: Towards Accurate and Efficient Multi-exit Networks.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

T-Edge: Trusted Heterogeneous Edge Computing.

[DOI]

,

,

,

Proceedings of the Annual Computer Security Applications Conference, 2024

2023

HongTu: Scalable Full-Graph GNN Training on Multiple GPUs.

[DOI]

,

,

,

Proc. ACM Manag. Data, December, 2023

NIOT: A Novel Inference Optimization of Transformers on Modern CPUs.

[DOI]

,

,

,

IEEE Trans. Parallel Distributed Syst., June, 2023

LightRW: FPGA Accelerated Graph Dynamic Random Walks.

[DOI]

,

,

,

,

Proc. ACM Manag. Data, 2023

Cybersecurity for Modern Smart Grid Against Emerging Threats.

[DOI]

Daisuke Mashima

,

,

Muhammad M. Roomi

,

Subhash Lakshminarayana

,

Found. Trends Priv. Secur., 2023

HongTu: Scalable Full-Graph GNN Training on Multiple GPUs (via communication-optimized CPU data offloading).

[DOI]

,

,

,

CoRR, 2023

2022

ThunderGP: Resource-Efficient Graph Processing Framework on FPGAs with HLS.

[DOI]

,

,

,

,

,

,

ACM Trans. Reconfigurable Technol. Syst., 2022

HiKonv: Maximizing the Throughput of Quantized Convolution With Novel Bit-wise Management and Computation.

[DOI]

,

,

,

,

CoRR, 2022

Efficient Machine Learning, Compilers, and Optimizations for Embedded Systems.

[DOI]

,

,

,

,

,

CoRR, 2022

YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs.

[DOI]

,

,

,

,

Marianne Winslett

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines.

[DOI]

,

,

,

,

,

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

HiKonv: High Throughput Quantized Convolution With Novel Bit-wise Management and Computation.

[DOI]

,

,

,

,

,

Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

2021

Learning-Based Simultaneous Detection and Characterization of Time Delay Attack in Cyber-Physical Systems.

[DOI]

,

,

,

,

David K. Y. Yau

,

,

Marianne Winslett

IEEE Trans. Smart Grid, 2021

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization.

[DOI]

,

,

,

,

,

IEEE Trans. Computers, 2021

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT.

[DOI]

,

,

,

Mohammad Ali Khan

,

,

,

,

,

Marianne Winslett

Trans. Assoc. Comput. Linguistics, 2021

Free Lunch for Co-Saliency Detection: Context Adjustment.

[DOI]

,

,

,

,

,

CoRR, 2021

3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration.

[DOI]

,

,

,

,

CoRR, 2021

ThundeRiNG: generating multiple independent random number sequences on FPGAs.

[DOI]

,

,

,

,

Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low Bitwidth Quantization, and Ultra-Low Latency Acceleration.

[DOI]

,

,

,

,

Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

ThunderGP: HLS-based Graph Processing Framework on FPGAs.

[DOI]

,

,

,

,

,

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

Skew-Oblivious Data Routing for Data Intensive Applications on FPGAs with HLS.

[DOI]

,

,

,

,

,

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

MELOPPR: Software/Hardware Co-design for Memory-efficient Low-latency Personalized PageRank.

[DOI]

,

,

Zacharie Zirnheld

,

,

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

WinoCNN: Kernel Sharing Winograd Systolic Array for Efficient Convolutional Neural Network Acceleration on FPGAs.

[DOI]

,

,

,

,

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

2020

HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy.

[DOI]

,

,

,

,

,

Zhenxiang Zhang

,

Marianne Winslett

,

Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020

Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices.

[DOI]

,

,

,

,

,

,

Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Is FPGA Useful for Hash Joins?

[DOI]

,

,

,

,

,

,

Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

TAG : Type Auxiliary Guiding for Code Comment Generation.

[DOI]

,

,

,

,

,

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices.

[DOI]

,

,

,

,

,

,

CoRR, 2019

T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA.

[DOI]

,

,

,

,

,

,

Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

Pico-Ampere Voltage References for IoT Systems.

[DOI]

,

,

,

,

Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

µL2Q: An Ultra-Low Loss Quantization Method for DNN Compression.

[DOI]

,

,

,

,

,

,

Proceedings of the International Joint Conference on Neural Networks, 2019

NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the International Conference on Computer-Aided Design, 2019

On-The-Fly Parallel Data Shuffling for Graph Processing on OpenCL-Based FPGAs.

[DOI]

,

,

,

,

,

,

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs.

[DOI]

,

,

,

,

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

2018

A locality-aware shuffle optimization on fat-tree data centers.

[DOI]

,

,

,

,

Future Gener. Comput. Syst., 2018

HaaS: Cloud-Based Real-Time Data Analytics with Heterogeneity-Aware Scheduling.

[DOI]

,

,

,

,

Marianne Winslett

,

,

Proceedings of the 38th IEEE International Conference on Distributed Computing Systems, 2018

HASS: High Accuracy Spike Sorting with Wavelet Package Decomposition and Mutual Information.

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2018

2016

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow.

[DOI]

,

Swathi T. Gurumani

,

,

,

,

,

IEEE Trans. Very Large Scale Integr. Syst., 2016

FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow.

[DOI]

,

,

,

Swathi T. Gurumani

,

,

,

,

,

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

SoC, NoC and Hierarchical Bus Implementations of Applications on FPGAs Using the FCUDA Flow.

[DOI]

,

,

,

Swathi T. Gurumani

,

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

High Level Synthesis of Complex Applications: An H.264 Video Decoder.

[DOI]

,

,

,

Swathi T. Gurumani

,

,

Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

2015

System-level design solutions: Enabling the IoT explosion.

[DOI]

,

,

,

,

Swathi T. Gurumani

,

,

Proceedings of the 2015 IEEE 11th International Conference on ASIC, 2015

2014

Integrated CUDA-to-FPGA Synthesis with Network-on-Chip.

[DOI]

Swathi T. Gurumani

,

,

,

,

,

Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

Loading...