Guihai Yan

Orcid: 0000-0002-1254-3278

According to our database¹, Guihai Yan authored at least 82 papers between 2008 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

DPU for Cybersecurity: Enabling Inline Defense and Self-Protection.

[BibT_eX]

[DOI]

Xiaowei Li

Yunkun Liao

Guihai Yan

J. Comput. Sci. Technol., January, 2026

RAPID: Accelerating Point Cloud Diffusion Models via Space-Aware Mix-Precision Quantization.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference, 2026

2025

GRACE: An End-to-End Graph Processing Accelerator on FPGA With Graph Reordering Engine.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., October, 2025

Co-ViSu: Accelerating Video Super-Resolution With Codec Information Reuse.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., September, 2025

KPU: Kernel Processing Unit for in-Memory Analytical Query Processing.

[BibT_eX]

[DOI]

IEEE Trans. Computers, August, 2025

FUS: FPGA-based Universal Sketch with homogeneous and heterogeneous memory architectures.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., June, 2025

Hermes: Accelerating Packet Processing in DPU with Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE International Conference on Computer Design, 2025

Flame: A Multiplier-Free LLM Accelerator with Dynamic Block Floating Point.

[BibT_eX]

[DOI]

Ao Lyu

Haishuang Fan

Guihai Yan

Proceedings of the 43rd IEEE International Conference on Computer Design, 2025

SNO: Securing Network Function Offloading on FPGA-based SmartNICs in Untrusted Clouds.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2025

APTO: Accelerating Serialization-Based Point Cloud Transformers with Position-Aware Pruning.

[BibT_eX]

[DOI]

Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

2024

Satisfying Energy-Efficiency Constraints for Mobile Systems.

[BibT_eX]

[DOI]

IEEE Trans. Mob. Comput., December, 2024

Monocular 3D Multi-Person Pose Estimation for On-Site Joint Flexion Assessment: A Case of Extreme Knee Flexion Detection.

[BibT_eX]

[DOI]

Sensors, October, 2024

DPU-Direct: Unleashing Remote Accelerators via Enhanced RDMA for Disaggregated Datacenters.

[BibT_eX]

[DOI]

IEEE Trans. Computers, August, 2024

AMST: Accelerating Large-Scale Graph Minimum Spanning Tree Computation on FPGA.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Efficient RNIC Cache Side-Channel Attack Detection Through DPU-Driven Architecture.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

Athena: Add More Intelligence to RMT-Based Network Data Plane with Low-Bit Quantization.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

PHD: Parallel Huffman Decoder on FPGA for Extreme Performance and Energy Efficiency.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Co-Via: A Video Frame Interpolation Accelerator Exploiting Codec Information Reuse.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

TianMen: a DPU-based storage network offloading structure for disaggregated datacenters.

[BibT_eX]

[DOI]

Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024

2023

DOE: database offloading engine for accelerating SQL processing.

[BibT_eX]

[DOI]

Distributed Parallel Databases, September, 2023

FlatProxy: A DPU-centric Service Mesh Architecture for Hyperscale Cloud-native Application.

[BibT_eX]

[DOI]

CoRR, 2023

BitColor: Accelerating Large-Scale Graph Coloring on FPGA with Parallel Bit-Wise Engines.

[BibT_eX]

[DOI]

Proceedings of the 52nd International Conference on Parallel Processing, 2023

Optimize the TX Architecture of RDMA NIC for Performance Isolation in the Cloud Environment.

[BibT_eX]

[DOI]

Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

KPU-SQL: Kernel Processing Unit for High-Performance SQL Acceleration.

[BibT_eX]

[DOI]

Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

M2VT: A Multi-Output Encoder Accelerator for Multiple-Way Video Transcoding.

[BibT_eX]

[DOI]

Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

Co-ViSu: a Video Super-Resolution Accelerator Exploiting Codec Information Reuse.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications, 2023

Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design - A Self-Test, Self-Diagnosis, and Self-Repair-Based Approach

[BibT_eX]

[DOI]

Xiaowei Li

Guihai Yan

Cheng Liu

Springer, ISBN: 978-981-19-8550-8, 2023

2022

Portrait: A holistic computation and bandwidth balanced performance evaluation model for heterogeneous systems.

[BibT_eX]

[DOI]

Sustain. Comput. Informatics Syst., 2022

DOE: Database Offloading Engine for Accelerating SQL Processing.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE International Conference on Data Engineering Workshops, 2022

Using Psychophysics to Guide Power Adaptation for Input Methods on Mobile Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021

ShuntFlowPlus: An Efficient and Scalable Dataflow Accelerator Architecture for Stream Applications.

[BibT_eX]

[DOI]

ACM J. Emerg. Technol. Comput. Syst., 2021

2020

A Quantitative Exploration of Collaborative Pruning and Approximation Computing Towards Energy Efficient Neural Networks.

[BibT_eX]

[DOI]

IEEE Des. Test, 2020

2019

SynergyFlow: An Elastic Accelerator Architecture Supporting Batch Processing of Large-Scale Deep Neural Networks.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2019

ShuttleNoC: Power-Adaptable Communication Infrastructure for Many-Core Processors.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Promoting the Harmony between Sparsity and Regularity: A Relaxed Synchronous Architecture for Convolutional Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

SqueezeFlow: A Sparse CNN Accelerator Exploiting Concise Convolution Rules.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

BZIP: A Compact Data Memory System for UTXO-based Blockchains.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Conference on Embedded Software and Systems, 2019

MLA: Machine Learning Adaptation for Realtime Streaming Financial Applications.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Green and Sustainable Computing Conference, 2019

ShuntFlow: An Efficient and Scalable Dataflow Accelerator Architecture for Streaming Applications.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

TNPU: an efficient accelerator architecture for training convolutional neural networks.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018

AdaFlow: Aggressive Convolutional Neural Networks Approximation by Leveraging the Input Variability.

[BibT_eX]

[DOI]

Wenyan Lu

Guihai Yan

Xiaowei Li

J. Low Power Electron., 2018

Optimizing Memory Efficiency for Deep Convolutional Neural Network Accelerators.

[BibT_eX]

[DOI]

Xiaowei Li

Jiajun Li

Guihai Yan

J. Low Power Electron., 2018

CPicker: Leveraging Performance-Equivalent Configurations to Improve Data Center Energy Efficiency.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2018

Joint Design of Training and Hardware Towards Efficient and Accuracy-Scalable Neural Network Inference.

[BibT_eX]

[DOI]

IEEE J. Emerg. Sel. Topics Circuits Syst., 2018

Fault tolerance on-chip: a reliable computing paradigm using self-test, self-diagnosis, and self-repair (3S) approach.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2018

AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design, 2018

Tetris: re-architecting convolutional neural network computation for machine learning accelerators.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2018

SmartShuttle: Optimizing off-chip memory accesses for deep learning accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

CCR: A concise convolution rule for sparse neural network accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

RiskCap: Minimizing Effort of Error Regulation for Approximate Computing.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE Asian Test Symposium, 2018

2017

PowerTrader: Enforcing Autonomous Power Management for Future Large-Scale Many-Core Processors.

[BibT_eX]

[DOI]

IEEE Trans. Multi Scale Comput. Syst., 2017

Exploiting the Potential of Computation Reuse Through Approximate Computing.

[BibT_eX]

[DOI]

IEEE Trans. Multi Scale Comput. Syst., 2017

FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

ApproxEye: Enabling approximate computation reuse for microrobotic computer vision.

[BibT_eX]

[DOI]

Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016

EcoUp: Towards Economical Datacenter Upgrading.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2016

CoreRank: Redeeming "Sick Silicon" by Dynamically Quantifying Core-Level Healthy Condition.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2016

An Analytical Framework for Estimating Scale-Out and Scale-Up Power Efficiency of Heterogeneous Manycores.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2016

Wide Operational Range Processor Power Delivery Design for Both Super-Threshold Voltage and Near-Threshold Voltage Computing.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2016

PowerCap: Leverage Performance-Equivalent Resource Configurations for power capping.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

ACR: Enabling computation reuse for approximate computing.

[BibT_eX]

[DOI]

Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015

RISO: Enforce Noninterfered Performance With Relaxed Network-on-Chip Isolation in Many-Core Cloud Processors.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2015

ShuttleNoC: Boosting on-chip communication efficiency by enabling localized power adaptation.

[BibT_eX]

[DOI]

Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014

Orchestrator: Guarding Against Voltage Emergencies in Multithreaded Applications.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2014

SmartCap: Using Machine Learning for Power Adaptation of Smartphone's Application Processor.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2014

SuperRange: Wide operational range power delivery design for both STV and NTV computing.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

On-Chip Delay Sensor for Environments with Large Temperature Fluctuations.

[BibT_eX]

[DOI]

Jibing Qiu

Guihai Yan

Xiaowei Li

Proceedings of the 23rd IEEE Asian Test Symposium, 2014

Amphisbaena: Modeling two orthogonal ways to hunt on heterogeneous many-cores.

[BibT_eX]

[DOI]

Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013

SmartCap: user experience-oriented power adaptation for smartphone's application processor.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2013

Orchestrator: a low-cost solution to reduce voltage emergencies for multi-threaded applications.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2013

RISO: relaxed network-on-chip isolation for cloud processors.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual Design Automation Conference 2013, 2013

2012

AgileRegulator: A hybrid voltage regulator scheme redeeming dark silicon for power efficiency in a multicore architecture.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011

SVFD: A Versatile Online Fault Detection Scheme via Checking of Stability Violation.

[BibT_eX]

[DOI]

Guihai Yan

Yinhe Han

Xiaowei Li

IEEE Trans. Very Large Scale Integr. Syst., 2011

MicroFix: Using timing interpolation and delay sensors for power reduction.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2011

ReviveNet: A Self-Adaptive Architecture for Improving Lifetime Reliability via Localized Timing Adaptation.

[BibT_eX]

[DOI]

Guihai Yan

Yinhe Han

Xiaowei Li

IEEE Trans. Computers, 2011

Online timing variation tolerance for digital integrated circuits.

[BibT_eX]

[DOI]

Guihai Yan

Xiaowei Li

Proceedings of the 2011 IEEE International Test Conference, 2011

2010

Performance-asymmetry-aware scheduling for Chip Multiprocessors with static core coupling.

[BibT_eX]

[DOI]

J. Syst. Archit., 2010

Leveraging the core-level complementary effects of PVT variations to reduce timing emergencies in multi-core processors.

[BibT_eX]

[DOI]

Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

2009

Variation-Aware Scheduling for Chip Multiprocessors with Thread Level Redundancy.

[BibT_eX]

[DOI]

Proceedings of the 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing, 2009

MicroFix: exploiting path-grained timing adaptability for improving power-performance efficiency.

[BibT_eX]

[DOI]

Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

A unified online Fault Detection scheme via checking of Stability Violation.

[BibT_eX]

[DOI]

Guihai Yan

Yinhe Han

Xiaowei Li

Proceedings of the Design, Automation and Test in Europe, 2009

M-IVC: Using Multiple Input Vectors to Minimize Aging-Induced Delay.

[BibT_eX]

[DOI]

Proceedings of the Eighteentgh Asian Test Symposium, 2009

2008

BAT: Performance-Driven Crosstalk Mitigation Based on Bus-Grouping Asynchronous Transmission.

[BibT_eX]

[DOI]

IEICE Trans. Electron., 2008

Guihai Yan

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...