Shaoyi Huang

Orcid: 0000-0001-6093-9798

According to our database¹, Shaoyi Huang authored at least 47 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Different Prompts, Different Ranks: Prompt-aware Dynamic Rank Selection for SVD-based LLM Compression.

[BibT_eX]

[DOI]

CoRR, May, 2026

Distributed Interpretability and Control for Large Language Models.

[BibT_eX]

[DOI]

Dev Arpan Desai

Shaoyi Huang

Zining Zhu

CoRR, April, 2026

GSR-GNN: Training Acceleration and Memory-Saving Framework of Deep GNNs on Circuit Graph.

[BibT_eX]

[DOI]

CoRR, March, 2026

Roots Beneath the Cut: Uncovering the Risk of Concept Revival in Pruning-Based Unlearning for Diffusion Models.

[BibT_eX]

[DOI]

CoRR, March, 2026

Effective MoE-based LLM Compression by Exploiting Heterogeneous Inter-Group Experts Routing Frequency and Information Density.

[BibT_eX]

[DOI]

CoRR, February, 2026

Rethinking the Potential of Layer Freezing for DNN Training Efficiency.

[BibT_eX]

[DOI]

Proceedings of the Great Lakes Symposium on VLSI 2026, 2026

Late Breaking Results: Conversion of Neural Networks into Logic Flows for Edge Computing.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference, 2026

2025

Layer-wise Weight Selection for Power-Efficient Neural Network Acceleration.

[BibT_eX]

[DOI]

Jiaxun Fang

Grace Li Zhang

Shaoyi Huang

CoRR, November, 2025

Towards Fast LLM Fine-tuning through Zeroth-Order Optimization with Projected Gradient-Aligned Perturbations.

[BibT_eX]

[DOI]

CoRR, October, 2025

PEL-NAS: Search Space Partitioned Architecture Prompt Co-Evolutionary LLM-driven Hardware-Aware Neural Architecture Search.

[BibT_eX]

[DOI]

Hengyi Zhu

Grace Li Zhang

Shaoyi Huang

CoRR, October, 2025

Layer-wise dynamic rank for compressing large language models.

[BibT_eX]

[DOI]

CoRR, September, 2025

End-to-End On-Device Quantization-Aware Training for LLMs at Inference Cost.

[BibT_eX]

[DOI]

CoRR, September, 2025

Rethinking the Potential of Layer Freezing for Efficient DNN Training.

[BibT_eX]

[DOI]

CoRR, August, 2025

ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning.

[BibT_eX]

[DOI]

CoRR, May, 2025

KerZOO: Kernel Function Informed Zeroth-Order Optimization for Accurate and Accelerated LLM Fine-Tuning.

[BibT_eX]

[DOI]

CoRR, May, 2025

DR-CircuitGNN: Training Acceleration of Heterogeneous Circuit Graph Neural Network on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 39th ACM International Conference on Supercomputing, 2025

GROOT: Graph Edge Re-growth and Partitioning for the Verification of Large Designs in Logic Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2025

Efficient Context Propagating Perceiver Architectures for Auto-Regressive Language Modeling.

[BibT_eX]

[DOI]

Kaleel Mahmood

Shaoyi Huang

Proceedings of the ECAI 2025 - 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy, 2025

2024

ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".

[BibT_eX]

[DOI]

Dataset, February, 2024

ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".

[BibT_eX]

[DOI]

Dataset, February, 2024

PromptV: Leveraging LLM-powered Multi-Agent Prompting for High-quality Verilog Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Enhanced Computationally Efficient Long LoRA Inspired Perceiver Architectures for Auto-Regressive Language Modeling.

[BibT_eX]

[DOI]

Kaleel Mahmood

Shaoyi Huang

CoRR, 2024

Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM.

[BibT_eX]

[DOI]

CoRR, 2024

PruneGNN: Algorithm-Architecture Pruning Framework for Graph Neural Network Acceleration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training.

[BibT_eX]

[DOI]

CoRR, 2023

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference.

[BibT_eX]

[DOI]

CoRR, 2023

LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Lossless Head Pruning through Automatic Peer Distillation for Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

AutoReP: Automatic ReLU Replacement for Fast Private Network Inference.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022

Efficient Traffic State Forecasting using Spatio-Temporal Network Dependencies: A Sparse Graph Neural Network Approach.

[BibT_eX]

[DOI]

CoRR, 2022

An Automatic and Efficient BERT Pruning for Edge AI Systems.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Towards Sparsification of Graph Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

A length adaptive algorithm-hardware co-design of transformer on FPGA through sparse attention and dynamic pipelining.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Analyzing and Defending against Membership Inference Attacks in Natural Language Processing Classification.

[BibT_eX]

[DOI]

Sanguthevar Rajasekaran

Proceedings of the IEEE International Conference on Big Data, 2022

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm.

[BibT_eX]

[DOI]

Sanguthevar Rajasekaran

Hang Liu

Caiwen Ding

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm.

[BibT_eX]

[DOI]

CoRR, 2021

E.T.: re-thinking self-attention for transformer models on GPUs.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Accommodating Transformer onto FPGA: Coupling the Balanced Model Compression and FPGA-Implementation Optimization.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Co-Exploration of Graph Neural Network and Network-on-Chip Design Using AutoML.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Shaoyi Huang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...