Shaoyi Huang

According to our database1, Shaoyi Huang authored at least 25 papers between 2021 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM.
CoRR, 2024

PruneGNN: Algorithm-Architecture Pruning Framework for Graph Neural Network Acceleration.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training.
CoRR, 2023

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference.
CoRR, 2023

LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Lossless Head Pruning through Automatic Peer Distillation for Language Models.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

AutoReP: Automatic ReLU Replacement for Fast Private Network Inference.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
Efficient Traffic State Forecasting using Spatio-Temporal Network Dependencies: A Sparse Graph Neural Network Approach.
CoRR, 2022

An Automatic and Efficient BERT Pruning for Edge AI Systems.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Towards Sparsification of Graph Neural Networks.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

A length adaptive algorithm-hardware co-design of transformer on FPGA through sparse attention and dynamic pipelining.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Analyzing and Defending against Membership Inference Attacks in Natural Language Processing Classification.
Proceedings of the IEEE International Conference on Big Data, 2022

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm.
CoRR, 2021

E.T.: re-thinking self-attention for transformer models on GPUs.
Proceedings of the International Conference for High Performance Computing, 2021

Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Accommodating Transformer onto FPGA: Coupling the Balanced Model Compression and FPGA-Implementation Optimization.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Co-Exploration of Graph Neural Network and Network-on-Chip Design Using AutoML.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021


  Loading...