Naigang Wang

Orcid: 0000-0001-7664-0061

According to our database1, Naigang Wang authored at least 40 papers between 2012 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System.
CoRR, August, 2025

Generative AI Through CAS Lens: An Integrated Overview of Algorithmic Optimizations, Architectural Advances, and Automated Designs.
IEEE J. Emerg. Sel. Topics Circuits Syst., June, 2025

Guest Editorial Generative Artificial Intelligence Compute: Algorithms, Implementations, and Applications to CAS.
IEEE J. Emerg. Sel. Topics Circuits Syst., June, 2025

DiaBlo: Diagonal Blocks Are Sufficient For Finetuning.
CoRR, June, 2025

EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media.
CoRR, May, 2025

Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization.
Trans. Mach. Learn. Res., 2025

CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization.
Trans. Mach. Learn. Res., 2025

COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization.
IEEE Access, 2025

No Time to Lose: Enabling Real-Time Fluorescence Lifetime Imaging on Resource-constrained FPGAs Through Efficient Scheduling.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

2024
Unlocking Real-Time Fluorescence Lifetime Imaging: Multi-Pixel Parallelism for FPGA-Accelerated Processing.
CoRR, 2024

Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging.
CoRR, 2024

MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization.
CoRR, 2024

Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization.
CoRR, 2024

COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization.
CoRR, 2024

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2022
A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling.
IEEE J. Solid State Circuits, 2022

Deep Compression of Pre-trained Transformer Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
All at Once Network Quantization via Collaborative Knowledge Transfer.
CoRR, 2021

A Comprehensive Survey on Hardware-Aware Neural Architecture Search.
CoRR, 2021



4-Bit Quantization of LSTM-Based Speech Recognition Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Hardware-Aware Neural Architecture Search: Survey and Taxonomy.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

2020
Efficient AI System Design With Cross-Layer Approximate Computing.
Proc. IEEE, 2020


Ultra-Low Precision 4-bit Training of Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Innovate Practices on CyberSecurity of Hardware Semiconductor Devices.
Proceedings of the 37th IEEE VLSI Test Symposium, 2019

Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

DLFloat: A 16-b Floating Point Format Designed for Deep Learning Training and Inference.
Proceedings of the 26th IEEE Symposium on Computer Arithmetic, 2019

2018

Training Deep Neural Networks with 8-bit Floating Point Numbers.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018


Novel IC Sub-Threshold IDDQ Signature And Its Relationship To Aging During High Voltage Stress.
Proceedings of the 48th European Solid-State Device Research Conference, 2018

2015
An 82%-efficient multiphase voltage-regulator 3D interposer with on-chip magnetic inductors.
Proceedings of the Symposium on VLSI Circuits, 2015

2013
A 2.5D Integrated Voltage Regulator Using Coupled-Magnetic-Core Inductors on Silicon Interposer.
IEEE J. Solid State Circuits, 2013

2012
A 2.5D integrated voltage regulator using coupled-magnetic-core inductors on silicon interposer delivering 10.8A/mm<sup>2</sup>.
Proceedings of the 2012 IEEE International Solid-State Circuits Conference, 2012


  Loading...