Zongwu Wang

Orcid: 0009-0003-2157-4927

According to our database1, Zongwu Wang authored at least 46 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization.
CoRR, June, 2025

MILLION: Mastering Long-Context LLM Inference Via Outlier-Immunized KV Product Quantization.
CoRR, April, 2025

ALLMod: Exploring Area-Efficiency of LUT-based Large Number Modular Reduction via Hybrid Workloads.
CoRR, March, 2025

SpMMPlu-Pro: An Enhanced Compiler Plug-In for Efficient SpMM and Sparsity Propagation Algorithm.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2025

STCO: Enhancing Training Efficiency via Structured Sparse Tensor Compilation Optimization.
ACM Trans. Design Autom. Electr. Syst., 2025

FATE: Boosting the Performance of Hyper-Dimensional Computing Intelligence with Flexible Numerical DAta TypE.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

CROSS: Compiler-Driven Optimization of Sparse DNNs Using Sparse/Dense Computation Kernels.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

EVASION: Efficient KV CAche CompreSsion vIa PrOduct QuaNtization.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

OPS: Outlier-Aware Precision-Slice Framework for LLM Acceleration.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

HyperDyn: Dynamic Dimensional Masking for Efficient Hyper-Dimensional Computing.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

TAIL: Exploiting Temporal Asynchronous Execution for Efficient Spiking Neural Networks with Inter-Layer Parallelism.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

Exploiting Differential-Based Data Encoding for Enhanced Query Efficiency.
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

NeuronQuant: Accurate and Efficient Post-Training Quantization for Spiking Neural Networks.
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

2024
ERA-BS: Boosting the Efficiency of ReRAM-Based PIM Accelerator With Fine-Grained Bit-Level Sparsity.
IEEE Trans. Computers, September, 2024

TokenRing: An Efficient Parallelism Framework for Infinite-Context LLMs via Bidirectional Communication.
CoRR, 2024

COMPASS: SRAM-Based Computing-in-Memory SNN Accelerator with Adaptive Spike Speculation.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

LowPASS: A Low power PIM-based accelerator with Speculative Scheme for SNNs.
Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, 2024

UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

Ninja: A Hardware Assisted System for Accelerating Nested Address Translation.
Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

T-BUS: Taming Bipartite Unstructured Sparsity for Energy-Efficient DNN Acceleration.
Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

PS4: A Low Power SNN Accelerator with Spike Speculative Scheme.
Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

HOLES: Boosting Large Language Models Efficiency with Hardware-Friendly Lossless Encoding.
Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

SPARK: Scalable and Precision-Aware Acceleration of Neural Networks via Efficient Encoding.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

EOS: An Energy-Oriented Attack Framework for Spiking Neural Networks.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

INSPIRE: Accelerating Deep Neural Networks via Hardware-friendly Index-Pair Encoding.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

TEAS: Exploiting Spiking Activity for Temporal-wise Adaptive Spiking Neural Networks.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

PAAP-HD: PIM-Assisted Approximation for Efficient Hyper-Dimensional Computing.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

TSTC: Enabling Efficient Training via Structured Sparse Tensor Compilation.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

2023
SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network Accelerator.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023

SIMSnn: A Weight-Agnostic ReRAM-based Search-In-Memory Engine for SNN Acceleration.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

2022
IVQ: In-Memory Acceleration of DNN Inference Exploiting Varied Quantization.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Cross-layer Designs against Non-ideal Effects in ReRAM-based Processing-in-Memory System.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Randomize and Match: Exploiting Irregular Sparsity for Energy Efficient Processing in SNNs.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

DynSNN: A Dynamic Approach to Reduce Redundancy in Spiking Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2022

DTQAtten: Leveraging Dynamic Token-based Quantization for Efficient Attention Architecture.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Self-Terminating Write of Multi-Level Cell ReRAM for Efficient Neuromorphic Computing.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

SATO: spiking neural network acceleration via temporal-oriented dataflow and architecture.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

EBSP: evolving bit sparsity patterns for hardware-friendly inference of quantized deep neural networks.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

PIM-DH: ReRAM-based processing-in-memory architecture for deep hashing acceleration.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

HAWIS: Hardware-Aware Automated WIdth Search for Accurate, Energy-Efficient and Robust Binary Neural Network on ReRAM Dot-Product Engine.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

SpikeConverter: An Efficient Conversion Framework Zipping the Gap between Artificial Neural Networks and Spiking Neural Networks.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network.
CoRR, 2021

Improving Neural Network Efficiency via Post-training Quantization with Adaptive Floating-Point.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Bit-Transformer: Transforming Bit-level Sparsity into Higher Preformance in ReRAM-based Accelerator.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

IM3A: Boosting Deep Neural Network Efficiency via In-Memory Addressing-Assisted Acceleration.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021


  Loading...