Zongle Huang

Orcid: 0009-0002-4557-3163

According to our database1, Zongle Huang authored at least 13 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
An Area-Efficient Lookup-Table -Based eDRAM Digital CIM Macro for Neural Network Inference.
IEEE J. Solid State Circuits, April, 2026

CCE: A Content Creation Engine With Outlier and Mixed-Representation Computing, Semantic-Defined Instruction Generation for Video Diffusion.
IEEE J. Solid State Circuits, March, 2026

Scope: A Scalable Merged Pipeline Framework for Multi-Chip-Module NN Accelerators.
Proceedings of the 31st Asia and South Pacific Design Automation Conference, 2026

2025
SASDenSebLE: A Compact Vision Transformer Inference Architecture With Saturation-Approximate Softmax Dataflow Enabling Sequence-Parallelism Boosted Layer-Fusion Execution.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2025

Enhancing Memory Efficiency in Large Language Model Training Through Chronos-aware Pipeline Parallelism.
CoRR, March, 2025

MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

HyC-LoRA: Memory Efficient LoRA Fine-tuning with Hybrid Activation Compression.
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

CCE: A 28nm Content Creation Engine with Asymmetric Computing, Semantic-Driven Instruction Generation and Collision-Free Outlier Mapper for Video Generation.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2025

Pro-Cache-CIM: A 28nm 69.4TOPS/W Product-Cache-based Digital-Compute-in-Memory Macro Leveraging Data Locality Pattern in Vision AI Tasks.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2025

2024
Hecaton: Training and Finetuning Large Language Models with Scalable Chiplet Systems.
CoRR, 2024

A 28nm 4.35TOPS/mm2 Transformer Accelerator with Basis-vector Based Ultra Storage Compression, Decomposed Computation and Unified LUT-Assisted Cores.
Proceedings of the IEEE Symposium on VLSI Technology and Circuits 2024, 2024

Exploring Approximation and Dataflow Co-Optimization for Scalable Transformer Inference Architecture on the Edge.
Proceedings of the 37th IEEE International System-on-Chip Conference, 2024

34.7 A 28nm 2.4Mb/mm<sup>2</sup> 6.9 - 16.3TOPS/mm<sup>2</sup> eDRAM-LUT-Based Digital-Computing-in-Memory Macro with In-Memory Encoding and Refreshing.
Proceedings of the IEEE International Solid-State Circuits Conference, 2024


  Loading...