Zunhai Su

According to our database1, Zunhai Su authored at least 13 papers between 2025 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond.
CoRR, May, 2026

CktFormalizer: Autoformalization of Natural Language into Circuit Representations.
CoRR, May, 2026

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation.
CoRR, April, 2026

Beyond Outliers: A Data-Free Layer-wise Mixed-Precision Quantization Approach Driven by Numerical and Structural Dual-Sensitivity.
CoRR, March, 2026

XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression.
CoRR, February, 2026

SnapMLA: Efficient Long-Context MLA Decoding via Hardware-Aware FP8 Quantized Pipelining.
CoRR, February, 2026

SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration.
CoRR, February, 2026

Locate, steer, and improve: A practical survey of actionable mechanistic interpretability in large language models.
Comput. Sci. Rev., 2026

2025
DoPE: Denoising Rotary Position Embedding.
CoRR, November, 2025

KVSink: Understanding and Enhancing the Preservation of Attention Sinks in KV Cache Quantization for LLMs.
CoRR, August, 2025

Unveiling Super Experts in Mixture-of-Experts Large Language Models.
CoRR, July, 2025

RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

AKVQ-VL: Attention-Aware KV Cache Adaptive 2-Bit Quantization for Vision-Language Models.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025


  Loading...