Zunhai Su
According to our database1,
Zunhai Su authored at least 13 papers
between 2025 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CoRR, May, 2026
CoRR, May, 2026
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation.
CoRR, April, 2026
Beyond Outliers: A Data-Free Layer-wise Mixed-Precision Quantization Approach Driven by Numerical and Structural Dual-Sensitivity.
CoRR, March, 2026
XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression.
CoRR, February, 2026
SnapMLA: Efficient Long-Context MLA Decoding via Hardware-Aware FP8 Quantized Pipelining.
CoRR, February, 2026
SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration.
CoRR, February, 2026
Locate, steer, and improve: A practical survey of actionable mechanistic interpretability in large language models.
Comput. Sci. Rev., 2026
2025
KVSink: Understanding and Enhancing the Preservation of Attention Sinks in KV Cache Quantization for LLMs.
CoRR, August, 2025
CoRR, July, 2025
RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025
AKVQ-VL: Attention-Aware KV Cache Adaptive 2-Bit Quantization for Vision-Language Models.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025