Dingyu Yao
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
VecInfer: Efficient LLM Inference with Low-Bit KV Cache via Outlier-Suppressed Vector Quantization.
CoRR, October, 2025
TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization.
Proceedings of the Findings of the Association for Computational Linguistics, 2025