Zhihao Zhang
Orcid: 0009-0002-8409-2717Affiliations:
- Carnegie Mellon University, Pittsburgh, PA, USA
According to our database1,
Zhihao Zhang
authored at least 13 papers
between 2020 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
CoRR, April, 2025
CoRR, January, 2025
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
CoRR, 2023
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.
CoRR, 2023
2022
Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction and Tracking.
IEEE Trans. Intell. Transp. Syst., 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
2020
Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein Graph Double-Attention Network.
CoRR, 2020