Zhuomin He

Orcid: 0000-0002-3477-0607

According to our database¹, Zhuomin He authored at least 5 papers between 2024 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Online Context Caching for Distributed Large Language Models Serving.

[BibT_eX]

[DOI]

Proceedings of the IEEE INFOCOM 2025, 2025

AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

AttentionStore: Cost-effective Attention Reuse across Multi-turn Conversations in Large Language Model Serving.

[BibT_eX]

[DOI]

CoRR, 2024

Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention.

[BibT_eX]

[DOI]

Proceedings of the 2024 USENIX Annual Technical Conference, 2024

IMI: In-memory Multi-job Inference Acceleration for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 53rd International Conference on Parallel Processing, 2024

Zhuomin He

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...