Gabriele Oliaro
Orcid: 0000-0001-5406-0736
According to our database1,
Gabriele Oliaro authored at least 20 papers
between 2021 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
CoRR, April, 2026
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
ACM Comput. Surv., January, 2026
Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026
AdaServe: Accelerating Multi-SLO LLM Serving with SLO-Customized Speculative Decoding.
Proceedings of the 21st European Conference on Computer Systems, 2026
2025
OWL: Overcoming Window Length-Dependence in Speculative Decoding for Long-Context Inputs.
CoRR, October, 2025
CoRR, April, 2025
CoRR, January, 2025
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
2024
CoRR, 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning.
CoRR, 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
CoRR, 2024
Reproducibility Report for ACM SIGMOD 2024 Paper: 'Hierarchical Cut Labelling - Scaling Up Distance Queries on Road Networks'.
Proceedings of the Reproducibility Reports of the 2024 International Conference on Management of Data, 2024
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.
CoRR, 2023
2022
2021
Proceedings of the HotNets '21: The 20th ACM Workshop on Hot Topics in Networks, 2021