Yitao Hu

Orcid: 0009-0004-0458-0900

According to our database1, Yitao Hu authored at least 21 papers between 2014 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
TightLLM: Maximizing Throughput for LLM Inference via Adaptive Offloading Policy.
IEEE Trans. Computers, July, 2025

ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs.
CoRR, May, 2025

SLOpt: Serving Real-Time Inference Pipeline With Strict Latency Constraint.
IEEE Trans. Computers, April, 2025

Harpagon: Minimizing DNN Serving Cost via Efficient Dispatching, Scheduling and Splitting.
Proceedings of the IEEE INFOCOM 2025, 2025

Lark: A Buffer-aware Building Block for Programmable Packet Scheduling in Datacenters.
Proceedings of the IEEE INFOCOM 2025, 2025

2024
An API Recommendation Method for Querying Mobile Computing Problems.
Int. J. Cogn. Informatics Nat. Intell., 2024

Skill-Adpative Imitation Learning for UI Test Reuse.
CoRR, 2024

PPT: A Pragmatic Transport for Datacenters.
Proceedings of the ACM SIGCOMM 2024 Conference, 2024

Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-Loading.
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024

FUYAO: DPU-enabled Direct Data Transfer for Serverless Computing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Accelerating Data Delivery of Latency-Sensitive Applications in Container Overlay Network.
IEEE Trans. Parallel Distributed Syst., December, 2023

High-throughput Sampling, Communicating and Training for Reinforcement Learning Systems.
Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

DeepLat: Achieving Minimum Worst Case Latency for DNN Inference with Batch-Aware Dispatching.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2023

2021
API Recommendation Based on WII-WMD.
Int. J. Cogn. Informatics Nat. Intell., 2021

Rim: Offloading Inference to the Edge.
Proceedings of the IoTDI '21: International Conference on Internet-of-Things Design and Implementation, 2021

Scrooge: A Cost-Effective Deep Learning Inference System.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

2018
Case studies and comparison between two models for assessing library service quality.
Electron. Libr., 2018

Olympian: Scheduling GPU Usage in a Deep Neural Network Model Serving System.
Proceedings of the 19th International Middleware Conference, 2018

2016
ALPS: accurate landmark positioning at city scales.
Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2016

2015
Data Acquisition for Real-Time Decision-Making under Freshness Constraints.
Proceedings of the 2015 IEEE Real-Time Systems Symposium, 2015

2014
Critical sensing range for mobile heterogeneous camera sensor networks.
Proceedings of the 2014 IEEE Conference on Computer Communications, 2014


  Loading...