Liping Zhang
Orcid: 0000-0003-2334-3471Affiliations:
- Alibaba Group, Hangzhou, China
According to our database1,
Liping Zhang authored at least 33 papers
between 2018 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
CoRR, April, 2026
CoRR, January, 2026
Attack of the Bubbles: Straggler-Resilient Pipeline Parallelism for Large Model Training.
Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026
AUM: Unleashing the Efficiency Potential of Shared Processors with Accelerator Units for LLM Serving.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2026
Proceedings of the 21st European Conference on Computer Systems, 2026
GFS: A Preemption-aware Scheduling Framework for GPU Clusters with Predictive Spot Instance Management.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026
2025
CoRR, November, 2025
InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling.
CoRR, May, 2025
CoRR, April, 2025
Proceedings of the 2025 USENIX Annual Technical Conference, 2025
Proceedings of the 2025 USENIX Annual Technical Conference, 2025
Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation, 2025
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2025
WDP: Mitigating Interference in CPU Sharing Through Wake-up Delay Driven Preemption for QoS-aware Co-location.
Proceedings of the 2025 ACM Symposium on Cloud Computing, 2025
EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025
2024
ACM Trans. Comput. Syst., May, 2024
FALCON: Pinpointing and Mitigating Stragglers for Large-Scale Hybrid-Parallel Training.
CoRR, 2024
Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 2024
2023
Practice of Alibaba Cloud on Elastic Resource Provisioning for Large-scale Microservices Cluster.
CoRR, 2023
Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023
Understanding and Optimizing Workloads for Unified Resource Management in Large Cloud Platforms.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
IEEE Trans. Parallel Distributed Syst., 2022
MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022
Proceedings of the IEEE International Symposium on Software Reliability Engineering Workshops, 2022
Proceedings of the 51st International Conference on Parallel Processing, 2022
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022
2021
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021
2018
All-Spark: Using Simulation Tests Directly in Production Environments to Detect System Bottlenecks in Large-Scale Systems.
Proceedings of the 19th International Middleware Conference, 2018