Liping Zhang

Orcid: 0000-0003-2334-3471

Affiliations:
  • Alibaba Group, Hangzhou, China


According to our database1, Liping Zhang authored at least 33 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
WaveTune: Wave-aware Bilinear Modeling for Efficient GPU Kernel Auto-tuning.
CoRR, April, 2026

LegoDiffusion: Micro-Serving Text-to-Image Diffusion Workflows.
CoRR, April, 2026

SynPerf: A Hybrid Analytical-ML Framework for GPU Performance Prediction.
CoRR, January, 2026

Attack of the Bubbles: Straggler-Resilient Pipeline Parallelism for Large Model Training.
Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026

AUM: Unleashing the Efficiency Potential of Shared Processors with Accelerator Units for LLM Serving.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2026

FlashPS: Efficient Generative Image Editing with Mask-aware Caching and Scheduling.
Proceedings of the 21st European Conference on Computer Systems, 2026

GFS: A Preemption-aware Scheduling Framework for GPU Clusters with Predictive Spot Instance Management.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
EDGC: Entropy-driven Dynamic Gradient Compression for Efficient LLM Training.
CoRR, November, 2025

InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling.
CoRR, May, 2025

Adaptra: Straggler-Resilient Hybrid-Parallel Training with Pipeline Adaptation.
CoRR, April, 2025

GREYHOUND: Hunting Fail-Slows in Hybrid-Parallel Training at Scale.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

Katz: Efficient Workflow Serving for Diffusion Models with Many Adapters.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

GPU-Disaggregated Serving for Deep Learning Recommendation Models at Scale.
Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation, 2025

Reducing the End-to-End Latency of DNN-Based Recommendation Systems in GPU Pools.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2025

WDP: Mitigating Interference in CPU Sharing Through Wake-up Delay Driven Preemption for QoS-aware Co-location.
Proceedings of the 2025 ACM Symposium on Cloud Computing, 2025

EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024
Optimizing Resource Management for Shared Microservices: A Scalable System Design.
ACM Trans. Comput. Syst., May, 2024

FALCON: Pinpointing and Mitigating Stragglers for Large-Scale Hybrid-Parallel Training.
CoRR, 2024

SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules.
CoRR, 2024

DeployFix: Dynamic Repair of Software Deployment Failures via Constraint Solving.
Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 2024

2023
Practice of Alibaba Cloud on Elastic Resource Provisioning for Large-scale Microservices Cluster.
CoRR, 2023

Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

Understanding and Optimizing Workloads for Unified Resource Management in Large Cloud Platforms.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

Erms: Efficient Resource Management for Shared Microservices with SLA Guarantees.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
An In-Depth Study of Microservice Call Graph and Runtime Performance.
IEEE Trans. Parallel Distributed Syst., 2022

MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

Cache Antagonists Identification: A Practice from Alibaba Colocation Datacenter.
Proceedings of the IEEE International Symposium on Software Reliability Engineering Workshops, 2022

Characterizing Job Microarchitectural Profiles at Scale: Dataset and Analysis.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Workload consolidation in alibaba clusters: the good, the bad, and the ugly.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

The power of prediction: microservice auto scaling via workload learning.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

2021
Morphling: Fast, Near-Optimal Auto-Configuration for Cloud-Native Model Serving.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

2018
All-Spark: Using Simulation Tests Directly in Production Environments to Detect System Bottlenecks in Large-Scale Systems.
Proceedings of the 19th International Middleware Conference, 2018


  Loading...