Quanlu Zhang

Orcid: 0000-0003-0557-1104

According to our database1, Quanlu Zhang authored at least 46 papers between 2015 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
DynaTrain: Fast Online Parallelism Switching for Elastic LLM Training.
CoRR, May, 2026

WoVR: World Models as Reliable Simulators for Post-Training VLA Policies with RL.
CoRR, February, 2026

RLinf-USER: A Unified and Extensible System for Real-World Online Policy Learning in Embodied AI.
CoRR, February, 2026

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning.
CoRR, February, 2026

STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal Planning.
Proceedings of the 21st European Conference on Computer Systems, 2026

2025
FUSCO: High-Performance Distributed Data Shuffling via Transformation-Communication Fusion.
CoRR, December, 2025

Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-Design.
CoRR, November, 2025

π<sub>RL</sub>: Online RL Fine-tuning for Flow-based Vision-Language-Action Models.
CoRR, October, 2025

RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training.
CoRR, October, 2025

RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation.
CoRR, September, 2025

Reducing GPU Memory Fragmentation via Spatio-Temporal Planning for Efficient Large-Scale Model Training.
CoRR, July, 2025

2024
Automating Cloud Deployment for Real-Time Online Foundation Model Inference.
IEEE/ACM Trans. Netw., April, 2024

Efficient Large Language Models: A Survey.
Trans. Mach. Learn. Res., 2024

Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

You Only Cache Once: Decoder-Decoder Architectures for Language Models.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
AutoTaskFormer: Searching Vision Transformers for Multi-task Learning.
CoRR, 2023

SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation.
CoRR, 2023

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction.
CoRR, 2023

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation.
Proceedings of the 29th Symposium on Operating Systems Principles, 2023

Efficient GPU Kernels for N: M-Sparse Weights in Deep Learning.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SiloD: A Co-design of Caching and Scheduling for Deep Learning Clusters.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

2022
SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Privacy-preserving Online AutoML for Domain-Specific Face Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing.
CoRR, 2021

2020
How Does Supernet Help in Neural Architecture Search?
CoRR, 2020

Deeper Insights into Weight Sharing in Neural Architecture Search.
CoRR, 2020

A Novel Hybrid Active Contour Model for Intracranial Tuberculosis MRI Segmentation Applications.
IEEE Access, 2020

AutoSys: The Design and Operation of Learning-Augmented Systems.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

HiveD: Sharing a GPU Cluster for Deep Learning with Guarantees.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Retiarii: A Deep Learning Exploratory-Training Framework.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Automating Cloud Deployment for Deep Learning Inference of Real-time Online Services.
Proceedings of the 39th IEEE Conference on Computer Communications, 2020

LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2018
Gandiva: Introspective Cluster Scheduling for Deep Learning.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Building efficient and available distributed transaction with Paxos-based coding consensus.
Proceedings of the IEEE INFOCOM 2018, 2018

Towards Web-based Delta Synchronization for Cloud Storage Services.
Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018

SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines.
Proceedings of the ACM Symposium on Cloud Computing, 2018

Scheduling CPU for GPU-based Deep Learning Jobs.
Proceedings of the ACM Symposium on Cloud Computing, 2018

2017
DeltaCFS: Boosting Delta Sync for Cloud Storage Services by Learning from NFS.
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

2015
CHARM: A Cost-Efficient Multi-Cloud Data Hosting Scheme with High Availability.
IEEE Trans. Cloud Comput., 2015

UStore: A Low Cost Cold and Archival Data Storage System for Data Centers.
Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015

Understanding and Surpassing Dropbox: Efficient Incremental Synchronization in Cloud Storage Services.
Proceedings of the 2015 IEEE Global Communications Conference, 2015

DSwitch: a dual mode direct and network attached disk.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015


  Loading...