Xupeng Miao

Orcid: 0000-0002-9371-8358

According to our database1, Xupeng Miao authored at least 47 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning.
CoRR, 2024

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
CoRR, 2024

Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

Generative Dense Retrieval: Memory Can Be a Burden.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Experimental Analysis of Large-scale Learnable Vector Storage Compression.
Proc. VLDB Endow., December, 2023

P<sup>2</sup>CG: a privacy preserving collaborative graph neural network training framework.
VLDB J., July, 2023

Hetu: a highly efficient automatic parallel distributed deep learning system.
Sci. China Inf. Sci., January, 2023

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-Aware Deep Architecture.
IEEE Trans. Knowl. Data Eng., 2023

Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent.
Proc. VLDB Endow., 2023

SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training.
Proc. VLDB Endow., 2023

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement.
Proc. ACM Manag. Data, 2023

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
CoRR, 2023

SpotServe: Serving Generative Large Language Models on Preemptible Instances.
CoRR, 2023

Model-enhanced Vector Index.
CoRR, 2023

Improving Automatic Parallel Training via Balanced Memory Workload Optimization.
CoRR, 2023

FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference.
CoRR, 2023

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.
CoRR, 2023

EINNET: Optimizing Tensor Programs with Derivation-Based Transformations.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Model-enhanced Vector Index.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
CuWide: Towards Efficient Flow-Based Training for Sparse Wide Models on GPUs.
IEEE Trans. Knowl. Data Eng., 2022

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism.
Proc. VLDB Endow., 2022

Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Update.
Proc. VLDB Endow., 2022

Distributed Graph Neural Network Training: A Survey.
CoRR, 2022

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning.
CoRR, 2022

Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Updates.
CoRR, 2022

HetuMoE: An Efficient Trillion-scale Mixture-of-Expert Distributed Training System.
CoRR, 2022

HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

TSPLIT: Fine-grained GPU Memory Management for Efficient DNN Training via Tensor Splitting.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-aware Deep Architecture (Extended Abstract).
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Zoomer: Boosting Retrieval on Web-scale Graphs by Regions of Interest.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

HET-KG: Communication-Efficient Knowledge Graph Embedding Training via Hotness-Aware Cache.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

PointCLIP: Point Cloud Understanding by CLIP.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scalable Graph Sampling on GPUs with Compressed Graph.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021
Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs.
VLDB J., 2021

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework.
Proc. VLDB Endow., 2021

Dense-to-Sparse Gate for Mixture-of-Experts.
CoRR, 2021

Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

ROD: Reception-aware Online Distillation for Sparse Graphs.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

DeGNN: Improving Graph Neural Networks with Graph Decomposition.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

CuWide: Towards Efficient Flow-based Training for Sparse Wide Models on GPUs (Extended Abstract).
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

2020
Reliable Data Distillation on Graph Convolutional Network.
Proceedings of the 2020 International Conference on Management of Data, 2020

Memory-Aware Framework for Efficient Second-Order Random Walk on Large Graphs.
Proceedings of the 2020 International Conference on Management of Data, 2020

PSGraph: How Tencent trains extremely large-scale graphs with Spark?
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

2019
PS2: Parameter Server on Spark.
Proceedings of the 2019 International Conference on Management of Data, 2019


  Loading...