Wei Wang

Orcid: 0000-0002-4585-4152

Affiliations:
  • Hong Kong University of Science and Technology, Department of Computer Science and Engineering, Hong Kong (since 2015)
  • University of Toronto, Canada (2009 - 2015)


According to our database1, Wei Wang authored at least 100 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference.
CoRR, 2024

Lotto: Secure Participant Selection against Adversarial Servers in Federated Learning.
CoRR, 2024

Dordis: Efficient Federated Learning with Dropout-Resilient Differential Privacy.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024

2023
LB-Chain: Load-Balanced and Low-Latency Blockchain Sharding via Account Migration.
IEEE Trans. Parallel Distributed Syst., October, 2023

Towards Efficient Synchronous Federated Training: A Survey on System Optimization Strategies.
IEEE Trans. Big Data, April, 2023

GIFT: Toward Accurate and Efficient Federated Learning With Gradient-Instructed Frequency Tuning.
IEEE J. Sel. Areas Commun., April, 2023

Monocular 3-D Object Detection Based on Depth-Guided Local Convolution for Smart Payment in D2D Systems.
IEEE Internet Things J., February, 2023

Scalable K-FAC Training for Deep Neural Networks With Distributed Preconditioning.
IEEE Trans. Cloud Comput., 2023

Accelerating Distributed Learning in Non-Dedicated Environments.
IEEE Trans. Cloud Comput., 2023

LMSFC: A Novel Multidimensional Index based on Learned Monotonic Space Filling Curves.
Proc. VLDB Endow., 2023

FaaSwap: SLO-Aware, GPU-Efficient Serverless Inference via Model Swapping.
CoRR, 2023

Decoupling the All-Reduce Primitive for Accelerating Distributed Deep Learning.
CoRR, 2023

Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

Following the Data, Not the Function: Rethinking Function Orchestration in Serverless Computing.
Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023

Fast Sparse GPU Kernels for Accelerated Training of Graph Neural Networks.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

CoChain: High Concurrency Blockchain Sharding via Consensus on Consensus.
Proceedings of the IEEE INFOCOM 2023, 2023

DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining.
Proceedings of the 43rd IEEE International Conference on Distributed Computing Systems, 2023

Golgi: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing.
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023

2022
Enabling Cost-Effective, SLO-Aware Machine Learning Inference Serving on Public Cloud.
IEEE Trans. Cloud Comput., 2022

Towards Dependency-Aware Cache Management for Data Analytics Applications.
IEEE Trans. Cloud Comput., 2022

Editorial: Advances in Mobile, Edge and Cloud Computing.
Mob. Networks Appl., 2022

Incentivizing WiFi-Based Multilateration Location Verification.
IEEE Internet Things J., 2022

Feature Reconstruction Attacks and Countermeasures of DNN training in Vertical Federated Learning.
CoRR, 2022

Taming Client Dropout for Distributed Differential Privacy in Federated Learning.
CoRR, 2022

An LLVM-based open-source compiler for NVIDIA GPUs.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022

MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

Jenga: Orchestrating Smart Contracts in Sharding-Based Blockchain for Efficient Processing.
Proceedings of the 42nd IEEE International Conference on Distributed Computing Systems, 2022

Workload consolidation in alibaba clusters: the good, the bad, and the ugly.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

Owl: performance-aware scheduling for resource-efficient function-as-a-service cloud.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

Pisces: efficient federated learning via guided asynchronous training.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

2021
A Quantitative Survey of Communication Optimizations in Distributed Deep Learning.
IEEE Netw., 2021

Guest Editorial: Interplay between Machine Learning and Networking Systems.
IEEE Netw., 2021

Toward Privacy-Preserving Task Assignment for Fully Distributed Spatial Crowdsourcing.
IEEE Internet Things J., 2021

Restructuring Serverless Computing with Data-Centric Function Orchestration.
CoRR, 2021

System Optimization in Synchronous Federated Training: A Survey.
CoRR, 2021

FLASHE: Additively Symmetric Homomorphic Encryption for Cross-Silo Federated Learning.
CoRR, 2021

Citadel: Protecting Data Privacy and Model Confidentiality for Collaborative Learning with SGX.
CoRR, 2021

CrystalPerf: Learning to Characterize the Performance of Dataflow Computation through Code Analysis.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021

Simplifying low-level GPU programming with GAS.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

Gillis: Serving Large Neural Networks in Serverless Functions with Automatic Model Partitioning.
Proceedings of the 41st IEEE International Conference on Distributed Computing Systems, 2021

Communication-Efficient Federated Learning with Adaptive Parameter Freezing.
Proceedings of the 41st IEEE International Conference on Distributed Computing Systems, 2021

Citadel: Protecting Data Privacy and Model Confidentiality for Collaborative Learning.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

Morphling: Fast, Near-Optimal Auto-Configuration for Cloud-Native Model Serving.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

George: Learning to Place Long-Lived Containers in Large Clusters with Operation Constraints.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

2020
Achieving Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition.
IEEE Trans. Parallel Distributed Syst., 2020

Communication-Efficient Distributed Deep Learning: Survey, Evaluation, and Challenges.
CoRR, 2020

Communication-Efficient Distributed Deep Learning: A Comprehensive Survey.
CoRR, 2020

Learning to Detect Malicious Clients for Robust Federated Learning.
CoRR, 2020

BatchCrypt: Efficient Homomorphic Encryption for Cross-Silo Federated Learning.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Metis: learning to schedule long-running applications in shared container clusters at scale.
Proceedings of the International Conference for High Performance Computing, 2020

Optimizing batched Winograd convolution on GPUs.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

Not All Explorations Are Equal: Harnessing Heterogeneous Profiling Cost for Efficient MLaaS Training.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Demystifying Tensor Cores to Optimize Half-Precision Matrix Multiply.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

RepBun: Load-Balanced, Shuffle-Free Cluster Caching for Structured Data.
Proceedings of the 39th IEEE Conference on Computer Communications, 2020

Semi-dynamic load balancing: efficient distributed learning in non-dedicated environments.
Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020

2019
Abnormal Client Behavior Detection in Federated Learning.
CoRR, 2019

MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Round-Robin Synchronization: Mitigating Communication Bottlenecks in Parameter Servers.
Proceedings of the 2019 IEEE Conference on Computer Communications, 2019

LACS: Load-Aware Cache Sharing with Isolation Guarantee.
Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019

CMFL: Mitigating Communication Overhead for Federated Learning.
Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019

Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud.
Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, 2019

Towards Framework-Independent, Non-Intrusive Performance Characterization for Dataflow Computation.
Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems, 2019

2018
SP-cache: load-balanced, redundancy-free cluster caching with selective partition.
Proceedings of the International Conference for High Performance Computing, 2018

Utopia: Near-optimal Coflow Scheduling with Isolation Guarantee.
Proceedings of the 2018 IEEE Conference on Computer Communications, 2018

Performance-Aware Fair Scheduling: Exploiting Demand Elasticity of Data Analytics Jobs.
Proceedings of the 2018 IEEE Conference on Computer Communications, 2018

Stay Fresh: Speculative Synchronization for Fast Distributed Machine Learning.
Proceedings of the 38th IEEE International Conference on Distributed Computing Systems, 2018

OpuS: Fair and Efficient Cache Sharing for In-Memory Data Analytics.
Proceedings of the 38th IEEE International Conference on Distributed Computing Systems, 2018

Fair Coflow Scheduling without Prior Knowledge.
Proceedings of the 38th IEEE International Conference on Distributed Computing Systems, 2018

Unraveling the RTT-fairness Problem for BBR: A Queueing Model.
Proceedings of the IEEE Global Communications Conference, 2018

Continuum: A Platform for Cost-Aware, Low-Latency Continual Learning.
Proceedings of the ACM Symposium on Cloud Computing, 2018

Fast Distributed Deep Learning via Worker-adaptive Batch Sizing.
Proceedings of the ACM Symposium on Cloud Computing, 2018

2017
Towards RTT Fairness of Congestion-Based Congestion Control.
CoRR, 2017

LRC: Dependency-aware cache management for data analytics clusters.
Proceedings of the 2017 IEEE Conference on Computer Communications, 2017

Coflex: Navigating the fairness-efficiency tradeoff for coflow scheduling.
Proceedings of the 2017 IEEE Conference on Computer Communications, 2017

Cluster fair queueing: Speeding up data-parallel jobs with delay guarantees.
Proceedings of the 2017 IEEE Conference on Computer Communications, 2017

Speculative Slot Reservation: Enforcing Service Isolation for Dependent Data-Parallel Computations.
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

LERC: Coordinated Cache Management for Data-Parallel Systems.
Proceedings of the 2017 IEEE Global Communications Conference, 2017

Towards Online Checkpointing Mechanism for Cloud Transient Servers.
Proceedings of the 2017 IEEE Global Communications Conference, 2017

2016
Towards Multi-Resource Fair Allocation with Placement Constraints.
Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016

Multi-resource fair sharing for datacenter jobs with placement constraints.
Proceedings of the International Conference for High Performance Computing, 2016

Friends or foes: Revisiting strategy-proofness in cloud network sharing.
Proceedings of the 24th IEEE International Conference on Network Protocols, 2016

2015
Toward long-term quality of protection in mobile networks: a context-aware perspective.
IEEE Wirel. Commun., 2015

Dynamic Cloud Instance Acquisition via IaaS Cloud Brokerage.
IEEE Trans. Parallel Distributed Syst., 2015

Optimal Online Multi-Instance Acquisition in IaaS Clouds.
IEEE Trans. Parallel Distributed Syst., 2015

Multi-Resource Fair Allocation in Heterogeneous Cloud Computing Systems.
IEEE Trans. Parallel Distributed Syst., 2015

TiSA: Time-dependent social network advertising.
Proceedings of the 2015 IEEE International Conference on Communications, 2015

2014
Designing Truthful Spectrum Double Auctions with Local Markets.
IEEE Trans. Mob. Comput., 2014

Low complexity multi-resource fair queueing with bounded delay.
Proceedings of the 2014 IEEE Conference on Computer Communications, 2014

Dominant resource fairness in cloud computing systems with heterogeneous servers.
Proceedings of the 2014 IEEE Conference on Computer Communications, 2014

On the Fairness-Efficiency Tradeoff for Packet Processing with Multiple Resources.
Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies, 2014

2013
Multi-resource generalized processor sharing for packet processing.
Proceedings of the 21st IEEE/ACM International Symposium on Quality of Service, 2013

Revenue maximization with dynamic auctions in IaaS cloud markets.
Proceedings of the 21st IEEE/ACM International Symposium on Quality of Service, 2013

Multi-Resource Round Robin: A low complexity packet scheduler with Dominant Resource Fairness.
Proceedings of the 2013 21st IEEE International Conference on Network Protocols, 2013

On Fairness-Efficiency Tradeoffs for Multi-resource Packet Processing.
Proceedings of the 33rd International Conference on Distributed Computing Systems Workshops (ICDCS 2013 Workshops), 2013

Dynamic Cloud Resource Reservation via Cloud Brokerage.
Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

To Reserve or Not to Reserve: Optimal Online Multi-Instance Acquisition in IaaS Clouds.
Proceedings of the 10th International Conference on Autonomic Computing, 2013

2012
Towards Optimal Capacity Segmentation with Hybrid Cloud Pricing.
Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems, 2012

2011
District: Embracing local markets in truthful spectrum double auctions.
Proceedings of the 8th Annual IEEE Communications Society Conference on Sensor, 2011

2009
A Noncooperative Spectrum Sensing Game with Maximum Network Throughput.
Proceedings of the Global Communications Conference, 2009. GLOBECOM 2009, Honolulu, Hawaii, USA, 30 November, 2009

Sequential Greedy Localization in Wireless Sensor Networks With Inaccurate Anchor Positions.
Proceedings of the Global Communications Conference, 2009. GLOBECOM 2009, Honolulu, Hawaii, USA, 30 November, 2009


  Loading...