Yibo Zhu

Orcid: 0000-0002-9113-2660

According to our database1, Yibo Zhu authored at least 90 papers between 2010 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


On csauthors.net:


DistMind: Efficient Resource Disaggregation for Deep Learning Workloads.
IEEE/ACM Trans. Netw., June, 2024

RLHFuse: Efficient RLHF Training for Large Language Models with Inter- and Intra-Stage Fusion.
CoRR, 2024

DistTrain: Addressing Model and Data Heterogeneity with Disaggregated Training for Multimodal Large Language Models.
CoRR, 2024

Hashing-Based Multi-Modal Semantic Communication.
Proceedings of the IEEE Wireless Communications and Networking Conference, 2024

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024

SP-GNN: Learning structure and position information from graphs.
Neural Networks, April, 2023

MuxFlow: Efficient and Safe GPU Sharing in Large-Scale Production Deep Learning Clusters.
CoRR, 2023

Discrete Cosin TransFormer: Image Modeling From Frequency Domain.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Accelerating Distributed MoE Training and Inference with Lina.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing.
Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023

ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

Lyra: Elastic Scheduling for Deep Learning Clusters.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

DeepCC: Bridging the Gap Between Congestion Control and Applications via Multiobjective Optimization.
IEEE/ACM Trans. Netw., 2022

Congestion Control for Cross-Datacenter Networks.
IEEE/ACM Trans. Netw., 2022

Lita: Accelerating Distributed Training of Sparsely Activated Models.
CoRR, 2022

Espresso: Revisiting Gradient Compression from the System Perspective.
CoRR, 2022

dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training.
CoRR, 2022

Aryl: An Elastic Cluster Scheduler for Deep Learning.
CoRR, 2022

Multi-resource interleaving for deep learning training.
Proceedings of the SIGCOMM '22: ACM SIGCOMM 2022 Conference, Amsterdam, The Netherlands, August 22, 2022

Collie: Finding Performance Anomalies in RDMA Subsystems.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

SAPipe: Staleness-Aware Pipeline for Data Parallel DNN Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem.
CoRR, 2021

DeepCC: Bridging the Gap Between Congestion Control and Applications via Multi-Objective Optimization.
CoRR, 2021

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly.
Proceedings of the 9th International Conference on Learning Representations, 2021

Towards timeout-less transport in commodity datacenter networks.
Proceedings of the EuroSys '21: Sixteenth European Conference on Computer Systems, 2021

Building verified neural networks with specifications for systems.
Proceedings of the APSys '21: 12th ACM SIGOPS Asia-Pacific Workshop on Systems, 2021

Classification of Fatigue Phases in Healthy and Diabetic Adults Using Wearable Sensor.
Sensors, 2020

Methodological Approaches and Recommendations for Functional Near-Infrared Spectroscopy Applications in HF/E Research.
Hum. Factors, 2020

A neurophysiological approach to assess training outcome under stress: A virtual reality experiment of industrial shutdown maintenance using Functional Near-Infrared Spectroscopy (fNIRS).
Adv. Eng. Informatics, 2020

Neuroergonomics Metrics to evaluate Exoskeleton based Gait Rehabilitation.
Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics, 2020

TEA: Enabling State-Intensive Network Functions on Programmable Switches.
Proceedings of the SIGCOMM '20: Proceedings of the 2020 Annual conference of the ACM Special Interest Group on Data Communication on the applications, 2020

A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Elastic parameter server load distribution in deep learning clusters.
Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020

Tagger: Practical PFC Deadlock Prevention in Data Center Networks.
IEEE/ACM Trans. Netw., 2019

A generic communication scheduler for distributed DNN training acceleration.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

Slim: OS Kernel Support for a Low-Overhead Container Overlay Network.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

dShark: A General, Easy to Program and Scalable Framework for Analyzing In-network Packet Traces.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

Tiresias: A GPU Cluster Manager for Distributed Deep Learning.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

Features of Physiological Tremor in Diabetic Patients.
Proceedings of the 2019 IEEE International Smart Cities Conference, 2019

Spectral Analysis of Hand Tremors Induced During a Fatigue Test.
Proceedings of the 32nd IEEE International Symposium on Computer-Based Medical Systems, 2019

Hyperloop: group-based NIC-offloading to accelerate replicated transactions in multi-tenant storage systems.
Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, 2018

007: Democratically Finding the Cause of Packet Drops.
Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation, 2018

Generic External Memory for Switch Data Planes.
Proceedings of the 17th ACM Workshop on Hot Topics in Networks, 2018

Receiver-Initiated Spectrum Management for Underwater Cognitive Acoustic Network.
IEEE Trans. Mob. Comput., 2017

CrystalNet: Faithfully Emulating Large Production Networks.
Proceedings of the 26th Symposium on Operating Systems Principles, 2017

Closing the Network Diagnostics Gap with Vigil.
Proceedings of the Posters and Demos Proceedings of the Conference of the ACM Special Interest Group on Data Communication, 2017

Combining ECN and RTT for Datacenter Transport.
Proceedings of the First Asia-Pacific Workshop on Networking, 2017

Empirical Validation of Commodity Spectrum Monitoring.
Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems, SenSys 2016, 2016

Trimming the Smartphone Network Stack.
Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016

Deadlocks in Datacenter Networks: Why Do They Form, and How to Avoid Them.
Proceedings of the 15th ACM Workshop on Hot Topics in Networks, 2016

ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY.
Proceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies, 2016

Toward Practical MAC Design for Underwater Acoustic Networks.
IEEE Trans. Mob. Comput., 2015

An adaptive power controlled routing protocol for underwater sensor network.
Int. J. Sens. Networks, 2015

A joint power control and rate adaptation MAC protocol for underwater sensor networks.
Ad Hoc Networks, 2015

Energy and Performance of Smartphone Radio Bundling in Outdoor Environments.
Proceedings of the 24th International Conference on World Wide Web, 2015

Aqua-Sim Next Generation: A NS-3 Based Simulator for Underwater Sensor Networks.
Proceedings of the 10th International Conference on Underwater Networks & Systems, 2015

60GHz Mobile Imaging Radar.
Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications, 2015

Packet-Level Telemetry in Large Datacenter Networks.
Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015

Congestion Control for Large-Scale RDMA Deployments.
Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015

Reusing 60GHz Radios for Mobile Radar Imaging.
Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, 2015

Busy Terminal Problem and Implications for MAC Protocols in Underwater Acoustic Networks.
Proceedings of the International Conference on Underwater Networks & Systems, Rome, Italy, November 12, 2014

RISM: An efficient spectrum management system for underwater cognitive acoustic networks.
Proceedings of the Eleventh Annual IEEE International Conference on Sensing, 2014

Cutting the cord: a robust wireless facilities network for data centers.
Proceedings of the 20th Annual International Conference on Mobile Computing and Networking, 2014

Demystifying 60GHz outdoor picocells.
Proceedings of the 20th Annual International Conference on Mobile Computing and Networking, 2014

Distributed on-demand MAC scheduling for underwater acoustic networks.
Proceedings of the IEEE Global Communications Conference, 2014

Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers.
IEEE J. Sel. Areas Commun., 2013

PMAC: a real-world case study of underwater MAC.
Proceedings of the Conference on Underwater Networks and Systems, 2013

UPC-MAC: A Power Control MAC Protocol for Underwater Sensor Networks.
Proceedings of the Wireless Algorithms, Systems, and Applications, 2013

Evaluating Selective ARQ and Slotted Handshake Based Access in Real World Underwater Networks.
Proceedings of the Wireless Algorithms, Systems, and Applications, 2013

An adaptive surface sink redeployment strategy for Underwater Sensor Networks.
Proceedings of the 2013 IEEE Symposium on Computers and Communications, 2013

Toward practical MAC design for underwater acoustic networks.
Proceedings of the IEEE INFOCOM 2013, Turin, Italy, April 14-19, 2013, 2013

Serf and turf: crowdturfing for fun and profit.
Proceedings of the 21st World Wide Web Conference 2012, 2012

"Busy terminal problem" and implications in underwater acoustic networks.
Proceedings of the Conference on Under Water Networks, 2012

Adaptive Power Controlled Routing for Underwater Sensor Networks.
Proceedings of the Wireless Algorithms, Systems, and Applications, 2012

Mirror mirror on the ceiling: flexible wireless links for data centers.
Proceedings of the ACM SIGCOMM 2012 Conference, 2012

Fountain code based Adaptive multi-hop Reliable data transfer for underwater acoustic networks.
Proceedings of IEEE International Conference on Communications, 2012

Enforcing dynamic spectrum access with spectrum permits.
Proceedings of the IEEE International Symposium on Dynamic Spectrum Access Networks, 2012

Datacast: a scalable and efficient reliable group data delivery service for data centers.
Proceedings of the Conference on emerging Networking Experiments and Technologies, 2012

An efficient geo-routing aware MAC protocol for underwater acoustic networks.
EAI Endorsed Trans. Mob. Commun. Appl., 2011

Tarantula: Towards an Accurate Network Coordinate System by Handling Major Portion of TIVs.
Proceedings of the Global Communications Conference, 2011

Taming the triangle inequality violations with network coordinate system on real internet.
Proceedings of the Re-Architecting the Internet Workshop, 2010

Reducing TIV interference in network coordinate systems.
Proceedings of the ACM CoNEXT Student Workshop, 2010

An Efficient Geo-Routing Aware MAC Protocol for Underwater Acoustic Networks - (Invited Paper).
Proceedings of the Ad Hoc Networks - Second International Conference, 2010
