Lidong Zhou

Orcid: 0000-0002-7258-3116

Affiliations:
  • Microsoft Research


According to our database1, Lidong Zhou authored at least 93 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of two.
  • Erdős number3 of three.

Awards

ACM Fellow

ACM Fellow 2019, "For contributions to trustworthy distributed computing and to systems research and education in China".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Anubis: Towards Reliable Cloud AI Infrastructure via Proactive Validation.
CoRR, 2024

Amanda: Unified Instrumentation Framework for Deep Neural Networks.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation.
CoRR, 2023

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction.
CoRR, 2023

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation.
Proceedings of the 29th Symposium on Operating Systems Principles, 2023

VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Optimizing Dynamic Neural Networks with Brainstorm.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Welder: Scheduling Deep Learning Memory Access via Tile-graph.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

On Modular Learning of Distributed Systems for Predicting End-to-End Latency.
Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023

SiloD: A Co-design of Caching and Scheduling for Deep Learning Clusters.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

2022
ROLLER: Fast and Efficient Tensor Compilation for Deep Learning.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

2021
Argus: A Fully Transparent Incentive System for Anti-Piracy Campaigns (Extended Version).
CoRR, 2021

Agatha: Smart Contract for DNN Computation.
CoRR, 2021

Geometric Partitioning: Explore the Boundary of Optimal Erasure Code Repair.
Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

Forerunner: Constraint-based Speculative Transaction Execution for Ethereum.
Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Distributed Graph Computation Meets Machine Learning.
IEEE Trans. Parallel Distributed Syst., 2020

Byzantine Ordered Consensus without Byzantine Oligarchy.
IACR Cryptol. ePrint Arch., 2020

AutoSys: The Design and Operation of Learning-Augmented Systems.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

HiveD: Sharing a GPU Cluster for Deep Learning with Guarantees.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Retiarii: A Deep Learning Exploratory-Training Framework.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

TextNAS: A Neural Architecture Search Space Tailored for Text Representation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
The Case for Learning-and-System Co-design.
ACM SIGOPS Oper. Syst. Rev., 2019

NeuGraph: Parallel Deep Neural Network Computation on Large Graphs.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Fast Distributed Deep Learning over RDMA.
Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019

Astra: Exploiting Predictability to Optimize Deep Learning.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
Towards Efficient Large-Scale Graph Neural Network Computing.
CoRR, 2018

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA.
CoRR, 2018

TerseCades: Efficient Data Compression in Stream Processing.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Gandiva: Introspective Cluster Scheduling for Deep Learning.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Capturing and Enhancing In Situ System Observability for Failure Detection.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Seasonal classification and RBF adaptive weight based parallel combined method for day-ahead electricity price forecasting.
Proceedings of the 2018 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference, 2018

Scheduling CPU for GPU-based Deep Learning Jobs.
Proceedings of the ACM Symposium on Cloud Computing, 2018

TEE-KV: Secure Immutable Key-Value Store for Trusted Execution Environments.
Proceedings of the ACM Symposium on Cloud Computing, 2018

2017
Tux<sup>2</sup>: Distributed Graph Computation for Machine Learning.
Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, 2017

Inverse-time protection scheme for active distribution network based on user-defined characteristics.
Proceedings of the 2017 IEEE PES Innovative Smart Grid Technologies Conference Europe, 2017

Multi-objective optimization model of source-load-storage synergetic dispatch for building energy system based on TOU price demand response.
Proceedings of the 2017 IEEE Industry Applications Society Annual Meeting, 2017

Gray Failure: The Achilles' Heel of Cloud-Scale Systems.
Proceedings of the 16th Workshop on Hot Topics in Operating Systems, 2017

2016
Realizing the Fault-Tolerance Promise of Cloud Storage Using Locks with Intent.
Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, 2016

StreamScope: Continuous Reliable Distributed Processing of Big Data Streams.
Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation, 2016

2015
Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE.
IEEE Trans. Parallel Distributed Syst., 2015

ImmortalGraph: A System for Storage and Analysis of Temporal Graphs.
ACM Trans. Storage, 2015

GraM: scaling graph computation to the trillions.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

2014
Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing.
Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation, 2014

Cybertron: pushing the limit on I/O reduction in data-parallel programs.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

Nondeterminism in MapReduce considered harmful? an empirical study on non-commutative aggregators in MapReduce programs.
Proceedings of the 36th International Conference on Software Engineering, 2014

Chronos: a graph engine for temporal graph analysis.
Proceedings of the Ninth Eurosys Conference 2014, 2014

Rex: replication at the speed of multi-core.
Proceedings of the Ninth Eurosys Conference 2014, 2014

2013
The technical security issues in cloud computing.
Int. J. Inf. Commun. Technol., 2013

KuaFu: Closing the parallelism gap in database replication.
Proceedings of the 29th IEEE International Conference on Data Engineering, 2013

Failure Recovery: When the Cure Is Worse Than the Disease.
Proceedings of the 14th Workshop on Hot Topics in Operating Systems, 2013

TimeStream: reliable stream computation in the cloud.
Proceedings of the Eighth Eurosys Conference 2013, 2013

2012
Managing Large Graphs on Multi-Cores with Graph Awareness.
Proceedings of the 2012 USENIX Annual Technical Conference, 2012

Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE.
Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, 2012

Optimizing Data Shuffling in Data-Parallel Computation by Understanding User-Defined Functions.
Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, 2012

Kineograph: taking the pulse of a fast-changing and connected world.
Proceedings of the European Conference on Computer Systems, 2012

2011
G2: A Graph Processing System for Diagnosing Distributed Systems.
Proceedings of the 2011 USENIX Annual Technical Conference, 2011

Practical software model checking via dynamic interface reduction.
Proceedings of the 23rd ACM Symposium on Operating Systems Principles 2011, 2011

2010
Reconfiguring a state machine.
SIGACT News, 2010

Language-based replay via data flow cut.
Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2010

Distributed Systems Meet Economics: Pricing in the Cloud.
Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing, 2010

Comet: batched stream processing for data intensive distributed computing.
Proceedings of the 1st ACM Symposium on Cloud Computing, 2010

2009
Kinesis: A new approach to replica placement in distributed storage systems.
ACM Trans. Storage, 2009

Chasing the Weakest System Model for Implementing Ω and Consensus.
IEEE Trans. Dependable Secur. Comput., 2009

Building reliable large-scale distributed systems: when theory meets practice.
SIGACT News, 2009

Vertical paxos and primary-backup replication.
Proceedings of the 28th Annual ACM Symposium on Principles of Distributed Computing, 2009

MODIST: Transparent Model Checking of Unmodified Distributed Systems.
Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation, 2009

Wave Computing in the Cloud.
Proceedings of HotOS'09: 12th Workshop on Hot Topics in Operating Systems, 2009

2008
Niobe: A practical replication protocol.
ACM Trans. Storage, 2008

Vigilante: End-to-end containment of Internet worm epidemics.
ACM Trans. Comput. Syst., 2008

Transactional Flash.
Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, 2008

2007
Bouncer: securing software by blocking bad input.
Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, 2007

Graceful degradation via versions: specifications and implementations.
Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, 2007

Peer-to-Peer Rating.
Proceedings of the Seventh IEEE International Conference on Peer-to-Peer Computing (P2P 2007), 2007

2006
Troubleshooting wireless mesh networks.
Comput. Commun. Rev., 2006

Brief Announcement: Chasing the Weakest System Model for Implementing <i>Omega</i> and Consensus.
Proceedings of the Stabilization, 2006

2005
APSS: proactive secret sharing in asynchronous systems.
ACM Trans. Inf. Syst. Secur., 2005

Implementing Trustworthy Services Using Replicated State Machines.
IEEE Secur. Priv., 2005

<i>Omega</i> Meets Paxos: Leader Election and Stability Without Eventual Timely Links.
Proceedings of the Distributed Computing, 19th International Conference, 2005

Vigilante: end-to-end containment of internet worms.
Proceedings of the 20th ACM Symposium on Operating Systems Principles 2005, 2005

Troubleshooting multihop wireless networks.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 2005

A First Look at Peer-to-Peer Worms: Threats and Defenses.
Proceedings of the Peer-to-Peer Systems IV, 4th International Workshop, 2005

Distributed Blinding for Distributed ElGamal Re-Encryption.
Proceedings of the 25th International Conference on Distributed Computing Systems (ICDCS 2005), 2005

2004
Boxwood: Abstractions as the Foundation for Storage Infrastructure.
Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI 2004), 2004

P6P: A Peer-to-Peer Approach to Internet Infrastructure.
Proceedings of the Peer-to-Peer Systems III, Third International Workshop, 2004

A Multi-Radio Unification Protocol for IEEE 802.11 Wireless Networks.
Proceedings of the 1st International Conference on Broadband Networks (BROADNETS 2004), 2004

2002
COCA: A secure distributed online certification authority.
ACM Trans. Comput. Syst., 2002

Implementing IPv6 as a Peer-to-Peer Overlay Network.
Proceedings of the 21st Symposium on Reliable Distributed Systems (SRDS 2002), 2002

2001
Towards Fault-Tolerant and Secure On-Line Services.
PhD thesis, 2001

1999
Securing ad hoc networks.
IEEE Netw., 1999


  Loading...