Zheng Cao

Orcid: 0000-0002-1565-3683

According to our database1, Zheng Cao authored at least 63 papers between 2005 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Kspeed: Beating I/O Bottlenecks of Data Provisioning for RDMA Training Clusters.
Proceedings of the 32nd IEEE International Conference on Network Protocols, 2024

2023
Flor: An Open High Performance RDMA Framework Over Heterogeneous RNICs.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

More Than Capacity: Performance-oriented Evolution of Pangu in Alibaba.
Proceedings of the 21st USENIX Conference on File and Storage Technologies, 2023


2022
Lamda: The Last Mile of the Datacenter Network Does matter.
CoRR, 2022

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
CoRR, 2022

Libra: In-network Gradient Aggregation for Speeding up Distributed Sparse Deep Training.
CoRR, 2022

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for Task-Oriented Dialog Understanding.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-supervised Learning and Explicit Policy Injection.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
A New Optoelectronic Hybrid Network Based on Scheduling Optimization of Optical Links.
IEEE Trans. Computers, 2021

ACCL: Architecting Highly Scalable Distributed Training Systems With Highly Efficient Collective Communication Library.
IEEE Micro, 2021

SDCUP: Schema Dependency-Enhanced Curriculum Pre-Training for Table Semantic Parsing.
CoRR, 2021

Achieving Human Parity on Visual Question Answering.
CoRR, 2021

When Cloud Storage Meets RDMA.
Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, 2021

Reducing BERT Computation by Padding Removal and Curriculum Learning.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

AIBench Training: Balanced Industry-Standard AI Training Benchmarking.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

AIBench Scenario: Scenario-Distilling AI Benchmarking.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
AIBench: Scenario-distilling AI Benchmarking.
CoRR, 2020

AIBench: An Industry Standard AI Benchmark Suite from Internet Services.
CoRR, 2020

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite.
CoRR, 2020

CETUS: Towards Proportional Capacity Provisioning and Cost-Effectiveness in Frontend Servers.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Dissecting the Communication Latency in Distributed Deep Sparse Learning.
Proceedings of the IMC '20: ACM Internet Measurement Conference, 2020

EFLOPS: Algorithm and System Co-Design for a High Performance Distributed Training Platform.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

DWT: Decoupled Workload Tracing for Data Centers.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2019
AIBench: An Industry Standard Internet Service AI Benchmark Suite.
CoRR, 2019

Priority-Based PCIe Scheduling for Multi-Tenant Multi-GPU Systems.
IEEE Comput. Archit. Lett., 2019

HPCC: high precision congestion control.
Proceedings of the ACM Special Interest Group on Data Communication, 2019

Anomaly Analysis and Diagnosis for Co-located Datacenter Workloads in the Alibaba Cluster.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2019

2018
Anomaly Analysis for Co-located Datacenter Workloads in the Alibaba Cluster.
CoRR, 2018

BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite.
CoRR, 2018

AIBench: Towards Scalable and Comprehensive Datacenter AI Benchmarking.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2018

2017
HyperFatTree: A Large-Scale Tree-Based Network with Low-Radix Switches.
Int. J. Parallel Program., 2017

Regional Congestion Mitigation in Lossless Datacenter Networks.
Proceedings of the Network and Parallel Computing, 2017

Regional Congestion Control in Datacenter Networks.
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017

Fast and Hitless Topology Management of AWGR-Based Optical Networking for Data Centers.
Proceedings of the European Conference on Optical Communication, 2017

2016
Experimental demonstration of heterogeneous cross stratum broker for scientific applications.
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2016

Modeling Traffic of Big Data Platform for Large Scale Datacenter Networks.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

2015
Hi-LION: Hierarchical Large-Scale Interconnection Optical Network With AWGRs [Invited].
JOCN, 2015

PROP: Using PCIe-Based RDMA to Accelerate Rack-Scale Communications in Data Centers.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

2014
Decentralized NIC-Switching Architecture Using SR-IOV PCI Express Network Device.
IEEE Micro, 2014

An Intra-Server Interconnect Fabric for Heterogeneous Computing.
J. Comput. Sci. Technol., 2014

Scalable and distributed optical interconnect architecture based on AWGR for HPC and data centers.
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2014

HiNetSim: A Parallel Simulator for Large-Scale Hierarchical Direct Networks.
Proceedings of the Network and Parallel Computing, 2014

Building a large-scale direct network with low-radix routers.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014


Demonstration of scalable, flat, and high-throughput data center architecture based on arrayed waveguide grating routers.
Proceedings of the European Conference on Optical Communication, 2014

Experimental demonstration of dynamic flexible bandwidth optical data center network with all-to-all interconnectivity.
Proceedings of the European Conference on Optical Communication, 2014

Accelerating synchronization communications for high-density blade enclosure.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

2013
cHPP controller: A High Performance Hyper-node Hardware Accelerator.
Proceedings of the International Conference on Parallel and Distributed Computing, 2013

Accelerating Allreduce Operation: A Switch-Based Solution.
Proceedings of the 22nd International Conference on Computer Communication and Networks, 2013

2012
Design of Hardware-Based Communication Performance Measurement Tool.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011
Design of HPC Node with Heterogeneous Processors.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010
HPP controller: a system controller for high performance computing.
Frontiers Comput. Sci. China, 2010

HPP Controller: A System Controller Dedicated for Message Passing.
Proceedings of the 2010 International Conference on Parallel and Distributed Computing, 2010

Adding an Expressway to Accelerate the Neighborhood Communication.
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010

2009
SimK: A Large-Scale Parallel Simulation Engine.
J. Comput. Sci. Technol., 2009

HPPNetSim: a parallel simulation of large-scale interconnection networks.
Proceedings of the 2009 Spring Simulation Multiconference, SpringSim 2009, 2009

2008
Design and Evaluation of Optical Bus in High Performance Computer.
Proceedings of the 9th International Conference for Young Computer Scientists, 2008

HPP Switch: A Novel High Performance Switch for HPC.
Proceedings of the 16th Annual IEEE Symposium on High Performance Interconnects (HOTI 2008), 2008

2005
A Reconfigurable Optical Interconnect System for DSAG.
Proceedings of the Sixth International Conference on Parallel and Distributed Computing, 2005


  Loading...