Jianxi Ye

Orcid: 0009-0007-3395-3624

According to our database¹, Jianxi Ye authored at least 15 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

BURST: Seeking High-performance, Interoperability and Scalability in Soft-RDMA.

[BibT_eX]

[DOI]

Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026

Handling Network Faults in Distributed AI Training: Failover is Now an Option.

[BibT_eX]

[DOI]

Proceedings of the 21st European Conference on Computer Systems, 2026

2025

Triton-distributed: Programming Overlapping Kernels on Distributed AI Systems with the Triton Compiler.

[BibT_eX]

[DOI]

CoRR, April, 2025

MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism.

[BibT_eX]

[DOI]

CoRR, April, 2025

Barre: Empowering Simplified and Versatile Programmable Congestion Control in High-Speed AI Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2025 USENIX Annual Technical Conference, 2025

Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM Training.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles, 2025

MegaScale-Infer: Efficient Mixture-of-Experts Model Serving with Disaggregated Expert Parallelism.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2025 Conference, 2025

From ATOP to ZCube: Automated Topology Optimization Pipeline and A Highly Cost-Effective Network Topology for Large Model Training.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2025 Conference, 2025

Minder: Faulty Machine Detection for Large-scale Distributed Model Training.

[BibT_eX]

[DOI]

Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation, 2025

TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric Primitives.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

UCM: Fast and Maintainable User-space RDMA Connection Setup.

[BibT_eX]

[DOI]

Proceedings of the 9th Asia-Pacific Workshop on Networking, 2025

2024

MegaScale: Scaling Large Language Model Training to More Than 10, 000 GPUs.

[BibT_eX]

[DOI]

Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

2022

Collie: Finding Performance Anomalies in RDMA Subsystems.

[BibT_eX]

[DOI]

Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

2020

EFLOPS: Algorithm and System Co-Design for a High Performance Distributed Training Platform.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2016

RDMA over Commodity Ethernet at Scale.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2016 Conference, Florianopolis, Brazil, August 22-26, 2016, 2016

Jianxi Ye

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...