Xuegui Zheng

Orcid: 0009-0005-4635-7309

According to our database1, Xuegui Zheng authored at least 7 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
DITRON: Distributed Multi-level Tiling Compiler for Parallel Tensor Programs.
CoRR, May, 2026

UniEP: Unified Expert-Parallel MoE MegaKernel for LLM Training.
CoRR, April, 2026

MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production.
Proceedings of the 21st European Conference on Computer Systems, 2026

2025
Triton-distributed: Programming Overlapping Kernels on Distributed AI Systems with the Triton Compiler.
CoRR, April, 2025

TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric Primitives.
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

2024
Towards optimized tensor code generation for deep learning on sunway many-core processor.
Frontiers Comput. Sci., April, 2024

Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024


  Loading...