Zijie Yan

According to our database1, Zijie Yan authored at least 9 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Scalable Training of Mixture-of-Experts Models with Megatron Core.
CoRR, March, 2026

HYDRA: Unearthing "Black Swan" Vulnerabilities in LEO Satellite Networks.
CoRR, February, 2026

2025
MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core.
CoRR, April, 2025

2024
Llama 3 Meets MoE: Efficient Upcycling.
CoRR, 2024

Upcycling Large Language Models into Mixture of Experts.
CoRR, 2024

2020
Dual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Standard Deviation Based Adaptive Gradient Compression For Distributed Deep Learning.
Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019
Gradient Sparification for Asynchronous Distributed Training.
CoRR, 2019

2018
More Effective Distributed Deep Learning Using Staleness Based Parameter Updating.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018


  Loading...