Zijie Yan

According to our database¹, Zijie Yan authored at least 9 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Scalable Training of Mixture-of-Experts Models with Megatron Core.

[BibT_eX]

[DOI]

Vijay Anand Korthikanti

CoRR, March, 2026

HYDRA: Unearthing "Black Swan" Vulnerabilities in LEO Satellite Networks.

[BibT_eX]

[DOI]

CoRR, February, 2026

2025

MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core.

[BibT_eX]

[DOI]

CoRR, April, 2025

2024

Llama 3 Meets MoE: Efficient Upcycling.

[BibT_eX]

[DOI]

CoRR, 2024

Upcycling Large Language Models into Mixture of Experts.

[BibT_eX]

[DOI]

CoRR, 2024

2020

Dual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Standard Deviation Based Adaptive Gradient Compression For Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019

Gradient Sparification for Asynchronous Distributed Training.

[BibT_eX]

[DOI]

Zijie Yan

CoRR, 2019

2018

More Effective Distributed Deep Learning Using Staleness Based Parameter Updating.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

Zijie Yan

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...