Shengwei Li

Orcid: 0000-0002-7419-1511

According to our database1, Shengwei Li authored at least 18 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training.
IEEE Trans. Parallel Distributed Syst., April, 2024

2023
Merak: An Efficient Distributed DNN Training Framework With Automated 3D Parallelism for Giant Foundation Models.
IEEE Trans. Parallel Distributed Syst., May, 2023

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent.
CoRR, 2023

Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training.
CoRR, 2023

Improved Accuracy of XCO2 Retrieval Based on OCO-2 Rtretrieval Framework Model.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2023

Observing Anthropogenic CO2 Emissions with TanSat in Northeast China.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2023

Monitoring of Xch4 Changes and Anomaly in Hebei Province, China Based on Tropomi.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2023

Communication Analysis for Multidimensional Parallel Training of Large-scale DNN Models.
Proceedings of the IEEE International Conference on High Performance Computing & Communications, 2023

Prophet: Fine-grained Load Balancing for Parallel Training of Large-scale MoE Models.
Proceedings of the IEEE International Conference on Cluster Computing, 2023

Stability-Based Generalization Analysis of the Asynchronous Decentralized SGD.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Keyword Spotting Based on CTC and Similarity Matching for Chinese Speech.
Proceedings of the 23rd IEEE/ACIS International Conference on Computer and Information Science, 2023

2022
EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks.
Proceedings of the 51st International Conference on Parallel Processing, 2022

AutoPipe: A Fast Pipeline Parallelism Approach with Balanced Partitioning and Micro-batch Slicing.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

HPH: Hybrid Parallelism on Heterogeneous Clusters for Accelerating Large-scale DNNs Training.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021
EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks.
CoRR, 2021

Hippie: A Data-Paralleled Pipeline Approach to Improve Memory-Efficiency and Scalability for Large DNN Training.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

2PGraph: Accelerating GNN Training over Large Graphs on GPU Clusters.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

2009
Mining closed frequent itemset based on FP-Tree.
Proceedings of the 2009 IEEE International Conference on Granular Computing, 2009


  Loading...