Teng Su

Orcid: 0009-0005-9517-2845

According to our database1, Teng Su authored at least 19 papers between 2012 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
HyperParallel: A Supernode-Affinity AI Framework.
CoRR, March, 2026

2025
Training Report of TeleChat3-MoE.
CoRR, December, 2025

Cross-Search With Improved Multi-Dimensional Dichotomy-Based Joint Optimization for Distributed Parallel Training of DNN.
IEEE Trans. Parallel Distributed Syst., July, 2025

EfficientMoE: Optimizing Mixture-of-Experts Model Training With Adaptive Load Balance.
IEEE Trans. Parallel Distributed Syst., April, 2025

The knowledge distillation-assisted multimodal model for osteoporosis screening.
Comput. Methods Programs Biomed., 2025

Accelerating Model Training on Ascend Chips: An Industrial System for Profiling, Analysis and Optimization.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

Efficient and Automatic 3D Parallelism Strategies Search via Contrastive Reinforcement Learning Pretrained Neural Networks.
Proceedings of the 12th IEEE International Conference on Data Science and Advanced Analytics, 2025

BMPipe: Bubble-Memory Co-Optimization Strategy Planner for Very-Large DNN Training.
Proceedings of the IEEE International Conference on Cluster Computing, 2025

2024
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences.
CoRR, 2024

CSIMD: Cross-Search Algorithm with Improved Multi-dimensional Dichotomy for Micro-Batch-Based Pipeline Parallel Training in DNN.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

2023
A Survey on Auto-Parallelism of Large-Scale Deep Learning Training.
IEEE Trans. Parallel Distributed Syst., August, 2023

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X.
CoRR, 2023

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing.
CoRR, 2023

TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks.
CoRR, 2023

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

2022
TensorOpt: Exploring the Tradeoffs in Distributed DNN Training With Auto-Parallelism.
IEEE Trans. Parallel Distributed Syst., 2022

2021
PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation.
CoRR, 2021

2015
Task-D: A Task Based Programming Framework for Distributed System.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

2012
A Family of Fast Hadamard-Fourier Transform Algorithms.
IEEE Signal Process. Lett., 2012


  Loading...