Teng Su

Orcid: 0009-0005-9517-2845

According to our database¹, Teng Su authored at least 20 papers between 2012 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

HyperParallel-MoE: Multi-Core Interleaved Scheduling for Fast MoE Training on Ascend NPUs.

[BibT_eX]

[DOI]

CoRR, May, 2026

HyperParallel: A Supernode-Affinity AI Framework.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

Training Report of TeleChat3-MoE.

[BibT_eX]

[DOI]

CoRR, December, 2025

Cross-Search With Improved Multi-Dimensional Dichotomy-Based Joint Optimization for Distributed Parallel Training of DNN.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., July, 2025

EfficientMoE: Optimizing Mixture-of-Experts Model Training With Adaptive Load Balance.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., April, 2025

The knowledge distillation-assisted multimodal model for osteoporosis screening.

[BibT_eX]

[DOI]

Comput. Methods Programs Biomed., 2025

Accelerating Model Training on Ascend Chips: An Industrial System for Profiling, Analysis and Optimization.

[BibT_eX]

[DOI]

Proceedings of the 2025 USENIX Annual Technical Conference, 2025

Efficient and Automatic 3D Parallelism Strategies Search via Contrastive Reinforcement Learning Pretrained Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Data Science and Advanced Analytics, 2025

BMPipe: Bubble-Memory Co-Optimization Strategy Planner for Very-Large DNN Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2025

2024

BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences.

[BibT_eX]

[DOI]

CoRR, 2024

CSIMD: Cross-Search Algorithm with Improved Multi-dimensional Dichotomy for Micro-Batch-Based Pipeline Parallel Training in DNN.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

2023

A Survey on Auto-Parallelism of Large-Scale Deep Learning Training.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., August, 2023

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X.

[BibT_eX]

[DOI]

CoRR, 2023

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing.

[BibT_eX]

[DOI]

CoRR, 2023

TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2023

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

2022

TensorOpt: Exploring the Tradeoffs in Distributed DNN Training With Auto-Parallelism.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

2021

PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation.

[BibT_eX]

[DOI]

CoRR, 2021

2015

Task-D: A Task Based Programming Framework for Distributed System.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

2012

A Family of Fast Hadamard-Fourier Transform Algorithms.

[BibT_eX]

[DOI]

Teng Su

Feng Yu

IEEE Signal Process. Lett., 2012

Teng Su

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...