Jiangsu Du
Orcid: 0000-0003-4707-9492Affiliations:
- Sun Yat-Sen University, Guangzhou, China
According to our database1,
Jiangsu Du
authored at least 34 papers
between 2019 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
CoRR, August, 2025
TD-Pipe: Temporally-Disaggregated Pipeline Parallelism Architecture for High-Throughput LLM Inference.
CoRR, June, 2025
Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism.
CoRR, May, 2025
SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation.
CoRR, May, 2025
EcoServe: Enabling Cost-effective LLM Serving with Proactive Intra- and Inter-Instance Orchestration.
CoRR, April, 2025
gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling.
CoRR, April, 2025
Proceedings of the ACM on Web Conference 2025, 2025
Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025
2024
IncrCP: Decomposing and Orchestrating Incremental Checkpoints for Effective Recommendation Model Training.
Proc. VLDB Endow., December, 2024
IEEE Trans. Parallel Distributed Syst., November, 2024
SAIH: A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems.
J. Comput. Sci. Technol., March, 2024
APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes.
Proceedings of the International Conference for High Performance Computing, 2024
Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the Network and Parallel Computing, 2024
Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference.
Proceedings of the IEEE INFOCOM 2024, 2024
Proceedings of the Euro-Par 2024: Parallel Processing, 2024
Communication-Efficient Model Parallelism for Distributed In-Situ Transformer Inference.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
2023
Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs.
ACM Trans. Archit. Code Optim., December, 2023
Hierarchical Model Parallelism for Optimizing Inference on Many-core Processor via Decoupled 3D-CNN Structure.
ACM Trans. Archit. Code Optim., September, 2023
Parallel Comput., September, 2023
IEEE Trans. Parallel Distributed Syst., July, 2023
Proceedings of the 41st IEEE International Conference on Computer Design, 2023
Accelerating Inference of 3D-CNN on ARMMany-core CPU via Hierarchical Model Partition.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023
Proceedings of the Advanced Parallel Processing Technologies, 2023
2022
Parallel Comput., 2022
IEEE Internet Things J., 2022
CoRR, 2022
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
Proceedings of the 51st International Conference on Parallel Processing, 2022
2021
Model Parallelism Optimization for Distributed Inference Via Decoupled CNN Structure.
IEEE Trans. Parallel Distributed Syst., 2021
2020
Proceedings of the 38th IEEE International Conference on Computer Design, 2020
2019
Proceedings of the Algorithms and Architectures for Parallel Processing, 2019
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019