Jiangsu Du

Orcid: 0000-0003-4707-9492

According to our database1, Jiangsu Du authored at least 19 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

2023
Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs.
ACM Trans. Archit. Code Optim., December, 2023

Hierarchical Model Parallelism for Optimizing Inference on Many-core Processor via Decoupled 3D-CNN Structure.
ACM Trans. Archit. Code Optim., September, 2023

Optimizing massively parallel sparse matrix computing on ARM many-core processor.
Parallel Comput., September, 2023

Full-Stack Optimizing Transformer Inference on ARM Many-Core CPU.
IEEE Trans. Parallel Distributed Syst., July, 2023

ATP: Adaptive Tensor Parallelism for Foundation Models.
CoRR, 2023

MixRec: Orchestrating Concurrent Recommendation Model Training on CPU-GPU platform.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

Accelerating Inference of 3D-CNN on ARMMany-core CPU via Hierarchical Model Partition.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Enhancing Multi-physics Coupling on ARM Many-Core Cluster.
Proceedings of the Advanced Parallel Processing Technologies, 2023

2022
Optimizing small channel 3D convolution on GPU with tensor core.
Parallel Comput., 2022

Enhancing Distributed In-Situ CNN Inference in the Internet of Things.
IEEE Internet Things J., 2022

SAIH: A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems.
CoRR, 2022

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models.
CoRR, 2022

Handling heavy-tailed input of transformer inference on GPUs.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Characterizing and Optimizing Transformer Inference on ARM Many-core Processor.
Proceedings of the 51st International Conference on Parallel Processing, 2022

2021
Model Parallelism Optimization for Distributed Inference Via Decoupled CNN Structure.
IEEE Trans. Parallel Distributed Syst., 2021

2020
A Distributed In-Situ CNN Inference System for IoT Applications.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

2019
Understanding the Resource Demand Differences of Deep Neural Network Training.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2019

P-SOBI: A Parallel Implementation for Second Order Blind Identification Algorithm.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019


  Loading...