Junmin Xiao

Orcid: 0000-0003-0457-4709

According to our database1, Junmin Xiao authored at least 21 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Exploiting Fine-Grained Redundancy in Set-Centric Graph Pattern Mining.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

A Coordinated Strategy for GNN Combining Computational Graph and Operator Optimizations.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

2023
Accelerating k-Shape Time Series Clustering Algorithm Using GPU.
IEEE Trans. Parallel Distributed Syst., October, 2023

SI on parallel system and algorithm optimization.
CCF Trans. High Perform. Comput., September, 2023

AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format.
IEEE Trans. Parallel Distributed Syst., March, 2023

Adaptive Workload-Balanced Scheduling Strategy for Global Ocean Data Assimilation on Massive GPUs.
Proceedings of the International Conference for High Performance Computing, 2023

GraphPar: Efficient Workload-Aware Subgraph Matching System on Multiple GPUs.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

2022
Fast and accurate variable batch size convolution neural network training on large scale distributed systems.
Concurr. Comput. Pract. Exp., 2022

W-Cycle SVD: A Multilevel Algorithm for Batched SVD on GPUs.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

A W-cycle algorithm for efficient batched SVD on GPUs.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022

MegTaiChi: dynamic tensor-based memory management optimization for DNN training.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021
I/O lower bounds for auto-tuning of convolutions in CNNs.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

2020
Fast Data-Obtaining Algorithm for Data Assimilation with Large Data Set.
Int. J. Parallel Program., 2020

Communication Lower Bounds of Convolutions in CNNs.
Proceedings of the SPAA '20: 32nd ACM Symposium on Parallelism in Algorithms and Architectures, 2020

2019
Trade-offs between computation, communication, and synchronization in stencil-collective alternate update.
CCF Trans. High Perform. Comput., 2019

S-EnKF: co-designing for scalable ensemble Kalman filter.
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

Tensor Layout Optimization of Convolution for Inference on Digital Signal Processor.
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019

A Variable Batch Size Strategy for Large Scale Distributed DNN Training.
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019

2018
Communication-Avoiding for Dynamical Core of Atmospheric General Circulation Model.
Proceedings of the 47th International Conference on Parallel Processing, 2018

AGCM3D: A Highly Scalable Finite-Difference Dynamical Core of Atmospheric General Circulation Model Based on 3D Decomposition.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

2013
Multilevel correction for collocation solutions of Volterra integral equations with proportional delays.
Adv. Comput. Math., 2013


  Loading...