Qingxiao Sun

Orcid: 0000-0003-2927-362X

According to our database1, Qingxiao Sun authored at least 19 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Adaptive Auto-Tuning Framework for Global Exploration of Stencil Optimization on GPUs.
IEEE Trans. Parallel Distributed Syst., January, 2024

2023
Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

2022
Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP.
IEEE Trans. Computers, 2022

QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU.
Parallel Comput., 2022

Mimose: An Input-Aware Checkpointing Planner for Efficient Training on GPU.
CoRR, 2022

CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

StencilMART: Predicting Optimization Selection for Stencil Computations across GPUs.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Towards Optimized Streaming Tensor Completion on multiple GPUs.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

2021
The Deep Learning Compiler: A Comprehensive Survey.
IEEE Trans. Parallel Distributed Syst., 2021

Towards efficient canonical polyadic decomposition on sunway many-core processor.
Inf. Sci., 2021

Highly scalable parallel genetic algorithm on Sunway many-core processors.
Future Gener. Comput. Syst., 2021

An optimized tensor completion library for multiple GPUs.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

csTuner: Scalable Auto-tuning Framework for Complex Stencil Computation on GPUs.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020
The Deep Learning Compiler: A Comprehensive Survey.
CoRR, 2020

SpTFS: sparse tensor format selection for MTTKRP via deep learning.
Proceedings of the International Conference for High Performance Computing, 2020

Accelerating De Novo Assembler WTDBG2 on Commodity Servers.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2020

2019
Improving Thread-level Parallelism in GPUs Through Expanding Register File to Scratchpad Memory.
ACM Trans. Archit. Code Optim., 2019

SMQoS: Improving Utilization and Energy Efficiency with QoS Awareness on GPUs.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019


  Loading...