Piyush Sao

Orcid: 0000-0002-9432-5855

According to our database1, Piyush Sao authored at least 23 papers between 2013 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
PANDORA: A Parallel Dendrogram Construction Algorithm for Single Linkage Clustering on GPU.
CoRR, 2024

2023
Newly Released Capabilities in the Distributed-Memory SuperLU Sparse Direct Solver.
ACM Trans. Math. Softw., March, 2023

Brief Announcement: Communication Optimal Sparse LU Factorization for Planar Matrices.
Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures, 2023

Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters.
Proceedings of the International Conference for High Performance Computing, 2023

Optimizing Communication in 2D Grid-Based MPI Applications at Exascale.
Proceedings of the 30th European MPI Users' Group Meeting, 2023

2022
Exaflops Biomedical Knowledge Graph Analytics.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

A single-tree algorithm to compute the Euclidean minimum spanning tree on GPUs.
Proceedings of the 51st International Conference on Parallel Processing, 2022

2021
Sparse Binary Matrix-Vector Multiplication on Neuromorphic Computers.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Scalable All-pairs Shortest Paths for Huge Graphs on Multi-GPU Clusters.
Proceedings of the HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, 2021

2020
Traversing Large Graphs on GPUs with Unified Memory.
Proc. VLDB Endow., 2020

Scalable knowledge graph analytics at 136 petaflop/s.
Proceedings of the International Conference for High Performance Computing, 2020

A supernodal all-pairs shortest path algorithm.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

2019
A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems.
J. Parallel Distributed Comput., 2019

Self-stabilizing Connected Components.
Proceedings of the 9th IEEE/ACM Workshop on Fault Tolerance for HPC at eXtreme Scale, 2019

Multifrontal Non-negative Matrix Factorization.
Proceedings of the Parallel Processing and Applied Mathematics, 2019

A communication-avoiding 3D sparse triangular solver.
Proceedings of the ACM International Conference on Supercomputing, 2019

2018
Scalable and resilient sparse linear solvers.
PhD thesis, 2018

A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

2016
A Self-Correcting Connected Components Algorithm.
Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale, 2016

2015
A Sparse Direct Solver for Distributed Memory Xeon Phi-Accelerated Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
A distributed kernel summation framework for general-dimension machine learning.
Stat. Anal. Data Min., 2014

A Distributed CPU-GPU Sparse Direct Solver.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013
Self-stabilizing iterative solvers.
Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2013


  Loading...