Sivasankaran Rajamanickam

Orcid: 0000-0002-5854-409X

According to our database1, Sivasankaran Rajamanickam authored at least 99 papers between 2008 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 




Breaking the Molecular Dynamics Timescale Barrier Using a Wafer-Scale System.
CoRR, 2024

Performance Portable Batched Sparse Linear Solvers.
IEEE Trans. Parallel Distributed Syst., May, 2023

Jet: Multilevel Graph Partitioning on GPUs.
CoRR, 2023

Exploiting Inter-Operation Data Reuse in Scientific Applications using GOGETA.
CoRR, 2023

An Experimental Study of Two-level Schwarz Domain-Decomposition Preconditioners on GPUs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

A Comparison of Spectral and Spatial Graph Convolutional Neural Network Kernels Using GraphSAGE-Sparse.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

TenSQL: An SQL Database Built on GraphBLAS.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

Training-free hyperparameter optimization of neural networks for electronic structures in matter.
Mach. Learn. Sci. Technol., December, 2022

A Block-Based Triangle Counting Algorithm on Heterogeneous Environments.
IEEE Trans. Parallel Distributed Syst., 2022

Kokkos 3: Programming Model Extensions for the Exascale Era.
IEEE Trans. Parallel Distributed Syst., 2022

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication.
IEEE Trans. Parallel Distributed Syst., 2022

FROSch Preconditioners for Land Ice Simulations of Greenland and Antarctica.
SIAM J. Sci. Comput., 2022

Parallel graph coloring algorithms for distributed GPU environments.
Parallel Comput., 2022

PGAbB: A Block-Based Graph Processing Framework for Heterogeneous Platforms.
CoRR, 2022

Enabling Flexibility for Sparse Tensor Acceleration via Heterogeneity.
CoRR, 2022

High-Performance GMRES Multi-Precision Benchmark: Design, Performance, and Challenges.
Proceedings of the IEEE/ACM International Workshop on Performance Modeling, 2022

Parallel, Portable Algorithms for Distance-2 Maximal Independent Set and Graph Coarsening.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Understanding the Design-Space of Sparse/Dense Multiphase GNN dataflows on Spatial Accelerators.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Concentric Spherical Neural Network for 3D Representation Learning.
Proceedings of the International Joint Conference on Neural Networks, 2022

Half-Precision Scalar Support in Kokkos and Kokkos Kernels: An Engineering Study and Experience Report.
Proceedings of the 18th IEEE International Conference on e-Science, 2022

Multicore Algorithms for Graph Connectivity Problems.
Proceedings of the Massive Graph Analytics, 2022

Partitioning Trillion-Edge Graphs.
Proceedings of the Massive Graph Analytics, 2022

Sphynx: A parallel multi-GPU graph partitioner for distributed-memory systems.
Parallel Comput., 2021

Co-design Center for Exascale Machine Learning Technologies (ExaLearn).
Int. J. High Perform. Comput. Appl., 2021

EXAGRAPH: Graph and combinatorial methods for enabling exascale applications.
Int. J. High Perform. Comput. Appl., 2021

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic.
Int. J. High Perform. Comput. Appl., 2021

The Kokkos EcoSystem: Comprehensive Performance Portability for High Performance Computing.
Comput. Sci. Eng., 2021

A Study of Mixed Precision Strategies for GMRES on GPUs.
CoRR, 2021

Two-Stage Gauss-Seidel Preconditioners and Smoothers for Krylov Solvers on a GPU cluster.
CoRR, 2021

Kokkos Kernels: Performance Portable Sparse/Dense Linear Algebra and Graph Kernels.
CoRR, 2021

Concentric Spherical GNN for 3D Representation Learning.
CoRR, 2021

A Taxonomy for Classification and Comparison of Dataflows for GNN Accelerators.
CoRR, 2021

Experimental Evaluation of Multiprecision Strategies for GMRES on GPUs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Performance-Portable Graph Coarsening for Efficient Multilevel Graph Analysis.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Extending Sparse Tensor Accelerators to Support Multiple Compression Formats.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

Scalable, Multi-Constraint, Complex-Objective Graph Partitioning.
IEEE Trans. Parallel Distributed Syst., 2020

Scalable Asynchronous Domain Decomposition Solvers.
SIAM J. Sci. Comput., 2020

An Algebraic Sparsified Nested Dissection Algorithm Using Low-Rank Approximations.
SIAM J. Matrix Anal. Appl., 2020

Accelerating Finite-temperature Kohn-Sham Density Functional Theory with Deep Neural Networks.
CoRR, 2020

A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic.
CoRR, 2020

ADELUS: A Performance-Portable Dense LU Solver for Distributed-Memory Hardware-Accelerated Systems.
Proceedings of the Accelerator Programming Using Directives - 7th International Workshop, 2020

Distributed Memory Graph Coloring Algorithms for Multiple GPUs.
Proceedings of the 10th IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms, 2020

A performance-portable nonhydrostatic atmospheric dycore for the energy exascale earth system model running at cloud-resolving resolutions.
Proceedings of the International Conference for High Performance Computing, 2020

SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Performance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Geometric Mapping of Tasks to Processors on Parallel Computers with Mesh or Torus Networks.
IEEE Trans. Parallel Distributed Syst., 2019

A robust hierarchical solver for ill-conditioned systems with applications to ice sheet modeling.
J. Comput. Phys., 2019

How Robust Are Graph Neural Networks to Structural Noise?
CoRR, 2019

Scalable generation of graphs for benchmarking HPC community-detection algorithms.
Proceedings of the International Conference for High Performance Computing, 2019

A Portable SIMD Primitive Using Kokkos for Heterogeneous Architectures.
Proceedings of the Accelerator Programming Using Directives - 6th International Workshop, 2019

A Parallel Graph Algorithm for Detecting Mesh Singularities in Distributed Memory Ice Sheet Simulations.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Linear Algebra-Based Triangle Counting via Fine-Grained Tasking on Heterogeneous Environments : (Update on Static Graph Challenge).
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Scalable Inference for Sparse Deep Neural Networks using Kokkos Kernels.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Scalable Triangle Counting on Distributed-Memory Systems.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures.
Parallel Comput., 2018

A distributed-memory hierarchical solver for general sparse linear systems.
Parallel Comput., 2018

Ensemble Grouping Strategies for Embedded Stochastic Collocation Methods Applied to Anisotropic Diffusion Problems.
SIAM/ASA J. Uncertain. Quantification, 2018

Asynchronous One-Level and Two-Level Domain Decomposition Solvers.
CoRR, 2018

Geometric Partitioning and Ordering Strategies for Task Mapping on Parallel Computers.
CoRR, 2018

Sparse Matrix-Matrix Multiplication on Multilevel Memory Architectures : Algorithms and Experiments.
CoRR, 2018

Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures.
CoRR, 2018

Experimental Design of Work Chunking for Graph Algorithms on High Bandwidth Memory Architectures.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Tacho: Memory-Scalable Task Parallel Sparse Cholesky Factorization.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Fast Triangle Counting Using Cilk.
Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Embedded Ensemble Propagation for Improving Performance, Portability, and Scalability of Uncertainty Quantification on Emerging Computational Architectures.
SIAM J. Sci. Comput., 2017

Basker: Parallel sparse LU factorization utilizing hierarchical parallelism and data layouts.
Parallel Comput., 2017

Distributed Graph Layout for Scalable Small-world Network Analysis.
CoRR, 2017

Designing vector-friendly compact BLAS and LAPACK kernels.
Proceedings of the International Conference for High Performance Computing, 2017

Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Partitioning Trillion-Edge Graphs in Minutes.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Performance-Portable Sparse Matrix-Matrix Multiplication for Many-Core Architectures.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Fast linear algebra-based triangle counting with KokkosKernels.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Multi-Jagged: A Scalable Parallel Spatial Partitioning Algorithm.
IEEE Trans. Parallel Distributed Syst., 2016

Complex Network Partitioning Using Label Propagation.
SIAM J. Sci. Comput., 2016

Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout.
CoRR, 2016

A survey of direct methods for sparse linear systems.
Acta Numer., 2016

A Case Study of Complex Graph Analysis in Distributed Memory: Implementation and Optimization.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Parallel Graph Coloring for Manycore Architectures.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Basker: A Threaded Sparse LU Factorization Utilizing Hierarchical Parallelism and Data Layouts.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

A Comparison of High-Level Programming Choices for Incomplete Sparse Factorization Across Different Architectures.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

High-Performance Graph Analytics on Manycore Processors.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Towards Extreme-Scale Simulations for Low Mach Fluids with Second-Generation Trilinos.
Parallel Process. Lett., 2014

A Hybrid Approach for Parallel Transistor-Level Full-Chip Circuit Simulation.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster.
Proceedings of the International Conference for High Performance Computing, 2014

BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Towards Extreme-Scale Simulations with Next-Generation Trilinos: A Low Mach Fluid Application Case Study.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Exploiting Geometric Partitioning in Task Mapping for Parallel Computers.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Building blocks for graph based network analysis.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2014

PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

Electrical modeling and simulation for stockpile stewardship.
XRDS, 2013

Scalable matrix computations on large scale-free graphs using 2D graph partitioning.
Proceedings of the International Conference for High Performance Computing, 2013

Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems.
Sci. Program., 2012

ShyLU: A Hybrid-Hybrid Solver for Multicore Platforms.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Multithreaded Algorithms for Maxmum Matching in Bipartite Graphs.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Parallel partitioning with Zoltan: Is hypergraph partitioning worth it?
Proceedings of the Graph Partitioning and Graph Clustering, 2012

Poster: a hybrid-hybrid solver for manycore platforms.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Enabling Next-Generation Parallel Circuit Simulation with Trilinos.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

Algorithm 887: CHOLMOD, Supernodal Sparse Cholesky Factorization and Update/Downdate.
ACM Trans. Math. Softw., 2008
