# Padma Raghavan

According to our database

^{1}, Padma Raghavan## Awards

## IEEE Fellow

IEEE Fellow 2014, "For contributions to robust scalable sparse solvers and energy-efficient parallel scientific computing".

## Timeline

#### Legend:

Book In proceedings Article PhD thesis Other## Links

#### Homepages:

#### On csauthors.net:

## Bibliography

2017

An embedded sectioning scheme for multiprocessor topology-aware mapping of irregular applications.

IJHPCA, 2017

Co-Scheduling Algorithms for Cache-Partitioned Systems.

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

2016

Co-scheduling algorithms for high-throughput workload execution.

J. Scheduling, 2016

Research and Education in Computational Science and Engineering.

CoRR, 2016

Locality-Aware Laplacian Mesh Smoothing.

CoRR, 2016

Locality-Aware Laplacian Mesh Smoothing.

Proceedings of the 45th International Conference on Parallel Processing, 2016

2015

STS-k: a multilevel sparse triangular solution scheme for NUMA multicores.

Proceedings of the International Conference for High Performance Computing, 2015

Phase Detection with Hidden Markov Models for DVFS on Many-Core Processors.

Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015

2014

Hybrid Sparse Linear Solutions with Substituted Factorization.

Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

A multilevel compressed sparse row format for efficient sparse computations on multicore processors.

Proceedings of the 21st International Conference on High Performance Computing, 2014

2013

Special Issue: Selected Papers from Super Computing 2012.

Scientific Programming, 2013

Speedup-Aware Co-Schedules for Efficient Workload Management.

Parallel Processing Letters, 2013

Co-Scheduling Algorithms for High-Throughput Workload Execution

CoRR, 2013

Scalable parallel graph partitioning.

Proceedings of the International Conference for High Performance Computing, 2013

Interference Resolver in Shared Storage Systems to Provide Fairness to I/O Intensive Applications.

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012

Similarity Graph Neighborhoods for Enhanced Supervised Classification.

Proceedings of the International Conference on Computational Science, 2012

NUMA-aware graph mining techniques for performance and energy efficiency.

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Fault tolerant preconditioned conjugate gradient for sparse linear system solution.

Proceedings of the International Conference on Supercomputing, 2012

Adapting Sparse Triangular Solution to GPUs.

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Phase Partitioning Methods for I/O Cache Optimization.

Proceedings of the 41st International Conference on Parallel Processing, 2012

2011

Can models of scientific software-hardware interactions be predictive?

Proceedings of the International Conference on Computational Science, 2011

A Multilevel Cholesky Conjugate Gradients Hybrid Solver for Linear Systems with Multiple Right-hand Sides.

Proceedings of the International Conference on Computational Science, 2011

Exploiting dense substructures for fast sparse matrix vector multiplication.

IJHPCA, 2011

Virtual I/O caching: dynamic storage cache management for concurrent workloads.

Proceedings of the Conference on High Performance Computing Networking, 2011

Characterizing the impact of soft errors on iterative methods in scientific computing.

Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2010

Parallel Hybrid Preconditioning: Incomplete Factorization with Selective Sparse Approximate Inversion.

SIAM J. Scientific Computing, 2010

PFFTC: An improved fast Fourier transform for the IBM cell broadband engine.

Proceedings of the International Conference on Computational Science, 2010

Characterizing sparse preconditioner performance for the support vector machine kernel.

Proceedings of the International Conference on Computational Science, 2010

Intra-application shared cache partitioning for multithreaded applications.

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Intra-application cache partitioning.

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

T-NUCA - a novel approach to non-uniform access latency cache architectures for 3D CMPs.

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Analyzing the soft error resilience of linear solvers on multicore multiprocessors.

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Dynamic core partitioning for energy efficiency.

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Feature subspace transformations for enhancing k-means clustering.

Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

2009

Adapting application execution in CMPs using helper threads.

J. Parallel Distrib. Comput., 2009

Towards Low-Cost, High-Accuracy Classifiers for Linear Solver Selection.

Proceedings of the Computational Science, 2009

Adapting Application Mapping to Systematic Within-Die Process Variations on Chip Multiprocessors.

Proceedings of the High Performance Embedded Architectures and Compilers, 2009

Hybrid Techniques for Fast Multicore Simulation.

Proceedings of the Euro-Par 2009 Parallel Processing, 2009

Markov Model Based Disk Power Management for Data Intensive Workloads.

Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009

2008

Evaluating the role of scratchpad memories in chip multiprocessors for sparse matrix computations.

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Managing power, performance and reliability trade-offs.

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Towards energy efficient scaling of scientific codes.

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

A helper thread based EDP reduction scheme for adapting application execution in CMPs.

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Ring data location prediction scheme for Non-Uniform Cache Architectures.

Proceedings of the 26th International Conference on Computer Design, 2008

2007

Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling.

The Journal of Supercomputing, 2007

Phase-aware adaptive hardware selection for power-efficient scientific computations.

Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007

Analysis of the IPv4 Address Space Delegation Structure.

Proceedings of the 12th IEEE Symposium on Computers and Communications (ISCC 2007), 2007

Memory Optimizations For Fast Power-Aware Sparse Computations.

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Load Miss Prediction - Exploiting Power Performance Trade-offs.

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Link Shutdown Opportunities During Collective Communications in 3-D Torus Nets.

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Ring Prediction for Non-Uniform Cache Architectures.

Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), 2007

2006

Effective Preconditioning through Ordering Interleaved with Incomplete Factorization.

SIAM J. Matrix Analysis Applications, 2006

Poster reception - Toward a power efficient computer architecture for Barnes-Hut N-body simulations.

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Poster reception - Energy/performance modeling for collective communication in 3-D torus cluster networks.

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Integrated link/CPU voltage scaling for reducing energy consumption of parallel sparse matrix applications.

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Conjugate gradient sparse solvers: performance-power characteristics.

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

On improving performance and energy profiles of sparse scientific applications.

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Characterizing the Performance and Energy Attributes of Scientific Simulations.

Proceedings of the Computational Science, 2006

2005

Adaptive Software for Scientific Computing: Co-Managing Quality-Performance-Power Tradeoffs.

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Reducing Power with Performance Constraints for Parallel Sparse Applications.

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Multi-pass Mapping Schemes for Parallel Sparse Matrix Computations.

Proceedings of the Computational Science, 2005

2004

Faster PDE-based simulations using robust composite linear solvers.

Future Generation Comp. Syst., 2004

Parallel Hybrid Sparse Solvers Through Flexible Incomplete Cholesky Preconditioning.

Proceedings of the Applied Parallel Computing, 2004

Advanced Algorithms and Software Components for Scientific Computing: An Introduction.

Proceedings of the Applied Parallel Computing, 2004

Towards a Grid enabled system for multicomponent materials design.

Proceedings of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004), 2004

2003

A latency tolerant hybrid sparse solver using incomplete Cholesky factorization.

Numerical Lin. Alg. with Applic., 2003

Time-Memory Trade-Offs Using Sparse Matrix Methods for Large-Scale Eigenvalue Problems.

Proceedings of the Computational Science and Its Applications, 2003

The Role of Multi-method Linear Solvers in PDE-based Simulations.

Proceedings of the Computational Science and Its Applications, 2003

2002

Large-Scale Normal Coordinate Analysis on Distributed Memory Parallel Systems.

IJHPCA, 2002

A new data-mapping scheme for latency-tolerant distributed sparse triangular solution.

Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

A Combinatorial Scheme for Developing Efficient Composite Solvers.

Proceedings of the Computational Science - ICCS 2002, 2002

2001

Level search schemes for information filtering and retrieval.

Inf. Process. Manage., 2001

Scalable Preconditioning Using Incomplete Factors.

Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

2000

Towards a Scalable Hybrid Sparse Solver.

Concurrency - Practice and Experience, 2000

A Grid Computing Environment for Enabling Large Scale Quantum Mechanical Simulations.

Proceedings of the Grid Computing, 2000

1999

Performance of Greedy Ordering Heuristics for Sparse Cholesky Factorization.

SIAM J. Matrix Analysis Applications, 1999

Incomplete Cholesky Parallel Preconditioners with Selective Inversion.

Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

1998

Efficient Parallel Sparse Triangular Solution Using Selective Inversion.

Parallel Processing Letters, 1998

1997

Parallel Ordering Using Edge Contraction.

Parallel Computing, 1997

Performance of a Fully Parallel Sparse Solver.

IJHPCA, 1997

1995

Distributed Sparse Gaussian Elimination and Orthogonal Factorization.

SIAM J. Scientific Computing, 1995

A Cartesian Parallel Nested Dissection Algorithm.

SIAM J. Matrix Analysis Applications, 1995