David E. Keyes

ACM Trans. Math. Softw., 2016

KBLAS: An Optimized Library for Dense Matrix-Vector Multiplication on GPU Accelerators.

[BibT_eX]

[DOI]

Ahmad Abdelfattah

ACM Trans. Math. Softw., 2016

Accelerated Dimension-Independent Adaptive Metropolis.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2016

Convergence Analysis for the Multiplicative Schwarz Preconditioned Inexact Newton Algorithm.

[BibT_eX]

[DOI]

Lulu Liu

SIAM J. Numer. Anal., 2016

Unstructured computational aerodynamics on many integrated core architecture.

[BibT_eX]

[DOI]

Mohammed A. Al Farhan

Dinesh K. Kaushik

Parallel Comput., 2016

A performance model for the communication in fast multipole methods on high-performance computing platforms.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2016

Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation.

[BibT_eX]

[DOI]

CoRR, 2016

Research and Education in Computational Science and Engineering.

[BibT_eX]

[DOI]

CoRR, 2016

A Matrix-free Preconditioner for the Helmholtz Equation based on the Fast Multipole Method.

[BibT_eX]

[DOI]

CoRR, 2016

A Direct Elliptic Solver Based on Hierarchically Low-rank Schur Complements.

[BibT_eX]

[DOI]

Gustavo Chávez

George M. Turkiyyah

CoRR, 2016

Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2016

Efficiency of High Order Spectral Element Methods on Petascale Architectures.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 31st International Conference, 2016

On the Robustness and Prospects of Adaptive BDDC Methods for Finite Element Discretizations of Elliptic PDEs with High-Contrast Coefficients.

[BibT_eX]

[DOI]

Stefano Zampini

Proceedings of the Platform for Advanced Scientific Computing Conference, 2016

Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Efficient Sphere Detector Algorithm for Massive MIMO using GPU Hardware Accelerator.

[BibT_eX]

[DOI]

Mohamed Amine Arfaoui

Proceedings of the International Conference on Computational Science 2016, 2016

High Performance Polar Decomposition on Distributed Memory Systems.

[BibT_eX]

[DOI]

Dalal Sukkari

Proceedings of the Euro-Par 2016: Parallel Processing, 2016

Redesigning Triangular Dense Matrix Computations on GPUs.

[BibT_eX]

[DOI]

Ali Charara

Proceedings of the Euro-Par 2016: Parallel Processing, 2016

2015

Dense Matrix Computations on NUMA Architectures with Distance-Aware Work Stealing.

[BibT_eX]

[DOI]

Supercomput. Front. Innov., 2015

Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2015

Field-Split Preconditioned Inexact Newton Algorithms.

[BibT_eX]

[DOI]

Lulu Liu

SIAM J. Sci. Comput., 2015

A parallel domain decomposition-based implicit method for the Cahn-Hilliard-Cook phase-field equation in 3D.

[BibT_eX]

[DOI]

J. Comput. Phys., 2015

Smooth and robust solutions for Dirichlet boundary control of fluid-solid conjugate heat transfer problems.

[BibT_eX]

[DOI]

Yan Yan

J. Comput. Phys., 2015

Multi-dimensional intra-tile parallelization for memory-starved stencil computations.

[BibT_eX]

[DOI]

CoRR, 2015

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms.

[BibT_eX]

[DOI]

Amani AlOnazi

Alexey L. Lastovetsky

Vladimir Rychkov

CoRR, 2015

Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

A Scalable Community Detection Algorithm for Large Graphs Using Stochastic Block Models.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Composing Algorithmic Skeletons to Express High-Performance Scientific Applications.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

High Performance Multi-GPU SpMV for Multi-component PDE-Based Applications.

[BibT_eX]

[DOI]

Ahmad Abdelfattah

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014

Communication Complexity of the Fast Multipole Method and its Algebraic Variants.

[BibT_eX]

[DOI]

George Turkiyyah

Supercomput. Front. Innov., 2014

Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking.

[BibT_eX]

[DOI]

CoRR, 2014

A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms.

[BibT_eX]

[DOI]

CoRR, 2014

Asynchronous Execution of the Fast Multipole Method Using Charm++.

[BibT_eX]

[DOI]

Mustafa Abdul Jabbar

CoRR, 2014

Pipelining Computational Stages of the Tomographic Reconstructor for Multi-Object Adaptive Optics on a Multi-GPU System.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2014

High Performance Pseudo-analytical Simulation of Multi-Object Adaptive Optics over Multi-GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013

The Miracle, Mandate and Mirage of High Performance Computing.

[BibT_eX]

[DOI]

it Inf. Technol., 2013

Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2013

Multiphysics simulations: Challenges and opportunities.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2013

Topic 14+16: High-Performance and Scientific Applications and Extreme-Scale Computing - (Introduction).

[BibT_eX]

[DOI]

Marie-Christine Sawley

Thomas C. Schulthess

John Shalf

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012

A Quasi-algebraic Multigrid Approach to Fracture Problems Based on Extended Finite Elements.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2012

Numerical simulation of four-field extended magnetohydrodynamics in dynamically adaptive curvilinear coordinates via Newton-Krylov-Schwarz.

[BibT_eX]

[DOI]

Xuefei Yuan

Stephen C. Jardin

J. Comput. Phys., 2012

Optimizing Memory-Bound SYMV Kernel on GPU Hardware Accelerators.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing for Computational Science, 2012

Multiplicative Algorithms for Constrained Non-negative Matrix Factorization.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Data Mining, 2012

Systematic Approach in Optimizing Numerical Memory-Bound Kernels on GPU.

[BibT_eX]

[DOI]

Ahmad Abdelfattah

Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

2011

Special Section: 2010 Copper Mountain Conference.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2011

The International Exascale Software Project roadmap.

[BibT_eX]

[DOI]

Bertrand Braunschweig

Int. J. High Perform. Comput. Appl., 2011

Moving grids for magnetic reconnection via Newton-Krylov methods.

[BibT_eX]

[DOI]

Xuefei Yuan

Stephen C. Jardin

Comput. Phys. Commun., 2011

Hybrid Programming Model for Implicit PDE Simulations on Multicore Architectures.

[BibT_eX]

[DOI]

Proceedings of the OpenMP in the Petascale Era - 7th International Workshop on OpenMP, 2011

2010

Application of Alternating Decision Trees in Selecting Sparse Linear Solvers.

[BibT_eX]

[DOI]

Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

2009

Linear augmented Slater-type orbital method for free standing clusters.

[BibT_eX]

[DOI]

J. Comput. Chem., 2009

Partial Differential Equation-Based Applications and Solvers At Extreme Scale.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2009

Modeling wildland fire propagation with level set methods.

[BibT_eX]

[DOI]

Vivien Mallet

F. E. Fendell

Comput. Math. Appl., 2009

2008

Special Issue on Computational Science and Engineering.

[BibT_eX]

[DOI]

Chris R. Johnson

Ulrich Rüde

SIAM J. Sci. Comput., 2008

2007

Additive Schwarz-based fully coupled implicit methods for resistive Hall magnetohydrodynamic problems.

[BibT_eX]

[DOI]

J. Comput. Phys., 2007

Reconstructing parameters of the FitzHugh-Nagumo system from boundary potential measurements.

[BibT_eX]

[DOI]

Yuan He

J. Comput. Neurosci., 2007

Petaflop/s, Seriously.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2007

2006

Multi-core issues - Multi-Core for HPC: breakthrough or breakdown?

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

M06 - Issues for the future of supercomputing: impact of Moore's law and architecture on application performance.

[BibT_eX]

[DOI]

Erik DeBenedictis

Bart G. van Bloemen Waanders

Peter M. Kogge

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Grid-based Image Registration.

[BibT_eX]

[DOI]

Proceedings of the Grid-Based Problem Solving Environments, 2006

Parallel Algorithms for PDE-Constrained Optimization.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing for Scientific Computing, 2006

2004

Topic 11: Numerical Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2004 Parallel Processing, 2004

2003

Pseudotransient Continuation and Differential-Algebraic Equations.

[BibT_eX]

[DOI]

Todd S. Coffey

C. T. Kelley

SIAM J. Sci. Comput., 2003

2002

Nonlinearly Preconditioned Inexact Newton Algorithms.

[BibT_eX]

[DOI]

Xiao-Chuan Cai

SIAM J. Sci. Comput., 2002

2001

High-performance parallel implicit CFD.

[BibT_eX]

[DOI]

Parallel Comput., 2001

2000

Globalized Newton-Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2000

Performance Modeling and Tuning of an Unstructured Mesh CFD Application.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing 2000, 2000

Analyzing the Parallel Scalability of an Implicit Unstructured Mesh CFD Code.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2000

Four Horizons for Enhancing the Performance of Parallel Simulations Based on Partial Differential Equations.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999

Adapting to Hostile Architectural Environments.

[BibT_eX]

[DOI]

Parallel Distributed Comput. Pract., 1999

Three Parallel Programming Paradigms: Comparisons on an Archetypal PDE Computation.

[BibT_eX]

[DOI]

M. Ethtesham Hayder

Cos S. Ierotheou

Parallel Distributed Comput. Pract., 1999

Achieving High Sustained Performance in an Unstructured Mesh CFD Application.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on Supercomputing, 1999

Parallelization of an Object-Oriented Unstructured Aeroacoustics Solver.

[BibT_eX]

Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

1998

Parallel Newton-Krylov-Schwarz Algorithms for the Transonic Full Potential Equation.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 1998

1997

Parallel Implicit PDE Computations.

[BibT_eX]

[DOI]

Proceedings of the Conference on Parallel Computational Fluid Dynamics 1997, 1997

1996

A Hyperbolic Model for Communications in Layered Parallel Processing Environments.

[BibT_eX]

[DOI]

Ion Stoica

Florin Sultan

J. Parallel Distributed Comput., 1996

Evaluating the Hyperbolic Model on a Variety of Architectures.

[BibT_eX]

[DOI]

Ion Stoica

Florin Sultan

Proceedings of the Euro-Par '96 Parallel Processing, 1996

1995

Modeling Communication in Cluster Computing.

[BibT_eX]

Ion Stoica

Florin Sultan

Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

1994

Towards Polyalgorithmic Linear System Solvers for Nonlinear Elliptic Problems.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 1994

A comparison of some domain decomposition and ILU preconditioned iterative methods for nonsymmetric elliptic problems.

[BibT_eX]

[DOI]

Xiao-Chuan Cai

William D. Gropp

Numer. Linear Algebra Appl., 1994

1992

Domain Decomposition with Local Mesh Refinement.

[BibT_eX]

[DOI]

William D. Gropp

SIAM J. Sci. Comput., 1992

Parallel Performance of Domain-Decomposed Preconditioned Krylov Methods for PDEs with Locally Uniform Refinement.

[BibT_eX]

[DOI]

William D. Gropp

SIAM J. Sci. Comput., 1992

1989

Domain decomposition on parallel computers.

[BibT_eX]

[DOI]

IMPACT Comput. Sci. Eng., 1989

Balanced Divide-and-Conquer Algorithms for the Fine-Grained Parallel Direct Solution of Dense and Banded Triangular Linear Systems and their Connection Machine Implementation.

[BibT_eX]

Z. George Mou

Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

Parallel Domain Decomposition with Local Mesh Refinement.

[BibT_eX]

Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

1987

Analysis of a Parallized Elliptic Solver for Reacting Flows-Abstract.

[BibT_eX]

Mitchell D. Smooke

Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, 1987

1985

A comparison of domain decomposition techniques for elliptic partial differential equations and their parallel implementation.

[BibT_eX]