James Demmel

Proceedings of the SNC 2011, 2011

Improving communication performance in dense linear algebra via topology aware collectives.

[BibT_eX]

[DOI]

Edgar Solomonik

Abhinav Bhatele

Proceedings of the Conference on High Performance Computing Networking, 2011

Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Communication-Avoiding QR Decomposition for GPUs.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

On improving trust-region variable projection algorithms for separable nonlinear least squares learning.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Joint Conference on Neural Networks, 2011

Rethinking algorithms for future architectures: Communication-avoiding algorithms.

[BibT_eX]

[DOI]

Jim Demmel

Proceedings of the 2011 IEEE Hot Chips 23 Symposium (HCS), 2011

Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms.

[BibT_eX]

[DOI]

Edgar Solomonik

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Avoiding Communication in Numerical Linear Algebra.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Workshop on Algorithm Engineering and Experiments, 2011

2010

Communication-optimal Parallel and Sequential Cholesky Decomposition.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2010

Minimizing Communication for Eigenproblems and the Singular Value Decomposition

[BibT_eX]

[DOI]

Grey Ballard

CoRR, 2010

Brief announcement: Lower bounds on communication for sparse Cholesky factorization of a model problem.

[BibT_eX]

[DOI]

Proceedings of the SPAA 2010: Proceedings of the 22nd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2010

2009

Extra-Precise Iterative Refinement for Overdetermined Least Squares Problems.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2009

Nonnegative Diagonals and High Performance on Low-Profile Matrices from Householder QR.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2009

Minimizing Communication in Linear Algebra

[BibT_eX]

[DOI]

CoRR, 2009

A view of the parallel computing landscape.

[BibT_eX]

[DOI]

Commun. ACM, 2009

Communication-optimal parallel and sequential Cholesky decomposition: extended abstract.

[BibT_eX]

[DOI]

Proceedings of the SPAA 2009: Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2009

Minimizing communication in sparse matrix solvers.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

2008

Algorithm 880: A testing infrastructure for symmetric tridiagonal eigensolvers.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2008

Cache efficient bidiagonalization using BLAS 2.5 operators.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2008

Performance and Accuracy of LAPACK's Symmetric Tridiagonal Eigensolvers.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2008

Continuation of Invariant Subspaces in Large Bifurcation Problems.

[BibT_eX]

[DOI]

David Bindel

Mark J. Friedman

SIAM J. Sci. Comput., 2008

Sparse SOS Relaxations for Minimizing Functions that are Summations of Small Polynomials.

[BibT_eX]

[DOI]

SIAM J. Optim., 2008

Global minimization of rational functions and the nearest GCDs.

[BibT_eX]

[DOI]

Ming Gu

J. Glob. Optim., 2008

Communication-avoiding parallel and sequential QR factorizations

[BibT_eX]

[DOI]

CoRR, 2008

Benchmarking GPUs to tune dense linear algebra.

[BibT_eX]

[DOI]

Vasily Volkov

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Communication avoiding Gaussian elimination.

[BibT_eX]

[DOI]

Laura Grigori

Hua Xiang

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Avoiding communication in sparse matrix computations.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007

Prospectus for a Dense Linear Algebra Software Library.

[BibT_eX]

[DOI]

Proceedings of the Handbook of Parallel Computing - Models, Algorithms and Applications., 2007

Parallel Symbolic Factorization for Sparse LU with Static Pivoting.

[BibT_eX]

[DOI]

Laura Grigori

SIAM J. Sci. Comput., 2007

Fast matrix multiplication is stable.

[BibT_eX]

[DOI]

Numerische Mathematik, 2007

Fast linear algebra is stable.

[BibT_eX]

[DOI]

Olga Holtz

Numerische Mathematik, 2007

Accurate and Efficient Expression Evaluation and Linear Algebra

[BibT_eX]

[DOI]

CoRR, 2007

When cache blocking of sparse matrix vector multiply works and why.

[BibT_eX]

[DOI]

Appl. Algebra Eng. Commun. Comput., 2007

Optimization of sparse matrix-vector multiplication on emerging multicore platforms.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Health monitoring of civil infrastructures using wireless sensor networks.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Information Processing in Sensor Networks, 2007

2006

Error bounds from extra-precise iterative refinement.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2006

Minimizing Polynomials via Sum of Squares over the Gradient Ideal.

[BibT_eX]

[DOI]

Bernd Sturmfels

Math. Program., 2006

Accurate and efficient evaluation of Schur and Jack functions.

[BibT_eX]

[DOI]

Plamen Koev

Math. Comput., 2006

Wireless sensor networks for structural health monitoring.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Embedded Networked Sensor Systems, 2006

Automatic Performance Tuning for the Multi-section with Multiple Eigenvalues Method for Symmetric Tridiagonal Eigenproblems.

[BibT_eX]

[DOI]

Takahiro Katagiri

Christof Vömel

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Prospectus for the Next LAPACK and ScaLAPACK Libraries.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

2005

The Accurate and Efficient Solution of a Totally Positive Generalized Vandermonde Linear System.

[BibT_eX]

[DOI]

Plamen Koev

SIAM J. Matrix Anal. Appl., 2005

Minimum Ellipsoid Bounds for Solutions of Polynomial Systems via Sum of Squares.

[BibT_eX]

[DOI]

J. Glob. Optim., 2005

Toward accurate polynomial evaluation in rounded arithmetic

[BibT_eX]

[DOI]

Olga Holtz

CoRR, 2005

Second-order backpropagation algorithms for a stagewise-partitioned separable Hessian matrix.

[BibT_eX]

[DOI]

Stuart E. Dreyfus

Proceedings of the IEEE International Joint Conference on Neural Networks, 2005

Bifurcation Analysis of Large Equilibrium Systems in Matlab.

[BibT_eX]

[DOI]

Proceedings of the Computational Science, 2005

Toward accurate polynomial evaluation in rounded arithmetic (short report).

[BibT_eX]

[DOI]

Olga Holtz

Proceedings of the Algebraic and Numerical Algorithms and Computer-assisted Proofs, 2005

2004

Accurate and Efficient Floating Point Summation.

[BibT_eX]

[DOI]

Yozo Hida

SIAM J. Sci. Comput., 2004

Accurate SVDs of weakly diagonally dominant M-matrices.

[BibT_eX]

[DOI]

Plamen Koev

Numerische Mathematik, 2004

Fast and Accurate Floating Point Summation with Application to Computational Geometry.

[BibT_eX]

[DOI]

Yozo Hida

Numer. Algorithms, 2004

Statistical Models for Empirical Search-Based Performance Tuning.

[BibT_eX]

[DOI]

Richard W. Vuduc

Jeff A. Bilmes

Int. J. High Perform. Comput. Appl., 2004

Performance Tuning of Matrix Triple Products Based on Matrix Structure.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 2004

Model Reduction for RF MEMS Simulation.

[BibT_eX]

[DOI]

David Bindel

Proceedings of the Applied Parallel Computing, 2004

Performance Models for Evaluation and Automatic Tuning of Symmetric Sparse Matrix-Vector Multiply.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

2003

SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2003

On structure-exploiting trust-region regularized nonlinear least squares algorithms for neural-network learning.

[BibT_eX]

[DOI]

Neural Networks, 2003

Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian-Vector Multiply.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

On sparsity-exploiting memory-efficient trust-region regularized nonlinear least squares algorithms for neural-network learning.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2003

Memory Hierarchy Optimizations and Performance ounds for Sparse A.

[BibT_eX]

[DOI]

Proceedings of the Computational Science - ICCS 2003, 2003

2002

Design, implementation and testing of extended and mixed precision BLAS.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2002

On computing givens rotations reliably and efficiently.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 2002

Performance optimizations and bounds for sparse matrix-vector multiply.

[BibT_eX]

[DOI]

Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

2001

On the Complexity of Computing Error Bounds.

[BibT_eX]

[DOI]

Benjamin Diament

Gregorio Malajovich

Found. Comput. Math., 2001

Statistical Models for Automatic Performance Tuning.

[BibT_eX]

[DOI]

Rich Vuduc

Jeff A. Bilmes

Proceedings of the Computational Science - ICCS 2001, 2001

A Data Broker for Distributed Computing Environments.

[BibT_eX]

[DOI]

Leroy Anthony Drummond

Proceedings of the Computational Science - ICCS 2001, 2001

2000

Computing Connecting Orbits via an Improved Algorithm for Continuing Invariant Subspaces.

[BibT_eX]

[DOI]

Luca Dieci

Mark J. Friedman

SIAM J. Sci. Comput., 2000

Accurate Singular Value Decompositions of Structured Matrices.

[BibT_eX]

[DOI]

SIAM J. Matrix Anal. Appl., 2000

Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW.

[BibT_eX]

[DOI]

Rich Vuduc

Proceedings of the Semantics, 2000

On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Common Issues.

[BibT_eX]

[DOI]

Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000

Singular Value Decomposition.

[BibT_eX]

[DOI]

Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000

A Brief Tour of Eigenproblems.

[BibT_eX]

[DOI]

Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000

Non-Hermitian Eigenvalue Problems.

[BibT_eX]

[DOI]

Gerard L. G. Sleijpen

Ruipeng Li

Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000

1999

An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination.

[BibT_eX]

[DOI]

John R. Gilbert

SIAM J. Matrix Anal. Appl., 1999

A Supernodal Approach to Sparse Partial Pivoting.

[BibT_eX]

[DOI]

SIAM J. Matrix Anal. Appl., 1999

Making Sparse Matrix Computations Scalable (Invited Talk Abstract).

[BibT_eX]

[DOI]

Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures, 1999

Parallel Multigrid Solver for 3D Unstructured Finite Element Problems.

[BibT_eX]

[DOI]

Mark Adams

Jim Demmel

Proceedings of the ACM/IEEE Conference on Supercomputing, 1999

A Scalable Sparse Direct Solver Using Static Pivoting.

[BibT_eX]

Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

LAPACK Users' Guide, Third Edition

[BibT_eX]

[DOI]

Software, Environments and Tools, SIAM, ISBN: 978-0-89871-960-4, 1999

1998

Using the Matrix Sign Function to Compute Invariant Subspaces.

[BibT_eX]

[DOI]

SIAM J. Matrix Anal. Appl., January, 1998

Programming Tools and Environments.

[BibT_eX]

[DOI]

Commun. ACM, 1998

Making Sparse Gaussian Elimination Scalable by Static Pivoting.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on Supercomputing, 1998

1997

Practical Experience in the Numerical Dangers of Heterogeneous Computing.

[BibT_eX]

[DOI]

ACM Trans. Math. Softw., 1997

The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 1997

Models and Scheduling Algorithms for Mixed Data and Task Parallel Programs.

[BibT_eX]

[DOI]

Soumen Chakrabarti

Katherine A. Yelick

J. Parallel Distributed Comput., 1997

ScaLAPACK: A Linear Algebra Library for Message-Passing Computers.

[BibT_eX]

Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

Optimizing Matrix Multiply Using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology.

[BibT_eX]

[DOI]

Proceedings of the 11th international conference on Supercomputing, 1997

Using PHiPAC to speed error back-propagation learning.

[BibT_eX]

[DOI]

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Applied Numerical Linear Algebra.

[BibT_eX]

[DOI]

SIAM, ISBN: 978-0-898713-89-3, 1997

1996

ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.

[BibT_eX]

[DOI]

Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

Practical Experience in the Dangers of Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 1996

1995

Stability of block LU factorization.

[BibT_eX]

[DOI]

Nicholas J. Higham

Robert S. Schreiber

Numer. Linear Algebra Appl., 1995

Algorithms for Intersecting Parametric and Algebraic Curves II: Multiple Intersections.

[BibT_eX]

[DOI]

Dinesh Manocha

CVGIP Graph. Model. Image Process., 1995

Modeling the Benefits of Mixed Data and Task Parallelism.

[BibT_eX]

[DOI]

Soumen Chakrabarti

Katherine A. Yelick

Proceedings of the 7th Annual ACM Symposium on Parallel Algorithms and Architectures, 1995

Performance of a Parallel Global Atmospheric Chemical Tracer Model.

[BibT_eX]

[DOI]

Sharon Smith

Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

The Performance of Finding Eigenvalues and Eigenvaectors of Dense Symmetric Matrices on Distributed Memory Computers.

[BibT_eX]

Ken Stanley

Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 1995

Templates for Linear Algebra Problems.

[BibT_eX]

[DOI]

Proceedings of the Computer Science Today: Recent Trends and Developments, 1995

1994

Algorithms for intersecting parametric and algebraic curves I: simple intersections.

[BibT_eX]

[DOI]

Dinesh Manocha

ACM Trans. Graph., 1994

Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods

[BibT_eX]

[DOI]

Other Titles in Applied Mathematics, SIAM, ISBN: 978-1-61197-153-8, 1994

1993

Improved Error Bounds for Underdetermined System Solvers.

[BibT_eX]

[DOI]

Nicholas J. Higham

SIAM J. Matrix Anal. Appl., January, 1993

The generalized Schur decomposition of an arbitrary pencil A-λB - robust software with error bounds and applications. Part II: software and applications.

[BibT_eX]

[DOI]

Bo Kågström

ACM Trans. Math. Softw., 1993

The generalized Schur decomposition of an arbitrary pencil A-λB - robust software with error bounds and applications. Part I: theory and algorithms.

[BibT_eX]

[DOI]

Bo Kågström

ACM Trans. Math. Softw., 1993

On computing condition numbers for the nonsymmetric eigenproblem.

[BibT_eX]

[DOI]

A. McKenney

ACM Trans. Math. Softw., 1993

Computing the Generalized Singular Value Decomposition.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 1993

A New Algorithm for the Symmetric Tridiagonal Eigenvalue Problem.

[BibT_eX]

[DOI]

Victor Y. Pan

J. Complex., 1993

LAPACK for Distributed Memory Architectures: The Next Generation.

[BibT_eX]

Jack J. Dongarra

Robert A. van de Geijn

David W. Walker

Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Design of a Parallel Nonsymmetric Eigenroutine Toolbox, Part I.

[BibT_eX]

Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Faster numerical algorithms via exception handling.

[BibT_eX]

[DOI]

Proceedings of the 11th Symposium on Computer Arithmetic, 29 June, 1993

1992

Stability of block algorithms with fast level-3 BLAS.

[BibT_eX]

[DOI]

Nicholas J. Higham

ACM Trans. Math. Softw., 1992

Jacobi's Method is More Accurate than QR.

[BibT_eX]

[DOI]

Kresimir Veselic

SIAM J. Matrix Anal. Appl., 1992

The Componentwise Distance to the Nearest Singular Matrix.

[BibT_eX]

[DOI]

SIAM J. Matrix Anal. Appl., 1992

1991

LAPACK: A portable linear algebra library for high-performance computers.

[BibT_eX]

[DOI]

Concurr. Pract. Exp., 1991

1990

Accurate Singular Values of Bidiagonal Matrices.

[BibT_eX]

[DOI]

William Kahan

SIAM J. Sci. Comput., 1990

Matrix Computations; Second Edition (Gene Golub and Charles F. Van Loan).

[BibT_eX]

[DOI]

SIAM Rev., 1990

LAPACK: a portable linear algebra library for high-performance computers.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

1989

On a Block Implementation of Hessenberg Multishift QR Iteration.

[BibT_eX]

[DOI]

Int. J. High Speed Comput., 1989

Optimal three finger grasps.

[BibT_eX]

[DOI]

Gerardo Lafferriere

Proceedings of the 1989 IEEE International Conference on Robotics and Automation, 1989

1988

Theoretical and experimental studies using a multifinger planar manipulator.

[BibT_eX]

[DOI]

Proceedings of the 1988 IEEE International Conference on Robotics and Automation, 1988

1987

The geometry of III-conditioning.

[BibT_eX]

[DOI]

J. Complex., 1987

Three methods for refining estimates of invariant subspaces.

[BibT_eX]

[DOI]

Computing, 1987

On error analysis in arithmetic with varying relative precision.

[BibT_eX]

[DOI]

Proceedings of the 8th IEEE Symposium on Computer Arithmetic, 1987

1985

An interval algorithm for solving systems of linear equations to prespecified accuracy.

[BibT_eX]

[DOI]