We stand with Ukraine

We stand with Ukraine

Jack J. Dongarra

Orcid: 0000-0003-3247-1782

Affiliations:

University of Tennessee, Knoxville, TN, USA
Oak Ridge National Laboratory, TN, USA
University of Manchester, Manchester, UK

According to our database¹, Jack J. Dongarra authored at least 819 papers between 1976 and 2026.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Awards

Turing Prize recipient

Turing Prize 2021, "For pioneering contributions to numerical algorithms and libraries that enabled high performance computational software to keep pace with exponential hardware improvements for over four decades." .

ACM Fellow

ACM Fellow 2001, "For contributions in the field of scientific computing, the development of mathematical software, parallel methods, and enabling technologies for high-performance computing.".

IEEE Fellow

IEEE Fellow 2000, "For contributions and leadership in the field of computational mathematics.".

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

On csauthors.net:

Bibliography

2026

HPL-MxP benchmark: Mixed-precision algorithms, iterative refinement, and scalable data generation.

[DOI]

Jack J. Dongarra

,

Int. J. High Perform. Comput. Appl., 2026

2025

Durable Engines of Discovery.

[DOI]

Jack J. Dongarra

Commun. ACM, December, 2025

The Stability of Block Eliminations and Additive Modifications.

[DOI]

,

,

Jack J. Dongarra

CoRR, September, 2025

Analysis of Floating-Point Matrix Multiplication Computed via Integer Arithmetic.

[DOI]

Ahmad Abdelfattah

,

Jack J. Dongarra

,

Massimiliano Fasi

,

Mantas Mikaitis

,

Françoise Tisseur

CoRR, June, 2025

Evolution of the computational science community: The dynamics of topics and collaborations in 24 years of ICCS and JoCS publications.

[DOI]

,

Klavdiya Bochenina

,

Tesfamariam M. Abuhay

,

,

,

Sergey V. Kovalchuk

,

Valeria V. Krzhizhanovskaya

,

Maciej Paszynski

,

Clélia de Mulatier

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2025

Computational science: Guiding the way towards a sustainable society.

[DOI]

Sergey V. Kovalchuk

,

Clélia de Mulatier

,

Valeria V. Krzhizhanovskaya

,

Leonardo Franco

,

Maciej Paszynski

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2025

Advancements of PAPI for the exascale generation.

[DOI]

,

Anthony Danalis

,

Giuseppe Congiu

,

,

Anthony Castaldo

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2025

Efficient Embedding Initialization via Dominant Eigenvector Projections.

[DOI]

Quentin R. Petit

,

,

,

Jack J. Dongarra

Proceedings of the SC '25 Workshops of the International Conference for High Performance Computing, 2025

Accelerating Supercomputing: AI-Hardware-Driven Innovation for Speed and Efficiency.

[DOI]

Jack J. Dongarra

,

John A. Gunnels

,

Harun Bayraktar

,

,

Proceedings of the IEEE High Performance Extreme Computing Conference, 2025

2024

Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes.

[DOI]

,

,

Jack J. Dongarra

ACM Trans. Math. Softw., December, 2024

Computation at the Cutting Edge of Science.

[DOI]

Sergey V. Kovalchuk

,

Clélia de Mulatier

,

Valeria V. Krzhizhanovskaya

,

,

Maciej Paszynski

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2024

Numerical eigen-spectrum slicing, accurate orthogonal eigen-basis, and mixed-precision eigenvalue refinement using OpenMP data-dependent tasks and accelerator offload.

[DOI]

,

Anthony Castaldo

,

Yaohung M. Tsai

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2024

XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing.

[DOI]

Torsten Hoefler

,

,

,

,

,

Manish Parashar

,

,

Matthias Troyer

,

Thomas C. Schulthess

,

,

Jack J. Dongarra

Comput. Sci. Eng., 2024

Hardware Trends Impacting Floating-Point Computations In Scientific Applications.

[DOI]

Jack J. Dongarra

,

John A. Gunnels

,

Harun Bayraktar

,

,

CoRR, 2024

Automated Data Analysis for Defining Performance Metrics from Raw Hardware Events.

[DOI]

,

Anthony Danalis

,

Jack J. Dongarra

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Trends in Computational Science: Natural Language Processing and Network Analysis of 23 Years of ICCS Publications.

[DOI]

,

Sergey V. Kovalchuk

,

Valeria V. Krzhizhanovskaya

,

Maciej Paszynski

,

Clélia de Mulatier

,

Jack J. Dongarra

,

Peter M. A. Sloot

Proceedings of the Computational Science - ICCS 2024, 2024

2023

The computational planet.

[DOI]

Sergey V. Kovalchuk

,

Clélia de Mulatier

,

,

Maciej Paszynski

,

Valeria V. Krzhizhanovskaya

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., September, 2023

Combining multitask and transfer learning with deep Gaussian processes for autotuning-based performance engineering.

[DOI]

,

Wissam M. Sid-Lakhdar

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., July, 2023

HPC Forecast: Cloudy and Uncertain.

[DOI]

,

,

Jack J. Dongarra

Commun. ACM, February, 2023

Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software.

[DOI]

,

,

Michael W. Mahoney

,

N. Benjamin Erichson

,

Maksim Melnichenko

,

Osman Asif Malik

,

,

,

Michal Derezinski

,

,

,

,

Jack J. Dongarra

CoRR, 2023

Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators.

[DOI]

,

,

Mohammed A. Al Farhan

,

,

Jack J. Dongarra

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure.

[DOI]

Ahmad Abdelfattah

,

Stanimire Tomov

,

,

,

Jack J. Dongarra

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

PAQR: Pivoting Avoiding QR factorization.

[DOI]

Wissam M. Sid-Lakhdar

,

Sébastien Cayrols

,

,

Ahmad Abdelfattah

,

,

,

Stanimire Tomov

,

,

David B. Williams-Young

,

Timothy A. Davis

,

Jack J. Dongarra

,

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements.

[DOI]

,

,

Anthony Danalis

,

Jack J. Dongarra

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Using Additive Modifications in LU Factorization Instead of Pivoting.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 37th International Conference on Supercomputing, 2023

2022

Reproducability Artifact for Running SLATE's GEMM and POTRF Operations on Summit and Crusher.

[DOI]

,

,

,

,

Sébastien Cayrols

,

,

Mohammed A. Al Farhan

,

Jack J. Dongarra

Dataset, August, 2022

Reproducability Artifact for Running SLATE's GEMM and POTRF Operations on Summit and Crusher.

[DOI]

,

,

,

,

Sébastien Cayrols

,

,

Ahmad Abdelfattah

,

Mohammed A. Al Farhan

,

Jack J. Dongarra

Dataset, August, 2022

Software for "Threshold Pivoting in LU Factorizations".

[DOI]

,

,

Jack J. Dongarra

Dataset, May, 2022

Software for "Threshold Pivoting for dense LU Factorization".

[DOI]

,

,

,

Jack J. Dongarra

Dataset, May, 2022

Accelerating Restarted GMRES With Mixed Precision Arithmetic.

[DOI]

,

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2022

Evaluating Data Redistribution in PaRSEC.

[DOI]

,

,

,

,

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2022

Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC.

[DOI]

,

,

,

,

Jack J. Dongarra

,

,

,

,

IEEE Trans. Parallel Distributed Syst., 2022

Using long vector extensions for MPI reductions.

[DOI]

,

,

,

Jack J. Dongarra

Parallel Comput., 2022

Computational science for a better future.

[DOI]

Sergey V. Kovalchuk

,

Valeria V. Krzhizhanovskaya

,

Maciej Paszynski

,

Dieter Kranzlmüller

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2022

Comparing Distributed Termination Detection Algorithms for Modern HPC Platforms.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Valentin Le Fèvre

,

,

Jack J. Dongarra

Int. J. Netw. Comput., 2022

Reinventing High Performance Computing: Challenges and Opportunities.

[DOI]

,

,

Jack J. Dongarra

CoRR, 2022

The evolution of mathematical software.

[DOI]

Jack J. Dongarra

Commun. ACM, 2022

Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices<sup>1</sup>.

[DOI]

Yaohung M. Tsai

,

,

Jack J. Dongarra

Proceedings of the IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems, 2022

Threshold Pivoting for Dense LU Factorization.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems, 2022

Reshaping Geostatistical Modeling and Prediction for Extreme-Scale Environmental Applications.

[DOI]

,

,

,

,

,

,

Jack J. Dongarra

,

,

,

,

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers.

[DOI]

Ahmad Abdelfattah

,

,

,

Stanimire Tomov

,

Xiaoye Sherry Li

,

Jack J. Dongarra

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Sequential Task Flow Runtime Model Improvements and Limitations.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers, 2022

High-Performance GMRES Multi-Precision Benchmark: Design, Performance, and Challenges.

[DOI]

Ichitaro Yamazaki

,

Christian Glusa

,

Jennifer A. Loe

,

,

Sivasankaran Rajamanickam

,

Jack J. Dongarra

Proceedings of the IEEE/ACM International Workshop on Performance Modeling, 2022

Portable and Efficient Dense Linear Algebra in the Beginning of the Exascale Era.

[DOI]

,

,

,

,

Sébastien Cayrols

,

,

Ahmad Abdelfattah

,

Mohammed A. Al Farhan

,

Jack J. Dongarra

Proceedings of the IEEE/ACM International Workshop on Performance, 2022

A Framework to Exploit Data Sparsity in Tile Low-Rank Cholesky Factorization.

[DOI]

,

,

,

,

,

,

Jack J. Dongarra

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Performance Analysis of Parallel FFT on Large Multi-GPU Systems.

[DOI]

,

,

Miroslav Stoyanov

,

,

Jack J. Dongarra

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Batch QR Factorization on GPUs: Design, Optimization, and Tuning.

[DOI]

Ahmad Abdelfattah

,

,

Jack J. Dongarra

Proceedings of the Computational Science - ICCS 2022, 2022

Deep Gaussian process with multitask and transfer learning for performance optimization.

[DOI]

Wissam M. Sid-Lakhdar

,

,

,

Jack J. Dongarra

Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

Message from the High Performance Computing and Communications 2022 General Chairs.

[DOI]

Jack J. Dongarra

,

,

Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

Proposed Consistent Exception Handling for the BLAS and LAPACK.

[DOI]

,

Jack J. Dongarra

,

,

,

,

,

,

Weslley S. Pereira

,

,

Cindy Rubio-González

Proceedings of the Sixth IEEE/ACM International Workshop on Software Correctness for HPC Applications, 2022

Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUs.

[DOI]

Sébastien Cayrols

,

,

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021

A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines.

[DOI]

Ahmad Abdelfattah

,

Timothy B. Costa

,

Jack J. Dongarra

,

,

,

Sven Hammarling

,

Nicholas J. Higham

,

,

,

Stanimire Tomov

,

ACM Trans. Math. Softw., 2021

20 years of computational science: Selected papers from 2020 International Conference on Computational Science.

[DOI]

Sergey V. Kovalchuk

,

Valeria V. Krzhizhanovskaya

,

Maciej Paszynski

,

Gábor Závodszky

,

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2021

Translational process: Mathematical software perspective.

[DOI]

Jack J. Dongarra

,

,

,

Stanimire Tomov

J. Comput. Sci., 2021

Efficient exascale discretizations: High-order finite element methods.

[DOI]

Int. J. High Perform. Comput. Appl., 2021

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic.

[DOI]

Int. J. High Perform. Comput. Appl., 2021

Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems.

[DOI]

,

Saeid Nooshabadi

,

Ichitaro Yamazaki

,

Stanimire Tomov

,

Jack J. Dongarra

IEEE Access, 2021

Scalability Issues in FFT Computation.

[DOI]

,

Stanimire Tomov

,

Miroslav Stoyanov

,

Jack J. Dongarra

Proceedings of the Parallel Computing Technologies, 2021

Distributed-memory multi-GPU block-sparse tensor contraction for electronic structure.

[DOI]

Thomas Hérault

,

,

,

Robert J. Harrison

,

Cannada A. Lewis

,

Edward F. Valeev

,

Jack J. Dongarra

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems.

[DOI]

,

,

,

,

,

,

Jack J. Dongarra

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Revisiting Credit Distribution Algorithms for Distributed Termination Detection.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Valentin Le Fèvre

,

,

Jack J. Dongarra

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms.

[DOI]

,

Miroslav Stoyanov

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Accelerating Multi - Process Communication for Parallel 3-D FFT.

[DOI]

,

,

Miroslav Stoyanov

,

,

Jack J. Dongarra

Proceedings of the Workshop on Exascale MPI, 2021

2020

Software for Linear Algebra Targeting Exascale (SLATE) with a Recursive Butterfly Transform based solver.

[DOI]

,

,

Jack J. Dongarra

Dataset, August, 2020

Load-balancing Sparse Matrix Vector Product Kernels on GPUs.

[DOI]

,

,

,

Jack J. Dongarra

,

,

,

Stanimire Tomov

,

Yuhsiang M. Tsai

,

ACM Trans. Parallel Comput., 2020

Matrix multiplication on batches of small matrices in half and half-complex precisions.

[DOI]

Ahmad Abdelfattah

,

Stanimire Tomov

,

Jack J. Dongarra

J. Parallel Distributed Comput., 2020

Computational Science in the Interconnected World: Selected papers from 2019 International Conference on Computational Science.

[DOI]

Pedro J. S. Cardoso

,

João M. F. Rodrigues

,

Jânio M. Monteiro

,

,

Valeria V. Krzhizhanovskaya

,

Michael Harold Lees

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2020

MAGMA templates for scalable linear algebra on emerging architectures.

[DOI]

Mohammed A. Al Farhan

,

Ahmad Abdelfattah

,

Stanimire Tomov

,

,

,

,

Robert Rosenberg

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2020

A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic.

[DOI]

CoRR, 2020

Reducing the amount of out-of-core data access for GPU-accelerated randomized SVD.

[DOI]

,

Ichitaro Yamazaki

,

,

Yasuyuki Matsushita

,

Stanimire Tomov

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2020

Improving the Performance of the GMRES Method Using Mixed-Precision Techniques.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI, 2020

Integrating Deep Learning in Domain Sciences at Exascale.

[DOI]

,

,

Eduardo F. D'Azevedo

,

Jack J. Dongarra

,

Markus Eisenbach

,

,

,

,

Stanimire Tomov

,

,

Proceedings of the Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI, 2020

Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 11th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2020

High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs.

[DOI]

,

Ahmad Abdelfattah

,

,

Jack J. Dongarra

,

Tzanio V. Kolev

,

Proceedings of the 11th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2020

Using Advanced Vector Extensions AVX-512 for MPI Reductions.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the EuroMPI/USA '20: 27th European MPI Users' Group Meeting, 2020

Evaluating the Performance of NVIDIA's A100 Ampere GPU for Sparse and Batched Computations.

[DOI]

,

Yuhsiang M. Tsai

,

Ahmad Abdelfattah

,

,

Jack J. Dongarra

Proceedings of the 2020 IEEE/ACM Performance Modeling, 2020

Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications.

[DOI]

,

,

,

Aleksandr Mikhalev

,

,

,

,

Jack J. Dongarra

Proceedings of the PASC '20: Platform for Advanced Scientific Computing Conference, Geneva, Switzerland, June 29, 2020

Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime.

[DOI]

,

,

,

,

Victor Eijkhout

,

Jack J. Dongarra

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Asynchronous SGD for DNN training on Shared-memory Parallel Architectures.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

heFFTe: Highly Efficient FFT for Exascale.

[DOI]

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the Computational Science - ICCS 2020, 2020

Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs.

[DOI]

Ahmad Abdelfattah

,

,

Jack J. Dongarra

Proceedings of the Computational Science - ICCS 2020, 2020

Scalable Data Generation for Evaluating Mixed-Precision Solvers.

[DOI]

,

Yaohung M. Tsai

,

,

,

Jack J. Dongarra

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs.

[DOI]

,

Ahmad Abdelfattah

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

HAN: a Hierarchical AutotuNed Collective Communication Framework.

[DOI]

,

,

,

,

,

Thananon Patinyasakdikul

,

,

Jack J. Dongarra

Proceedings of the IEEE International Conference on Cluster Computing, 2020

Flexible Data Redistribution in a Task-Based Runtime System.

[DOI]

,

,

,

,

Aurelien Bouteiller

,

Jack J. Dongarra

Proceedings of the IEEE International Conference on Cluster Computing, 2020

Using Arm Scalable Vector Extension to Optimize OPEN MPI.

[DOI]

,

,

,

,

Shinji Sumimoto

,

,

Jack J. Dongarra

Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019

Solving Linear Diophantine Systems on Parallel Architectures.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2019

PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP.

[DOI]

Jack J. Dongarra

,

,

,

,

,

,

Ichitaro Yamazaki

,

,

Maksims Abalenkovs

,

Negin Bagherpour

,

Sven Hammarling

,

,

,

,

Samuel D. Relton

ACM Trans. Math. Softw., 2019

Performance of asynchronous optimized Schwarz with one-sided communication.

[DOI]

Ichitaro Yamazaki

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Parallel Comput., 2019

Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices.

[DOI]

,

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

,

,

Jack J. Dongarra

Parallel Comput., 2019

Comparing the performance of rigid, moldable and grid-shaped applications on failure-prone HPC platforms.

[DOI]

Valentin Le Fèvre

,

Thomas Hérault

,

,

Aurélien Bouteiller

,

,

,

Jack J. Dongarra

Parallel Comput., 2019

Variable-size batched Gauss-Jordan elimination for block-Jacobi preconditioning on graphics processors.

[DOI]

,

Jack J. Dongarra

,

,

Enrique S. Quintana-Ortí

Parallel Comput., 2019

Science at the intersection of data, modelling, and computation.

[DOI]

Sergey V. Kovalchuk

,

Valeria V. Krzhizhanovskaya

,

,

,

Michael Harold Lees

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2019

Fine-grained bit-flip protection for relaxation methods.

[DOI]

,

Jack J. Dongarra

,

Enrique S. Quintana-Ortí

J. Comput. Sci., 2019

Checkpointing Strategies for Shared High-Performance Computing Platforms.

[DOI]

Thomas Hérault

,

,

Aurélien Bouteiller

,

Dorian C. Arnold

,

Kurt B. Ferreira

,

,

Jack J. Dongarra

Int. J. Netw. Comput., 2019

Evaluation of directive-based performance portable programming models.

[DOI]

M. Graham Lopez

,

,

Verónica G. Vergara Larrea

,

Oscar R. Hernandez

,

,

Stanimire Tomov

,

Jack J. Dongarra

Int. J. High Perform. Comput. Netw., 2019

Distributed-memory lattice H-matrix factorization.

[DOI]

Ichitaro Yamazaki

,

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2019

PAPI software-defined events for in-depth performance analysis.

[DOI]

,

Anthony Danalis

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2019

Race to Exascale.

[DOI]

Jack J. Dongarra

,

Steven Gottlieb

,

William T. C. Kramer

Comput. Sci. Eng., 2019

Investigating power capping toward energy-efficient scientific applications.

[DOI]

,

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2019

Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers.

[DOI]

,

Jack J. Dongarra

,

,

Nicholas J. Higham

,

Enrique S. Quintana-Ortí

Concurr. Comput. Pract. Exp., 2019

Hands-On Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning for Emerging Environments.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the High Performance Computing, 2019

MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing.

[DOI]

,

Nathalie-Sofia Tomov

,

Frank Betancourt

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing, 2019

Evaluation of Programming Models to Address Load Imbalance on Distributed Multi-Core CPUs: A Case Study with Block Low-Rank Factorization.

[DOI]

,

,

Ichitaro Yamazaki

,

,

Jack J. Dongarra

Proceedings of the 2019 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI, 2019

Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC.

[DOI]

Thomas Hérault

,

,

,

Jack J. Dongarra

Proceedings of the 10th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2019

SLATE: design of a modern distributed and accelerated linear algebra library.

[DOI]

,

,

,

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2019

Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools.

[DOI]

,

,

Thomas Hérault

,

,

Aleksandr Mikhalev

,

,

,

,

Jack J. Dongarra

Proceedings of the IEEE/ACM International Workshop on Programming and Performance Visualization Tools, 2019

Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs.

[DOI]

Ahmad Abdelfattah

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 10th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2019

Towards Continuous Benchmarking: An Automated Performance Evaluation Framework for High Performance Software.

[DOI]

,

,

,

Jack J. Dongarra

,

,

,

Enrique S. Quintana-Ortí

,

Yuhsiang M. Tsai

,

Proceedings of the Platform for Advanced Scientific Computing Conference, 2019

Characterization of Power Usage and Performance in Data-Intensive Applications Using MapReduce over MPI.

[DOI]

Joshua Hoke Davis

,

,

Sunita Chandrasekaran

,

,

Anthony Danalis

,

Jack J. Dongarra

,

,

Proceedings of the Parallel Computing: Technology Trends, 2019

Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation.

[DOI]

Ichitaro Yamazaki

,

,

,

Jack J. Dongarra

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Software-Defined Events through PAPI.

[DOI]

Anthony Danalis

,

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

ParILUT - A Parallel Threshold ILU for GPUs.

[DOI]

,

,

,

,

Jack J. Dongarra

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs.

[DOI]

Ahmad Abdelfattah

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Least squares solvers for distributed-memory machines with GPU accelerators.

[DOI]

,

,

,

,

Jack J. Dongarra

Proceedings of the ACM International Conference on Supercomputing, 2019

Massively Parallel Automated Software Tuning.

[DOI]

,

Yaohung M. Tsai

,

,

Ahmad Abdelfattah

,

Jack J. Dongarra

Proceedings of the 48th International Conference on Parallel Processing, 2019

Increasing Accuracy of Iterative Refinement in Limited Floating-Point Arithmetic on Half-Precision Accelerators.

[DOI]

,

Ichitaro Yamazaki

,

Jack J. Dongarra

Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Progressive Optimization of Batched LU Factorization on GPUs.

[DOI]

Ahmad Abdelfattah

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators.

[DOI]

,

,

,

,

Ichitaro Yamazaki

,

Jack J. Dongarra

Proceedings of the Euro-Par 2019: Parallel Processing, 2019

2018

Symmetric Indefinite Linear Solver Using OpenMP Task on Multicore Architectures.

[DOI]

Ichitaro Yamazaki

,

,

,

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2018

A Guide for Achieving High Performance with Very Small Matrices on GPU: A Case Study of Batched LU and Cholesky Factorizations.

[DOI]

,

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2018

Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2018

Autotuning Techniques for Performance-Portable Point Set Registration in 3D.

[DOI]

,

,

Ichitaro Yamazaki

,

David J. Keffer

,

Vasileios Maroulas

,

Jack J. Dongarra

Supercomput. Front. Innov., 2018

AlgoWiki Project as an Extension of the Top500 Methodology.

[DOI]

Alexander S. Antonov

,

Jack J. Dongarra

,

Vladimir V. Voevodin

Supercomput. Front. Innov., 2018

ParILUT - A New Parallel Threshold ILU Factorization.

[DOI]

,

,

Jack J. Dongarra

SIAM J. Sci. Comput., 2018

The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale.

[DOI]

Jack J. Dongarra

,

,

,

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

SIAM Rev., 2018

From High-Level Specification to High-Performance Code.

[DOI]

Franz Franchetti

,

José M. F. Moura

,

,

Jack J. Dongarra

Proc. IEEE, 2018

Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators.

[DOI]

Jack J. Dongarra

,

,

,

,

Yaohung M. Tsai

Proc. IEEE, 2018

Autotuning in High-Performance Computing Applications.

[DOI]

Prasanna Balaprakash

,

Jack J. Dongarra

,

,

,

Jeffrey K. Hollingsworth

,

,

Richard W. Vuduc

Proc. IEEE, 2018

Accelerating the SVD two stage bidiagonal reduction and divide and conquer using GPUs.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Parallel Comput., 2018

Incomplete Sparse Approximate Inverses for Parallel Preconditioning.

[DOI]

,

Thomas K. Huckle

,

Jürgen Bräckle

,

Jack J. Dongarra

Parallel Comput., 2018

Post-exascale supercomputing: research opportunities abound.

[DOI]

,

Jack J. Dongarra

,

Frontiers Inf. Technol. Electron. Eng., 2018

Using Jacobi iterations and blocking for solving sparse triangular systems in incomplete factorization preconditioning.

[DOI]

,

,

Jennifer A. Scott

,

Jack J. Dongarra

J. Parallel Distributed Comput., 2018

The art of computational science: Bridging gaps - forming alloys.

[DOI]

Sergey V. Kovalchuk

,

Valeria V. Krzhizhanovskaya

,

Petros Koumoutsakos

,

Eleni N. Chatzi

,

Michael Harold Lees

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2018

Accelerating the SVD bi-diagonalization of a batch of small matrices using GPUs.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

J. Comput. Sci., 2018

Batched one-sided factorizations of tiny matrices using GPUs: Challenges and countermeasures.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

J. Comput. Sci., 2018

Accelerating NWChem Coupled Cluster through dataflow-based execution.

[DOI]

,

Anthony Danalis

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2018

A failure detector for HPC platforms.

[DOI]

,

Aurélien Bouteiller

,

Amina Guermouche

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2018

Big data and extreme-scale computing.

[DOI]

Int. J. High Perform. Comput. Appl., 2018

Optimization and performance evaluation of the IDR iterative Krylov solver on GPUs.

[DOI]

,

Moritz Kreutzer

,

,

Gregory D. Peterson

,

Gerhard Wellein

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2018

Task based Cholesky decomposition on Xeon Phi architectures using OpenMP.

[DOI]

,

,

,

,

Jack J. Dongarra

Int. J. Comput. Sci. Eng., 2018

SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks.

[DOI]

,

,

,

,

,

,

Jack J. Dongarra

,

Maurice Herlihy

,

Rodrigo Fonseca

CoRR, 2018

Evaluation of dataflow programming models for electronic structure theory.

[DOI]

,

Anthony Danalis

,

,

Mathieu Faverge

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2018

The 30th Anniversary of the Supercomputing Conference: Bringing the Future Closer - Supercomputing History and the Immortality of Now.

[DOI]

Jack J. Dongarra

,

,

Computer, 2018

Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

,

Nicholas J. Higham

Proceedings of the International Conference for High Performance Computing, 2018

Variable-Size Batched Condition Number Calculation on GPUs.

[DOI]

,

Jack J. Dongarra

,

,

Thomas Grützmacher

Proceedings of the 30th International Symposium on Computer Architecture and High Performance Computing, 2018

A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs.

[DOI]

,

Jack J. Dongarra

Proceedings of the 30th International Symposium on Computer Architecture and High Performance Computing, 2018

Performance of Hierarchical-matrix BiCGStab Solver on GPU Clusters.

[DOI]

Ichitaro Yamazaki

,

Ahmad Abdelfattah

,

,

Satoshi Ohshima

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms.

[DOI]

Thomas Hérault

,

,

Aurélien Bouteiller

,

Dorian C. Arnold

,

Kurt B. Ferreira

,

,

Jack J. Dongarra

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques.

[DOI]

,

Ahmad Abdelfattah

,

,

,

Srikara Pranesh

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Computational Science - ICCS 2018, 2018

Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky Factorization.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

ADAPT: an event-based adaptive collective communication framework.

[DOI]

,

,

,

Thananon Patinyasakdikul

,

,

Jack J. Dongarra

Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018

Do Moldable Applications Perform Better on Failure-Prone HPC Platforms?

[DOI]

Valentin Le Fèvre

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2018: Parallel Processing Workshops, 2018

2017

Design and Implementation of the PULSAR Programming System for Large Scale Computing.

[DOI]

,

,

Ichitaro Yamazaki

,

,

Jack J. Dongarra

Supercomput. Front. Innov., 2017

Preconditioned Krylov solvers on GPUs.

[DOI]

,

,

Jack J. Dongarra

,

Moritz Kreutzer

,

Gerhard Wellein

,

Parallel Comput., 2017

Data through the Computational Lens.

[DOI]

Sergey V. Kovalchuk

,

Tesfamariam M. Abuhay

,

,

Michael L. Norman

,

Michael Harold Lees

,

Valeria V. Krzhizhanovskaya

,

Jack J. Dongarra

,

Peter M. A. Sloot

J. Comput. Sci., 2017

Fast Cholesky factorization on GPUs for batch and native modes in MAGMA.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

J. Comput. Sci., 2017

Porting the PLASMA Numerical Library to the OpenMP Standard.

[DOI]

,

,

,

Jack J. Dongarra

Int. J. Parallel Program., 2017

Guest Editor's Note: Special Issue on Clusters, Clouds and Data for Scientific Computing.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Int. J. High Perform. Comput. Appl., 2017

A look back on 30 years of the Gordon Bell Prize.

[DOI]

,

David H. Bailey

,

Jack J. Dongarra

,

,

Int. J. High Perform. Comput. Appl., 2017

On the performance and energy efficiency of sparse linear algebra on GPUs.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2017

Structure-Aware Linear Solver for Realtime Convex Optimization for Embedded Systems.

[DOI]

Ichitaro Yamazaki

,

Saeid Nooshabadi

,

Stanimire Tomov

,

Jack J. Dongarra

IEEE Embed. Syst. Lett., 2017

With Extreme Computing, the Rules Have Changed.

[DOI]

Jack J. Dongarra

,

Stanimire Tomov

,

,

,

,

Ichitaro Yamazaki

,

,

,

Ahmad Abdelfattah

Comput. Sci. Eng., 2017

Non-GPU-resident symmetric indefinite factorization.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2017

Solving dense symmetric indefinite systems using GPUs.

[DOI]

,

Jack J. Dongarra

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

Concurr. Comput. Pract. Exp., 2017

A Framework for Out of Memory SVD Algorithms.

[DOI]

,

,

Stanimire Tomov

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the High Performance Computing - 32nd International Conference, 2017

Dynamic task discovery in PaRSEC: a data-flow task-based runtime.

[DOI]

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2017

Investigating half precision arithmetic to accelerate dense linear system solvers.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2017

Flexible batched sparse matrix-vector product on GPUs.

[DOI]

,

,

Jack J. Dongarra

,

,

Enrique S. Quintana-Ortí

Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2017

High-performance Cholesky factorization for GPU-only execution.

[DOI]

,

Ahmad Abdelfattah

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the General Purpose GPUs, 2017

Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs.

[DOI]

,

Jack J. Dongarra

,

,

Enrique S. Quintana-Ortí

Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, 2017

Improving Performance of GMRES by Reducing Communication and Pipelining Global Collectives.

[DOI]

Ichitaro Yamazaki

,

,

,

Jack J. Dongarra

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices.

[DOI]

,

,

,

,

Jack J. Dongarra

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation.

[DOI]

Mathieu Faverge

,

,

,

Jack J. Dongarra

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

EduPar Keynote.

[DOI]

Jack J. Dongarra

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference on Supercomputing, 2017

Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning.

[DOI]

,

Jack J. Dongarra

,

,

Enrique S. Quintana-Ortí

Proceedings of the 46th International Conference on Parallel Processing, 2017

The Art of Computational Science, Bridging Gaps - Forming Alloys. Preface for ICCS 2017.

[DOI]

Petros Koumoutsakos

,

Eleni N. Chatzi

,

Valeria V. Krzhizhanovskaya

,

,

Jack J. Dongarra

,

Peter M. A. Sloot

Proceedings of the International Conference on Computational Science, 2017

The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems.

[DOI]

Jack J. Dongarra

,

Sven Hammarling

,

Nicholas J. Higham

,

Samuel D. Relton

,

Pedro Valero-Lara

,

Proceedings of the International Conference on Computational Science, 2017

Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science, 2017

Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning.

[DOI]

,

Jack J. Dongarra

,

,

Enrique S. Quintana-Ortí

,

Andrés E. Tomás

Proceedings of the International Conference on Computational Science, 2017

Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science, 2017

Towards numerical benchmark for half-precision floating point arithmetic.

[DOI]

,

,

Ichitaro Yamazaki

,

Jack J. Dongarra

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Out of memory SVD solver for big data.

[DOI]

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Power-aware computing: Measurement, control, and performance analysis for Intel Xeon Phi.

[DOI]

,

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Optimized Batched Linear Algebra for Modern Architectures.

[DOI]

Jack J. Dongarra

,

Sven Hammarling

,

Nicholas J. Higham

,

Samuel D. Relton

,

Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

Sampling algorithms to update truncated SVD.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

Scaling point set registration in 3D across thread counts on multicore and hardware accelerator platforms through autotuning for large scale analysis of scientific point clouds.

[DOI]

,

,

Ichitaro Yamazaki

,

David J. Keffer

,

Jack J. Dongarra

Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

Bringing High Performance Computing to Big Data Algorithms.

[DOI]

,

Jack J. Dongarra

,

,

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

Proceedings of the Handbook of Big Data Technologies, 2017

2016

Domain Overlap for Iterative Sparse Triangular Solves on GPUs.

[DOI]

,

,

Daniel B. Szyld

,

Jack J. Dongarra

Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs.

[DOI]

,

,

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2016

Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

Jack J. Dongarra

ACM Trans. Math. Softw., 2016

Assessing the cost of redistribution followed by a computational kernel: Complexity and performance results.

[DOI]

Julien Herrmann

,

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Parallel Comput., 2016

Updating incomplete factorization preconditioners for model order reduction.

[DOI]

,

,

,

Jack J. Dongarra

Numer. Algorithms, 2016

High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems.

[DOI]

Jack J. Dongarra

,

Michael A. Heroux

,

Int. J. High Perform. Comput. Appl., 2016

Bidiagonalization with Parallel Tiled Algorithms.

[DOI]

Mathieu Faverge

,

,

,

Jack J. Dongarra

CoRR, 2016

Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs.

[DOI]

Ahmad Abdelfattah

,

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2016

Linear algebra software for large-scale accelerated multicore computing.

[DOI]

Ahmad Abdelfattah

,

,

Jack J. Dongarra

,

,

,

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

,

Acta Numer., 2016

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations.

[DOI]

,

,

Jack J. Dongarra

,

,

Frank Hülsemann

,

,

Proceedings of the High Performance Computing for Computational Science - VECPAR 2016, 2016

Task-Based Cholesky Decomposition on Knights Corner Using OpenMP.

[DOI]

,

,

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing, 2016

Performance, Design, and Autotuning of Batched GEMM for GPUs.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the High Performance Computing - 31st International Conference, 2016

Performance-Portable Autotuning of OpenCL Kernels for Convolutional Layers of Deep Neural Networks.

[DOI]

Yaohung M. Tsai

,

,

,

Jack J. Dongarra

Proceedings of the 2nd Workshop on Machine Learning in HPC Environments, 2016

Towards Achieving Performance Portability Using Directives for Accelerators.

[DOI]

M. Graham Lopez

,

Verónica G. Vergara Larrea

,

,

Oscar R. Hernandez

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Third Workshop on Accelerator Programming Using Directives, 2016

Failure detection and propagation in HPC systems.

[DOI]

,

Aurélien Bouteiller

,

Amina Guermouche

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2016

Batched Generation of Incomplete Sparse Approximate Inverses on GPUs.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2016

Heterogeneous Streaming.

[DOI]

Chris J. Newburn

,

,

,

,

,

Alejandro Duran

,

,

Leonardo Borges

,

,

Stanimire Tomov

,

Jack J. Dongarra

,

,

,

,

,

,

Ichitaro Yamazaki

,

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Search Space Generation and Pruning System for Autotuners.

[DOI]

,

,

,

Anthony Danalis

,

Jack J. Dongarra

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Hessenberg Reduction with Transient Error Resilience on GPU-Based Hybrid Architectures.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Efficiency of General Krylov Methods on GPUs - An Experimental Study.

[DOI]

,

Jack J. Dongarra

,

Moritz Kreutzer

,

Gerhard Wellein

,

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Data through the Computational Lens, Preface for ICCS 2016.

[DOI]

,

,

,

Valeria V. Krzhizhanovskaya

,

Jack J. Dongarra

,

Peter M. A. Sloot

Proceedings of the International Conference on Computational Science 2016, 2016

Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs.

[DOI]

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science 2016, 2016

High-Performance Tensor Contractions for GPUs.

[DOI]

Ahmad Abdelfattah

,

,

,

Jack J. Dongarra

,

Christopher W. Earl

,

,

,

,

Tzanio V. Kolev

,

,

Stanimire Tomov

Proceedings of the International Conference on Computational Science 2016, 2016

LU, QR, and Cholesky factorizations: Programming model, performance analysis and optimization techniques for the Intel Knights Landing Xeon Phi.

[DOI]

,

Stanimire Tomov

,

Konstantin Arturov

,

Murat Efe Guney

,

,

Jack J. Dongarra

Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations.

[DOI]

,

,

Stanimire Tomov

,

,

Jay Jay Billings

,

,

Jack J. Dongarra

Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

GPU-Aware Non-contiguous Data Movement In Open MPI.

[DOI]

,

,

Rolf Vandevaart

,

Sylvain Jeaugey

,

Jack J. Dongarra

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

With Extreme Scale Computing the Rules Have Changed.

[DOI]

Jack J. Dongarra

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

High-Performance Matrix-Matrix Multiplications of Very Small Matrices.

[DOI]

,

Ahmad Abdelfattah

,

,

Stanimire Tomov

,

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2016: Parallel Processing, 2016

2015

Algorithm-Based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy.

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

,

,

Jack J. Dongarra

ACM Trans. Parallel Comput., 2015

AlgoWiki: an Open Encyclopedia of Parallel Algorithmic Features.

[DOI]

Vladimir V. Voevodin

,

Alexander S. Antonov

,

Jack J. Dongarra

Supercomput. Front. Innov., 2015

Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems.

[DOI]

Jack J. Dongarra

,

Maksims Abalenkovs

,

Ahmad Abdelfattah

,

,

,

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

,

Supercomput. Front. Innov., 2015

Computing Low-Rank Approximation of a Dense Matrix on Multicore CPUs with a GPU and Its Application to Solving a Hierarchically Semiseparable Linear System of Equations.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

Jack J. Dongarra

Sci. Program., 2015

HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi.

[DOI]

Jack J. Dongarra

,

,

,

,

,

,

Stanimire Tomov

Sci. Program., 2015

Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

Jack J. Dongarra

SIAM J. Sci. Comput., 2015

Guest Editors' Note: Special Issue on Clusters, Clouds and Data for Scientific Computing.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Parallel Process. Lett., 2015

Mixing LU and QR factorization algorithms to design high-performance dense linear algebra solvers.

[DOI]

Mathieu Faverge

,

Julien Herrmann

,

,

Bradley R. Lowery

,

,

Jack J. Dongarra

J. Parallel Distributed Comput., 2015

Composing resilience techniques: ABFT, periodic and incremental checkpointing.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Int. J. Netw. Comput., 2015

Acceleration of GPU-based Krylov solvers via data transfer reduction.

[DOI]

,

Stanimire Tomov

,

,

William B. Sawyer

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2015

A scalable approach to solving dense linear algebra problems on hybrid CPU-GPU systems.

[DOI]

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2015

A survey of recent developments in parallel implementations of Gaussian elimination.

[DOI]

Simplice Donfack

,

Jack J. Dongarra

,

Mathieu Faverge

,

,

,

,

Ichitaro Yamazaki

Concurr. Comput. Pract. Exp., 2015

Experiences in autotuning matrix multiplication for energy minimization on GPUs.

[DOI]

,

,

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2015

The TOP500 List and Progress in High-Performance Computing.

[DOI]

Erich Strohmaier

,

Hans Werner Meuer

,

Jack J. Dongarra

,

Computer, 2015

Exascale computing and big data.

[DOI]

,

Jack J. Dongarra

Commun. ACM, 2015

On the Design, Development, and Analysis of Optimized Matrix-Vector Multiplication Routines for Coprocessors.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the High Performance Computing - 30th International Conference, 2015

A Framework for Batched and GPU-Resident Factorization Algorithms Applied to Block Householder Transformations.

[DOI]

,

Tingxing Tim Dong

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing - 30th International Conference, 2015

Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing - 30th International Conference, 2015

Performance analysis and design of a hessenberg reduction using stabilized blocked elementary transformations for new architectures.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Symposium on High Performance Computing, 2015

Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Symposium on High Performance Computing, 2015

Mixed-precision block gram Schmidt orthogonalization.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

,

Jack J. Dongarra

,

Jesse L. Barlow

Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2015

Randomized algorithms to update partial singular value decomposition on a hybrid CPU/GPU cluster.

[DOI]

Ichitaro Yamazaki

,

,

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2015

Efficient implementation of quantum materials simulations on distributed CPU-GPU systems.

[DOI]

Raffaele Solcà

,

Anton Kozhevnikov

,

,

Stanimire Tomov

,

Jack J. Dongarra

,

Thomas C. Schulthess

Proceedings of the International Conference for High Performance Computing, 2015

Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs.

[DOI]

,

Ichitaro Yamazaki

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2015

Practical scalable consensus for pseudo-synchronous distributed systems.

[DOI]

Thomas Hérault

,

Aurélien Bouteiller

,

,

,

Keita Teranishi

,

Manish Parashar

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2015

Visualizing execution traces with task dependencies.

[DOI]

,

Stephen Richmond

,

,

,

Jack J. Dongarra

Proceedings of the 2nd Workshop on Visual Performance Analysis, 2015

Weighted dynamic scheduling with many parallelism grains for offloading of numerical workloads to multiple varied accelerators.

[DOI]

,

,

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2015

GPU-accelerated co-design of induced dimension reduction: algorithmic fusion and kernel overlap.

[DOI]

,

,

Gregory D. Peterson

,

Jack J. Dongarra

Proceedings of the 2nd International Workshop on Hardware-Software Co-Design for High Performance Computing, 2015

Tuning stationary iterative solvers for fault resilience.

[DOI]

,

Jack J. Dongarra

,

Enrique S. Quintana-Ortí

Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2015

Adaptive precision solvers for sparse linear systems.

[DOI]

,

Jack J. Dongarra

,

Enrique S. Quintana-Ortí

Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing, 2015

Strengthening compute and data intensive capacities of Armenia.

[DOI]

Hrachya V. Astsatryan

,

Vladimir Sahakyan

,

Yuri Shoukourian

,

Jack J. Dongarra

,

Pierre-Henri Cros

,

Michel J. Daydé

,

Proceedings of the 2015 14th RoEduNet International Conference, 2015

Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery.

[DOI]

Aurélien Bouteiller

,

,

Jack J. Dongarra

Proceedings of the 22nd European MPI Users' Group Meeting, 2015

Optimization for performance and energy for batched matrix computations on GPUs.

[DOI]

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 8th Workshop on General Purpose Processing using GPUs, 2015

Towards batched linear solvers on accelerated hardware platforms.

[DOI]

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Energy efficiency and performance frontiers for sparse computations on GPU supercomputers.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, 2015

Accelerating NWChem Coupled Cluster Through Dataflow-Based Execution.

[DOI]

,

Anthony Danalis

,

,

Jack J. Dongarra

Proceedings of the Parallel Processing and Applied Mathematics, 2015

Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures.

[DOI]

,

Jack J. Dongarra

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

Proceedings of the Parallel Processing and Applied Mathematics, 2015

Hierarchical DAG Scheduling for Hybrid Distributed Systems.

[DOI]

,

Aurélien Bouteiller

,

,

Mathieu Faverge

,

Jack J. Dongarra

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Design for a Soft Error Resilient Dynamic Task-Based Runtime.

[DOI]

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Performance Analysis and Optimisation of Two-sided Factorization Algorithms for Heterogeneous Platform.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science, 2015

MAGMA embedded: Towards a dense linear algebra library for energy efficient extreme computing.

[DOI]

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the 2015 IEEE High Performance Extreme Computing Conference, 2015

Flexible Linear Algebra Development and Scheduling with Cholesky Factorization.

[DOI]

,

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Iterative Sparse Triangular Solves for Preconditioning.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed Task-Based Execution.

[DOI]

Anthony Danalis

,

,

,

Jack J. Dongarra

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Accelerating collaborative filtering using concepts from high performance computing.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

2014

Model-Driven One-Sided Factorizations on Multicore Accelerated Systems.

[DOI]

Jack J. Dongarra

,

,

,

,

Stanimire Tomov

,

Supercomput. Front. Innov., 2014

Communication-Avoiding Symmetric-Indefinite Factorization.

[DOI]

,

Dulceneia Becker

,

,

Jack J. Dongarra

,

,

,

,

,

Ichitaro Yamazaki

SIAM J. Matrix Anal. Appl., 2014

An efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems.

[DOI]

,

Dulceneia Becker

,

,

Anthony Danalis

,

Jack J. Dongarra

Parallel Comput., 2014

Looking back at dense linear algebra software.

[DOI]

,

,

Jack J. Dongarra

J. Parallel Distributed Comput., 2014

Performance and reliability trade-offs for the double checkpointing algorithm.

[DOI]

Jack J. Dongarra

,

Thomas Hérault

,

Int. J. Netw. Comput., 2014

A novel hybrid CPU-GPU generalized eigensolver for electronic structure calculations based on fine-grained memory aware tasks.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

,

Raffaele Solcà

,

Thomas C. Schulthess

Int. J. High Perform. Comput. Appl., 2014

Power profiling of Cholesky and QR factorizations on distributed memory systems.

[DOI]

,

,

Jack J. Dongarra

Comput. Sci. Res. Dev., 2014

Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems.

[DOI]

Ichitaro Yamazaki

,

,

Raffaele Solcà

,

Stanimire Tomov

,

Jack J. Dongarra

,

Thomas C. Schulthess

Concurr. Comput. Pract. Exp., 2014

Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting.

[DOI]

Jack J. Dongarra

,

Mathieu Faverge

,

,

Concurr. Comput. Pract. Exp., 2014

Unified model for assessing checkpointing protocols at extreme-scale.

[DOI]

,

Aurélien Bouteiller

,

Elisabeth Brunet

,

Franck Cappello

,

Jack J. Dongarra

,

Amina Guermouche

,

Thomas Hérault

,

,

Frédéric Vivien

,

Dounia Zaidouni

Concurr. Comput. Pract. Exp., 2014

BlackjackBench: Portable Hardware Characterization with Automated Results' Analysis.

[DOI]

Anthony Danalis

,

,

,

Jeffrey S. Vetter

,

Jack J. Dongarra

Comput. J., 2014

Mixed-Precision Orthogonalization Scheme and Adaptive Step Size for Improving the Stability and Performance of CA-GMRES on GPUs.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

Accelerating Computation of Eigenvectors in the Dense Nonsymmetric Eigenvalue Problem.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

Self-adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures.

[DOI]

,

Dimitar Lukarski

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

Deflation strategies to improve the convergence of communication-avoiding GMRES.

[DOI]

Ichitaro Yamazaki

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014

PTG: an abstraction for unhindered parallelism.

[DOI]

Anthony Danalis

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

Jack J. Dongarra

Proceedings of the Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2014

Performance and portability with OpenCL for throughput-oriented HPC workloads across accelerators, coprocessors, and multicore processors.

[DOI]

,

,

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

,

Jack J. Dongarra

Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014

clMAGMA: high performance dense linear algebra with OpenCL.

[DOI]

,

Jack J. Dongarra

,

,

,

,

Stanimire Tomov

Proceedings of the International Workshop on OpenCL, 2014

MIAMI: A framework for application performance diagnosis.

[DOI]

,

Jack J. Dongarra

,

Daniel Terpstra

Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime.

[DOI]

Ichitaro Yamazaki

,

,

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Improving the Performance of CA-GMRES on Multicores with Multiple GPUs.

[DOI]

Ichitaro Yamazaki

,

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Hybrid Multi-elimination ILU Preconditioners on GPUs.

[DOI]

Dimitar Lukarski

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

New Algorithm for Computing Eigenvectors of the Symmetric Eigenvalue Problem.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Unified Development for Mixed Multi-GPU and Multi-coprocessor Environments Using a Lightweight Runtime Environment.

[DOI]

,

,

,

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Designing LU-QR Hybrid Solvers for Performance and Stability.

[DOI]

Mathieu Faverge

,

Julien Herrmann

,

,

Bradley R. Lowery

,

,

Jack J. Dongarra

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

A Step towards Energy Efficient Computing: Redesigning a Hydrodynamic Application on CPU-GPU.

[DOI]

,

,

Tzanio V. Kolev

,

Robert N. Rieben

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Dynamically Balanced Synchronization-Avoiding LU Factorization with Multicore and GPUs.

[DOI]

Simplice Donfack

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Assessing the Impact of ABFT and Checkpoint Composite Strategies.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Optimizing Krylov Subspace Solvers on Graphics Processing Units.

[DOI]

,

William B. Sawyer

,

Stanimire Tomov

,

,

Ichitaro Yamazaki

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Scaling up matrix computations on shared-memory manycore systems with 1000 CPU cores.

[DOI]

,

Jack J. Dongarra

Proceedings of the 2014 International Conference on Supercomputing, 2014

Parallel Simulation of Superscalar Scheduling.

[DOI]

,

,

,

,

Jack J. Dongarra

Proceedings of the 43rd International Conference on Parallel Processing, 2014

A Fast Batched Cholesky Factorization on a GPU.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 43rd International Conference on Parallel Processing, 2014

Big Data Meets Computational Science, Preface for ICCS 2014.

[DOI]

,

,

Valeria V. Krzhizhanovskaya

,

Jack J. Dongarra

,

Peter M. A. Sloot

Proceedings of the International Conference on Computational Science, 2014

LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU.

[DOI]

,

,

,

James Austin Harris

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Power monitoring with PAPI for extreme scale architectures and dataflow-based programming models.

[DOI]

,

,

Anthony Danalis

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

Utilizing dataflow-based execution for coupled cluster methods.

[DOI]

,

Anthony Danalis

,

Thomas Hérault

,

,

Jack J. Dongarra

,

,

Theresa L. Windus

Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

Access-averse framework for computing low-rank matrix approximations.

[DOI]

Ichitaro Yamazaki

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

Accelerating Numerical Dense Linear Algebra Calculations with GPUs.

[DOI]

Jack J. Dongarra

,

,

,

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

Proceedings of the Numerical Computations with GPUs, 2014

2013

LU Factorization with Partial Pivoting for a Multicore System with Accelerators.

[DOI]

,

,

Mathieu Faverge

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2013

High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures.

[DOI]

,

,

Jack J. Dongarra

ACM Trans. Math. Softw., 2013

Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms.

[DOI]

Fred G. Gustavson

,

Jerzy Wasniewski

,

Jack J. Dongarra

,

José R. Herrero

,

ACM Trans. Math. Softw., 2013

Accelerating Linear System Solutions Using Randomization Techniques.

[DOI]

,

Jack J. Dongarra

,

Julien Herrmann

,

Stanimire Tomov

ACM Trans. Math. Softw., 2013

Enabling workflows in GridSolve: request sequencing and service trading.

[DOI]

,

,

Jack J. Dongarra

,

,

Aurélie Hurault

J. Supercomput., 2013

Hierarchical QR factorization algorithms for multi-core clusters.

[DOI]

Jack J. Dongarra

,

Mathieu Faverge

,

Thomas Hérault

,

Mathias Jacquelin

,

,

Parallel Comput., 2013

Kernel-assisted and topology-aware MPI collective communications on multicore/many-core platforms.

[DOI]

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

J. Parallel Distributed Comput., 2013

Introduction for August Special Issue CCDSC.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Int. J. High Perform. Comput. Appl., 2013

Post-failure recovery of MPI communication capability: Design and rationale.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2013

PaRSEC: Exploiting Heterogeneity to Enhance Scalability.

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Mathieu Faverge

,

Thomas Hérault

,

Jack J. Dongarra

Comput. Sci. Eng., 2013

Correlated set coordination in fault tolerant message logging protocols for many-core clusters.

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2013

Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI.

[DOI]

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2013

Beyond the CPU: Hardware Performance Counter Monitoring on Blue Gene/Q.

[DOI]

,

Daniel Terpstra

,

Jack J. Dongarra

,

,

Roy G. Musselman

Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Leading Edge Hybrid Multi-GPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations.

[DOI]

,

Raffaele Solcà

,

,

Stanimire Tomov

,

Thomas C. Schulthess

,

Jack J. Dongarra

Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

CPU-GPU hybrid bidiagonal reduction with soft error resilience.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2013

Parallel reduction to hessenberg form with algorithm-based fault tolerance.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2013

Optimal Checkpointing Period: Time vs. Energy.

[DOI]

,

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013

Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi.

[DOI]

Jack J. Dongarra

,

,

,

,

,

,

Stanimire Tomov

Proceedings of the Parallel Processing and Applied Mathematics, 2013

Tridiagonalization of a Symmetric Dense Matrix on a GPU Cluster.

[DOI]

Ichitaro Yamazaki

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Virtual Systolic Array for QR Decomposition.

[DOI]

,

,

,

Ichitaro Yamazaki

,

Jack J. Dongarra

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Revisiting the Double Checkpointing Algorithm.

[DOI]

Jack J. Dongarra

,

Thomas Hérault

,

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

HCW 2013 Keynote Talk.

[DOI]

Jack J. Dongarra

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Implementing a Blocked Aasen's Algorithm with a Dynamic Scheduler on Multicore Architectures.

[DOI]

,

Dulceneia Becker

,

,

Jack J. Dongarra

,

,

,

,

,

Ichitaro Yamazaki

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Efficient parallelization of batch pattern training algorithm on many-core and cluster architectures.

[DOI]

Volodymyr Turchenko

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems, 2013

Toward a scalable multi-GPU eigensolver via compute-intensive kernels and efficient communication.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference on Supercomputing, 2013

A Parallel Solver for Incompressible Fluid Flows.

[DOI]

,

,

Jack J. Dongarra

,

,

,

Olivier P. Le Maître

Proceedings of the International Conference on Computational Science, 2013

Computation at the Frontiers of Science, preface for ICCS 2013.

[DOI]

Vassil Alexandrov

,

,

Valeria V. Krzhizhanovskaya

,

Jack J. Dongarra

,

Peter M. A. Sloot

Proceedings of the International Conference on Computational Science, 2013

Standards for graph algorithm primitives.

[DOI]

,

,

Jonathan W. Berry

,

,

Jack J. Dongarra

,

Christos Faloutsos

,

,

John R. Gilbert

,

Joseph Gonzalez

,

Bruce Hendrickson

,

,

Charles E. Leiserson

,

Andrew Lumsdaine

,

,

,

Steven P. Reinhardt

,

Mike Stonebraker

,

,

Proceedings of the IEEE High Performance Extreme Computing Conference, 2013

Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization.

[DOI]

Aurélien Bouteiller

,

Franck Cappello

,

Jack J. Dongarra

,

Amina Guermouche

,

Thomas Hérault

,

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Implementing a Systolic Algorithm for QR Factorization on Multicore Clusters with PaRSEC.

[DOI]

,

Mathieu Faverge

,

,

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013

2012

Autotuning GEMM Kernels for the Fermi GPU.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2012

BlackjackBench: portable hardware characterization.

[DOI]

Anthony Danalis

,

,

,

Jeffrey S. Vetter

,

Jack J. Dongarra

SIGMETRICS Perform. Evaluation Rev., 2012

Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems.

[DOI]

Christof Vömel

,

Stanimire Tomov

,

Jack J. Dongarra

SIAM J. Sci. Comput., 2012

Toward a High Performance Tile Divide and Conquer Algorithm for the Dense Symmetric Eigenvalue Problem.

[DOI]

,

,

Jack J. Dongarra

SIAM J. Sci. Comput., 2012

Multi-GPU Implementation of LU Factorization.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science, 2012

High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science, 2012

A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines.

[DOI]

,

Simplice Donfack

,

Jack J. Dongarra

,

,

,

Stanimire Tomov

Proceedings of the International Conference on Computational Science, 2012

Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems.

[DOI]

,

Stanimire Tomov

,

,

Jack J. Dongarra

,

Vincent Heuveline

Proceedings of the International Conference on Computational Science, 2012

Empowering Science through Computing, Preface for ICCS 2012.

[DOI]

,

,

Deepak Khazanchi

,

,

G. Dick van Albada

,

Jack J. Dongarra

,

Peter M. A. Sloot

Proceedings of the International Conference on Computational Science, 2012

From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming.

[DOI]

,

,

,

Stanimire Tomov

,

Gregory D. Peterson

,

Jack J. Dongarra

Parallel Comput., 2012

Introduction to the Special Issue.

[DOI]

,

Jack J. Dongarra

,

Int. J. High Perform. Comput. Appl., 2012

Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency.

[DOI]

,

,

Jack J. Dongarra

Comput. Sci. Res. Dev., 2012

A hybrid Hermitian general eigenvalue solver

[DOI]

Raffaele Solcà

,

Thomas C. Schulthess

,

,

Stanimire Tomov

,

Ichitaro Yamazaki

,

Jack J. Dongarra

CoRR, 2012

High-performance computing systems: Status and outlook.

[DOI]

Jack J. Dongarra

,

Aad J. van der Steen

Acta Numer., 2012

Programming the LU Factorization for a Multicore System with Accelerators.

[DOI]

,

,

Mathieu Faverge

,

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science, 2012

Optimizing Memory-Bound SYMV Kernel on GPU Hardware Accelerators.

[DOI]

Ahmad Abdelfattah

,

Jack J. Dongarra

,

,

Proceedings of the High Performance Computing for Computational Science, 2012

A scalable framework for heterogeneous GPU-based clusters.

[DOI]

,

Jack J. Dongarra

Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012

Poster: A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks.

[DOI]

Raffaele Solcà

,

,

Stanimire Tomov

,

Thomas C. Schulthess

,

Jack J. Dongarra

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks.

[DOI]

Raffaele Solcà

,

,

Stanimire Tomov

,

Thomas C. Schulthess

,

Jack J. Dongarra

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Poster: Matrices over Runtime Systems at Exascale.

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Matrices Over Runtime Systems at Exascale.

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

An Evaluation of User-Level Failure Mitigation Support in MPI.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2012

Algorithm-based fault tolerance for dense matrix factorizations.

[DOI]

,

Aurélien Bouteiller

,

,

Thomas Hérault

,

Jack J. Dongarra

Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters.

[DOI]

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems.

[DOI]

Jack J. Dongarra

,

Mathieu Faverge

,

Thomas Hérault

,

,

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

A Parallel Tiled Solver for Dense Symmetric Indefinite Systems on Multicore Architectures.

[DOI]

,

Dulceneia Becker

,

Jack J. Dongarra

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

A Block-Asynchronous Relaxation Method for Graphics Processing Units.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

,

Vincent Heuveline

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the International Conference on Supercomputing, 2012

Anatomy of a globally recursive embedded LINPACK benchmark.

[DOI]

Jack J. Dongarra

,

Proceedings of the IEEE Conference on High Performance Extreme Computing, 2012

Scalable Dense Linear Algebra on Heterogeneous Hardware.

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Thomas Hérault

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Transition of HPC Towards Exascale Computing, 2012

From Serial Loops to Parallel Execution on Distributed Systems.

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Thomas Hérault

,

Jack J. Dongarra

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

A Checkpoint-on-Failure Protocol for Algorithm-Based Recovery in Standard MPI.

[DOI]

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

,

Vincent Heuveline

Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement.

[DOI]

,

,

Jack J. Dongarra

,

Vincent Heuveline

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architectures.

[DOI]

Jack J. Dongarra

,

,

,

Vincent M. Weaver

Proceedings of the 2012 Second International Conference on Cloud and Green Computing, 2012

Dense Linear Algebra on Accelerated Multicore Hardware.

[DOI]

Jack J. Dongarra

,

,

,

Stanimire Tomov

Proceedings of the High-Performance Scientific Computing - Algorithms and Applications., 2012

2011

TOP500.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

ScaLAPACK.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

PLASMA.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

Livermore Loops.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

LINPACK Benchmark.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

Linear Algebra Software.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

LAPACK.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

HPC Challenge Benchmark.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

Benchmarks.

[DOI]

Jack J. Dongarra

,

Proceedings of the Encyclopedia of Parallel Computing, 2011

Linear algebra - software issues.

[DOI]

,

Jack J. Dongarra

Scholarpedia, 2011

Preface.

[DOI]

,

Satoshi Matsuoka

,

Peter M. A. Sloot

,

G. Dick van Albada

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science, 2011

Guest Editors Note.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Parallel Process. Lett., 2011

High-performance high-resolution semi-Lagrangian tracer transport on a sphere.

[DOI]

James Buford White III

,

Jack J. Dongarra

J. Comput. Phys., 2011

Trace-based performance analysis for the petascale simulation code FLASH.

[DOI]

,

Andreas Knüpfer

,

Jack J. Dongarra

,

Matthias Jurenz

,

Matthias S. Müller

,

Wolfgang E. Nagel

Int. J. High Perform. Comput. Appl., 2011

Selected papers of the Workshop on Clusters, Clouds and Grids for Scientific Computing (CCGSC).

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Int. J. High Perform. Comput. Appl., 2011

The International Exascale Software Project roadmap.

[DOI]

Jack J. Dongarra

,

Peter H. Beckman

,

,

,

Giovanni Aloisio

,

Jean-Claude Andre

,

,

Jean-Yves Berthou

,

,

Bertrand Braunschweig

,

Franck Cappello

,

Barbara M. Chapman

,

,

Alok N. Choudhary

,

Sudip S. Dosanjh

,

Thom H. Dunning

,

,

,

,

Robert J. Harrison

,

,

Michael A. Heroux

,

,

,

,

Yutaka Ishikawa

,

,

,

,

,

,

,

Alain Lichnewsky

,

,

,

,

Satoshi Matsuoka

,

,

Peter Michielse

,

,

Matthias S. Müller

,

Wolfgang E. Nagel

,

Hiroshi Nakashima

,

Michael E. Papka

,

,

,

,

,

,

,

Thomas L. Sterling

,

,

Frederick H. Streitz

,

,

Shinji Sumimoto

,

William M. Tang

,

,

,

Anne E. Trefethen

,

,

Aad J. van der Steen

,

Jeffrey S. Vetter

,

,

Robert W. Wisniewski

,

Katherine A. Yelick

Int. J. High Perform. Comput. Appl., 2011

QCG-OMPI: MPI applications on grids.

[DOI]

Emmanuel Agullo

,

,

Thomas Hérault

,

,

Sylvain Peyronnet

,

,

Franck Cappello

,

Jack J. Dongarra

Future Gener. Comput. Syst., 2011

Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community.

[DOI]

Jeffrey S. Vetter

,

Richard Glassbrook

,

Jack J. Dongarra

,

,

,

Stephen Taylor McNally

,

Jeremy S. Meredith

,

James H. Rogers

,

,

,

Sudhakar Yalamanchili

Comput. Sci. Eng., 2011

Fully Empirical Autotuned QR Factorization For Multicore Architectures

[DOI]

Emmanuel Agullo

,

Jack J. Dongarra

,

,

Stanimire Tomov

CoRR, 2011

Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures.

[DOI]

,

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2011

Panel: many-task computing meets exascales.

[DOI]

,

,

Jack J. Dongarra

,

,

Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers, 2011

Optimizing symmetric dense matrix-vector multiplication on GPUs.

[DOI]

,

Stanimire Tomov

,

,

Jack J. Dongarra

Proceedings of the Conference on High Performance Computing Networking, 2011

Poster: new features of the PAPI hardware counter library.

[DOI]

,

Daniel Terpstra

,

Vincent M. Weaver

,

,

,

Jack J. Dongarra

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Conference on High Performance Computing Networking, 2011

Soft error resilient QR factorization for hybrid system with GPGPU.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the second workshop on Scalable algorithms for large-scale systems, 2011

High performance matrix inversion based on LU factorization for multicore architectures.

[DOI]

Jack J. Dongarra

,

Mathieu Faverge

,

,

Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers, 2011

Impact of Kernel-Assisted MPI Communication over Scientific Applications: CPMD and FFTW.

[DOI]

,

Aurélien Bouteiller

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2011

OMPIO: A Modular Software Architecture for MPI I/O.

[DOI]

Mohamad Chaarawi

,

,

,

Richard L. Graham

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2011

Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure.

[DOI]

,

Thomas Hérault

,

Pierre Lemarinier

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2011

Reducing the Time to Tune Parallel Dense Linear Algebra Routines with Partial Execution and Performance Modeling.

[DOI]

,

Jack J. Dongarra

Proceedings of the Parallel Processing and Applied Mathematics, 2011

Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures Using Tree Reduction.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Parallel Processing and Applied Mathematics, 2011

Reducing the Amount of Pivoting in Symmetric Indefinite Systems.

[DOI]

Dulceneia Becker

,

,

Jack J. Dongarra

Proceedings of the Parallel Processing and Applied Mathematics, 2011

Solving the Generalized Symmetric Eigenvalue Problem using Tile Algorithms on Multicore Architectures.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Exploiting Fine-Grain Parallelism in Recursive LU Factorization.

[DOI]

Jack J. Dongarra

,

Mathieu Faverge

,

,

Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Overlapping Computation and Communication for Advection on Hybrid Parallel Computers.

[DOI]

James Buford White III

,

Jack J. Dongarra

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Panel Statement.

[DOI]

,

William J. Dally

,

Jack J. Dongarra

,

Satoshi Matsuoka

,

Robert Schreiber

,

,

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices Using Tile Algorithms on Multicore Architectures.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Architecture-aware Algorithms and Software for Peta and Exascale Computing.

[DOI]

Jack J. Dongarra

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

DAGuE: A Generic Distributed DAG Engine for High Performance Computing.

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Thomas Hérault

,

Pierre Lemarinier

,

Jack J. Dongarra

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA.

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Mathieu Faverge

,

,

Thomas Hérault

,

,

,

Pierre Lemarinier

,

,

,

,

Jack J. Dongarra

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators.

[DOI]

Emmanuel Agullo

,

Cédric Augonnet

,

Jack J. Dongarra

,

Mathieu Faverge

,

,

Samuel Thibault

,

Stanimire Tomov

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs.

[DOI]

,

,

Aurélien Bouteiller

,

,

Jeffrey M. Squyres

,

Jack J. Dongarra

Proceedings of the International Conference on Parallel Processing, 2011

Evaluation of the HPC Challenge Benchmarks in Virtualized Environments.

[DOI]

,

,

,

Daniel Terpstra

,

Vincent M. Weaver

,

Jack J. Dongarra

Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

Correlated Set Coordination in Fault Tolerant Message Logging Protocols.

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

A Fully Empirical Autotuned Dense QR Factorization for Multicore Architectures.

[DOI]

Emmanuel Agullo

,

Jack J. Dongarra

,

,

Stanimire Tomov

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Process Distance-Aware Adaptive MPI Collective Communications.

[DOI]

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

High Performance Dense Linear System Solver with Soft Error Resilience.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

On Scalability for MPI Runtime Systems.

[DOI]

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

Performance Portability of a GPU Enabled Factorization with the DAGuE Framework.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Pierre Lemarinier

,

Narapat Ohm Saengpatsa

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

EZTrace: A Generic Framework for Performance Analysis.

[DOI]

François Trahay

,

,

Mathieu Faverge

,

Yutaka Ishikawa

,

,

Jack J. Dongarra

Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

LU factorization for accelerator-based systems.

[DOI]

Emmanuel Agullo

,

Cédric Augonnet

,

Jack J. Dongarra

,

Mathieu Faverge

,

,

,

Stanimire Tomov

Proceedings of the 9th IEEE/ACS International Conference on Computer Systems and Applications, 2011

2010

Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures.

[DOI]

,

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2010

Rectangular full packed format for cholesky's algorithm: factorization, solution, and inversion.

[DOI]

Fred G. Gustavson

,

Jerzy Wasniewski

,

Jack J. Dongarra

,

ACM Trans. Math. Softw., 2010

Scheduling two-sided transformations using tile algorithms on multicore architectures.

[DOI]

,

,

Jack J. Dongarra

,

Sci. Program., 2010

Improvement of parallelization efficiency of batch pattern BP training algorithm using Open MPI.

[DOI]

Volodymyr Turchenko

,

Lucio Grandinetti

,

,

Jack J. Dongarra

Proceedings of the International Conference on Computational Science, 2010

Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing.

[DOI]

Stanimire Tomov

,

,

Jack J. Dongarra

Parallel Comput., 2010

Towards dense linear algebra for hybrid GPU accelerated manycore systems.

[DOI]

Stanimire Tomov

,

Jack J. Dongarra

,

Parallel Comput., 2010

Preface.

[DOI]

Peter M. A. Sloot

,

Peter V. Coveney

,

Jack J. Dongarra

J. Comput. Sci., 2010

An Improved Magma Gemm For Fermi Graphics Processing Units.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2010

Self-healing network for scalable fault-tolerant runtime environments.

[DOI]

,

,

,

Jelena Pjesivac-Grbovic

,

Jack J. Dongarra

Future Gener. Comput. Syst., 2010

Scheduling dense linear algebra operations on multicore processors.

[DOI]

,

,

Jack J. Dongarra

,

Concurr. Comput. Pract. Exp., 2010

SmartGridRPC: The new RPC model for high performance Grid computing.

[DOI]

,

Jack J. Dongarra

,

Michele Guidolin

,

Alexey L. Lastovetsky

,

Concurr. Comput. Pract. Exp., 2010

Redesigning the message logging model for high performance.

[DOI]

Aurélien Bouteiller

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2010

Accelerating GPU Kernels for Dense Linear Algebra.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators.

[DOI]

,

Stanimire Tomov

,

,

,

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures.

[DOI]

Emmanuel Agullo

,

Henricus Bouwmeester

,

Jack J. Dongarra

,

,

,

Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Conference on High Performance Computing Networking, 2010

Locality and Topology Aware Intra-node Communication among Multicore CPUs.

[DOI]

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2010

Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols.

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Pierre Lemarinier

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2010

An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Applied Parallel and Scientific Computing, 2010

Dense linear algebra solvers for multicore with GPU accelerators.

[DOI]

Stanimire Tomov

,

,

,

Jack J. Dongarra

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Tile QR factorization with parallel panel processing for multicore architectures.

[DOI]

,

,

Emmanuel Agullo

,

Jack J. Dongarra

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

QR factorization of tall and skinny matrices in a grid computing environment.

[DOI]

Emmanuel Agullo

,

,

Jack J. Dongarra

,

Thomas Hérault

,

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Mixed-Tool Performance Analysis on Hybrid Multicore Architectures.

[DOI]

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 39th International Conference on Parallel Processing, 2010

Dense Linear Algebra for Hybrid GPU-Based Systems.

[DOI]

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

BLAS for GPUs.

[DOI]

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

Implementing Matrix Factorizations on the Cell B. E.

[DOI]

,

Jack J. Dongarra

Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

Implementing Matrix Multiplication on the Cell B. E.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

2009

Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing.

[DOI]

,

Jack J. Dongarra

IEEE Trans. Computers, 2009

QR factorization for the Cell Broadband Engine.

[DOI]

,

Jack J. Dongarra

Sci. Program., 2009

Optimizing matrix multiplication for a short-vector SIMD architecture - CELL processor.

[DOI]

,

,

Jack J. Dongarra

Parallel Comput., 2009

Foreword.

[DOI]

Franck Cappello

,

Thomas Hérault

,

Jack J. Dongarra

Parallel Comput., 2009

A class of parallel tiled linear algebra algorithms for multicore architectures.

[DOI]

Alfredo Buttari

,

,

,

Jack J. Dongarra

Parallel Comput., 2009

Computing the conditioning of the components of a linear least-squares solution.

[DOI]

,

Jack J. Dongarra

,

,

Numer. Linear Algebra Appl., 2009

Algorithm-based fault tolerance applied to high performance computing.

[DOI]

,

,

Jack J. Dongarra

,

J. Parallel Distributed Comput., 2009

Editorial.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Int. J. High Perform. Comput. Appl., 2009

The Problem With the Linpack Benchmark 1.0 Matrix Generator.

[DOI]

Jack J. Dongarra

,

Int. J. High Perform. Comput. Appl., 2009

The International Exascale Software Project: a Call To Cooperative Action By the Global High-Performance Community.

[DOI]

Jack J. Dongarra

,

Peter H. Beckman

,

,

Franck Cappello

,

,

Satoshi Matsuoka

,

,

,

,

Anne E. Trefethen

,

Int. J. High Perform. Comput. Appl., 2009

Accelerating scientific computations with mixed precision algorithms.

[DOI]

,

Alfredo Buttari

,

Jack J. Dongarra

,

,

,

,

,

Stanimire Tomov

Comput. Phys. Commun., 2009

Paravirtualization effect on single- and multi-threaded memory-intensive linear algebra software.

[DOI]

,

,

,

Dmitrii Zagorodnov

,

Jack J. Dongarra

,

Clust. Comput., 2009

Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Comparative study of one-sided factorizations with multiple software packages on multi-core hardware.

[DOI]

Emmanuel Agullo

,

,

,

Jack J. Dongarra

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Towards Efficient MapReduce Using MPI.

[DOI]

Torsten Hoefler

,

Andrew Lumsdaine

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Collecting Performance Data with PAPI-C.

[DOI]

Daniel Terpstra

,

,

,

Jack J. Dongarra

Proceedings of the Tools for High Performance Computing 2009, 2009

Constructing Resiliant Communication Infrastructure for Runtime Environments.

[DOI]

,

,

Thomas Hérault

,

Pierre Lemarinier

,

Jack J. Dongarra

Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

CIFTS: A Coordinated Infrastructure for Fault-Tolerant Systems.

[DOI]

,

Peter H. Beckman

,

Byung-Hoon Park

,

,

,

,

Dhabaleswar K. Panda

,

Andrew Lumsdaine

,

Jack J. Dongarra

Proceedings of the ICPP 2009, 2009

A Scalable Non-blocking Multicast Scheme for Distributed DAG Scheduling.

[DOI]

,

Jack J. Dongarra

,

Proceedings of the Computational Science, 2009

A Note on Auto-tuning GEMM for GPUs.

[DOI]

,

Jack J. Dongarra

,

Stanimire Tomov

Proceedings of the Computational Science, 2009

A Holistic Approach for Performance Measurement and Analysis for Petascale Applications.

[DOI]

,

Jack J. Dongarra

,

,

Jeffrey S. Vetter

,

,

Allen D. Malony

Proceedings of the Computational Science, 2009

Analytical modeling and optimization for affinity based thread scheduling on multicore systems.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

Reasons for a pessimistic or optimistic message logging protocol in MPI uncoordinated failure, recovery.

[DOI]

Aurélien Bouteiller

,

,

,

Christine Morin

,

Jack J. Dongarra

Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

Recent trends in high performance computing.

[DOI]

Jack J. Dongarra

,

Hans Werner Meuer

,

,

Erich Strohmaier

Proceedings of the Birth of Numerical Analysis, 2009

High Performance Heterogeneous Computing.

[DOI]

Alexey L. Lastovetsky

,

Jack J. Dongarra

Wiley series on parallel and distributed computing, Wiley, ISBN: 978-0-470-04039-3, 2009

2008

Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization.

[DOI]

,

Alfredo Buttari

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2008

Algorithm-Based Fault Tolerance for Fail-Stop Failures.

[DOI]

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 2008

Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy.

[DOI]

Alfredo Buttari

,

Jack J. Dongarra

,

,

,

Stanimire Tomov

ACM Trans. Math. Softw., 2008

State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems.

[DOI]

Christof Vömel

,

Stanimire Tomov

,

Osni A. Marques

,

,

,

Jack J. Dongarra

J. Comput. Phys., 2008

Special section: Grid computing and the message passing interface.

[DOI]

Beniamino Di Martino

,

Dieter Kranzlmüller

,

Jack J. Dongarra

Future Gener. Comput. Syst., 2008

Special section: Cluster and computational grids for scientific computing.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Future Gener. Comput. Syst., 2008

Special section: Applications of distributed and grid computing.

[DOI]

,

Jack J. Dongarra

,

,

Jerzy Wasniewski

,

Future Gener. Comput. Syst., 2008

The PlayStation 3 for High-Performance Scientific Computing.

[DOI]

,

Alfredo Buttari

,

,

Jack J. Dongarra

Comput. Sci. Eng., 2008

Algorithmic Based Fault Tolerance Applied to High Performance Computing

[DOI]

,

,

Jack J. Dongarra

,

CoRR, 2008

Interactive grid-access using GridSolve and Giggle.

[DOI]

,

,

Jack J. Dongarra

,

,

Comput. Informatics, 2008

Netlib and NA-Net: Building a Scientific Computing Community.

[DOI]

Jack J. Dongarra

,

,

,

,

IEEE Ann. Hist. Comput., 2008

DARPA's HPCS Program- History, Models, Tools, Languages.

Jack J. Dongarra

,

Robert B. Graybill

,

William J. Harrod

,

Robert F. Lucas

,

,

,

,

,

Jeffrey S. Vetter

,

Katherine A. Yelick

,

,

Roy L. Campbell

,

Laura Carrington

,

,

,

Jeremy S. Meredith

,

Mustafa M. Tikir

Adv. Comput., 2008

An Overview of High Performance Computing and Challenges for the Future.

[DOI]

Jack J. Dongarra

Proceedings of the High Performance Computing for Computational Science, 2008

Matrix product on heterogeneous master-worker platforms.

[DOI]

Jack J. Dongarra

,

Jean-Francois Pineau

,

,

Frédéric Vivien

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the Synergistic Processing Element of the CELL Processor.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Computational Science, 2008

The impact of paravirtualized memory hierarchy on linear algebra computational kernels and software.

[DOI]

,

,

,

Jack J. Dongarra

,

Proceedings of the 17th International Symposium on High-Performance Distributed Computing (HPDC-17 2008), 2008

Scheduling for Numerical Linear Algebra Library at Scale.

[DOI]

,

,

Jack J. Dongarra

,

Proceedings of the High Speed and Large Scale Scientific Computing - Selected Papers from the High Performance Computing Workshop, Cetraro, Italy, June 30, 2008

A Scalable Checkpoint Encoding Algorithm for Diskless Checkpointing.

[DOI]

,

Jack J. Dongarra

Proceedings of the 11th IEEE High Assurance Systems Engineering Symposium, 2008

Request Sequencing: Enabling Workflow for Efficient Problem Solving in GridSolve.

[DOI]

,

Jack J. Dongarra

,

,

Proceedings of the Seventh International Conference on Grid and Cooperative Computing, 2008

A comparison of search heuristics for empirical code optimization.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007

Prospectus for a Dense Linear Algebra Software Library.

[DOI]

,

,

,

,

,

Alfredo Buttari

,

Stanimire Tomov

,

,

,

,

Christof Vömel

,

,

,

Jack J. Dongarra

,

,

Beresford N. Parlett

,

Proceedings of the Handbook of Parallel Computing - Models, Algorithms and Applications., 2007

Recovery Patterns for Iterative Methods in a Parallel Unstable Environment.

[DOI]

,

,

,

Jack J. Dongarra

SIAM J. Sci. Comput., 2007

Improved Runtime and Transfer Time Prediction Mechanisms in a Network Enabled Servers Middleware.

[DOI]

Emmanuel Jeannot

,

,

,

Jack J. Dongarra

Parallel Process. Lett., 2007

The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot.

[DOI]

Christof Vömel

,

Stanimire Tomov

,

,

Osni A. Marques

,

Jack J. Dongarra

J. Comput. Phys., 2007

Preface.

[DOI]

Beniamino Di Martino

,

Dieter Kranzlmüller

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2007

High Performance Development for High End Computing With Python Language Wrapper (PLW).

[DOI]

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2007

Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems.

[DOI]

Alfredo Buttari

,

Jack J. Dongarra

,

,

,

,

Int. J. High Perform. Comput. Appl., 2007

Automatic analysis of inefficiency patterns in parallel applications.

[DOI]

,

,

Jack J. Dongarra

,

Concurr. Comput. Pract. Exp., 2007

Implementation of mixed precision in solving systems of linear equations on the Cell processor.

[DOI]

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2007

Editorial introduction to the special issue on computational linear algebra and sparse matrix computations.

[DOI]

Jerzy Wasniewski

,

Jack J. Dongarra

,

,

,

Appl. Algebra Eng. Commun. Comput., 2007

Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems.

[DOI]

Jack J. Dongarra

,

Emmanuel Jeannot

,

,

Proceedings of the SPAA 2007: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2007

Retrospect: Deterministic Replay of MPI Applications for Interactive Distributed Debugging.

[DOI]

Aurélien Bouteiller

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Parallel Tiled QR Factorization for Multicore Architectures.

[DOI]

Alfredo Buttari

,

,

,

Jack J. Dongarra

Proceedings of the Parallel Processing and Applied Mathematics, 2007

Optimal Routing in Binomial Graph Networks.

[DOI]

,

,

Bradley T. Vander Zanden

,

Jack J. Dongarra

Proceedings of the Eighth International Conference on Parallel and Distributed Computing, 2007

Self-healing in Binomial Graph Networks.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the On the Move to Meaningful Internet Systems 2007: OTM 2007 Workshops, 2007

Binomial Graph: A Scalable and Fault-Tolerant Logical Network Topology.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Parallel and Distributed Processing and Applications, 2007

Revisiting Matrix Product on Master-Worker Platforms.

[DOI]

Jack J. Dongarra

,

Jean-Francois Pineau

,

,

,

Frédéric Vivien

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Self Adaptive Application Level Fault Tolerance for Parallel and Distributed Computing.

[DOI]

,

,

Guillermo A. Francia III

,

Jack J. Dongarra

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

L2 Cache Modeling for Scientific Applications on Chip Multi-Processors.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Scalability Analysis of the SPEC OpenMP Benchmarks on Large-Scale Shared Memory Multiprocessors.

[DOI]

Karl Fürlinger

,

,

Jack J. Dongarra

Proceedings of the Computational Science - ICCS 2007, 7th International Conference, Beijing, China, May 27, 2007

Feedback-directed thread scheduling with memory considerations.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the 16th International Symposium on High-Performance Distributed Computing (HPDC-16 2007), 2007

Decision Trees and MPI Collective Algorithm Selection Problem.

[DOI]

Jelena Pjesivac-Grbovic

,

,

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2007, 2007

On Using Incremental Profiling for the Performance Analysis of Shared Memory Parallel Applications.

[DOI]

Karl Fürlinger

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2007, 2007

Reliability Analysis of Self-Healing Network using Discrete-Event Simulation.

[DOI]

,

,

,

Jelena Pjesivac-Grbovic

,

Jack J. Dongarra

Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007

2006

Recent Developments in Gridsolve.

[DOI]

,

,

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2006

Special Issue on Tools in the ACTS Collection 2004.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Int. J. High Perform. Comput. Appl., 2006

Conjugate-gradient eigenvalue solvers in computing electronic properties of nanostructure architectures.

[DOI]

Stanimire Tomov

,

,

Jack J. Dongarra

,

,

Int. J. Comput. Sci. Eng., 2006

Self-adapting numerical software (SANS) effort.

[DOI]

Jack J. Dongarra

,

,

,

Victor Eijkhout

,

,

,

,

,

Jelena Pjesivac-Grbovic

,

,

,

Sathish S. Vadhiyar

IBM J. Res. Dev., 2006

Scheduling workflow applications on processors with different capabilities.

[DOI]

,

Jack J. Dongarra

Future Gener. Comput. Syst., 2006

An asynchronous algorithm on the NetSolve global computing system.

[DOI]

,

Seyed Abolfazl Shahzadeh Fazeli

,

Jack J. Dongarra

Future Gener. Comput. Syst., 2006

S12 - The HPC Challenge (HPCC) benchmark suite.

[DOI]

,

David H. Bailey

,

Jack J. Dongarra

,

,

Robert F. Lucas

,

Rolf Rabenseifner

,

Daisuke Takahashi

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Tools and techniques for performance - Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems).

[DOI]

,

,

,

,

Alfredo Buttari

,

Jack J. Dongarra

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

HPC challenge - The 2006 HPC challenge awards.

[DOI]

Jack J. Dongarra

,

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Poster reception - Targeting multi-core architectures for linear algebra applications.

[DOI]

Alfredo Buttari

,

,

Jack J. Dongarra

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

MPI Collective Algorithm Selection and Quadtree Encoding.

[DOI]

Jelena Pjesivac-Grbovic

,

,

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Implementation and Usage of the PERUSE-Interface in Open MPI.

[DOI]

,

,

,

Michael M. Resch

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Scalable Fault Tolerant Protocol for Parallel Runtime Environments.

[DOI]

,

,

,

Jelena Pjesivac-Grbovic

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Implementing Linear Algebra Routines on Multi-core Processors with Pipelining and a Look Ahead.

[DOI]

,

Jack J. Dongarra

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Prospectus for the Next LAPACK and ScaLAPACK Libraries.

[DOI]

,

Jack J. Dongarra

,

Beresford N. Parlett

,

,

,

,

,

,

,

,

Christof Vömel

,

,

,

,

Alfredo Buttari

,

,

Stanimire Tomov

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

The Impact of Multicore on Math Software.

[DOI]

Alfredo Buttari

,

Jack J. Dongarra

,

,

,

,

Stanimire Tomov

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications.

[DOI]

Oscar R. Hernandez

,

,

Barbara M. Chapman

,

Jack J. Dongarra

,

,

,

Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2006

The Impact of Multicore on Math Software and Exploiting Single Precision Computing to Obtain Double Precision Results.

[DOI]

Jack J. Dongarra

Proceedings of the Parallel and Distributed Processing and Applications, 2006

Algorithm-based checkpoint-free fault tolerance for parallel matrix computations on volatile resources.

[DOI]

,

Jack J. Dongarra

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

GridSolve: The Evolution of A Network Enabled Solver.

[DOI]

,

Jack J. Dongarra

,

Proceedings of the Grid-Based Problem Solving Environments, 2006

Exploiting Mixed Precision Floating Point Hardware in Scientific Computations.

Alfredo Buttari

,

Jack J. Dongarra

,

,

,

,

,

Stanimire Tomov

Proceedings of the High Performance Computing and Grids in Action, 2006

Robust task scheduling in non-deterministic heterogeneous computing systems.

[DOI]

,

Emmanuel Jeannot

,

Jack J. Dongarra

Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Parallel Linear Algebra Software.

[DOI]

Victor Eijkhout

,

,

Jack J. Dongarra

Proceedings of the Parallel Processing for Scientific Computing, 2006

Trends in High-Performance Computing.

[DOI]

Jack J. Dongarra

Proceedings of the Handbook of Nature-Inspired and Innovative Computing, 2006

Engineering the grid - status and perspective.

Beniamino Di Martino

,

Jack J. Dongarra

,

,

Laurence Tianruo Yang

,

American Scientific Publishers, ISBN: 978-1-58883-038-8, 2006

2005

Condition Numbers of Gaussian Random Matrices.

[DOI]

,

Jack J. Dongarra

SIAM J. Matrix Anal. Appl., 2005

Special Issue on Program Generation, Optimization, and Platform Adaptation.

[DOI]

José M. F. Moura

,

Markus Püschel

,

,

Jack J. Dongarra

Proc. IEEE, 2005

Self-Adapting Linear Algebra Algorithms and Software.

[DOI]

Richard Carl Demmel

,

Jack J. Dongarra

,

Victor Eijkhout

,

,

Antoine Petitet

,

Richard W. Vuduc

,

R. Clint Whaley

,

Katherine A. Yelick

Proc. IEEE, 2005

Recent trends in the marketplace of high performance computing.

[DOI]

Erich Strohmaier

,

Jack J. Dongarra

,

Hans Werner Meuer

,

Parallel Comput., 2005

The Component Structure of a Self-Adapting Numerical Software System.

[DOI]

Victor Eijkhout

,

,

,

Jack J. Dongarra

Int. J. Parallel Program., 2005

Recent Advances in Parallel Virtual Machine and Message Passing Interface.

[DOI]

Dieter Kranzlmüller

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2005

Evaluating Dynamic Communicators and One-Sided Operations for Current MPI Libraries.

[DOI]

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2005

Process Fault Tolerance: Semantics, Design and Applications for High Performance Computing.

[DOI]

,

,

,

,

,

Jelena Pjesivac-Grbovic

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2005

Biological sequence alignment on the computational grid using the GrADS framework.

[DOI]

,

Jack J. Dongarra

Future Gener. Comput. Syst., 2005

High-performance computing: clusters, constellations, MPPs, and future directions.

[DOI]

Jack J. Dongarra

,

Thomas L. Sterling

,

,

Erich Strohmaier

Comput. Sci. Eng., 2005

Self adaptivity in Grid computing.

[DOI]

Sathish S. Vadhiyar

,

Jack J. Dongarra

Concurr. Pract. Exp., 2005

Enabling interactive and collaborative oil reservoir simulations on the Grid.

[DOI]

Manish Parashar

,

Rajeev Muralidhar

,

,

Dorian C. Arnold

,

Jack J. Dongarra

,

Mary F. Wheeler

Concurr. Pract. Exp., 2005

A Scalable Approach to MPI Application Performance Analysis.

[DOI]

,

,

Jack J. Dongarra

,

,

Allen D. Malony

,

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Hash Functions for Datatype Signatures in MPI.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Scalable Fault Tolerant MPI: Extending the Recovery Algorithm.

[DOI]

,

,

,

Jelena Pjesivac-Grbovic

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Fault tolerant high performance computing by a coding approach.

[DOI]

,

,

,

,

,

,

Jack J. Dongarra

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005

Eigenvalue Computation with NetSolve Global Computing System.

[DOI]

Seyed Abolfazl Shahzadeh Fazeli

,

,

Jack J. Dongarra

Proceedings of the Large-Scale Scientific Computing, 5th International Conference, 2005

Performance Analysis of MPI Collective Operations.

[DOI]

Jelena Pjesivac-Grbovic

,

,

,

,

,

Jack J. Dongarra

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

NetSolve/D: A Massively Parallel Grid Execution System for Scalable Data Intensive Collaboration.

[DOI]

,

Jack J. Dongarra

,

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Automatic Experimental Analysis of Communication Patterns in Virtual Topologies.

[DOI]

,

,

,

Jack J. Dongarra

,

,

Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

Comparison of Nonlinear Conjugate-Gradient Methods for Computing the Electronic Properties of Nanostructure Architectures.

[DOI]

Stanimire Tomov

,

,

,

,

Jack J. Dongarra

Proceedings of the Computational Science, 2005

Numerically Stable Real Number Codes Based on Random Matrices.

[DOI]

,

Jack J. Dongarra

Proceedings of the Computational Science, 2005

Processes Distribution of Homogeneous Parallel Linear Algebra Routines on Heterogeneous Clusters.

[DOI]

,

Luis-Pedro García

,

Domingo Giménez

,

Jack J. Dongarra

Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

2004

GrADSolve a grid-based RPC system for parallel computing with application-level scheduling.

[DOI]

Sathish S. Vadhiyar

,

Jack J. Dongarra

J. Parallel Distributed Comput., 2004

Building and Using a Fault-Tolerant MPI Implementation.

[DOI]

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2004

The Virtual Instrument: Support for Grid-Enabled Mcell Simulations.

[DOI]

,

Francine Berman

,

Thomas M. Bartol

,

,

Terrence J. Sejnowski

,

,

Jack J. Dongarra

,

Michelle Miller

,

Mark H. Ellisman

,

,

Graziano Obertelli

,

,

Stuart M. Pomerantz

,

Int. J. High Perform. Comput. Appl., 2004

Selected numerical algorithms.

[DOI]

Jack J. Dongarra

,

,

Jerzy Wasniewski

Future Gener. Comput. Syst., 2004

Trends in High Performance Computing.

[DOI]

Jack J. Dongarra

Comput. J., 2004

TEG: A High-Performance, Scalable, Multi-network Point-to-Point Communications Methodology.

[DOI]

Timothy S. Woodall

,

Richard L. Graham

,

Ralph H. Castain

,

David J. Daniel

,

Mitchel W. Sukalski

,

,

,

,

,

Jack J. Dongarra

,

Jeffrey M. Squyres

,

,

Prabhanjan Kambadur

,

,

Andrew Lumsdaine

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Open MPI's TEG Point-to-Point Communications Methodology: Comparison to Existing Implementations.

[DOI]

Timothy S. Woodall

,

Richard L. Graham

,

Ralph H. Castain

,

David J. Daniel

,

Mitchel W. Sukalski

,

,

,

,

,

Jack J. Dongarra

,

Jeffrey M. Squyres

,

,

Prabhanjan Kambadur

,

,

Andrew Lumsdaine

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation.

[DOI]

,

,

,

,

Jack J. Dongarra

,

Jeffrey M. Squyres

,

,

Prabhanjan Kambadur

,

,

Andrew Lumsdaine

,

Ralph H. Castain

,

David J. Daniel

,

Richard L. Graham

,

Timothy S. Woodall

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Fault Tolerance in Message Passing and in Action.

[DOI]

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Present and Future Supercomputer Architectures.

[DOI]

Jack J. Dongarra

Proceedings of the Parallel and Distributed Processing and Applications, 2004

Improvements in the Efficient Composition of Applications Built Using a Component-Based Programming Environment.

[DOI]

,

Victor Eijkhout

,

Jack J. Dongarra

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

New Grid Scheduling and Rescheduling Methods in the GrADS Project.

[DOI]

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

An Algebra for Cross-Experiment Performance Analysis.

[DOI]

,

,

,

Jack J. Dongarra

,

Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Design of Interactive Environment for Numerically Intensive Parallel Linear Algebra Calculations.

[DOI]

,

Jack J. Dongarra

Proceedings of the Computational Science, 2004

Accurate Cache and TLB Characterization Using Hardware Counters.

[DOI]

Jack J. Dongarra

,

,

,

,

Proceedings of the Computational Science, 2004

NetSolve: Grid enabling scientific computing environments.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Grid Computing: The New Frontier of High Performance Computing [post-proceedings of the High Performance Computing Workshop, 2004

The LAPACK for Clusters Project: An Example of Self Adapting Numerical Software.

[DOI]

,

Jack J. Dongarra

,

,

Proceedings of the 37th Hawaii International Conference on System Sciences (HICSS-37 2004), 2004

High Performance Computing Trends, Supercomputers, Clusters and Grids.

[DOI]

Jack J. Dongarra

Proceedings of the Grid and Cooperative Computing, 2004

Efficient Pattern Search in Large Traces Through Successive Refinement.

[DOI]

,

,

Jack J. Dongarra

,

Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Automatic blocking of QR and LU factorizations for locality.

[DOI]

,

,

,

,

Jack J. Dongarra

Proceedings of the 2004 workshop on Memory System Performance, 2004

Application-Level Tools.

[DOI]

,

,

Jack J. Dongarra

,

Satoshi Matsuoka

Proceedings of the Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition, 2004

2003

SRS: A Framework for Developing Malleable and Migratable Parallel Applications for Distributed Systems.

[DOI]

Sathish S. Vadhiyar

,

Jack J. Dongarra

Parallel Process. Lett., 2003

Preface.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Parallel Process. Lett., 2003

Self-adapting software for numerical linear algebra and LAPACK for clusters.

[DOI]

,

Jack J. Dongarra

,

,

Parallel Comput., 2003

Recent Advances in Parallel Virtual Machine and Message Passing Interface: (Selected Papers from the EuroPVMMPI 2002 Conference).

[DOI]

,

Dieter Kranzlmüller

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2003

Self-Adapting Numerical Software for Next Generation Applications.

[DOI]

Jack J. Dongarra

,

Victor Eijkhout

Int. J. High Perform. Comput. Appl., 2003

The LINPACK Benchmark: past, present and future.

[DOI]

Jack J. Dongarra

,

,

Antoine Petitet

Concurr. Comput. Pract. Exp., 2003

Evaluating the Performance of MPI-2 Dynamic Communicators and One-Sided Communication.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

Automatic Optimisation of Parallel Linear Algebra Routines in Systems with Variable Load.

[DOI]

,

Domingo Giménez

,

José González

,

Jack J. Dongarra

,

Proceedings of the 11th Euromicro Workshop on Parallel, 2003

High Performance Computing Trends and Self Adapting Numerical Software.

[DOI]

Jack J. Dongarra

Proceedings of the High Performance Computing, 5th International Symposium, 2003

Optimizing Performance and Reliability in Distributed Computing Systems through Wide Spectrum Storage.

[DOI]

,

,

Jack J. Dongarra

,

,

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Applying Aspect-Orient Programming Concepts to a Component-Based Programming Model.

[DOI]

,

Jack J. Dongarra

,

Victor Eijkhout

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters.

[DOI]

Jack J. Dongarra

,

Kevin S. London

,

,

,

Daniel Terpstra

,

,

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

visPerf: Monitoring Tool for Grid Computing.

[DOI]

,

Jack J. Dongarra

,

Rudrapatna S. Ramakrishna

Proceedings of the Computational Science - ICCS 2003, 2003

Performance Instrumentation and Measurement for Terascale Systems.

[DOI]

Jack J. Dongarra

,

Allen D. Malony

,

,

,

Proceedings of the Computational Science - ICCS 2003, 2003

Self-Adapting Numerical Software and Automatic Tuning of Heuristics.

[DOI]

Jack J. Dongarra

,

Victor Eijkhout

Proceedings of the Computational Science - ICCS 2003, 2003

Self-Adapting Software for Numerical Linear Algebra Library Routines on Clusters.

[DOI]

,

Jack J. Dongarra

,

,

Proceedings of the Computational Science - ICCS 2003, 2003

Distributed Probabilistic Model-Building Genetic Algorithm.

[DOI]

Tomoyuki Hiroyasu

,

,

,

Hisashi Shimosaka

,

Shigeyoshi Tsutsui

,

Jack J. Dongarra

Proceedings of the Genetic and Evolutionary Computation, 2003

GrADSolve - RPC for High Performance Computing on the Grid.

[DOI]

Sathish S. Vadhiyar

,

Jack J. Dongarra

,

Proceedings of the Euro-Par 2003. Parallel Processing, 2003

A Performance Oriented Migration Framework For The Grid.

[DOI]

Sathish S. Vadhiyar

,

Jack J. Dongarra

Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002

Preface to the special issue on the basic linear algebra subprograms (BLAS).

[DOI]

Ronald F. Boisvert

,

Jack J. Dongarra

ACM Trans. Math. Softw., 2002

A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures.

[DOI]

,

David S. Watkins

,

Jack J. Dongarra

SIAM J. Sci. Comput., 2002

Middleware for the use of storage in communication.

[DOI]

,

Dorian C. Arnold

,

Alessandro Bassi

,

Francine Berman

,

,

Jack J. Dongarra

,

,

Graziano Obertelli

,

,

D. Martin Swany

,

Sathish S. Vadhiyar

,

Parallel Comput., 2002

Active Netlib: An Active Mathematical Software Collection for Inquiry-based Computational Science and Engineering Education.

[DOI]

,

,

Jack J. Dongarra

,

Christian Halloy

,

J. Digit. Inf., 2002

Basic Linear Algebra Subprograms Technical (Blast) Forum Standard (2).

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2002

Basic Linear Algebra Subprograms Technical (Blast) Forum Standard (1).

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2002

HARNESS fault tolerant MPI design, usage and performance issues.

[DOI]

,

Jack J. Dongarra

Future Gener. Comput. Syst., 2002

NetBuild: transparent cross-platform access to computational software libraries.

[DOI]

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2002

Innovations of the NetSolve Grid Computing System.

[DOI]

Dorian C. Arnold

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2002

High Performance Computing, Computational Grid, and Numerical Libraries.

[DOI]

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002

Active netlib: an active mathematical software collection for inquiry-based computational science & engineering education.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2002

Toward a Framework for Preparing and Executing Adaptive Grid Programs.

[DOI]

,

,

John M. Mellor-Crummey

,

Keith D. Cooper

,

,

Francine Berman

,

Andrew A. Chien

,

,

,

,

,

,

,

,

S. Lennart Johnsson

,

,

Jack J. Dongarra

,

Sathish S. Vadhiyar

,

Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

A Metascheduler For The Grid.

[DOI]

Sathish S. Vadhiyar

,

Jack J. Dongarra

Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

Experiments with Scheduling Using Simulated Annealing in a Grid Environment.

[DOI]

,

Jack J. Dongarra

Proceedings of the Grid Computing, 2002

Overview of GridRPC: A Remote Procedure Call API for Grid Computing.

[DOI]

,

Hidemoto Nakada

,

Satoshi Matsuoka

,

Jack J. Dongarra

,

,

Proceedings of the Grid Computing, 2002

Trends in High Performance Computing and Using Numerical Libraries on Cluster.

[DOI]

Jack J. Dongarra

Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

Three Tools to Help with Cluster and Grid Computing: SANS-Effort, PAPI, and NetSolve.

[DOI]

Jack J. Dongarra

Proceedings of the 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2002), 2002

2001

An iterative solver benchmark.

[DOI]

Jack J. Dongarra

,

Victor Eijkhout

,

Henk A. van der Vorst

Sci. Program., 2001

Recursive approach in sparse matrix LU factorization.

[DOI]

Jack J. Dongarra

,

Victor Eijkhout

,

Sci. Program., 2001

Preface: Clusters and Computational Grids for Scientific Computing.

Jack J. Dongarra

,

Bernard Tourancheau

Parallel Process. Lett., 2001

On the Convergence of Computational and Data Grids.

[DOI]

Dorian C. Arnold

,

Sathish S. Vadhiyar

,

Jack J. Dongarra

Parallel Process. Lett., 2001

Automated empirical optimizations of software and the ATLAS project.

[DOI]

R. Clinton Whaley

,

Antoine Petitet

,

Jack J. Dongarra

Parallel Comput., 2001

HARNESS and fault tolerant MPI.

[DOI]

,

Antonin Bukovsky

,

Jack J. Dongarra

Parallel Comput., 2001

Clusters and computational grids for scientific computing - introduction.

[DOI]

Jack J. Dongarra

,

Masaaki Shimasaki

,

Bernard Tourancheau

Parallel Comput., 2001

Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries.

[DOI]

,

,

Keith D. Cooper

,

Jack J. Dongarra

,

Robert J. Fowler

,

,

S. Lennart Johnsson

,

John M. Mellor-Crummey

,

J. Parallel Distributed Comput., 2001

Numerical Libraries and the Grid.

[DOI]

Antoine Petitet

,

L. Susan Blackford

,

Jack J. Dongarra

,

,

,

,

Sathish S. Vadhiyar

Int. J. High Perform. Comput. Appl., 2001

Numerical Libraries and Tools for Scalable Parallel Cluster Computing.

[DOI]

Jack J. Dongarra

,

,

Anne E. Trefethen

Int. J. High Perform. Comput. Appl., 2001

The GrADS Project: Software Support for High-Level Grid Application Development.

[DOI]

Francine Berman

,

Andrew A. Chien

,

Keith D. Cooper

,

Jack J. Dongarra

,

,

,

S. Lennart Johnsson

,

,

,

John M. Mellor-Crummey

,

,

,

Int. J. High Perform. Comput. Appl., 2001

The quest for petascale computing.

[DOI]

Jack J. Dongarra

,

David W. Walker

Comput. Sci. Eng., 2001

Numerical libraries and the grid: the GrADS experiments with ScaLAPACK.

[DOI]

Antoine Petitet

,

L. Susan Blackford

,

Jack J. Dongarra

,

,

,

,

Sathish S. Vadhiyar

Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001

Review of Performance Analysis Tools for MPI Parallel Programs.

[DOI]

,

,

Kevin S. London

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2001

Parallel IO Support for Meta-computing Applications: MPI_Connect IO Applied to PACX-MPI.

[DOI]

,

,

Michael M. Resch

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2001

Packed Storage Extension for ScaLAPACK.

Eduardo F. D'Azevedo

,

Jack J. Dongarra

Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

NetSolve and Its Applications.

[DOI]

Jack J. Dongarra

Proceedings of the IEEE International Symposium on Network Computing and Applications (NCA 2001), 2001

Automatic translation of Fortran to JVM bytecode.

[DOI]

,

Jack J. Dongarra

Proceedings of the ACM 2001 Java Grande Conference, Stanford University, California, USA, 2001

Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication.

[DOI]

,

Dorian C. Arnold

,

Alessandro Bassi

,

Jack J. Dongarra

,

,

,

D. Martin Swany

,

Sathish S. Vadhiyar

,

,

Francine Berman

,

,

Graziano Obertelli

Proceedings of the 3rd Annual International Workshop on Active Middleware Services (AMS 2001), 2001

Towards an Accurate Model for Collective Communications.

[DOI]

Sathish S. Vadhiyar

,

,

Jack J. Dongarra

Proceedings of the Computational Science - ICCS 2001, 2001

Fault Tolerant MPI for the HARNESS Meta-computing System.

[DOI]

,

Antonin Bukovsky

,

Jack J. Dongarra

Proceedings of the Computational Science - ICCS 2001, 2001

High Performance Computing and Trends: Connecting Computational Requirements with Computing Resources.

[DOI]

Jack J. Dongarra

Proceedings of the Euro-Par 2001: Parallel Processing, 2001

High Performance Computing and Trends: Connected Computational Requirements with Computing Resources.

[DOI]

Jack J. Dongarra

Proceedings of the 2001 IEEE International Conference on Cluster Computing (CLUSTER 2001), 2001

End-user Tools for Application Performance Analysis Using Hardware Counters.

Kevin S. London

,

Jack J. Dongarra

,

,

,

,

Proceedings of the ISCA 14th International Conference on Parallel and Distributed Computing Systems, 2001

Lapack95 users' guide.

[DOI]

Vincent A. Barker

,

L. Susan Blackford

,

Jack J. Dongarra

,

,

Sven Hammarling

,

,

Jerzy Wasniewski

,

Plamen Y. Yalamov

Software, environments, tools 13, SIAM, ISBN: 978-0-89871-504-0, 2001

2000

Preface.

[DOI]

Frederica Darema

,

Jack J. Dongarra

,

Int. J. High Perform. Comput. Appl., 2000

A Portable Programming Interface for Performance Evaluation on Modern Processors.

[DOI]

,

Jack J. Dongarra

,

,

,

Int. J. High Perform. Comput. Appl., 2000

Guest Editors Introduction to the top 10 algorithms.

[DOI]

Jack J. Dongarra

,

Francis Sullivan

Comput. Sci. Eng., 2000

The design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines.

[DOI]

Eduardo F. D'Azevedo

,

Jack J. Dongarra

Concurr. Pract. Exp., 2000

Automatically Tuned Collective Communications.

[DOI]

Sathish S. Vadhiyar

,

,

Jack J. Dongarra

Proceedings of the Proceedings Supercomputing 2000, 2000

A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters.

[DOI]

,

Jack J. Dongarra

,

,

Kevin S. London

,

Proceedings of the Proceedings Supercomputing 2000, 2000

ACCT: Automatic Collective Communications Tuning.

[DOI]

,

Sathish S. Vadhiyar

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2000

FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World.

[DOI]

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2000

Developing an Architecture to Support the Implementation and Development of Scientific computing Applications.

Dorian C. Arnold

,

Jack J. Dongarra

Proceedings of the Architecture of Scientific Software, 2000

Invited Talk: The NetSolve Environment: Progressing Towards the Seamless Grid.

[DOI]

Dorian C. Arnold

,

Jack J. Dongarra

Proceedings of the 2000 International Workshop on Parallel Processing, 2000

A Grid Computing Environment for Enabling Large Scale Quantum Mechanical Simulations.

[DOI]

Jack J. Dongarra

,

Proceedings of the Grid Computing, 2000

Request Sequencing: Optimizing Communication for the Grid.

[DOI]

Dorian C. Arnold

,

Dieter Bachmann

,

Jack J. Dongarra

Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

Parallel and Distributed Scientific Computing.

[DOI]

Antoine Petitet

,

,

Jack J. Dongarra

,

,

R. Clint Whaley

Proceedings of the Handbook on Parallel and Distributed Processing, 2000

Common Issues.

[DOI]

Jack J. Dongarra

,

,

,

,

Henk A. van der Vorst

Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000

1999

Algorithmic Redistribution Methods for Block-Cyclic Decompositions.

[DOI]

Antoine Petitet

,

Jack J. Dongarra

IEEE Trans. Parallel Distributed Syst., 1999

JLAPACK-compiling LAPACK Fortran to Java.

[DOI]

David M. Doolin

,

Jack J. Dongarra

,

Sci. Program., 1999

A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures.

[DOI]

Françoise Tisseur

,

Jack J. Dongarra

SIAM J. Sci. Comput., 1999

Experiences with Windows NT as a Cluster Computing Platform for Parallel Computing.

[DOI]

,

Jack J. Dongarra

Parallel Distributed Comput. Pract., 1999

Parallel Numerical Linear Algebra.

[DOI]

Jack J. Dongarra

,

Erricos John Kontoghiorghes

Parallel Distributed Comput. Pract., 1999

A Comparison of Parallel Solvers for Diagonally Dominant and General Narrow-Banded Linear Systems.

[DOI]

,

Andrew J. Cleary

,

Jack J. Dongarra

,

Parallel Distributed Comput. Pract., 1999

Algorithmic Issues on Heterogeneous Computing Platforms.

[DOI]

,

Jack J. Dongarra

,

Fabrice Rastello

,

,

Frédéric Vivien

Parallel Process. Lett., 1999

The marketplace of high-performance computing.

[DOI]

Erich Strohmaier

,

Jack J. Dongarra

,

Hans Werner Meuer

,

Parallel Comput., 1999

Static tiling for heterogeneous computing platforms.

[DOI]

,

Jack J. Dongarra

,

,

Frédéric Vivien

Parallel Comput., 1999

Stochastic Performance Prediction for Iterative Algorithms in Distributed Environments.

[DOI]

,

Michael G. Thomason

,

Jack J. Dongarra

J. Parallel Distributed Comput., 1999

Clusters and Computational Grids for Scientific Computing.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Int. J. High Perform. Comput. Appl., 1999

Deploying fault tolerance and taks migration with NetSolve.

[DOI]

,

,

,

Jack J. Dongarra

Future Gener. Comput. Syst., 1999

Scalable networked information processing environment (SNIPE).

[DOI]

,

,

Jack J. Dongarra

Future Gener. Comput. Syst., 1999

HARNESS: a next generation distributed virtual machine.

[DOI]

,

Jack J. Dongarra

,

,

,

,

James Arthur Kohl

,

Mauro Migliardi

,

,

,

Philip Papadopoulous

Future Gener. Comput. Syst., 1999

Tiling on systems with communication/computation overlap.

[DOI]

Pierre-Yves Calland

,

Jack J. Dongarra

,

Concurr. Pract. Exp., 1999

Logistical quality of service in NetSolve.

[DOI]

,

,

Jack J. Dongarra

,

,

,

Francine Berman

,

Comput. Commun., 1999

Automatically Tuned Linear Algebra Software.

R. Clinton Whaley

,

Jack J. Dongarra

Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

The Future of the BLAS.

Jack J. Dongarra

Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

Adaptive Scheduling for Task Farming with Grid Middleware.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999

A Comparison of Parallel Solvers for Diagonally Dominant and General Narrow-Banded Linear Systems II.

[DOI]

,

Andrew J. Cleary

,

Jack J. Dongarra

,

Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999

LAPACK Users' Guide, Third Edition

[DOI]

Edward C. Anderson

,

,

Christian H. Bischof

,

L. Susan Blackford

,

,

Jack J. Dongarra

,

,

,

Sven Hammarling

,

,

Danny C. Sorensen

Software, Environments and Tools, SIAM, ISBN: 978-0-89871-960-4, 1999

1998

Using Agent-Based Software for Scientific Computing in the NetSolve System.

[DOI]

,

Jack J. Dongarra

Parallel Comput., 1998

National HPCC Software Exchange (NHSE): Uniting the High Performance Computing and Communications Community.

[DOI]

,

Jack J. Dongarra

,

,

,

D Lib Mag., 1998

Developing numerical libraries in Java.

[DOI]

Ronald F. Boisvert

,

Jack J. Dongarra

,

,

Karin A. Remington

,

Concurr. Pract. Exp., 1998

Programming Tools and Environments.

[DOI]

,

,

Susan L. Graham

,

,

,

Jack J. Dongarra

Commun. ACM, 1998

MPI_Connect Managing Heterogeneous MPI Applications Ineroperation and Process Control.

[DOI]

,

Kevin S. London

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1998

High Performance Linear Algebra Package for FORTRAN 90.

[DOI]

Jerzy Wasniewski

,

Jack J. Dongarra

Proceedings of the Applied Parallel Computing, 1998

Deploying Fault-Tolerance and Task Migration with NetSolve.

[DOI]

,

,

,

Jack J. Dongarra

Proceedings of the Applied Parallel Computing, 1998

More on Scheduling Block-Cyclic Array Redistribution.

[DOI]

Frederic Desprez

,

Stéphane Domas

,

Jack J. Dongarra

,

Antoine Petitet

,

Cyril Randriamaro

,

Proceedings of the Languages, 1998

Dynamic Reconfiguration and Virtual Machine Management in the Harness Metacomputing System.

[DOI]

Mauro Migliardi

,

Jack J. Dongarra

,

,

Vaidy S. Sunderam

Proceedings of the Computing in Object-Oriented Parallel Environments, 1998

High Performance Linear Algebra Package LAPACK90.

[DOI]

Jack J. Dongarra

,

Jerzy Wasniewski

Proceedings of the Parallel and Distributed Processing, 10 IPPS/SPDP'98 Workshops Held in Conjunction with the 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing, Orlando, Florida, USA, March 30, 1998

HARNESS: Heterogeneous Adaptable Reconfigurable NEtworked SystemS.

[DOI]

Jack J. Dongarra

,

,

,

James Arthur Kohl

,

Philip M. Papadopoulos

,

Stephen L. Scott

,

Vaidy S. Sunderam

,

Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, 1998

NetSolve: A Network-Enabled Solver; Examples and Users.

[DOI]

,

Jack J. Dongarra

Proceedings of the Seventh Heterogeneous Computing Workshop, 1998

Technologies for Repository Interoperation and Access Control.

[DOI]

,

Jack J. Dongarra

,

,

,

Proceedings of the 3rd ACM International Conference on Digital Libraries, 1998

1997

Practical Experience in the Numerical Dangers of Heterogeneous Computing.

[DOI]

L. Susan Blackford

,

Andrew J. Cleary

,

Antoine Petitet

,

R. Clinton Whaley

,

,

Inderjit S. Dhillon

,

,

,

Jack J. Dongarra

,

Sven Hammarling

ACM Trans. Math. Softw., 1997

The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers.

[DOI]

,

,

Jack J. Dongarra

,

Antoine Petitet

,

Howard Robinson

,

SIAM J. Sci. Comput., 1997

Workshop on Environments and Tools for Parallel Scientific Computing.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Parallel Comput., 1997

Key Concepts for Parallel Out-of-Core LU Factorization.

[DOI]

Jack J. Dongarra

,

Sven Hammarling

,

David W. Walker

Parallel Comput., 1997

Fault-Tolerant Matrix Operations for Networks of Workstations Using Diskless Checkpointing.

[DOI]

,

,

Jack J. Dongarra

J. Parallel Distributed Comput., 1997

Preface To the Special Issue.

[DOI]

Jack J. Dongarra

,

Bernard Tourancheau

Int. J. High Perform. Comput. Appl., 1997

Netsolve: a Network-Enabled Server for Solving Computational Science Problems.

[DOI]

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 1997

Message-Passing Performance of Various Computers.

[DOI]

Jack J. Dongarra

,

Thomas H. Dunigan

Concurr. Pract. Exp., 1997

Java Access to Numerical Libraries.

[DOI]

,

Jack J. Dongarra

,

David M. Doolin

Concurr. Pract. Exp., 1997

Scalable Networked Information Processing Environment (SNIPE).

[DOI]

,

,

Jack J. Dongarra

,

Proceedings of the ACM/IEEE Conference on Supercomputing, 1997

Heterogeneous MPI Application Interoperation and Process Management under PVMPI.

[DOI]

,

Jack J. Dongarra

,

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1997

Block-Cyclic Array Redistribution on Networks of Workstations.

[DOI]

Jack J. Dongarra

,

Frederic Desprez

,

Antoine Petitet

,

Cyril Randriamaro

,

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1997

PVMPI Provides Interoperability Between MPI Implementations.

,

Jack J. Dongarra

,

Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

A Distributed Memory Implementation of the Nonsymmetric QR Algorithm.

Jack J. Dongarra

,

,

David S. Watkins

Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

A Further Proposal for a Fortran 90 Interface for LAPACK.

L. Susan Blackford

,

Jack J. Dongarra

,

,

Sven Hammarling

,

Jerzy Wasniewski

Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

ScaLAPACK: A Linear Algebra Library for Message-Passing Computers.

L. Susan Blackford

,

,

Andrew J. Cleary

,

Eduardo F. D'Azevedo

,

,

Inderjit S. Dhillon

,

Jack J. Dongarra

,

Sven Hammarling

,

,

Antoine Petitet

,

,

David W. Walker

,

R. Clinton Whaley

Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

Scheduling Block-Cyclic Array Redistribution.

Frederic Desprez

,

Jack J. Dongarra

,

Antoine Petitet

,

Cyril Randriamaro

,

Proceedings of the Parallel Computing: Fundamentals, 1997

Industrial Application Areas of High-Performance Computing.

[DOI]

Erich Strohmaier

,

Jack J. Dongarra

,

Hans Werner Meuer

,

Proceedings of the High-Performance Computing and Networking, 1997

Tiling with limited resources.

[DOI]

Pierre-Yves Calland

,

Jack J. Dongarra

,

Proceedings of the 1997 International Conference on Application-Specific Systems, 1997

Determining the Idle Time of Tiling: New Results.

[DOI]

Frederic Desprez

,

Jack J. Dongarra

,

Fabrice Rastello

,

Proceedings of the 1997 Conference on Parallel Architectures and Compilation Techniques (PACT '97), 1997

1996

Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines.

[DOI]

,

Jack J. Dongarra

,

Susan Ostrouchov

,

Antoine Petitet

,

David W. Walker

,

R. Clinton Whaley

Sci. Program., 1996

PB-BLAS: a set of parallel block basic linear algebra subprograms.

[DOI]

,

Jack J. Dongarra

,

David W. Walker

Concurr. Pract. Exp., 1996

A Message Passing Standard for MPP and Workstations.

[DOI]

Jack J. Dongarra

,

,

,

David W. Walker

Commun. ACM, 1996

Providing Access to High Performance Computing Technologies.

[DOI]

Jack J. Dongarra

,

,

Proceedings of the Vector and Parallel Processing, 1996

NetSovle: A Network Server for Solving Computational Science Problems.

[DOI]

,

Jack J. Dongarra

Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.

[DOI]

L. Susan Blackford

,

,

Andrew J. Cleary

,

,

Inderjit S. Dhillon

,

Jack J. Dongarra

,

Sven Hammarling

,

,

Antoine Petitet

,

,

David W. Walker

,

R. Clinton Whaley

Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

Case studies on the development of ScaLAPACK and the NAG Numerical PVM Library.

Jack J. Dongarra

,

Sven Hammarling

,

Antoine Petitet

Proceedings of the Quality of Numerical Software, 1996

Matrix Market: a web resource for test matrix collections.

Ronald F. Boisvert

,

,

Karin A. Remington

,

Richard F. Barrett

,

Jack J. Dongarra

Proceedings of the Quality of Numerical Software, 1996

Taskers and General Resource Managers: PVM Supporting DCE Process Management.

[DOI]

,

Kevin S. London

,

Jack J. Dongarra

Proceedings of the Parallel Virtual Machine, 1996

ScaLAPACK Tutorial.

[DOI]

Jack J. Dongarra

,

L. Susan Blackford

Proceedings of the Applied Parallel Computing, 1996

Practical Experience in the Dangers of Heterogeneous Computing.

[DOI]

Andrew J. Cleary

,

,

Inderjit S. Dhillon

,

Jack J. Dongarra

,

Sven Hammarling

,

Antoine Petitet

,

,

,

R. Clinton Whaley

Proceedings of the Applied Parallel Computing, 1996

Changing Technologies of HPC.

[DOI]

Jack J. Dongarra

,

Hans Werner Meuer

,

,

Erich Strohmaier

Proceedings of the High-Performance Computing and Networking, 1996

PARKBENCH: Methodology, Relations and Results.

[DOI]

Jack J. Dongarra

,

,

Erich Strohmaier

Proceedings of the High-Performance Computing and Networking, 1996

Lapack for Fortran90 Compiler.

[DOI]

Jack J. Dongarra

,

,

Sven Hammarling

,

Jerzy Wasniewski

,

Proceedings of the High-Performance Computing and Networking, 1996

Selected Results from the ParkBench Benchmark.

[DOI]

Jack J. Dongarra

,

,

Erich Strohmaier

Proceedings of the Euro-Par '96 Parallel Processing, 1996

1995

Software Distribution using XNETLIB.

[DOI]

Jack J. Dongarra

,

,

ACM Trans. Math. Softw., 1995

Software Libraries for Linear Algebra Computations on High Performance Computers.

[DOI]

Jack J. Dongarra

,

David W. Walker

SIAM Rev., 1995

Performance Study of LU Factorization with Low Communication Overhead on Multiprocessors.

[DOI]

Frederic Desprez

,

Jack J. Dongarra

,

Bernard Tourancheau

Parallel Process. Lett., 1995

Parallel Matrix Transpose Algorithms on Distributed Memory Concurrent Computers.

[DOI]

,

Jack J. Dongarra

,

David W. Walker

Parallel Comput., 1995

A Parallel Algorithm for the Reduction of a Nonsymmetric Matrix to Block Upper-Hessenberg Form.

[DOI]

Michael W. Berry

,

Jack J. Dongarra

,

Parallel Comput., 1995

The design of a parallel dense linear algebra software library: Reduction to Hessenberg, tridiagonal, and bidiagonal form.

[DOI]

,

Jack J. Dongarra

,

David W. Walker

Numer. Algorithms, 1995

Recent Enhancements To Pvm.

[DOI]

,

Jack J. Dongarra

,

,

,

Vaidy S. Sunderam

Int. J. High Perform. Comput. Appl., 1995

Visual programming and debugging for parallel computing.

[DOI]

James C. Browne

,

,

Jack J. Dongarra

,

,

Peter W. Newton

IEEE Parallel Distributed Technol. Syst. Appl., 1995

The Netlib Mathematical Software Repository.

[DOI]

,

Jack J. Dongarra

,

,

D Lib Mag., 1995

Location-Independent Naming for Virtual Distributed Software Repositories.

[DOI]

,

Jack J. Dongarra

,

,

,

,

,

Proceedings of the ACM SIGSOFT Symposium on Software Reusability, 1995

Distributed Information Management in the National HPCC Software Exchange.

[DOI]

,

Jack J. Dongarra

,

Geoffrey C. Fox

,

Kenneth A. Hawick

,

,

,

,

Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

Position Paper.

Jack J. Dongarra

Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

ScaLAPACK Tutorial.

[DOI]

Jack J. Dongarra

,

Antoine Petitet

Proceedings of the Applied Parallel Computing, 1995

A Proposal for a Fortran 90 Interface for LAPACK.

[DOI]

Jack J. Dongarra

,

,

Sven Hammarling

,

Jerzy Wasniewski

,

Proceedings of the Applied Parallel Computing, 1995

A Proposal for a Set of Parallel Basic Linear Algebra Subprograms.

[DOI]

,

Jack J. Dongarra

,

Susan Ostrouchov

,

Antoine Petitet

,

David W. Walker

,

R. Clinton Whaley

Proceedings of the Applied Parallel Computing, 1995

ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.

[DOI]

,

,

Inderjit S. Dhillon

,

Jack J. Dongarra

,

Susan Ostrouchov

,

Antoine Petitet

,

,

David W. Walker

,

R. Clinton Whaley

Proceedings of the Applied Parallel Computing, 1995

Scalable linear algebra software libraries for distributed memory concurrent computers.

[DOI]

,

Jack J. Dongarra

Proceedings of the 5th IEEE Workshop on Future Trends of Distributed Computing Systems (FTDCS 1995), 1995

Algorithm-Based Diskless Checkpointing for Fault Tolerant Matrix Operations.

[DOI]

,

,

Jack J. Dongarra

Proceedings of the Digest of Papers: FTCS-25, 1995

Management of the Nationale HPCC Software Exchange - A Virtual Distributed Digital Library.

[DOI]

,

Jack J. Dongarra

,

,

Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries, 1995

Digital Software and Data Repositories for Support of Scientific Computing.

[DOI]

Ronald F. Boisvert

,

,

Jack J. Dongarra

,

Proceedings of the Digital Libraries, Research and Technology Advances, 1995

Templates for Linear Algebra Problems.

[DOI]

,

,

,

Jack J. Dongarra

,

,

,

Henk A. van der Vorst

Proceedings of the Computer Science Today: Recent Trends and Developments, 1995

1994

PDS: A Performance Database Server.

[DOI]

Michael W. Berry

,

Jack J. Dongarra

,

Brian H. Larose

,

Todd A. Letsche

Sci. Program., 1994

HeNCE: A Heterogeneous Network Computing Environment.

[DOI]

,

Jack J. Dongarra

,

George Al Geist II

,

,

Sci. Program., 1994

The PVM Concurrent Computing System: Evolution, Experiences, and Trends.

[DOI]

Vaidy S. Sunderam

,

,

Jack J. Dongarra

,

Parallel Comput., 1994

Scalability Issues Affecting the Design of a Dense Linear Algebra Library.

[DOI]

Jack J. Dongarra

,

Robert A. van de Geijn

,

David W. Walker

J. Parallel Distributed Comput., 1994

Preface.

[DOI]

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 1994

Crpc Research Into Linear Algebra Software for High Performance Computers.

[DOI]

,

Jack J. Dongarra

,

,

Danny C. Sorensen

,

David W. Walker

Int. J. High Perform. Comput. Appl., 1994

Pumma: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers.

[DOI]

,

David W. Walker

,

Jack J. Dongarra

Concurr. Pract. Exp., 1994

Constructing Numerical Software Libraries for High-Performance Computing Environments.

[DOI]

,

Jack J. Dongarra

,

,

David W. Walker

Proceedings of the Parallel Scientific Computing, First International Workshop, 1994

The Design of Scalable Software Libraries for Distributed Memory Concurrent Computers.

[DOI]

,

David W. Walker

,

Jack J. Dongarra

Proceedings of the 8th International Symposium on Parallel Processing, 1994

Constructing Numerical Software Libraries for HPCC Environments.

[DOI]

Jack J. Dongarra

Proceedings of the Third International Symposium on High Performance Distributed Computing, 1994

Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods

[DOI]

Richard F. Barrett

,

Michael W. Berry

,

,

,

,

Jack J. Dongarra

,

Victor Eijkhout

,

,

Charles H. Romine

,

Henk A. van der Vorst

Other Titles in Applied Mathematics, SIAM, ISBN: 978-1-61197-153-8, 1994

1993

A Parallel Algorithm for the Nonsymmetric Eigenvalue Problem.

[DOI]

Jack J. Dongarra

,

SIAM J. Sci. Comput., 1993

Performance of LAPACK: a portable library of numerical linear algebra routines.

[DOI]

Edward C. Anderson

,

Jack J. Dongarra

Proc. IEEE, 1993

Linear algebra libraries for high-performance computers: a personal perspective.

[DOI]

Jack J. Dongarra

IEEE Parallel Distributed Technol. Syst. Appl., 1993

Visualization and Debugging in a Heterogeneous Environment.

[DOI]

,

Jack J. Dongarra

,

,

Vaidy S. Sunderam

Computer, 1993

LAPACK++: a design overview of object-oriented extensions for high performance linear algebra.

[DOI]

Jack J. Dongarra

,

,

David W. Walker

Proceedings of the Proceedings Supercomputing '93, 1993

Two Dimensional Basic Linear Algebra Communication Subprograms.

Jack J. Dongarra

,

Robert A. van de Geijn

,

R. Clinton Whaley

Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Using PVM 3.0 to Run Grand Challenge Applications on a Heterogeneous Network of Parallel Computers.

Jack J. Dongarra

,

,

,

Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

LAPACK for Distributed Memory Architectures: The Next Generation.

,

Jack J. Dongarra

,

Robert A. van de Geijn

,

David W. Walker

Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Tools for Heterogeneous Network Computing.

,

Jack J. Dongarra

,

,

,

,

Vaidy S. Sunderam

Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

1992

Algorithm 710: FORTRAN subroutines for computing the eigenvalues and eigenvectors of a general matrix by reduction to general tridiagonal form.

[DOI]

Jack J. Dongarra

,

George Al Geist II

,

Charles H. Romine

ACM Trans. Math. Softw., 1992

Numerical Considerations in Computing Invariant Subspaces.

[DOI]

Jack J. Dongarra

,

Sven Hammarling

,

James Hardy Wilkinson

SIAM J. Matrix Anal. Appl., 1992

Reduction to condensed form for the eigenvalue problem on distributed memory architectures.

[DOI]

Jack J. Dongarra

,

Robert A. van de Geijn

Parallel Comput., 1992

Editorial.

[DOI]

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 1992

1991

1990 Gordon Bell Prize Winners.

[DOI]

Jack J. Dongarra

,

,

,

IEEE Softw., 1991

A comparative study of automatic vectorizing compilers.

[DOI]

,

,

Jack J. Dongarra

Parallel Comput., 1991

Parallel loops - a test suite for parallelizing compilers: description and example results.

[DOI]

Jack J. Dongarra

,

,

Steven P. Reinhardt

,

Parallel Comput., 1991

Gordon Bell prize lectures.

[DOI]

Jack J. Dongarra

,

,

,

Proceedings of the Proceedings Supercomputing '91, 1991

Graphical development tools for network-based concurrent supercomputing.

[DOI]

,

Jack J. Dongarra

Proceedings of the Proceedings Supercomputing '91, 1991

Solving Computational Grand Challenges Using a Network of Heterogeneous Supercomputers.

,

Jack J. Dongarra

,

,

,

Vaidy S. Sunderam

Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing, 1991

LAPACK for Distributed Memory Architectures: Progress Report.

Edward C. Anderson

,

Annamaria Benzoni

,

Jack J. Dongarra

,

,

Susan Ostrouchov

,

Bernard Tourancheau

,

Robert A. van de Geijn

Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing, 1991

Solving linear systems on vector and shared memory computers.

Jack J. Dongarra

,

,

Danny C. Sorensen

,

Henk A. van der Vorst

SIAM, ISBN: 978-0-89871-270-4, 1991

1990

Algorithm 679; a set of level 3 basic linear algebra subprograms: model implementation and test programs.

[DOI]

Jack J. Dongarra

,

,

Sven Hammarling

,

ACM Trans. Math. Softw., 1990

A set of level 3 basic linear algebra subprograms.

[DOI]

Jack J. Dongarra

,

,

Sven Hammarling

,

ACM Trans. Math. Softw., 1990

1989 Gordon Bell Prize.

[DOI]

Jack J. Dongarra

,

,

IEEE Softw., 1990

Performance of various computers using standard linear equations software.

[DOI]

Jack J. Dongarra

SIGARCH Comput. Archit. News, 1990

A Tool to Aid in the Design, Implementation, and Understanding of Matrix Algorithms for Parallel Processors.

[DOI]

Jack J. Dongarra

,

,

James Arthur Kohl

,

Samuel A. Fineberg

J. Parallel Distributed Comput., 1990

LAPACK: a portable linear algebra library for high-performance computers.

[DOI]

Edward C. Anderson

,

,

Jack J. Dongarra

,

,

,

,

Sven Hammarling

,

,

Christian H. Bischof

,

Danny C. Sorensen

Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

1989

1988 Gordon Bell Prize.

[DOI]

James C. Browne

,

Jack J. Dongarra

,

,

,

IEEE Softw., 1989

Advanced Computing Research Facility, Mathematics and Computer Science Division, Argonne National Laboratory.

[DOI]

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 1989

Evaluating Block Algorithm Variants in LAPACK.

Edward C. Anderson

,

Jack J. Dongarra

Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

A graphics tool to aid in the generation of parallel FORTRAN programs.

[DOI]

,

Jack J. Dongarra

,

Danny C. Sorensen

Proceedings of the 13th Annual International Computer Software and Applications Conference, 1989

1988

Corrigenda: "An Extended Set of FORTRAN Basic Linear Algebra Subprograms".

[DOI]

Jack J. Dongarra

,

,

Sven Hammarling

,

Richard J. Hanson

ACM Trans. Math. Softw., 1988

Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs.

[DOI]

Jack J. Dongarra

,

,

Sven Hammarling

,

Richard J. Hanson

ACM Trans. Math. Softw., 1988

An extended set of FORTRAN basic linear algebra subprograms.

[DOI]

Jack J. Dongarra

,

,

Sven Hammarling

,

Richard J. Hanson

ACM Trans. Math. Softw., 1988

Programming methodology and performance issues for advanced computer architectures.

[DOI]

Jack J. Dongarra

,

Danny C. Sorensen

,

Kathryn Connolly

,

Parallel Comput., 1988

Tools to aid in the analysis of memory access patterns for FORTRAN programs.

[DOI]

,

Jack J. Dongarra

,

Danny C. Sorensen

Parallel Comput., 1988

Vectorizing compilers: a test suite and results.

[DOI]

,

Jack J. Dongarra

,

Proceedings of the Proceedings Supercomputing '88, Orlando, FL, USA, November 12-17, 1988, 1988

1987

Performance of various computers using standard linear equations software in a Fortran environment.

[DOI]

Jack J. Dongarra

Simul., 1987

A portable environment for developing parallel FORTRAN programs.

[DOI]

Jack J. Dongarra

,

Danny C. Sorensen

Parallel Comput., 1987

Solving banded systems on a parallel processor.

[DOI]

Jack J. Dongarra

,

S. Lennart Johnsson

Parallel Comput., 1987

Distribution of Mathematical Software via Electronic Mail.

[DOI]

Jack J. Dongarra

,

Commun. ACM, 1987

SCHEDULE: An Environment for Developing Transportable Explicitly Parallel Codes in Fortran-Abstract.

Jack J. Dongarra

,

Danny C. Sorensen

Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, 1987

A Proposal for a Set of Level 3 Basic Linear Algebra Subprograms.

Jack J. Dongarra

,

,

,

Sven Hammarling

Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, 1987

The LINPACK Benchmark: An Explanation.

[DOI]

Jack J. Dongarra

Proceedings of the Supercomputing, 1987

1986

Comparison of the CRAY X-MP-4, Fujitsu VP-200, and Hitachi S-810/20.

[DOI]

Jack J. Dongarra

,

Simul., 1986

Implementation of some concurrent algorithms for matrix factorization.

[DOI]

Jack J. Dongarra

,

,

Danny C. Sorensen

Parallel Comput., 1986

Performance of advanced architectures.

[DOI]

Jack J. Dongarra

Proceedings of the 1986 Workshop on Applied Computing, 1986

1985

A fully parallel algorithm for the symmetric eigenvalue problem.

Jack J. Dongarra

,

Danny C. Sorensen

Proceedings of the Selected Papers from the Second Conference on Parallel Processing for Scientific Computing, 1985

A fast algorithm for the symmetric eigenvalue problem.

[DOI]

Jack J. Dongarra

,

Danny C. Sorensen

Proceedings of the 7th IEEE Symposium on Computer Arithmetic, 1985

1984

Squeezing the most out of an algorithm in CRAY FORTRAN.

[DOI]

Jack J. Dongarra

,

Stanley C. Eisenstat

ACM Trans. Math. Softw., 1984

On some parallel banded system solvers.

[DOI]

Jack J. Dongarra

,

Parallel Comput., 1984

A collection of parallel linear equations routines for the Denelcor HEP.

[DOI]

Jack J. Dongarra

,

Robert E. Hiromoto

Parallel Comput., 1984

Multiprocessing linear algebra algorithms on the CRAY X-MP-2: Experiences with small granularity.

[DOI]

,

Jack J. Dongarra

,

Christopher C. Hsiung

J. Parallel Distributed Comput., 1984

1982

Algorithm 589: SICEDR: A FORTRAN Subroutine for Improving the Accuracy of Computed Matrix Eigenvalues.

[DOI]

Jack J. Dongarra

ACM Trans. Math. Softw., 1982

1979

Unrolling Loops in FORTRAN.

[DOI]

Jack J. Dongarra

,

Softw. Pract. Exp., 1979

1977

Matrix Eigensystem Routines - EISPACK Guide Extension

[DOI]

Burton S. Garbow

,

,

Jack J. Dongarra

,

Lecture Notes in Computer Science 51, Springer, ISBN: 0387082549, 1977

1976

Matrix Eigensystem Routines - EISPACK Guide, Second Edition

[DOI]

,

,

Jack J. Dongarra

,

Burton S. Garbow

,

,

Virginia C. Klema

,

Lecture Notes in Computer Science 6, Springer, ISBN: 0-387-07546-1, 1976

Loading...