Mark Gates

Orcid: 0000-0003-2996-1641

Affiliations:
  • University of Tennessee Knoxville, TN, USA


According to our database1, Mark Gates authored at least 45 papers between 2011 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

PAQR: Pivoting Avoiding QR factorization.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

2022
Reproducability Artifact for Running SLATE's GEMM and POTRF Operations on Summit and Crusher.
Dataset, August, 2022

Reproducability Artifact for Running SLATE's GEMM and POTRF Operations on Summit and Crusher.
Dataset, August, 2022

Software for "Threshold Pivoting for dense LU Factorization".
Dataset, May, 2022

Threshold Pivoting for Dense LU Factorization.
Proceedings of the IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems, 2022

Portable and Efficient Dense Linear Algebra in the Beginning of the Exascale Era.
Proceedings of the IEEE/ACM International Workshop on Performance, 2022

Proposed Consistent Exception Handling for the BLAS and LAPACK.
Proceedings of the Sixth IEEE/ACM International Workshop on Software Correctness for HPC Applications, 2022

2021
A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines.
ACM Trans. Math. Softw., 2021

Translational process: Mathematical software perspective.
J. Comput. Sci., 2021

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic.
Int. J. High Perform. Comput. Appl., 2021

Task-graph scheduling extensions for efficient synchronization and communication.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

2020
MAGMA templates for scalable linear algebra on emerging architectures.
Int. J. High Perform. Comput. Appl., 2020

A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic.
CoRR, 2020

2019
PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP.
ACM Trans. Math. Softw., 2019

SLATE: design of a modern distributed and accelerated linear algebra library.
Proceedings of the International Conference for High Performance Computing, 2019

Least squares solvers for distributed-memory machines with GPU accelerators.
Proceedings of the ACM International Conference on Supercomputing, 2019

Massively Parallel Automated Software Tuning.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators.
Proceedings of the Euro-Par 2019: Parallel Processing, 2019

2018
The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale.
SIAM Rev., 2018

Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators.
Proc. IEEE, 2018

Accelerating the SVD two stage bidiagonal reduction and divide and conquer using GPUs.
Parallel Comput., 2018

2017
Preconditioned Krylov solvers on GPUs.
Parallel Comput., 2017

With Extreme Computing, the Rules Have Changed.
Comput. Sci. Eng., 2017

Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Bringing High Performance Computing to Big Data Algorithms.
Proceedings of the Handbook of Big Data Technologies, 2017

2016
Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs.
IEEE Trans. Parallel Distributed Syst., 2016

Linear algebra software for large-scale accelerated multicore computing.
Acta Numer., 2016

Heterogeneous Streaming.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Search Space Generation and Pruning System for Autotuners.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

2015
Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems.
Supercomput. Front. Innov., 2015

HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi.
Sci. Program., 2015

High-performance hybrid CPU and GPU parallel algorithm for digital volume correlation.
Int. J. High Perform. Comput. Appl., 2015

A survey of recent developments in parallel implementations of Gaussian elimination.
Concurr. Comput. Pract. Exp., 2015

Accelerating collaborative filtering using concepts from high performance computing.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

2014
Accelerating Computation of Eigenvectors in the Dense Nonsymmetric Eigenvalue Problem.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

Performance and portability with OpenCL for throughput-oriented HPC workloads across accelerators, coprocessors, and multicore processors.
Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014

clMAGMA: high performance dense linear algebra with OpenCL.
Proceedings of the International Workshop on OpenCL, 2014

Accelerating Numerical Dense Linear Algebra Calculations with GPUs.
Proceedings of the Numerical Computations with GPUs, 2014

2013
Leading Edge Hybrid Multi-GPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi.
Proceedings of the Parallel Processing and Applied Mathematics, 2013

Virtual Systolic Array for QR Decomposition.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Toward a scalable multi-GPU eigensolver via compute-intensive kernels and efficient communication.
Proceedings of the International Conference on Supercomputing, 2013

2012
Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems.
Proceedings of the International Conference on Computational Science, 2012

2011
High performance digital volume correlation
PhD thesis, 2011


  Loading...