Holger Fehske

ACM Trans. Parallel Comput., September, 2023

2021

A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials.

[BibT_eX]

[DOI]

Andreas Pieper

Int. J. High Perform. Comput. Appl., 2021

2020

ESSEX: Equipping Sparse Solvers For Exascale.

[BibT_eX]

[DOI]

Proceedings of the Software for Exascale Computing - SPPEXA 2016-2019, 2020

A Recursive Algebraic Coloring Technique for Hardware-efficient Symmetric Sparse Matrix-vector Multiplication.

[BibT_eX]

[DOI]

ACM Trans. Parallel Comput., 2020

Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 35th International Conference, 2020

2018

Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs.

[BibT_eX]

[DOI]

CoRR, 2018

Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 33rd International Conference, 2018

2017

GHOST: Building Blocks for High Performance Sparse Linear Algebra on Heterogeneous Systems.

[BibT_eX]

[DOI]

Moritz Kreutzer

Jonas Thies

Int. J. Parallel Program., 2017

PVSC-DTM: A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials.

[BibT_eX]

[DOI]

Andreas Pieper

CoRR, 2017

2016

Towards an Exascale Enabled Sparse Solver Repository.

[BibT_eX]

[DOI]

Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

Performance Engineering and Energy Efficiency of Building Blocks for Large, Sparse Eigenvalue Computations on Heterogeneous Supercomputers.

[BibT_eX]

[DOI]

Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations.

[BibT_eX]

[DOI]

J. Comput. Phys., 2016

2015

Increasing the Performance of the Jacobi-Davidson Method by Blocking.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2015

Performance Engineering of the Kernel Polynomal Method on Large-Scale CPU-GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014

A Unified Sparse Matrix Data Format for Efficient General Sparse Matrix-Vector Multiplication on Modern Processors with Wide SIMD Units.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2014

Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems.

[BibT_eX]

[DOI]

CoRR, 2014

ESSEX: Equipping Sparse Solvers for Exascale.

[BibT_eX]

[DOI]

Faisal Shahzad

Jonas Thies

Gerhard Wellein

Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

2013

A unified sparse matrix data format for modern processors with wide SIMD units.

[BibT_eX]

[DOI]

CoRR, 2013

2012

Sparse Matrix-vector Multiplication on GPGPU Clusters: A New Storage Format and a Scalable Implementation.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

2011

Hybrid-Parallel Sparse Matrix-Vector Multiplication with Explicit Communication Overlap on Current Multicore-Based Systems.

[BibT_eX]

[DOI]

Parallel Process. Lett., 2011

High-order commutator-free exponential time-propagation of driven quantum systems.

[BibT_eX]

[DOI]

Andreas Alvermann

J. Comput. Phys., 2011

Parallel Sparse Matrix-Vector Multiplication as a Test Case for Hybrid MPI+OpenMP Programming.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2009

Performance limitations for sparse matrix-vector multiplications on current multicore environments

[BibT_eX]

[DOI]

Gerald Schubert