José Gracia

Christian Simmendinger

Andreas Ruopp

Ramil Nabiev

Nathalie Favretto-Cristini

Proceedings of the High Performance Computing. ISC High Performance 2024 International Workshops, 2024

EE-HPC a Framework for Energy Efficient HPC System Management.

[BibT_eX]

[DOI]

Christian Simmendinger

Marcel Marquardt

Jan Eitzinger

Thomas Gruber

Proceedings of the SC24-W: Workshops of the International Conference for High Performance Computing, 2024

2023

The EU Center of Excellence for Exascale in Solid Earth (ChEESE): Implementation, results, and roadmap for the second phase.

[BibT_eX]

[DOI]

Tomaso Esposti Ongaro

Joan Farnós

Andreas Fichtner

Alexandre Fournier

Alice-Agnes Gabriel

Jean-Matthieu Gallard

Steven J. Gibbons

Sylfest Glimsdal

José Manuel González-Vida

Maria Concetta Lorenzino

Beatriz Martínez Montesinos

Leonardo Mingari

Geneviève Moguilny

Vadim Montellier

Marisol Monterrubio Velasco

Georges-Emmanuel Moulard

Juan Esteban Rodriguez

Carlos Sánchez-Linares

Future Gener. Comput. Syst., 2023

2021

Callback-based completion notification using MPI Continuations.

[BibT_eX]

[DOI]

Parallel Comput., 2021

Quo Vadis MPI RMA? Towards a More Efficient Use of MPI One-Sided Communication.

[BibT_eX]

[DOI]

CoRR, 2021

Feasibility Study of Molecular Dynamics Kernels Exploitation Using EngineCL.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2021: Parallel Processing Workshops, 2021

2020

DASH: Distributed Data Structures and Parallel Algorithms in a Global Address Space.

[BibT_eX]

[DOI]

Proceedings of the Software for Exascale Computing - SPPEXA 2016-2019, 2020

Collectives in hybrid MPI+MPI code: Design, practice and performance.

[BibT_eX]

[DOI]

Parallel Comput., 2020

Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2020

Fibers are not (P)Threads: The Case for Loose Coupling of Asynchronous Programming Models and MPI Through Continuations.

[BibT_eX]

[DOI]

Joseph Schuchart

Proceedings of the EuroMPI/USA '20: 27th European MPI Users' Group Meeting, 2020

2019

Global Task Data-Dependencies in PGAS Applications.

[BibT_eX]

[DOI]

Joseph Schuchart

Proceedings of the High Performance Computing - 34th International Conference, 2019

MPI Collectives for Multi-core Clusters: Optimized Performance of the Hybrid MPI+MPI Parallel Codes.

[BibT_eX]

[DOI]

Ralf Schneider

Proceedings of the 48th International Conference on Parallel Processing, 2019

2018

The Impact of Taskyield on the Design of Tasks Communicating Through MPI.

[BibT_eX]

[DOI]

Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

2017

Application Productivity and Performance Evaluation of Transparent Locality-aware One-sided Communication Primitives.

[BibT_eX]

[DOI]

Int. J. Netw. Comput., 2017

Patterns for OpenMP Task Data Dependency Overhead Measurements.

[BibT_eX]

[DOI]

Joseph Schuchart

Mathias Nachtmann

Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

2016

HPC Benchmarking: Problem Size Matters.

[BibT_eX]

[DOI]

Vladimir Marjanovic

Proceedings of the 7th International Workshop on Performance Modeling, 2016

Asynchronous Progress Design for a MPI-Based PGAS One-Sided Communication System.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

Towards Performance Portability through Locality-Awareness for Applications Using One-Sided Communication Primitives.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Symposium on Computing and Networking, 2016

2015

DART-MPI: An MPI-based Implementation of a PGAS Runtime System.

[BibT_eX]

[DOI]

CoRR, 2015

CppSs - a C++ Library for Efficient Task Parallelism.

[BibT_eX]

[DOI]

Steffen Brinkmann

CoRR, 2015

A Bandwidth-Saving Optimization for MPI Broadcast Collective Operation.

[BibT_eX]

[DOI]

Proceedings of the 44th International Conference on Parallel Processing Workshops, 2015

Providing Parallel Debugging for DASH Distributed Data Structures with GDB.

[BibT_eX]

[DOI]

Denis Hünich

Andreas Knüpfer

Proceedings of the International Conference on Computational Science, 2015

Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems.

[BibT_eX]

[DOI]

Kamran Idrees

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014

Performance Modeling of the HPCG Benchmark.

[BibT_eX]

[DOI]

Vladimir Marjanovic

Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

DART-MPI: An MPI-based Implementation of a PGAS Runtime System.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

DASH: Data Structures and Algorithms with Support for Hierarchical Locality.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

2013

Programmability and portability for exascale: Top down programming methodology and tools with StarSs.

[BibT_eX]

[DOI]

J. Comput. Sci., 2013

Cudagrind: Memory-Usage Checking for CUDA.

[BibT_eX]

[DOI]

Thomas M. Baumann

Proceedings of the Tools for High Performance Computing 2013, 2013

POLCA - A Programming Model for Large Scale, Strongly Heterogeneous Infrastructures.

[BibT_eX]

[DOI]

Lutz Schubert

Jan Kuper

Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

Cudagrind: A Valgrind Extension for CUDA.

[BibT_eX]

[DOI]

Thomas M. Baumann

Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

2012

Task Debugging with TEMANEJO.

[BibT_eX]

[DOI]

Steffen Brinkmann

Proceedings of the Tools for High Performance Computing 2012, 2012

Avoiding Serialization Effects in Data / Dependency Aware Task Parallel Algorithms for Spatial Decomposition.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Hybrid MPI/StarSs - A Case Study.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Scheduling Overheads for Task-Based Parallel Programming Models.

[BibT_eX]

[DOI]

Mathias Nachtmann