Peter Thoman

Philipp Gschwandtner

Int. J. Parallel Program., September, 2026

A Portable Compiler-Runtime Approach for Scalability Prediction.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2026

Bridging usability and performance: High-level abstractions for advanced accelerator cluster programming.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2026

2025

A High-Level API for Dynamic Load Balancing in Large-Scale Parameter Sweeps.

[BibT_eX]

[DOI]

Philip Salzmann

Int. J. Parallel Program., August, 2025

Celerity-RSim: Porting Light Propagation Simulation to Accelerator Clusters Using a High-Level API.

[BibT_eX]

[DOI]

Philipp Gschwandtner

Facundo Molina Heredia

Juan Jose Durillo Barrionuevo

Int. J. Parallel Program., June, 2025

Toward Heterogeneous, Distributed, and Energy-Efficient Computing with SYCL.

[BibT_eX]

[DOI]

CoRR, May, 2025

Concurrent Scheduling of High-Level Parallel Programs on Multi-GPU Systems.

[BibT_eX]

[DOI]

CoRR, March, 2025

STREAMLINE: Dynamic and Resource-Efficient Auto-Tuning of Stream Processing Data Pipeline Ensembles.

[BibT_eX]

[DOI]

Stefan Pedratscher

Zahra Najafabadi Samani

Internet Things, 2025

FedGuard: Leader-Aware Fault-Tolerant Federated Learning in the Cloud-Edge Continuum.

[BibT_eX]

[DOI]

Zahra Najafabadi Samani

Proceedings of the 18th IEEE/ACM International Conference on Utility and Cloud Computing, 2025

2024

Automatic Discovery of Collective Communication Patterns in Parallelized Task Graphs.

[BibT_eX]

[DOI]

Int. J. Parallel Program., June, 2024

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach.

[BibT_eX]

[DOI]

Philip Salzmann

SN Comput. Sci., April, 2024

SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance.

[BibT_eX]

[DOI]

Luigi Crisci

Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

SYCL-Bench 2020: Benchmarking SYCL 2020 on AMD, Intel, and NVIDIA GPUs.

[BibT_eX]

[DOI]

Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

LIGATE - LIgand Generator and portable drug discovery platform AT Exascale.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

2023

Declarative Data Flow in a Graph-Based Distributed Memory Runtime System.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2023

Command Horizons: Coalescing Data Dependencies While Maintaining Asynchronicity.

[BibT_eX]

[DOI]

Philip Salzmann

Proceedings of the Asynchronous Many-Task Systems and Applications, 2023

Domain-Specific Energy Modeling for Drug Discovery and Magnetohydrodynamics Applications.

[BibT_eX]

[DOI]

Andrea Rosario Beccari

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Tunable and Portable Extreme-Scale Drug Discovery Platform at Exascale: the LIGATE Approach.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023

2022

The Celerity High-level API: C++20 for Accelerator Clusters.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2022

Multi-GPU room response simulation with hardware raytracing.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2022

On the Compilation Performance of Current SYCL Implementations.

[BibT_eX]

[DOI]

Facundo Molina Heredia

Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

Celerity: How (Well) Does the SYCL API Translate to Distributed Clusters?

[BibT_eX]

[DOI]

Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

2021

The cluster coffer: Teaching HPC on the road.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2021

ndzip-gpu: efficient lossless compression of scientific floating-point data on GPUs.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

Sylkan: Towards a Vulkan Compute Target Platform for SYCL.

[BibT_eX]

[DOI]

Daniel Gogl

Proceedings of the IWOCL'21: International Workshop on OpenCL, Munich Germany, April, 2021, 2021

Optimizing Embedded Industrial Safety Systems Based on Time-of-flight Depth Imaging.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on eScience, 2021

Porting Real-World Applications to GPU Clusters: A Celerity and Cronos Case Study.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on eScience, 2021

ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data.

[BibT_eX]

[DOI]

Proceedings of the 31st Data Compression Conference, 2021

2020

The allscale framework architecture.

[BibT_eX]

[DOI]

Parallel Comput., 2020

AllScale toolchain pilot applications: PDE based solvers using a parallel development environment.

[BibT_eX]

[DOI]

Comput. Phys. Commun., 2020

Datasets for Benchmarking Floating-Point Compressors.

[BibT_eX]

[DOI]

CoRR, 2020

Running on Raygun.

[BibT_eX]

[DOI]

Alexander Hirsch

CoRR, 2020

RTX-RSim: Accelerated Vulkan Room Response Simulation for Time-of-Flight Imaging.

[BibT_eX]

[DOI]

Proceedings of the IWOCL '20: International Workshop on OpenCL, 2020

SYCL-Bench: A Versatile Single-Source Benchmark Suite for Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the IWOCL '20: International Workshop on OpenCL, 2020

SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2020: Parallel Processing, 2020

2019

Static Compiler Analyses for Application-specific Optimization of Task-Parallel Runtime Systems.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2019

Compiler Generated Progress Estimation for OpenMP Programs.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing Technologies, 2019

Celerity: High-Level C++ for Accelerator Clusters.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2019: Parallel Processing, 2019

The AllScale API.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on eScience, 2019

2018

Dataset for "The AllScale Runtime Application Model" publication.

[BibT_eX]

[DOI]

Dataset, July, 2018

A taxonomy of task-based parallel programming technologies for high-performance computing.

[BibT_eX]

[DOI]

J. Supercomput., 2018

Exploring the semantic gap in compiling embedded DSLs.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, 2018

The AllScale Runtime Application Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017

SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Hans Vandierendonck

Bronis R. de Supinski

ACM Trans. Archit. Code Optim., 2017

A Taxonomy of Task-Based Technologies for High-Performance Computing.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing and Applied Mathematics, 2017

Characterizing Performance and Cache Impacts of Code Multi-versioning on Multicore Architectures.

[BibT_eX]

[DOI]

Proceedings of the 25th Euromicro International Conference on Parallel, 2017

Task-parallel Runtime System Optimization Using Static Compiler Analysis.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, 2017

2016

The AllScale Runtime Interface - Theoretical Foundation and Concept.

[BibT_eX]

[DOI]

Proceedings of the 9th Workshop on Many-Task Computing on Clouds, 2016

A Context-Aware Primitive for Nested Recursive Parallelism.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

2015

On the Quality of Implementation of the C++11 Thread Support Library.

[BibT_eX]

[DOI]

Philipp Gschwandtner

Proceedings of the 23rd Euromicro International Conference on Parallel, 2015

Application-Level Energy Awareness for OpenMP.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Optimizing Task Parallelism with Library-Semantics-Aware Compilation.

[BibT_eX]

[DOI]

Stefan Moosbrugger

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014

Compiler multiversioning for automatic task granularity control.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2014

2013

Adaptive Granularity Control in Task Parallel Programs Using Multiversioning.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

A High-Level IR Transformation System.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013

INSPIRE: The insieme parallel intermediate representation.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012

A multi-objective auto-tuning framework for parallel codes.

[BibT_eX]

[DOI]

Juan Jose Durillo Barrionuevo

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Automatic OpenMP Loop Scheduling: A Combined Compiler and Runtime Approach.

[BibT_eX]

[DOI]

Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

2011

Automatic OpenCL Device Characterization: Guiding Optimized Kernel Design.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2010

Topology-Aware OpenMP Process Scheduling.

[BibT_eX]

[DOI]

Hans Moritsch

Proceedings of the Beyond Loop Level Parallelism in OpenMP: Accelerators, 2010

2008

GPU-Based Multigrid: Real-Time Performance in High Resolution Nonlinear Image Processing.

[BibT_eX]

[DOI]

Harald Grossauer