Peter Thoman

Orcid: 0000-0002-4028-7451

According to our database1, Peter Thoman authored at least 58 papers between 2008 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
A High-Level API for Dynamic Load Balancing in Large-Scale Parameter Sweeps.
Int. J. Parallel Program., August, 2025

Celerity-RSim: Porting Light Propagation Simulation to Accelerator Clusters Using a High-Level API.
Int. J. Parallel Program., June, 2025

Toward Heterogeneous, Distributed, and Energy-Efficient Computing with SYCL.
CoRR, May, 2025

Concurrent Scheduling of High-Level Parallel Programs on Multi-GPU Systems.
CoRR, March, 2025

STREAMLINE: Dynamic and Resource-Efficient Auto-Tuning of Stream Processing Data Pipeline Ensembles.
Internet Things, 2025

2024
Automatic Discovery of Collective Communication Patterns in Parallelized Task Graphs.
Int. J. Parallel Program., June, 2024

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach.
SN Comput. Sci., April, 2024

SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance.
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

SYCL-Bench 2020: Benchmarking SYCL 2020 on AMD, Intel, and NVIDIA GPUs.
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024


2023
Declarative Data Flow in a Graph-Based Distributed Memory Runtime System.
Int. J. Parallel Program., 2023

Command Horizons: Coalescing Data Dependencies While Maintaining Asynchronicity.
Proceedings of the Asynchronous Many-Task Systems and Applications, 2023

Domain-Specific Energy Modeling for Drug Discovery and Magnetohydrodynamics Applications.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023


An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing.
Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023

2022
The Celerity High-level API: C++20 for Accelerator Clusters.
Int. J. Parallel Program., 2022

Multi-GPU room response simulation with hardware raytracing.
Concurr. Comput. Pract. Exp., 2022

On the Compilation Performance of Current SYCL Implementations.
Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

Celerity: How (Well) Does the SYCL API Translate to Distributed Clusters?
Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

2021
The cluster coffer: Teaching HPC on the road.
J. Parallel Distributed Comput., 2021

ndzip-gpu: efficient lossless compression of scientific floating-point data on GPUs.
Proceedings of the International Conference for High Performance Computing, 2021

Sylkan: Towards a Vulkan Compute Target Platform for SYCL.
Proceedings of the IWOCL'21: International Workshop on OpenCL, Munich Germany, April, 2021, 2021

Optimizing Embedded Industrial Safety Systems Based on Time-of-flight Depth Imaging.
Proceedings of the 17th IEEE International Conference on eScience, 2021

Porting Real-World Applications to GPU Clusters: A Celerity and Cronos Case Study.
Proceedings of the 17th IEEE International Conference on eScience, 2021

ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data.
Proceedings of the 31st Data Compression Conference, 2021

2020
The allscale framework architecture.
Parallel Comput., 2020

AllScale toolchain pilot applications: PDE based solvers using a parallel development environment.
Comput. Phys. Commun., 2020

Datasets for Benchmarking Floating-Point Compressors.
CoRR, 2020

Running on Raygun.
CoRR, 2020

RTX-RSim: Accelerated Vulkan Room Response Simulation for Time-of-Flight Imaging.
Proceedings of the IWOCL '20: International Workshop on OpenCL, 2020

SYCL-Bench: A Versatile Single-Source Benchmark Suite for Heterogeneous Computing.
Proceedings of the IWOCL '20: International Workshop on OpenCL, 2020

SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing.
Proceedings of the Euro-Par 2020: Parallel Processing, 2020

2019
Static Compiler Analyses for Application-specific Optimization of Task-Parallel Runtime Systems.
J. Signal Process. Syst., 2019

Compiler Generated Progress Estimation for OpenMP Programs.
Proceedings of the Parallel Computing Technologies, 2019

Celerity: High-Level C++ for Accelerator Clusters.
Proceedings of the Euro-Par 2019: Parallel Processing, 2019

The AllScale API.
Proceedings of the 15th International Conference on eScience, 2019

2018
Dataset for "The AllScale Runtime Application Model" publication.
Dataset, July, 2018

A taxonomy of task-based parallel programming technologies for high-performance computing.
J. Supercomput., 2018

Exploring the semantic gap in compiling embedded DSLs.
Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, 2018

The AllScale Runtime Application Model.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads.
ACM Trans. Archit. Code Optim., 2017

A Taxonomy of Task-Based Technologies for High-Performance Computing.
Proceedings of the Parallel Processing and Applied Mathematics, 2017

Characterizing Performance and Cache Impacts of Code Multi-versioning on Multicore Architectures.
Proceedings of the 25th Euromicro International Conference on Parallel, 2017

Task-parallel Runtime System Optimization Using Static Compiler Analysis.
Proceedings of the Computing Frontiers Conference, 2017

2016
The AllScale Runtime Interface - Theoretical Foundation and Concept.
Proceedings of the 9th Workshop on Many-Task Computing on Clouds, 2016

A Context-Aware Primitive for Nested Recursive Parallelism.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

2015
On the Quality of Implementation of the C++11 Thread Support Library.
Proceedings of the 23rd Euromicro International Conference on Parallel, 2015

Application-Level Energy Awareness for OpenMP.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Optimizing Task Parallelism with Library-Semantics-Aware Compilation.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014
Compiler multiversioning for automatic task granularity control.
Concurr. Comput. Pract. Exp., 2014

2013
Adaptive Granularity Control in Task Parallel Programs Using Multiversioning.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

A High-Level IR Transformation System.
Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013

INSPIRE: The insieme parallel intermediate representation.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
A multi-objective auto-tuning framework for parallel codes.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Automatic OpenMP Loop Scheduling: A Combined Compiler and Runtime Approach.
Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

2011
Automatic OpenCL Device Characterization: Guiding Optimized Kernel Design.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2010
Topology-Aware OpenMP Process Scheduling.
Proceedings of the Beyond Loop Level Parallelism in OpenMP: Accelerators, 2010

2008
GPU-Based Multigrid: Real-Time Performance in High Resolution Nonlinear Image Processing.
Proceedings of the Computer Vision Systems, 6th International Conference, 2008


  Loading...