Peter Thoman

Orcid: 0000-0002-4028-7451

According to our database1, Peter Thoman authored at least 49 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance.
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

SYCL-Bench 2020: Benchmarking SYCL 2020 on AMD, Intel, and NVIDIA GPUs.
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

2023
Declarative Data Flow in a Graph-Based Distributed Memory Runtime System.
Int. J. Parallel Program., 2023

Command Horizons: Coalescing Data Dependencies While Maintaining Asynchronicity.
Proceedings of the Asynchronous Many-Task Systems and Applications, 2023

Domain-Specific Energy Modeling for Drug Discovery and Magnetohydrodynamics Applications.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023


An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing.
Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023

2022
The Celerity High-level API: C++20 for Accelerator Clusters.
Int. J. Parallel Program., 2022

Multi-GPU room response simulation with hardware raytracing.
Concurr. Comput. Pract. Exp., 2022

On the Compilation Performance of Current SYCL Implementations.
Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

Celerity: How (Well) Does the SYCL API Translate to Distributed Clusters?
Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

2021
The cluster coffer: Teaching HPC on the road.
J. Parallel Distributed Comput., 2021

ndzip-gpu: efficient lossless compression of scientific floating-point data on GPUs.
Proceedings of the International Conference for High Performance Computing, 2021

Sylkan: Towards a Vulkan Compute Target Platform for SYCL.
Proceedings of the IWOCL'21: International Workshop on OpenCL, Munich Germany, April, 2021, 2021

Optimizing Embedded Industrial Safety Systems Based on Time-of-flight Depth Imaging.
Proceedings of the 17th IEEE International Conference on eScience, 2021

Porting Real-World Applications to GPU Clusters: A Celerity and Cronos Case Study.
Proceedings of the 17th IEEE International Conference on eScience, 2021

ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data.
Proceedings of the 31st Data Compression Conference, 2021

2020
The allscale framework architecture.
Parallel Comput., 2020

AllScale toolchain pilot applications: PDE based solvers using a parallel development environment.
Comput. Phys. Commun., 2020

Datasets for Benchmarking Floating-Point Compressors.
CoRR, 2020

Running on Raygun.
CoRR, 2020

AllScale API.
Comput. Informatics, 2020

RTX-RSim: Accelerated Vulkan Room Response Simulation for Time-of-Flight Imaging.
Proceedings of the IWOCL '20: International Workshop on OpenCL, 2020

SYCL-Bench: A Versatile Single-Source Benchmark Suite for Heterogeneous Computing.
Proceedings of the IWOCL '20: International Workshop on OpenCL, 2020

SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing.
Proceedings of the Euro-Par 2020: Parallel Processing, 2020

2019
Static Compiler Analyses for Application-specific Optimization of Task-Parallel Runtime Systems.
J. Signal Process. Syst., 2019

Compiler Generated Progress Estimation for OpenMP Programs.
Proceedings of the Parallel Computing Technologies, 2019

Celerity: High-Level C++ for Accelerator Clusters.
Proceedings of the Euro-Par 2019: Parallel Processing, 2019

2018
A taxonomy of task-based parallel programming technologies for high-performance computing.
J. Supercomput., 2018

Exploring the semantic gap in compiling embedded DSLs.
Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, 2018

The AllScale Runtime Application Model.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads.
ACM Trans. Archit. Code Optim., 2017

A Taxonomy of Task-Based Technologies for High-Performance Computing.
Proceedings of the Parallel Processing and Applied Mathematics, 2017

Characterizing Performance and Cache Impacts of Code Multi-versioning on Multicore Architectures.
Proceedings of the 25th Euromicro International Conference on Parallel, 2017

Task-parallel Runtime System Optimization Using Static Compiler Analysis.
Proceedings of the Computing Frontiers Conference, 2017

2016
The AllScale Runtime Interface - Theoretical Foundation and Concept.
Proceedings of the 9th Workshop on Many-Task Computing on Clouds, 2016

A Context-Aware Primitive for Nested Recursive Parallelism.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

2015
On the Quality of Implementation of the C++11 Thread Support Library.
Proceedings of the 23rd Euromicro International Conference on Parallel, 2015

Application-Level Energy Awareness for OpenMP.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Optimizing Task Parallelism with Library-Semantics-Aware Compilation.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014
Compiler multiversioning for automatic task granularity control.
Concurr. Comput. Pract. Exp., 2014

2013
Adaptive Granularity Control in Task Parallel Programs Using Multiversioning.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

A High-Level IR Transformation System.
Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013

INSPIRE: The insieme parallel intermediate representation.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
A multi-objective auto-tuning framework for parallel codes.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Automatic OpenMP Loop Scheduling: A Combined Compiler and Runtime Approach.
Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

2011
Automatic OpenCL Device Characterization: Guiding Optimized Kernel Design.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2010
Topology-Aware OpenMP Process Scheduling.
Proceedings of the Beyond Loop Level Parallelism in OpenMP: Accelerators, 2010

2008
GPU-Based Multigrid: Real-Time Performance in High Resolution Nonlinear Image Processing.
Proceedings of the Computer Vision Systems, 6th International Conference, 2008


  Loading...