Johannes Doerfert

Proceedings of the OpenMP: Balancing Productivity and Performance Portability, 2025

Predicting Performance for OpenMP GPU Parameter Choices.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Balancing Productivity and Performance Portability, 2025

2024

ComPile: A Large IR Dataset from Production Sources.

[BibT_eX]

[DOI]

Aiden Grossman

Ludger Paehler

Tal Ben-Nun

Jacob Hegna

William S. Moses

Mircea Trofin

J. Data-centric Mach. Learn. Res., 2024

Input-Gen: Guided Generation of Stateful Inputs for Testing, Tuning, and Training.

[BibT_eX]

[DOI]

CoRR, 2024

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs.

[BibT_eX]

[DOI]

CoRR, 2024

Evaluating Tuning Opportunities of the LLVM/OpenMP Runtime.

[BibT_eX]

[DOI]

Proceedings of the SC24-W: Workshops of the International Conference for High Performance Computing, 2024

CI/CD Efforts for Validation, Verification and Benchmarking OpenMP Implementations.

[BibT_eX]

[DOI]

Sunita Chandrasekaran

Proceedings of the Advancing OpenMP for Future Accelerators, 2024

Automatic Parallelization and OpenMP Offloading of Fortran Array Notation.

[BibT_eX]

[DOI]

Proceedings of the Advancing OpenMP for Future Accelerators, 2024

Leveraging LLVM OpenMP GPU Offload Optimizations for Kokkos Applications.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE International Conference on High Performance Computing, 2024

2023

Quantum Task Offloading with the OpenMP API.

[BibT_eX]

[DOI]

CoRR, 2023

GPU First - Execution of Legacy CPU Codes on GPUs.

[BibT_eX]

[DOI]

CoRR, 2023

OpenMP Kernel Language Extensions for Performance Portable GPU Codes.

[BibT_eX]

[DOI]

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Precision and Performance Analysis of C Standard Math Library Functions on GPUs.

[BibT_eX]

[DOI]

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Memory Transfer Decomposition: Exploring Smart Data Movement Through Architecture-Aware Strategies.

[BibT_eX]

[DOI]

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2023

High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Advanced Task-Based, Device and Compiler Programming, 2023

The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned.

[BibT_eX]

[DOI]

Damien Lebrun-Grandié

Proceedings of the OpenMP: Advanced Task-Based, Device and Compiler Programming, 2023

Maximizing Parallelism and GPU Utilization For Direct GPU Compilation Through Ensemble Execution.

[BibT_eX]

[DOI]

Proceedings of the 52nd International Conference on Parallel Processing Workshops, 2023

Implementing OpenMP's SIMD Directive in LLVM's GPU Runtime.

[BibT_eX]

[DOI]

Sunita Chandrasekaran

Proceedings of the 52nd International Conference on Parallel Processing, 2023

ORAQL - Optimistic Responses to Alias Queries in LLVM.

[BibT_eX]

[DOI]

Jan Hückelheim

Proceedings of the 52nd International Conference on Parallel Processing, 2023

SPLENDID: Supporting Parallel LLVM-IR Enhanced Natural Decompilation for Interactive Development.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022

OpenMP application experiences: Porting to accelerated nodes.

[BibT_eX]

[DOI]

Parallel Comput., 2022

MARTINI: The Little Match and Replace Tool for Automatic Code Rewriting.

[BibT_eX]

[DOI]

J. Open Source Softw., 2022

Remote OpenMP Offloading.

[BibT_eX]

[DOI]

Atmn Patel

Sri Hari Krishna Narayanan

Proceedings of the High Performance Computing - 37th International Conference, 2022

Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation.

[BibT_eX]

[DOI]

William S. Moses

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Piper: Pipelining OpenMP Offloading Execution Through Compiler Optimization For Performance.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Ignacio Laguna

Thomas R. W. Scogland

Proceedings of the IEEE/ACM International Workshop on Performance, 2022

Direct GPU Compilation and Execution for Host Applications with OpenMP Parallelism.

[BibT_eX]

[DOI]

Joseph Huber

Rafael A. Herrera Guaitero

Proceedings of the Eighth IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2022

Automatic Asynchronous Execution of Synchronously Offloaded OpenMP Target Regions.

[BibT_eX]

[DOI]

Thomas Applencourt

Xiaoming Li

Proceedings of the Eighth IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2022

Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading.

[BibT_eX]

[DOI]

Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022

Towards Automatic OpenMP-Aware Utilization of Fast GPU Memory.

[BibT_eX]

[DOI]

Delaram Talaashrafi

Marc Moreno Maza

Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022

Towards Efficient Remote OpenMP Offloading.

[BibT_eX]

[DOI]

Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022

Co-Designing an OpenMP GPU Runtime and Optimizations for Near-Zero Overhead Execution.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

A Pipeline Pattern Detection Technique in Polly.

[BibT_eX]

[DOI]

Delaram Talaashrafi

Marc Moreno Maza

Proceedings of the Workshop Proceedings of the 51st International Conference on Parallel Processing, 2022

MARTINI: The Little Match and Replace Tool for Automatic Application Rewriting with Code Examples.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2022: Parallel Processing, 2022

Efficient Execution of OpenMP on GPUs.

[BibT_eX]

[DOI]

Kuter Dinel

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

Breaking the Vendor Lock: Performance Portable Programming through OpenMP as Target Independent Runtime Layer.

[BibT_eX]

[DOI]

Sri Hari Krishna Narayanan

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021

Reverse-mode automatic differentiation and optimization of GPU kernels via enzyme.

[BibT_eX]

[DOI]

Michel Schanen

Proceedings of the International Conference for High Performance Computing, 2021

Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021

A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021

Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II).

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021

Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I).

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021

Spray: Sparse Reductions of Arrays in OPENMP.

[BibT_eX]

[DOI]

Jan Hückelheim

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

A Virtual GPU as Developer-Friendly OpenMP Offload Target.

[BibT_eX]

[DOI]

Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021

Towards Compile-Time-Reducing Compiler Optimization Selection via Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021

Advancing OpenMP Offload Debugging Capabilities in LLVM.

[BibT_eX]

[DOI]

Joseph Huber

Melanie Cornelius

Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021

2020

Really Embedding Domain-Specific Languages into C++.

[BibT_eX]

[DOI]

Alexander J. McCaskey

Tobi Popoola

Dmitry I. Lyakh

CoRR, 2020

Concurrent Execution of Deferred OpenMP Target Tasks with Hidden Helper Threads.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2020

FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Ignacio Laguna

Thomas R. W. Scogland

Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

Automated Partitioning of Data-Parallel Kernels using Polyhedral Compilation.

[BibT_eX]

[DOI]

Alexander Matz

Holger Fröning

Proceedings of the ICPP Workshops '20: Workshops, Edmonton, AB, Canada, August 17-20, 2020, 2020

2019

Performance Exploration Through Optimistic Static Program Annotations.

[BibT_eX]

[DOI]

Brian Homerding

Proceedings of the High Performance Computing - 34th International Conference, 2019

The TRegion Interface and Compiler Optimizations for OpenMP Target Regions.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

2018

Applicable and sound polyhedral optimization of low-level programs.

[BibT_eX]

[DOI]

PhD thesis, 2018

Compiler Optimizations for Parallel Programs.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2018

Compiler Optimizations for OpenMP.

[BibT_eX]

[DOI]

Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

Polyhedral expression propagation.

[BibT_eX]

[DOI]

Shrey Sharma

Sebastian Hack

Proceedings of the 27th International Conference on Compiler Construction, 2018

2017

Optimistic loop optimization.

[BibT_eX]

[DOI]

Tobias Grosser

Sebastian Hack

Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

2016

Input space splitting for OpenCL.

[BibT_eX]

[DOI]

Simon Moll

Sebastian Hack

Proceedings of the 25th International Conference on Compiler Construction, 2016

2015

Generalized Task Parallelism.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2015

Polly's Polyhedral Scheduling in the Presence of Reductions.

[BibT_eX]

[DOI]

CoRR, 2015

Runtime pointer disambiguation.

[BibT_eX]

[DOI]

Péricles Alves

Fabian Gruber

Fernando Magno Quintão Pereira

Alexandros Lamprineas

Tobias Grosser

Fabrice Rastello

Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, 2015

2014

Architecture-parametric timing analysis.

[BibT_eX]

[DOI]

Jan Reineke