Biagio Cosenza

Sandro Fiore

Proceedings of the 3rd Special Track on Big Data and High-Performance Computing (BigHPC 2025) co-located with the 4th Italian Conference on Big Data and Data Science (ITADATA 2025), 2025

2024

Analysis and prediction of performance variability in large-scale computing systems.

[BibT_eX]

[DOI]

Sascha Hunold

J. Supercomput., July, 2024

Out of kernel tuning and optimizations for portable large-scale docking experiments on GPUs.

[BibT_eX]

[DOI]

J. Supercomput., May, 2024

Enabling performance portability on the LiGen drug discovery pipeline.

[BibT_eX]

[DOI]

Andrea Rosario Beccari

Future Gener. Comput. Syst., 2024

SYCL-Bench 2020: Benchmarking SYCL 2020 on AMD, Intel, and NVIDIA GPUs.

[BibT_eX]

[DOI]

Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

Unlocking performance portability on LUMI-G supercomputer: A virtual screening case study.

[BibT_eX]

[DOI]

Andrea Rosario Beccari

Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

miniLB: A Performance Portability Study of Lattice-Boltzmann Simulations.

[BibT_eX]

[DOI]

Proceedings of the 2nd Special Track on Big Data and High-Performance Computing (BigHPC 2024) co-located with the 3rd Italian Conference on Big Data and Data Science (ITADATA 2024), 2024

MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns.

[BibT_eX]

[DOI]

Sascha Hunold

Proceedings of the IEEE International Conference on Cluster Computing, 2024

LIGATE - LIgand Generator and portable drug discovery platform AT Exascale.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

2023

Improving computation efficiency using input and architecture features for a virtual screening application.

[BibT_eX]

[DOI]

CoRR, 2023

SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy Saving.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2023

Domain-Specific Energy Modeling for Drug Discovery and Magnetohydrodynamics Applications.

[BibT_eX]

[DOI]

Andrea Rosario Beccari

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Towards a SYCL API for Approximate Computing.

[BibT_eX]

[DOI]

Lorenzo Carpentieri

Proceedings of the 2023 International Workshop on OpenCL, 2023

Algorithm Selection of MPI Collectives Considering System Utilization.

[BibT_eX]

[DOI]

Sascha Hunold

Proceedings of the Euro-Par 2023: Parallel Processing Workshops - Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28, 2023

Tunable and Portable Extreme-Scale Drug Discovery Platform at Exascale: the LIGATE Approach.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023

EMPI: Enhanced Message Passing Interface in Modern C++.

[BibT_eX]

[DOI]

Luigi Crisci

Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023

2022

Celerity: How (Well) Does the SYCL API Translate to Distributed Clusters?

[BibT_eX]

[DOI]

Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

Towards a Portable Drug Discovery Pipeline with SYCL 2020.

[BibT_eX]

[DOI]

Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

An Analysis of Performance Variability on Dragonfly+topology.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2022

FLEXDP: flexible frequency scaling for energy-delay product optimization of GPU applications.

[BibT_eX]

[DOI]

Kaijie Fan

Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

An Analysis of Long-Tailed Network Latency Distribution and Background Traffic on Dragonfly+.

[BibT_eX]

[DOI]

Proceedings of the Benchmarking, Measuring, and Optimizing, 2022

2021

Easy and efficient agent-based simulations with the OpenABL language and compiler.

[BibT_eX]

[DOI]

Mozhgan Kabiri Chimeh

Carmine Spagnuolo

Gennaro Cordasco

Vittorio Scarano

Future Gener. Comput. Syst., 2021

ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning.

[BibT_eX]

[DOI]

Daniel Maier

Proceedings of the Euro-Par 2021: Parallel Processing, 2021

The Italian research on HPC key technologies across EuroHPC.

[BibT_eX]

[DOI]

Marco Aldinucci

Giovanni Agosta

Antonio Andreini

Claudio Agostino Ardagna

Proceedings of the CF '21: Computing Frontiers Conference, 2021

2020

Vectorization cost modeling for NEON, AVX and SVE.

[BibT_eX]

[DOI]

Perform. Evaluation, 2020

Accurate Energy and Performance Prediction for Frequency-Scaled GPU Kernels.

[BibT_eX]

[DOI]

Kaijie Fan

Comput., 2020

SYCL-Bench: A Versatile Single-Source Benchmark Suite for Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the IWOCL '20: International Workshop on OpenCL, 2020

SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2020: Parallel Processing, 2020

2019

Portable Cost Modeling for Auto-Vectorizers.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Symposium on Modeling, 2019

A Performance Analysis of Vector Length Agnostic Code.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

Approximating Memory-bound Applications on Mobile GPUs.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

Predictable GPUs Frequency Scaling for Energy and Performance.

[BibT_eX]

[DOI]

Kaijie Fan

Proceedings of the 48th International Conference on Parallel Processing, 2019

Celerity: High-Level C++ for Accelerator Clusters.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2019: Parallel Processing, 2019

2018

Control Flow Vectorization for ARM NEON.

[BibT_eX]

[DOI]

Proceedings of the 21st International Workshop on Software and Compilers for Embedded Systems, 2018

Accelerating the RICH Particle Detector Algorithm on Intel Xeon Phi.

[BibT_eX]

[DOI]

Proceedings of the 26th Euromicro International Conference on Parallel, 2018

OpenABL: A Domain-Specific Language for Parallel and Distributed Agent-Based Simulations.

[BibT_eX]

[DOI]

Mozhgan Kabiri Chimeh

Carmine Spagnuolo

Gennaro Cordasco

Vittorio Scarano

Proceedings of the Euro-Par 2018: Parallel Processing, 2018

Cost Modelling for Vectorization on ARM.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2018

Local memory-aware kernel perforation.

[BibT_eX]

[DOI]

Daniel Maier

Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

2017

Stencil Autotuning with Ordinal Regression: Extended Abstract.

[BibT_eX]

[DOI]

Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems, 2017

Autotuning Stencil Computations with Structural Ordinal Regression Learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Static optimization in PHP 7.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Compiler Construction, 2017

2016

An evaluation of current SIMD programming models for C++.

[BibT_eX]

[DOI]

Mauricio Alvarez-Mesa

Chi Ching Chi

Proceedings of the 3rd Workshop on Programming Models for SIMD/Vector Processing, 2016

2015

Spectral turning bands for efficient Gaussian random fields generation on GPUs and accelerators.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2015

Point Distribution Tensor Computation on Heterogeneous Systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computational Science, 2015

Automatic Data Layout Optimizations for GPUs.

[BibT_eX]

[DOI]

Klaus Kofler

Thomas Fahringer

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

Behavioral Spherical Harmonics for Long-Range Agents' Interaction.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

2014

A uniform approach for programming distributed heterogeneous computing systems.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2014

Kd-Tree Based N-Body Simulations with Volume-Mass Heuristic on the GPU.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Random Fields Generation on the GPU with the Spectral Turning Bands Method.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013

Automatic problem size sensitive task partitioning on heterogeneous parallel systems.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

An automatic input-sensitive approach for heterogeneous task partitioning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2013

LibWater: heterogeneous distributed computing made easy.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2013

GPU Cost Estimation for Load Balancing in Parallel Ray Tracing.

[BibT_eX]

Carsten Dachsbacher

Ugo Erra

Proceedings of the GRAPP & IVAPP 2013: Proceedings of the International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications, 2013

2011

Efficient distributed load balancing for parallel algorithms.

[BibT_eX]

[DOI]

PhD thesis, 2011

Distributed Load Balancing for Parallel Agent-Based Simulations.

[BibT_eX]

[DOI]