We stand with Ukraine

We stand with Ukraine

Carlo Bertolli

Orcid: 0009-0006-6852-1445

According to our database¹, Carlo Bertolli authored at least 45 papers between 2008 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2024

Porting HPC Applications to AMD InstinctTM MI300A Using Unified Memory and OpenMP.

[DOI]

,

Leopold Grinberg

,

Gheorghe-Teodor Bercea

,

,

,

,

Nicholas Malaya

CoRR, 2024

Porting HPC Applications to AMD Instinct™ MI300A using Unified Memory and OpenMP®.

[DOI]

,

Leopold Grinberg

,

Gheorghe-Teodor Bercea

,

,

,

,

Nicholas Malaya

Proceedings of the ISC High Performance 2024 Research Paper Proceedings (39th International Conference), 2024

Performance Analysis of Runtime Handling of Zero-Copy for OpenMP Programs on MI300A APUs.

[DOI]

,

,

,

Nicole Aschenbrenner

,

Jan-Patrick Lehr

,

,

Dhruva R. Chakrabarti

,

Lawrence Meadows

,

Proceedings of the SC24-W: Workshops of the International Conference for High Performance Computing, 2024

2023

Specialized Kernels for Optimizing GPU Offload in OpenMP.

[DOI]

Dhruva R. Chakrabarti

,

Gregory Rodgers

,

,

Gheorghe-Teodor Bercea

,

Jan-Patrick Lehr

,

,

,

,

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

2020

An open-source solution to performance portability for Summit and Sierra supercomputers.

[DOI]

Gheorghe-Teodor Bercea

,

,

Alexandre E. Eichenberger

,

,

John K. O'Brien

IBM J. Res. Dev., 2020

2017

Implementing implicit OpenMP data sharing on GPUs.

[DOI]

Gheorghe-Teodor Bercea

,

,

Arpith C. Jacob

,

Alexandre E. Eichenberger

,

,

,

,

,

Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, 2017

Hands on with OpenMP4.5 and Unified Memory: Developing Applications for IBM's Hybrid CPU + GPU Systems (Part II).

[DOI]

Leopold Grinberg

,

,

Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Hands on with OpenMP4.5 and Unified Memory: Developing Applications for IBM's Hybrid CPU + GPU Systems (Part I).

[DOI]

Leopold Grinberg

,

,

Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Efficient Fork-Join on GPUs Through Warp Specialization.

[DOI]

Arpith Chacko Jacob

,

Alexandre E. Eichenberger

,

,

Samuel F. Antão

,

Gheorghe-Teodor Bercea

,

,

,

,

,

,

,

Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

2016

Acceleration of a Full-Scale Industrial CFD Application with OP2.

[DOI]

István Z. Reguly

,

Gihan R. Mudalige

,

,

Michael B. Giles

,

,

Paul H. J. Kelly

,

IEEE Trans. Parallel Distributed Syst., 2016

Performance Analysis and Optimization of Clang's OpenMP 4.5 GPU Support.

[DOI]

,

Simon McIntosh-Smith

,

,

Arpith C. Jacob

,

Samuel F. Antão

,

Alexandre E. Eichenberger

,

Gheorghe-Teodor Bercea

,

,

,

,

,

,

Proceedings of the 7th International Workshop on Performance Modeling, 2016

Offloading Support for OpenMP in Clang and LLVM.

[DOI]

Samuel F. Antão

,

,

Arpith C. Jacob

,

Gheorghe-Teodor Bercea

,

Alexandre E. Eichenberger

,

,

,

,

,

,

,

,

,

Proceedings of the Third Workshop on the LLVM Compiler Infrastructure in HPC, 2016

Early Experiences Porting Three Applications to OpenMP 4.5.

[DOI]

,

,

Arpith C. Jacob

,

Samuel F. Antão

,

Gheorghe-Teodor Bercea

,

,

Bronis R. de Supinski

,

Erik W. Draeger

,

Alexandre E. Eichenberger

,

,

,

,

David Poliakoff

,

David F. Richards

Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

2015

Active Memory Cube: A processing-in-memory architecture for exascale systems.

[DOI]

IBM J. Res. Dev., 2015

Integrating GPU support for OpenMP offloading directives into Clang.

[DOI]

,

,

Gheorghe-Teodor Bercea

,

Arpith C. Jacob

,

Alexandre E. Eichenberger

,

,

,

,

,

David Appelhans

,

Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 2015

Performance analysis of OpenMP on a GPU using a CORAL proxy application.

[DOI]

Gheorghe-Teodor Bercea

,

,

Samuel F. Antão

,

Arpith C. Jacob

,

Alexandre E. Eichenberger

,

,

,

,

,

David Appelhans

,

Proceedings of the 6th International Workshop on Performance Modeling, 2015

Progressive Codesign of an Architecture and Compiler Using a Proxy Application.

[DOI]

Arpith C. Jacob

,

,

,

,

,

,

,

Proceedings of the 27th International Symposium on Computer Architecture and High Performance Computing, 2015

Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach.

[DOI]

Arpith C. Jacob

,

,

Alexandre E. Eichenberger

,

,

,

,

,

,

Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Data access optimization in a processing-in-memory system.

[DOI]

,

Arpith C. Jacob

,

,

Bryan S. Rosenburg

,

Olivier Sallenave

,

,

,

José R. Brunheroto

,

,

,

Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

2014

Coordinating GPU threads for OpenMP 4.0 in LLVM.

[DOI]

,

,

Alexandre E. Eichenberger

,

,

,

Arpith C. Jacob

,

,

Olivier Sallenave

Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, 2014

Generalizing Run-Time Tiling with the Loop Chain Abstraction.

[DOI]

Michelle Mills Strout

,

,

Christopher D. Krieger

,

,

Gheorghe-Teodor Bercea

,

Catherine Olschanowsky

,

,

Paul H. J. Kelly

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

2013

Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems.

[DOI]

Gihan R. Mudalige

,

,

Jeyarajan Thiyagalingam

,

,

,

Paul H. J. Kelly

,

Anne E. Trefethen

Parallel Comput., 2013

Designing OP2 for GPU architectures.

[DOI]

,

Gihan R. Mudalige

,

,

,

J. Parallel Distributed Comput., 2013

Performance-Portable Finite Element Assembly Using PyOP2 and FEniCS.

[DOI]

Graham R. Markall

,

Florian Rathgeber

,

Lawrence Mitchell

,

Nicolas Loriant

,

,

,

Paul H. J. Kelly

Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Loop Chaining: A Programming Abstraction for Balancing Locality and Parallelism.

[DOI]

Christopher D. Krieger

,

Michelle Mills Strout

,

Catherine Olschanowsky

,

,

Stephen M. Guzik

,

,

,

Paul H. J. Kelly

,

Gihan R. Mudalige

,

Brian van Straalen

,

Samuel Williams

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012

Predictive modeling and analysis of OP2 on distributed memory GPU clusters.

[DOI]

Gihan R. Mudalige

,

,

,

Paul H. J. Kelly

SIGMETRICS Perform. Evaluation Rev., 2012

PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes.

[DOI]

Florian Rathgeber

,

Graham R. Markall

,

Lawrence Mitchell

,

Nicolas Loriant

,

,

,

Paul H. J. Kelly

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

An Analytical Study of Loop Tiling for a Large-Scale Unstructured Mesh Application.

[DOI]

,

Gihan R. Mudalige

,

,

Paul H. J. Kelly

,

,

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Compiler Optimizations for Industrial Unstructured Mesh CFD Applications on GPUs.

[DOI]

,

,

Nicolas Loriant

,

Gihan R. Mudalige

,

,

,

,

Paul H. J. Kelly

Proceedings of the Languages and Compilers for Parallel Computing, 2012

Mesh independent loop fusion for unstructured mesh applications.

[DOI]

,

,

Paul H. J. Kelly

,

Gihan R. Mudalige

,

Proceedings of the Computing Frontiers Conference, CF'12, 2012

2011

Fault tolerance for data parallel programs.

[DOI]

,

Marco Vanneschi

Concurr. Comput. Pract. Exp., 2011

Consistent reconfiguration protocols for adaptive high-performance applications.

[DOI]

,

Gabriele Mencagli

,

Marco Vanneschi

Proceedings of the 7th International Wireless Communications and Mobile Computing Conference, 2011

Consistent Rollback Protocols for Autonomic ASSISTANT Applications.

[DOI]

,

Gabriele Mencagli

,

Marco Vanneschi

Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

Design and Performance of the OP2 Library for Unstructured Mesh Applications.

[DOI]

,

,

Gihan R. Mudalige

,

,

Paul H. J. Kelly

Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

2010

An Approach to Mobile Grid Platforms for the Development and Support of Complex Ubiquitous Applications.

[DOI]

,

,

Gabriele Mencagli

,

Marco Vanneschi

Int. J. Adv. Pervasive Ubiquitous Comput., 2010

Analyzing Memory Requirements for Pervasive Grid Applications.

[DOI]

,

Gabriele Mencagli

,

Marco Vanneschi

Proceedings of the 18th Euromicro Conference on Parallel, 2010

Enabling replication in the ASSISTANT programming model.

[DOI]

,

Marco Vanneschi

,

,

Francesco Quaglia

Proceedings of the 6th International Wireless Communications and Mobile Computing Conference, 2010

An integrated communication-computing solution in emergency management.

[DOI]

,

,

Romano Fantacci

,

Marco Vanneschi

,

Proceedings of the 6th International Wireless Communications and Mobile Computing Conference, 2010

Resource discovery support for time-critical adaptive applications.

[DOI]

,

,

Gabriele Mencagli

,

Massimo Torquati

,

Marco Vanneschi

,

Matteo Mordacchini

,

Franco Maria Nardini

Proceedings of the 6th International Wireless Communications and Mobile Computing Conference, 2010

A cost model for autonomic reconfigurations in high-performance pervasive applications.

[DOI]

,

Gabriele Mencagli

,

Marco Vanneschi

Proceedings of the 4th ACM International Workshop on Context-Awareness for Self-Managing Systems, 2010

2009

Next generation grids and wireless communication networks: towards a novel integrated approach.

[DOI]

Romano Fantacci

,

Marco Vanneschi

,

,

Gabriele Mencagli

,

Wirel. Commun. Mob. Comput., 2009

Optimized Checkpointing Protocols for Data Parallel Programs.

[DOI]

,

Marco Vanneschi

Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

Adaptivity in Risk and Emergency Management Applications on Pervasive Grids.

[DOI]

,

Gabriele Mencagli

,

Marco Vanneschi

Proceedings of the 10th International Symposium on Pervasive Systems, 2009

Expressing Adaptivity and Context Awareness in the ASSISTANT Programming Model.

[DOI]

,

,

Gabriele Mencagli

,

Marco Vanneschi

Proceedings of the Autonomic Computing and Communications Systems, 2009

2008

Fault Tolerance for High-Performance Applications Using Structured Parallelism Models.

[DOI]

PhD thesis, 2008

Loading...