Carlo Bertolli

Orcid: 0009-0006-6852-1445

According to our database1, Carlo Bertolli authored at least 42 papers between 2008 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Specialized Kernels for Optimizing GPU Offload in OpenMP.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

2020
An open-source solution to performance portability for Summit and Sierra supercomputers.
IBM J. Res. Dev., 2020

2017
Implementing implicit OpenMP data sharing on GPUs.
Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, 2017

Hands on with OpenMP4.5 and Unified Memory: Developing Applications for IBM's Hybrid CPU + GPU Systems (Part II).
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Hands on with OpenMP4.5 and Unified Memory: Developing Applications for IBM's Hybrid CPU + GPU Systems (Part I).
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Efficient Fork-Join on GPUs Through Warp Specialization.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

2016
Acceleration of a Full-Scale Industrial CFD Application with OP2.
IEEE Trans. Parallel Distributed Syst., 2016

Performance Analysis and Optimization of Clang's OpenMP 4.5 GPU Support.
Proceedings of the 7th International Workshop on Performance Modeling, 2016

Offloading Support for OpenMP in Clang and LLVM.
Proceedings of the Third Workshop on the LLVM Compiler Infrastructure in HPC, 2016


2015
Active Memory Cube: A processing-in-memory architecture for exascale systems.
IBM J. Res. Dev., 2015

Integrating GPU support for OpenMP offloading directives into Clang.
Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 2015

Performance analysis of OpenMP on a GPU using a CORAL proxy application.
Proceedings of the 6th International Workshop on Performance Modeling, 2015

Progressive Codesign of an Architecture and Compiler Using a Proxy Application.
Proceedings of the 27th International Symposium on Computer Architecture and High Performance Computing, 2015

Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Data access optimization in a processing-in-memory system.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

2014
Coordinating GPU threads for OpenMP 4.0 in LLVM.
Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, 2014

Generalizing Run-Time Tiling with the Loop Chain Abstraction.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

2013
Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems.
Parallel Comput., 2013

Designing OP2 for GPU architectures.
J. Parallel Distributed Comput., 2013

Performance-Portable Finite Element Assembly Using PyOP2 and FEniCS.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Loop Chaining: A Programming Abstraction for Balancing Locality and Parallelism.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012
Predictive modeling and analysis of OP2 on distributed memory GPU clusters.
SIGMETRICS Perform. Evaluation Rev., 2012

PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

An Analytical Study of Loop Tiling for a Large-Scale Unstructured Mesh Application.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Compiler Optimizations for Industrial Unstructured Mesh CFD Applications on GPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2012

Mesh independent loop fusion for unstructured mesh applications.
Proceedings of the Computing Frontiers Conference, CF'12, 2012

2011
Fault tolerance for data parallel programs.
Concurr. Comput. Pract. Exp., 2011

Consistent reconfiguration protocols for adaptive high-performance applications.
Proceedings of the 7th International Wireless Communications and Mobile Computing Conference, 2011

Consistent Rollback Protocols for Autonomic ASSISTANT Applications.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

Design and Performance of the OP2 Library for Unstructured Mesh Applications.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

2010
An Approach to Mobile Grid Platforms for the Development and Support of Complex Ubiquitous Applications.
Int. J. Adv. Pervasive Ubiquitous Comput., 2010

Analyzing Memory Requirements for Pervasive Grid Applications.
Proceedings of the 18th Euromicro Conference on Parallel, 2010

Enabling replication in the ASSISTANT programming model.
Proceedings of the 6th International Wireless Communications and Mobile Computing Conference, 2010

An integrated communication-computing solution in emergency management.
Proceedings of the 6th International Wireless Communications and Mobile Computing Conference, 2010

Resource discovery support for time-critical adaptive applications.
Proceedings of the 6th International Wireless Communications and Mobile Computing Conference, 2010

A cost model for autonomic reconfigurations in high-performance pervasive applications.
Proceedings of the 4th ACM International Workshop on Context-Awareness for Self-Managing Systems, 2010

2009
Next generation grids and wireless communication networks: towards a novel integrated approach.
Wirel. Commun. Mob. Comput., 2009

Optimized Checkpointing Protocols for Data Parallel Programs.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

Adaptivity in Risk and Emergency Management Applications on Pervasive Grids.
Proceedings of the 10th International Symposium on Pervasive Systems, 2009

Expressing Adaptivity and Context Awareness in the ASSISTANT Programming Model.
Proceedings of the Autonomic Computing and Communications Systems, 2009

2008
Fault Tolerance for High-Performance Applications Using Structured Parallelism Models.
PhD thesis, 2008


  Loading...