José Gracia

Orcid: 0000-0002-8925-6592

According to our database1, José Gracia authored at least 34 papers between 2011 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
The EU Center of Excellence for Exascale in Solid Earth (ChEESE): Implementation, results, and roadmap for the second phase.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Future Gener. Comput. Syst., 2023

2021
Callback-based completion notification using MPI Continuations.
Parallel Comput., 2021

Quo Vadis MPI RMA? Towards a More Efficient Use of MPI One-Sided Communication.
CoRR, 2021

Feasibility Study of Molecular Dynamics Kernels Exploitation Using EngineCL.
Proceedings of the Euro-Par 2021: Parallel Processing Workshops, 2021

2020
DASH: Distributed Data Structures and Parallel Algorithms in a Global Address Space.
Proceedings of the Software for Exascale Computing - SPPEXA 2016-2019, 2020

Collectives in hybrid MPI+MPI code: Design, practice and performance.
Parallel Comput., 2020

Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU.
Future Gener. Comput. Syst., 2020

Fibers are not (P)Threads: The Case for Loose Coupling of Asynchronous Programming Models and MPI Through Continuations.
Proceedings of the EuroMPI/USA '20: 27th European MPI Users' Group Meeting, 2020

2019
Global Task Data-Dependencies in PGAS Applications.
Proceedings of the High Performance Computing - 34th International Conference, 2019

MPI Collectives for Multi-core Clusters: Optimized Performance of the Hybrid MPI+MPI Parallel Codes.
Proceedings of the 48th International Conference on Parallel Processing, 2019

2018
The Impact of Taskyield on the Design of Tasks Communicating Through MPI.
Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

2017
Application Productivity and Performance Evaluation of Transparent Locality-aware One-sided Communication Primitives.
Int. J. Netw. Comput., 2017

Patterns for OpenMP Task Data Dependency Overhead Measurements.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

2016
HPC Benchmarking: Problem Size Matters.
Proceedings of the 7th International Workshop on Performance Modeling, 2016

Asynchronous Progress Design for a MPI-Based PGAS One-Sided Communication System.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

Towards Performance Portability through Locality-Awareness for Applications Using One-Sided Communication Primitives.
Proceedings of the Fourth International Symposium on Computing and Networking, 2016

2015
DART-MPI: An MPI-based Implementation of a PGAS Runtime System.
CoRR, 2015

CppSs - a C++ Library for Efficient Task Parallelism.
CoRR, 2015

A Bandwidth-Saving Optimization for MPI Broadcast Collective Operation.
Proceedings of the 44th International Conference on Parallel Processing Workshops, 2015

Providing Parallel Debugging for DASH Distributed Data Structures with GDB.
Proceedings of the International Conference on Computational Science, 2015

Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014
Performance Modeling of the HPCG Benchmark.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

DART-MPI: An MPI-based Implementation of a PGAS Runtime System.
Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

DASH: Data Structures and Algorithms with Support for Hierarchical Locality.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

2013
Programmability and portability for exascale: Top down programming methodology and tools with StarSs.
J. Comput. Sci., 2013

Cudagrind: Memory-Usage Checking for CUDA.
Proceedings of the Tools for High Performance Computing 2013, 2013

POLCA - A Programming Model for Large Scale, Strongly Heterogeneous Infrastructures.
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

Cudagrind: A Valgrind Extension for CUDA.
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

2012
Task Debugging with TEMANEJO.
Proceedings of the Tools for High Performance Computing 2012, 2012

Avoiding Serialization Effects in Data / Dependency Aware Task Parallel Algorithms for Spatial Decomposition.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Hybrid MPI/StarSs - A Case Study.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Scheduling Overheads for Task-Based Parallel Programming Models.
Proceedings of the Facing the Multicore-Challenge, 2012

2011
Temanejo: Debugging of Thread-Based Task-Parallel Programs in StarSS.
Proceedings of the Tools for High Performance Computing 2011, 2011

TEMANEJO - a debugger for task based parallel programming models.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011


  Loading...