Simon Garcia De Gonzalo

Orcid: 0000-0002-5699-1793

According to our database1, Simon Garcia De Gonzalo authored at least 24 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
CommBench: Micro-Benchmarking Hierarchical Networks with Multi-GPU, Multi-NIC Nodes.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

2023
ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code.
CoRR, 2023

Few-shot HPC application runtime prediction.
Proceedings of the IEEE International Conference on Cluster Computing, 2023

A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code.
Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction, 2023

2022
MemXCT: Design, Optimization, Scaling, and Reproducibility of X-Ray Tomography Imaging.
IEEE Trans. Parallel Distributed Syst., 2022

An efficient GPU implementation and scaling for higher-order 3D stencils.
Inf. Sci., 2022

OmpSs-2 and OpenACC Interoperation.
Proceedings of the 9th Workshop on Accelerator Programming Using Directives, 2022

Towards OmpSs-2 and OpenACC interoperation.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022

2021
JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization.
Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

2020
Techniques for enabling GPU code generation of low-level optimizations and dynamic parallelism from high-level abstractions
PhD thesis, 2020

Petascale XCT: 3D image reconstruction with hierarchical communications on multi-GPU nodes.
Proceedings of the International Conference for High Performance Computing, 2020

Workshop 8: AsHES Accelerators and Hybrid Exascale Systems.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

2019
Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures.
Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

MemXCT: memory-centric X-ray CT reconstruction with massive parallelization.
Proceedings of the International Conference for High Performance Computing, 2019

DeepStore: In-Storage Acceleration for Intelligent Queries.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

An Efficient GPU Implementation Technique for Higher-Order 3D Stencils.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep Learning Inference in Function-as-a-Service.
Proceedings of the 12th IEEE International Conference on Cloud Computing, 2019

2018
TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments.
CoRR, 2018

2017
Chai: Collaborative heterogeneous applications for integrated-architectures.
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

Rebooting the Data Access Hierarchy of Computing Systems.
Proceedings of the IEEE International Conference on Rebooting Computing, 2017

Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on Next-Generation Architectures.
Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications; 15th IEEE International Conference on Smart City; 3rd IEEE International Conference on Data Science and Systems, 2017

2015
Enhancing the Usability and Utilization of Accelerated Architectures via Docker.
Proceedings of the 8th IEEE/ACM International Conference on Utility and Cloud Computing, 2015

2013
Thermal aware automated load balancing for HPC applications.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013


  Loading...