Brice Goglin

Orcid: 0000-0002-8671-4615

According to our database1, Brice Goglin authored at least 57 papers between 2004 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
H2M: Exploiting Heterogeneous Shared Memory Architectures.
Future Gener. Comput. Syst., November, 2023

A survey of software techniques to emulate heterogeneous memory systems in high-performance computing.
Parallel Comput., 2023

2022
Towards EXtreme scale technologies and accelerators for euROhpc hw/Sw supercomputing applications for exascale: The TEXTAROSSA approach.
Microprocess. Microsystems, November, 2022

Using Performance Attributes for Managing Heterogeneous Memory in HPC Applications.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Relative Performance Projection on Arm Architectures.
Proceedings of the Euro-Par 2022: Parallel Processing, 2022

H2M: Towards Heuristics for Heterogeneous Memory.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021
Profiles of Upcoming HPC Applications and Their Impact on Reservation Strategies.
IEEE Trans. Parallel Distributed Syst., 2021

Using Bandwidth Throttling to Quantify Application Sensitivity to Heterogeneous Memory.
Proceedings of the IEEE/ACM Workshop on Memory Centric High Performance Computing, 2021


2020
Application-Driven Requirements for Node Resource Management in Next-Generation Systems.
Proceedings of the 2020 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers, 2020

Reservation and Checkpointing Strategies for Stochastic Jobs.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

2019
Modeling Non-Uniform Memory Access on Large Compute Nodes with the Cache-Aware Roofline Model.
IEEE Trans. Parallel Distributed Syst., 2019

Modeling high-throughput applications for in situ analytics.
Int. J. High Perform. Comput. Appl., 2019

Co-scheduling HPC workloads on cache-partitioned CMP platforms.
Int. J. High Perform. Comput. Appl., 2019

M&MMs: navigating complex memory spaces with hwloc.
Proceedings of the International Symposium on Memory Systems, 2019

Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Opportunities for Partitioning Non-volatile Memory DIMMs Between Co-scheduled Jobs on HPC Nodes.
Proceedings of the Euro-Par 2019: Parallel Processing Workshops, 2019

2018
Hardware topology management in MPI applications through hierarchical communicators.
Parallel Comput., 2018

Memory Footprint of Locality Information on Many-Core Platforms.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

2017
Modeling Large Compute Nodes with Heterogeneous Memories with Cache-Aware Roofline Model.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2017

On the Overhead of Topology Discovery for Locality-Aware Scheduling in HPC.
Proceedings of the 25th Euromicro International Conference on Parallel, 2017

Netloc: A Tool for Topology-Aware Process Mapping.
Proceedings of the Euro-Par 2017: Parallel Processing Workshops, 2017

2016
Exposing the Locality of Heterogeneous Memory Architectures to HPC Applications.
Proceedings of the Second International Symposium on Memory Systems, 2016

2015
A Topology-Aware Performance Monitoring Tool for Shared Resource Management in Multicore Systems.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

2014
Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Netloc: Towards a Comprehensive View of the HPC System Topology.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

A benchmark-based performance model for memory-bound HPC applications.
Proceedings of the International Conference on High Performance Computing & Simulation, 2014

Managing the topology of heterogeneous cluster nodes with hardware locality (hwloc).
Proceedings of the International Conference on High Performance Computing & Simulation, 2014

MBSPDiscover: An Automatic Benchmark for MultiBSP Performance Analysis.
Proceedings of the High Performance Computing - First HPCLATAM, 2014

Vers des mécanismes génériques de communication et une meilleure maîtrise des affinités dans les grappes de calculateurs hiérarchiques. (Towards generic Communication Mechanisms and better Affinity Management in Clusters of Hierarchical Nodes).
, 2014

2013
KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework.
J. Parallel Distributed Comput., 2013

2011
High-performance message-passing over generic Ethernet hardware with Open-MX.
Parallel Comput., 2011

NIC-assisted cache-efficient receive stack for message passing over Ethernet.
Concurr. Comput. Pract. Exp., 2011

Dodging Non-uniform I/O Access in Hierarchical Collective Operations for Multicore Clusters.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs.
Proceedings of the International Conference on Parallel Processing, 2011

Introduction.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2010
ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures.
Int. J. Parallel Program., 2010

Adaptive MPI Multirail Tuning for Non-uniform Input/Output Access.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications.
Proceedings of the 18th Euromicro Conference on Parallel, 2010

Optimizing MPI communication within large multicore nodes with kernel assistance.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Structuring the execution of OpenMP applications for multicore architectures.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

2009
High Throughput Intra-Node MPI Communication with Open-MX.
Proceedings of the 17th Euromicro International Conference on Parallel, 2009

Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective.
Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009

Enabling high-performance memory migration for multithreaded applications on LINUX.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Decoupling memory pinning from the application with overlapped on-demand pinning and MMU notifiers.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis.
Proceedings of the ICPP 2009, 2009

Finding a tradeoff between host interrupt load and MPI latency over Ethernet.
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

2008
Interaction efficace entre les réseaux rapides et le stockage distribué dans les grappes de calcul.
Tech. Sci. Informatiques, 2008

Design and implementation of Open-MX: High-performance message passing over generic Ethernet hardware.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Improving message passing over Ethernet with I/OAT copy offload in Open-MX.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007
An Efficient OpenMP Runtime System for Hierarchical Arch
CoRR, 2007

An Efficient OpenMP Runtime System for Hierarchical Architectures.
Proceedings of the A Practical Programming Model for the Multi-Core Era, 2007

2005
Réseaux rapides et stockage distribué dans les grappes de calculateurs : propositions pour une interaction efficace.
PhD thesis, 2005

An Efficient Network API for in-Kernel Applications in Clusters.
Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

2004
Transparent Remote File Access Through a Shared Library Client.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2004

Optimizations of Client's Side Communications in a Distributed File System within a Myrinet Cluster.
Proceedings of the 29th Annual IEEE Conference on Local Computer Networks (LCN 2004), 2004

Performance Analysis of Remote File System Access over a High-Speed Local Network.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004


  Loading...