We stand with Ukraine

We stand with Ukraine

Aurelien Bouteiller

Orcid: 0000-0001-5108-509X

According to our database¹, Aurelien Bouteiller authored at least 68 papers between 2002 and 2023.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org
on dl.acm.org

On csauthors.net:

Bibliography

2023

Elastic deep learning through resilient collective operations.

[BibT_eX]

[DOI]

,

,

Aurelien Bouteiller

,

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

2022

Comparing Distributed Termination Detection Algorithms for Modern HPC Platforms.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Valentin Le Fèvre

,

,

Jack J. Dongarra

Int. J. Netw. Comput., 2022

Implicit Actions and Non-blocking Failure Recovery with MPI.

[BibT_eX]

[DOI]

Aurelien Bouteiller

,

Proceedings of the 12th IEEE/ACM Workshop on Fault Tolerance for HPC at eXtreme Scale, 2022

Integrating process, control-flow, and data resiliency layers using a hybrid Fenix/Kokkos approach.

[BibT_eX]

[DOI]

Matthew Whitlock

,

Nicolas Morales

,

,

Aurelien Bouteiller

,

,

Keita Teranishi

,

,

Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021

Revisiting Credit Distribution Algorithms for Distributed Termination Detection.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Valentin Le Fèvre

,

,

Jack J. Dongarra

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

2020

Overhead of using spare nodes.

[BibT_eX]

[DOI]

,

Kazumi Yoshinaga

,

Thomas Hérault

,

Aurelien Bouteiller

,

,

Yutaka Ishikawa

Int. J. High Perform. Comput. Appl., 2020

Fault tolerance of MPI applications in exascale systems: The ULFM solution.

[BibT_eX]

[DOI]

,

Patricia González

,

María J. Martín

,

,

Aurélien Bouteiller

,

Keita Teranishi

Future Gener. Comput. Syst., 2020

Flexible Data Redistribution in a Task-Based Runtime System.

[BibT_eX]

[DOI]

,

,

,

,

Aurelien Bouteiller

,

Jack J. Dongarra

Proceedings of the IEEE International Conference on Cluster Computing, 2020

2019

Performance of asynchronous optimized Schwarz with one-sided communication.

[BibT_eX]

[DOI]

Ichitaro Yamazaki

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Parallel Comput., 2019

Comparing the performance of rigid, moldable and grid-shaped applications on failure-prone HPC platforms.

[BibT_eX]

[DOI]

Valentin Le Fèvre

,

Thomas Hérault

,

,

Aurélien Bouteiller

,

,

,

Jack J. Dongarra

Parallel Comput., 2019

Checkpointing Strategies for Shared High-Performance Computing Platforms.

[BibT_eX]

[DOI]

Thomas Hérault

,

,

Aurélien Bouteiller

,

Dorian C. Arnold

,

Kurt B. Ferreira

,

,

Jack J. Dongarra

Int. J. Netw. Comput., 2019

Local rollback for resilient MPI applications with application-level checkpointing and message logging.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

Patricia González

,

María J. Martín

Future Gener. Comput. Syst., 2019

Asynchronous Receiver-Driven Replay for Local Rollback of MPI Applications.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Proceedings of the 9th IEEE/ACM Workshop on Fault Tolerance for HPC at eXtreme Scale, 2019

Runtime level failure detection and propagation in HPC systems.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

,

Proceedings of the 26th European MPI Users' Group Meeting, 2019

2018

PMIx: Process management for exascale environments.

[BibT_eX]

[DOI]

Ralph H. Castain

,

,

Aurélien Bouteiller

,

Parallel Comput., 2018

A failure detector for HPC platforms.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Amina Guermouche

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2018

Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms.

[BibT_eX]

[DOI]

Thomas Hérault

,

,

Aurélien Bouteiller

,

Dorian C. Arnold

,

Kurt B. Ferreira

,

,

Jack J. Dongarra

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Do Moldable Applications Perform Better on Failure-Prone HPC Platforms?

[BibT_eX]

[DOI]

Valentin Le Fèvre

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2018: Parallel Processing Workshops, 2018

2017

A Framework for Out of Memory SVD Algorithms.

[BibT_eX]

[DOI]

,

,

Stanimire Tomov

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the High Performance Computing - 32nd International Conference, 2017

Evaluating Contexts in OpenSHMEM-X Reference Implementation.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Swaroop Pophale

,

,

Matthew B. Baker

,

Manjunath Gorentla Venkata

Proceedings of the OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, 2017

2016

Failure detection and propagation in HPC systems.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Amina Guermouche

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2016

Surviving Errors with OpenSHMEM.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

,

Manjunath Gorentla Venkata

Proceedings of the OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments, 2016

2015

Algorithm-Based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

,

,

Jack J. Dongarra

ACM Trans. Parallel Comput., 2015

Composing resilience techniques: ABFT, periodic and incremental checkpointing.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Int. J. Netw. Comput., 2015

Practical scalable consensus for pseudo-synchronous distributed systems.

[BibT_eX]

[DOI]

Thomas Hérault

,

Aurélien Bouteiller

,

,

,

Keita Teranishi

,

Manish Parashar

,

Jack J. Dongarra

Proceedings of the International Conference for High Performance Computing, 2015

Sliding Substitution of Failed Nodes.

[BibT_eX]

[DOI]

,

Kazumi Yoshinaga

,

Thomas Hérault

,

Aurélien Bouteiller

,

,

Yutaka Ishikawa

Proceedings of the 22nd European MPI Users' Group Meeting, 2015

Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

,

Jack J. Dongarra

Proceedings of the 22nd European MPI Users' Group Meeting, 2015

From MPI to OpenSHMEM: Porting LAMMPS.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Manjunath Gorentla Venkata

,

Proceedings of the OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, 2015

Hierarchical DAG Scheduling for Hybrid Distributed Systems.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

,

Mathieu Faverge

,

Jack J. Dongarra

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

UCX: An Open Source Framework for HPC Network APIs and Beyond.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE Annual Symposium on High-Performance Interconnects, 2015

2014

Unified model for assessing checkpointing protocols at extreme-scale.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Elisabeth Brunet

,

Franck Cappello

,

Jack J. Dongarra

,

Amina Guermouche

,

Thomas Hérault

,

,

Frédéric Vivien

,

Dounia Zaidouni

Concurr. Comput. Pract. Exp., 2014

PTG: an abstraction for unhindered parallelism.

[BibT_eX]

[DOI]

Anthony Danalis

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

Jack J. Dongarra

Proceedings of the Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2014

A Multithreaded Communication Substrate for OpenSHMEM.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

Assessing the Impact of ABFT and Checkpoint Composite Strategies.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

2013

Kernel-assisted and topology-aware MPI collective communications on multicore/many-core platforms.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

J. Parallel Distributed Comput., 2013

Post-failure recovery of MPI communication capability: Design and rationale.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Int. J. High Perform. Comput. Appl., 2013

PaRSEC: Exploiting Heterogeneity to Enhance Scalability.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Mathieu Faverge

,

Thomas Hérault

,

Jack J. Dongarra

Comput. Sci. Eng., 2013

Correlated set coordination in fault tolerant message logging protocols for many-core clusters.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2013

Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2013

An evaluation of User-Level Failure Mitigation support in MPI.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

,

Jack J. Dongarra

Computing, 2013

Efficient parallelization of batch pattern training algorithm on many-core and cluster architectures.

[BibT_eX]

[DOI]

Volodymyr Turchenko

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems, 2013

Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Franck Cappello

,

Jack J. Dongarra

,

Amina Guermouche

,

Thomas Hérault

,

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012

DAGuE: A generic distributed DAG engine for High Performance Computing.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Thomas Hérault

,

Pierre Lemarinier

,

Jack J. Dongarra

Parallel Comput., 2012

Algorithm-based fault tolerance for dense matrix factorizations.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

,

Thomas Hérault

,

Jack J. Dongarra

Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

HierKNEM: An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Scalable Dense Linear Algebra on Heterogeneous Hardware.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Thomas Hérault

,

,

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the Transition of HPC Towards Exascale Computing, 2012

From Serial Loops to Parallel Execution on Distributed Systems.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Thomas Hérault

,

Jack J. Dongarra

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

A Checkpoint-on-Failure Protocol for Algorithm-Based Recovery in Standard MPI.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

2011

Impact of Kernel-Assisted MPI Communication over Scientific Applications: CPMD and FFTW.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2011

Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Anthony Danalis

,

Mathieu Faverge

,

,

Thomas Hérault

,

,

,

Pierre Lemarinier

,

,

,

,

Jack J. Dongarra

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

,

Jeffrey M. Squyres

,

Jack J. Dongarra

Proceedings of the International Conference on Parallel Processing, 2011

Correlated Set Coordination in Fault Tolerant Message Logging Protocols.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

,

Jack J. Dongarra

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Performance Portability of a GPU Enabled Factorization with the DAGuE Framework.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Pierre Lemarinier

,

Narapat Ohm Saengpatsa

,

Stanimire Tomov

,

Jack J. Dongarra

Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010

Redesigning the message logging model for high performance.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

,

Jack J. Dongarra

Concurr. Comput. Pract. Exp., 2010

Locality and Topology Aware Intra-node Communication among Multicore CPUs.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2010

Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Thomas Hérault

,

Pierre Lemarinier

,

Jack J. Dongarra

Proceedings of the Recent Advances in the Message Passing Interface, 2010

2009

Reasons for a pessimistic or optimistic message logging protocol in MPI uncoordinated failure, recovery.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

,

,

Christine Morin

,

Jack J. Dongarra

Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

2008

Fault Tolerance Management for a Hierarchical GridRPC Middleware.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Frédéric Desprez

Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008

2007

Retrospect: Deterministic Replay of MPI Applications for Interactive Distributed Debugging.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

,

Jack J. Dongarra

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

2006

MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Thomas Hérault

,

Géraud Krawezik

,

Pierre Lemarinier

,

Franck Cappello

Int. J. High Perform. Comput. Appl., 2006

Hybrid Preemptive Scheduling of Message Passing Interface Applications on Grids.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Hinde-Lilia Bouziane

,

Thomas Hérault

,

Pierre Lemarinier

,

Franck Cappello

Int. J. High Perform. Comput. Appl., 2006

Diet: New Developments and Recent Results.

[BibT_eX]

[DOI]

,

,

Aurélien Bouteiller

,

,

,

,

Pushpinder-Kaur Chouhan

,

,

,

Benjamin Depardon

,

Frédéric Desprez

,

Jean-Sébastien Gay

,

Proceedings of the Euro-Par 2006 Workshops: Parallel Processing, 2006

2005

Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

,

Thomas Hérault

,

Pierre Lemarinier

,

Franck Cappello

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004

Coordinated checkpoint versus message log for fault tolerant MPI.

[BibT_eX]

[DOI]

Pierre Lemarinier

,

Aurélien Bouteiller

,

Géraud Krawezik

,

Franck Cappello

Int. J. High Perform. Comput. Netw., 2004

Hybrid Preemptive Scheduling of MPI Applications on the Grids.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Hinde-Lilia Bouziane

,

Thomas Hérault

,

Pierre Lemarinier

,

Franck Cappello

Proceedings of the 5th International Workshop on Grid Computing (GRID 2004), 2004

Improved message logging versus improved coordinated checkpointing for fault tolerant MPI.

[BibT_eX]

[DOI]

Pierre Lemarinier

,

Aurélien Bouteiller

,

Thomas Hérault

,

Géraud Krawezik

,

Franck Cappello

Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

2003

MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging.

[BibT_eX]

[DOI]

Aurélien Bouteiller

,

Franck Cappello

,

Thomas Hérault

,

Géraud Krawezik

,

Pierre Lemarinier

,

Frédéric Magniette

Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

2002

MPICH-V: toward a scalable fault tolerant MPI for volatile nodes.

[BibT_eX]

[DOI]

,

Aurélien Bouteiller

,

Franck Cappello

,

,

,

Cécile Germain

,

Thomas Hérault

,

Pierre Lemarinier

,

Oleg Lodygensky

,

Frédéric Magniette

,

,

Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

Loading...