Thomas Hérault

According to our database1, Thomas Hérault
  • authored at least 89 papers between 2001 and 2017.
  • has a "Dijkstra number"2 of three.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2017
Dynamic task discovery in PaRSEC: a data-flow task-based runtime.
Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2017

2016
Assessing the cost of redistribution followed by a computational kernel: Complexity and performance results.
Parallel Computing, 2016

Failure detection and propagation in HPC systems.
Proceedings of the International Conference for High Performance Computing, 2016

2015
Algorithm-Based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy.
TOPC, 2015

Composing resilience techniques: ABFT, periodic and incremental checkpointing.
IJNC, 2015

Practical scalable consensus for pseudo-synchronous distributed systems.
Proceedings of the International Conference for High Performance Computing, 2015

Sliding Substitution of Failed Nodes.
Proceedings of the 22nd European MPI Users' Group Meeting, 2015

From MPI to OpenSHMEM: Porting LAMMPS.
Proceedings of the OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, 2015

Design for a Soft Error Resilient Dynamic Task-Based Runtime.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
Performance and reliability trade-offs for the double checkpointing algorithm.
IJNC, 2014

Unified model for assessing checkpointing protocols at extreme-scale.
Concurrency and Computation: Practice and Experience, 2014

PTG: an abstraction for unhindered parallelism.
Proceedings of the Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2014

A Multithreaded Communication Substrate for OpenSHMEM.
Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

Determining the Optimal Redistribution for a Given Data Partition.
Proceedings of the IEEE 13th International Symposium on Parallel and Distributed Computing, 2014

Assessing the Impact of ABFT and Checkpoint Composite Strategies.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Utilizing dataflow-based execution for coupled cluster methods.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013
Hierarchical QR factorization algorithms for multi-core clusters.
Parallel Computing, 2013

Post-failure recovery of MPI communication capability: Design and rationale.
IJHPCA, 2013

PaRSEC: Exploiting Heterogeneity to Enhance Scalability.
Computing in Science and Engineering, 2013

On the Combination of Silent Error Detection and Checkpointing.
CoRR, 2013

Optimal Checkpointing Period: Time vs. Energy.
CoRR, 2013

Correlated set coordination in fault tolerant message logging protocols for many-core clusters.
Concurrency and Computation: Practice and Experience, 2013

Extending the scope of the Checkpoint-on-Failure protocol for forward recovery in standard MPI.
Concurrency and Computation: Practice and Experience, 2013

An evaluation of User-Level Failure Mitigation support in MPI.
Computing, 2013

Optimal Checkpointing Period: Time vs. Energy.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013

On the Combination of Silent Error Detection and Checkpointing.
Proceedings of the IEEE 19th Pacific Rim International Symposium on Dependable Computing, 2013

Revisiting the Double Checkpointing Algorithm.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012
DAGuE: A generic distributed DAG engine for High Performance Computing.
Parallel Computing, 2012

An Evaluation of User-Level Failure Mitigation Support in MPI.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Algorithm-based fault tolerance for dense matrix factorizations.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Scalable Dense Linear Algebra on Heterogeneous Hardware.
Proceedings of the Transition of HPC Towards Exascale Computing, 2012

From Serial Loops to Parallel Execution on Distributed Systems.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

A Checkpoint-on-Failure Protocol for Algorithm-Based Recovery in Standard MPI.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

2011
QCG-OMPI: MPI applications on grids.
Future Generation Comp. Syst., 2011

Hierarchical QR factorization algorithms for multi-core cluster systems
CoRR, 2011

Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

DAGuE: A Generic Distributed DAG Engine for High Performance Computing.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Correlated Set Coordination in Fault Tolerant Message Logging Protocols.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Process Distance-Aware Adaptive MPI Collective Communications.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

On Scalability for MPI Runtime Systems.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

Performance Portability of a GPU Enabled Factorization with the DAGuE Framework.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010
Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Supple: a flexible probabilistic data dissemination protocol for wireless sensor networks.
Proceedings of the 13th International Symposium on Modeling Analysis and Simulation of Wireless and Mobile Systems, 2010

DSL-Lab: A Low-Power Lightweight Platform to Experiment on Domestic Broadband Internet.
Proceedings of the Ninth International Symposium on Parallel and Distributed Computing, 2010

QR factorization of tall and skinny matrices in a grid computing environment.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

SAFE-OS: A secure and usable desktop operating system.
Proceedings of the CRiSIS 2010, 2010

Scalability and Parallelization of Monte-Carlo Tree Search.
Proceedings of the Computers and Games - 7th International Conference, 2010

Planning Large Data Transfers in Institutional Grids.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

2009
Foreword.
Parallel Computing, 2009

Hierarchical Replication Techniques to Ensure Checkpoint Storage Reliability in Grid Environment.
Journal of Interconnection Networks, 2009

QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment
CoRR, 2009

Constructing Resiliant Communication Infrastructure for Runtime Environments.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

MPI Applications on Grids: A Topology Aware Approach.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

Running Parallel Applications with Topology-Aware Grid Middleware.
Proceedings of the Fifth International Conference on e-Science, 2009

High accuracy failure injection in parallel and distributed systems using virtualization.
Proceedings of the 6th Conference on Computing Frontiers, 2009

2008
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols.
Future Generation Comp. Syst., 2008

Cell Assisted APMC.
Proceedings of the Fifth International Conference on the Quantitative Evaluaiton of Systems (QEST 2008), 2008

On the Complexity of a Self-Stabilizing Spanning Tree Algorithm for Large Scale Systems.
Proceedings of the 14th IEEE Pacific Rim International Symposium on Dependable Computing, 2008

Emulation platform for high accuracy failure injection in grids.
Proceedings of the High Speed and Large Scale Scientific Computing - Selected Papers from the High Performance Computing Workshop, Cetraro, Italy, June 30, 2008

Grid Services for MPI.
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008

Hierarchical Replication Techniques to Ensure Checkpoint Storage Reliability in Grid Environment.
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008

Hierarchical replication techniques to ensure checkpoint storage reliability in grid environment.
Proceedings of the 6th ACS/IEEE International Conference on Computer Systems and Applications, 2008

2007
Evaluating Complex MAC Protocols for Sensor Networks with APMC.
Electr. Notes Theor. Comput. Sci., 2007

Virtual Parallel Machines Through Virtualization: Impact on MPI Executions.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Grid Services for MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

A Model for Large Scale Self-Stabilization.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

A Distributed and Replicated Service for Checkpoint Storage.
Proceedings of the Making Grids Work: Proceedings of the CoreGRID Workshop on Programming Models Grid and P2P System Architecture Grid Systems, 2007

2006
MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI.
IJHPCA, 2006

Hybrid Preemptive Scheduling of Message Passing Interface Applications on Grids.
IJHPCA, 2006

Distribution, Approximation and Probabilistic Model Checking.
Electr. Notes Theor. Comput. Sci., 2006

Brief Announcement: Self-stabilizing Spanning Tree Algorithm for Large Scale Systems.
Proceedings of the Stabilization, 2006

MPI tools and performance studies - Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Probabilistic verification of sensor networks.
Proceedings of the 4th International Confernce on Computer Sciences: Research, 2006

APMC 3.0: Approximate Verification of Discrete and Continuous Time Markov Chains.
Proceedings of the Third International Conference on the Quantitative Evaluation of Systems (QEST 2006), 2006

FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

2005
Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid.
Future Generation Comp. Syst., 2005

Probabilistic Model Checking of the CSMA/CD Protocol Using PRISM and APMC.
Electr. Notes Theor. Comput. Sci., 2005

Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004
Approximate Probabilistic Model Checking.
Proceedings of the Verification, 2004

RPC-V: Toward Fault-Tolerant RPC for Internet Connected Desktop Grids with Volatile Nodes.
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Hybrid Preemptive Scheduling of MPI Applications on the Grids.
Proceedings of the 5th International Workshop on Grid Computing (GRID 2004), 2004

Improved message logging versus improved coordinated checkpointing for fault tolerant MPI.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

2003
MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

2002
Fault-Local Stabilization: The Shortest Path Tree.
Proceedings of the 21st Symposium on Reliable Distributed Systems (SRDS 2002), 2002

MPICH-V: toward a scalable fault tolerant MPI for volatile nodes.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

2001
Easy Stabilization with an Agent.
Proceedings of the Self-Stabilizing Systems, 5th International Workshop, 2001


  Loading...