Don E. Maxwell

Orcid: 0000-0002-3794-5687

According to our database1, Don E. Maxwell authored at least 14 papers between 2008 and 2021.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021
Understanding failures through the lifetime of a top-level supercomputer.
J. Parallel Distributed Comput., 2021

2020
GPU lifetimes on titan supercomputer: survival analysis and reliability.
Proceedings of the International Conference for High Performance Computing, 2020

Towards a Model to Estimate the Reliability of Large-Scale Hybrid Supercomputers.
Proceedings of the Euro-Par 2020: Parallel Processing, 2020

2019
Are we witnessing the spectre of an HPC meltdown?
Concurr. Comput. Pract. Exp., 2019


Analyzing a Five-Year Failure Record of a Leadership-Class Supercomputer.
Proceedings of the 31st International Symposium on Computer Architecture and High Performance Computing, 2019

2018
GPU age-aware scheduling to improve the reliability of leadership jobs on Titan.
Proceedings of the International Conference for High Performance Computing, 2018


2017
Experiences Evaluating Functionality and Performance of IBM POWER8+ Systems.
Proceedings of the High Performance Computing, 2017

2015
Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility.
Proceedings of the International Conference for High Performance Computing, 2015

Understanding GPU errors on large-scale HPC systems and the implications for system design and operation.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Understanding and Exploiting Spatial Properties of System Failures on Extreme-Scale HPC Systems.
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

2013
TUE, a New Energy-Efficiency Metric Applied at ORNL's Jaguar.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

2008
New algorithm to enable 400+ TFlop/s sustained performance in simulations of disorder effects in high-<i>T</i><sub>c</sub> superconductors.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008


  Loading...