William T. Kramer

Affiliations:
  • University of Illinois at Urbana-Champaign, IL, USA


According to our database1, William T. Kramer authored at least 51 papers between 1989 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Blue Waters system and component reliability.
Concurr. Comput. Pract. Exp., 2024

2022
Analysis of User-Support Tickets in the Lifetime of the Blue Waters System.
Proceedings of the IEEE/ACM International Workshop on HPC User Support Tools, 2022

2021
A new sustained system performance metric for scientific performance evaluation.
J. Supercomput., 2021

2020
Convergence of artificial intelligence and high performance computing on NSF-supported cyberinfrastructure.
J. Big Data, 2020

Measuring Congestion in High-Performance Datacenter Interconnects.
Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation, 2020

2019
Race to Exascale.
Comput. Sci. Eng., 2019

Enabling real-time multi-messenger astrophysics discoveries with deep learning.
CoRR, 2019

Live Forensics for Distributed Storage Systems.
CoRR, 2019

Understanding Fault Scenarios and Impacts through Fault Injection Experiments in Cielo.
CoRR, 2019

Deep Learning for Multi-Messenger Astrophysics: A Gateway for Discovery in the Big Data Era.
CoRR, 2019

Best practices for management and operation of large HPC installations.
Concurr. Comput. Pract. Exp., 2019

A Study of Network Congestion in Two Supercomputing High-Speed Interconnects.
Proceedings of the 2019 IEEE Symposium on High-Performance Interconnects, 2019

2018
Resiliency of HPC Interconnects: A Case Study of Interconnect Failures and Recovery in Blue Waters.
IEEE Trans. Dependable Secur. Comput., 2018

Big data and extreme-scale computing.
Int. J. High Perform. Comput. Appl., 2018

Best practices and lessons from deploying and operating a sustained-petascale system: the blue waters experience.
Proceedings of the International Conference for High Performance Computing, 2018

2017
Workload Analysis of Blue Waters.
CoRR, 2017

Challenges of Workload Analysis on Large HPC Systems: A Case Study on NCSA Blue Waters.
Proceedings of the Practice and Experience in Advanced Research Computing 2017: Sustainability, 2017

BOSS-LDG: A Novel Computational Framework That Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery.
Proceedings of the 13th IEEE International Conference on e-Science, 2017

A Performance Projection of Mini-Applications onto Benchmarks Toward the Performance Projection of Real-Applications.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

Holistic Measurement-Driven System Assessment.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2016
HPCMASPA 2016 Keynote.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

2015
Deployment and testing of the sustained petascale Blue Waters system.
J. Comput. Sci., 2015

LogDiver: A Tool for Measuring Resilience of Extreme-Scale Systems and Applications.
Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, 2015

Measuring and Understanding Extreme-Scale Application Resilience: A Field Study of 5, 000, 000 HPC Application Runs.
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

2014
Toward Exascale Resilience: 2014 update.
Supercomput. Front. Innov., 2014

Deploying a Large Petascale System: The Blue Waters Experience.
Proceedings of the International Conference on Computational Science, 2014

Lessons Learned from the Analysis of System Failures at Petascale: The Case of Blue Waters.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

2013
Failure prediction for HPC systems and applications: Current situation and open issues.
Int. J. High Perform. Comput. Appl., 2013

2012
Fault prediction under the microscope: a closer look into HPC systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Taming of the Shrew: Modeling the Normal and Faulty Behaviour of Large-scale HPC Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Top500 versus sustained performance: the top problems with the top500 list - and what to do about them.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
The International Exascale Software Project roadmap.
Int. J. High Perform. Comput. Appl., 2011

How to measure useful, sustained performance.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Performance modeling for systematic performance tuning.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Modeling and tolerating heterogeneous failures in large parallel systems.
Proceedings of the Conference on High Performance Computing Networking, 2011

Event Log Mining Tool for Large Scale HPC Systems.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2009
Resource Management.
Int. J. High Perform. Comput. Appl., 2009

Consistent Application Performance at the Exascale.
Int. J. High Perform. Comput. Appl., 2009

An Exascale Approach to Software and Hardware Design.
Int. J. High Perform. Comput. Appl., 2009

Toward Exascale Resilience.
Int. J. High Perform. Comput. Appl., 2009

2006
S06 - Computing protection in open HPC environments.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Procurement - Best practice in HPC procurements.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

2005
Control Considerations for Scalable Event Processing.
Proceedings of the Ambient Networks, 2005

2004
Deep scientific computing requires deep data.
IBM J. Res. Dev., 2004

2003
Performance Variability of Highly Parallel Architectures.
Proceedings of the Computational Science - ICCS 2003, 2003

2002
SCinet: Testbed for High-Performance Networked Applications.
Computer, 2002

2000
ESP: A System Utilization Benchmark.
Proceedings of the Proceedings Supercomputing 2000, 2000

System Utilization Benchmark on the Cray T3E and IBM SP.
Proceedings of the Job Scheduling Strategies for Parallel Processing, IPDPS 2000 Workshop, 2000

1999
Building the Teraflops/Petabytes Production Supercomputing Center.
Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999

1994
NAS experiences with a prototype cluster of workstations.
Proceedings of the Proceedings Supercomputing '94, 1994

1989
Effective use of Cray supercomputers.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989


  Loading...