Todd Gamblin

CoRR, November, 2025

XaaS Containers: Performance-Portable Representation With Source and IR Containers.

[BibT_eX]

[DOI]

Dataset, September, 2025

Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2025

Bridging the Gap Between Binary and Source Based Package Management in Spack.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2025

XaaS Containers: Performance-Portable Representation With Source and IR Containers.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2025

2024

HPC and Cloud Convergence Beyond Technical Boundaries: Strategies for Economic Sustainability, Standardization, and Data Accessibility.

[BibT_eX]

[DOI]

Computer, June, 2024

Providing a Flexible and Comprehensive Software Stack Via Spack, an Extreme-Scale Scientific Software Stack, and Software Development Kits.

[BibT_eX]

[DOI]

James M. Willenbring

Sameer Suresh Shende

Comput. Sci. Eng., 2024

Toward a Cohesive AI and Simulation Software Ecosystem for Scientific Innovation.

[BibT_eX]

[DOI]

CoRR, 2024

Performance-Aligned LLMs for Generating Fast Code.

[BibT_eX]

[DOI]

CoRR, 2024

HPC-Coder: Modeling Parallel Programs using Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the ISC High Performance 2024 Research Paper Proceedings (39th International Conference), 2024

A Probabilistic Approach To Selecting Build Configurations in Package Managers.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2024

Learning to Predict and Improve Build Successes in Package Ecosystems.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE/ACM International Conference on Mining Software Repositories, 2024

An Exploration of Global Optimization Strategies for Autotuning OpenMP-based Codes.

[BibT_eX]

[DOI]

Gregory Bolet

Giorgis Georgakoudis

Kirk W. Cameron

David Beckingsale

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

2023

Scalable Comparative Visualization of Ensembles of Call Graphs.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., March, 2023

Modeling Parallel Programs using Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Machine Learning-Driven Adaptive OpenMP For Portable Performance on Heterogeneous Systems.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Chunhua Liao

David Beckingsale

CoRR, 2023

Towards Collaborative Continuous Benchmarking for HPC.

[BibT_eX]

[DOI]

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Flexible and Optimal Dependency Management via Max-SMT.

[BibT_eX]

[DOI]

Proceedings of the 45th IEEE/ACM International Conference on Software Engineering, 2023

2022

AI4IO: A suite of AI-based tools for IO-aware scheduling.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2022

Overcoming Challenges to Continuous Integration in HPC.

[BibT_eX]

[DOI]

Daniel S. Katz

Comput. Sci. Eng., 2022

Using Solver-Aided Languages to Build Package Managers.

[BibT_eX]

[DOI]

CoRR, 2022

Reliabuild: Searching for High-Fidelity Builds Using Active Learning.

[BibT_eX]

[DOI]

Tom Scogland

CoRR, 2022

Mapping Out the HPC Dependency Chaos.

[BibT_eX]

[DOI]

Farid Zakaria

Thomas R. W. Scogland

Carlos Maltzahn

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Using Answer Set Programming for HPC Dependency Solving.

[BibT_eX]

[DOI]

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Searching for High-Fidelity Builds Using Active Learning.

[BibT_eX]

[DOI]

Tom Scogland

Proceedings of the 19th IEEE/ACM International Conference on Mining Software Repositories, 2022

Resource Utilization Aware Job Scheduling to Mitigate Performance Variability.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

2021

Visualizing Hierarchical Performance Profiles of Parallel Codes Using CallFlow.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., 2021

Extending OpenMP for Machine Learning-Driven Adaptation.

[BibT_eX]

[DOI]

Chunhua Liao

Anjia Wang

Giorgis Georgakoudis

Yonghong Yan

David Beckingsale

Carlos Eduardo Arango Gutierrez

Proceedings of the Accelerator Programming Using Directives - 8th International Workshop, 2021

Artemis: Automatic Runtime Tuning of Parallel Execution Parameters Using Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 36th International Conference, 2021

A Holistic View of Memory Utilization on HPC Systems: Current and Future Trends.

[BibT_eX]

[DOI]

Proceedings of the MEMSYS 2021: The International Symposium on Memory Systems, Washington, USA, September 27, 2021

2020

Scalable Comparative Visualization of Ensembles of Call Graphs.

[BibT_eX]

[DOI]

CoRR, 2020

archspec: A library for detecting, labeling, and reasoning about microarchitectures.

[BibT_eX]

[DOI]

Massimiliano Culpo

Gregory Becker

Kenneth Hoste

Proceedings of the 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, 2020

Usability and Performance Improvements in Hatchet.

[BibT_eX]

[DOI]

Stephanie Brink

Ian Lumsden

Connor Scully-Allison

Proceedings of the IEEE/ACM International Workshop on HPC User Support Tools and Workshop on Programming and Performance Visualization Tools, 2020

Workflows are the New Applications: Challenges in Performance, Portability, and Productivity.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Workshop on Performance, 2020

CanarIO: Sounding the Alarm on IO-Related Performance Degradation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Auto-tuning Parameter Choices in HPC Applications using Bayesian Optimization.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

CodeSeer: input-dependent code variants selection via machine learning.

[BibT_eX]

[DOI]

Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

2019

Preserving Command Line Workflow for a Package Management System Using ASCII DAG Visualization.

[BibT_eX]

[DOI]

Katherine E. Isaacs

IEEE Trans. Vis. Comput. Graph., 2019

Using Malleable Task Scheduling to Accelerate Package Manager Installations.

[BibT_eX]

[DOI]

Samuel Knight

Jeremiah J. Wilke

Proceedings of the Tools and Techniques for High Performance Computing, 2019

Preparation and optimization of a diverse workload for a large-scale heterogeneous system.

[BibT_eX]

[DOI]

Ian Karlin

Yoonho Park

Guillaume Thomas-Collignon

Sara Kokkila Schumacher

Proceedings of the International Conference for High Performance Computing, 2019

Hatchet: pruning the overgrowth in parallel profiles.

[BibT_eX]

[DOI]

Stephanie Brink

Proceedings of the International Conference for High Performance Computing, 2019

Analyzing Cost-Performance Tradeoffs of HPC Network Designs under Different Constraints using Simulations.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, 2019

FuncyTuner: Auto-tuning Scientific Applications With Per-loop Compilation.

[BibT_eX]

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

2018

MemAxes: Visualization and Analytics for Characterizing Complex Memory Performance Behaviors.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., 2018

Autotuning in High-Performance Computing Applications.

[BibT_eX]

[DOI]

Jeffrey K. Hollingsworth

Boyana Norris

Richard W. Vuduc

Proc. IEEE, 2018

PADDLE: Performance Analysis Using a Data-Driven Learning Environment.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Bootstrapping Parameter Space Exploration for Fast Tuning.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Supercomputing, 2018

PRIONN: Predicting Runtime and IO using Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 47th International Conference on Parallel Processing, 2018

2017

xSDK Foundations: Toward an Extreme-scale Scientific Software Development Kit.

[BibT_eX]

[DOI]

Supercomput. Front. Innov., 2017

Projecting Performance Data over Simulation Geometry Using SOSflow and ALPINE.

[BibT_eX]

[DOI]

Proceedings of the Programming and Performance Visualization Tools, 2017

Performance modeling under resource constraints using deep transfer learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

Predicting the performance impact of different fat-tree configurations.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

ScrubJay: deriving knowledge from the disarray of HPC performance data.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

DR-BW: Identifying Bandwidth Contention in NUMA Architectures with Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Partitioning Low-Diameter Networks to Eliminate Inter-Job Interference.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

REPPAR Keynote.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Apollo: Reusable Models for Fast, Dynamic Tuning of Input-Dependent Code.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016

Ordering Traces Logically to Identify Lateness in Message Passing Programs.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2016

Evaluating and extending user-level fault tolerance in MPI applications.

[BibT_eX]

[DOI]

Kathryn M. Mohror

Howard Pritchard

Int. J. High Perform. Comput. Appl., 2016

A Scalable Observation System for Introspection and In Situ Analytics.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Extreme-Scale Programming Tools, 2016

VIPACT: A Visualization Interface for Analyzing Calling Context Trees.

[BibT_eX]

[DOI]

Proceedings of the Third Workshop on Visual Performance Analysis, 2016

Evaluating HPC networks via simulation of parallel workloads.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2016

A machine learning framework for performance coverage analysis of proxy applications.

[BibT_eX]

[DOI]

Tanzima Z. Islam

Proceedings of the International Conference for High Performance Computing, 2016

Caliper: performance introspection for HPC software stacks.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2016

Managing Combinatorial Software Installations with Spack.

[BibT_eX]

[DOI]

Proceedings of the 2016 Third International Workshop on HPC User Support Tools, 2016

A Study of Failures in Community Clusters: The Case of Conte.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Software Reliability Engineering Workshops, 2016

IPDRM Introduction and Committees.

[BibT_eX]

[DOI]

Shuaiwen Leon Song

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

MPMD Framework for Offloading Load Balance Computation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

VarSys Introduction.

[BibT_eX]

[DOI]

Kirk W. Cameron

Dimitrios S. Nikolopoulos

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Machine Learning Predictions of Runtime and IO Traffic on High-End Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

2015

Diagnosis of Performance Faults in LargeScale MPI Applications via Probabilistic Progress-Dependence Inference.

[BibT_eX]

[DOI]

Saurabh Bagchi

IEEE Trans. Parallel Distributed Syst., 2015

Connecting Performance Analysis and Visualization (Dagstuhl Perspectives Workshop 14022).

[BibT_eX]

[DOI]

Dagstuhl Manifestos, 2015

Debugging high-performance computing applications at massive scales.

[BibT_eX]

[DOI]

Commun. ACM, 2015

Recovering logical structure from Charm++ event traces.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2015

Relating memory performance data to application domain data using an integration API.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Visual Performance Analysis, 2015

The Spack package manager: bringing order to HPC software chaos.

[BibT_eX]

[DOI]

Scott Futral

Proceedings of the International Conference for High Performance Computing, 2015

Decoupled load balancing.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Identifying the Culprits Behind Network Congestion.

[BibT_eX]

[DOI]

Andrew R. Titus

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014

Combing the Communication Hairball: Visualizing Parallel Execution Traces using Logical Time.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., 2014

State of the Art of Performance Visualization.

[BibT_eX]

[DOI]

Proceedings of the 16th Eurographics Conference on Visualization, 2014

Dissecting On-Node Memory Access Performance: A Semantic Approach.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2014

Evaluating User-Level Fault Tolerance for MPI Applications.

[BibT_eX]

[DOI]

Proceedings of the 21st European MPI Users' Group Meeting, 2014

Extracting logical structure and identifying stragglers in parallel execution traces.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

Accurate application progress analysis for large-scale parallel debugging.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014

FMI: Fault Tolerant Messaging Interface for Fast and Transparent Recovery.

[BibT_eX]

[DOI]

Naoya Maruyama

Satoshi Matsuoka

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Load balancing n-body simulations with highly non-uniform density.

[BibT_eX]

[DOI]

Tom Arsenlis

Proceedings of the 2014 International Conference on Supercomputing, 2014

Optimizing the performance of parallel applications on a 5D torus via task mapping.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on High Performance Computing, 2014

A User-Level InfiniBand-Based File System and Checkpoint Strategy for Burst Buffers.

[BibT_eX]

[DOI]

Naoya Maruyama

Satoshi Matsuoka

Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

2013

Parallelizing heavyweight debugging tools with mpiecho.

[BibT_eX]

[DOI]

Barry Rountree

Parallel Comput., 2013

Trellis: Portability across architectures with a high-level framework.

[BibT_eX]

[DOI]

Lukasz G. Szafaryn

Kevin Skadron

J. Parallel Distributed Comput., 2013

Predicting application performance using supervised learning on communication features.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2013

Performance Analysis Techniques for the Exascale Co-Design Process.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

Efficient and Scalable Retrieval Techniques for Global File Properties.

[BibT_eX]

[DOI]

Michael J. Brim

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Massively parallel loading.

[BibT_eX]

[DOI]

Felix Wolf

Proceedings of the International Conference on Supercomputing, 2013

2012

Visualizing Network Traffic to Understand the Performance of Massively Parallel Simulations.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., 2012

Memory Trace Compression and Replay for SPMD Systems Using Extended PRSDs.

[BibT_eX]

[DOI]

Sandeep Budanur

Frank Mueller

Comput. J., 2012

Design and modeling of a non-blocking checkpointing system.

[BibT_eX]

[DOI]

Satoshi Matsuoka

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Abstract: Slack-Conscious Lightweight Loop Scheduling for Improving Scalability of Bulk-synchronous MPI Applications.

[BibT_eX]

[DOI]

Vivek Kale

Torsten Hoefler

William D. Gropp

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Exploring Performance Data with Boxfish.

[BibT_eX]

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Mapping applications with collectives over sub-communicators on torus networks.

[BibT_eX]

[DOI]

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Novel views of performance data to analyze large-scale adaptive applications.

[BibT_eX]

[DOI]

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Poster: Evaluation Topology Mapping via Graph Partitioning.

[BibT_eX]

[DOI]

Anshu Arya

Laxmikant V. Kalé

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Evaluating Topology Mapping via Graph Partitioning.

[BibT_eX]

[DOI]

Anshu Arya

Dimitrios S. Nikolopoulos

Laxmikant V. Kalé

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

The myrmics memory allocator: hierarchical, message-passing allocation for global address spaces.

[BibT_eX]

[DOI]

Spyros Lyberis

Polyvios Pratikakis

Proceedings of the International Symposium on Memory Management, 2012

Quantifying the effectiveness of load balance algorithms.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2012

Probabilistic diagnosis of performance faults in large-scale parallel applications.

[BibT_eX]

[DOI]

Saurabh Bagchi

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011

Large scale debugging of parallel tasks with AutomaDeD.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Creating a Tool Set for Optimizing Topology-Aware Node Mappings.

[BibT_eX]

[DOI]

Proceedings of the Tools for High Performance Computing 2011, 2011

Reconciling Sampling and Direct Instrumentation for Unintrusive Call-Path Profiling of MPI Programs.

[BibT_eX]

[DOI]

Zoltán Szebenyi

Felix Wolf

Brian J. N. Wylie

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Interpreting Performance Data across Intuitive Domains.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Processing, 2011

2010

ScalaTrace: Tracing, Analysis and Modeling of HPC Codes at Scale.

[BibT_eX]

[DOI]

Frank Mueller

Xing Wu

Proceedings of the Applied Parallel and Scientific Computing, 2010

Clustering performance data efficiently at massive scales.

[BibT_eX]

[DOI]

Robert J. Fowler

Daniel A. Reed

Proceedings of the 24th International Conference on Supercomputing, 2010

Scaling Algebraic Multigrid Solvers: On the Road to Exascale.

[BibT_eX]

[DOI]

Proceedings of the Competence in High Performance Computing 2010, 2010

2009

Scalable performance measurement and analysis.

[BibT_eX]

[DOI]

PhD thesis, 2009

2008

Scalable load-balance measurement for SPMD codes.

[BibT_eX]

[DOI]

Robert J. Fowler

Daniel A. Reed

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Scalable methods for monitoring and detecting behavioral equivalence classes in scientific codes.

[BibT_eX]

[DOI]