David F. Richards

Orcid: 0000-0003-2047-3780

According to our database1, David F. Richards authored at least 31 papers between 2005 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022
Accelerating communication for parallel programming models on GPU systems.
Parallel Comput., 2022

Improving Scalability with GPU-Aware Asynchronous Tasks.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

2021
Lessons Learned from Accelerating Quicksilver on Programmable Integrated Unified Memory Architecture (PIUMA) and How That's Different from CPU.
Proceedings of the High Performance Computing - 36th International Conference, 2021

GPU-aware Communication with UCX in Parallel Programming Models: Charm++, MPI, and Python.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

CharminG: A Scalable GPU-resident Runtime System.
Proceedings of the HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, 2021

2020
Sierra Center of Excellence: Lessons learned.
IBM J. Res. Dev., 2020

Achieving Computation-Communication Overlap with Overdecomposition on GPU Systems.
Proceedings of the 5th IEEE/ACM International Workshop on Extreme Scale Programming Models and Middleware, 2020

End-to-end performance modeling of distributed GPU applications.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

2019
Exploring dynamic load imbalance solutions with the CoMD proxy application.
Future Gener. Comput. Syst., 2019

ClangJIT: Enhancing C++ with Just-in-Time Compilation.
CoRR, 2019


ClangJIT: Enhancing C++ with Just-in-Time Compilation.
Proceedings of the 2019 IEEE/ACM International Workshop on Performance, 2019

Thin-Threads: An Approach for History-Based Monte Carlo on GPUs.
Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

2017
Quicksilver: A Proxy App for the Monte Carlo Transport Code Mercury.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2016
Evaluating and extending user-level fault tolerance in MPI applications.
Int. J. High Perform. Comput. Appl., 2016

Enabling Work Migration in CoMD to Study Dynamic Load Imbalance Solutions.
Proceedings of the 7th International Workshop on Performance Modeling, 2016

Optimizing PGAS Overhead in a Multi-locale Chapel Implementation of CoMD.
Proceedings of the 2016 PGAS Applications Workshop, 2016


IPAS: intelligent protection against silent output corruption in scientific applications.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

2015
Papillary Muscles Contraction Does Not Change Ventricular Wall Mechanics.
Proceedings of the Computing in Cardiology, 2015

2014
Evaluating User-Level Fault Tolerance for MPI Applications.
Proceedings of the 21st European MPI Users' Group Meeting, 2014

2013
Analysis of scalable data-privatization threading algorithms for hybrid MPI/OpenMP parallelization of molecular dynamics.
J. Supercomput., 2013

Science at LLNL with IBM Blue Gene/Q.
IBM J. Res. Dev., 2013

Performance Analysis Techniques for the Exascale Co-Design Process.
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

Performance Characteristics of Hardware Transactional Memory for Molecular Dynamics Application on BlueGene/Q: Toward Efficient Multithreading Strategies for Large-Scale Scientific Applications.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

2012
Dynamic load balancing algorithm for molecular dynamics based on Voronoi cells domain decompositions.
Comput. Phys. Commun., 2012

Toward real-time modeling of human heart ventricles at cellular resolution: simulation of drug-induced arrhythmias.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

2009
Beyond homogeneous decomposition: scaling long-range forces on Massively Parallel Systems.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

2007
Extending stability beyond CPU millennium: a micron-scale atomistic simulation of Kelvin-Helmholtz instability.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

2005
The Use of Conformal Voxels for Consistent Extractions from Multiple Level-Set Fields.
Proceedings of the Computational Science, 2005


  Loading...