# Simon D. Hammond

According to our database1, Simon D. Hammond authored at least 49 papers between 2007 and 2018.

Collaborative distances :

Book
In proceedings
Article
PhD thesis
Other

## Bibliography

2018
Sparse Matrix-Matrix Multiplication on Multilevel Memory Architectures : Algorithms and Experiments.
CoRR, 2018

Optimizing for KNL Usage Modes When Data Doesn't Fit in MCDRAM.
Proceedings of the 47th International Conference on Parallel Processing, 2018

2017
Optical interconnects for extreme scale computing systems.
Parallel Computing, 2017

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation.
J. Parallel Distrib. Comput., 2017

Designing vector-friendly compact BLAS and LAPACK kernels.
Proceedings of the International Conference for High Performance Computing, 2017

Performance analysis for using non-volatile memory DIMMs: opportunities and challenges.
Proceedings of the International Symposium on Memory Systems, 2017

Double Buffering for MCDRAM on Second Generation $$\hbox {Intel}^{\circledR }$$ Xeon Phi $$^{\text {TM}}$$ Processors with OpenMP.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Fast linear algebra-based triangle counting with KokkosKernels.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on Next-Generation Architectures.
Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications; 15th IEEE International Conference on Smart City; 3rd IEEE International Conference on Data Science and Systems, 2017

2016
Analyzing allocation behavior for multi-level memory.
Proceedings of the Second International Symposium on Memory Systems, 2016

Multi-Level Memory Policies: What You Add Is More Important Than What You Take Out.
Proceedings of the Second International Symposium on Memory Systems, 2016

End-to-End Modeling and Optimization of Power Consumption in HPC Interconnects.
Proceedings of the 45th International Conference on Parallel Processing Workshops, 2016

(SAI) Stalled, Active and Idle: Characterizing Power and Performance of Large-Scale Dragonfly Networks.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

2015
Design Methodology for Optimizing Optical Interconnection Networks in High Performance Systems.
Proceedings of the High Performance Computing - 30th International Conference, 2015

The Potential and Perils of Multi-Level Memory.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

k-Means Clustering on Two-Level Memory Systems.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

Two-Level Main Memory Co-Design: Multi-threaded Algorithmic Primitives, Analysis, and Simulation.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

2014
An evaluation of MPI message rate on hybrid-core processors.
IJHPCA, 2014

Exascale design space exploration and co-design.
Future Generation Comp. Syst., 2014

SNAP: Strong Scaling High Fidelity Molecular Dynamics Simulations on Leadership-Class Computing Platforms.
Proceedings of the Supercomputing - 29th International Conference, 2014

Abstract machine models and proxy architectures for exascale computing.
Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance Computing, 2014

2013
Reducing the Bulk in the Bulk Synchronous Parallel Model.
Parallel Processing Letters, 2013

An investigation of the performance portability of OpenCL.
J. Parallel Distrib. Comput., 2013

Parallel File System Analysis Through Application I/O Tracing.
Comput. J., 2013

Towards Automated Memory Model Generation Via Event Tracing.
Comput. J., 2013

Analysis of Cray XC30 Performance Using Trinity-NERSC-8 Benchmarks and Comparison with Cray XE6 and IBM BG/Q.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013

The impact of hybrid-core processors on MPI message rate.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Application Explorations for Future Interconnects.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

GPU acceleration of Data Assembly in Finite Element Methods and its energy implications.
Proceedings of the 24th International Conference on Application-Specific Systems, 2013

2012
On the Acceleration of Wavefront Applications using Distributed Many-Core Architectures.
Comput. J., 2012

Unprecedented Scalability and Performance of the New NNSA Tri-Lab Linux Capacity Cluster 2.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Navigating an Evolutionary Fast Path to Exascale.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Poster: Assessing the Predictive Capabilities of Mini-applications.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

LDPLFS: Improving I/O Performance without Application Modification.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Proceedings of the Transition of HPC Towards Exascale Computing, 2012

2011
Should we worry about memory loss?
SIGMETRICS Performance Evaluation Review, 2011

Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark.
SIGMETRICS Performance Evaluation Review, 2011

Benchmarking and modelling of POWER7, Westmere, BG/P, and GPUs: an industry case study.
SIGMETRICS Performance Evaluation Review, 2011

Predictive analysis of a hydrodynamics application on large-scale CMP clusters.
Computer Science - R&D, 2011

Light-Weight Parallel I/O Analysis at Scale.
Proceedings of the Computer Performance Engineering, 2011

WMTools - Assessing Parallel Application Memory Utilisation at Scale.
Proceedings of the Computer Performance Engineering, 2011

2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

2009
Performance prediction and procurement in practice: assessing the suitability of commodity cluster components for wavefront codes.
IET Software, 2009

WARPP: a toolkit for simulating high-performance parallel scientific codes.
Proceedings of the 2nd International Conference on Simulation Tools and Techniques for Communications, 2009

Predictive analysis and optimisation of pipelined wavefront computations.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Predictive Simulation of HPC Applications.
Proceedings of the IEEE 23rd International Conference on Advanced Information Networking and Applications, 2009

2007