John Shalf

According to our database1, John Shalf authored at least 117 papers between 1996 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2018
SimpleSSD: Modeling Solid State Drives for Holistic System Simulation.
Computer Architecture Letters, 2018

Phase asynchronous AMR execution for productive and performant astrophysical flows.
Proceedings of the International Conference for High Performance Computing, 2018

MRG8: Random Number Generation for the Exascale Era.
Proceedings of the Platform for Advanced Scientific Computing Conference, 2018

Open2C: open-source generator for exploration of coherent cache memory subsystems.
Proceedings of the International Symposium on Memory Systems, 2018

2017
Trends in Data Locality Abstractions for HPC Systems.
IEEE Trans. Parallel Distrib. Syst., 2017

Towards an Integrated Strategy to Preserve Digital Computing Performance Scaling Using Emerging Technologies.
Proceedings of the High Performance Computing, 2017

Reconfigurable Silicon Photonic Interconnect for Many-Core Architecture.
Proceedings of the High Performance Computing, 2017

CASPER - Configurable design space exploration of programmable architectures for machine learning using beyond moore devices.
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2017

TraceTracker: Hardware/software co-evaluation for large-scale I/O workload reconstruction.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Overlapping Data Transfers with Computation on GPU with Tiles.
Proceedings of the 46th International Conference on Parallel Processing, 2017

Last Level Collective Hardware Prefetching For Data-Parallel Applications.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

OpenSoC system architect: An open toolkit for building soft-cores on FPGAs.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

Nonintrusive AMR Asynchrony for Communication Optimization.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

APHiD: Hierarchical Task Placement to Enable a Tapered Fat Tree Topology for Lower Power and Cost in HPC Networks.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016
NANDFlashSim: High-Fidelity, Microarchitecture-Aware NAND Flash Memory Simulation.
TOS, 2016

BoxLib with Tiling: An Adaptive Mesh Refinement Software Framework.
SIAM J. Scientific Computing, 2016

TiDA: High-Level Programming Abstractions for Data Locality Management.
Proceedings of the High Performance Computing - 31st International Conference, 2016

Perilla: metadata-based optimizations of an asynchronous runtime for adaptive mesh refinement.
Proceedings of the International Conference for High Performance Computing, 2016

Characterizing the Performance of Hybrid Memory Cube Using ApexMAP Application Probes.
Proceedings of the Second International Symposium on Memory Systems, 2016

OpenSoC Fabric: On-chip network generator.
Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

Silicon photonic memory interconnect for many-core architectures.
Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

2015
ExaSAT: An exascale co-design tool for performance modeling.
IJHPCA, 2015

Computing beyond Moore's Law.
IEEE Computer, 2015

OpenNVM: An open-sourced FPGA-based NVM controller for low level memory characterization.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Memory Errors in Modern Systems: The Good, The Bad, and The Ugly.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures.
Proceedings of the 2015 International Conference on Parallel Architecture and Compilation, 2015

Integrating 3D Resistive Memory Cache into GPGPU for Energy-Efficient Data Processing.
Proceedings of the 2015 International Conference on Parallel Architecture and Compilation, 2015

2014
Abstract machine models and proxy architectures for exascale computing.
Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance Computing, 2014

Variable-width datapath for on-chip network static power reduction.
Proceedings of the Eighth IEEE/ACM International Symposium on Networks-on-Chip, 2014

OpenSoC Fabric: On-Chip Network Generator: Using Chisel to Generate a Parameterizable On-Chip Interconnect Fabric.
Proceedings of the 2014 International Workshop on Network on Chip Architectures, 2014

Collective memory transfers for multi-core chips.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Triple-A: a Non-SSD based autonomic all-flash array for high performance storage systems.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

2013
Exascale Computing Trends: Adjusting to the "New Normal"' for Computer Architecture.
Computing in Science and Engineering, 2013

Software Design Space Exploration for Exascale Combustion Co-design.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Exploring the future of out-of-core computing with compute-local non-volatile memory.
Proceedings of the International Conference for High Performance Computing, 2013

A communications simulation methodology for AMR codes using task dependency analysis.
Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013

Extending Summation Precision for Network Reduction Operations.
Proceedings of the 25th International Symposium on Computer Architecture and High Performance Computing, 2013

Design of a large-scale storage-class RRAM system.
Proceedings of the International Conference on Supercomputing, 2013

Topic 14+16: High-Performance and Scientific Applications and Extreme-Scale Computing - (Introduction).
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012
A preliminary evaluation of the hardware acceleration of the Cray Gemini interconnect for PGAS languages and comparison with MPI.
SIGMETRICS Performance Evaluation Review, 2012

Optimization of geometric multigrid for emerging multi- and manycore processors.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

The Analysis of Impact of Energy Efficiency Requirements on Programming Environments.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

NANDFlashSim: Intrinsic latency variation aware NAND flash memory system modeling and simulation at microarchitecture level.
Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies, 2012

Toward codesign in high performance computing systems.
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012

Experiences with 100Gbps network applications.
Proceedings of the DIDC'12, 2012

On the Role of Co-design in High Performance Computing.
Proceedings of the Transition of HPC Towards Exascale Computing, 2012

2011
Green Flash: Climate Machine (LBNL).
Proceedings of the Encyclopedia of Parallel Computing, 2011

The International Exascale Software Project roadmap.
IJHPCA, 2011

Rethinking Hardware-Software Codesign for Exascale Systems.
IEEE Computer, 2011

Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning.
Proceedings of the Conference on High Performance Computing Networking, 2011

Multithreaded global address space communication techniques for gyrokinetic fusion applications on ultra-scale platforms.
Proceedings of the Conference on High Performance Computing Networking, 2011

Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems.
Proceedings of the Conference on High Performance Computing Networking, 2011

Hardware/software co-design for energy-efficient seismic modeling.
Proceedings of the Conference on High Performance Computing Networking, 2011

Let there be light!: the future of memory systems is photonics and 3D stacking.
Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, 2011

2010
Communication Requirements and Interconnect Optimization for High-End Scientific Applications.
IEEE Trans. Parallel Distrib. Syst., 2010

Exascale Computing Technology Challenges.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

Parallel I/O performance: From events to ensembles.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

An auto-tuning framework for parallel multicore stencil computations.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Exascale Computing and the Role of Co-Design.
Proceedings of the High Performance Computing: From Grids and Clouds to Exascale, 2010

Silicon Nanophotonic Network-on-Chip Using TDM Arbitration.
Proceedings of the IEEE 18th Annual Symposium on High Performance Interconnects, 2010

Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud.
Proceedings of the Cloud Computing, Second International Conference, 2010

Defining future platform requirements for e-Science clouds.
Proceedings of the 1st ACM Symposium on Cloud Computing, 2010

2009
Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors.
SIAM Review, 2009

HPC global file system performance analysis using a scientific-application derived benchmark.
Parallel Computing, 2009

Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms.
J. Parallel Distrib. Comput., 2009

Energy-Efficient Computing for Extreme-Scale Science.
IEEE Computer, 2009

A design methodology for domain-optimized power-efficient supercomputing.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

A Comparison of Different Communication Structures for Scalable Parallel Three Dimensional FFTs in First Principles Codes.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

Analysis of photonic networks for a chip multiprocessor using scientific applications.
Proceedings of the Third International Symposium on Networks-on-Chips, 2009

Scalability challenges for massively parallel AMR applications.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture.
Proceedings of the Architecture of Computing Systems, 2009

2008
Towards Ultra-High Resolution Models of Climate and Weather.
IJHPCA, 2008

Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms.
IJHPCA, 2008

Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Lattice Boltzmann simulation optimization on leading multicore platforms.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Power efficiency in high performance computing.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007
Scientific Computing Kernels on the Cell Processor.
International Journal of Parallel Programming, 2007

Optimization of sparse matrix-vector multiplication on emerging multicore platforms.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Investigation of leading HPC I/O performance using a scientific-application derived benchmark.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Scientific Application Performance on Candidate PetaScale Platforms.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Reconfigurable hybrid interconnection for static and dynamic scientific applications.
Proceedings of the 4th Conference on Computing Frontiers, 2007

2006
Performance Evaluation of Scientific Applications on Modern Parallel Vector Systems.
Proceedings of the High Performance Computing for Computational Science, 2006

HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets using Fast Bitmap Indices.
Proceedings of the 18th International Conference on Scientific and Statistical Database Management, 2006

The potential of the cell processor for scientific computing.
Proceedings of the Third Conference on Computing Frontiers, 2006

Implicit and explicit optimizations for stencil computations.
Proceedings of the 2006 workshop on Memory System Performance and Correctness, 2006

2005
The Astrophysics Simulation Collaboratory Portal: a framework for effective distributed research.
Future Generation Comp. Syst., 2005

Performance evaluation of the SX-6 vector architecture for scientific computations.
Concurrency - Practice and Experience, 2005

Query-Driven Visualization of Large Data Sets.
Proceedings of the 16th IEEE Visualization Conference, 2005

DEX: Increasing the Capability of Scientific Data Analysis Pipelines by Using Efficient Bitmap Indices to Accelerate Scientific Visualization.
Proceedings of the 17th International Conference on Scientific and Statistical Database Management, 2005

Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Impact of modern memory subsystems on cache optimizations for stencil computations.
Proceedings of the 2005 workshop on Memory System Performance, 2005

2004
Scientific Computations on Modern Parallel Vector Systems.
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

2003
Enabling Applications on the Grid: A Gridlab Overview.
IJHPCA, 2003

The Grid and Future Visualization System Architectures.
IEEE Computer Graphics and Applications, 2003

Deploying Web-Based Visual Exploration Tools on the Grid.
IEEE Computer Graphics and Applications, 2003

Grid-Distributed Visualizations Using Connectionless Protocols.
IEEE Computer Graphics and Applications, 2003

Interoperability of Visualization Software and Data Models is NOT an Achievable Goal.
Proceedings of the 14th IEEE Visualization 2003 Conference, 2003

Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Parallel Cell Projection Rendering of Adaptive Mesh Refinement Data.
Proceedings of the IEEE Symposium on Parallel and Large-Data Visualization and Graphics 2003, 2003

2002
Community software development with the Astrophysics Simulation Collaboratory.
Concurrency and Computation: Practice and Experience, 2002

The Astrophysics Simulation Collaboratory: A Science Portal Enabling Community Software Development.
Cluster Computing, 2002

The Cactus Framework and Toolkit: Design and Applications.
Proceedings of the High Performance Computing for Computational Science, 2002

GridLab: Enabling Applications on the Grid.
Proceedings of the Grid Computing, 2002

2001
The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment.
IJHPCA, 2001

Cactus Tools for Grid Applications.
Cluster Computing, 2001

High-quality Volume Rendering of Adaptive Mesh Refinement Data.
Proceedings of the Vision Modeling and Visualization Conference 2001 (VMV-01), 2001

Extraction of Crack-free Isosurfaces from Adaptive Mesh Refinement Data.
Proceedings of the 2001 Joint Eurographics and IEEE TCVG Symposium on Visualization, 2001

The Astrophysics Simulation Collaboratory Portal: A Science Portal Enabling Community Software Development.
Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10 2001), 2001

2000
The Cactus Code: A Problem Solving Environment for the Grid.
Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing, 2000

1999
Diving deep: data-management and visualization strategies for adaptive mesh refinement simulations.
Computing in Science and Engineering, 1999

Solving Einstein's Equations on Supercomputers.
IEEE Computer, 1999

Numerical Relativity in a Distributed Environment.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

1996
Galaxies Collide On the I-Way: an Example of Heterogeneous Wide-Area Collaborative Supercomputing.
IJHPCA, 1996


  Loading...