Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

2016

LiRa: A New Likelihood-Based Similarity Score for Collaborative Filtering.

[BibT_eX]

[DOI]

Veronika Strnadová-Neeley

CoRR, 2016

Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2016

Extreme scale plasma turbulence simulations on top supercomputers worldwide.

[BibT_eX]

[DOI]

Carlos Rosales-Fernandez

Timothy J. Williams

Proceedings of the International Conference for High Performance Computing, 2016

Evaluating and Optimizing the NERSC Workload on Knights Landing.

[BibT_eX]

[DOI]

Proceedings of the 7th International Workshop on Performance Modeling, 2016

Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

2015

Special issue "Graph analysis for scientific discovery".

[BibT_eX]

[DOI]

Aydin Buluç

Leonid Oliker

John R. Gilbert

Parallel Comput., 2015

Parallel processing of filtered queries in attributed semantic graphs.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2015

HipMer: an extreme-scale de novo genome assembler.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2015

Thread-level parallelization and optimization of NWChem for the Intel MIC architecture.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, 2015

merAligner: A Fully Parallel Sequence Aligner.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Compiler-Directed Transformation for Higher-Order Stencils.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Parallel Performance Optimizations on Unstructured Mesh-based Simulations.

[BibT_eX]

[DOI]

Jeffrey K. Hollingsworth

Allen D. Malony

Samuel Williams

Leonid Oliker

Proceedings of the International Conference on Computational Science, 2015

Efficient data reduction for large-scale genetic mapping.

[BibT_eX]

[DOI]

Veronika Strnadová-Neeley

Proceedings of the 6th ACM Conference on Bioinformatics, 2015

2014

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2014

Efficient and accurate clustering for large-scale genetic mapping.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Bioinformatics and Biomedicine, 2014

2013

Best paper awards: 26th international parallel and distributed processing symposium (IPDPS 2012).

[BibT_eX]

[DOI]

Leonid Oliker

Katherine A. Yelick

J. Parallel Distributed Comput., 2013

Introduction for Special Issue on Autotuning.

[BibT_eX]

[DOI]

Leonid Oliker

Richard W. Vuduc

Int. J. High Perform. Comput. Appl., 2013

Analysis and optimization of gyrokinetic toroidal simulations on homogenous and heterogenous platforms.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2013

Kinetic turbulence simulations at extreme scale on leadership-class systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2013

Performance Tuning of Fock Matrix and Two-Electron Integral Calculations for NWChem on Leading HPC Platforms.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013

High-Productivity and High-Performance Analysis of Filtered Semantic Graphs.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Compiler generation and autotuning of communication-avoiding operators for geometric multigrid.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Conference on High Performance Computing, 2013

2012

Optimization of Parallel Particle-to-Grid Interpolation on Leading Multicore Platforms.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2012

Optimization of geometric multigrid for emerging multi- and manycore processors.

[BibT_eX]

[DOI]

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Poster: Advances in Gyrokinetic Particle in Cell Simulation for Fusion Plasmas to Extreme Scale.

[BibT_eX]

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Advances in Gyrokinetic Particle in Cell Simulation for Fusion Plasmas to Extreme Scale.

[BibT_eX]

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Can Network-Offload Based Non-blocking Neighborhood MPI Collectives Improve Communication Overheads of Irregular Graph Algorithms?

[BibT_eX]

[DOI]

Krishna Chaitanya Kandalla

Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, 2012

High-performance analysis of filtered semantic graphs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011

Green Flash: Climate Machine (LBNL).

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Parallel Computing, 2011

Emerging programming paradigms for large-scale scientific computing.

[BibT_eX]

[DOI]

Leonid Oliker

Rajesh Nishtala

Rupak Biswas

Parallel Comput., 2011

Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms.

[BibT_eX]

[DOI]

Parallel Comput., 2011

Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Hardware/software co-design for energy-efficient seismic modeling.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Cosmic microwave background map-making at the petascale and beyond.

[BibT_eX]

[DOI]

Rajesh Sudarsan

Julian Borrill

Christopher Cantalupo

Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2010

Communication Requirements and Interconnect Optimization for High-End Scientific Applications.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2010

Parallel I/O performance: From events to ensembles.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

An auto-tuning framework for parallel multicore stencil computations.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures.

[BibT_eX]

[DOI]

Aparna Chandramowlishwaran

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Silicon Nanophotonic Network-on-Chip Using TDM Arbitration.

[BibT_eX]

[DOI]

Proceedings of the IEEE 18th Annual Symposium on High Performance Interconnects, 2010

Sparse Matrix-Vector Multiplication on Multicore and Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

Auto-Tuning Stencil Computations on Multicore and Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

2009

Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors.

[BibT_eX]

[DOI]

SIAM Rev., 2009

HPC global file system performance analysis using a scientific-application derived benchmark.

[BibT_eX]

[DOI]

Parallel Comput., 2009

Revolutionary technologies for acceleration of emerging petascale applications.

[BibT_eX]

[DOI]

Rupak Biswas

Leonid Oliker

Jeffrey S. Vetter

Parallel Comput., 2009

Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2009

Energy-Efficient Computing for Extreme-Scale Science.

[BibT_eX]

[DOI]

Computer, 2009

A design methodology for domain-optimized power-efficient supercomputing.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Analysis of photonic networks for a chip multiprocessor using scientific applications.

[BibT_eX]

[DOI]

Proceedings of the Third International Symposium on Networks-on-Chips, 2009

Green flash: Designing an energy efficient climate supercomputer.

[BibT_eX]

[DOI]

Leonid Oliker

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture.

[BibT_eX]

[DOI]

Proceedings of the Architecture of Computing Systems, 2009

2008

Towards Ultra-High Resolution Models of Climate and Weather.

[BibT_eX]

[DOI]

Michael F. Wehner

Leonid Oliker

John Shalf

Int. J. High Perform. Comput. Appl., 2008

Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2008

Preface.

[BibT_eX]

[DOI]

Rupak Biswas

Leonid Oliker

Int. J. High Perform. Comput. Appl., 2008

Large-scale gyrokinetic particle simulation of microturbulence in magnetically confined fusion plasmas.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 2008

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Lattice Boltzmann simulation optimization on leading multicore platforms.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007

Scientific Computing Kernels on the Cell Processor.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2007

Optimization of sparse matrix-vector multiplication on emerging multicore platforms.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Investigation of leading HPC I/O performance using a scientific-application derived benchmark.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Scientific Application Performance on Candidate PetaScale Platforms.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Reconfigurable hybrid interconnection for static and dynamic scientific applications.

[BibT_eX]

[DOI]

Proceedings of the 4th Conference on Computing Frontiers, 2007

2006

Performance Evaluation of Scientific Applications on Modern Parallel Vector Systems.

[BibT_eX]

[DOI]

Jonathan Carter

Leonid Oliker

John Shalf

Proceedings of the High Performance Computing for Computational Science, 2006

The potential of the cell processor for scientific computing.

[BibT_eX]

[DOI]

Proceedings of the Third Conference on Computing Frontiers, 2006

Performance characteristics of an adaptive mesh refinement calculation on scalar and vector platforms.

[BibT_eX]

[DOI]

Proceedings of the Third Conference on Computing Frontiers, 2006

Implicit and explicit optimizations for stencil computations.

[BibT_eX]

[DOI]

Proceedings of the 2006 workshop on Memory System Performance and Correctness, 2006

Performance Evaluation and Modeling of Ultra-Scale Systems.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

Rob F. Van der Wijngaart

David H. Bailey

Allan Snavely

Proceedings of the Parallel Processing for Scientific Computing, 2006

2005

Performance evaluation of the SX-6 vector architecture for scientific computations.

[BibT_eX]

[DOI]

Rob F. Van der Wijngaart

Concurr. Pract. Exp., 2005

Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Leading Computational Methods on Scalar and Vector HEC Platforms.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

Impact of modern memory subsystems on cache optimizations for stencil computations.

[BibT_eX]

[DOI]

Proceedings of the 2005 workshop on Memory System Performance, 2005

2004

A Performance Evaluation of the Cray X1 for Scientific Applications.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing for Computational Science, 2004

Scientific Computations on Modern Parallel Vector Systems.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe.

[BibT_eX]

[DOI]

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Performance Characteristics of a Cosmology Package on Leading HPC Architectures.

[BibT_eX]

[DOI]

Jonathan Carter

Julian Borrill

Leonid Oliker

Proceedings of the High Performance Computing, 2004

2003

Message passing and shared address space parallelism on an SMP cluster.

[BibT_eX]

[DOI]

Parallel Comput., 2003

Job Superscheduler Architecture and Performance in Computational Grid Environments.

[BibT_eX]

[DOI]

Hongzhang Shan

Leonid Oliker

Rupak Biswas

Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations.

[BibT_eX]

[DOI]

Rob F. Van der Wijngaart

Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Performance Evaluation of Two Emerging Media Processors: VIRAM and Imagine.

[BibT_eX]

[DOI]

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

2002

Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations.

[BibT_eX]

[DOI]

SIAM Rev., 2002

Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines.

[BibT_eX]

[DOI]

Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001

Design Strategies for Irregularly Adapting Parallel Applications.

[BibT_eX]

Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

Ordering Sparse Matrices for Cache-Based Systems.

[BibT_eX]

Rupak Biswas

Leonid Oliker

Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

Message Passing Vs. Shared Address Space on a Clusters of SMPs.

[BibT_eX]

[DOI]

Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

2000

Parallelization of a Dynamic Unstructured Algorithm Using Three Leading Programming Paradigms.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

IEEE Trans. Parallel Distributed Syst., 2000

Parallel tetrahedral mesh adaptation with dynamic load balancing.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

Harold N. Gabow

Parallel Comput., 2000

ESP: A System Utilization Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing 2000, 2000

A Comparison of Three Programming Models for Adaptive Applications on the Origin2000.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing 2000, 2000

System Utilization Benchmark on the Cray T3E and IBM SP.

[BibT_eX]

[DOI]

Proceedings of the Job Scheduling Strategies for Parallel Processing, IPDPS 2000 Workshop, 2000

Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems.

[BibT_eX]

[DOI]

Proceedings of the Parallel and Distributed Processing, 2000

1999

Parallelization of a Dynamic Unstructured Application using Three Leading Paradigms.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

Proceedings of the ACM/IEEE Conference on Supercomputing, 1999

Dynamic Load Balancing for Parallel Adaptive Unstructured Grid Computations.

[BibT_eX]

Rupak Biswas

Leonid Oliker

Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications.

[BibT_eX]

[DOI]

Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

1998

PLUM: Parallel Load Balancing for Adaptive Unstructured Meshes.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

J. Parallel Distributed Comput., 1998

Performance Analysis and Portability of the PLUM Load Balancing System.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

Harold N. Gabow

Proceedings of the Euro-Par '98 Parallel Processing, 1998

1997

Efficient Load Balancing and Data Remapping for Adaptive Grid Calculations.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures, 1997

Load Balancing Unstructured Adaptive Grids for CFD Problems.

[BibT_eX]

Rupak Biswas

Leonid Oliker

Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

Load balancing sequences of unstructured adaptive grids.

[BibT_eX]

[DOI]

Rupak Biswas

Leonid Oliker

Proceedings of the Fourth International on High-Performance Computing, 1997

1996

Algorithms for Automatic Alignment of Arrays.

[BibT_eX]

[DOI]

Siddhartha Chatterjee

J. Parallel Distributed Comput., 1996

Global Load Balancing with Parallel Mesh Adaption on Distributed-Memory Systems.

[BibT_eX]

[DOI]

Rupak Biswas

Leonid Oliker

Andrew Sohn

Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

Parallel Implementation of an Adaptive Scheme for 3D Unstructured Grids on the SP2.

[BibT_eX]

[DOI]

Leonid Oliker

Rupak Biswas

Roger C. Strawn

Proceedings of the Parallel Algorithms for Irregularly Structured Problems, 1996

Leonid Oliker

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...