Andrew Lumsdaine

Orcid: 0000-0002-9153-6622

According to our database1, Andrew Lumsdaine authored at least 235 papers between 1988 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Scalable, Programmable and Dense: The HammerBlade Open-Source RISC-V Manycore.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

2023
Minibatching Offers Improved Generalization Performance for Second Order Optimizers.
CoRR, 2023

2022
Direction-optimizing Label Propagation Framework for Structure Detection in Graphs: Design, Implementation, and Experimental Analysis.
ACM J. Exp. Algorithmics, 2022

NWHy: A Framework for Hypergraph Analytics: Representations, Data structures, and Algorithms.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

High-order Line Graphs of Non-uniform Hypergraphs: Algorithms, Applications, and Experimental Analysis.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

NWGraph: A Library of Generic Graph Algorithms and Data Structures in C++20.
Proceedings of the 36th European Conference on Object-Oriented Programming, 2022


The Parallel Boost Graph Library 2.0.
Proceedings of the Massive Graph Analytics, 2022

2021
Critique of "Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility" by SCC Team From University of Washington.
IEEE Trans. Parallel Distributed Syst., 2021

Parallel Algorithms for Efficient Computation of High-Order Line Graphs of Hypergraphs.
Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

Towards Modern C++ Language Support for MPI.
Proceedings of the Workshop on Exascale MPI, 2021

2020
Unsupervised Monocular Depth Estimation From Light Field Image.
IEEE Trans. Image Process., 2020

Efficient Computation of High-Order Line Graphs of Hypergraphs.
CoRR, 2020

EduPar-20 Invited Panel.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020


Fast and Efficient Neural Network for Light Field Disparity Estimation.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Flexible Spatial and Angular Light Field Super Resolution.
Proceedings of the IEEE International Conference on Image Processing, 2020

Triangle Counting with Cyclic Distributions.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

Direction-optimizing label propagation and its application to community detection.
Proceedings of the 17th ACM International Conference on Computing Frontiers, 2020

2019
A Comparative Study of Asynchronous Many-Tasking Runtimes: Cilk, Charm++, ParalleX and AM++.
CoRR, 2019

RDMA Managed Buffers: A Case for Accelerating Communication Bound Processes via Fine-Grained Events for Zero-Copy Message Passing.
Proceedings of the 18th International Symposium on Parallel and Distributed Computing, 2019

Learning Depth Cues from Focal Stack for Light Field Depth Estimation.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Distributed Direction-Optimizing Label Propagation for Community Detection.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

A Parallel Graph Environment for Real-World Data Analytics Workflows.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

A Synchronization-Avoiding Distance-1 Grundy Coloring Algorithm for Power-Law Graphs.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Real-Time Refocusing Using an FPGA-Based Standard Plenoptic Camera.
IEEE Trans. Ind. Electron., 2018

A scalable distance-1 vertex coloring algorithm for power-law graphs.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Distributed, Shared-Memory Parallel Triangle Counting.
Proceedings of the Platform for Advanced Scientific Computing Conference, 2018

Runtime Scheduling Policies for Distributed Graph Algorithms.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

PyGB: GraphBLAS DSL in Python with Dynamic Compilation Into Efficient C++.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Scale and Orientation Aware EPI-Patch Learning for Light Field Depth Estimation.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

rmalloc() and rpipe(): a uGNI-based Distributed Remote Memory Allocator and Access Library for One-sided Messaging.
Proceedings of the 8th International Workshop on Runtime and Operating Systems for Supercomputers, 2018

Adaptive Runtime Features for Distributed Graph Algorithms.
Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018

Synchronization-Avoiding Graph Algorithms.
Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018

Enabling Efficient Inter-Node Message Passing and Remote Memory Access Via a uGNI Based Light-Weight Network Substrate for Cray Interconnects.
Proceedings of the 18th IEEE/ACM International Symposium on Cluster, 2018

2017
Keeping up with technology: Teaching Parallel, Distributed and High-Performance Computing.
J. Parallel Distributed Comput., 2017

Families of Distributed Memory Parallel Graph Algorithms from Self-Stabilizing Kernels-An SSSP Case Study.
CoRR, 2017

Declarative Guide Creation.
Proceedings of the Visualization and Data Analysis 2017, Burlingame, CA, USA, 29 January 2017, 2017

POSTER: Distributed Control: The Benefits of Eliminating Global Synchronization via Effective Scheduling.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Characterizing Performance of Imbalanced Collectives on Hybrid and Task Centric Runtimes for Two-Phase Reduction.
Proceedings of the Languages and Compilers for Parallel Computing, 2017

Light-field flow: A subpixel-accuracy depth flow estimation with geometric occlusion model from a single light-field image.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Distributed-memory fast maximal independent set.
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Parallel Asynchronous Distributed-Memory Maximal Independent Set Algorithm with Work Ordering.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

Families of Graph Algorithms: SSSP Case Study.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

Edge-aware Light-Field Flow for Depth Estimation and Occlusion Detection.
Proceedings of the Computational Imaging XV, Burlingame, 2017

2016
Introduction to the Special Issue on PPoPP'14.
ACM Trans. Parallel Comput., 2016

Matrix-free Krylov iteration for implicit convolution of numerically low-rank data.
J. Comput. Appl. Math., 2016

A Survey of Methods for Collective Communication Optimization and Tuning.
CoRR, 2016

Abstract Graph Machine.
CoRR, 2016

The Value of Variance.
Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering, 2016

Context Matters: Distributed Graph Algorithms and Runtime Systems: A Case Study of Distributed Graph Traversals.
Proceedings of the Platform for Advanced Scientific Computing Conference, 2016

GBTL-CUDA: Graph Algorithms and Primitives for GPUs.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Depth estimation with cascade occlusion culling filter for light-field cameras.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016


Network-Managed Virtual Global Address Space for Message-driven Runtimes.
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

Epoch Persistence: Safe, Efficient, On-demand Rendering for Streaming Data.
Proceedings of the 49th Hawaii International Conference on System Sciences, 2016

Improving Performance of Distributed Graph Traversals via Application-Aware Plug-In Work Scheduler.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

2015
The Anatomy of Large-Scale Distributed Graph Algorithms.
CoRR, 2015

Pixel-oriented techniques for visualizing next-generation HPC systems.
Proceedings of the 3rd IEEE Working Conference on Software Visualization, 2015

Dynamic parallelism for simple and efficient GPU graph algorithms.
Proceedings of the 5th Workshop on Irregular Applications - Architectures and Algorithms, 2015

An Embedded DSL for High Performance Declarative Communication with Correctness Guarantees in C++.
Proceedings of the Languages and Compilers for Parallel Computing, 2015

Declarative Patterns for Imperative Distributed Graph Algorithms.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

A Unifying Programming Model for Parallel Graph Algorithms.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

EduPar Introduction and Committees.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Comparison of Single Source Shortest Path Algorithms on Two Recent Asynchronous Many-task Runtime Systems.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Dynamic Adaptation for Elastic System Services Using Virtual Servers.
Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015

Importance of Runtime Considerations in Performance Engineering of Large-Scale Distributed Graph Algorithms.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

Multimode plenoptic imaging.
Proceedings of the Digital Photography XI, 2015

2014
Multi-scale contrast-based saliency enhancement for salient object detection.
IET Comput. Vis., 2014

Abstract rendering: out-of-core rendering for information visualization.
Proceedings of the Visualization and Data Analysis 2014, 2014

Distributed control: priority scheduling for single source shortest paths without synchronization.
Proceedings of the Fourth Workshop on Irregular Applications, 2014

Region-based memory management for GPU programming languages: enabling rich data structures on a spartan host.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

Hybrid MPI: a case study on the Xeon Phi platform.
Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers, 2014

The radon image as plenoptic function.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Scoping rules on a platter: a framework for understanding and specifying name binding.
Proceedings of the 10th ACM SIGPLAN workshop on Generic programming, 2014

2013
Special Section Guest Editorial: Mobile Computational Photography.
J. Electronic Imaging, 2013

Optimizing process creation and execution on multi-core architectures.
Int. J. High Perform. Comput. Appl., 2013

What Makes Code Hard to Understand?
CoRR, 2013

Hybrid MPI: efficient message passing for multi-core systems.
Proceedings of the International Conference for High Performance Computing, 2013

Ownership passing: efficient distributed memory programming on multi-core systems.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

GPU Programming in Rust: Implementing High-Level Abstractions in a Systems-Level Language.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Expressing graph algorithms using generalized active messages.
Proceedings of the International Conference on Supercomputing, 2013

Line Assisted Light Field Triangulation and Stereo Matching.
Proceedings of the IEEE International Conference on Computer Vision, 2013


Plenoptic depth map in the case of occlusions.
Proceedings of the Multimedia Content and Mobile Devices 2013, 2013

Fourier analysis of the focused plenoptic camera.
Proceedings of the Multimedia Content and Mobile Devices 2013, 2013

Lytro camera technology: theory, algorithms, performance analysis.
Proceedings of the Multimedia Content and Mobile Devices 2013, 2013

Introduction to the JEI Focal Track Presentations.
Proceedings of the Multimedia Content and Mobile Devices 2013, 2013

Overplotting: Unified solutions under Abstract Rendering.
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013

2012
Efficient, dynamic data visualization with persistent data structures.
Proceedings of the Visualization and Data Analysis 2012, 2012

Position Paper: Logic Programming for Parallel Irregular Applications.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Breaking the speed and scalability barriers for graph exploration on distributed-memory machines.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Cognitive architectures: a way forward for the psychology of programming.
Proceedings of the ACM Symposium on New Ideas in Programming and Reflections on Software, 2012

Plenoptic rendering with interactive performance using GPUs.
Proceedings of the Image Processing: Algorithms and Systems X; and Parallel Processing for Imaging Applications II, 2012

Watch this: A taxonomy for dynamic data visualization.
Proceedings of the 7th IEEE Conference on Visual Analytics Science and Technology, 2012

Optimizing latency and throughput for spawning processes on massively multicore processors.
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, 2012

Avalanche: a fine-grained flow graph model for irregular applications on distributed-memory systems.
Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing, 2012

The design and implementation of a multi-level content-addressable checkpoint file system.
Proceedings of the 19th International Conference on High Performance Computing, 2012

Spatial analysis of discrete plenoptic sampling.
Proceedings of the Digital Photography VIII, 2012

The multifocus plenoptic camera.
Proceedings of the Digital Photography VIII, 2012

An analysis of color demosaicing in plenoptic cameras.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Visualizing cells and their connectivity graphs for CompuCell3D.
Proceedings of the 2012 IEEE Symposium on Biological Data Visualization, 2012

Spatial autocorrelation-based information visualization evaluation.
Proceedings of the 2012 BELIV Workshop: Beyond Time and Errors, 2012

2011
A language for generic programming in the large.
Sci. Comput. Program., 2011

Using Focused Plenoptic Cameras for Rich Image Capture.
IEEE Computer Graphics and Applications, 2011

Active pebbles: a programming model for highly parallel fine-grained data-driven computations.
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

Declarative Parallel Programming for GPUs.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Kanor - A Declarative Language for Explicit Communication.
Proceedings of the Practical Aspects of Declarative Languages, 2011

Communication Optimization Beyond MPI.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Reasonable abstractions: Semantics for dynamic data visualization.
Proceedings of the 6th IEEE Conference on Visual Analytics Science and Technology, 2011

Active pebbles: parallel programming for data-driven applications.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

ConceptClang: an implementation of C++ concepts in Clang.
Proceedings of the seventh ACM SIGPLAN workshop on Generic programming, 2011

Partial globalization of partitioned address spaces for zero-copy communication with shared memory.
Proceedings of the 18th International Conference on High Performance Computing, 2011

Superresolution with the focused plenoptic camera.
Proceedings of the Computational Imaging IX, 2011

2010
Accurately measuring overhead, communication time and progression of blocking and nonblocking collective operations at massive scale.
Int. J. Parallel Emergent Distributed Syst., 2010

Focused plenoptic camera and rendering.
J. Electronic Imaging, 2010

Lazy Evaluation and Delimited Control
Log. Methods Comput. Sci., 2010

Reducing Plenoptic Camera Artifacts.
Comput. Graph. Forum, 2010

Workflows for parameter studies of multi-cell modeling.
Proceedings of the 2010 Spring Simulation Multiconference, 2010

Lightfield photography: theory and methods.
Proceedings of the ACM SIGGRAPH ASIA 2010 Courses, 2010

Characterizing the Influence of System Noise on Large-Scale Applications by Simulation.
Proceedings of the Conference on High Performance Computing Networking, 2010

Checkpoint/Restart-Enabled Parallel Debugging.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Efficient MPI Support for Advanced Hybrid Programming Models.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Scalable communication protocols for dynamic sparse data exchange.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Extensible PGAS semantics for C++.
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, 2010

Automatic Application of the Data-State Model in Data-Flow Contexts.
Proceedings of the 14th International Conference on Information Visualisation, 2010

LogGOPSim: simulating large-scale applications in the LogGOPS model.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

A space-efficient parallel algorithm for computing betweenness centrality in distributed memory.
Proceedings of the 2010 International Conference on High Performance Computing, 2010

Theory and Methods of Lightfield Photography.
Proceedings of the 31st Annual Conference of the European Association for Computer Graphics, 2010

AM++: a generalized active message framework.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
LogGP in theory and practice - An in-depth analysis of modern interconnection networks and benchmarking methods for collective operations.
Simul. Model. Pract. Theory, 2009

The Effect of Network Noise on Large-Scale Collective Communications.
Parallel Process. Lett., 2009

Software Engineering and Computational Science.
Comput. Sci. Eng., 2009

PFunc: modern task parallelism for modern high performance computing.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Towards Efficient MapReduce Using MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Extending Task Parallelism For Frequent Pattern Mining.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

Algebraic Guide Generation.
Proceedings of the 13th International Conference on Information Visualisation, 2009

The impact of network noise at large-scale communication performance.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

A power-aware, application-based performance study of modern commodity cluster interconnection networks.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Group Operation Assembly Language - A Flexible Way to Express Collective Communication.
Proceedings of the ICPP 2009, 2009

CIFTS: A Coordinated Infrastructure for Fault-Tolerant Systems.
Proceedings of the ICPP 2009, 2009

Interconnect agnostic checkpoint/restart in open MPI.
Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, 2009

Optimized Routing for Large-Scale InfiniBand Networks.
Proceedings of the 17th IEEE Symposium on High Performance Interconnects, 2009

Demand-driven execution of static directed acyclic graphs using task parallelism.
Proceedings of the 16th International Conference on High Performance Computing, 2009

Reusable, generic program analyses and transformations.
Proceedings of the Generative Programming and Component Engineering, 2009

Toward foundations for type-reflective metaprogramming.
Proceedings of the Generative Programming and Component Engineering, 2009

Depth of Field in Plenoptic Cameras.
Proceedings of the 30th Annual Conference of the European Association for Computer Graphics, 2009

2008
Leveraging non-blocking collective communication in high-performance applications.
Proceedings of the SPAA 2008: Proceedings of the 20th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2008

Representing unit test data for large scale software development.
Proceedings of the ACM 2008 Symposium on Software Visualization, 2008

Communication Optimization for Medical Image Reconstruction Algorithms.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Sparse Non-blocking Collectives in Quantum Mechanical Calculations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Design and implementation of a high-performance MPI for C# and the common language infrastructure.
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

OpenMP Extensions for Generic Libraries.
Proceedings of the OpenMP in a New Era of Parallelism, 4th International Workshop, 2008

Stencil: A Conceptual Model for Representation and Interaction.
Proceedings of the 12th International Conference on Information Visualisation, 2008

Accurately measuring collective operations at massive scale.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Optimizing non-blocking collective operations for infiniband.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Stateless Clustering Using OSCAR and PERCEUS.
Proceedings of the 22nd Annual International Symposium on High Performance Computing Systems and Applications (HPCS 2008), 2008

Integrating semantics and compilation: using c++ concepts to develop robust and efficient reusable libraries.
Proceedings of the Generative Programming and Component Engineering, 2008

Theory and Methods of Light-Field Photography.
Proceedings of the 29th Annual Conference of the European Association for Computer Graphics, 2008

Unified Frequency Domain Analysis of Lightfield Cameras.
Proceedings of the Computer Vision, 2008

Multistage switches are not crossbars: Effects of static routing in high-performance networks.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

Message progression in parallel computing - to thread or not to thread?
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

Overlapping Communication and Computation with High Level Communication Routines.
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008

2007
Challenges in Parallel Graph Processing.
Parallel Process. Lett., 2007

Optimizing a conjugate gradient solver with non-blocking collective operations.
Parallel Comput., 2007

An extended comparative study of language support for generic programming.
J. Funct. Program., 2007

Implementation and performance analysis of non-blocking collective operations for MPI.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

An Extensible Framework for Distributed Testing of MPI Implementations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

A Case for Standard Non-blocking Collective Operations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Analysis of Implementation Options for MPI-2 One-Sided.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Parallelization of Generic Libraries Based on Type Properties.
Proceedings of the Computational Science, 2007

Netgauge: A Network Performance Measurement Framework.
Proceedings of the High Performance Computing and Communications, 2007

Interpreting large visual similarity matrices.
Proceedings of the APVIS 2007, 2007

A comparison of vertex ordering algorithms for large graph visualization.
Proceedings of the APVIS 2007, 2007

2006
High-Performance Direct Pairwise Comparison of Large Genomic Sequences.
IEEE Trans. Parallel Distributed Syst., 2006

Modernizing the C++ Interface to MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Algorithm specialization in generic programming: challenges of constrained generics in C++.
Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, 2006

Runtime synthesis of high-performance code from scripting languages.
Proceedings of the Companion to the 21th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2006

Concepts: linguistic support for generic programming in C++.
Proceedings of the 21th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2006

Expression and Loop Libraries for High-Performance Code Synthesis.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

A Case for Non-blocking Collective Operations.
Proceedings of the Frontiers of High Performance Computing and Networking, 2006

Effecting parallel graph eigensolvers through library composition.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

DFS: A Simple to Write Yet Difficult to Execute Benchmark.
Proceedings of the 2006 IEEE International Symposium on Workload Characterization, 2006

Accelerating sparse matrix computations via data compression.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

The Introduction of the OSCAR Database API (ODA).
Proceedings of the 20th Annual International Symposium on High Performance Computing Systems and Applications (HPCS 2006), 2006

Distributed Force-Directed Graph Layout and Visualization.
Proceedings of the 6th Eurographics Symposium on Parallel Graphics and Visualization, 2006

Single-Source Shortest Paths with the Parallel Boost Graph Library.
Proceedings of the Shortest Path Problem, 2006

Open MPI: A High-Performance, Heterogeneous MPI.
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

Parallel Tools and Environments: A Survey.
Proceedings of the Parallel Processing for Scientific Computing, 2006

2005
MultiArray: a C++ library for generic programming with arrays.
Softw. Pract. Exp., 2005

Generic Programming and High-Performance Libraries.
Int. J. Parallel Program., 2005

The Lam/Mpi Checkpoint/Restart Framework: System-Initiated Checkpointing.
Int. J. High Perform. Comput. Appl., 2005

Using MPI with C# and the Common Language Infrastructure.
Concurr. Pract. Exp., 2005

Generic programming for high-performance scientific applications.
Concurr. Pract. Exp., 2005

Analysis of the Component Architecture Overhead in Open MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Essential language support for generic programming.
Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, 2005

Associated types and constraint propagation for mainstream object-oriented generics.
Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2005

Lifting sequential graph algorithms for distributed-memory parallel computation.
Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2005

Revamping the OSCAR Database: A Flexible Approach to Cluster Configuration Data Management.
Proceedings of the 19th Annual International Symposium on High Performance Computing Systems and Applications (HPCS 2005), 2005

Language Requirements for Large-Scale Generic Libraries.
Proceedings of the Generative Programming and Component Engineering, 2005

2004
TEG: A High-Performance, Scalable, Multi-network Point-to-Point Communications Methodology.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Open MPI's TEG Point-to-Point Communications Methodology: Comparison to Existing Implementations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

2003
The Lambda Library: unnamed functions in C++.
Softw. Pract. Exp., 2003

Krylov Subspace Acceleration of Waveform Relaxation.
SIAM J. Numer. Anal., 2003

A Component Architecture for LAM/MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

A comparative study of language support for generic programming.
Proceedings of the 2003 ACM SIGPLAN Conference on Object-Oriented Programming Systems, 2003

The Generic Message Passing Framework.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Concept-Controlled Polymorphism.
Proceedings of the Generative Programming and Component Engineering, 2003

2002
Guaranteed Optimization: Proving Nullspace Properties of Compilers.
Proceedings of the Static Analysis, 9th International Symposium, 2002

Concept-Based Component Libraries and Optimizing Compilers.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

The Boost Graph Library - User Guide and Reference Manual.
C++ in-depth series, Pearson / Prentice Hall, ISBN: 978-0-201-72914-6, 2002

2001
Object-oriented analysis and design of the Message Passing Interface.
Concurr. Comput. Pract. Exp., 2001

1999
The Matrix Template Library: generic components for high-performance scientific computing.
Comput. Sci. Eng., 1999

The Generic Graph Component Library.
Proceedings of the 1999 ACM SIGPLAN Conference on Object-Oriented Programming Systems, 1999

Generic Graph Algorithms for Sparse Matrix Ordering.
Proceedings of the Computing in Object-Oriented Parallel Environments, 1999

1998
The Matrix Template Library: A Generic Programming Approach to High Performance Numerical Linear Algebra.
Proceedings of the Computing in Object-Oriented Parallel Environments, 1998

A Rational Approach to Portable High Performance: The Basic Linear Algebra Instruction Set (BLAIS) and the Fixed Algorithm Size Template (FAST) Library.
Proceedings of the Object-Oriented Technology, ECOOP'98 Workshop Reader, 1998

The Matrix Template Library: A Unifying Framework for Numerical Linear Algebra.
Proceedings of the Object-Oriented Technology, ECOOP'98 Workshop Reader, 1998

1997
Spectra and Pseudospectra of Waveform Relaxation Operators.
SIAM J. Sci. Comput., 1997

Parallel Extensions to the Matrix Template Library.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

The Design and Evolution of the MPI-2 C++ Interface.
Proceedings of the Scientific Computing in Object-Oriented Parallel Environments, 1997

The Role of Abstraction in High-Performance Computing.
Proceedings of the Scientific Computing in Object-Oriented Parallel Environments, 1997

1996
Accelerated waveform methods for parallel transient simulation of semiconductor devices.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1996

MPI-2: Extending the Message-Passing Interface.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

1995
7. Decomposition of Space-Time Domains: Accelerated Waveform Methods, with Application to Semiconductor Device Simulation.
Proceedings of the Domain-Based Parallelism and Problem Decomposition Methods in Computational Science and Engineering, 1995

1994
Maximum Likelihood Parameter Estimation for Non-Gaussian Prior Signal Models.
Proceedings of the Proceedings 1994 International Conference on Image Processing, 1994

1993
Massively parallel simulation algorithms for grid-based analog signal processors.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1993

Waveform Iterative Techniques for Device Transient Simulation on Parallel Machines.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Accelerated waveform methods for parallel transient simulation of semiconductor devices.
Proceedings of the 1993 IEEE/ACM International Conference on Computer-Aided Design, 1993

1991
Nonlinear analog networks for image smoothing and segmentation.
J. VLSI Signal Process., 1991

Conjugate Direction Waveform Methods for Transient Two-Dimensional Simulation for MOS Devices.
Proceedings of the 1991 IEEE/ACM International Conference on Computer-Aided Design, 1991

1990
Parallel Simulation Algorithms for Grid-Based Analog Signal Processors.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 1990

1988
A band relaxation algorithm for reliable and parallelizable circuit simulation.
Proceedings of the 1988 IEEE International Conference on Computer-Aided Design, 1988


  Loading...