Lawrence Rauchwerger

Affiliations:
  • Texas A&M University, Parasol Lab, College Station, USA


According to our database1, Lawrence Rauchwerger authored at least 92 papers between 1990 and 2021.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2012, "For contributions to thread-level speculation, parallelizing compilers, and parallel libraries".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2021
FFT blitz: the tensor cores strike back.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

Accelerating SARS-CoV-2 low frequency variant calling on ultra deep sequencing datasets.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Accelerating Fourier and Number Theoretic Transforms using Tensor Cores and Warp Shuffles.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
Introduction to the Special Issue on PPoPP 2017 (Part 2).
ACM Trans. Parallel Comput., 2020

Provably optimal parallel transport sweeps on semi-structured grids.
J. Comput. Phys., 2020

2019
Rethinking Incremental and Parallel Pointer Analysis.
ACM Trans. Program. Lang. Syst., 2019

Introduction to the Special Issue on PPoPP 2017 (Part 1).
ACM Trans. Parallel Comput., 2019

Two Roads to Parallelism: From Serial Code to Programming with STAPL.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

2018
Nested Parallelism with Algorithmic Skeletons.
Proceedings of the Languages and Compilers for Parallel Computing, 2018

2016
Fast Approximate Distance Queries in Unweighted Graphs Using Bounded Asynchrony.
Proceedings of the Languages and Compilers for Parallel Computing, 2016

2015
Finding schedule-sensitive branches.
Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, 2015

A hierarchical approach to reducing communication in parallel graph algorithms.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Asynchronous Nested Parallelism for Dynamic Applications in Distributed Memory.
Proceedings of the Languages and Compilers for Parallel Computing, 2015

A Hybrid Approach to Processing Big Data Graphs on Memory-Restricted Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Composing Algorithmic Skeletons to Express High-Performance Scientific Applications.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

STAPL-RTS: An Application Driven Runtime System.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Scalable conditional induction variables (CIV) analysis.
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

An Algorithmic Approach to Communication Reduction in Parallel Graph Algorithms.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014
SCCMulti: an improved parallel strongly connected components algorithm.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

The stapl Skeleton Framework.
Proceedings of the Languages and Compilers for Parallel Computing, 2014

Using Load Balancing to Scalably Parallelize Sampling-Based Motion Planning Algorithms.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Author retrospective for adaptive reduction parallelization techniques.
Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014

KLA: a new algorithmic paradigm for parallel graph computations.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

Processing big data graphs on memory-restricted systems.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

From petascale to the pocket: Adaptively scaling parallel programs for mobile SoCs.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2012
Logical inference techniques for loop parallelization.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012

The STAPL Parallel Graph Library.
Proceedings of the Languages and Compilers for Parallel Computing, 2012

2011
Speculative Parallelization of Loops.
Proceedings of the Encyclopedia of Parallel Computing, 2011

The Potential of Synergistic Static, Dynamic and Speculative Loop Nest Optimizations for Automatic Parallelization
CoRR, 2011

The STAPL parallel container framework.
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

A Hybrid Approach to Proving Memory Reference Monotonicity.
Proceedings of the Languages and Compilers for Parallel Computing, 2011

2010
STAPL: standard template adaptive parallel library.
Proceedings of of SYSTOR 2010: The 3rd Annual Haifa Experimental Systems Conference, 2010

The STAPL pView.
Proceedings of the Languages and Compilers for Parallel Computing, 2010

2009

Two memory allocators that use hints to improve locality.
Proceedings of the 8th International Symposium on Memory Management, 2009

2008
Implementation of Sensitivity Analysis for Automatic Parallelization.
Proceedings of the Languages and Compilers for Parallel Computing, 2008

Design for Interoperability in stapl: pMatrices and Linear Algebra Algorithms.
Proceedings of the Languages and Compilers for Parallel Computing, 2008

2007
The STAPL pArray.
Proceedings of the 2007 workshop on MEmory performance, 2007

Associative Parallel Containers in STAPL.
Proceedings of the Languages and Compilers for Parallel Computing, 2007

Sensitivity analysis for automatic parallelization on multi-cores.
Proceedings of the 21th Annual International Conference on Supercomputing, 2007

2006
An Adaptive Algorithm Selection Framework for Reduction Parallelization.
IEEE Trans. Parallel Distributed Syst., 2006

SmartApps: middle-ware for adaptive applications on reconfigurable platforms.
ACM SIGOPS Oper. Syst. Rev., 2006

Armi: a High Level Communication Library for Stapl.
Parallel Process. Lett., 2006

Custom Memory Allocation for Free.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

Design and Use of htalib - A Library for Hierarchically Tiled Arrays.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

Region array SSA.
Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT 2006), 2006

2005
Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors.
ACM Trans. Archit. Code Optim., 2005

Finding strongly connected components in distributed graphs.
J. Parallel Distributed Comput., 2005

An Experimental Evaluation of the HP V-Class and SGI Origin 2000 Multiprocessors using Microbenchmarks and Scientific Applications.
Int. J. Parallel Program., 2005

Parallel protein folding with STAPL.
Concurr. Comput. Pract. Exp., 2005

A framework for adaptive algorithm selection in STAPL.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005

Scalable Array SSA and Array Data Flow Analysis.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

2004
Automatic Parallelization Using the Value Evolution Graph.
Proceedings of the Languages and Compilers for High Performance Computing, 2004

An Adaptive Algorithm Selection Framework.
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004

The Value Evolution Graph and its Use in Memory Reference Analysis.
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004

2003
Hybrid Analysis: Static & Dynamic Memory Reference Analysis.
Int. J. Parallel Program., 2003

ARMI: an adaptive, platform independent communication library.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors.
Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003

Using Software Logging to Support Multi-Version Buffering in Thread-Level Speculation.
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques (PACT 2003), 27 September, 2003

2002
Parallel Reductions: An Application of Adaptive Algorithm Selection.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

The R-LRPD Test: Speculative Parallelization of Partially Parallel Loops.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

SmartApps: An Application Centric Approach to High Performance Computing: Compiler-Assisted Software and Hardware Support for Reduction Operations.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
Finding strongly connected components in parallel in particle transport sweeps.
Proceedings of the Thirteenth Annual ACM Symposium on Parallel Algorithms and Architectures, 2001

Identifying Strongly Connected Components in Parallel.
Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

STAPL: An Adaptive, Generic Parallel C++ Library.
Proceedings of the Languages and Compilers for Parallel Computing, 2001

Removing architectural bottlenecks to the scalability of speculative parallelization.
Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001

Architectural Support for Parallel Reductions in Scalable Shared-Memory Multiprocessors.
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT 2001), 2001

2000
Parallel computing for irregular applications.
Parallel Comput., 2000

Speculative Parallelization of Partially Parallel Loops.
Proceedings of the Languages, 2000

SmartApps: An Application Centric Approach to High Performance Computing.
Proceedings of the Languages and Compilers for Parallel Computing, 2000

Adaptive reduction parallelization techniques.
Proceedings of the 14th international conference on Supercomputing, 2000

Techniques for Reducing the Overhead of Run-Time Parallelization.
Proceedings of the Compiler Construction, 9th International Conference, 2000

1999
The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization.
IEEE Trans. Parallel Distributed Syst., 1999

Parallel Transport Computations by Spatial Decomposition.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

Run-Time Parallelization Optimization Techniques.
Proceedings of the Languages and Compilers for Parallel Computing, 1999

Comparing the memory system performance of the HP V-class and SGI Origin 2000 multiprocessors using microbenchmarks and scientific applications.
Proceedings of the 13th international conference on Supercomputing, 1999

Hardware for Speculative Parallelization of Partially-Parallel Loops in DSM Multiprocessors.
Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

Implementation Issues of Loop-Level Speculative Run-Time Parallelization.
Proceedings of the Compiler Construction, 8th International Conference, 1999

1998
Run-Time Parallelization: Its Time Has Come.
Parallel Comput., 1998

Standard Templates Adaptive Parallel Library (STAPL).
Proceedings of the Languages, 1998

Principles of Speculative Run-Time Parallelization.
Proceedings of the Languages and Compilers for Parallel Computing, 1998

Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors.
Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

1996
Parallel Programming with Polaris.
Computer, 1996

Restructuring Programs for High-Speed Computers with Polaris.
Proceedings of the 1996 International Conference on Parallel Processing Workshop, 1996

1995
A scalable method for run-time loop parallelization.
Int. J. Parallel Program., 1995

Parallelizing while loops for multiprocessor systems.
Proceedings of IPPS '95, 1995

Run-Time Methods for Parallelizing Partially Parallel Loops.
Proceedings of the 9th international conference on Supercomputing, 1995

1994
Automatic Detection of Parallelism: A grand challenge for high performance computing.
IEEE Parallel Distributed Technol. Syst. Appl., 1994

Polaris: Improving the Effectiveness of Parallelizing Compilers.
Proceedings of the Languages and Compilers for Parallel Computing, 1994

The privatizing DOALL test: a run-time technique for DOALL loop identification and array privatization.
Proceedings of the 8th international conference on Supercomputing, 1994

1993
Measuring limits of parallelism and characterizing its vulnerability to resource constraints.
Proceedings of the 26th Annual International Symposium on Microarchitecture, 1993

1990
A multiple floating point coprocessor architecture.
SIGARCH Comput. Archit. News, 1990


  Loading...