David A. Padua

According to our database1, David A. Padua authored at least 185 papers between 1980 and 2020.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2007, "For contributions to compiler support for parallel computing.".

IEEE Fellow

IEEE Fellow 2000, "For contributions to compiler technology for parallel computing.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2020
Parallel Processing, 1980 to 2020
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, 2020

2019
Managing code transformations for better performance portability.
Int. J. High Perform. Comput. Appl., 2019

Dataflow Execution of Hierarchically Tiled Arrays.
Proceedings of the Euro-Par 2019: Parallel Processing, 2019

Locus: A System and a Language for Program Optimization.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018
From High-Level Specification to High-Performance Code.
Proc. IEEE, 2018

An empirical study of the effect of source-level loop transformations on compiler stability.
Proc. ACM Program. Lang., 2018

Towards an Achievable Performance for the Loop Nests.
Proceedings of the Languages and Compilers for Parallel Computing, 2018

2017
LORE: A loop repository for the evaluation of compilers.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

A DSL for Performance Orchestration.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
High level abstractions and automatic optimization techniques for the programming of irregular algorithms.
Proceedings of the 6th Workshop on Irregular Applications: Architecture and Algorithms, 2016

DSMR: a shared and distributed memory algorithm for single-source shortest path problem.
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

DSMR: A Parallel Algorithm for Single-Source Shortest Path Problem.
Proceedings of the 2016 International Conference on Supercomputing, 2016

2015
Vectorization of apply to reduce interpretation overhead of R.
Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, 2015

Compilers and the Furture of High Performance Computing.
Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015

2014
Optimal Parallelogram Selection for Hierarchical Tiling.
ACM Trans. Archit. Code Optim., 2014

Vector seeker: a tool for finding vector potential.
Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing, 2014

Tiled Linear Algebra a System for Parallel Graph Algorithms.
Proceedings of the Languages and Compilers for Parallel Computing, 2014

Directive-Based Compilers for GPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2014

Optimizing R VM: Allocation Removal and Path Length Reduction via Interpreter-level Specialization.
Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

2013

Hydra: Automatic algorithm exploration from linear algebra equations.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

2012
Optimization techniques for efficient HTA programs.
Parallel Comput., 2012

Performance Portability with the Chapel Language.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Hierarchical overlapped tiling.
Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

2011
Prolog Machines.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Processors-in-Memory.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Processes, Tasks, and Threads.
Proceedings of the Encyclopedia of Parallel Computing, 2011

POSIX Threads (Pthreads).
Proceedings of the Encyclopedia of Parallel Computing, 2011

Pipelining.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Petascale Computer.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Perfect Benchmarks.
Proceedings of the Encyclopedia of Parallel Computing, 2011

PARSEC Benchmarks.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Parallelization, Automatic.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Network of Workstations.
Proceedings of the Encyclopedia of Parallel Computing, 2011

nCUBE.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Proceedings of the Encyclopedia of Parallel Computing, 2011

MasPar.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Proceedings of the Encyclopedia of Parallel Computing, 2011

Proceedings of the Encyclopedia of Parallel Computing, 2011

NSF/IEEE-TCPP curriculum initiative on parallel and distributed computing: core topics for undergraduates.
Proceedings of the 42nd ACM technical symposium on Computer science education, 2011

Autotuning for high performance computing.
Proceedings of the WHPCF'11, 2011

Scheduling of stream-based real-time applications for heterogeneous systems.
Proceedings of the ACM SIGPLAN/SIGBED 2011 conference on Languages, 2011

Panel Statement.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

An Evaluation of Vectorizing Compilers.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
A Parallel Numerical Solver Using Hierarchically Tiled Arrays.
Proceedings of the Languages and Compilers for Parallel Computing, 2010

Program Composition and Optimization: An Introduction.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

10191 Executive Summary - Program Composition and Optimization : Autotuning, Scheduling, Metaprogramming and Beyond.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

10191 Abstracts Collection - Program Composition and Optimization : Autotuning, Scheduling, Metaprogramming and Beyond.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

2009
Writing productive stencil codes with overlapped tiling.
Concurr. Comput. Pract. Exp., 2009

Communication contention in APN list scheduling algorithm.
Sci. China Ser. F Inf. Sci., 2009

Compiler research: the next 50 years.
Commun. ACM, 2009

Task-Parallel versus Data-Parallel Library-Based Programming in Multicore Systems.
Proceedings of the 17th Euromicro International Conference on Parallel, 2009

Optimization of tele-immersion codes.
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009

2008
Design Issues in Parallel Array Languages for Shared Memory.
Proceedings of the Embedded Computer Systems: Architectures, 2008

Programming with tiles.
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

P-Ray: A Software Suite for Multi-core Architecture Characterization.
Proceedings of the Languages and Compilers for Parallel Computing, 2008

Automatic generation of a parallel sorting algorithm.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007
Optimizing Sorting with Machine Learning Algorithms.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006
In search of a program generator to implement generic transformations for high-performance computing.
Sci. Comput. Program., 2006

Programming for parallelism and locality with hierarchically tiled arrays.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006

Optimizing data permutations for SIMD devices.
Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, 2006

Design and Use of htalib - A Library for Hierarchically Tiled Arrays.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

Hierarchically tiled arrays for parallelism and locality.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005
Is Search Really Necessary to Generate High-Performance BLAS?
Proc. IEEE, 2005

SPIRAL: Code Generation for DSP Transforms.
Proc. IEEE, 2005

Special Issue on Program Generation, Optimization, and Platform Adaptation.
Proc. IEEE, 2005

Compiler techniques for high performance sequentially consistent java programs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005

A sampling-based framework for parallel data mining.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005

Evaluating the Impact of Thread Escape Analysis on a Memory Consistency Model-Aware Compiler.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

A Language for the Compact Representation of Multiple Program Versions.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

Parallel mining of closed sequential patterns.
Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005

An Empirical Study On the Vectorization of Multimedia Applications for Multimedia Extensions.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Optimizing Sorting with Genetic Algorithms.
Proceedings of the 3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005), 2005

2004
Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms.
Int. J. High Perform. Comput. Appl., 2004

A compiler for multiple memory models.
Concurr. Comput. Pract. Exp., 2004

HiLO: High Level Optimization of FFTs.
Proceedings of the Languages and Compilers for High Performance Computing, 2004

Implementation of Parallel Numerical Algorithms Using Hierarchically Tiled Arrays.
Proceedings of the Languages and Compilers for High Performance Computing, 2004

Performance Modeling and Programming Environments for Petaflops Computers and the Blue Gene Machine.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

A Dynamically Tuned Sorting Library.
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

2003
Compiler Techniques for the Distribution of Data and Computation.
IEEE Trans. Parallel Distributed Syst., 2003

Programming the FlexRAM parallel intelligent memory system.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

A comparison of empirical and model-driven optimization.
Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, 2003

A Preliminary Study on the Vectorization of Multimedia Applications for Multimedia Extensions.
Proceedings of the Languages and Compilers for Parallel Computing, 2003

The Power of Belady?s Algorithm in Register Allocation for Long Basic Blocks.
Proceedings of the Languages and Compilers for Parallel Computing, 2003

Programming for Locality and Parallelism with Hierarchically Tiled Arrays.
Proceedings of the Languages and Compilers for Parallel Computing, 2003

Estimating cache misses and locality using stack distances.
Proceedings of the 17th Annual International Conference on Supercomputing, 2003

2002
An Advanced Compiler Framework for Non-Cache-Coherent Multiprocessors.
IEEE Trans. Parallel Distributed Syst., 2002

Efficient and precise array access analysis.
ACM Trans. Program. Lang. Syst., 2002

MaJIC: Compiling MATLAB for Speed and Responsiveness.
Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2002

Automatic Implementation of Programming Language Consistency Models.
Proceedings of the Languages and Compilers for Parallel Computing, 15th Workshop, 2002

Calculating stack distances efficiently.
Proceedings of The Workshop on Memory Systems Performance (MSP 2002), 2002

The Pensieve Project: A Compiler Infrastructure for Memory Models.
Proceedings of the International Symposium on Parallel Architectures, 2002

Is OpenMP for Grids?.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Instance-wise points-to analysis for loop-based dependence testing.
Proceedings of the 16th international conference on Supercomputing, 2002

2001
Hiding Relaxed Memory Consistency with a Compiler.
IEEE Trans. Computers, 2001

SPL: A Language and Compiler for DSP Algorithms.
Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2001

Compiling for a Hybrid Programming Model Using the LMAD Representation.
Proceedings of the Languages and Compilers for Parallel Computing, 2001

Induction Variable Analysis without Idiom Recognition: Beyond Monotonicity.
Proceedings of the Languages and Compilers for Parallel Computing, 2001

A synthesis of memory mechanisms for distributed architectures.
Proceedings of the 15th international conference on Supercomputing, 2001

Monotonic evolution: an alternative to induction variable substitution for dependence analysis.
Proceedings of the 15th international conference on Supercomputing, 2001

Automatic Array Privatization.
Proceedings of the Compiler Optimizations for Scalable Parallel Systems Languages, 2001

2000
Compilers and Interpreters Archive.
ACM SIGPLAN Notices, 2000

Containers on the Parallelization of General-Purpose Java Programs.
Int. J. Parallel Program., 2000

The Fortran I compiler.
Comput. Sci. Eng., 2000

A Simple Framework to Calculate the Reaching Definition of Array References and Its Use in Subscript Array Analysis.
Concurr. Pract. Exp., 2000

Compiler analysis of irregular memory accesses.
Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2000

Searching for the Best FFT Formulas with the SPL Compiler.
Proceedings of the Languages and Compilers for Parallel Computing, 2000

MaJIC: A Matlab Just-In-time Compiler.
Proceedings of the Languages and Compilers for Parallel Computing, 2000

Analysis of Irregular Single-Indexed Array Accesses and Its Applications in Compiler Optimizations.
Proceedings of the Compiler Construction, 9th International Conference, 2000

Hiding Relaxed Memory Consistency with Compilers.
Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT'00), 2000

1999
The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization.
IEEE Trans. Parallel Distributed Syst., 1999

Techniques for the Translation of MATLAB Programs into Fortran 90.
ACM Trans. Program. Lang. Syst., 1999

On the automatic parallelization of sparse and irregular Fortran programs.
Sci. Program., 1999

Semantic Inlining - the Compiler Support for Java in Technical Computing.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

Basic Compiler Algorithms for Parallel Programs.
Proceedings of the 1999 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'99), 1999

Demand-Driven Interprocedural Array Property Analysis.
Proceedings of the Languages and Compilers for Parallel Computing, 1999

Compile-Time Based Performance Prediction.
Proceedings of the Languages and Compilers for Parallel Computing, 1999

Access Descriptor Based Locality Analysis for Distributed-Shared Memory Multiprocessors.
Proceedings of the International Conference on Parallel Processing 1999, 1999

MATmarks: A Shared Memory Environment for MATLAB Programming.
Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, 1999

1998
On the Automatic Parallelization of the Perfect Benchmarks.
IEEE Trans. Parallel Distributed Syst., 1998

A Constant Propagation Algorithm for Explicitly Parallel Programs.
Int. J. Parallel Program., 1998

Simplification of Array Access Patterns for Compiler Optimizations.
Proceedings of the ACM SIGPLAN '98 Conference on Programming Language Design and Implementation (PLDI), 1998

Beyond Arrays - A Container-Centric Approach for Parallelization of Real-World Symbolic Applications.
Proceedings of the Languages and Compilers for Parallel Computing, 1998

Retrospective: The Cedar System.
Proceedings of the 25 Years of the International Symposia on Computer Architecture (Selected Papers)., 1998

Experimental Study of Compiler Techniques for NUMA Machines.
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

Parallelization of Benchmarks for Scalable Shared-Memory Multiprocessors.
Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998

1997
Compiling for Scalable Multiprocessors with Polaris.
Parallel Process. Lett., 1997

Concurrent Static Single Assignment Form and Constant Propagation for Explicitly Parallel Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 1997

Compiler Techniques for Effective Communication on Distributed-Memory Multiprocessors.
Proceedings of the 1997 International Conference on Parallel Processing (ICPP '97), 1997

1996
Static and Dynamic Evaluation of Data Dependence Analysis Techniques.
IEEE Trans. Parallel Distributed Syst., 1996

Parallel Programming with Polaris.
Computer, 1996

Automatic Parallelization for Non-cache Coherent Multiprocessors.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

A MATLAB to Fortran 90 Translator and Its Effectiveness.
Proceedings of the 10th international conference on Supercomputing, 1996

Restructuring Programs for High-Speed Computers with Polaris.
Proceedings of the 1996 International Conference on Parallel Processing Workshop, 1996

1995
A scalable method for run-time loop parallelization.
Int. J. Parallel Program., 1995

Automatic Program Restructuring for Parallel Computing and the Polaris Fortran Translator.
Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

Efficient Building and Placing of Gating Functions.
Proceedings of the ACM SIGPLAN'95 Conference on Programming Language Design and Implementation (PLDI), 1995

Quantitative analysis of vector code.
Proceedings of the 3rd Euromicro Workshop on Parallel and Distributed Processing (PDP '95), 1995

FALCON: A MATLAB Interactive Restructuring Compiler.
Proceedings of the Languages and Compilers for Parallel Computing, 1995

Parallelizing while loops for multiprocessor systems.
Proceedings of IPPS '95, 1995

Gated SSA-based Demand-Driven Symbolic Analysis for Parallelizing Compilers.
Proceedings of the 9th international conference on Supercomputing, 1995

Run-Time Methods for Parallelizing Partially Parallel Loops.
Proceedings of the 9th international conference on Supercomputing, 1995

1994
The polaris internal representation.
Int. J. Parallel Program., 1994

Editors' introduction.
Int. J. Parallel Program., 1994

Automatic Detection of Parallelism: A grand challenge for high performance computing.
IEEE Parallel Distributed Technol. Syst. Appl., 1994

Polaris: Improving the Effectiveness of Parallelizing Compilers.
Proceedings of the Languages and Compilers for Parallel Computing, 1994

The privatizing DOALL test: a run-time technique for DOALL loop identification and array privatization.
Proceedings of the 8th international conference on Supercomputing, 1994

Comparing the Performance of the DASH and CEDAR Multiprocessors.
Proceedings of the 1994 International Conference on Parallel Processing, 1994

1993
Restructuring Fortran programs for Cedar.
Concurr. Pract. Exp., 1993


Loop Transformations for Prolog Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 1993


Static and Dynamic Evaluation of Data Dependence Analysis.
Proceedings of the 7th international conference on Supercomputing, 1993

1992
Detecting Nondeterminacy in Parallel Programs.
IEEE Softw., 1992

Problem-solving environments for parallel computers.
Future Gener. Comput. Syst., 1992

Array Privatization for Shared and Distributed Memory Machines (Extended Abstract).
Proceedings of the 2nd SIGPLAN Workshop on Languages, Compilers, and Run-Time Environments for Distributed Memory Multiprocessors, Boulder, Colorado, September 30, 1992

Dynamic Dependence Analysis: A Novel Method for Data Depndence Evaluation.
Proceedings of the Languages and Compilers for Parallel Computing, 1992

1991
Guest Editor's Introduction.
IEEE Trans. Parallel Distributed Syst., 1991

Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 1991

Fortran-Style Transformations for Functional Programs.
Proceedings of the International Conference on Parallel Processing, 1991

A Comparison of Four Synchronization Optimization Techniques.
Proceedings of the International Conference on Parallel Processing, 1991

Effects of Program Parallelization and Stripmining Transformation on Cache Performance in a Multiprocessor.
Proceedings of the International Conference on Parallel Processing, 1991

1990
Cedar Fortran and other vector and parallel Fortran dialects.
J. Supercomput., 1990

Issues in the Optimization of Parallel Programs.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

Cedar Fortran and Its Compiler.
Proceedings of the CONPAR 90, 1990

1989
Utilizing Multidimensional Loop Parallelism on Large-Scale Parallel Processor Systems.
IEEE Trans. Computers, 1989

Event synchronization analysis for debugging parallel programs.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

Problem solving environments.
Proceedings of the 13th Annual International Computer Software and Applications Conference, 1989

1988
OR parallel execution of Prolog programs with side effects.
J. Supercomput., 1988

Automatic Detection of Nondeterminacy in Parallel Programs.
Proceedings of the ACM SIGPLAN and SIGOPS Workshop on Parallel and Distributed Debugging, 1988

Parcel: project for the automatic restructuring and concurrent evaluation of LISP.
Proceedings of the 2nd international conference on Supercomputing, 1988

Automatic Compound Function Definition for Multiprocessors.
Proceedings of the International Conference on Parallel Processing, 1988

Prolog at the University of Illinois.
Proceedings of the COMPCON'88, Digest of Papers, Thirty-Third IEEE Computer Society International Conference, San Francisco, California, USA, February 29, 1988

1987
Compiler Algorithms for Synchronization.
IEEE Trans. Computers, 1987

Debugging Parallel Fortran on a Shared Memory Machine.
Proceedings of the International Conference on Parallel Processing, 1987

1986
Advanced Compiler Optimizations for Supercomputers.
Commun. ACM, 1986

Execution of Parallel Loops on Parallel Processor Systems.
Proceedings of the International Conference on Parallel Processing, 1986

Compiler Generated Synchronization for Do Loops.
Proceedings of the International Conference on Parallel Processing, 1986

Representing S-Expressions for the Efficient Evaluation of LISP on Parallel Processors.
Proceedings of the International Conference on Parallel Processing, 1986

1982
Some Results on the Working Set Anomalies in Numerical Programs.
IEEE Trans. Software Eng., 1982

A Second Opinion on Data Flow Machines and Languages.
Computer, 1982

1981
Interconnection Networks Using Shuffles.
Computer, 1981

Dependence Graphs and Compiler Optimizations.
Proceedings of the Conference Record of the Eighth Annual ACM Symposium on Principles of Programming Languages, 1981

1980
High-Speed Multiprocessors and Compilation Techniques.
IEEE Trans. Computers, 1980


  Loading...