Ken Kennedy

According to our database1, Ken Kennedy
  • authored at least 191 papers between 1975 and 2016.
  • has a "Dijkstra number"2 of three.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2016
Algebraic multigrid support vector machines.
CoRR, 2016

2015
Automotive big data: Applications, workloads and infrastructures.
Proceedings of the 2015 IEEE International Conference on Big Data, 2015

2013
Automotive big data.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

2012
A self-stabilizing algorithm for optimally efficient sets in graphs.
Inf. Process. Lett., 2012

2011
The rise and fall of high performance Fortran.
Commun. ACM, 2011

2009
Scheduling Tasks to Maximize Usage of Aggregate Variables in Place.
Proceedings of the Compiler Construction, 18th International Conference, 2009

2008
Model-guided empirical tuning of loop fusion.
IJHPSA, 2008

Redundancy elimination revisited.
Proceedings of the 17th International Conference on Parallel Architecture and Compilation Techniques, 2008

2007
Improving Compilation of Java Scientific Applications.
IJHPCA, 2007

Compiling Parallel MATLAB for General Distributions using Telescoping Languages.
Proceedings of the IEEE International Conference on Acoustics, 2007

The rise and fall of High Performance Fortran: an historical object lesson.
Proceedings of the Third ACM SIGPLAN History of Programming Languages Conference (HOPL-III), 2007

Relative Performance of Scheduling Algorithms in Grid Environments.
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007

2006
Automatic tuning of whole applications using direct search and a performance-based transformation system.
The Journal of Supercomputing, 2006

Grid scheduling and protocols - Evaluation of a workflow scheduler using integrated performance modelling and batch queue wait time prediction.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Dependence-Based Code Generation for a CELL Processor.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

Performance modeling and prediction for scientific Java applications.
Proceedings of the 2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006

Why Performance Models Matter for Grid Computing.
Proceedings of the Grid-Based Problem Solving Environments, 2006

Profitable loop fusion and tiling using model-driven empirical search.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Software Challenges for Multicore Computing.
Proceedings of the High Performance Computing, 2006

Scalable Grid Application Scheduling via Decoupled Resource Selection and Scheduling.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

2005
Scalarization Using Loop Alignment and Loop Skewing.
The Journal of Supercomputing, 2005

Telescoping Languages: A System for Automatic Generation of Domain Languages.
Proceedings of the IEEE, 2005

New Grid Scheduling and Rescheduling Methods in the GrADS Project.
International Journal of Parallel Programming, 2005

An algorithm for partial Grundy number on trees.
Discrete Mathematics, 2005

Compiling almost-whole Java programs.
Concurrency - Practice and Experience, 2005

A Cache-Conscious Profitability Model for Empirical Tuning of Loop Fusion.
Proceedings of the Languages and Compilers for Parallel Computing, 2005

Scalarization on Short Vector Machines.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005

Scheduling strategies for mapping application workflows onto the grid.
Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing, 2005

Task scheduling strategies for workflow-based applications in grids.
Proceedings of the 5th International Symposium on Cluster Computing and the Grid (CCGrid 2005), 2005

2004
Transforming Complex Loop Nests for Locality.
The Journal of Supercomputing, 2004

Improving effective bandwidth through compiler enhancement of global cache reuse.
J. Parallel Distrib. Comput., 2004

Improving Memory Hierarchy Performance through Combined Loop Interchange and Multi-Level Fusion.
IJHPCA, 2004

Defining and Measuring the Productivity of Programming Languages.
IJHPCA, 2004

New Grid Scheduling and Rescheduling Methods in the GrADS Project.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Scheduling workflow applications in GrADS.
Proceedings of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004), 2004

Automatic blocking of QR and LU factorizations for locality.
Proceedings of the 2004 workshop on Memory System Performance, 2004

2003
Automatic Type-Driven Library Generation for Telescoping Languages.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Slice-Hoisting for Array-Size Inference in MATLAB.
Proceedings of the Languages and Compilers for Parallel Computing, 2003

2002
Reducing and Vectorizing Procedures for Telescoping Languages.
International Journal of Parallel Programming, 2002

KELPIO a telescope-ready domain-specific I/O library for irregular block-structured applications.
Future Generation Comp. Syst., 2002

Advanced optimization strategies in the Rice dHPF compiler.
Concurrency and Computation: Practice and Experience, 2002

A Rice University perspective on software engineering licensing.
Commun. ACM, 2002

Fast Copy Coalescing and Live-Range Identification.
Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2002

Almost-whole-program compilation.
Proceedings of the 2002 Joint ACM-ISCOPE Conference on Java Grande 2002, 2002

Toward a Framework for Preparing and Executing Adaptive Grid Programs.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
What Are the Top Ten Most Influential Parallel and Distributed Processing Concepts of the Past Millenium?
J. Parallel Distrib. Comput., 2001

Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries.
J. Parallel Distrib. Comput., 2001

Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings.
International Journal of Parallel Programming, 2001

Fast Greedy Weighted Fusion.
International Journal of Parallel Programming, 2001

The GrADS Project: Software Support for High-Level Grid Application Development.
IJHPCA, 2001

JaMake: A Java Compiler Environment.
Proceedings of the Large-Scale Scientific Computing, Third International Conference, 2001

Improving Effective Bandwidth through Compiler Enhancement of Global Cache Reuse.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Optimizing strategies for telescoping languages: procedure strength reduction and procedure vectorization.
Proceedings of the 15th international conference on Supercomputing, 2001

Software Support for High Performance Problem-Solving on Computational Grids.
Proceedings of the Computational Science - ICCS 2001, 2001

High Performance Fortran 2.0.
Proceedings of the Compiler Optimizations for Scalable Parallel Systems Languages, 2001

KelpIO: A Telescope-Ready Domain-Specific I/O Library for Irregular Block-Structured Applications.
Proceedings of the First IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2001), 2001

Optimizing Compilers for Modern Architectures: A Dependence-based Approach
Morgan Kaufmann, ISBN: 1-55860-286-0, 2001

2000
A balanced code placement framework.
ACM Trans. Program. Lang. Syst., 2000

Transforming loops to recursion for multi-level memory hierarchies.
Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2000

Telescoping Languages: A Compiler Strategy for Implementation of High-Level Domain-Specific Programming Systems.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

The Memory Bandwidth Bottleneck and its Amelioration by a Compiler.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

Fast greedy weighted fusion.
Proceedings of the 14th international conference on Supercomputing, 2000

1999
The cost of being object-oriented: A preliminary study.
Scientific Programming, 1999

Prospects for Scientific Computing in Polymorphic, Object-Oriented Style.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

Improving Cache Performance in Dynamic Applications through Data and Computation Reorganization at Run Time.
Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 1999

Inter-array Data Regrouping.
Proceedings of the Languages and Compilers for Parallel Computing, 1999

Improving memory hierarchy performance for irregular applications.
Proceedings of the 13th international conference on Supercomputing, 1999

1998
Automatic Data Layout for Distributed-Memory Machines.
ACM Trans. Program. Lang. Syst., 1998

Loop Fusion in High Performance Fortran.
Proceedings of the 12th international conference on Supercomputing, 1998

1997
Automatic Data Distribution for Composite Grid Applications.
Scientific Programming, 1997

Experiences in Data-Parallel Programming.
Scientific Programming, 1997

Optimizing Java: theory and practice.
Concurrency - Practice and Experience, 1997

A Nationwide Parallel Computing Environment.
Commun. ACM, 1997

Compiling Stencils in High Performance Fortran.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1997

1996
Optimal register assignment to loops for embedded code generation.
ACM Trans. Design Autom. Electr. Syst., 1996

Interprocedural Compilation on Fortran D.
J. Parallel Distrib. Comput., 1996

Parallelization support for coupled grid applications with small meshes.
Concurrency - Practice and Experience, 1996

Dependence Analysis of Fortran90 Array Syntax.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1996

Resource-Based Communication Placement Analysis.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

Cross-Loop Reuse Analysis and Its Application to Cache Optimizations.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

A Method for Register Allocation to Loops in Multiple Register File Architectures.
Proceedings of IPPS '96, 1996

A communication placement framework with unified dependence and data-flow analysis.
Proceedings of the 3rd International Conference on High Performance Computing, 1996

1995
Integer Programming for Array Subscript Analysis.
IEEE Trans. Parallel Distrib. Syst., 1995

Automatic Data Layout for High Performance Fortran.
Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

Index Array Flattening Through Program Transformation.
Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

Distributed Information Management in the National HPCC Software Exchange.
Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs.
Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

A Linear-Time Algorithm for Computing the Memory Access Sequence in Data-Parallel Programs.
Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1995

A Model and Compilation Strategy for Out-of-Core Data Parallel Programs.
Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1995

Optimizing Fortran 90 Shift Operations on Distributed-Memory Multicomputers.
Proceedings of the Languages and Compilers for Parallel Computing, 1995

Optimal register assignment to loops for embedded code generation.
Proceedings of the 8th International Symposium on System Synthesis (ISSS 1995), 1995

Combining dependence and data-flow analyses to optimize communication.
Proceedings of IPPS '95, 1995

Efficient Address Generation for Block-Cyclic Distributions.
Proceedings of the 9th international conference on Supercomputing, 1995

Management of the Nationale HPCC Software Exchange - A Virtual Distributed Digital Library.
Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries, 1995

The Prospects for Architecture-Independent Parallel Programming.
Proceedings of the 1995 ACM 23rd Annual Conference on Computer Science, CSC '95, Nashville, TN, USA, February 28, 1995

1994
Improving the Ratio of Memory Operations to Floating-Point Operations in Loops.
ACM Trans. Program. Lang. Syst., 1994

Scalar Replacement in the Presence of Conditional Control Flow.
Softw., Pract. Exper., 1994

Evaluating Compiler Optimizations for Fortran D.
J. Parallel Distrib. Comput., 1994

Compiler technology for machine-indepenent parallel programming.
International Journal of Parallel Programming, 1994

Centers of Supercomputing: Making Parallel Computing Truly Usable: Research, Education, and Knowledge Transfer At the Center for Research On Parallel Computation.
IJHPCA, 1994

Integrated Support for Task and Data Parallelism.
IJHPCA, 1994

Requirements for DataParallel Programming Environments.
IEEE P&DT, 1994

The D editor: a new interactive parallel programming tool.
Proceedings of the Proceedings Supercomputing '94, 1994

GIVE-N-TAKE - A Balanced Code Placement Framework.
Proceedings of the ACM SIGPLAN'94 Conference on Programming Language Design and Implementation (PLDI), 1994

Parallelization of Linearized Applications in Fortran D.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

Automatic Data Layout Using 0-1 Integer Programming.
Proceedings of the Parallel Architectures and Compilation Techniques, 1994

Compilation techniques for block-cyclic distributions.
Proceedings of the 8th international conference on Supercomputing, 1994

Parallel Processing: What Have We Done Wrong?
Proceedings of the Proceedings 1994 International Conference on Parallel and Distributed Systems, 1994

The Prospects for Architecture-Independent Parallel Programming.
Proceedings of the Proceedings 1994 International Conference on Parallel and Distributed Systems, 1994

Value-Based Distributions in Fortran D.
Proceedings of the High-Performance Computing and Networking, 1994

1993
Unified Compilation of Fortran 77D and 90D.
LOPLAS, 1993

Analysis and transformation in an interactive parallel programming tool.
Concurrency - Practice and Experience, 1993

A Methodology for Procedure Cloning.
Comput. Lang., 1993

Preliminary experiences with the Fortran D compiler.
Proceedings of the Proceedings Supercomputing '93, 1993

Cache coherence using local knowledge.
Proceedings of the Proceedings Supercomputing '93, 1993

Experiences Using the ParaScope Editor: an Interactive Parallel Programming Tool.
Proceedings of the Fourth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1993

Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution.
Proceedings of the Languages and Compilers for Parallel Computing, 1993

1992
Software for supercomputers of the future.
The Journal of Supercomputing, 1992

Vector Register Allocation.
IEEE Trans. Computers, 1992

Efficient Call Graph Analysis.
LOPLAS, 1992

Supercomputing - Introduction to the Special Section.
Commun. ACM, 1992

Compiling Fortran D for MIMD Distributed Memory Machines.
Commun. ACM, 1992

Interprocedural Compilation of Fortran D for MIMD Distributed-Memory Machines.
Proceedings of the Proceedings Supercomputing '92, 1992

Compiler Blockability of Numerical Algorithms.
Proceedings of the Proceedings Supercomputing '92, 1992

Relaxing SIMD Control Flow Constraints using Loop Transformations.
Proceedings of the ACM SIGPLAN'92 Conference on Programming Language Design and Implementation (PLDI), 1992

Compiler Analysis for Irregular Problems in Fortran D.
Proceedings of the Languages and Compilers for Parallel Computing, 1992

Optimizing for parallelism and data locality.
Proceedings of the 6th international conference on Supercomputing, 1992

Evaluation of compiler optimizations for Fortran D on MIMD distributed memory machines.
Proceedings of the 6th international conference on Supercomputing, 1992

Automatic software cache coherence through vectorization.
Proceedings of the 6th international conference on Supercomputing, 1992

Procedure cloning.
Proceedings of the ICCL'92, 1992

1991
Interactive Parallel Programming using the ParaScope Editor.
IEEE Trans. Parallel Distrib. Syst., 1991

An Implementation of Interprocedural Bounded Regular Section Analysis.
IEEE Trans. Parallel Distrib. Syst., 1991

Compiler optimizations for Fortran D on MIMD distributed-memory machines.
Proceedings of the Proceedings Supercomputing '91, 1991

Interprocedural transformations for parallel code generation.
Proceedings of the Proceedings Supercomputing '91, 1991

A Static Performance Estimator to Guide Data Partitioning Decisions.
Proceedings of the Third ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1991

Practical Dependence Testing.
Proceedings of the ACM SIGPLAN'91 Conference on Programming Language Design and Implementation (PLDI), 1991

An Overview of the Fortran D Programming System.
Proceedings of the Languages and Compilers for Parallel Computing, 1991

Analysis and transformation in the ParaScope editor.
Proceedings of the 5th international conference on Supercomputing, 1991

Software Prefetching.
Proceedings of the ASPLOS-IV Proceedings, 1991

1990
Constructing the Procedure Call Multigraph.
IEEE Trans. Software Eng., 1990

Loop distribution with arbitrary control flow.
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Parallel program debugging with on-the-fly anomaly detection.
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Experience with interprocedural analysis of array side effects.
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Analysis of Event Synchronization in A Parallel Programming Tool.
Proceedings of the Second ACM SIGPLAN Symposium on Princiles & Practice of Parallel Programming (PPOPP), 1990

Improving register allocation for subscripted variables (with retrospective)
Proceedings of the 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999, 1990

Improving Register Allocation for Subscripted Variables.
Proceedings of the ACM SIGPLAN'90 Conference on Programming Language Design and Implementation (PLDI), 1990

1989
1988 Gordon Bell Prize.
IEEE Software, 1989

Performance of parallel processors.
Parallel Computing, 1989

The parascope editor: an interactive parallel programming tool.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

Blocking Linear Algebra Codes for Memory Hierarchies.
Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

Fast Interprocedural Alias Analysis.
Proceedings of the Conference Record of the Sixteenth Annual ACM Symposium on Principles of Programming Languages, 1989

Coloring heuristics for register allocation (with retrospective)
Proceedings of the 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999, 1989

Coloring Heuristics for Register Allocation.
Proceedings of the ACM SIGPLAN'89 Conference on Programming Language Design and Implementation (PLDI), 1989

A Technique for Summarizing Data Access and Its Use in Parallelism Enhancing Transformations.
Proceedings of the ACM SIGPLAN'89 Conference on Programming Language Design and Implementation (PLDI), 1989

Compile-time detection of race conditions in a parallel program.
Proceedings of the 3rd international conference on Supercomputing, 1989

1988
Compiling programs for distributed-memory multiprocessors.
The Journal of Supercomputing, 1988

Efficient computation of flow-insensitive interprocedural summary information - a correction.
SIGPLAN Notices, 1988

Analysis of Interprocedural Side Effects in a Parallel Programming Environment.
J. Parallel Distrib. Comput., 1988

Estimating Interlock and Improving Balance for Pipelined Architectures.
J. Parallel Distrib. Comput., 1988

Interprocedural side-effect analysis in linear time (with retrospective)
Proceedings of the 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999, 1988

Interprocedural Side-Effect Analysis in Linear Time.
Proceedings of the ACM SIGPLAN'88 Conference on Programming Language Design and Implementation (PLDI), 1988

1987
Automatic Translation of Fortran Programs to Vector Form.
ACM Trans. Program. Lang. Syst., 1987

A Practical Environment for Scientific Programming.
IEEE Computer, 1987

Automatic Decomposition of Fortran Programs for Execution on Multiprocessors-Abstract.
Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, 1987

Automatic Decomposition of Scientific Programs for Parallel Execution.
Proceedings of the Conference Record of the Fourteenth Annual ACM Symposium on Principles of Programming Languages, 1987

Analysis of Interprocedural Side Effects in a Parallel Programming Environment.
Proceedings of the Supercomputing, 1987

Estimating Interlock and Improving Balance for Pipelined Architectures.
Proceedings of the International Conference on Parallel Processing, 1987

Parallel Programming Support in ParaScope.
Proceedings of the Parallel Computing in Science and Engineering, 1987

1986
The Impact of Interprocedural Analysis and Optimization in the Rn Programming Environment.
ACM Trans. Program. Lang. Syst., 1986

Interprocedural optimization: eliminating unnecessary recompilation.
Proceedings of the 1986 SIGPLAN Symposium on Compiler Construction, 1986

Interprocedural constant propagation.
Proceedings of the 1986 SIGPLAN Symposium on Compiler Construction, 1986

Efficient recompilation of module interfaces in a software development environment.
Proceedings of the SESPSDE'86: ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, 1986

Editing and compiling whole programs.
Proceedings of the SESPSDE'86: ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, 1986

Interprocedural constant propagation (with retrospective)
Proceedings of the 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999, 1986

PTOOL : A Semi-Automatic Parallel Programming Assistant.
Proceedings of the International Conference on Parallel Processing, 1986

1985
A Parallel Programming Environment.
IEEE Software, 1985

The impact of interprocedural analysis and optimization on the design of a software development environment.
SIGPLAN Notices, 1985

1984
Efficient computation of flow insensitive interprocedural summary information.
Proceedings of the 1984 SIGPLAN Symposium on Compiler Construction, 1984

Automatic loop interchange.
Proceedings of the 1984 SIGPLAN Symposium on Compiler Construction, 1984

Automatic loop interchange (with retrospective)
Proceedings of the 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999, 1984

1983
Conversion of Control Dependence to Data Dependence.
Proceedings of the Conference Record of the Tenth Annual ACM Symposium on Principles of Programming Languages, 1983

1981
Pathlistings Applied to Data Flow Analysis.
Acta Inf., 1981

1979
A Deterministic Attribute Grammar Evaluator Based on Dynamic Scheduling.
ACM Trans. Program. Lang. Syst., 1979

1978
Use-Definition Chains with Applications.
Comput. Lang., 1978

1977
An Algorithm for Reduction of Operator Strength.
Commun. ACM, 1977

Applications of Graph Grammar for Program Control Flow Analysis.
Proceedings of the Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, 1977

1976
A Comparison of Two Algorithms for Global Data Flow Analysis.
SIAM J. Comput., 1976

PLANET: A simulation approach to PERT.
Computers & OR, 1976

Automatic Generation of Efficient Evaluators for Attribute Grammars.
Proceedings of the Conference Record of the Third ACM Symposium on Principles of Programming Languages, 1976

Graph Grammars and Global Program Data Flow Analysis
Proceedings of the 17th Annual Symposium on Foundations of Computer Science, 1976

1975
Node Listings Applied to Data Flow Analysis.
Proceedings of the Conference Record of the Second ACM Symposium on Principles of Programming Languages, 1975


  Loading...