Louis-Noël Pouchet

Jason Cong

Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

Formal Verification of Source-to-Source Transformations for HLS.

[BibT_eX]

[DOI]

Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

2023

Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules.

[BibT_eX]

[DOI]

Martin Monperrus

IEEE Trans. Software Eng., July, 2023

2022

MARTA: Multi-configuration Assembly pRofiler and Toolkit for performance Analysis.

[BibT_eX]

[DOI]

Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022

FOURST: A code generator for FFT-based fast stencil computations.

[BibT_eX]

[DOI]

Zafar Ahmad

Mohammad Mahdi Javanmard

Gregory Thomas Croisdale

Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022

Accelerator design with decoupled hardware customizations: benefits and challenges: invited.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

PALMED: Throughput Characterization for Superscalar Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021

SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair.

[BibT_eX]

[DOI]

IEEE Trans. Software Eng., 2021

Self-Supervised Learning to Prove Equivalence Between Programs via Semantics-Preserving Rewrite Rules.

[BibT_eX]

[DOI]

Martin Monperrus

CoRR, 2021

Proving Equivalence Between Complex Expressions Using Graph-to-Sequence Neural Models.

[BibT_eX]

[DOI]

Théo Barollet

Miguel Á. Abella-González

CoRR, 2021

Optimizing Coherence Traffic in Manycore Processors Using Closed-Form Caching/Home Agent Mappings.

[BibT_eX]

[DOI]

IEEE Access, 2021

PolyBench/Python: benchmarking Python environments with polyhedral optimizations.

[BibT_eX]

[DOI]

Pedro Carollo-Fernández

Gabriel Rodríguez

Proceedings of the CC '21: 30th ACM SIGPLAN International Conference on Compiler Construction, 2021

2020

Building a Polyhedral Representation from an Instrumented Execution: Making Dynamic Analyses of Nonaffine Programs Scalable.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

From micro-OPs to abstract resources: constructing a simpler CPU performance model through microbenchmarking.

[BibT_eX]

[DOI]

CoRR, 2020

Coherence Traffic in Manycore Processors with Opaque Distributed Directories.

[BibT_eX]

[DOI]

CoRR, 2020

Equivalence of Dataflow Graphs via Rewrite Rules Using a Graph-to-Sequence Neural Model.

[BibT_eX]

[DOI]

Théo Barollet

CoRR, 2020

Automated derivation of parametric data movement lower bounds for affine programs.

[BibT_eX]

[DOI]

Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2020

Efficient Execution of Dynamic Programming Algorithms on Apache Spark.

[BibT_eX]

[DOI]

Mohammad Mahdi Javanmard

Proceedings of the IEEE International Conference on Cluster Computing, 2020

Deriving parametric multi-way recursive divide-and-conquer dynamic programming algorithms using polyhedral compilers.

[BibT_eX]

[DOI]

Mohammad Mahdi Javanmard

Proceedings of the CGO '20: 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020

2019

Data-flow/dependence profiling for structured transformations.

[BibT_eX]

[DOI]

Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

Model-driven transformations for multi- and many-core CPUs.

[BibT_eX]

[DOI]

Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

Generating piecewise-regular code from irregular structures.

[BibT_eX]

[DOI]

Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

On Optimizing Complex Stencils on GPUs.

[BibT_eX]

[DOI]

Miheer Vaidya

Atanas Rountev

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Synthetic Lung Nodule 3D Image Generation Using Autoencoders.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

Effect of Distributed Directories in Mesh Interconnects.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

A Code Generator for High-Performance Tensor Contractions on GPUs.

[BibT_eX]

[DOI]

Vineeth Thumma

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018

Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations.

[BibT_eX]

[DOI]

Miheer Vaidya

Proc. IEEE, 2018

Analytical modeling of cache behavior for affine programs.

[BibT_eX]

[DOI]

Proc. ACM Program. Lang., 2018

A Performance Vocabulary for Affine Loop Transformations.

[BibT_eX]

[DOI]

CoRR, 2018

Associative instruction reordering to alleviate register pressure.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2018

[BibT_eX]

[DOI]

Atanas Rountev

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Performance modeling for GPUs using abstract kernel emulation.

[BibT_eX]

[DOI]

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

GPU code optimization using abstract kernel emulation and sensitivity analysis.

[BibT_eX]

[DOI]

Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018

2017

Efficient Cache Simulation for Affine Computations.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2017

Simplification and runtime resolution of data dependence constraints for loop transformations.

[BibT_eX]

[DOI]

Diogo Nunes Sampaio

Proceedings of the International Conference on Supercomputing, 2017

Accurate High-level Modeling and Automated Hardware/Software Co-design for Effective SoC Design Space Exploration.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

POSTER: Statement Reordering to Alleviate Register Pressure for Stencils on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

Static and Dynamic Frequency Scaling on Multicore CPUs.

[BibT_eX]

[DOI]

Sudheer Chunduri

ACM Trans. Archit. Code Optim., 2016

A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment.

[BibT_eX]

[DOI]

Samyam Rajbhandari

Proceedings of the International Conference for High Performance Computing, 2016

PIPES: a language and compiler for task-based programming on distributed-memory clusters.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2016

Effective resource management for enhancing performance of 2D and 3D stencils on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

PolyCheck: dynamic verification of iteration space transformations on affine programs.

[BibT_eX]

[DOI]

Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2016

Effective padding of multidimensional arrays to avoid cache conflict misses.

[BibT_eX]

[DOI]

Albert Cohen

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2016

On fusing recursive traversals of K-d trees.

[BibT_eX]

[DOI]

Samyam Rajbhandari

Proceedings of the 25th International Conference on Compiler Construction, 2016

POSTER: Hybrid Data Dependence Analysis for Loop Transformations.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

Resource Conscious Reuse-Driven Tiling for GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015

SDSLc: a multi-target domain-specific compiler for stencil computations.

[BibT_eX]

[DOI]

Proceedings of the 5th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2015

Distributed memory code generation for mixed Irregular/Regular computations.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

On Characterizing the Data Access Complexity of Programs.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2015

Polyhedral Optimizations for a Data-Flow Graph Language.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2015

A Roofline-Based Performance Estimator for Distributed Matrix-Multiply on Intel CnC.

[BibT_eX]

[DOI]

Ponnuswamy Sadayappan

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Automatic Selection of Sparse Matrix Representation on GPUs.

[BibT_eX]

[DOI]

Naser Sedaghati

Te Mu

Srinivasan Parthasarathy

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Optimistic Delinearization of Parametrically Sized Arrays.

[BibT_eX]

[DOI]

Tobias Grosser

Jagannathan Ramanujam

Sebastian Pop

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

A Polyhedral-based SystemC Modeling and Generation Framework for Effective Low-power Design Space Exploration.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

Resource-Aware Throughput Optimization for High-Level Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

Characterizing and enhancing global memory data coalescing on GPUs.

[BibT_eX]

[DOI]

Naznin Fauzia

Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014

Automatic parallelization of a class of irregular loops for distributed memory systems.

[BibT_eX]

[DOI]

ACM Trans. Parallel Comput., 2014

Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2014

On Using the Roofline Model with Lower Bounds on Data Movement.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2014

On characterizing the data movement complexity of computational DAGs for parallel execution.

[BibT_eX]

[DOI]

Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, 2014

Oil and Water Can Mix: An Integration of Polyhedral and AST-Based Transformations.

[BibT_eX]

[DOI]

Jun Shirako

Vivek Sarkar

Proceedings of the International Conference for High Performance Computing, 2014

A framework for enhancing data reuse via associative reordering.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014

Verification of Polyhedral Optimizations with Constant Loop Bounds in Finite State Space Computations.

[BibT_eX]

[DOI]

Proceedings of the Leveraging Applications of Formal Methods, Verification and Validation. Specialized Techniques and Applications, 2014

Transformations for throughput optimization in high-level synthesis (abstract only).

[BibT_eX]

[DOI]

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

2013

Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

Predictive Modeling in a Polyhedral Optimization Space.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2013

When polyhedral transformations meet SIMD code generation.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2013

A stencil compiler for short-vector SIMD architectures.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2013

Polyhedral-based data reuse optimization for configurable computing.

[BibT_eX]

[DOI]

Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Improving polyhedral code generation for high-level synthesis.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2013

2012

Using machine learning to improve automatic vectorization.

[BibT_eX]

[DOI]

Kevin Stock

ACM Trans. Archit. Code Optim., 2012

Code generation for parallel execution of a class of irregular loops on distributed memory systems.

[BibT_eX]

[DOI]

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Dynamic trace-based analysis of vectorization potential of applications.

[BibT_eX]

[DOI]

Justin Holewinski

Ragavendar Ramamurthi

Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012

High-performance code generation for stencil computations on GPU architectures.

[BibT_eX]

[DOI]

Justin Holewinski

Mohamed-Walid Benabderrahmane

Proceedings of the International Conference on Supercomputing, 2012

Analytical Bounds for Optimal Tile Size Selection.

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction - 21st International Conference, 2012

2011

ACOTES Project: Advanced Compiler Technologies for Embedded Streaming.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2011

The Potential of Synergistic Static, Dynamic and Speculative Loop Nest Optimizations for Automatic Parallelization

[BibT_eX]

[DOI]

CoRR, 2011

Loop transformations: convexity, pruning and optimization.

[BibT_eX]

[DOI]

Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2011

Dynamic selection of tile sizes.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on High Performance Computing, 2011

Predictive modeling in a polyhedral optimization space.

[BibT_eX]

[DOI]

Proceedings of the CGO 2011, 2011

Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures.

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction - 20th International Conference, 2011

StVEC: A Vector Instruction Extension for High Performance Stencil Computation.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2010

The Polyhedral Model Is More Widely Applicable Than You Think.

[BibT_eX]

[DOI]

Albert Cohen

Cédric Bastoul

Proceedings of the Compiler Construction, 19th International Conference, 2010

2008

Iterative optimization in the polyhedral model: part ii, multidimensional time.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, 2008

2007

Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007

Automatic Correction of Loop Transformations.

[BibT_eX]

[DOI]

Nicolas Vasilache

Albert Cohen