Louis-Noël Pouchet

Orcid: 0000-0001-5103-3097

According to our database1, Louis-Noël Pouchet authored at least 89 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Automatic Hardware Pragma Insertion in High-Level Synthesis: A Non-Linear Programming Approach.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

Formal Verification of Source-to-Source Transformations for HLS.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

2023
Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules.
IEEE Trans. Software Eng., July, 2023

2022
MARTA: Multi-configuration Assembly pRofiler and Toolkit for performance Analysis.
Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022

FOURST: A code generator for FFT-based fast stencil computations.
Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022

Accelerator design with decoupled hardware customizations: benefits and challenges: invited.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

PALMED: Throughput Characterization for Superscalar Architectures.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair.
IEEE Trans. Software Eng., 2021

Self-Supervised Learning to Prove Equivalence Between Programs via Semantics-Preserving Rewrite Rules.
CoRR, 2021

Proving Equivalence Between Complex Expressions Using Graph-to-Sequence Neural Models.
CoRR, 2021

Optimizing Coherence Traffic in Manycore Processors Using Closed-Form Caching/Home Agent Mappings.
IEEE Access, 2021

PolyBench/Python: benchmarking Python environments with polyhedral optimizations.
Proceedings of the CC '21: 30th ACM SIGPLAN International Conference on Compiler Construction, 2021

2020
Building a Polyhedral Representation from an Instrumented Execution: Making Dynamic Analyses of Nonaffine Programs Scalable.
ACM Trans. Archit. Code Optim., 2020

From micro-OPs to abstract resources: constructing a simpler CPU performance model through microbenchmarking.
CoRR, 2020

Coherence Traffic in Manycore Processors with Opaque Distributed Directories.
CoRR, 2020

Equivalence of Dataflow Graphs via Rewrite Rules Using a Graph-to-Sequence Neural Model.
CoRR, 2020

Automated derivation of parametric data movement lower bounds for affine programs.
Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2020

Efficient Execution of Dynamic Programming Algorithms on Apache Spark.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

Deriving parametric multi-way recursive divide-and-conquer dynamic programming algorithms using polyhedral compilers.
Proceedings of the CGO '20: 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020

2019
Data-flow/dependence profiling for structured transformations.
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

Model-driven transformations for multi- and many-core CPUs.
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

Generating piecewise-regular code from irregular structures.
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

On Optimizing Complex Stencils on GPUs.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Synthetic Lung Nodule 3D Image Generation Using Autoencoders.
Proceedings of the International Joint Conference on Neural Networks, 2019

Effect of Distributed Directories in Mesh Interconnects.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

A Code Generator for High-Performance Tensor Contractions on GPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018
Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations.
Proc. IEEE, 2018

Analytical modeling of cache behavior for affine programs.
Proc. ACM Program. Lang., 2018

A Performance Vocabulary for Affine Loop Transformations.
CoRR, 2018

Associative instruction reordering to alleviate register pressure.
Proceedings of the International Conference for High Performance Computing, 2018

Register optimizations for stencils on GPUs.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Performance modeling for GPUs using abstract kernel emulation.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

GPU code optimization using abstract kernel emulation and sensitivity analysis.
Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018

2017
Efficient Cache Simulation for Affine Computations.
Proceedings of the Languages and Compilers for Parallel Computing, 2017

Simplification and runtime resolution of data dependence constraints for loop transformations.
Proceedings of the International Conference on Supercomputing, 2017

Accurate High-level Modeling and Automated Hardware/Software Co-design for Effective SoC Design Space Exploration.
Proceedings of the 54th Annual Design Automation Conference, 2017

POSTER: Statement Reordering to Alleviate Register Pressure for Stencils on GPUs.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Static and Dynamic Frequency Scaling on Multicore CPUs.
ACM Trans. Archit. Code Optim., 2016

A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment.
Proceedings of the International Conference for High Performance Computing, 2016

PIPES: a language and compiler for task-based programming on distributed-memory clusters.
Proceedings of the International Conference for High Performance Computing, 2016

Effective resource management for enhancing performance of 2D and 3D stencils on GPUs.
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

PolyCheck: dynamic verification of iteration space transformations on affine programs.
Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2016

Effective padding of multidimensional arrays to avoid cache conflict misses.
Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2016

On fusing recursive traversals of K-d trees.
Proceedings of the 25th International Conference on Compiler Construction, 2016

POSTER: Hybrid Data Dependence Analysis for Loop Transformations.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

Resource Conscious Reuse-Driven Tiling for GPUs.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
SDSLc: a multi-target domain-specific compiler for stencil computations.
Proceedings of the 5th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2015

Distributed memory code generation for mixed Irregular/Regular computations.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

On Characterizing the Data Access Complexity of Programs.
Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2015

Polyhedral Optimizations for a Data-Flow Graph Language.
Proceedings of the Languages and Compilers for Parallel Computing, 2015

A Roofline-Based Performance Estimator for Distributed Matrix-Multiply on Intel CnC.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Automatic Selection of Sparse Matrix Representation on GPUs.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Optimistic Delinearization of Parametrically Sized Arrays.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

A Polyhedral-based SystemC Modeling and Generation Framework for Effective Low-power Design Space Exploration.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

Resource-Aware Throughput Optimization for High-Level Synthesis.
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

Characterizing and enhancing global memory data coalescing on GPUs.
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014
Automatic parallelization of a class of irregular loops for distributed memory systems.
ACM Trans. Parallel Comput., 2014

Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs.
ACM Trans. Archit. Code Optim., 2014

On Using the Roofline Model with Lower Bounds on Data Movement.
ACM Trans. Archit. Code Optim., 2014

On characterizing the data movement complexity of computational DAGs for parallel execution.
Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, 2014

Oil and Water Can Mix: An Integration of Polyhedral and AST-Based Transformations.
Proceedings of the International Conference for High Performance Computing, 2014

A framework for enhancing data reuse via associative reordering.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014

Verification of Polyhedral Optimizations with Constant Loop Bounds in Finite State Space Computations.
Proceedings of the Leveraging Applications of Formal Methods, Verification and Validation. Specialized Techniques and Applications, 2014

Transformations for throughput optimization in high-level synthesis (abstract only).
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

2013
Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential.
ACM Trans. Archit. Code Optim., 2013

Predictive Modeling in a Polyhedral Optimization Space.
Int. J. Parallel Program., 2013

When polyhedral transformations meet SIMD code generation.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2013

A stencil compiler for short-vector SIMD architectures.
Proceedings of the International Conference on Supercomputing, 2013

Polyhedral-based data reuse optimization for configurable computing.
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Improving polyhedral code generation for high-level synthesis.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2013

2012
Using machine learning to improve automatic vectorization.
ACM Trans. Archit. Code Optim., 2012

Code generation for parallel execution of a class of irregular loops on distributed memory systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Dynamic trace-based analysis of vectorization potential of applications.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012

High-performance code generation for stencil computations on GPU architectures.
Proceedings of the International Conference on Supercomputing, 2012

Analytical Bounds for Optimal Tile Size Selection.
Proceedings of the Compiler Construction - 21st International Conference, 2012

2011
ACOTES Project: Advanced Compiler Technologies for Embedded Streaming.
Int. J. Parallel Program., 2011

The Potential of Synergistic Static, Dynamic and Speculative Loop Nest Optimizations for Automatic Parallelization
CoRR, 2011

Loop transformations: convexity, pruning and optimization.
Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2011

Dynamic selection of tile sizes.
Proceedings of the 18th International Conference on High Performance Computing, 2011

Predictive modeling in a polyhedral optimization space.
Proceedings of the CGO 2011, 2011

Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures.
Proceedings of the Compiler Construction - 20th International Conference, 2011

StVEC: A Vector Instruction Extension for High Performance Stencil Computation.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework.
Proceedings of the Conference on High Performance Computing Networking, 2010

The Polyhedral Model Is More Widely Applicable Than You Think.
Proceedings of the Compiler Construction, 19th International Conference, 2010

2008
Iterative optimization in the polyhedral model: part ii, multidimensional time.
Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, 2008

2007
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time.
Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007

Automatic Correction of Loop Transformations.
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

2005
Inside Vaucanson.
Proceedings of the Implementation and Application of Automata, 2005


  Loading...