Joël Falcou

Orcid: 0000-0001-5380-7375

Affiliations:
  • University of Paris-Sud, Laboratory for Computer Science (LRI), France


According to our database1, Joël Falcou authored at least 40 papers between 2003 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
ctbench - compile-time benchmarking and analysis.
J. Open Source Softw., August, 2023

2019
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices.
Parallel Comput., 2019

2018
A Case Study on Optimizing Accurate Half Precision Average.
Proceedings of the 30th International Symposium on Computer Architecture and High Performance Computing, 2018

Modern Generative Programming for Optimizing Small Matrix-Vector Multiplication.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

Data Layout and SIMD Abstraction Layers: Decoupling Interfaces from Implementations.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

2016
Automatic Task-Based Code Generation for High Performance Domain Specific Embedded Language.
Int. J. Parallel Program., 2016

Meta-programming and Multi-stage Programming for GPGPUs.
Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016

High-Performance Tensor Contractions for GPUs.
Proceedings of the International Conference on Computational Science 2016, 2016

High-Performance Matrix-Matrix Multiplications of Very Small Matrices.
Proceedings of the Euro-Par 2016: Parallel Processing, 2016

2015
Metaprogramming Dense Linear Algebra Solvers Applications to Multi and Many-Core Architectures.
Proceedings of the 2015 IEEE TrustCom/BigDataSE/ISPA, 2015

Designing HPC libraries in the modern C++ world.
Proceedings of the 2015 International Conference on High Performance Computing & Simulation, 2015

2014
The numerical template toolbox: A modern C++ design for scientific computing.
J. Parallel Distributed Comput., 2014

Parallel spherical harmonic transforms on heterogeneous architectures (graphics processing units/multi-core CPUs).
Concurr. Comput. Pract. Exp., 2014

Exploring the vectorization of python constructs using pythran and boost SIMD.
Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing, 2014

Boost.SIMD: generic programming for portable SIMDization.
Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing, 2014

Software Abstractions for Parallel Architectures. (Abstractions Logicielles pour Architectures Parallèles).
, 2014

2013
Parallel Smith-Waterman Comparison on Multicore and Manycore Computing Platforms with BSP++.
Int. J. Parallel Program., 2013

High level tranforms toreduce energy consumption of signal and image processing operators.
Proceedings of the 2013 23rd International Workshop on Power and Timing Modeling, 2013

A Parallel Solver for Incompressible Fluid Flows.
Proceedings of the International Conference on Computational Science, 2013

2012
Exploiting Multimedia Extensions in C++: A Portable Approach.
Comput. Sci. Eng., 2012

Impact of high level transforms on high level synthesis for motion detection algorithm.
Proceedings of the 2012 Conference on Design and Architectures for Signal and Image Processing, 2012

Boost.SIMD: generic programming for portable SIMDization.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
Parallelization Schemes for Memory Optimization on the Cell Processor: A Case Study on the Harris Corner Detector.
Trans. High Perform. Embed. Archit. Compil., 2011

Spherical harmonic transform on heterogeneous architectures using hybrid programming
CoRR, 2011

A framework for an automatic hybrid MPI+OpenMP code generation.
Proceedings of the 2011 Spring Simulation Multi-conference, 2011

Parallel Biological Sequence Comparison on Heterogeneous High Performance Computing Platforms with BSP++.
Proceedings of the 23rd International Symposium on Computer Architecture and High Performance Computing, 2011

Spherical Harmonic Transform with GPUs.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

2009
Une bibliothèque métaprogrammée pour la programmation parallèle.
Tech. Sci. Informatiques, 2009

Parallel Programming with Skeletons.
Comput. Sci. Eng., 2009

Algorithmic Skeletons within an Embedded Domain Specific Language for the CELL Processor.
Proceedings of the PACT 2009, 2009

2008
Altivec Vector Unit Customization for Embedded Systems.
Int. J. Comput. Sci. Appl., 2008

Functional Meta-programming for Parallel Skeletons.
Proceedings of the Computational Science, 2008

Meta-programming Applied to Automatic SMP Parallelization of Linear Algebra Code.
Proceedings of the Euro-Par 2008, 2008

2007
Formal Semantics Applied to the Implementation of a Skeleton-Based Parallel Programming Library.
Proceedings of the Parallel Computing: Architectures, 2007

2006
Un cluster pour la vision temps réel : architecture, outils et application. (A cluster for real time computer vision: architecture, tools and application).
PhD thesis, 2006

Quaff: efficient C++ design for parallel skeletons.
Parallel Comput., 2006

2005
E.V.E., An Object Oriented SIMD Library.
Scalable Comput. Pract. Exp., 2005

A Parallel Implementation of a 3D Reconstruction Algorithm for Real-Time Vision.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

2004
EVE, an Object Oriented SIMD Library.
Proceedings of the Computational Science, 2004

2003
CamlG4 : une bibliothèque de calcul parallèle pour Objective Caml.
Proceedings of the Journées francophones des langages applicatifs (JFLA'03), 2003


  Loading...