Markus Püschel

According to our database1, Markus Püschel
  • authored at least 133 papers between 1997 and 2018.
  • has a "Dijkstra number"2 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2018
A practical construction for decomposing numerical abstract domains.
PACMPL, 2018

SIMD intrinsics on managed language runtimes.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

Program generation for small-scale linear algebra applications.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

2017
Characterizing and Enumerating Walsh-Hadamard Transform Algorithms.
CoRR, 2017

Fast polyhedra abstract domain.
Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, 2017

Staging for generic programming in space and time.
Proceedings of the 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, 2017

Optimal Streamed Linear Permutations.
Proceedings of the 24th IEEE Symposium on Computer Arithmetic, 2017

2016
Streaming Sorting Networks.
ACM Trans. Design Autom. Electr. Syst., 2016

e-PAL: An Active Learning Approach to the Multi-Objective Optimization Problem.
Journal of Machine Learning Research, 2016

RandIR: differential testing for embedded compilers.
Proceedings of the 7th ACM SIGPLAN Symposium on Scala, 2016

Program generation for performance.
Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016

Optimal Circuits for Streamed Linear Permutations Using RAM.
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

A basic linear algebra compiler for structured matrices.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

2015
Distributed Optimization With Local Domains: Applications in MPC and Network Flows.
IEEE Trans. Automat. Contr., 2015

Go Meta! A Case for Generative Programming and DSLs in Performance Critical Systems.
Proceedings of the 1st Summit on Advances in Programming Languages, 2015

Making numerical program analysis fast.
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015

A basic linear algebra compiler for embedded processors.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014
High-performance sparse fast Fourier transforms.
Proceedings of the 2014 IEEE Workshop on Signal Processing Systems, 2014

Abstracting Vector Architectures in Library Generators: Case Study Convolution Filters.
Proceedings of the ARRAY'14: Proceedings of the 2014 ACM SIGPLAN International Workshop on Libraries, 2014

Applying the roofline model.
Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

Extending the roofline model: Bottleneck analysis with microarchitectural constraints.
Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Automatic locality-friendly interface extension of numerical functions.
Proceedings of the Generative Programming: Concepts and Experiences, 2014

A Basic Linear Algebra Compiler.
Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

2013
D-ADMM: A Communication-Efficient Distributed Algorithm for Separable Optimization.
IEEE Trans. Signal Processing, 2013

Distributed Optimization With Local Domains: Applications in MPC and Network Flows
CoRR, 2013

Active Learning for Multi-Objective Optimization.
Proceedings of the 30th International Conference on Machine Learning, 2013

Spiral in scala: towards the systematic construction of generators for performance libraries.
Proceedings of the Generative Programming: Concepts and Experiences, 2013

Distributed compressed sensing algorithms: Completing the puzzle.
Proceedings of the IEEE Global Conference on Signal and Information Processing, 2013

A unified algorithmic approach to distributed optimization.
Proceedings of the IEEE Global Conference on Signal and Information Processing, 2013

2012
Efficient Compression of QRS Complexes Using Hermite Expansion.
IEEE Trans. Signal Processing, 2012

Algebraic Signal Processing Theory: 1-D Nearest Neighbor Models.
IEEE Trans. Signal Processing, 2012

Distributed Basis Pursuit.
IEEE Trans. Signal Processing, 2012

Computer Generation of Hardware for Linear Digital Signal Processing Transforms.
ACM Trans. Design Autom. Electr. Syst., 2012

Compiling math to fast code.
Proceedings of the ACM SIGPLAN 2012 Workshop on Partial Evaluation and Program Manipulation, 2012

"Smart" design space sampling to predict Pareto-optimal solutions.
Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2012

D-ADMM: A distributed algorithm for compressed sensing and other separable optimization problems.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improving fixed-point accuracy of FFT cores in O-OFDM systems.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Computer generation of streaming sorting networks.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

ADMM for consensus on colored networks.
Proceedings of the 51th IEEE Conference on Decision and Control, 2012

Distributed ADMM for model predictive control and congestion control.
Proceedings of the 51th IEEE Conference on Decision and Control, 2012

2011
Spiral.
Proceedings of the Encyclopedia of Parallel Computing, 2011

FFT (Fast Fourier Transform).
Proceedings of the Encyclopedia of Parallel Computing, 2011

Algebraic Signal Processing Theory: Cooley-Tukey-Type Algorithms for Polynomial Transforms Based on Induction.
SIAM J. Matrix Analysis Applications, 2011

Automatic performance programming.
Proceedings of the ACM Symposium on New Ideas in Programming and Reflections on Software, 2011

Automatic SIMD vectorization of fast fourier transforms for the larrabee and AVX instruction sets.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Compression of QRS complexes using Hermite expansion.
Proceedings of the IEEE International Conference on Acoustics, 2011

Basis Pursuit in sensor networks.
Proceedings of the IEEE International Conference on Acoustics, 2011

Real-time software implementation of an IEEE 802.11a baseband receiver on Intel multicore.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Systematic construction of real lapped tight frame transforms.
IEEE Trans. Signal Processing, 2010

Algebraic signal processing theory: sampling for infinite and finite 1-D space.
IEEE Trans. Signal Processing, 2010

Distributed Basis Pursuit
CoRR, 2010

Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for Polynomial Transforms Based on Induction
CoRR, 2010

Offline library adaptation using automatically generated heuristics.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Hardware implementation of the discrete fourier transform with non-power-of-two problem size.
Proceedings of the IEEE International Conference on Acoustics, 2010

Computer Generation of Efficient Software Viterbi Decoders.
Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Program Composition and Optimization: An Introduction.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

10191 Executive Summary - Program Composition and Optimization : Autotuning, Scheduling, Metaprogramming and Beyond.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

10191 Abstracts Collection - Program Composition and Optimization : Autotuning, Scheduling, Metaprogramming and Beyond.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

2009
Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for Real DFTs.
IEEE Trans. Signal Processing, 2009

Permuting streaming data using RAMs.
J. ACM, 2009

Automatic synthesis of high performance mathematical programs.
Proceedings of the Symbolic and Algebraic Computation, International Symposium, 2009

Computer generation of fast fourier transforms for the cell broadband engine.
Proceedings of the 23rd international conference on Supercomputing, 2009

Bandit-based optimization on graphs with application to library performance tuning.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Generating high performance pruned FFT implementations.
Proceedings of the IEEE International Conference on Acoustics, 2009

Operator Language: A Program Generation Framework for Fast Kernels.
Proceedings of the Domain-Specific Languages, IFIP TC 2 Working Conference, 2009

Automatic generation of streaming datapaths for arbitrary fixed permutations.
Proceedings of the Design, Automation and Test in Europe, 2009

Computer Generation of General Size Linear Transform Libraries.
Proceedings of the CGO 2009, 2009

Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling.
Proceedings of the PACT 2009, 2009

2008
Algebraic Signal Processing Theory: 1-D Space.
IEEE Trans. Signal Processing, 2008

Algebraic Signal Processing Theory: Foundation and 1-D Time.
IEEE Trans. Signal Processing, 2008

Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for DCTs and DSTs.
IEEE Trans. Signal Processing, 2008

Algebraic signal processing theory: Cooley-Tukey type algorithms on the 2-D hexagonal spatial lattice.
Appl. Algebra Eng. Commun. Comput., 2008

Axonal bouton modeling, detection and distribution analysis for the study of neural circuit organization and plasticity.
Proceedings of the 2008 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2008

Domain-specific library generation for parallel software and hardware platforms.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Haar filter banks for I-D space signals.
Proceedings of the IEEE International Conference on Acoustics, 2008

Alternatives to the discrete fourier transform.
Proceedings of the IEEE International Conference on Acoustics, 2008

Formal datapath representation and manipulation for implementing DSP transforms.
Proceedings of the 45th Design Automation Conference, 2008

Generating SIMD Vectorized Permutations.
Proceedings of the Compiler Construction, 17th International Conference, 2008

System Demonstration of Spiral: Generator for High-Performance Linear Transform Libraries.
Proceedings of the Algebraic Methodology and Software Technology, 2008

2007
Mechanical Derivation of Fused Multiply-Add Algorithms for Linear Transforms.
IEEE Trans. Signal Processing, 2007

Algebraic Signal Processing Theory: 2-D Spatial Hexagonal Lattice.
IEEE Trans. Image Processing, 2007

Time-Multiplexed Multiple-Constant Multiplication.
IEEE Trans. on CAD of Integrated Circuits and Systems, 2007

Multiplierless multiple constant multiplication.
ACM Trans. Algorithms, 2007

Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for DCTs and DSTs
CoRR, 2007

An Adaptive Multiresolution Approach to Fingerprint Recognition.
Proceedings of the International Conference on Image Processing, 2007

SIMD Vectorization of Non-Two-Power Sized FFTs.
Proceedings of the IEEE International Conference on Acoustics, 2007

FFT Compiler: from math to efficient hardware HLDVT invited short paper.
Proceedings of the IEEE International High Level Design Validation and Test Workshop, 2007

Performance/Energy Optimization of DSP Transforms on the XScale Processor.
Proceedings of the High Performance Embedded Architectures and Compilers, 2007

How to Write Fast Numerical Code: A Small Introduction.
Proceedings of the Generative and Transformational Techniques in Software Engineering II, 2007

Can we teach computers to write fast libraries?
Proceedings of the Generative Programming and Component Engineering, 2007

Generating FPGA-Accelerated DFT Libraries.
Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, 2007

2006
Algebraic Signal Processing Theory
CoRR, 2006

A Rewriting System for the Vectorization of Signal Transforms.
Proceedings of the High Performance Computing for Computational Science, 2006

Tools and techniques for performance - FFT program generation for shared memory: SMP and multicore.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Automatic Performance Optimization of the Discrete Fourier Transform on Distributed Memory Computers.
Proceedings of the Parallel and Distributed Processing and Applications, 2006

Algebraic Derivation of General Radix Cooley-Tukey Algorithms for the Real Discrete Fourier Transform.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

The Algebraic Structure in Signal Processing: Time and Space.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Sampling Theorem Associated With the Discrete Cosine Transform.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Fast and accurate resource estimation of automatically generated custom DFT IP cores.
Proceedings of the ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, 2006

Program generation for the all-pairs shortest path problem.
Proceedings of the 15th International Conference on Parallel Architecture and Compilation Techniques (PACT 2006), 2006

2005
SPIRAL: Code Generation for DSP Transforms.
Proceedings of the IEEE, 2005

Special Issue on Program Generation, Optimization, and Platform Adaptation.
Proceedings of the IEEE, 2005

Formal loop merging for signal transforms.
Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, 2005

Fourier transform for the spatial quincunx lattice.
Proceedings of the 2005 International Conference on Image Processing, 2005

Fourier transform for the directed quincunx lattice.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Performance analysis of the filtered backprojection image reconstruction algorithms.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Real, Tight Frames with Maximal Robustness to Erasures.
Proceedings of the 2005 Data Compression Conference (DCC 2005), 2005

Automatic generation of customized discrete fourier transform IPs.
Proceedings of the 42nd Design Automation Conference, 2005

2004
Special issue on computer algebra and signal processing: forward by the guest editors.
J. Symb. Comput., 2004

Symmetry-based matrix factorization.
J. Symb. Comput., 2004

Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms.
IJHPCA, 2004

Automatically Tuned FFTs for BlueGene/L's Double FPU.
Proceedings of the High Performance Computing for Computational Science, 2004

Custom-optimized multiplierless implementations of DSP algorithms.
Proceedings of the 2004 International Conference on Computer-Aided Design, 2004

Automatic cost minimization for multiplierless implementations of discrete signal transforms.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Automatic generation of implementations for DSP transforms on fused multiply-add architectures.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

The discrete triangle transform.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Automatically generated high-performance code for discrete wavelet transforms.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Multiple constant multiplication by time-multiplexed mapping of addition chains.
Proceedings of the 41th Design Automation Conference, 2004

2003
The Algebraic Approach to the Discrete Cosine and Sine Transforms and Their Fast Algorithms.
SIAM J. Comput., 2003

Short Vector Code Generation for the Discrete Fourier Transform.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Cooley-Tukey FFT like algorithms for the DCT.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Fast automatic software implementations of FIR filters.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Short vector code generation and adaptation for DSP algorithms.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Toward efficient static analysis of finite-precision effects in DSP applications via affine arithmetic modeling.
Proceedings of the 40th Design Automation Conference, 2003

2002
Decomposing Monomial Representations of Solvable Groups.
J. Symb. Comput., 2002

A SIMD Vectorizing Compiler for Digital Signal Processing Algorithms.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
Automatic generation of fast discrete signal transforms.
IEEE Trans. Signal Processing, 2001

Fast Automatic Generation of DSP Algorithms.
Proceedings of the Computational Science - ICCS 2001, 2001

2000
In search of the optimal Walsh-Hadamard transform.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Fast Quantum Fourier Transforms for a Class of Non-Abelian Groups.
Proceedings of the Applied Algebra, 1999

1998
Solving Puzzles Related to Permutation Groups.
Proceedings of the 1998 International Symposium on Symbolic and Algebraic Computation, 1998

Konstruktive Darstellungstheorie und Algorithmengenerierung.
PhD thesis, 1998

1997
Decomposing a Permutation into a Conjugated Tensor Product.
Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation, 1997


  Loading...