Franz Franchetti

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Generating High-Performance Number Theoretic Transform Implementations for Vector Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

Optimization and Performance Analysis of Shor's Algorithm in Qiskit.

[BibT_eX]

[DOI]

Dewang Sun

Naifeng Zhang

Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

ProtoX: A First Look.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

2022

A framework for low communication approaches for large scale 3D convolution.

[BibT_eX]

[DOI]

Jelena Kovacevic

Proceedings of the Workshop Proceedings of the 51st International Conference on Parallel Processing, 2022

Flexible Hardware Accelerator Design Generation with Spiral.

[BibT_eX]

[DOI]

Guanglin Xu

Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

A High Throughput Hardware Accelerator for FFTW Codelets: A First Look.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

A Compiler for Sound Floating-Point Computations using Affine Arithmetic.

[BibT_eX]

[DOI]

Joao Rivera

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

Introduction to GraphBLAS.

[BibT_eX]

[DOI]

Proceedings of the Massive Graph Analytics, 2022

2021

Leveraging High Dimensional Spatial Graph Embedding as a Heuristic for Graph Algorithms.

[BibT_eX]

[DOI]

Peter Oostema

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

An Auto-tuning with Adaptation of A64 Scalable Vector Extension for SPIRAL.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Optimized Quantum Circuit Generation with SPIRAL.

[BibT_eX]

[DOI]

Scott Mionis

Jason Larkin

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Graph Embedding and Field Based Detection of Non-Local Webs in Large Scale Free Networks.

[BibT_eX]

[DOI]

Michael E. Franusich

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

An Interval Compiler for Sound Floating-Point Computations.

[BibT_eX]

[DOI]

Joao Rivera

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021

2020

A Flexible Framework for Multidimensional DFTs.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2020

Verified Translation Between Purely Functional and Imperative Domain Specific Languages in HELIX.

[BibT_eX]

[DOI]

Vadim Zaliva

Ilia Zaichuk

Proceedings of the Software Verification - 12th International Conference, 2020

Massive Scaling of MASSIF: Algorithm Development and Analysis for Simulation on GPUs.

[BibT_eX]

[DOI]

Jelena Kovacevic

Proceedings of the PASC '20: Platform for Advanced Scientific Computing Conference, Geneva, Switzerland, June 29, 2020

FESIA: A Fast and SIMD-Efficient Set Intersection Approach on Modern CPUs.

[BibT_eX]

[DOI]

Jiyuan Zhang

Yi Lu

Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

GBTLX: A First Look.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

FFTE on SVE: SPIRAL-Generated Kernels.

[BibT_eX]

[DOI]

Daisuke Takahashi

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

2019

A Flexible Framework for Parallel Multi-Dimensional DFTs.

[BibT_eX]

[DOI]

CoRR, 2019

Efficient SpMV Operation for Large and Highly Sparse Matrices using Scalable Multi-way Merge Parallelization.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

FFTX for Micromechanical Stress-Strain Analysis.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

2018

From High-Level Specification to High-Performance Code.

[BibT_eX]

[DOI]

Proc. IEEE, 2018

SPIRAL: Extreme Performance Portability.

[BibT_eX]

[DOI]

Richard Michael Veras

Proc. IEEE, 2018

Large Bandwidth-Efficient FFTs on Multicore and Multi-socket Systems.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

High Performance Zero-Memory Overhead Direct Convolutions.

[BibT_eX]

[DOI]

Jiyuan Zhang

Proceedings of the 35th International Conference on Machine Learning, 2018

HELIX: a case study of a formal verification of high performance program generation.

[BibT_eX]

[DOI]

Vadim Zaliva

Proceedings of the 7th ACM SIGPLAN International Workshop on Functional High-Performance Computing, 2018

Preliminary Exploration of Large-Scale Triangle Counting on Shared-Memory Multicore System.

[BibT_eX]

[DOI]

Jiyuan Zhang

Scott McMillan

Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

PageRank Acceleration for Large Graphs with Scalable Hardware and Two-Step SpMV.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Fast and accurate object detection in high resolution 4K and 8K video using GPUs.

[BibT_eX]

[DOI]

Vít Ruzicka

Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

Linear Algebraic Formulation of Edge-centric K-truss Algorithms with Adjacency Matrices.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE High Performance Extreme Computing Conference, 2018

FFTX and SpectralPack: A First Look.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Conference on High Performance Computing Workshops, 2018

Large-Scale Algorithm Design for Parallel FFT-based Simulations on GPUs.

[BibT_eX]

[DOI]

Jelena Kovacevic

Proceedings of the 2018 IEEE Global Conference on Signal and Information Processing, 2018

2017

Algebraic description and automatic generation of multigrid methods in SPIRAL.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2017

Multirotor UAV state prediction through multi-microphone side-channel fusion.

[BibT_eX]

[DOI]

Hendrik Vincent Koops

Proceedings of the 2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, 2017

A scale-free structure for real world networks.

[BibT_eX]

[DOI]

Richard Michael Veras

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Algorithm and hardware co-optimized solution for large SpMV problems.

[BibT_eX]

[DOI]

Fazle Sadi

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

Mixed data layout kernels for vectorized complex arithmetic.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

First look: Linear algebra-based triangle counting without matrix multiplication.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017

High Assurance Code Generation for Cyber-Physical Systems.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on High Assurance Systems Engineering, 2017

2016

FFTs with Near-Optimal Memory Access Through Block Data Layouts: Algorithm, Architecture and Design Automation.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2016

Accelerating Architectural Simulation Via Statistical Techniques: A Survey.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

HAMLeT Architecture for Parallel Data Reorganization in Memory.

[BibT_eX]

[DOI]

IEEE Micro, 2016

Automating the Last-Mile for High Performance Dense Linear Algebra.

[BibT_eX]

[DOI]

Richard Michael Veras

Tyler Michael Smith

Robert A. van de Geijn

CoRR, 2016

Compilers, hands-off my hands-on optimizations.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Programming Models for SIMD/Vector Processing, 2016

A scale-free structure for power-law graphs.

[BibT_eX]

[DOI]

Richard Veras

Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

3D DRAM based application specific hardware accelerator for SpMV.

[BibT_eX]

[DOI]

Fazle Sadi

Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

Mathematical foundations of the GraphBLAS.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

Big data computation of taxi movement in New York City.

[BibT_eX]

[DOI]

Joya A. Deri

José M. F. Moura

Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015

Energy-Efficient Abundant-Data Computing: The N3XT 1, 000x.

[BibT_eX]

[DOI]

Computer, 2015

Enabling portable energy efficiency with memory accelerated library.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

Data reorganization in memory using 3D-stacked DRAM.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Generating Optimized Fourier Interpolation Routines for Density Functional Theory Using SPIRAL.

[BibT_eX]

[DOI]

Francis P. Russell

Karl A. Wilkinson

Chris-Kriton Skylaris

Paul H. J. Kelly

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

An ensemble technique for estimating vehicle speed and gear position from acoustic data.

[BibT_eX]

[DOI]

Hendrik Vincent Koops

Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

Optimizing Space Time Adaptive Processing through accelerating memory-bounded operations.

[BibT_eX]

[DOI]

Qi Guo

Proceedings of the 2015 IEEE High Performance Extreme Computing Conference, 2015

A synthesis methodology for application-specific logic-in-memory designs.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual Design Automation Conference, 2015

2014

Special issue on automatic application tuning for HPC architectures.

[BibT_eX]

[DOI]

Siegfried Benkner

Hans Michael Gerndt

Jeffrey K. Hollingsworth

Sci. Program., 2014

Capturing the Expert: Generating Fast Matrix-Multiply Kernels with Spiral.

[BibT_eX]

[DOI]

Richard Veras

Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

Barometric and GPS altitude sensor fusion.

[BibT_eX]

[DOI]

Vadim Zaliva

Proceedings of the IEEE International Conference on Acoustics, 2014

FFTS with near-optimal memory access through block data layouts.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Algorithm/hardware co-optimized SAR image reconstruction with 3D-stacked logic in memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2014

HAMLeT: Hardware accelerated memory layout transform within 3D-stacked DRAM.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2014

Efficient and secure intellectual property (IP) design with split fabrication.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Hardware-Oriented Security and Trust, 2014

Understanding the design space of DRAM-optimized hardware FFT accelerators.

[BibT_eX]

[DOI]

Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

2013

Local Interpolation-based Polar Format SAR: Algorithm, Hardware Implementation and Design Automation.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2013

An Information-Theoretic Approach to PMU Placement in Electric Power Systems.

[BibT_eX]

[DOI]

IEEE Trans. Smart Grid, 2013

Automatic Application Tuning for HPC Architectures (Dagstuhl Seminar 13401).

[BibT_eX]

[DOI]

Siegfried Benkner

Hans Michael Gerndt

Jeffrey K. Hollingsworth

Dagstuhl Reports, 2013

Power system probabilistic and security analysis on commodity high performance computing systems.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Workshop on High Performance Computing, 2013

When polyhedral transformations meet SIMD code generation.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2013

A Quasi-Monte Carlo approach for radial distribution system probabilistic load flow.

[BibT_eX]

[DOI]

Proceedings of the IEEE PES Innovative Smart Grid Technologies Conference, 2013

A stencil compiler for short-vector SIMD architectures.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2013

Accelerating sparse matrix-matrix multiplication with 3D-stacked logic-in-memory hardware.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2013

A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International 3D Systems Integration Conference (3DIC), 2013

2012

Computer Generation of Hardware for Linear Digital Signal Processing Transforms.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2012

A Smart Memory Accelerated Computed Tomography Parallel Backprojection.

[BibT_eX]

[DOI]

Qiuling Zhu

Proceedings of the VLSI-SoC: From Algorithms to Circuits and System-on-Chip Design, 2012

Cost-effective smart memory implementation for parallel backprojection in computed tomography.

[BibT_eX]

[DOI]

Qiuling Zhu

Proceedings of the 20th IEEE/IFIP International Conference on VLSI and System-on-Chip, 2012

Automatic Generation of the HPC Challenge's Global FFT Benchmark for BlueGene/P.

[BibT_eX]

[DOI]

Gheorghe Almási

Proceedings of the High Performance Computing for Computational Science, 2012

Highly Efficient Performance Portable Tracking of Evolving Surfaces.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Polar format synthetic aperture radar in energy efficient application-specific logic-in-memory.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Optimized parallel distribution load flow solver on commodity multi-core CPU.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on High Performance Extreme Computing, 2012

Algorithm and architecture optimization for large size two dimensional discrete fourier transform (abstract only).

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Memory Bandwidth Efficient Two-Dimensional Fast Fourier Transform Algorithm and Implementation for Large Problem Sizes.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

Design Automation Framework for Application-Specific Logic-in-Memory Blocks.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012

2011

Spiral.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Parallel Computing, 2011

FFT (Fast Fourier Transform).

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Parallel Computing, 2011

Autotuning a Random Walk Boolean Satisfiability Solver.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computational Science, 2011

On-line Decentralized Charging of Plug-In Electric Vehicles in Power Systems

[BibT_eX]

[DOI]

CoRR, 2011

Automatic SIMD vectorization of fast fourier transforms for the larrabee and AVX instruction sets.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Real-time software implementation of an IEEE 802.11a baseband receiver on Intel multicore.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures.

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction - 20th International Conference, 2011

2010

High Performance Stereo Vision Designed for Massively Data Parallel Platforms.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2010

Abstract only: SPIRAL-generated modular FFTs.

[BibT_eX]

[DOI]

ACM Commun. Comput. Algebra, 2010

Fast bilateral filtering by adapting block size.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Image Processing, 2010

Fast and robust active contours for image segmentation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Image Processing, 2010

Hardware implementation of the discrete fourier transform with non-power-of-two problem size.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Computer Generation of Efficient Software Viterbi Decoders.

[BibT_eX]

[DOI]

Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Spiral-generated modular FFT algorithms.

[BibT_eX]

[DOI]

Proceedings of the 4th International Workshop on Parallel Symbolic Computation, 2010

2009

Discrete fourier transform on multicore.

[BibT_eX]

[DOI]

IEEE Signal Process. Mag., 2009

Computer generation of fast fourier transforms for the cell broadband engine.

[BibT_eX]

[DOI]

Srinivas Chellappa

Proceedings of the 23rd international conference on Supercomputing, 2009

Generating high performance pruned FFT implementations.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Operator Language: A Program Generation Framework for Fast Kernels.

[BibT_eX]

[DOI]

Proceedings of the Domain-Specific Languages, IFIP TC 2 Working Conference, 2009

2008

BlueGene/L applications: Parallelism On a Massive Scale.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2008

Domain-specific library generation for parallel software and hardware platforms.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Formal datapath representation and manipulation for implementing DSP transforms.

[BibT_eX]

[DOI]

Proceedings of the 45th Design Automation Conference, 2008

Generating SIMD Vectorized Permutations.

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction, 17th International Conference, 2008

System Demonstration of Spiral: Generator for High-Performance Linear Transform Libraries.

[BibT_eX]

[DOI]

Proceedings of the Algebraic Methodology and Software Technology, 2008

2007

SIMD Vectorization of Non-Two-Power Sized FFTs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

FFT Compiler: from math to efficient hardware HLDVT invited short paper.

[BibT_eX]

[DOI]

Proceedings of the IEEE International High Level Design Validation and Test Workshop, 2007

Performance/Energy Optimization of DSP Transforms on the XScale Processor.

[BibT_eX]

[DOI]

Paolo D'Alberto

Proceedings of the High Performance Embedded Architectures and Compilers, 2007

How to Write Fast Numerical Code: A Small Introduction.

[BibT_eX]

[DOI]

Srinivas Chellappa

Proceedings of the Generative and Transformational Techniques in Software Engineering II, 2007

Generating FPGA-Accelerated DFT Libraries.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, 2007

2006

A Rewriting System for the Vectorization of Signal Transforms.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing for Computational Science, 2006

Gordon Bell finalists I - Large-scale electronic structure calculations of high-Z metals on the BlueGene/L platform.

[BibT_eX]

[DOI]

François Gygi

Erik W. Draeger

Martin Schulz

Bronis R. de Supinski

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Tools and techniques for performance - FFT program generation for shared memory: SMP and multicore.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Automatic Performance Optimization of the Discrete Fourier Transform on Distributed Memory Computers.

[BibT_eX]

[DOI]

Proceedings of the Parallel and Distributed Processing and Applications, 2006

Program generation for the all-pairs shortest path problem.

[BibT_eX]

[DOI]

Sung-Chul Han

Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT 2006), 2006

2005

SPIRAL: Code Generation for DSP Transforms.

[BibT_eX]

[DOI]

Proc. IEEE, 2005

Efficient Utilization of SIMD Extensions.

[BibT_eX]

[DOI]

Proc. IEEE, 2005

Vectorization techniques for the Blue Gene/L double FPU.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 2005

Large-Scale First-Principles Molecular Dynamics simulations on the BlueGene/L Platform using the Qbox code.

[BibT_eX]

[DOI]

Bronis R. de Supinski

John A. Gunnels

James C. Sexton

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Formal loop merging for signal transforms.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, 2005

Performance analysis of the filtered backprojection image reconstruction algorithms.

[BibT_eX]

[DOI]

Thammanit Pipatsrisawat

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Automatically Tuned FFTs for BlueGene/L's Double FPU.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing for Computational Science, 2004

FFT Compiler Techniques.

[BibT_eX]

[DOI]

Peter Wurzinger

Proceedings of the Compiler Construction, 13th International Conference, 2004

2003

On Using ZENTURIO for Performance and Parameter Studies on Cluster and Grid Architectures.

[BibT_eX]

[DOI]

Proceedings of the 11th Euromicro Workshop on Parallel, 2003

Short Vector Code Generation for the Discrete Fourier Transform.

[BibT_eX]

[DOI]

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Short vector code generation and adaptation for DSP algorithms.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

SIMD Vectorization of Straight Line FFT Code.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2003. Parallel Processing, 2003

2002

A SIMD Vectorizing Compiler for Digital Signal Processing Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001

Architecture independent short vector FFTs.

[BibT_eX]

[DOI]

Herbert Karner