Greg Stitt

Wesley Piard

Christopher Crary

Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2026

Bridging the Gap: A Module-Context Modeling Methodology for Hyperscale FPGA Applications.

[BibT_eX]

[DOI]

Madison N. Emas

Austin Baylis

Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2026

2025

Using FPGA devices to accelerate the evaluation phase of tree-based genetic programming: an extended analysis.

[BibT_eX]

[DOI]

Genet. Program. Evolvable Mach., June, 2025

2024

Novel Toolset for Efficient Hardwired Micro-Op Translation in Embedded Microarchitectures.

[BibT_eX]

[DOI]

IEEE Embed. Syst. Lett., December, 2024

Low-Latency, Line-Rate Variable-Length Field Parsing for 100+ Gb/s Ethernet.

[BibT_eX]

[DOI]

Wesley Piard

Christopher Crary

Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

2023

An Exploration of ATPG Methods for Redacted IP and Reconfigurable Hardware.

[BibT_eX]

[DOI]

Jackson Fugate

Naren Vikram Raj Masna

Proceedings of the 41st IEEE VLSI Test Symposium, 2023

Using FPGA Devices to Accelerate Tree-Based Genetic Programming: A Preliminary Exploration with Recent Technologies.

[BibT_eX]

[DOI]

Proceedings of the Genetic Programming - 26th European Conference, 2023

2022

Work-in-Progress: Toward a Robust, Reconfigurable Hardware Accelerator for Tree-Based Genetic Programming.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Compilers, 2022

2021

Scalable Performance Prediction of Irregular Workloads in Multi-Phase Particle-in-Cell Applications.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

2020

PANDORA: An Architecture-Independent Parallelizing Approximation-Discovery Framework.

[BibT_eX]

[DOI]

David Campbell

ACM Trans. Embed. Comput. Syst., 2020

FPGA Acceleration of Fluid-Flow Kernels.

[BibT_eX]

[DOI]

Ryan Blanchard

Herman Lam

Proceedings of the 2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing, 2020

2019

Seiba: An FPGA Overlay-Based Approach to Rapid Application Development.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on ReConFigurable Computing and FPGAs, 2019

Machine Learning-based Prediction for Dynamic, Runtime Architectural Optimizations of Embedded Systems.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Nordic Circuits and Systems Conference, 2019

PANDORA: a parallelizing approximation-discovery framework (WIP paper).

[BibT_eX]

[DOI]

David Campbell

Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019

A FPGA-Pipelined, High-Throughput Approach to Coarse-Grained Simulation of HPC Systems.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

Multi-Parameter Performance Modeling using Symbolic Regression.

[BibT_eX]

[DOI]

Sai P. Chenna

Herman Lam

Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

Energy Prediction for Cache Tuning in Embedded Systems.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Conference on Computer Design, 2019

Machine Learning-based Prediction for Dynamic Architectural Optimizations.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Green and Sustainable Computing Conference, 2019

Dynamic Scheduling on Heterogeneous Multicores.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Offloading cache configuration prediction to an FPGA for hardware speedup and overhead reduction: work-in-progress.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, 2019

2018

Scalable Behavioral Emulation of Extreme-Scale Systems Using Structural Simulation Toolkit.

[BibT_eX]

[DOI]

Proceedings of the 47th International Conference on Parallel Processing, 2018

A Recurrently Generated Overlay Architecture for Rapid FPGA Application Development.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2018

Scalable Window Generation for the Intel Broadwell+Arria 10 and High-Bandwidth FPGA Systems.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

High-Frequency Absorption-FIFO Pipelining for Stratix 10 HyperFlex.

[BibT_eX]

[DOI]

Madison N. Emas

Austin Baylis

Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

2017

Serial Arithmetic Strategies for Improving FPGA Throughput.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2017

A High-Level Synthesis Scheduling and Binding Heuristic for FPGA Fault Tolerance.

[BibT_eX]

[DOI]

Aniruddha Shastri

Int. J. Reconfigurable Comput., 2017

A Uniquified Virtualization Approach to Hardware Security.

[BibT_eX]

[DOI]

IEEE Embed. Syst. Lett., 2017

A Scalable, Low-Overhead Finite-State Machine Overlay for Rapid FPGA Application Development.

[BibT_eX]

[DOI]

CoRR, 2017

Overlay-based side-channel countermeasures: A case study on correlated noise generation.

[BibT_eX]

[DOI]

Austin Baylis

Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

2016

The Unified Accumulator Architecture: A Configurable, Portable, and Extensible Floating-Point Accumulator.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2016

A Parallel Sliding-Window Generator for High-Performance Digital-Signal Processing on FPGAs.

[BibT_eX]

[DOI]

Eric Schwartz

Patrick Cooke

ACM Trans. Reconfigurable Technol. Syst., 2016

Behavioral Emulation for Scalable Design-Space Exploration of Algorithms and Architectures.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2016

Doubling FPGA Throughput via a Soft SerDes Architecture for Full-Bandwidth Serial Pipelining (Abstract Only).

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

2015

Low-Overhead FPGA Middleware for Application Portability and Productivity.

[BibT_eX]

[DOI]

Robert Kirchgessner

ACM Trans. Reconfigurable Technol. Syst., 2015

A Tradeoff Analysis of FPGAs, GPUs, and Multicores for Sliding-Window Applications.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2015

Finite-State-Machine Overlay Architectures for Fast FPGA Compilation and Application Portability.

[BibT_eX]

[DOI]

Patrick Cooke

Lu Hao

ACM Trans. Embed. Comput. Syst., 2015

Core-Level Modeling and Frequency Prediction for DSP Applications on FPGAs.

[BibT_eX]

[DOI]

Int. J. Reconfigurable Comput., 2015

Revisiting Serial Arithmetic: A Performance and Tradeoff Analysis for Parallel Applications on Modern FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

Adjustable-Cost Overlays for Runtime Compilation.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

A scheduling and binding heuristic for high-level synthesis of fault-tolerant FPGA applications.

[BibT_eX]

[DOI]

Aniruddha Shastri

Eduardo Riccio

Proceedings of the 26th IEEE International Conference on Application-specific Systems, 2015

An interpolation-based approach to multi-parameter performance modeling for heterogeneous systems.

[BibT_eX]

[DOI]

Dylan Rudolph

Proceedings of the 26th IEEE International Conference on Application-specific Systems, 2015

2014

Fast, Flexible High-Level Synthesis from OpenCL using Reconfiguration Contexts.

[BibT_eX]

[DOI]

IEEE Micro, 2014

A framework for dynamic parallelization of FPGA-accelerated applications.

[BibT_eX]

[DOI]

Jeremy Fowers

Jianye Liu

Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems, 2014

A High Memory Bandwidth FPGA Accelerator for Sparse Matrix-Vector Multiplication.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

2013

A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

Dynafuse: dynamic dependence analysis for FPGA pipeline fusion and locality optimizations.

[BibT_eX]

[DOI]

Jeremy Fowers

Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

A high-performance, low-energy FPGA accelerator for correntropy-based feature tracking (abstract only).

[BibT_eX]

[DOI]

Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Pseudo-constant logic optimization.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Application-Specific Systems, 2013

Virtual finite-state-machine architectures for fast compilation and portability.

[BibT_eX]

[DOI]

Lu Hao

Proceedings of the 24th International Conference on Application-Specific Systems, 2013

A comparison of correntropy-based feature tracking on FPGAs and GPUs.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Application-Specific Systems, 2013

2012

SCF: A Framework for Task-Level Coordination in Reconfigurable, Heterogeneous Systems.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2012

RCML: An Environment for Estimation Modeling of Reconfigurable Computing Systems.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2012

Elastic computing: A portable optimization framework for hybrid computers.

[BibT_eX]

[DOI]

Parallel Comput., 2012

Bandwidth-Sensitivity-Aware Arbitration for FPGAs.

[BibT_eX]

[DOI]

Lu Hao

IEEE Embed. Syst. Lett., 2012

RACECAR: a heuristic for automatic function specialization on multi-core heterogeneous systems.

[BibT_eX]

[DOI]

Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

VirtualRC: a virtual FPGA platform for applications and tools portability.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Communication visualization for bottleneck detection of high-level synthesis applications.

[BibT_eX]

[DOI]

John Curreri

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

BPR: fast FPGA placement and routing using macroblocks.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Hardware/Software Codesign and System Synthesis, 2012

The RACECAR heuristic for automatic function specialization on multi-core heterogeneous systems.

[BibT_eX]

[DOI]

Jeremy Fowers

Proceedings of the 15th International Conference on Compilers, 2012

A low-overhead interconnect architecture for virtual reconfigurable fabrics.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Compilers, 2012

2011

Platform-aware bottleneck detection for reconfigurable computing applications.

[BibT_eX]

[DOI]

Seth Koehler

ACM Trans. Reconfigurable Technol. Syst., 2011

Thread Warping: Dynamic and Transparent Synthesis of Thread Accelerators.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2011

Are Field-Programmable Gate Arrays Ready for the Mainstream?

[BibT_eX]

[DOI]

IEEE Micro, 2011

High-Level Synthesis of In-Circuit Assertions for Verification, Debugging, and Timing Analysis.

[BibT_eX]

[DOI]

John Curreri

Int. J. Reconfigurable Comput., 2011

Intermediate Fabrics: Virtual Architectures for Near-Instant FPGA Compilation.

[BibT_eX]

[DOI]

IEEE Embed. Syst. Lett., 2011

An End-to-End Tool Flow for FPGA-Accelerated Scientific Computing.

[BibT_eX]

[DOI]

IEEE Des. Test Comput., 2011

Novo-G: At the Forefront of Scalable Reconfigurable Supercomputing.

[BibT_eX]

[DOI]

Herman Lam

Comput. Sci. Eng., 2011

2010

Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures.

[BibT_eX]

[DOI]

Int. J. Reconfigurable Comput., 2010

Performance modeling for multilevel communication in SHMEM+.

[BibT_eX]

[DOI]

Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, 2010

Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, 2010

High-level synthesis techniques for in-circuit assertion-based verification.

[BibT_eX]

[DOI]

John Curreri

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

A scalable performance prediction heuristic for implementation planning on heterogeneous systems.

[BibT_eX]

[DOI]

Proceedings of the 8th IEEE Workshop on Embedded Systems for Real-Time Multimedia, 2010

An Automated Scheduling and Partitioning Algorithm for Scalable Reconfigurable Computing Systems.

[BibT_eX]

Proceedings of the 2010 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2010

Novo-G: A View at the HPC Crossroads for Scientific Computing.

[BibT_eX]

Proceedings of the 2010 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2010

Intermediate fabrics: virtual architectures for circuit portability and fast placement and routing.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Hardware/Software Codesign and System Synthesis, 2010

2009

A framework for core-level modeling and design of reconfigurable computing algorithms.

[BibT_eX]

[DOI]

Proceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2009

Bridging parallel and reconfigurable computing with multilevel PGAS and SHMEM+.

[BibT_eX]

[DOI]

Proceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2009

SCF: a device- and language-independent task coordination framework for reconfigurable, heterogeneous systems.

[BibT_eX]

[DOI]

Proceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2009

A Traversal Cache Framework for FPGA Acceleration of Pointer Data Structures: A Case Study on Barnes-Hut N-body Simulation.

[BibT_eX]

[DOI]

Proceedings of the ReConFig'09: 2009 International Conference on Reconfigurable Computing and FPGAs, 2009

2008

Warp Processing: Dynamic Translation of Binaries to FPGA Circuits.

[BibT_eX]

[DOI]

Roman L. Lysecky

Computer, 2008

Recursion flattening.

[BibT_eX]

[DOI]

Jason R. Villarreal

Proceedings of the 18th ACM Great Lakes Symposium on VLSI 2008, 2008

Hardware/software partitioning with multi-version implementation exploration.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM Great Lakes Symposium on VLSI 2008, 2008

C is for circuits: capturing FPGA circuits as sequential code for portability.

[BibT_eX]

[DOI]

Scott Sirowy

Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

Traversal caches: a first step towards FPGA acceleration of pointer-based data structures.

[BibT_eX]

[DOI]

Gaurav Chaudhari

Proceedings of the 6th International Conference on Hardware/Software Codesign and System Synthesis, 2008

2007

Binary synthesis.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2007

Thread warping: a framework for dynamic synthesis of thread accelerators.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Hardware/Software Codesign and System Synthesis, 2007

2006

Warp Processors.

[BibT_eX]

[DOI]

Roman L. Lysecky

ACM Trans. Design Autom. Electr. Syst., 2006

A code refinement methodology for performance-improved synthesis from C.

[BibT_eX]

[DOI]

Walid A. Najjar

Proceedings of the 2006 International Conference on Computer-Aided Design, 2006

2005

Techniques for synthesizing binaries to an advanced register/memory structure.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 13th International Symposium on Field Programmable Gate Arrays, 2005

A Decompilation Approach to Partitioning Software for Microprocessor/FPGA Platforms.

[BibT_eX]

[DOI]

Proceedings of the 2005 Design, 2005

Hardware/software partitioning of software binaries: a case study of H.264 decode.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005

2004

Energy savings and speedups from partitioning critical software loops to hardware in embedded systems.

[BibT_eX]

[DOI]

Shawn Nematbakhsh

ACM Trans. Embed. Comput. Syst., 2004

2003

Highly configurable platforms for embedded computing systems.

[BibT_eX]

[DOI]

Microelectron. J., 2003

Profiling tools for hardware/software partitioning of embedded applications.

[BibT_eX]

[DOI]

Proceedings of the 2003 Conference on Languages, 2003

Dynamic hardware/software partitioning: a first approach.

[BibT_eX]

[DOI]

Roman L. Lysecky

Proceedings of the 40th Design Automation Conference, 2003

2002

Energy Advantages of Microprocessor Platforms with On-Chip Configurable Logic.

[BibT_eX]

[DOI]

IEEE Des. Test Comput., 2002

Improving Software Performance with Configurable Logic.

[BibT_eX]

[DOI]

Des. Autom. Embed. Syst., 2002

Hardware/software partitioning of software binaries.

[BibT_eX]

[DOI]

Proceedings of the 2002 IEEE/ACM International Conference on Computer-aided Design, 2002

Using On-Chip Configurable Logic to Reduce Embedded System Software Energy.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2002), 2002

Codesign-extended applications.

[BibT_eX]

[DOI]

Brian Grattan

Proceedings of the Tenth International Symposium on Hardware/Software Codesign, 2002

2001

Propagating constants past software to hardware peripherals in fixed-application embedded systems.

[BibT_eX]

[DOI]

Rilesh Patel