Nachiket Kapre

Srinirdheeshwar Kuttuva Prakash

ACM Trans. Reconfigurable Technol. Syst., 2022

HopliteML: Evolving Application Customized FPGA NoCs with Adaptable Routers and Regulators.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2022

Managing HBM Bandwidth on Multi-Die FPGAs with FPGA Overlay NoCs.

[BibT_eX]

[DOI]

Hiren D. Patel

Proceedings of the 30th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2022

2021

Worst-case latency analysis for the versal NoC network packet switch.

[BibT_eX]

[DOI]

Ian Elmor Lang

Rodolfo Pellizzoni

Proceedings of the NOCS '21: International Symposium on Networks-on-Chip, 2021

Mocarabe: High-Performance Time-Multiplexed Overlays for FPGAs.

[BibT_eX]

[DOI]

Frederick Tombs

Alireza Mellat

Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

2020

HopliteBuf: Network Calculus-Based Design of FPGA NoCs with Provably Stall-Free FIFOs.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2020

RapidLayout: Fast Hard Block Placement of FPGA-optimized Systolic Arrays using Evolutionary Algorithms.

[BibT_eX]

[DOI]

Niansong Zhang

Xiang Chen

CoRR, 2020

DarwiNN: efficient distributed neuroevolution under communication constraints.

[BibT_eX]

[DOI]

Gurshaant Singh Malik

Lucian Petrica

Michaela Blott

Proceedings of the GECCO '20: Genetic and Evolutionary Computation Conference, 2020

RapidLayout: Fast Hard Block Placement of FPGA-Optimized Systolic Arrays using Evolutionary Algorithms.

[BibT_eX]

[DOI]

Niansong Zhang

Xiang Chen

Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

Learn the Switches: Evolving FPGA NoCs with Stall-Free and Backpressure Based Routers.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

Exploring The Impact Of Switch Arity On Butterfly Fat Tree Fpga Nocs.

[BibT_eX]

[DOI]

Ian Elmor Lang

Ziqiang Huang

Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

2019

Partitioning FPGA-Optimized Systolic Arrays for Fun and Profit.

[BibT_eX]

[DOI]

Long Chung Chan

Gurshaant Malik

Proceedings of the International Conference on Field-Programmable Technology, 2019

Scaling the Cascades: Interconnect-Aware FPGA Implementation of Machine Learning Problems.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

Timing-Aware Routing in the RapidWright Framework.

[BibT_eX]

[DOI]

Leo Liu

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

Enhancing Butterfly Fat Tree NoCs for FPGAs with Lightweight Flow Control.

[BibT_eX]

[DOI]

Gurshaant Singh Malik

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

HopliteBuf: FPGA NoCs with Provably Stall-Free FIFOs.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

RapidRoute: Fast Assembly of Communication Structures for FPGA Overlays.

[BibT_eX]

[DOI]

Leo Liu

Jay Weng

Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

2018

CaffePresso: Accelerating Convolutional Networks on Embedded SoCs.

[BibT_eX]

[DOI]

Gopalakrishna Hegde

ACM Trans. Embed. Comput. Syst., 2018

FastTrack: Leveraging Heterogeneous FPGA Wires to Design Low-Cost High-Performance Soft NoCs.

[BibT_eX]

[DOI]

Tushar Krishna

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

Implementing NEF Neural Networks on Embedded FPGAs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2018

DaCO: A High-Performance Token Dataflow Coprocessor Overlay for FPGAs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2018

FastTrack: Exploiting Fast FPGA Wiring for Implementing NoC Shortcuts (Abstract Only).

[BibT_eX]

[DOI]

Tushar Krishna

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

LegUp-NoC: High-Level Synthesis of Loops with Indirect Addressing.

[BibT_eX]

[DOI]

Asif Islam

Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

Hoplite-Q: Priority-Aware Routing in FPGA Overlay NoCs.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

2017

Hoplite: A Deflection-Routed Directional Torus NoC for FPGAs.

[BibT_eX]

[DOI]

Jan Gray

ACM Trans. Reconfigurable Technol. Syst., 2017

The structure and dynamics of knowledge network in domain-specific Q&A sites: a case study of stack overflow.

[BibT_eX]

[DOI]

Deheng Ye

Zhenchang Xing

Empir. Softw. Eng., 2017

Out-of-Order Dataflow Scheduling for FPGA Overlays.

[BibT_eX]

[DOI]

CoRR, 2017

Applying Models of Computation to OpenCL Pipes for FPGA Computing.

[BibT_eX]

[DOI]

Hiren D. Patel

Proceedings of the 5th International Workshop on OpenCL, 2017

HopliteRT: An efficient FPGA NoC for real-time applications.

[BibT_eX]

[DOI]

Saud Wasly

Rodolfo Pellizzoni

Proceedings of the International Conference on Field Programmable Technology, 2017

Enabling partial reconfiguration and low latency routing using segmented FPGA NoCs.

[BibT_eX]

[DOI]

Kizheppatt Vipin

Jan Gray

Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

Deflection-routed butterfly fat trees on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

120-core microAptiv MIPS Overlay for the Terasic DE5-NET FPGA board.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Implementing FPGA Overlay NoCs Using the Xilinx UltraScale Memory Cascades.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

On Bit-Serial NoCs for FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

eBSP: Managing NoC traffic for BSP workloads on the 16-core Adapteva Epiphany-III processor.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

2016

Optimizing Soft Vector Processing in FPGA-Based Embedded Systems.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2016

Software-Specific Named Entity Recognition in Software Engineering Social Content.

[BibT_eX]

[DOI]

Proceedings of the IEEE 23rd International Conference on Software Analysis, 2016

Software-specific part-of-speech tagging: an experimental study on stack overflow.

[BibT_eX]

[DOI]

Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2016

Learning to Extract API Mentions from Informal Natural Language Discussions.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Software Maintenance and Evolution, 2016

Deflection routing for multi-level FPGA overlay NoCs.

[BibT_eX]

[DOI]

Kumar H. B. Chethan

Shubham Agarwal

Proceedings of the 2016 International Conference on Field-Programmable Technology, 2016

Boosting convergence of timing closure using feature selection in a Learning-driven approach.

[BibT_eX]

[DOI]

Que Yanghua

Harnhua Ng

Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Vector FPGA acceleration of 1-D DWT computations using sparse matrix skeletons.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Survey of domain-specific languages for FPGA computing.

[BibT_eX]

[DOI]

Samuel Bayliss

Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Hoplite-DSP: Harnessing the Xilinx DSP48 multiplexers to efficiently support NoCs on FPGAs.

[BibT_eX]

[DOI]

Kumar H. B. Chethan

Chinnakkannu Adaikkala Raj

Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Case for Design-Specific Machine Learning in Timing Closure of FPGA Designs.

[BibT_eX]

[DOI]

Que Yanghua

Harnhua Ng

Kirvy Teo

Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

Machine-Learning driven Auto-Tuning of High-Level Synthesis for FPGAs (Abstract Only).

[BibT_eX]

[DOI]

Li Ting

Harri Wijaya

Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths.

[BibT_eX]

[DOI]

Deheng Ye

Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

Improving Classification Accuracy of a Machine Learning Approach for FPGA Timing Closure.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

Communication Optimization for the 16-Core Epiphany Floating-Point Processor Array.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

Marathon: Statically-Scheduled Conflict-Free Routing on FPGA Overlay NoCs.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

Evaluating Embedded FPGA Accelerators for Deep Learning Applications.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

Preventive Detection of Mosquito Populations using Embedded Machine Learning on Low Power IoT Platforms.

[BibT_eX]

[DOI]

Prashant Ravi

Uma Syam

Proceedings of the 7th Annual Symposium on Computing for Development, 2016

CaffePresso: an optimized library for deep learning on embedded accelerator-based platforms.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Compilers, 2016

2015

Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FPGAs.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2015

A Case for Embedded FPGA-based SoCs in Energy-Efficient Acceleration of Graph Problems.

[BibT_eX]

[DOI]

Pradeep Moorthy

Supercomput. Front. Innov., 2015

G-DMA: improving memory access performance for hardware accelerated sparse graph computation.

[BibT_eX]

[DOI]

Andrew Bean

Peter Y. K. Cheung

Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2015

GraphMMU: Memory Management Unit for Sparse Graph Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Enhancing Speedups for FPGA Accelerated SPICE through Frequency Scaling and Precision Reduction.

[BibT_eX]

[DOI]

Lim Hui Hui

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Limits of FPGA acceleration of 3D Green's Function computation for geophysical applications.

[BibT_eX]

[DOI]

Sagar Shrishailappa Masuti

Jayakrishnan Selva Kumar

Parjanya Gupta

Sylvain Barbot

Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

Hoplite: Building austere overlay NoCs for FPGAs.

[BibT_eX]

[DOI]

Jan Gray

Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

On Data Forwarding in Deeply Pipelined Soft Processors.

[BibT_eX]

[DOI]

Hui Yan Cheah

Suhaib A. Fahmy

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

FPGA Acceleration of Irregular Iterative Computations using Criticality-Aware Dataflow Optimizations (Abstract Only).

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

InTime: A Machine Learning Approach for Efficient Selection of FPGA CAD Tool Parameters.

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

Zedwulf: Power-Performance Tradeoffs of a 32-Node Zynq SoC Cluster.

[BibT_eX]

[DOI]

Pradeep Moorthy

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

Driving Timing Convergence of FPGA Designs through Machine Learning and Cloud Computing.

[BibT_eX]

[DOI]

Bibin Chandrashekaran

Harnhua Ng

Kirvy Teo

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

Sparse Graph Processing with Soft-Processors.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

Energy-Efficient Acceleration of OpenCV Saliency Computation Using Soft Vector Processors.

[BibT_eX]

[DOI]

Gopalakrishna Hegde

Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

Custom FPGA-based soft-processors for sparse graph acceleration.

[BibT_eX]

[DOI]

Sagar Shrishailappa Masuti

Proceedings of the 26th IEEE International Conference on Application-specific Systems, 2015

2014

Relax-Miracle: GPU parallelization of semi-analytic fourier-domain solvers for earthquake modeling.

[BibT_eX]

[DOI]

Sylvain Barbot

Proceedings of the 21st International Conference on High Performance Computing, 2014

Fanout decomposition dataflow optimizations for FPGA-based Sparse LU factorization.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014

Analysis and optimization of a deeply pipelined FPGA soft processor.

[BibT_eX]

[DOI]

Hui Yan Cheah

Suhaib A. Fahmy

Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014

Heterogeneous dataflow architectures for FPGA-based sparse LU factorization.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

Comparing soft and hard vector processing in FPGA-based embedded systems.

[BibT_eX]

[DOI]

Soh Jun Jie

Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

MixFX-SCORE: Heterogeneous Fixed-Point Compilation of Dataflow Computations.

[BibT_eX]

[DOI]

Deheng Ye

Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

Timing Fault Detection in FPGA-Based Circuits.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

2013

System-level FPGA device driver with high-level synthesis support.

[BibT_eX]

[DOI]

Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

Application Composition and Communication Optimization in Iterative Solvers Using FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013

Exploiting Input Parameter Uncertainty for Reducing Datapath Precision of SPICE Device Models.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013

2012

${\rm SPICE}^2$: Spatial Processors Interconnected for Concurrent Execution for Accelerating the SPICE Circuit Simulator Using an FPGA.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2012

Enhancing performance of Tall-Skinny QR factorization using FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012

FX-SCORE: A Framework for Fixed-Point Compilation of SPICE Device Models Using Gappa++.

[BibT_eX]

[DOI]

Hélène Martorell

Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

A High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem.

[BibT_eX]

[DOI]

Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2012

2011

Spatial hardware implementation for sparse graph algorithms in GraphStep.

[BibT_eX]

[DOI]

ACM Trans. Auton. Adapt. Syst., 2011

An NoC Traffic Compiler for Efficient FPGA Implementation of Sparse Graph-Oriented Workloads.

[BibT_eX]

[DOI]

Int. J. Reconfigurable Comput., 2011

VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Field-Programmable Technology, 2011

2010

An NoC Traffic Compiler for efficient FPGA implementation of Parallel Graph Applications.

[BibT_eX]

Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip, 2010

2009

Pipelining Saturated Accumulation.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2009

Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

Accelerating SPICE Model-Evaluation using FPGAs.

[BibT_eX]

[DOI]

Proceedings of the FCCM 2009, 2009

2007

Optimistic Parallelization of Floating-Point Accumulation.

[BibT_eX]

[DOI]