Sai Rahul Chalamalasetti

Orcid: 0000-0001-9004-440X

  • Hewlett Packard Enterprise Labs, Palo Alto, CA, USA

According to our database1, Sai Rahul Chalamalasetti authored at least 35 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:



Predicting Heterogeneity and Serverless Principles of Converged High-Performance Computing, Artificial Intelligence, and Workflows.
Computer, January, 2024

Kernel-as-a-Service: A Serverless Programming Model for Heterogeneous Hardware Accelerators.
Proceedings of the 24th International Middleware Conference, 2023

A Python-based High-Level Programming Flow for CPU-FPGA Heterogeneous Systems : (Invited Paper).
Proceedings of the IEEE/ACM Programming Environments for Heterogeneous Computing, 2021

Resource Sharing and Security Implications on Machine Learning Inference Accelerators.
Proceedings of the IEEE 45th Annual Computers, Software, and Applications Conference, 2021

Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM.
IEEE Trans. Computers, 2020

Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures.
Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

FPGA Demonstrator of a Programmable Ultra-Efficient Memristor-Based Machine Learning Inference Accelerator.
Proceedings of the 2019 IEEE International Conference on Rebooting Computing, 2019

PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

Regular Expression Matching with Memristor TCAMs for Network Security.
Proceedings of the 14th IEEE/ACM International Symposium on Nanoscale Architectures, 2018

Regular Expression Matching with Memristor TCAMs.
Proceedings of the 2018 IEEE International Conference on Rebooting Computing, 2018

Autotuning high-level synthesis for FPGAs using OpenTuner and LegUp.
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2017

Generalize or Die: Operating Systems Support for Memristor-Based Accelerators.
Proceedings of the IEEE International Conference on Rebooting Computing, 2017

High Level Programming of Document Classification Systems for Heterogeneous Environments using OpenCL (Abstract Only).
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

High Level Programming for Heterogeneous Architectures.
CoRR, 2014

High level programming framework for FPGAs in the data center.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

Design and Evaluation of High-Performance Processing Elements for Reconfigurable Systems.
IEEE Trans. Very Large Scale Integr. Syst., 2013

Throughput/Resource-Efficient Reconfigurable Processor for Multimedia Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2013

A hybrid CPU-FPGA system for high throughput (10Gb/s) streaming document classification.
SIGARCH Comput. Archit. News, 2013

An FPGA memcached appliance.
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

High throughput filtering using FPGA-acceleration.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Throughput Analysis for a High-Performance FPGA-Accelerated Real-Time Search Application.
Int. J. Reconfigurable Comput., 2012

Evaluating FPGA-acceleration for real-time unstructured search.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012

A few lines of code, thousands of cores: High-level FPGA programming using vector processor networks.
Proceedings of the 2011 International Conference on High Performance Computing & Simulation, 2011

Radiation-Hardened Reconfigurable Array With Instruction Roll-Back.
IEEE Embed. Syst. Lett., 2010

Design of self correcting radiation hardened digital circuits using decoupled ground bus.
Proceedings of the 20th ACM Great Lakes Symposium on VLSI 2009, 2010

A C++-embedded Domain-Specific Language for programming the MORA soft processor array.
Proceedings of the 21st IEEE International Conference on Application-specific Systems Architectures and Processors, 2010

Low overhead soft error detection and correction scheme for reconfigurable pipelined data paths.
Proceedings of the 2010 NASA/ESA Conference on Adaptive Hardware and Systems, 2010

A 1.2v, 1.02 ghz 8 bit SIMD compatible highly parallel arithmetic data path for multi-precision arithmetic.
Proceedings of the 19th ACM Great Lakes Symposium on VLSI 2009, 2009

A low cost reconfigurable soft processor for multimedia applications: Design synthesis and programming model.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

Programming Model and Low-level Language for a Coarse-Grained Reconfigurable Multimedia Processor.
Proceedings of the 2009 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2009

MORA - An Architecture and Programming Model for a Resource Efficient Coarse Grained Reconfigurable Processor.
Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems, 2009

Power/throughput/area efficient PIM-based reconfigurable array for parallel processing.
Proceedings of the 21st Annual IEEE International SoC Conference, SoCC 2008, 2008

Power-Efficient High Throughput Reconfigurable Datapath Design for Portable Multimedia Devices.
Proceedings of the ReConFig'08: 2008 International Conference on Reconfigurable Computing and FPGAs, 2008