Sreenivas Subramoney

Konstantinos Kanellopoulos

CoRR, 2024

CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware.

[BibT_eX]

[DOI]

CoRR, 2024

Telescope: Telemetry for Gargantuan Memory Footprint Applications.

[BibT_eX]

[DOI]

Proceedings of the 2024 USENIX Annual Technical Conference, 2024

Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution.

[BibT_eX]

[DOI]

Mohammad Sadrosadati

Abhimanyu Rajeshkumar Bambhaniya

Onur Mutlu

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

2023

Enhanced regularization for on-chip training using analog and temporary memory weights.

[BibT_eX]

[DOI]

Maryam Shojaei Baghini

Udayan Ganguly

Neural Networks, August, 2023

Telescope: Telemetry at Terabyte Scale.

[BibT_eX]

[DOI]

CoRR, 2023

Motivating Next-Generation OS Physical Memory Management for Terabyte-Scale NVMMs.

[BibT_eX]

[DOI]

CoRR, 2023

Reclaimer: A Reinforcement Learning Approach to Dynamic Resource Allocation for Cloud Microservices.

[BibT_eX]

[DOI]

CoRR, 2023

VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs.

[BibT_eX]

[DOI]

Geonhwa Jeong

Sana Damani

Eric Qin

Christopher J. Hughes

Hyesoon Kim

Tushar Krishna

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

2022

A Unified Programmable Edge Matrix Processor for Deep Neural Networks and Matrix Algebra.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., September, 2022

A Survey of Deep Learning on CPUs: Opportunities and Co-Optimizations.

[BibT_eX]

[DOI]

Sparsh Mittal

Poonam Rajput

IEEE Trans. Neural Networks Learn. Syst., 2022

Unsupervised Learning of Depth, Camera Pose and Optical Flow from Monocular Video.

[BibT_eX]

[DOI]

Dipan Mandal

Abhilash Jain

CoRR, 2022

Disrupting Low-Write-Energy vs. Fast-Read Dilemma in RRAM to Enable L1 Instruction Cache.

[BibT_eX]

[DOI]

Proceedings of the VLSI Design and Test - 26th International Symposium, 2022

Speculative Code Compaction: Eliminating Dead Code via Speculative Microcode Transformations.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Compute-In-Memory Using 6T SRAM for a Wide Variety of Workloads.

[BibT_eX]

[DOI]

Pramod Kumar Bharti

Saurabh Jain

Kamlesh R. Pillai

Sagar Varma Sayyaparaju

Gurpreet S. Kalsi

Joycee Mekie

Niranjan K. Soundararajan

Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Thermometer: profile-guided btb replacement for data center applications.

[BibT_eX]

[DOI]

Shixin Song

Tanvir Ahmed Khan

Sara Mahdizadeh-Shahri

Akshitha Sriraman

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping.

[BibT_eX]

[DOI]

Damla Senol Cali

Konstantinos Kanellopoulos

Nour Almadhoun Alserr

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Segment-Fusion: Hierarchical Context Fusion for Robust 3D Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Robust 3D Scene Segmentation through Hierarchical and Learnable Part-Fusion.

[BibT_eX]

[DOI]

CoRR, 2021

Page Table Management for Heterogeneous Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2021

PDede: Partitioned, Deduplicated, Delta Branch Target Buffer.

[BibT_eX]

[DOI]

Niranjan K. Soundararajan

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Cryptographic Capability Computing.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Twig: Profile-Guided BTB Prefetching for Data Center Applications.

[BibT_eX]

[DOI]

Tanvir Ahmed Khan

Nathan Brown

Akshitha Sriraman

Niranjan K. Soundararajan

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning.

[BibT_eX]

[DOI]

Rahul Bera

Konstantinos Kanellopoulos

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Radiant: efficient page table management for tiered memory systems.

[BibT_eX]

[DOI]

Proceedings of the ISMM '21: 2021 ACM SIGPLAN International Symposium on Memory Management, 2021

REDUCT: Keep it Close, Keep it Cool! : Efficient Scaling of DNN Inference on Multi-core CPUs with Near-Cache Compute.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

ONT-X: An FPGA Approach to Real-time Portable Genomic Analysis.

[BibT_eX]

[DOI]

C. N. Ramachandra

Anirban Nag

Rajeev Balasubramonian

Gurpreet S. Kalsi

Kamlesh R. Pillai

Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU.

[BibT_eX]

[DOI]

Geonhwa Jeong

Eric Qin

Ananda Samajdar

Christopher J. Hughes

Akshay Krishna Ramanathan

Hyesoon Kim

Tushar Krishna

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020

REAL: REquest Arbitration in Last Level Caches.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2020

AccSS3D: Accelerator for Spatially Sparse 3D DNNs.

[BibT_eX]

[DOI]

CoRR, 2020

Proximu: Efficiently Scaling DNN Inference in Multi-core CPUs through Near-Cache Compute.

[BibT_eX]

[DOI]

CoRR, 2020

Look-Up Table based Energy Efficient Processing in Cache Support for Neural Network Acceleration.

[BibT_eX]

[DOI]

Gurpreet S. Kalsi

Srivatsa Srinivasa

Tarun Makesh Chandran

Kamlesh R. Pillai

Om Ji Omer

Vijaykrishnan Narayanan

Nagadastagiri Challapalle

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis.

[BibT_eX]

[DOI]

Rachata Ausavarungnirun

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Characterization of Data Generating Neural Network Applications on x86 CPU Architecture.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Auto-Predication of Critical Branches.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Focused Value Prediction.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Towards Noise Resilient SLAM.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Descriptor Scoring for Feature Selection in Real-Time Visual Slam.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2020

PSB-RNN: A Processing-in-Memory Systolic Array Architecture using Block Circulant Matrices for Recurrent Neural Networks.

[BibT_eX]

[DOI]

Vijaykrishnan Narayanan

Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Opportunistic Early Pipeline Re-steering for Data-dependent Branches.

[BibT_eX]

[DOI]

Saurabh Gupta

Niranjan Soundararajan

Ragavendra Natarajan

Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019

Towards the adoption of Local Branch Predictors in Modern Out-of-Order Superscalar Processors.

[BibT_eX]

[DOI]

Niranjan Soundararajan

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

DSPatch: Dual Spatial Pattern Prefetcher.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Bandwidth-Aware Last-Level Caching: Efficiently Coordinating Off-Chip Read and Write Bandwidth.

[BibT_eX]

[DOI]

Mainak Chaudhuri

Santhosh Kumar Rethinagiri

Proceedings of the 37th IEEE International Conference on Computer Design, 2019

Visual Inertial Odometry At the Edge: A Hardware-Software Co-design Approach for Ultra-low Latency and Power.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

2018

MARS: Memory Aware Reordered Source.

[BibT_eX]

[DOI]

CoRR, 2018

Tackling memory access latency through DRAM row management.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2018

Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-Level Cache Hierarchies.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

Density Tradeoffs of Non-Volatile Memory as a Replacement for SRAM Based Last Level Cache.

[BibT_eX]

[DOI]

Sasikanth Manipatruni

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

Closed yet open DRAM: achieving low latency and high performance in DRAM memory systems.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of Cores, Caches, and On-chip Network.

[BibT_eX]

[DOI]

Rahul Jain

ACM Trans. Archit. Code Optim., 2017

Micro-Sector Cache: Improving Space Utilization in Sectored DRAM Caches.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2017

Near-Optimal Access Partitioning for Memory Hierarchies with Multiple Heterogeneous Bandwidth Sources.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

A coordinated multi-agent reinforcement learning approach to multi-level cache co-partitioning.

[BibT_eX]

[DOI]

Rahul Jain

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

2016

Base-Victim Compression: An Opportunistic Cache Compression Architecture.

[BibT_eX]

[DOI]

Alaa R. Alameldeen

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Machine Learned Machines: Adaptive co-optimization of caches, cores, and On-chip Network.

[BibT_eX]

[DOI]

Rahul Jain

Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

2014

Array scalarization in high level synthesis.

[BibT_eX]

[DOI]

Namita Sharma

Arun Kumar Pilania

Gummidipudi Krishnaiah

Ashok Jagannathan

Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013

Efficient management of last-level caches in graphics processors for 3D scene rendering workloads.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

2012

Introducing hierarchy-awareness in replacement and bypass algorithms for last-level caches.

[BibT_eX]

[DOI]

Mainak Chaudhuri

Nithiyanandan Bashyam

Joseph Nuzman

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011

Bypass and insertion algorithms for exclusive last-level caches.

[BibT_eX]

[DOI]

Mainak Chaudhuri

Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

2004

Prefetch inection based on hardware monitoring and object metadata.

[BibT_eX]

[DOI]

Ali-Reza Adl-Tabatabai

Richard L. Hudson

Mauricio J. Serrano