Mohammad Sadrosadati

Yahya Can Tugrul

CoRR, 2024

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2024

Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

CoMeT: Count-Min-Sketch-based Row Tracking to Mitigate RowHammer at Low Cost.

[BibT_eX]

[DOI]

F. Nisa Bostanci

Ismail Emir Yüksel

Ataberk Olgun

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

2023

ALP: Alleviating CPU-Memory Data Movement Overheads in Memory-Centric Systems.

[BibT_eX]

[DOI]

Nastaran Hajinazar

Juan Gómez-Luna

IEEE Trans. Emerg. Top. Comput., 2023

PULSAR: Simultaneous Many-Row Activation for Reliable and High-Performance Computing in Off-the-Shelf DRAM Chips.

[BibT_eX]

[DOI]

Ismail Emir Yuksel

Yahya Can Tugrul

F. Nisa Bostanci

CoRR, 2023

MetaStore: High-Performance Metagenomic Analysis via In-Storage Computing.

[BibT_eX]

[DOI]

CoRR, 2023

TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2023

Energy Consumption Analysis of Instruction Cache Prefetching Methods.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Computer Architecture and High Performance Computing Workshops , 2023

Victima: Drastically Increasing Address Translation Reach by Leveraging Underutilized Cache Resources.

[BibT_eX]

[DOI]

Davide Basilio Bartolini

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Utopia: Fast and Efficient Address Translation via Hybrid Restrictive & Flexible Virtual-to-Physical Address Mappings.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

TransPimLib: Efficient Transcendental Functions for Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

RowPress: Amplifying Read Disturbance in Modern DRAM Chips.

[BibT_eX]

[DOI]

Haocong Luo

Ataberk Olgun

Ehsan Yousefzadeh-Asl-Miandoab

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022

pLUTo: Enabling Massively Parallel Computation In DRAM via Lookup Tables.

[BibT_eX]

[DOI]

Dataset, July, 2022

OSM: Off-Chip Shared Memory for GPUs.

[BibT_eX]

[DOI]

Sina Darabi

IEEE Trans. Parallel Distributed Syst., 2022

NURA: A Framework for Supporting Non-Uniform Resource Accesses in GPUs.

[BibT_eX]

[DOI]

Sina Darabi

Negin Mahani

Hazhir Bakhishi

Ehsan Yousefzadeh-Asl-Miandoab

Proc. ACM Meas. Anal. Comput. Syst., 2022

TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering.

[BibT_eX]

[DOI]

CoRR, 2022

RevaMp3D: Architecting the Processor Core and Cache Hierarchy for Systems with Monolithically-Integrated Logic and Memory.

[BibT_eX]

[DOI]

Nika Mansouri-Ghiasi

Geraldo F. Oliveira

CoRR, 2022

Chapter One - Traffic-load-aware virtual channel power-gating in network-on-chips.

[BibT_eX]

[DOI]

Negar Akbarzadeh

Mehdi Modarressi

Adv. Comput., 2022

Chapter Two - An efficient DVS scheme for on-chip networks.

[BibT_eX]

[DOI]

Negar Akbarzadeh

Homa Aghilinasab

Adv. Comput., 2022

Chapter Three - A power-performance balanced network-on-chip for mixed CPU-GPU systems.

[BibT_eX]

[DOI]

Behnaz Soltani

Adv. Comput., 2022

GenPIP: In-Memory Acceleration of Genome Analysis via Tight Integration of Basecalling and Read Mapping.

[BibT_eX]

[DOI]

Nour Almadhoun Alserr

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

pLUTo: Enabling Massively Parallel Computation in DRAM via Lookup Tables.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core Resources.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction.

[BibT_eX]

[DOI]

Rahul Bera

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Flash-Cosmos: In-Flash Bulk Bitwise Operations Using Inherent Computation Capability of NAND Flash Memory.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

2021

Efficient Nearest-Neighbor Data Sharing in GPUs.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2021

pLUTo: In-DRAM Lookup Tables to Enable Massively Parallel General-Purpose Computation.

[BibT_eX]

[DOI]

CoRR, 2021

Data-Aware Compression of Neural Networks.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2021

DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks.

[BibT_eX]

[DOI]

IEEE Access, 2021

CODIC: A Low-Cost Substrate for Enabling Custom In-DRAM Functionalities and Optimizations.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

2020

Enabling High-Capacity, Latency-Tolerant, and Highly-Concurrent GPU Register Files via Software/Hardware Cooperation.

[BibT_eX]

[DOI]

Seyyed Hossein Seyyedaghaei Rezaei

CoRR, 2020

NoM: Network-on-Memory for Inter-Bank Data Transfer in Highly-Banked Memories.

[BibT_eX]

[DOI]

Mehdi Modarressi

Masoud Daneshtalab

IEEE Comput. Archit. Lett., 2020

FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

2019

Highly Concurrent Latency-tolerant Register Files for GPUs.

[BibT_eX]

[DOI]

Fatemehsadat Mireshghallah

ACM Trans. Comput. Syst., 2019

Energy-Efficient Permanent Fault Tolerance in Hard Real-Time Systems.

[BibT_eX]

[DOI]

Mohammad Bakhshalipour

IEEE Trans. Computers, 2019

ITAP: Idle-Time-Aware Power Management for GPU Execution Units.

[BibT_eX]

[DOI]

Seyed Borna Ehsani

Hajar Falahati

ACM Trans. Archit. Code Optim., 2019

Dataplant: In-DRAM Security Mechanisms for Low-Cost Devices.

[BibT_eX]

[DOI]

CoRR, 2019

Focus on What is Needed: Area and Power Efficient FPGAs Using Turn-Restricted Switch Boxes.

[BibT_eX]

[DOI]

Fatemeh Serajeh-hassani

Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

2018

BARAN: Bimodal Adaptive Reconfigurable-Allocator Network-on-Chip.

[BibT_eX]

[DOI]

Fatemeh Aghamohammadi

Mehdi Modarressi

ACM Trans. Parallel Comput., 2018

ORIGAMI: A Heterogeneous Split Architecture for In-Memory Acceleration of Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions.

[BibT_eX]

[DOI]

CoRR, 2018

Neda: Supporting Direct Inter-Core Neighbor Data Exchange in GPUs.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2018

Reducing DRAM Latency via Charge-Level-Aware Look-Ahead Partial Restoration.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices.

[BibT_eX]

[DOI]

Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018

LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017

BiNoCHS: Bimodal Network-on-Chip for CPU-GPU Heterogeneous Systems.

[BibT_eX]

[DOI]

Proceedings of the Eleventh IEEE/ACM International Symposium on Networks-on-Chip, 2017

Effective cache bank placement for GPUs.

[BibT_eX]

[DOI]

Shahin Roozkhosh

Hazhir Bakhishi

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

POSTER: Elastic Reconfiguration for Heterogeneous NoCs with BiNoCHS.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

A Method to Improve Adaptivity of Odd-Even Routing Algorithm in Mesh NoCs.

[BibT_eX]

[DOI]

Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Reducing Power Consumption of GPGPUs Through Instruction Reordering.

[BibT_eX]

[DOI]

Homa Aghilinasab

Mohammad Hossein Samavatian

Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016

Quantifying the difference in resource demand among classic and modern NoC workloads.

[BibT_eX]

[DOI]

Maryam Zare

Proceedings of the 34th IEEE International Conference on Computer Design, 2016

2015

An efficient DVS scheme for on-chip networks using reconfigurable Virtual Channel allocators.

[BibT_eX]

[DOI]

Homa Aghilinasab

Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

An energy-efficient virtual channel power-gating mechanism for on-chip networks.

[BibT_eX]

[DOI]