Amin Farmahini Farahani

Hamidreza Khaleghzadeh

CoRR, 2020

2019

Power Profiling of Modern Die-Stacked Memory.

[BibT_eX]

[DOI]

Dylan C. Stow

Sudhanva Gurumurthi

Michael Ignatowski

Yuan Xie

IEEE Comput. Archit. Lett., 2019

2018

Challenges of High-Capacity DRAM Stacks and Potential Directions.

[BibT_eX]

[DOI]

Sudhanva Gurumurthi

Gabriel H. Loh

Michael Ignatowski

Proceedings of the Workshop on Memory Centric High Performance Computing, 2018

RegMutex: Inter-Warp GPU Register Time-Sharing.

[BibT_eX]

[DOI]

Farzad Khorasani

Hodjat Asghari Esfeden

Nuwan Jayasena

Vivek Sarkar

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

2016

Near-DRAM Acceleration with Single-ISA Heterogeneous Processing in Standard Memory Modules.

[BibT_eX]

[DOI]

Hadi Asghari Moghaddam

Jung Ho Ahn

IEEE Micro, 2016

Analytical Study on Bandwidth Efficiency of Heterogeneous Memory Systems.

[BibT_eX]

[DOI]

David Roberts

Nuwan Jayasena

Proceedings of the Second International Symposium on Memory Systems, 2016

2015

DRAMA: An Architecture for Accelerated Processing Near Memory.

[BibT_eX]

[DOI]

Jung Ho Ahn

IEEE Comput. Archit. Lett., 2015

NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules.

[BibT_eX]

[DOI]

Jung Ho Ahn

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

NMI: A new memory interface to enable innovation.

[BibT_eX]

[DOI]

David Roberts

Proceedings of the 2015 IEEE Hot Chips 27 Symposium (HCS), 2015

2014

Energy-efficient reconfigurable cache architectures for accelerator-enabled embedded systems.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

Process variation-aware workload partitioning algorithms for GPUs supporting spatial-multitasking.

[BibT_eX]

[DOI]

Paula Aguilera

Jungseob Lee

Michael J. Schulte

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

2013

Modular Design of High-Throughput, Low-Latency Sorting Units.

[BibT_eX]

[DOI]

Henry J. Duwe III

Michael J. Schulte

Katherine Compton

IEEE Trans. Computers, 2013

2011

Modular high-throughput and low-latency sorting units for FPGAs in the Large Hadron Collider.

[BibT_eX]

[DOI]

Anthony E. Gregerson

Michael J. Schulte

Katherine Compton

Proceedings of the IEEE 9th Symposium on Application Specific Processors, 2011

2010

Parallel scalable hardware implementation of asynchronous discrete particle swarm optimization.

[BibT_eX]

[DOI]

Eng. Appl. Artif. Intell., 2010

2009

FPGA Design Analysis of the Clustering Algorithm for the CERN Large Hadron Collider.

[BibT_eX]

[DOI]

Anthony E. Gregerson

Proceedings of the FCCM 2009, 2009

2008

Scalable Architecture for on-Chip Neural Network Training using Swarm Intelligence.

[BibT_eX]

[DOI]

Seid Mehdi Fakhraie

Saeed Safari

Proceedings of the Design, Automation and Test in Europe, 2008

2007

Simulation of Voice Processing Applications through VLIW DSP Architectures.

[BibT_eX]

[DOI]

Naser Sedaghati-Mokhtari

Mahdi Nazm Bojnordi

Mahmoud Mousavinezhad

Sied Mehdi Fakhraie

Proceedings of the 14th IEEE International Conference on Electronics, 2007

SOPC-Based Architecture for Discrete Particle Swarm Optimization.

[BibT_eX]

[DOI]

Sied Mehdi Fakhraie

Saeed Safari

Proceedings of the 14th IEEE International Conference on Electronics, 2007

HW/SW partitioning using discrete particle swarm.

[BibT_eX]

[DOI]

Mehdi Kamal

Seid Mehdi Fakhraie

Saeed Safari

Proceedings of the 17th ACM Great Lakes Symposium on VLSI 2007, 2007

2006

Parallel-Genetic-Algorithm-Based HW/SW Partitioning.

[BibT_eX]

[DOI]