Juan Gómez-Luna

CoRR, 2024

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2024

BIMSA: accelerating long sequence alignment using processing-in-memory.

[BibT_eX]

[DOI]

Alejandro Alonso-Marín

Bioinform., 2024

MATSA: An MRAM-Based Energy-Efficient Accelerator for Time Series Analysis.

[BibT_eX]

[DOI]

IEEE Access, 2024

SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing.

[BibT_eX]

[DOI]

Geraldo F. Oliveira

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

Read Disturbance in High Bandwidth Memory: A Detailed Experimental Study on HBM2 DRAM Chips.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System.

[BibT_eX]

[DOI]

Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023

GVLE: a highly optimized GPU-based implementation of variable-length encoding.

[BibT_eX]

[DOI]

Rafael Medina Carnicer

J. Supercomput., May, 2023

Scrooge: a fast and memory-frugal genomic sequence aligner for CPUs, GPUs, and ASICs.

[BibT_eX]

[DOI]

Bioinform., May, 2023

A framework for high-throughput sequence alignment using real processing-in-memory systems.

[BibT_eX]

[DOI]

Bioinform., May, 2023

PiDRAM: A Holistic End-to-end FPGA-based Framework for Processing-in-DRAM.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., March, 2023

ALP: Alleviating CPU-Memory Data Movement Overheads in Memory-Centric Systems.

[BibT_eX]

[DOI]

Nastaran Hajinazar

IEEE Trans. Emerg. Top. Comput., 2023

PULSAR: Simultaneous Many-Row Activation for Reliable and High-Performance Computing in Off-the-Shelf DRAM Chips.

[BibT_eX]

[DOI]

Ismail Emir Yuksel

Yahya Can Tugrul

F. Nisa Bostanci

CoRR, 2023

Understanding Read Disturbance in High Bandwidth Memory: An Experimental Analysis of Real HBM2 DRAM Chips.

[BibT_eX]

[DOI]

Majd Osseiran

CoRR, 2023

DaPPA: A Data-Parallel Framework for Processing-in-Memory Architectures.

[BibT_eX]

[DOI]

CoRR, 2023

TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2023

Extending Memory Capacity in Modern Consumer Systems With Emerging Non-Volatile Memory: Experimental Analysis and Characterization Using the Intel Optane SSD.

[BibT_eX]

[DOI]

IEEE Access, 2023

Casper: Accelerating Stencil Computations Using Near-Cache Processing.

[BibT_eX]

[DOI]

IEEE Access, 2023

High-Performance and Scalable Agent-Based Simulation with BioDynaMo.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

TransPimLib: Efficient Transcendental Functions for Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Evaluating Machine LearningWorkloads on Memory-Centric Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Evaluating Homomorphic Operations on a Real-World Processing-In-Memory System.

[BibT_eX]

[DOI]

Harshita Gupta

Mayank Kabra

Proceedings of the IEEE International Symposium on Workload Characterization, 2023

SPARTA: Spatial Acceleration for Efficient and Scalable Horizontal Diffusion Weather Stencil Computation.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Supercomputing, 2023

An Experimental Analysis of RowHammer in HBM2 DRAM Chips.

[BibT_eX]

[DOI]

Majd Osseiran

Proceedings of the 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2023

SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022

pLUTo: Enabling Massively Parallel Computation In DRAM via Lookup Tables.

[BibT_eX]

[DOI]

Dataset, July, 2022

Accelerating Weather Prediction Using Near-Memory Reconfigurable Fabric.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2022

CAVLCU: an efficient GPU-based implementation of CAVLC.

[BibT_eX]

[DOI]

Rafael Medina Carnicer

J. Supercomput., 2022

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.

[BibT_eX]

[DOI]

Proc. ACM Meas. Anal. Comput. Syst., 2022

Accelerating Neural Network Inference With Processing-in-DRAM: From the Edge to the Cloud.

[BibT_eX]

[DOI]

IEEE Micro, 2022

GUD-Canny: a real-time GPU-based unsupervised and distributed Canny edge detector.

[BibT_eX]

[DOI]

Rafael Medina Carnicer

J. Real Time Image Process., 2022

Accelerating Time Series Analysis via Processing using Non-Volatile Memories.

[BibT_eX]

[DOI]

CoRR, 2022

RevaMp3D: Architecting the Processor Core and Cache Hierarchy for Systems with Monolithically-Integrated Logic and Memory.

[BibT_eX]

[DOI]

Nika Mansouri-Ghiasi

Mohammad Sadrosadati

Geraldo F. Oliveira

CoRR, 2022

LEAPER: Modeling Cloud FPGA-based Systems via Transfer Learning.

[BibT_eX]

[DOI]

CoRR, 2022

An Experimental Evaluation of Machine Learning Training on a Real Processing-in-Memory System.

[BibT_eX]

[DOI]

CoRR, 2022

Going From Molecules to Genomic Variations to Scientific Discovery: Intelligent Algorithms and Architectures for Intelligent Genome Analysis.

[BibT_eX]

[DOI]

CoRR, 2022

Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2022

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2022

Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System.

[BibT_eX]

[DOI]

IEEE Access, 2022

Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.

[BibT_eX]

[DOI]

Proceedings of the SIGMETRICS/PERFORMANCE '22: ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, Mumbai, India, June 6, 2022

pLUTo: Enabling Massively Parallel Computation in DRAM via Lookup Tables.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core Resources.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Flash-Cosmos: In-Flash Bulk Bitwise Operations Using Inherent Computation Capability of NAND Flash Memory.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Methodologies, Workloads, and Tools for Processing-in-Memory: Enabling the Adoption of Data-Centric Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

PiDRAM: An FPGA-based Framework for End-to-end Evaluation of Processing-in-DRAM Techniques.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Machine Learning Training on a Real Processing-in-Memory System.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

SparseP: Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Exploiting Near-Data Processing to Accelerate Time Series Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping.

[BibT_eX]

[DOI]

Damla Senol Cali

Nour Almadhoun Alserr

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Algorithmic Improvement and GPU Acceleration of the GenASM Algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

High-throughput Pairwise Alignment with the Wavefront Algorithm using Processing-in-Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

A Compiler Framework for Optimizing Dynamic Parallelism on GPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

2021

FPGA-Based Near-Memory Acceleration of Modern Data-Intensive Applications.

[BibT_eX]

[DOI]

Mohammed Alser

Damla Senol Cali

Henk Corporaal

IEEE Micro, 2021

Casper: Accelerating Stencil Computation using Near-cache Processing.

[BibT_eX]

[DOI]

CoRR, 2021

Extending Memory Capacity in Consumer Devices with Emerging Non-Volatile Memory: An Experimental Study.

[BibT_eX]

[DOI]

CoRR, 2021

NERO: Accelerating Weather Prediction using Near-Memory Reconfigurable Fabric.

[BibT_eX]

[DOI]

CoRR, 2021

SIMDRAM: An End-to-End Framework for Bit-Serial SIMD Computing in DRAM.

[BibT_eX]

[DOI]

CoRR, 2021

Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture.

[BibT_eX]

[DOI]

CoRR, 2021

pLUTo: In-DRAM Lookup Tables to Enable Massively Parallel General-Purpose Computation.

[BibT_eX]

[DOI]

CoRR, 2021

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Maciej Besta

Raghavendra Kanakagiri

Grzegorz Kwasniewski

Jakub Beránek

Kacper Janda

Zur Vonarburg-Shmaria

Salvatore Di Girolamo

Marek Konieczny

Torsten Hoefler

CoRR, 2021

BurstLink: Techniques for Energy-Efficient Conventional and Virtual Reality Video Display.

[BibT_eX]

[DOI]

CoRR, 2021

SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs.

[BibT_eX]

[DOI]

Bioinform., 2021

DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks.

[BibT_eX]

[DOI]

IEEE Access, 2021

BurstLink: Techniques for Energy-Efficient Video Display for Conventional and Virtual Reality Systems.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Maciej Besta

Raghavendra Kanakagiri

Grzegorz Kwasniewski

Jakub Beránek

Kacper Janda

Zur Vonarburg-Shmaria

Salvatore Di Girolamo

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

IChannels: Exploiting Current Management Mechanisms to Create Covert Channels in Modern Processors.

[BibT_eX]

[DOI]

Mohammed Alser

Ivan Puddu

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

CODIC: A Low-Cost Substrate for Enabling Custom In-DRAM Functionalities and Optimizations.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware.

[BibT_eX]

[DOI]

Proceedings of the 12th International Green and Sustainable Computing Workshops, 2021

Modeling FPGA-Based Systems via Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

SIMDRAM: a framework for bit-serial SIMD processing using DRAM.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020

A Modern Primer on Processing in Memory.

[BibT_eX]

[DOI]

Saugata Ghose

CoRR, 2020

Accelerating B-spline interpolation on GPUs: Application to medical image registration.

[BibT_eX]

[DOI]

Comput. Methods Programs Biomed., 2020

Fast parallel vessel segmentation.

[BibT_eX]

[DOI]

Comput. Methods Programs Biomed., 2020

GPU acceleration of liver enhancement for tumor segmentation.

[BibT_eX]

[DOI]

Comput. Methods Programs Biomed., 2020

Accelerating sparse matrix-matrix multiplication with GPU Tensor Cores.

[BibT_eX]

[DOI]

Comput. Electr. Eng., 2020

Accelerating Chan-Vese model with cross-modality guided contrast enhancement for liver segmentation.

[BibT_eX]

[DOI]

Nitin Satpute

Joaquín Olivares

Comput. Biol. Medicine, 2020

FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

NATSA: A Near-Data Processing Accelerator for Time Series Analysis.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE International Conference on Computer Design, 2020

NERO: A Near High-Bandwidth Memory Stencil Accelerator for Weather Prediction Modeling.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

Boyi: A Systematic Framework for Automatically Deciding the Right Execution Model of OpenCL Applications on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

2019

Processing data where it makes sense: Enabling in-memory computation.

[BibT_eX]

[DOI]

Saugata Ghose

Microprocess. Microsystems, 2019

Processing-in-memory: A workload-driven perspective.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 2019

A Workload and Programming Ease Driven Perspective of Processing-in-Memory.

[BibT_eX]

[DOI]

CoRR, 2019

Dataplant: In-DRAM Security Mechanisms for Low-Cost Devices.

[BibT_eX]

[DOI]

CoRR, 2019

Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures.

[BibT_eX]

[DOI]

Sitao Huang

Li-Wen Chang

Izzat El Hajj

Simon Garcia De Gonzalo

Sai Rahul Chalamalasetti

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

NAPEL: Near-Memory Computing Application Performance Prediction via Ensemble Learning.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Enabling Practical Processing in and near Memory for Data-Intensive Computing.

[BibT_eX]

[DOI]

Saugata Ghose

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.

[BibT_eX]

[DOI]

Simon Garcia De Gonzalo

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018

High-throughput Ant Colony Optimization on graphics processing units.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2018

High-Performance Computation of Bézier Surfaces on Parallel and Heterogeneous Platforms.

[BibT_eX]

[DOI]

Rafael Palomar

Faouzi Alaya Cheikh

Joaquín Olivares Bueno

Ole Jakob Elle

Int. J. Parallel Program., 2018

Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions.

[BibT_eX]

[DOI]

CoRR, 2018

Improving tasks throughput on accelerators using OpenCL command concurrency.

[BibT_eX]

[DOI]

A. J. Lázaro-Muñoz

CoRR, 2018

FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices.

[BibT_eX]

[DOI]

Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018

2017

A tasks reordering model to reduce transfers overhead on GPUs.

[BibT_eX]

[DOI]

A. J. Lázaro-Muñoz

J. Parallel Distributed Comput., 2017

Collaborative Computing for Heterogeneous Integrated Systems.

[BibT_eX]

[DOI]

Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, 2017

Chai: Collaborative heterogeneous applications for integrated-architectures.

[BibT_eX]

[DOI]

Simon Garcia De Gonzalo

Thomas B. Jablin

Antonio J. Peña

Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

Efficient OpenCL-based concurrent tasks offloading on accelerators.

[BibT_eX]

[DOI]

A. J. Lázaro-Muñoz

Proceedings of the International Conference on Computational Science, 2017

2016

In-Place Matrix Transposition on GPUs.

[BibT_eX]

[DOI]

I-Jui Sung

Li-Wen Chang

IEEE Trans. Parallel Distributed Syst., 2016

Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs.

[BibT_eX]

[DOI]

Gert-Jan van den Braak

Henk Corporaal

IEEE Trans. Computers, 2016

A programming system for future proofing performance critical libraries.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Efficient kernel synthesis for performance portable programming.

[BibT_eX]

[DOI]

Li-Wen Chang

Izzat El Hajj

Christopher I. Rodrigues

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Evaluating the effect of last-level cache sharing on integrated GPU-CPU systems with heterogeneous applications.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

2015

Calculation of dense trajectory descriptors on a heterogeneous embedded architecture.

[BibT_eX]

[DOI]

Julián Ramos Cózar

Manuel J. Marín-Jiménez

J. Syst. Archit., 2015

In-Place Data Sliding Algorithms for Many-Core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 44th International Conference on Parallel Processing, 2015

2014

In-place transposition of rectangular matrices on accelerators.

[BibT_eX]

[DOI]

I-Jui Sung

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

Low-textured regions detection for improving stereoscopy algorithms.

[BibT_eX]

[DOI]

Salvador Ibarra-Delgado

Julián Ramos Cózar

Proceedings of the International Conference on High Performance Computing & Simulation, 2014

CUVLE: Variable-length encoding on CUDA.

[BibT_eX]

[DOI]

Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing, 2014

2013

Performance Modeling of Atomic Additions on GPU Scratchpad Memory.

[BibT_eX]

[DOI]

José Ignacio Benavides Benítez

IEEE Trans. Parallel Distributed Syst., 2013

An optimized approach to histogram computation on GPU.

[BibT_eX]

[DOI]

Mach. Vis. Appl., 2013

A robust and low resource FPGA-based stereoscopic vision algorithm.

[BibT_eX]

[DOI]

Salvador Ibarra-Delgado

Manuel Hernandez Calviño

Proceedings of the 2012 International Conference on Reconfigurable Computing and FPGAs, 2013

Simulation and architecture improvements of atomic operations on GPU scratchpad memory.

[BibT_eX]

[DOI]

Gert-Jan van den Braak

Henk Corporaal

Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

2012

Performance models for asynchronous data transfers on consumer Graphics Processing Units.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2012

2011

Load Balancing versus Occupancy Maximization on Graphics Processing Units: The Generalized Hough Transform as a Case Study.

[BibT_eX]

[DOI]

Emilio L. Zapata

Int. J. High Perform. Comput. Appl., 2011

simARQ, An Automatic Repeat Request Simulator for Teaching Purposes.

[BibT_eX]

[DOI]

Carlos García-García

José Ignacio Benavides Benítez

Ezequiel Herruzo Gomez

Proceedings of the IT Revolutions, 2011

Egomotion compensation and moving objects detection algorithm on GPU.

[BibT_eX]

[DOI]

Holger Endt

Walter Stechele

Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

2010

Parallelizing and Optimizing LIP-Canny Using NVIDIA CUDA.

[BibT_eX]

[DOI]

Proceedings of the Trends in Applied Intelligent Systems, 2010

2009

MESI Cache Coherence Simulator for Teaching Purposes.

[BibT_eX]

[DOI]

Ezequiel Herruzo

José Ignacio Benavides Benítez

CLEI Electron. J., 2009

FPGA Implementation of the Generalized Hough Transform.

[BibT_eX]

[DOI]

Sergio Ruben Geninatti

Manuel Hernandez Calviño

Proceedings of the ReConFig'09: 2009 International Conference on Reconfigurable Computing and FPGAs, 2009

Parallelization of a Video Segmentation Algorithm on CUDA-Enabled Graphics Processing Units.

[BibT_eX]

[DOI]

José I. Benavides