Richard Membarth

Orcid: 0000-0002-9979-7579

According to our database1, Richard Membarth authored at least 39 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Temporal Coherence-Based Distributed Ray Tracing of Massive Scenes.
IEEE Trans. Vis. Comput. Graph., February, 2024

2023
XEngine: Optimal Tensor Rematerialization for Neural Networks in Heterogeneous Environments.
ACM Trans. Archit. Code Optim., March, 2023

AnyQ: An Evaluation Framework for Massively-Parallel Queue Algorithms.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

2022
AnySeq/GPU: a novel approach for faster sequence alignment on GPUs.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021
tinyMD: Mapping molecular dynamics simulations to heterogeneous hardware using partial evaluation.
J. Comput. Sci., 2021

FLOWER: A comprehensive dataflow compiler for high-level synthesis.
Proceedings of the International Conference on Field-Programmable Technology, 2021

2020
AnyHLS: High-Level Synthesis With Partial Evaluation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

tinyMD: A Portable and Scalable Implementation for Pairwise Interactions Simulations.
CoRR, 2020

AnySeq: A High Performance Sequence Alignment Library based on Partial Evaluation.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

2019
Rodent: generating renderers without writing a generator.
ACM Trans. Graph., 2019

Efficient Mapping of Streaming Applications for Image Processing on Graphics Cards.
Trans. High Perform. Embed. Archit. Compil., 2019

Parallel Multi-Hypothesis Algorithm for Criticality Estimation in Traffic and Collision Avoidance.
Proceedings of the 2019 IEEE Intelligent Vehicles Symposium, 2019

2018
AnyDSL: a partial evaluation framework for programming high-performance libraries.
Proc. ACM Program. Lang., 2018

A Data Layout Transformation for Vectorizing Compilers.
Proceedings of the 4th Workshop on Programming Models for SIMD/Vector Processing, 2018

Unified Code Generation for the Parallel Computation of Pairwise Interactions Using Partial Evaluation.
Proceedings of the 17th International Symposium on Parallel and Distributed Computing, 2018

2017
Generating FPGA-based image processing accelerators with Hipacc: (Invited paper).
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

RaTrace: simple and efficient abstractions for BVH ray traversal algorithms.
Proceedings of the 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, 2017

The Next Generation of In-home Streaming: Light Fields, 5K, 10 GbE, and Foveated Compression.
Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, 2017

2016
HIPA<sup>cc</sup>: A Domain-Specific Language and Compiler for Image Processing.
IEEE Trans. Parallel Distributed Syst., 2016

2015
Advanced In-Home Streaming To Mobile Devices and Wearables.
Int. J. Comput. Sci. Appl., 2015

Shallow embedding of DSLs via online partial evaluation.
Proceedings of the 2015 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, 2015

2014
Code Refinement of Stencil Codes.
Parallel Process. Lett., 2014

Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language.
J. Parallel Distributed Comput., 2014

Target-specific refinement of multigrid codes.
Proceedings of the Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2014

Specialization through dynamic staging.
Proceedings of the Generative Programming: Concepts and Experiences, 2014

Code generation for embedded heterogeneous architectures on android.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Code generation from a domain-specific language for C-based HLS of hardware accelerators.
Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis, 2014

2013
Code Generation for GPU Accelerators from a Domain-Specific Language for Medical Imaging.
PhD thesis, 2013

2012
Towards Domain-Specific Computing for Stencil Codes in HPC.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Automatic Optimization of In-Flight Memory Transactions for GPU Accelerators Based on a Domain-Specific Language for Medical Imaging.
Proceedings of the 11th International Symposium on Parallel and Distributed Computing, 2012

Generating Device-specific GPU Code for Local Operators in Medical Imaging.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Mastering Software Variant Explosion for GPU Accelerators.
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging.
Proceedings of the Architecture of Computing Systems - ARCS 2012 - 25th International Conference, Munich, Germany, February 28, 2012

2011
Frameworks for GPU Accelerators: A comprehensive evaluation using 2D/3D image registration.
Proceedings of the IEEE 9th Symposium on Application Specific Processors, 2011

Detector defect correction of medical images on graphics processors.
Proceedings of the Medical Imaging 2011: Image Processing, 2011

Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration.
Proceedings of the Architecture of Computing Systems - ARCS 2011, 2011

2010
Generating GPU Code from a High-Level Representation for Image Processing Kernels.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2010

2009
Efficient Mapping of Multiresolution Image Filtering Algorithms on Graphics Processors.
Proceedings of the Embedded Computer Systems: Architectures, 2009

Acceleration of Multiresolution Imaging Algorithms: A Comparative Study.
Proceedings of the 20th IEEE International Conference on Application-Specific Systems, 2009


  Loading...