Dominik Grewe

Affiliations:
  • University of Edinburgh, UK


According to our database1, Dominik Grewe authored at least 17 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
PartIR: Composing SPMD Partitioning Strategies for Machine Learning.
CoRR, 2024

2022
Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR.
CoRR, 2022

Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning.
Proceedings of Machine Learning and Systems 2022, 2022

2021
Automap: Towards Ergonomic Automated Parallelism for ML Models.
CoRR, 2021

2019
TF-Replicator: Distributed Machine Learning for Researchers.
CoRR, 2019

2018

2016
Mastering the game of Go with deep neural networks and tree search.
Nat., 2016

2014
Mapping parallel programs to heterogeneous multi-core systems.
PhD thesis, 2014

Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems.
ACM Trans. Archit. Code Optim., 2014

NOVA: A Functional Language for Data Parallelism.
Proceedings of the ARRAY'14: Proceedings of the 2014 ACM SIGPLAN International Workshop on Libraries, 2014

2013
OpenCL Task Partitioning in the Presence of GPU Contention.
Proceedings of the Languages and Compilers for Parallel Computing, 2013

Portable mapping of data parallel programs to OpenCL for heterogeneous systems.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

Prius: a runtime for hybrid computing.
Proceedings of the First International Workshop on Code Optimisation for Multi and Many Cores, 2013

Input-aware auto-tuning for directive-based GPU programming.
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, 2013

2011
A workload-aware mapping approach for data-parallel programs.
Proceedings of the High Performance Embedded Architectures and Compilers, 2011

A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL.
Proceedings of the Compiler Construction - 20th International Conference, 2011

Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation.
Proceedings of 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011


  Loading...