Norman Rubin

According to our database1, Norman Rubin authored at least 30 papers between 1978 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2024

2021
Generating GPU Compiler Heuristics using Reinforcement Learning.
CoRR, 2021

2020
ArmorAll: Compiler-based Resilience Targeting GPU Applications.
ACM Trans. Archit. Code Optim., 2020

Griffin: Hardware-Software Support for Efficient Page Migration in Multi-GPU Systems.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2018
PRISM: predicting resilience of GPU applications using statistical methods.
Proceedings of the International Conference for High Performance Computing, 2018

Diesel: DSL for linear algebra and neural net computations on GPUs.
Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, 2018

Airavat: Improving energy efficiency of heterogeneous applications.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017
Moka: Model-based concurrent kernel analysis.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

2016
LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015
Dynamic thread block launch: a lightweight execution mechanism to support irregular applications on GPUs.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Revisiting ILP Designs for Throughput-Oriented GPGPU Architecture.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014
Heterogeneous computing: what does it mean for compiler research?
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

A Case for a Flexible Scalar Unit in SIMT Architecture.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

ParallelJS: An Execution Framework for JavaScript on Heterogeneous Systems.
Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014

2013
Characterizing scalar opportunities in GPGPU applications.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Exploiting uniform vector instructions for GPGPU performance, energy efficiency, and opportunistic reliability enhancement.
Proceedings of the International Conference on Supercomputing, 2013

Accelerating simulation of agent-based models on heterogeneous architectures.
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, 2013

2012
Enabling task-level scheduling on heterogeneous platforms.
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, 2012

Shared memory multiplexing: a novel way to improve GPGPU throughput.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Many-thread aware instruction-level parallelism: architecting shader cores for GPU computing.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
Analyzing program flow within a many-kernel OpenCL application.
Proceedings of 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011

A new method for GPU based irregular reductions and its application to k-means clustering.
Proceedings of 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011

2010
ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs.
Proceedings of the International Conference on Computer Graphics and Interactive Techniques, 2010

2008
Issues and challenges in compiling for graphics processors.
Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008

GPU evolution: will graphics morph into compute?
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

1998
FX!32 a profile-directed binary translator.
IEEE Micro, 1998

1997
Efficient instruction scheduling using finite state automata.
Int. J. Parallel Program., 1997

1993
Data Flow Computing and the Conjugate Gradient Method.
Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, 1993

1978
Another Generalization of Resolution.
J. ACM, 1978


  Loading...