Norman Rubin

CoRR, 2021

2020

ArmorAll: Compiler-based Resilience Targeting GPU Applications.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

Griffin: Hardware-Software Support for Efficient Page Migration in Multi-GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance.

[BibT_eX]

[DOI]

Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2018

PRISM: predicting resilience of GPU applications using statistical methods.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2018

Diesel: DSL for linear algebra and neural net computations on GPUs.

[BibT_eX]

[DOI]

Venmugil Elango

Hariharan Sandanagobalane

Mahesh Ravishankar

Vinod Grover

Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, 2018

Airavat: Improving energy efficiency of heterogeneous applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017

Moka: Model-based concurrent kernel analysis.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

2016

LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs.

[BibT_eX]

[DOI]

Albert Sidelnik

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015

Dynamic thread block launch: a lightweight execution mechanism to support irregular applications on GPUs.

[BibT_eX]

[DOI]

Albert Sidelnik

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Revisiting ILP Designs for Throughput-Oriented GPGPU Architecture.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014

Heterogeneous computing: what does it mean for compiler research?

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

A Case for a Flexible Scalar Unit in SIMT Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

ParallelJS: An Execution Framework for JavaScript on Heterogeneous Systems.

[BibT_eX]

[DOI]

Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014

2013

Characterizing scalar opportunities in GPGPU applications.

[BibT_eX]

[DOI]

Zhongliang Chen

David R. Kaeli

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Exploiting uniform vector instructions for GPGPU performance, energy efficiency, and opportunistic reliability enhancement.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2013

Accelerating simulation of agent-based models on heterogeneous architectures.

[BibT_eX]

[DOI]

Haicheng Wu

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, 2013

2012

Enabling task-level scheduling on heterogeneous platforms.

[BibT_eX]

[DOI]

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, 2012

Shared memory multiplexing: a novel way to improve GPGPU throughput.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Many-thread aware instruction-level parallelism: architecting shader cores for GPU computing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011

Analyzing program flow within a many-kernel OpenCL application.

[BibT_eX]

[DOI]

Proceedings of 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011

A new method for GPU based irregular reductions and its application to k-means clustering.

[BibT_eX]

[DOI]

Balaji Dhanasekaran

Proceedings of 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011

2010

ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs.

[BibT_eX]

[DOI]

Budirijanto Purnomo

Michael Houston

Proceedings of the International Conference on Computer Graphics and Interactive Techniques, 2010

2008

Issues and challenges in compiling for graphics processors.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008

GPU evolution: will graphics morph into compute?

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

1998

FX!32 a profile-directed binary translator.

[BibT_eX]

[DOI]

S. Bharadwaj Yadavalli

John Yates

IEEE Micro, 1998

1995

Efficient instruction scheduling using finite state automata.

[BibT_eX]

[DOI]

Vasanth Bala

Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29, 1995

1993

Data Flow Computing and the Conjugate Gradient Method.

[BibT_eX]

[DOI]

Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, 1993

1978

Another Generalization of Resolution.

[BibT_eX]

[DOI]

Malcolm C. Harrison