Mark Stephenson

Orcid: 0000-0002-1350-0165

Affiliations:
  • NVIDIA, Austin, TX, USA
  • IBM Research, Austin, TX, USA
  • Massachusetts Institute of Technology, Cambridge, MA, USA (PhD)


According to our database1, Mark Stephenson authored at least 29 papers between 2000 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
cuCatch: A Debugging Tool for Efficiently Catching Memory Safety Violations in CUDA Applications.
Proc. ACM Program. Lang., 2023

2022
GPU Subwarp Interleaving.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
Cooperative Profile Guided Optimizations.
Comput. Graph. Forum, 2021

PGZ: automatic zero-value code specialization.
Proceedings of the CC '21: 30th ACM SIGPLAN International Conference on Compiler Construction, 2021

2020
<i>Zeroploit</i>: Exploiting Zero Valued Operands in Interactive Gaming Applications.
ACM Trans. Archit. Code Optim., 2020

AZP: Automatic Specialization for Zero Values in Gaming Applications.
CoRR, 2020

Estimating Silent Data Corruption Rates Using a Two-Level Model.
CoRR, 2020

Speculative reconvergence for improved SIMT efficiency.
Proceedings of the CGO '20: 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020

2019
Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUs.
ACM Trans. Archit. Code Optim., 2019

NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

2018
Software-Directed Techniques for Improved GPU Register File Utilization.
ACM Trans. Archit. Code Optim., 2018

2017
SASSIFI: An architecture-level fault injection tool for GPU application resilience evaluation.
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

2016
Towards high performance paged memory for GPUs.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Flexible software profiling of GPU architectures.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Page Placement Strategies for GPUs within Heterogeneous Memory Systems.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014
Exploring the Design Space of SPMD Divergence Management on Data-Parallel Architectures.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

2013
A study of application-level recovery methods for transient network faults.
Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2013

The power 775 architecture at scale.
Proceedings of the International Conference on Supercomputing, 2013

2010
Statistically regulating program behavior via mainstream computing.
Proceedings of the CGO 2010, 2010

2009
Lightweight predication support for out of order processors.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

2007
Characterizing and Improving the Performance of Bioinformatics Workloads on the POWER5 Architecture.
Proceedings of the IEEE 10th International Symposium on Workload Characterization, 2007

2006
Automating the construction of a complier heuristics using machine learning.
PhD thesis, 2006

2005
Predicting Unroll Factors Using Supervised Classification.
Proceedings of the 3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005), 2005

2004
Convergent Scheduling.
J. Instr. Level Parallelism, 2004

2003
Meta optimization: improving compiler heuristics with machine learning.
Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, 2003

Adapting Convergent Scheduling Using Machine-Learning.
Proceedings of the Languages and Compilers for Parallel Computing, 2003

Genetic Programming Applied to Compiler Heuristic Optimization.
Proceedings of the Genetic Programming, 6th European Conference, EuroGP 2003, 2003

2000
Bitwidth analysis with application to silicon compilation.
Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2000


  Loading...