Adwait Jog

Dataset, March, 2025

Dissecting Performance Overheads of Confidential Computing on GPU-based Systems.

[BibT_eX]

[DOI]

Yang Yang

Mohammad Sonji

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2025

TrioSim: A Lightweight Simulator for Large-Scale DNN Workloads on Multi-GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

NetCrafter: Tailoring Network Traffic for Non-Uniform Bandwidth Multi-GPU Systems.

[BibT_eX]

[DOI]

Amel Fatima

Yang Yang

Yifan Sun

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

2024

Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs.

[BibT_eX]

[DOI]

Rishabh Jain

Vivek M. Bhasi

Anand Sivasubramaniam

Mahmut T. Kandemir

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Aspis: Lightweight Neural Network Protection Against Soft Errors.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Symposium on Software Reliability Engineering, 2024

Probing Weaknesses in GPU Reliability Assessment: A Cross-Layer Approach.

[BibT_eX]

[DOI]

Lishan Yang

George Papadimitriou

Dimitrios Sartzetakis

Evgenia Smirni

Dimitris Gizopoulos

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

GPU Reliability Assessment: Insights Across the Abstraction Layers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2024

2023

Asynchronous Automata Processing on GPUs.

[BibT_eX]

[DOI]

Sreepathi Pai

Proceedings of the Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2023

Path Forward Beyond Simulators: Fast and Accurate GPU Execution Time Prediction for DNN Workloads.

[BibT_eX]

[DOI]

Ying Li

Yifan Sun

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

A Regression-based Model for End-to-End Latency Prediction for DNN Execution on GPUs.

[BibT_eX]

[DOI]

Ying Li

Yifan Sun

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Optimizing CPU Performance for Recommendation Systems At-Scale.

[BibT_eX]

[DOI]

Anand Sivasubramaniam

Mahmut Taylan Kandemir

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022

Improving GPU Throughput through Parallel Execution Using Tensor Cores and CUDA Cores.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

2021

Practical Resilience Analysis of GPGPU Applications in the Presence of Single- and Multi-Bit Faults.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2021

SUGAR: Speeding Up GPGPU Application Resilience Estimation with Input Sizing.

[BibT_eX]

[DOI]

Proceedings of the SIGMETRICS '21: ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2021

Enabling Software Resilience in GPGPU Applications via Partial Thread Protection.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering, 2021

Analyzing and Leveraging Decoupled L1 Caches in GPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

Data-centric Reliability Management in GPUs.

[BibT_eX]

[DOI]

Gurunath Kadam

Evgenia Smirni

Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2021

Accelerating DNN Architecture Search at Scale Using Selective Weight Transfer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020

BCoal: Bucketing-Based Memory Coalescing for Efficient and Secure GPUs.

[BibT_eX]

[DOI]

Gurunath Kadam

Danfeng Zhang

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Characterizing Accuracy-Aware Resilience of GPGPU Applications.

[BibT_eX]

[DOI]

Bin Nie

Evgenia Smirni

Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

Why GPUs are Slow at Executing NFAs and How to Make them Faster.

[BibT_eX]

[DOI]

Sreepathi Pai

Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

Analyzing and Leveraging Shared L1 Caches in GPUs.

[BibT_eX]

[DOI]

Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019

Quantifying Data Locality in Dynamic Parallelism in GPUs.

[BibT_eX]

[DOI]

Mahmut Taylan Kandemir

Proceedings of the Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems, 2019

Opportunistic computing in GPU architectures.

[BibT_eX]

[DOI]

Anand Sivasubramaniam

Proceedings of the 46th International Symposium on Computer Architecture, 2019

Address-stride assisted approximate load value prediction in GPUs.

[BibT_eX]

[DOI]

Haonan Wang

Sparsh Mittal

Proceedings of the ACM International Conference on Supercomputing, 2019

Exploiting Latency and Error Tolerance of GPGPU Applications for an Energy-Efficient DRAM.

[BibT_eX]

[DOI]

Haonan Wang

Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

Analyzing and Leveraging Remote-Core Bandwidth for Enhanced Performance in GPUs.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

Decoupling GPU Programming Models from Resource Management for Enhanced Programming Ease, Portability, and Performance.

[BibT_eX]

[DOI]

CoRR, 2018

Fault Site Pruning for Practical Reliability Analysis of GPGPU Applications.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Architectural Support for Efficient Large-Scale Automata Processing.

[BibT_eX]

[DOI]

Sreepathi Pai

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Efficient and Fair Multi-programming in GPUs via Effective Bandwidth Management.

[BibT_eX]

[DOI]

Haonan Wang

Fan Luo

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

RCoal: Mitigating GPU Timing Attack via Subwarp-Based Randomized Coalescing Techniques.

[BibT_eX]

[DOI]

Gurunath Kadam

Danfeng Zhang

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

MASK: Redesigning the GPU Memory Hierarchy to Support Multi-Application Concurrency.

[BibT_eX]

[DOI]

Christopher J. Rossbach

Onur Mutlu

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017

Improving Multi-Application Concurrency Support Within the GPU Memory System.

[BibT_eX]

[DOI]

Christopher J. Rossbach

CoRR, 2017

Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on VLSI Design and 16th International Conference on Embedded Systems, 2017

Architecting SOT-RAM Based GPU Register File.

[BibT_eX]

[DOI]

Mehdi Baradaran Tahoori

Jeffrey S. Vetter

Proceedings of the 2017 IEEE Computer Society Annual Symposium on VLSI, 2017

Controlled Kernel Launch for Dynamic Parallelism in GPUs.

[BibT_eX]

[DOI]

Mahmut T. Kandemir

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

2016

A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps.

[BibT_eX]

[DOI]

CoRR, 2016

Exploiting Core Criticality for Enhanced GPU Performance.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016

Zorua: A holistic approach to resource virtualization in GPUs.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Characterization of quantum workloads on SIMD architectures.

[BibT_eX]

[DOI]

Robert Risque

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

μC-States: Fine-grained GPU Datapath Power Management.

[BibT_eX]

[DOI]

Ashutosh Pattnaik

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015

Anatomy of GPU Memory System for Multi-Application Execution.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Symposium on Memory Systems, 2015

A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014

Managing GPU Concurrency in Heterogeneous Architectures.

[BibT_eX]

[DOI]

Nachiappan Chidambaram Nachiappan

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Application-aware Memory System for Fair and Efficient Execution of Concurrent GPGPU Applications.

[BibT_eX]

[DOI]

Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014

Trading cache hit rate for memory performance.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

Orchestrated scheduling and prefetching for GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance.

[BibT_eX]

[DOI]

Nachiappan Chidambaram Nachiappan

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

Neither more nor less: Optimizing thread-level parallelism for GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012

Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs.

[BibT_eX]

[DOI]

Vijaykrishnan Narayanan

Ravishankar R. Iyer