Mike O'Connor

Orcid: 0000-0003-0944-2393

According to our database1, Mike O'Connor authored at least 49 papers between 1997 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing.
ACM Trans. Comput. Syst., 2023

2022
Characterizing and Mitigating Soft Errors in GPU DRAM.
IEEE Micro, 2022

Saving PAM4 Bus Energy with SMOREs: Sparse Multi-level Opportunistic Restricted Encodings.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

2020
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019
Near-memory data transformation for efficient sparse matrix multi-vector multiplication.
Proceedings of the International Conference for High Performance Computing, 2019

DeLTA: GPU Performance Model for Deep Learning Applications with In-Depth Memory System Traffic Analysis.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019

2018
What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study.
Proc. ACM Meas. Anal. Comput. Syst., 2018

Voltron: Understanding and Exploiting the Voltage-Latency-Reliability Trade-Offs in Modern DRAM Chips to Improve Energy Efficiency.
CoRR, 2018

Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

Reducing Data Transfer Energy by Exploiting Similarity within a Data Transaction.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017
Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms.
Proc. ACM Meas. Anal. Comput. Syst., 2017

Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks.
CoRR, 2017

Understanding Reduced-Voltage Operation in Modern DRAM Chips: Characterization, Analysis, and Mechanisms.
CoRR, 2017

Toward standardized near-data processing with unrestricted data placement for GPUs.
Proceedings of the International Conference for High Performance Computing, 2017

Fine-grained DRAM: energy-efficient DRAM for extreme bandwidth systems.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Architecting an Energy-Efficient DRAM System for GPUs.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

2016
Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism.
CoRR, 2016

CLARA: Circular Linked-List Auto and Self Refresh Architecture.
Proceedings of the Second International Symposium on Memory Systems, 2016

Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015
Designing Efficient Heterogeneous Memory Architectures.
IEEE Micro, 2015

Toggle-Aware Compression for GPUs.
IEEE Comput. Archit. Lett., 2015

Flexible software profiling of GPU architectures.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

A variable warp size architecture.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Priority-based cache allocation in throughput processors.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Unlocking bandwidth for GPUs in CC-NUMA systems.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

MemcachedGPU: scaling-up scale-out key-value stores.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

Page Placement Strategies for GPUs within Heterogeneous Memory Systems.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

Exploiting asymmetry in Booth-encoded multipliers for reduced energy multiplication.
Proceedings of the 49th Asilomar Conference on Signals, Systems and Computers, 2015

2014
Cache Coherence for GPU Architectures.
IEEE Micro, 2014

A Configurable and Strong RAS Solution for Die-Stacked DRAM Caches.
IEEE Micro, 2014

Learning your limit: managing massively multithreaded caches through scheduling.
Commun. ACM, 2014

Scaling the Power Wall: A Path to Exascale.
Proceedings of the International Conference for High Performance Computing, 2014

Managing DRAM Latency Divergence in Irregular GPGPU Applications.
Proceedings of the International Conference for High Performance Computing, 2014

A scalable multi-path microarchitecture for efficient GPU control flow.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

2013
Cache-Conscious Thread Scheduling for Massively Multithreaded Processors.
IEEE Micro, 2013

Divergence-aware warp scheduling.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Resilient die-stacked DRAM caches.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

GPUDet: a deterministic GPU architecture.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012
A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

Cache-Conscious Wavefront Scheduling.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

Characterizing and evaluating a key-value store application on heterogeneous CPU-GPU systems.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012

Accelerated processing and the Fusion System Architecture.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

2006
Network Communication as a Service-Oriented Capability.
Proceedings of the High Performance Computing and Grids in Action, 2006

2001
The iFlow Address Processor.
IEEE Micro, 2001

1999
Performance analysis and validation of the picoJava processor.
IEEE Micro, 1999

1998
PicoJava: A Direct Execution Engine For Java Bytecode.
Computer, 1998

Tracking Web Usage with Network Flight Recorder.
Proceedings of WebNet 98, 1998

1997
Open Standard Content Cookies: Utility vs. Privacy.
Proceedings of WebNet 97, 1997


  Loading...