Vijay Nagarajan

Orcid: 0009-0000-5045-4754

Affiliations:
  • University of Utah, Salt Lake City, UT, USA
  • University of Edinburgh
  • University of California, Riverside, CA, USA (Ph.D., 2009)


According to our database1, Vijay Nagarajan authored at least 74 papers between 2004 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
The LAW theorem: Local Reads and Linearizable Asynchronous Replication.
Proc. VLDB Endow., May, 2025

Stop Taking the Scenic Route: the Shortest Distance Between the CPU and the NIC is MMIO.
Proceedings of the 2025 Workshop on Hot Topics in Operating Systems, 2025

Fast, Highly Available, and Recoverable Transactions on Disaggregated Data Stores.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025

2024
Determining the Minimum Number of Virtual Networks for Different Coherence Protocols.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

PipeGen: Automated Transformation of a Single-Core Pipeline into a Multicore Pipeline for a Given Memory Consistency Model.
Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023
Compound Memory Models.
Proc. ACM Program. Lang., 2023

HeteroGen: Automatic Synthesis of Heterogeneous Cache Coherence Protocols.
IEEE Micro, 2023

Āpta: Fault-tolerant object-granular CXL disaggregated memory for accelerating FaaS.
Proceedings of the 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network, 2023

2021
Extending Classic Paxos for High-performance Read-Modify-Write Registers.
CoRR, 2021

Avocado: A Secure In-Memory Distributed Storage System.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021

Dvé: Improving DRAM Reliability and Performance On-Demand via Coherent Replication.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Towards the Synthesis of Coherence/Replication Protocols from Consistency Models via Real-Time Orderings.
Proceedings of the PaPoC@EuroSys 2021, 2021

Odyssey: the impact of modern hardware on strongly-consistent replication protocols.
Proceedings of the EuroSys '21: Sixteenth European Conference on Computer Systems, 2021

2020
A Primer on Memory Consistency and Cache Coherence, Second Edition
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01764-3, 2020

Kite: efficient and available release consistency for the datacenter.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

HieraGen: Automated Generation of Concurrent, Hierarchical Cache Coherence Protocols.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Hermes: A Fast, Fault-Tolerant and Linearizable Replication Protocol.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

Lazy Release Persistency.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
Poise: Balancing Thread-Level Parallelism and Memory System Performance in GPUs Using Machine Learning.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

2018
Solving the task variant allocation problem in distributed robotics.
Auton. Robots, 2018

ProtoGen: Automatically Generating Directory Cache Coherence Protocols from Atomic Specifications.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

DHTM: Durable Hardware Transactional Memory.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

Automatic Parameter Tuning of Motion Planning Algorithms.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Scale-out ccNUMA: exploiting skew with strongly consistent caching.
Proceedings of the Thirteenth EuroSys Conference, 2018

VerC3: A library for explicit state synthesis of concurrent systems.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Blasting through the Front-End Bottleneck with Shotgun.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
Evaluating and mitigating bandwidth bottlenecks across the memory hierarchy in GPUs.
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

ATOM: Atomic Durability in Non-volatile Memory through Hardware Logging.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

Boomerang: A Metadata-Free Architecture for Control Flow Delivery.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

Verification of a lazy cache coherence protocol against a weak memory model.
Proceedings of the 2017 Formal Methods in Computer Aided Design, 2017

2016
Fence Placement for Legacy Data-Race-Free Programs via Synchronization Read Detection.
ACM Trans. Archit. Code Optim., 2016

Cooperative Caching for GPUs.
ACM Trans. Archit. Code Optim., 2016

DCA: a DRAM-cache-aware DRAM controller.
Proceedings of the International Conference for High Performance Computing, 2016

Task Variant Allocation in Distributed Robotics.
Proceedings of the Robotics: Science and Systems XII, University of Michigan, Ann Arbor, Michigan, USA, June 18, 2016

C<sup>3</sup>D: Mitigating the NUMA bottleneck via coherent DRAM caches.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Automatic configuration of ROS applications for near-optimal performance.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Characterizing memory bottlenecks in GPGPU workloads.
Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

McVerSi: A test generation framework for fast memory consistency verification in simulation.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
Understanding the Effects of Data Corruption on Application Behavior Based on Data Characteristics.
Proceedings of the Computer Safety, Reliability, and Security, 2015

Efficient persist barriers for multicores.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Dynamic process migration in heterogeneous ROS-based environments.
Proceedings of the International Conference on Advanced Robotics, 2015

RC3: Consistency Directed Cache Coherence for x86-64 with RC Extensions.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014
Erratum: A system for debugging via online tracing and dynamic slicing.
Softw. Pract. Exp., 2014

Fence Scoping.
Proceedings of the International Conference for High Performance Computing, 2014

Static Approximation of MPI Communication Graphs for Optimized Process Placement.
Proceedings of the Languages and Compilers for Parallel Computing, 2014

Increasing cache capacity via critical-words-only cache.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

TSO-CC: Consistency directed cache coherence for TSO.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

ATCache: reducing DRAM cache latency via a small SRAM tag cache.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
Fast RMWs for TSO: semantics and implementation.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2013

Address-aware fences.
Proceedings of the International Conference on Supercomputing, 2013

2012
Erratum: A system for debugging via online tracing and dynamic slicing.
Softw. Pract. Exp., 2012

A system for debugging via online tracing and dynamic slicing.
Softw. Pract. Exp., 2012

Efficient Sequential Consistency Using Conditional Fences.
Int. J. Parallel Program., 2012

SuperCoP: a general, correct, and performance-efficient supervised memory system.
Proceedings of the Computing Frontiers Conference, CF'12, 2012

Efficient sequential consistency via conflict ordering.
Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012

2010
Execution suppression: An automated iterative technique for locating memory errors.
ACM Trans. Program. Lang. Syst., 2010

2009
IMPRESS: Improving Multicore Performance and Reliability via Efficient Software Support for Monitoring.
PhD thesis, 2009

Compiler-Assisted Memory Encryption for Embedded Processors.
Trans. High Perform. Embed. Archit. Compil., 2009

Automated dynamic detection of busy-wait synchronizations.
Softw. Pract. Exp., 2009

Runtime monitoring on multicores via OASES.
ACM SIGOPS Oper. Syst. Rev., 2009

Speculative Parallelization of Sequential Loops on Multicores.
Int. J. Parallel Program., 2009

Architectural support for shadow memory in multiprocessors.
Proceedings of the 5th International Conference on Virtual Execution Environments, 2009

Speculative Optimizations for Parallel Programs on Multicores.
Proceedings of the Languages and Compilers for Parallel Computing, 2009

Self-recovery in server programs.
Proceedings of the 8th International Symposium on Memory Management, 2009

ECMon: exposing cache events for monitoring.
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

2008
Copy or Discard execution model for speculative parallelization on multicores.
Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

Dynamic recognition of synchronization operations for improved data race detection.
Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, 2008

Support for symmetric shadow memory in multiprocessors.
Proceedings of the 6th Workshop on Parallel and Distributed Systems: Testing, 2008

Scalable dynamic information flow tracking and its applications.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007
High-throughput VLSI Implementations of Iterative Decoders and Related Code Construction Problems.
J. VLSI Signal Process., 2007

ONTRAC: A system for efficient ONline TRACing for debugging.
Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM 2007), 2007

Matching Control Flow of Program Versions.
Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM 2007), 2007

2004
The effect of channel side information at transmitter on coding complexity.
Proceedings of the 2004 IEEE International Symposium on Information Theory, 2004

High-throughput VLSI implementations of iterative decoders and related code construction problems.
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004


  Loading...