V. Krishna Nandivada

Orcid: 0000-0002-5949-0046

According to our database1, V. Krishna Nandivada authored at least 47 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
COWS for High Performance: Cost Aware Work Stealing for Irregular Parallel Loop.
ACM Trans. Archit. Code Optim., March, 2024

2023
UWOmppro: UWOmp++ with Point-to-Point Synchronization, Reduction and Schedules.
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2021
Homeostasis: Design and Implementation of a Self-Stabilizing Compiler.
CoRR, 2021

2020
Optimizing Remote Communication in X10.
ACM Trans. Archit. Code Optim., 2020

DisGCo: A Compiler for Distributed Graph Analytics.
ACM Trans. Archit. Code Optim., 2020

On the fly MHP analysis.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

A Study of Graph Analytics for Massive Datasets on Distributed Multi-GPUs.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Chunking loops with non-uniform workloads.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

Mix your contexts well: opportunities unleashed by recent advances in scaling context-sensitivity.
Proceedings of the CC '20: 29th International Conference on Compiler Construction, 2020

2019
PYE: A Framework for Precise-Yet-Efficient Just-In-Time Analyses for Java Programs.
ACM Trans. Program. Lang. Syst., 2019

Efficient lock-step synchronization in task-parallel languages.
Softw. Pract. Exp., 2019

An Adaptive Load Balancer For Graph Analytical Applications on GPUs.
CoRR, 2019

Batch Alias Analysis.
Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019

Graph Coloring Using GPUs.
Proceedings of the Euro-Par 2019: Parallel Processing, 2019

Compare less, defer more: scaling value-contexts based whole-program heap analyses.
Proceedings of the 28th International Conference on Compiler Construction, 2019

Efficiency and expressiveness in UW-OpenMP.
Proceedings of the 28th International Conference on Compiler Construction, 2019

Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Identifying refactoring opportunities for replacing type code with subclass and state.
Proc. ACM Program. Lang., 2018

TTLG - An Efficient Tensor Transposition Library for GPUs.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Optimizing remote data transfers in X10.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
Energy-Efficient Compilation of Irregular Task-Parallel Loops.
ACM Trans. Archit. Code Optim., 2017

Refactoring opportunities for replacing type code with state and subclass.
Proceedings of the 39th International Conference on Software Engineering, 2017

Optimizing recursive task parallel programs.
Proceedings of the International Conference on Supercomputing, 2017

2016
Lexical state analyzer for JavaCC grammars.
Softw. Pract. Exp., 2016

Improved MHP Analysis.
Proceedings of the 25th International Conference on Compiler Construction, 2016

2015
IMSuite: A benchmark suite for simulating distributed algorithms.
J. Parallel Distributed Comput., 2015

DCAFE: Dynamic load-balanced loop Chunking & Aggressive Finish Elimination for Recursive Task Parallel Programs.
CoRR, 2015

Unique Worker model for OpenMP.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Loop Tiling in the Presence of Exceptions.
Proceedings of the 29th European Conference on Object-Oriented Programming, 2015

2013
A Transformation Framework for Optimizing Task-Parallel Programs.
ACM Trans. Program. Lang. Syst., 2013

Improved bitwidth-aware variable packing.
ACM Trans. Archit. Code Optim., 2013

Lexical State Analyzer.
CoRR, 2013

2012
Identifying services from legacy batch applications.
Proceedings of the Proceeding of the 5th Annual India Software Engineering Conference, 2012

2011
Fault localization for data-centric programs.
Proceedings of the SIGSOFT/FSE'11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC'11: 13th European Software Engineering Conference (ESEC-13), 2011

A framework for analyzing programs written in proprietary languages.
Proceedings of the Companion to the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011

2010
Inferring arbitrary distributions for data and computation.
Proceedings of the Companion to the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2010

Reducing task creation and termination overhead in explicitly parallel programs.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
Efficient, portable implementation of asynchronous multi-place programs.
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

Chunking parallel loops in the presence of synchronization.
Proceedings of the 23rd international conference on Supercomputing, 2009

2008
Static Detection of Place Locality and Elimination of Runtime Checks.
Proceedings of the Programming Languages and Systems, 6th Asian Symposium, 2008

2007
A Framework for End-to-End Verification and Evaluation of Register Allocators.
Proceedings of the Static Analysis, 14th International Symposium, 2007

Advances in Register Allocation Techniques.
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2007

2006
Dynamic state restoration using versioning exceptions.
High. Order Symb. Comput., 2006

SARA: Combining Stack Allocation and Register Allocation.
Proceedings of the Compiler Construction, 15th International Conference, 2006

2005
Timing Analysis of TCP Servers for Surviving Denial-of-Service Attacks.
Proceedings of the 11th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2005), 2005

Compile-Time Concurrent Marking Write Barrier Removal.
Proceedings of the 3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005), 2005

2003
Efficient spill code for SDRAM.
Proceedings of the International Conference on Compilers, 2003


  Loading...