Ranjay Krishna

Orcid: 0000-0001-8784-2531

According to our database1, Ranjay Krishna authored at least 76 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion.
CoRR, 2024

m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks.
CoRR, 2024

Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use.
CoRR, 2024

Training Language Model Agents without Modifying Language Models.
CoRR, 2024

THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation.
CoRR, 2024

Scaling Up LLM Reviews for Google Ads Content Moderation.
Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024

2023
Guest Editorial: Introduction to the Special Section on Graphs in Vision and Pattern Analysis.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Explanations Can Reduce Overreliance on AI Systems During Decision-Making.
Proc. ACM Hum. Comput. Interact., April, 2023

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions.
Proc. VLDB Endow., 2023

EQUI-VOCAL Demonstration: Synthesizing Video Queries from User Interactions.
Proc. VLDB Endow., 2023

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building.
Proc. VLDB Endow., 2023

Designing LLM Chains by Adapting Techniques from Crowdsourcing Workflows.
CoRR, 2023

Holodeck: Language Guided Generation of 3D Embodied AI Environments.
CoRR, 2023

Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos.
CoRR, 2023

Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models.
CoRR, 2023

Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World.
CoRR, 2023

Lasagna: Layered Score Distillation for Disentangled Object Relighting.
CoRR, 2023

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback.
CoRR, 2023

Selective Visual Representations Improve Convergence and Generalization for Embodied AI.
CoRR, 2023

Improving Interpersonal Communication by Simulating Audiences with Language Models.
CoRR, 2023

Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation.
CoRR, 2023

Cultural and Linguistic Diversity Improves Visual Representations.
CoRR, 2023

EcoAssistant: Using LLM Assistant More Affordably and Accurately.
CoRR, 2023

Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models.
CoRR, 2023

Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias.
CoRR, 2023

MIMIC: Masked Image Modeling with Image Correspondences.
CoRR, 2023

COLA: How to adapt vision-language models to Compose Objects Localized with Attributes?
CoRR, 2023

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions [Technical Report].
CoRR, 2023

Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cola: A Benchmark for Compositional Text-to-image Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

OBJECT 3DIT: Language-guided 3D-aware Image Editing.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Quilt-1M: One Million Image-Text Pairs for Histopathology.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023



TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

@ CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AR2-D2: Training a Robot Without a Robot.
Proceedings of the Conference on Robot Learning, 2023

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal Reasoning.
CoRR, 2022

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Measuring Compositional Consistency for Video Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

VOCAL: Video Organization and Interactive Compositional AnaLytics.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022

2021
Visual intelligence through human learning.
PhD thesis, 2021

Visual Intelligence through Human Interaction.
CoRR, 2021

On the Opportunities and Risks of Foundation Models.
CoRR, 2021

AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Conceptual Metaphors Impact Perceptions of Human-AI Collaboration.
Proc. ACM Hum. Comput. Interact., 2020

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning.
Proceedings of the Sixth Workshop on Noisy User-generated Text, 2020

2019
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs.
CoRR, 2019

Deep Bayesian Active Learning for Multiple Correct Outputs.
CoRR, 2019

HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

HYPE: Human-eYe Perceptual Evaluation of Generative Models.
Proceedings of the Deep Generative Models for Highly Structured Data, 2019

Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Scene Graph Prediction With Limited Labels.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

AI-Based Request Augmentation to Increase Crowdsourcing Participation.
Proceedings of the Seventh AAAI Conference on Human Computation and Crowdsourcing, 2019

Information Maximizing Visual Question Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Eevee: Transforming Images by Bridging High-level Goals and Low-level Edit Operations.
Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

2018
The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary.
CoRR, 2018

Engagement Learning: Expanding Visual Knowledge by Engaging Online Participants.
Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology Adjunct Proceedings, 2018

Referring Relationships.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations.
Int. J. Comput. Vis., 2017

ActivityNet Challenge 2017 Summary.
CoRR, 2017

Crowd Research: Open and Scalable University Laboratories.
Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, 2017

Dense-Captioning Events in Videos.
Proceedings of the IEEE International Conference on Computer Vision, 2017

A Hierarchical Approach for Generating Descriptive Image Paragraphs.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

A Glimpse Far into the Future: Understanding Long-term Crowd Worker Quality.
Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 2017

2016
A Glimpse Far into the Future: Understanding Long-term Crowd Worker Accuracy.
CoRR, 2016

Visual Relationship Detection with Language Priors.
Proceedings of the Computer Vision - ECCV 2016, 2016

Embracing Error to Enable Rapid Crowdsourcing.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

2015
SentenceRacer: A Game with a Purpose for Image Sentence Annotation.
CoRR, 2015


Image retrieval using scene graphs.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval.
Proceedings of the Fourth Workshop on Vision and Language, 2015


  Loading...