Dhruv Batra

Orcid: 0000-0002-1358-0011

According to our database1, Dhruv Batra authored at least 222 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
IndoorSim-to-OutdoorReal: Learning to Navigate Outdoors Without Any Outdoor Experience.
IEEE Robotics Autom. Lett., May, 2024

ASC: Adaptive Skill Coordination for Robotic Mobile Manipulation.
IEEE Robotics Autom. Lett., January, 2024

Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge.
CoRR, 2024

Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control.
CoRR, 2024

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation.
CoRR, 2024

Seeing the Unseen: Visual Common Sense for Semantic Placement.
CoRR, 2024

VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic Navigation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

What Do We Learn from a Large-Scale Study of Pre-Trained Visual Representations in Sim and Real Environments?
Proceedings of the IEEE International Conference on Robotics and Automation, 2024


2023
An Extensible, Data-Oriented Architecture for High-Performance, Many-World Simulation.
ACM Trans. Graph., August, 2023

Navigating to objects in the real world.
Sci. Robotics, June, 2023

Emergence of Maps in the Memories of Blind Navigation Agents.
AI Matters, June, 2023

Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis.
CoRR, 2023

GOAT: GO to Any Thing.
CoRR, 2023

Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations.
CoRR, 2023

Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation.
CoRR, 2023

AutoNeRF: Training Implicit Scene Representations with Autonomous Agents.
CoRR, 2023

Adaptive Skill Coordination for Robotic Mobile Manipulation.
CoRR, 2023

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?
CoRR, 2023

OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav.
CoRR, 2023

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ViNL: Visual Navigation and Locomotion Over Obstacles.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Adaptive Coordination in Social Embodied Rearrangement.
Proceedings of the International Conference on Machine Learning, 2023

BC-IRL: Learning Generalizable Reward Functions from Demonstrations.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Navigating to Objects Specified by Images.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Skill Transformer: A Monolithic Policy for Mobile Manipulation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Habitat-Matterport 3D Semantics Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PIRLNav: Pretraining with Imitation and RL Finetuning for OBJECTNAV.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023


FindThis: Language-Driven Object Disambiguation in Indoor Environments.
Proceedings of the Conference on Robot Learning, 2023

Simple and Effective Synthesis of Indoor 3D Scenes.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Instance-Specific Image Goal Navigation: Training Embodied Agents to Find Object Instances.
CoRR, 2022

Retrospectives on the Embodied AI Workshop.
CoRR, 2022

Offline Visual Representation Learning for Embodied Navigation.
CoRR, 2022

VER: Scaling On-Policy RL Leads to the Emergence of Navigation in Embodied Rearrangement.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Benchmarking Augmentation Methods for Learning Robust Navigation Agents: the Winning Entry of the 2021 iGibson Challenge.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Memory-Augmented Reinforcement Learning for Image-Goal Navigation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

SSL Enables Learning from Sparse Rewards in Image-Goal Navigation.
Proceedings of the International Conference on Machine Learning, 2022

Housekeep: Tidying Virtual Households Using Commonsense Reasoning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Is Mapping Necessary for Realistic PointGoal Navigation?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022


Episodic Memory Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Rethinking Sim2Real: Lower Fidelity Simulation Leads to Higher Sim2Real Transfer in Navigation.
Proceedings of the Conference on Robot Learning, 2022

Cross-Domain Transfer via Semantic Skill Imitation.
Proceedings of the Conference on Robot Learning, 2022

How to Train PointGoal Navigation Agents on a (Sample and Compute) Budget.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

2021
Bi-Directional Domain Adaptation for Sim2Real Transfer of Embodied Navigation Agents.
IEEE Robotics Autom. Lett., 2021

Ego4D: Around the World in 3, 000 Hours of Egocentric Video.
CoRR, 2021

Learning Robust Agents for Visual Navigation in Dynamic Environments: The Winning Entry of iGibson Challenge 2021.
CoRR, 2021

Realistic PointGoal Navigation via Auxiliary Losses and Information Bottleneck.
CoRR, 2021

Model-Advantage Optimization for Model-Based Reinforcement Learning.
CoRR, 2021

Auxiliary Tasks and Exploration Enable ObjectNav.
CoRR, 2021

Habitat 2.0: Training Home Assistants to Rearrange their Habitat.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Success Weighted by Completion Time: A Dynamics-Aware Evaluation Criteria for Embodied Navigation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Learning Navigation Skills for Legged Robots with Learned Robot Embeddings.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Large Batch Simulation for Deep Reinforcement Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Auxiliary Tasks and Exploration Enable ObjectGoal Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

THDA: Treasure Hunt Data Augmentation for Semantic Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Waypoint Models for Instruction-guided Navigation in Continuous Environments.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Contrast and Classify: Training Robust VQA Models.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Semantic MapNet: Building Allocentric Semantic Maps and Representations from Egocentric Views.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Sim2Real Predictivity: Does Evaluation in Simulation Predict Real-World Performance?
IEEE Robotics Autom. Lett., 2020

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization.
Int. J. Comput. Vis., 2020

Rearrangement: A Challenge for Embodied AI.
CoRR, 2020

Contrast and Classify: Alternate Training for Robust VQA.
CoRR, 2020

Semantic MapNet: Building Allocentric SemanticMaps and Representations from Egocentric Views.
CoRR, 2020

Auxiliary Tasks Speed Up Learning PointGoal Navigation.
CoRR, 2020

ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects.
CoRR, 2020

Analyzing Visual Representations in Embodied Navigation Tasks.
CoRR, 2020

Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

IR-VIC: Unsupervised Discovery of Sub-goals for Transfer in RL.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Embodied Multimodal Multitask Learning.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames.
Proceedings of the 8th International Conference on Learning Representations, 2020

Where Are You? Localization from Embodied Dialog.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Large-Scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline.
Proceedings of the Computer Vision - ECCV 2020, 2020

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web.
Proceedings of the Computer Vision - ECCV 2020, 2020

Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments.
Proceedings of the Computer Vision - ECCV 2020, 2020

Spatially Aware Multimodal Transformers for TextVQA.
Proceedings of the Computer Vision - ECCV 2020, 2020

Auxiliary Tasks Speed Up Learning Point Goal Navigation.
Proceedings of the 4th Conference on Robot Learning, 2020

Integrating Egocentric Localization for More Realistic Point-Goal Navigation Agents.
Proceedings of the 4th Conference on Robot Learning, 2020

Sim-to-Real Transfer for Vision-and-Language Navigation.
Proceedings of the 4th Conference on Robot Learning, 2020

2019
Visual Dialog.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering.
Int. J. Comput. Vis., 2019

Are We Making Real Progress in Simulated Environments? Measuring the Sim2Real Gap in Embodied Visual Navigation.
CoRR, 2019

Decentralized Distributed PPO: Solving PointGoal Navigation.
CoRR, 2019

Unsupervised Discovery of Decision States for Transfer in Reinforcement Learning.
CoRR, 2019

The Replica Dataset: A Digital Replica of Indoor Spaces.
CoRR, 2019

Emergence of Compositional Language with Deep Generational Transmission.
CoRR, 2019

Embodied Visual Recognition.
CoRR, 2019

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future.
CoRR, 2019

Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded.
CoRR, 2019

EvalAI: Towards Better Evaluation Systems for AI Agents.
CoRR, 2019

Response to "Visual Dialogue without Vision or Dialogue" (Massiceti et al., 2018).
CoRR, 2019

Dialog System Technology Challenge 7.
CoRR, 2019

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Chasing Ghosts: Instruction Following as Bayesian State Tracking.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Probabilistic Neural Symbolic Models for Interpretable Visual Question Answering.
Proceedings of the 36th International Conference on Machine Learning, 2019

Trainable Decoding of Sets of Sequences for Neural Sequence Models.
Proceedings of the 36th International Conference on Machine Learning, 2019

Counterfactual Visual Explanations.
Proceedings of the 36th International Conference on Machine Learning, 2019

TarMAC: Targeted Multi-Agent Communication.
Proceedings of the 36th International Conference on Machine Learning, 2019

Modeling the Long Term Future in Model-Based Reinforcement Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

Embodied Amodal Recognition: Learning to Move to Perceive Objects.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Habitat: A Platform for Embodied AI Research.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

nocaps: novel object captioning at scale.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features.
Proceedings of the IEEE International Conference on Acoustics, 2019

Improving Generative Visual Dialog by Answering Diverse Questions.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Multi-Target Embodied Question Answering.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Embodied Question Answering in Photorealistic Environments With Point Cloud Perception.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Towards VQA Models That Can Read.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Audio Visual Scene-Aware Dialog.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Fabrik: An Online Collaborative Neural Network Editor.
CoRR, 2018

Pythia v0.1: the Winning Entry to the VQA Challenge 2018.
CoRR, 2018

Talk the Walk: Navigating New York City through Grounded Dialogue.
CoRR, 2018

Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7.
CoRR, 2018

Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples.
CoRR, 2018

Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations.
Proceedings of the 35th International Conference on Machine Learning, 2018

Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples.
Proceedings of the 6th International Conference on Learning Representations, 2018

Graph R-CNN for Scene Graph Generation.
Proceedings of the Computer Vision - ECCV 2018, 2018

Choose Your Neuron: Incorporating Domain Knowledge Through Neuron-Importance.
Proceedings of the Computer Vision - ECCV 2018, 2018

Visual Coreference Resolution in Visual Dialog Using Neural Module Networks.
Proceedings of the Computer Vision - ECCV 2018, 2018

Neural Baby Talk.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Embodied Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Neural Modular Control for Embodied Question Answering.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Diverse Beam Search for Improved Description of Complex Scenes.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Empirical Minimum Bayes Risk Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

VQA: Visual Question Answering - www.visualqa.org.
Int. J. Comput. Vis., 2017

Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?
Comput. Vis. Image Underst., 2017

Resolving vision and language ambiguities together: Joint segmentation & prepositional attachment resolution in captioned scenes.
Comput. Vis. Image Underst., 2017

CoDraw: Visual Dialog for Collaborative Drawing.
CoRR, 2017

C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset.
CoRR, 2017

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation.
Proceedings of the 5th International Conference on Learning Representations, 2017

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Evaluating Visual Conversational Agents via Cooperative Human-AI Games.
Proceedings of the Fifth AAAI Conference on Human Computation and Crowdsourcing, 2017

ParlAI: A Dialog Research Software Platform.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

The Promise of Premise: Harnessing Question Premises in Visual Question Answering.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Deal or No Deal? End-to-End Learning of Negotiation Dialogues.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Visual Dialog.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Counting Everyday Objects in Everyday Scenes.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Gender Classification of Walkers via Underfloor Accelerometer Measurements.
IEEE Internet Things J., 2016

Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models.
CoRR, 2016

Grad-CAM: Why did you say that?
CoRR, 2016

Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization.
CoRR, 2016

A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories.
CoRR, 2016

Interpreting Visual Question Answering Models.
CoRR, 2016

Reducing Overfitting in Deep Networks by Decorrelating Representations.
Proceedings of the 4th International Conference on Learning Representations, 2016

Measuring Machine Intelligence Through Visual Question Answering.
AI Mag., 2016

Pose tracking by efficiently exploiting global features.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Hierarchical Question-Image Co-Attention for Visual Question Answering.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories.
Proceedings of the NAACL HLT 2016, 2016


Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Sort Story: Sorting Jumbled Images and Captions into Stories.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Analyzing the Behavior of Visual Question Answering Models.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Yin and Yang: Balancing and Answering Binary Visual Questions.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Joint Unsupervised Learning of Deep Representations and Image Clusters.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Object-Proposal Evaluation Protocol is 'Gameable'.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

We are Humor Beings: Understanding and Predicting Visual Humor.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Radio transformer networks: Attention models for learning to synchronize in wireless systems.
Proceedings of the 50th Asilomar Conference on Signals, Systems and Computers, 2016

2015
Human pose estimation via multi-layer composite models.
Signal Process., 2015

Guest Editors' Introduction: Special Section on Higher Order Graphical Models in Computer Vision.
IEEE Trans. Pattern Anal. Mach. Intell., 2015

A Comparative Study of Modern Inference Techniques for Structured Discrete Energy Minimization Problems.
Int. J. Comput. Vis., 2015

Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks.
CoRR, 2015

CloudCV: Large Scale Distributed Computer Vision as a Cloud Service.
CoRR, 2015

SubmodBoxes: Near-Optimal Search for a Set of Diverse Object Proposals.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

VQA: Visual Question Answering.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Optimizing Expected Intersection-Over-Union with Candidate-Constrained CRFs.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Active learning for structured probabilistic models with histogram approximation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

VIP: Finding important people in images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

CloudCV: Large-Scale Distributed Computer Vision as a Cloud Service.
Proceedings of the Mobile Cloud Visual Media Computing - From Interaction to Service, 2015

2014
Putting the User in the Loop for Image-Based Modeling.
Int. J. Comput. Vis., 2014

Combining the Best of Graphical Models and ConvNets for Semantic Segmentation.
CoRR, 2014

Candidate Constrained CRFs for Loss-Aware Structured Prediction.
CoRR, 2014

Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Empirical Minimum Bayes Risk Prediction: How to Extract an Extra Few % Performance from Vision Models with Just Three More Parameters.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Multimodal Learning in Loosely-Organized Web Images.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Efficiently Enforcing Diversity in Multi-Output Structured Prediction.
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, 2014

2013
Group Norm for Learning Structured SVMs with Unstructured Latent Variables.
Proceedings of the IEEE International Conference on Computer Vision, 2013

A Systematic Exploration of Diversity in Machine Translation.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

Discriminative Re-ranking of Diverse Segmentations.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

DivMCuts: Faster Training of Structural SVMs with Diverse M-Best Cutting-Planes.
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, 2013

2012
An Efficient Message-Passing Algorithm for the M-Best MAP Problem.
Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 2012

Multiple Choice Learning: Learning to Produce Multiple Structured Outputs.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Diverse M-Best Solutions in Markov Random Fields.
Proceedings of the Computer Vision - ECCV 2012, 2012

Learning the right model: Efficient max-margin learning in Laplacian CRFs.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

MaxFlow Revisited: An Empirical Comparison of Maxflow Algorithms for Dense Vision Problems.
Proceedings of the British Machine Vision Conference, 2012

A Multi-layer Composite Model for Human Pose Estimation.
Proceedings of the British Machine Vision Conference, 2012

2011
Interactive Co-segmentation of Objects in Image Collections.
Springer Briefs in Computer Science, Springer, ISBN: 978-1-4614-1915-0, 2011

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance.
Int. J. Comput. Vis., 2011

Dynamic Tree Block Coordinate Ascent.
Proceedings of the 28th International Conference on Machine Learning, 2011

Scribble based interactive 3D reconstruction via scene co-segmentation.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Inference for order reduction in Markov random fields.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Making the right moves: Guiding alpha-expansion using local primal-dual gaps.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
iModel: Interactive Co-segmentation for Object of Interest 3D Modeling.
Proceedings of the Trends and Topics in Computer Vision, 2010

iCoseg: Interactive co-segmentation with intelligent scribble guidance.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Beyond trees: MRF inference via outer-planar decomposition.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Seed Image Selection in interactive cosegmentation.
Proceedings of the International Conference on Image Processing, 2009

Cutout-search: Putting a name to the picture.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009

2008
Learning class-specific affinities for image labelling.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Semi-Supervised Clustering via Learnt Codeword Distances.
Proceedings of the British Machine Vision Conference 2008, Leeds, UK, September 2008, 2008


  Loading...