Li Fei-Fei

Orcid: 0000-0002-7481-0810

Affiliations:
  • Stanford University, Department of Computer Science, CA, USA


According to our database1, Li Fei-Fei authored at least 336 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1, 000 Everyday Activities and Realistic Simulation.
CoRR, 2024

DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation.
CoRR, 2024

Position Paper: Agent AI Towards a Holistic Intelligence.
CoRR, 2024

An Interactive Agent Foundation Model.
CoRR, 2024

Agent AI: Surveying the Horizons of Multimodal Interaction.
CoRR, 2024

Wild2Avatar: Rendering Humans Behind Occlusions.
CoRR, 2024

Differentially Private Video Activity Recognition.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

2023
Guest Editorial: Introduction to the Special Section on Graphs in Vision and Pattern Analysis.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Photorealistic Video Generation with Diffusion Models.
CoRR, 2023

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator.
CoRR, 2023

ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image.
CoRR, 2023

Mini-BEHAVIOR: A Procedurally Generated Benchmark for Long-horizon Decision-Making in Embodied AI.
CoRR, 2023

D<sup>3</sup>Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Robotic Manipulation.
CoRR, 2023

MindAgent: Emergent Gaming Interaction.
CoRR, 2023

HomE: Homography-Equivariant Video Representation Learning.
CoRR, 2023

The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects.
CoRR, 2023

Dynamic-Resolution Model Learning for Object Pile Manipulation.
Proceedings of the Robotics: Science and Systems XIX, Daegu, 2023

Holistic Evaluation of Text-to-Image Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Siamese Masked Autoencoders.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Model-Based Control with Sparse Neural Dynamics.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Primitive Skill-Based Robot Learning from Human Evaluative Feedback.
IROS, 2023

Active Task Randomization: Learning Robust Skills via Unsupervised Generation of Diverse and Feasible Tasks.
IROS, 2023

M-EMBER: Tackling Long-Horizon Mobile Manipulation via Factorized Domain Transfer.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Task-Driven Graph Attention for Hierarchical Relational Object Navigation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Modeling Dynamic Environments with Scene Graph Memory.
Proceedings of the International Conference on Machine Learning, 2023

VIMA: Robot Manipulation with Multimodal Prompts.
Proceedings of the International Conference on Machine Learning, 2023

MaskViT: Masked Visual Pre-Training for Video Prediction.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Rendering Humans from Object-Occluded Monocular Videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

The Object Folder Benchmark : Multisensory Learning with Neural and Real Objects.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities.
Proceedings of the Conference on Robot Learning, 2023

MimicPlay: Long-Horizon Imitation Learning by Watching Human Play.
Proceedings of the Conference on Robot Learning, 2023

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models.
Proceedings of the Conference on Robot Learning, 2023

Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation.
Proceedings of the Conference on Robot Learning, 2023

2022
Author Correction: Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mac. Intell., October, 2022

Generalizable Task Planning Through Representation Pretraining.
IEEE Robotics Autom. Lett., 2022

Advances, challenges and opportunities in creating data for trustworthy AI.
Nat. Mach. Intell., 2022

Active Task Randomization: Learning Visuomotor Skills for Sequential Manipulation by Proposing Feasible and Novel Tasks.
CoRR, 2022

Retrospectives on the Embodied AI Workshop.
CoRR, 2022

VIMA: General Robot Manipulation with Multimodal Prompts.
CoRR, 2022

BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents.
CoRR, 2022

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

GaitForeMer: Self-supervised Pre-training of Transformers via Human Motion Forecasting for Few-Shot Gait Impairment Severity Estimation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022

A Study of Face Obfuscation in ImageNet.
Proceedings of the International Conference on Machine Learning, 2022

MetaMorph: Learning Universal Controllers with Transformers.
Proceedings of the Tenth International Conference on Learning Representations, 2022

PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens.
Proceedings of the Computer Vision - ECCV 2022, 2022

Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Revisiting the "Video" in Video-Language Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A Dual Representation Framework for Robot Learning with Human Guidance.
Proceedings of the Conference on Robot Learning, 2022

See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation.
Proceedings of the Conference on Robot Learning, 2022


2021
Next Big Challenges in Core AI Technology.
Proceedings of the Reflections on Artificial Intelligence for Humanity, 2021

Neural Event Semantics for Grounded Language Understanding.
Trans. Assoc. Comput. Linguistics, 2021

Quantifying Parkinson's disease motor severity under uncertainty using MDS-UPDRS videos.
Medical Image Anal., 2021

Visual Intelligence through Human Interaction.
CoRR, 2021

On the Opportunities and Risks of Foundation Models.
CoRR, 2021

Neural Abstructions: Abstractions that Support Construction for Grounded Language Learning.
CoRR, 2021

Physion: Evaluating Physical Prediction from Vision in Humans and Machines.
CoRR, 2021

Embodied Intelligence via Learning and Evolution.
CoRR, 2021

Representation Learning with Statistical Independence to Mitigate Bias.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Discovering Generalizable Skills via Automated Generation of Diverse Tasks.
Proceedings of the Robotics: Science and Systems XVII, Virtual Event, July 12-16, 2021., 2021

MOMA: Multi-Object Multi-Actor Activity Parsing.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Physion: Evaluating Physical Prediction from Vision in Humans and Machines.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Generalization Through Hand-Eye Coordination: An Action Space for Learning Spatially-Invariant Visuomotor Control.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Deep Affordance Foresight: Planning Through What Can Be Done in the Future.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Learning Multi-Arm Manipulation Through Collaborative Teleoperation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies.
Proceedings of the 38th International Conference on Machine Learning, 2021

Adaptive Procedural Task Generation for Hard-Exploration Problems.
Proceedings of the 9th International Conference on Learning Representations, 2021

Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Scalable Differential Privacy With Sparse Network Finetuning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Metadata Normalization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Example-Driven Model-Based Reinforcement Learning for Solving Long-Horizon Visuomotor Tasks.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Error-Aware Imitation Learning from Teleoperation Data for Mobile Manipulation.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks.
IEEE Trans. Robotics, 2020

Conceptual Metaphors Impact Perceptions of Human-AI Collaboration.
Proc. ACM Hum. Comput. Interact., 2020

Assessing the accuracy of automatic speech recognition for psychotherapy.
npj Digit. Medicine, 2020

Illuminating the dark spaces of healthcare with ambient intelligence.
Nat., 2020

Automatic detection of hand hygiene using computer vision technology.
J. Am. Medical Informatics Assoc., 2020

Learning task-oriented grasping for tool manipulation from simulated self-supervision.
Int. J. Robotics Res., 2020

Human-in-the-Loop Imitation Learning using Remote Teleoperation.
CoRR, 2020

iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes.
CoRR, 2020

Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations.
CoRR, 2020

GTI: Learning to Generalize across Long-Horizon Tasks from Human Demonstrations.
Proceedings of the Robotics: Science and Systems XVI, 2020

Learning Physical Graph Representations from Visual Scenes.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Vision-Based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

KETO: Learning Keypoint Representations for Tool Manipulation.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Motion Reasoning for Goal-Based Imitation Learning.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Towards fairer datasets: filtering and balancing the distribution of the people subtree in the ImageNet hierarchy.
Proceedings of the FAT* '20: Conference on Fairness, 2020

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

Procedure Planning in Instructional Videos.
Proceedings of the Computer Vision - ECCV 2020, 2020

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
A computer vision system for deep learning-based detection of patient mobilization activities in the ICU.
npj Digit. Medicine, 2019

Automated abnormality detection in lower extremity radiographs using deep learning.
Nat. Mach. Intell., 2019

Action Genome: Actions as Composition of Spatio-temporal Scene Graphs.
CoRR, 2019

Deep Bayesian Active Learning for Multiple Correct Outputs.
CoRR, 2019

IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data.
CoRR, 2019

Bias-Resilient Neural Network.
CoRR, 2019

Causal Induction from Visual Observations for Goal Directed Tasks.
CoRR, 2019

Dual Sequential Monte Carlo: Tunneling Filtering and Planning in Continuous POMDPs.
CoRR, 2019

SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning.
CoRR, 2019

D<sup>3</sup>TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation.
CoRR, 2019

HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Regression Planning Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Scaling Robot Supervision to Hundreds of Hours with RoboTurk: Robotic Manipulation Dataset through Human Reasoning and Dexterity.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks.
Proceedings of the International Conference on Robotics and Automation, 2019

Eidetic 3D LSTM: A Model for Video Prediction and Beyond.
Proceedings of the 7th International Conference on Learning Representations, 2019

Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Situational Fusion of Visual Representation for Visual Navigation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Scene Graph Prediction With Limited Labels.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Audio-linguistic Embeddings for Spoken Sentences.
Proceedings of the IEEE International Conference on Acoustics, 2019

AI-Based Request Augmentation to Increase Crowdsourcing Participation.
Proceedings of the Seventh AAAI Conference on Human Computation and Crowdsourcing, 2019

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Composing Text and Image for Image Retrieval - an Empirical Odyssey.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Information Maximizing Visual Question Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019

Thoracic Disease Identification and Localization with Limited Supervision.
Proceedings of the Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics, 2019

2018
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos.
Int. J. Comput. Vis., 2018

Vision-Based Gait Analysis for Senior Care.
CoRR, 2018

Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference.
CoRR, 2018

A Fully Private Pipeline for Deep Learning on Electronic Health Records.
CoRR, 2018

Privacy-Preserving Action Recognition for Smart Hospitals using Low-Resolution Depth Images.
CoRR, 2018

Measuring Depression Symptom Severity from Spoken Language and 3D Facial Expressions.
CoRR, 2018

DDRprog: A CLEVR Differentiable Dynamic Reasoning Programmer.
CoRR, 2018

Learning to Play with Intrinsically-Motivated Self-Aware Agents.
CoRR, 2018

Scaling Human-Object Interaction Recognition Through Zero-Shot Learning.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Engagement Learning: Expanding Visual Knowledge by Engaging Online Participants.
Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology Adjunct Proceedings, 2018

Flexible neural representation for physics prediction.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Learning to Decompose and Disentangle Representations for Video Prediction.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Learning to Play With Intrinsically-Motivated, Self-Aware Agents.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

3D Point Cloud-Based Visual Prediction of ICU Mobility Care Activities.
Proceedings of the Machine Learning for Healthcare Conference, 2018

Neural Task Programming: Learning to Generalize Across Hierarchical Tasks.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go?
Proceedings of the 35th International Conference on Machine Learning, 2018

MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels.
Proceedings of the 35th International Conference on Machine Learning, 2018

HiDDeN: Hiding Data With Deep Networks.
Proceedings of the Computer Vision - ECCV 2018, 2018

Graph Distillation for Action Detection with Privileged Modalities.
Proceedings of the Computer Vision - ECCV 2018, 2018

Progressive Neural Architecture Search.
Proceedings of the Computer Vision - ECCV 2018, 2018

Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos.
Proceedings of the Computer Vision - ECCV 2018, 2018

Dynamic Task Prioritization for Multitask Learning.
Proceedings of the Computer Vision - ECCV 2018, 2018

Neural Graph Matching Networks for Fewshot 3D Action Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Thoracic Disease Identification and Localization With Limited Supervision.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Referring Relationships.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Image Generation From Scene Graphs.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Social GAN: Socially Acceptable Trajectories With Generative Adversarial Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Iterative Visual Reasoning Beyond Convolutions.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Emergence of Structured Behaviors from Curiosity-Based Intrinsic Motivation.
Proceedings of the 40th Annual Meeting of the Cognitive Science Society, 2018

2017
Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States.
Proc. Natl. Acad. Sci. USA, 2017

Deep Visual-Semantic Alignments for Generating Image Descriptions.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Evidence for similar patterns of neural activity elicited by picture- and word-based representations of natural scenes.
NeuroImage, 2017

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations.
Int. J. Comput. Vis., 2017

MentorNet: Regularizing Very Deep Neural Networks on Corrupted Labels.
CoRR, 2017

Progressive Neural Architecture Search.
CoRR, 2017

Label Efficient Learning of Transferable Representations across Domains and Tasks.
CoRR, 2017

Graph Distillation for Action Detection with Privileged Information.
CoRR, 2017

Tackling Over-pruning in Variational Autoencoders.
CoRR, 2017

Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US.
CoRR, 2017

Label Efficient Learning of Transferable Representations acrosss Domains and Tasks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Towards Vision-Based Smart Hospitals: A System for Tracking and Monitoring Hand Hygiene Compliance.
Proceedings of the Machine Learning for Health Care Conference, 2017

AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems.
Proceedings of the Robotics Research, The 18th International Symposium, 2017

Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations.
Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017

Target-driven visual navigation in indoor scenes using deep reinforcement learning.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017

Unsupervised camera localization in crowded spaces.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017

Visual Semantic Planning Using Deep Successor Representations.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Dense-Captioning Events in Videos.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Inferring and Executing Programs for Visual Reasoning.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Characterizing and Improving Stability in Neural Style Transfer.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Fine-Grained Recognition in the Wild: A Multi-task Domain Adaptation Approach.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Knowledge Acquisition for Visual Question Answering via Iterative Querying.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Learning to Learn from Noisy Web Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Scene Graph Generation by Iterative Message Passing.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Unsupervised Learning of Long-Term Motion Dynamics for Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

A Hierarchical Approach for Generating Descriptive Image Paragraphs.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

A Glimpse Far into the Future: Understanding Long-term Crowd Worker Quality.
Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 2017

Scalable Annotation of Fine-Grained Categories Without Experts.
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017

End-to-End, Single-Stream Temporal Action Detection in Untrimmed Videos.
Proceedings of the British Machine Vision Conference 2017, 2017

Computer Vision-based Approach to Maintain Independent Living for Seniors.
Proceedings of the AMIA 2017, 2017

Fine-Grained Car Detection for Visual Census Estimation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Learning to Predict Human Behavior in Crowded Scenes.
Proceedings of the Group and Crowd Behavior for Computer Vision, 1st Edition, 2017

Tracking Millions of Humans in Crowded Spaces.
Proceedings of the Group and Crowd Behavior for Computer Vision, 1st Edition, 2017

2016

Leveraging the Wisdom of the Crowd for Fine-Grained Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

Typicality sharpens category representations in object-selective cortex.
NeuroImage, 2016

Crowdsourcing in Computer Vision.
Found. Trends Comput. Graph. Vis., 2016

A Glimpse Far into the Future: Understanding Long-term Crowd Worker Accuracy.
CoRR, 2016

Viewpoint Invariant 3D Human Pose Estimation with Recurrent Error Feedback.
CoRR, 2016

Toward More Gender Diversity in CS through an Artificial Intelligence Summer Program for High School Girls.
Proceedings of the 47th ACM Technical Symposium on Computing Science Education, 2016

Vision-Based Classification of Developmental Disorders Using Eye-Movements.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, 2016

Visual Relationship Detection with Language Priors.
Proceedings of the Computer Vision - ECCV 2016, 2016

The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Perceptual Losses for Real-Time Style Transfer and Super-Resolution.
Proceedings of the Computer Vision - ECCV 2016, 2016

Connectionist Temporal Modeling for Weakly Supervised Action Labeling.
Proceedings of the Computer Vision - ECCV 2016, 2016

Towards Viewpoint Invariant 3D Human Pose Estimation.
Proceedings of the Computer Vision - ECCV 2016, 2016

What's the Point: Semantic Segmentation with Point Supervision.
Proceedings of the Computer Vision - ECCV 2016, 2016

Visual7W: Grounded Question Answering in Images.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

End-to-End Learning of Action Detection from Frame Glimpses in Videos.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Detecting Events and Key Actors in Multi-person Videos.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

DenseCap: Fully Convolutional Localization Networks for Dense Captioning.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Recurrent Attention Models for Depth-Based Person Identification.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Social LSTM: Human Trajectory Prediction in Crowded Spaces.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Embracing Error to Enable Rapid Crowdsourcing.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

Vision-Based Hand Hygiene Monitoring in Hospitals.
Proceedings of the AMIA 2016, 2016

2015
RGB-W Dataset.
Dataset, December, 2015

Basic Level Category Structure Emerges Gradually across Human Ventral Visual Cortex.
J. Cogn. Neurosci., 2015

ImageNet Large Scale Visual Recognition Challenge.
Int. J. Comput. Vis., 2015

Building a Large-scale Multimodal Knowledge Base for Visual Question Answering.
CoRR, 2015

Visualizing and Understanding Recurrent Networks.
CoRR, 2015

SentenceRacer: A Game with a Purpose for Image Sentence Annotation.
CoRR, 2015

Improving Image Classification with Location Context.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Learning Temporal Embeddings for Complex Video Analysis.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Love Thy Neighbors: Image Annotation by Exploiting Image Metadata.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

RGB-W: When Vision Meets Wireless.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Best of both worlds: Human-machine collaboration for object annotation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Learning semantic relationships for better action retrieval in images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Fine-grained recognition without part annotations.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Image retrieval using scene graphs.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval.
Proceedings of the Fourth Workshop on Vision and Language, 2015

2014
Object Bank: An Object-Level Image Representation for High-Level Visual Recognition.
Int. J. Comput. Vis., 2014

VideoSET: Video Summary Evaluation through Text.
CoRR, 2014

Affordances Provide a Fundamental Categorization Principle for Visual Scenes.
CoRR, 2014

Visual Noise from Natural Scene Statistics Reveals Human Scene Category Representations.
CoRR, 2014

Understanding the 3D layout of a cluttered room from multiple images.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

Deep Fragment Embeddings for Bidirectional Image Sentence Mapping.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Learning Features and Parts for Fine-Grained Recognition.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Reasoning about Object Affordances in a Knowledge Base Representation.
Proceedings of the Computer Vision - ECCV 2014, 2014

Linking People in Videos with "Their" Names Using Coreference Resolution.
Proceedings of the Computer Vision - ECCV 2014, 2014

Efficient Image and Video Co-localization with Frank-Wolfe Algorithm.
Proceedings of the Computer Vision - ECCV 2014, 2014

Co-localization in Real-World Images.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Large-Scale Video Classification with Convolutional Neural Networks.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Socially-Aware Large-Scale Crowd Forecasting.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Discovering the Signatures of Joint Attention in Child-Caregiver Interaction.
Proceedings of the 36th Annual Meeting of the Cognitive Science Society, 2014

Scalable multi-label annotation.
Proceedings of the CHI Conference on Human Factors in Computing Systems, 2014

Social Role Recognition for Human Event Understanding.
Proceedings of the Human-Centered Social Media Analytics, 2014

Integrating Randomization and Discrimination for Classifying Human-Object Interaction Activities.
Proceedings of the Human-Centered Social Media Analytics, 2014

2013
Differential connectivity within the Parahippocampal Place Area.
NeuroImage, 2013


Object discovery in 3D scenes via shape analysis.
Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013

3D Object Representations for Fine-Grained Categorization.
Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

Discovering Object Functionality.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Combining the Right Features for Complex Event Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Detecting Avocados to Zucchinis: What Have We Done, and Where Are We Going?
Proceedings of the IEEE International Conference on Computer Vision, 2013

Video Event Understanding Using Natural Language Descriptions.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Discriminative Segment Annotation in Weakly Labeled Video.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Social Role Discovery in Human Events.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Fine-Grained Crowdsourcing for Fine-Grained Recognition.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Free your Camera: 3D Indoor Scene Understanding from Arbitrary Camera Motion.
Proceedings of the British Machine Vision Conference, 2013

2012
Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses.
IEEE Trans. Pattern Anal. Mach. Intell., 2012

Voxel-level functional connectivity using spatial regularization.
NeuroImage, 2012


Shifting Weights: Adapting Object Detectors from Image to Video.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Web image prediction using multivariate point processes.
Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012

Efficient Euclidean Projections onto the Intersection of Norm Balls.
Proceedings of the 29th International Conference on Machine Learning, 2012

Crowdsourcing Annotations for Visual Object Detection.
Proceedings of the 4th Human Computation Workshop, 2012

Action Recognition with Exemplar Based 2.5D Graph Matching.
Proceedings of the Computer Vision - ECCV 2012, 2012

Object-Centric Spatial Pooling for Image Classification.
Proceedings of the Computer Vision - ECCV 2012, 2012

A codebook-free and annotation-free approach for fine-grained image categorization.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Learning latent temporal structure for complex event detection.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Multi-Level Structured Image Coding on High-Dimensional Image Representation.
Proceedings of the Computer Vision, 2012

2011
ReVision: automated classification, analysis and redesign of chart images.
Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 2011

GENIE TRECVID 2011 Multimedia Event Detection: Late-Fusion Approaches to Combine Multiple Audio-Visual features.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Large-Scale Category Structure Aware Image Categorization.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Human action recognition by learning bases of action attributes and parts.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Distributed cosegmentation via submodular optimization on anisotropic diffusion.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Online detection of unusual events in videos via dynamic sparse coding.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Combining randomization and discrimination for fine-grained image categorization.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Hierarchical semantic indexing for large scale image retrieval.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
Multi-view Object Categorization and Pose Estimation.
Proceedings of the Computer Vision: Detection, Recognition and Reconstruction, 2010

What, Where and Who? Telling the Story of an Image by Activity Classification, Scene Recognition and Object Categorization.
Proceedings of the Computer Vision: Detection, Recognition and Reconstruction, 2010

Learning Object Categories From Internet Image Searches.
Proc. IEEE, 2010

OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning.
Int. J. Comput. Vis., 2010

Large Margin Learning of Upstream Scene Understanding Models.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Image Segmentation with Topic Random Field.
Proceedings of the Computer Vision - ECCV 2010, 2010

Attribute Learning in Large-Scale Datasets.
Proceedings of the Trends and Topics in Computer Vision, 2010

Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification.
Proceedings of the Computer Vision, 2010

Objects as Attributes for Scene Classification.
Proceedings of the Trends and Topics in Computer Vision, 2010

What Does Classifying More Than 10, 000 Image Categories Tell Us?
Proceedings of the Computer Vision - ECCV 2010, 2010

Modeling mutual context of object and human pose in human-object interaction activities.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Grouplet: A structured image representation for recognizing human and object interactions.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Efficient extraction of human motion volumes by tracking.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Building and using a semantivisual image hierarchy.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Hierarchical Mixture of Classification Experts Uncovers Interactions between Brain Regions.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Mining discriminative adjectives and prepositions for natural scene recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009

Simultaneous image classification and annotation.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

A multi-view probabilistic model for 3D object classes.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

Towards total scene understanding: Classification, annotation and segmentation in an automatic framework.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

ImageNet: A large-scale hierarchical image database.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words.
Int. J. Comput. Vis., 2008

Variational Transform Invariant Mixture of Probabilistic PCA.
Proceedings of the 9th IEEE Workshop on Applications of Computer Vision (WACV 2008), 2008

View Synthesis for Recognizing Unseen Poses of Object Classes.
Proceedings of the Computer Vision, 2008

Extracting Moving People from Internet Videos.
Proceedings of the Computer Vision, 2008

Towards Scalable Dataset Construction: An Active Learning Approach.
Proceedings of the Computer Vision, 2008

2007
Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories.
Comput. Vis. Image Underst., 2007

3D generic object categorization, localization and pose estimation.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

What, where and who? Classifying events by scene and object recognition.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

A Hierarchical Model of Shape and Appearance for Human Action Classification.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

OPTIMOL: automatic Online Picture collecTion via Incremental MOdel Learning.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

OPTIMOL: A Framework for Online Picture Collection via Incremental Model Learning.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006
One-Shot Learning of Object Categories.
IEEE Trans. Pattern Anal. Mach. Intell., 2006

Variational Shift Invariant Probabilistic PCA for Face Recognition.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Audio-Visual Speaker Localization Using Graphical Models.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Using Dependent Regions for Object Categorization in a Generative Framework.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

2005
Learning Object Categories from Google's Image Search.
Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), 2005

A Bayesian Hierarchical Model for Learning Natural Scene Categories.
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005

2004
What do reflections tell us about the shape of a mirror?
Proceedings of the 1st Symposium on Applied Perception in Graphics and Visualization, 2004

2003
A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories.
Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV 2003), 2003


  Loading...