Ishan Misra

Nur Muhammad (Mahi) Shafiullah

CoRR, 2023

On Bringing Robots Home.

[BibT_eX]

[DOI]

CoRR, 2023

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning.

[BibT_eX]

[DOI]

CoRR, 2023

SelfEval: Leveraging the discriminative nature of generative models for evaluation.

[BibT_eX]

[DOI]

Sai Saketh Rambhatla

CoRR, 2023

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

DINOv2: Learning Robust Visual Features without Supervision.

[BibT_eX]

[DOI]

CoRR, 2023

Vision-Language Models Performing Zero-Shot Tasks Exhibit Gender-based Disparities.

[BibT_eX]

[DOI]

CoRR, 2023

A Simple Recipe for Competitive Low-compute Self supervised Vision Models.

[BibT_eX]

[DOI]

Quentin Duval

Nicolas Ballas

CoRR, 2023

MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses.

[BibT_eX]

[DOI]

Yang Fu

Xiaolong Wang

Proceedings of the International Conference on Machine Learning, 2023

RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

The hidden uniform cluster prior in self-supervised learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Vision-Language Models Performing Zero-Shot Tasks Exhibit Disparities Between Gender Groups.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

The effectiveness of MAE pre-pretraining for billion-scale pretraining.

[BibT_eX]

[DOI]

Mannat Singh

Quentin Duval

Kalyan Vasudev Alwala

Christoph Feichtenhofer

Ross B. Girshick

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MOST: Multiple Object localization with Self-supervised Transformers for object discovery.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

GeneCIS: A Benchmark for General Conditional Image Similarity.

[BibT_eX]

[DOI]

Sagar Vaze

Nicolas Carion

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

OmniMAE: Single Model Masked Pretraining on Images and Videos.

[BibT_eX]

[DOI]

Alaaeldin El-Nouby

Mannat Singh

Kalyan Vasudev Alwala

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ImageBind One Embedding Space to Bind Them All.

[BibT_eX]

[DOI]

Kalyan Vasudev Alwala

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cut and Learn for Unsupervised Object Detection and Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Video Representations from Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

The Hidden Uniform Cluster Prior in Self-Supervised Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Multiplane NeRF-Supervised Disentanglement of Depth and Camera Pose from Videos.

[BibT_eX]

[DOI]

Yang Fu

Xiaolong Wang

CoRR, 2022

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision.

[BibT_eX]

[DOI]

CoRR, 2022

A Data-Augmentation Is Worth A Thousand Samples: Exact Quantification From Analytical Augmented Sample Moments.

[BibT_eX]

[DOI]

Randall Balestriero

Yann LeCun

CoRR, 2022

A Data-Augmentation Is Worth A Thousand Samples: Analytical Moments And Sampling-Free Training.

[BibT_eX]

[DOI]

Randall Balestriero

Yann LeCun

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Frame Averaging for Invariant and Equivariant Network Design.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Detecting Twenty-Thousand Classes Using Image-Level Supervision.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Masked Siamese Networks for Label-Efficient Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Omnivore: A Single Model for Many Visual Modalities.

[BibT_eX]

[DOI]

Mannat Singh

Nikhila Ravi

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Masked-attention Mask Transformer for Universal Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scaling up Instance Segmentation using Approximately Localized Phrases.

[BibT_eX]

[DOI]

Karan Desai

Justin Johnson

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021

Mask2Former for Video Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Self-supervised Pretraining of Visual Features in the Wild.

[BibT_eX]

[DOI]

CoRR, 2021

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers.

[BibT_eX]

[DOI]

Christoph Feichtenhofer

Andrea Vedaldi

João F. Henriques

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Barlow Twins: Self-Supervised Learning via Redundancy Reduction.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Self-Supervised Pretraining of 3D Features on any Point-Cloud.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

An End-to-End Transformer Model for 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

MDETR - Modulated Detection for End-to-End Multi-Modal Understanding.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Emerging Properties in Self-Supervised Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

3D Spatial Recognition Without Spatially Labeled 3D.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Audio-Visual Instance Discrimination with Cross-Modal Agreement.

[BibT_eX]

[DOI]

Pedro Morgado

Nuno Vasconcelos

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust Audio-Visual Instance Discrimination.

[BibT_eX]

[DOI]

Pedro Morgado

Nuno Vasconcelos

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Can Temporal Information Help with Contrastive Self-Supervised Learning?

[BibT_eX]

[DOI]

CoRR, 2020

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

ClusterFit: Improving Generalization of Visual Representations.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Self-Supervised Learning of Pretext-Invariant Representations.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

In Defense of Grid Features for Visual Question Answering.

[BibT_eX]

[DOI]

Huaizu Jiang

Marcus Rohrbach

Erik G. Learned-Miller

Xinlei Chen

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Binary Image Selection (BISON): Interpretable Evaluation of Visual Grounding.

[BibT_eX]

[DOI]

Hexiang Hu

CoRR, 2019

Evaluating Text-to-Image Matching using Binary Image Selection (BISON).

[BibT_eX]

[DOI]

Hexiang Hu

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

3D-RelNet: Joint Object and Relational Network for 3D Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Scaling and Benchmarking Self-Supervised Visual Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Does Object Recognition Work for Everyone?

[BibT_eX]

[DOI]

Terrance DeVries

Changhan Wang

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

2018

Mainstream: Dynamic Stem-Sharing for Multi-Tenant Video Processing.

[BibT_eX]

[DOI]

Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Learning by Asking Questions.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection.

[BibT_eX]

[DOI]

Debidatta Dwibedi

Proceedings of the IEEE International Conference on Computer Vision, 2017

From Red Wine to Red Tomato: Composition with Context.

[BibT_eX]

[DOI]

Abhinav Gupta

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

Generating Natural Questions About an Image.

[BibT_eX]

[DOI]

CoRR, 2016

Unsupervised Learning using Sequential Verification for Action Recognition.

[BibT_eX]

[DOI]

C. Lawrence Zitnick

CoRR, 2016

Visual Storytelling.

[BibT_eX]

[DOI]

Ting-Hao (Kenneth) Huang

Proceedings of the NAACL HLT 2016, 2016

Shuffle and Learn: Unsupervised Learning Using Temporal Order Verification.

[BibT_eX]

[DOI]

C. Lawrence Zitnick

Proceedings of the Computer Vision - ECCV 2016, 2016

Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Cross-Stitch Networks for Multi-task Learning.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Generating Natural Questions About an Image.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015

Applying artificial vision models to human scene understanding.

[BibT_eX]

[DOI]

Elissa Michele Aminoff

Frontiers Comput. Neurosci., 2015

Learning Visual Classifiers using Human-centric Annotations.

[BibT_eX]

[DOI]

CoRR, 2015

Watch and learn: Semi-supervised learning of object detectors from videos.

[BibT_eX]

[DOI]

Abhinav Shrivastava

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014

Data-driven exemplar model selection.

[BibT_eX]

[DOI]

Abhinav Shrivastava

Sivaramakrishna Bharadwaj

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

2013

CPU and/or GPU: Revisiting the GPU Vs. CPU Myth

[BibT_eX]

[DOI]

CoRR, 2013

2011

Hybrid implementation of error diffusion dithering.

[BibT_eX]

[DOI]

Aditya Deshpande