Yi Yang

Affiliations:

Google DeepMind, London, UK
Baidu Research, Institute of Deep Learning, Sunnyvale, CA, USA
University of California Irvine, CA, USA (PhD 2013)

According to our database¹, Yi Yang authored at least 45 papers between 2010 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2025

Recurrent Video Masked Autoencoders.

[BibT_eX]

[DOI]

CoRR, December, 2025

Tapnext: Tracking Any Point (Tap) as Next Token Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

From Image to Video: An Empirical Study of Diffusion Representations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

2024

Scaling 4D Representations.

[BibT_eX]

[DOI]

CoRR, 2024

Moving Off-the-Grid: Scene-Grounded Video Representations.

[BibT_eX]

[DOI]

Sjoerd van Steenkiste

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

TAPVid-3D: A Benchmark for Tracking Any Point in 3D.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Deep SE(3)-Equivariant Geometric Reasoning for Precise Placement Tasks.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Learning from One Continuous Video Stream.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

BootsTAP: Bootstrapped Training for Tracking-Any-Point.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2024, 2024

2023

TacticAI: an AI assistant for football tactics.

[BibT_eX]

[DOI]

CoRR, 2023

Perception Test: A Diagnostic Benchmark for Multimodal Video Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

TAP-Vid: A Benchmark for Tracking Any Point in a Video.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2020

Large-scale multilingual audio visual dubbing.

[BibT_eX]

[DOI]

CoRR, 2020

2019

Feedback Convolutional Neural Network for Visual Localization and Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

A Refined 3D Pose Dataset for Fine-Grained Object Categories.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Recognizing Part Attributes With Insufficient Data.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Depth-Based Hand Pose Estimation: Methods, Data, and Challenges.

[BibT_eX]

[DOI]

James Steven Supancic III

Int. J. Comput. Vis., 2018

Zero-Shot Transfer VQA Dataset.

[BibT_eX]

[DOI]

CoRR, 2018

Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories.

[BibT_eX]

[DOI]

CoRR, 2018

Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos.

[BibT_eX]

[DOI]

CoRR, 2018

3D Pose Estimation for Fine-Grained Object Categories.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Occlusion Aware Unsupervised Learning of Optical Flow.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Occlusion Aware Unsupervised Learning of Optical Flow.

[BibT_eX]

[DOI]

CoRR, 2017

Unsupervised Learning Layers for Video Analysis.

[BibT_eX]

[DOI]

CoRR, 2017

Dynamic Computational Time for Visual Attention.

[BibT_eX]

[DOI]

CoRR, 2017

Dynamic Computational Time for Visual Attention.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

2016

Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

CNN-RNN: A Unified Framework for Multi-label Image Classification.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Attention to Scale: Scale-Aware Semantic Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN).

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Learning Representations, 2015

Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images.

[BibT_eX]

[DOI]

CoRR, 2015

Depth-Based Hand Pose Estimation: Data, Methods, and Challenges.

[BibT_eX]

[DOI]

James Steven Supancic III

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Learning Like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

2014

Explain Images with Multimodal Recurrent Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2014

AutoCaption: Automatic caption generation for personal photos.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

Parsing Occluded People.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013

Articulated Human Detection with Flexible Mixtures of Parts.

[BibT_eX]

[DOI]

Yi Yang

Deva Ramanan

IEEE Trans. Pattern Anal. Mach. Intell., 2013

2012

Layered Object Models for Image Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2012

Recognizing proxemics in personal photos.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011

Articulated pose estimation with flexible mixtures-of-parts.

[BibT_eX]

[DOI]

Yi Yang

Deva Ramanan

Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010

Layered object detection for multi-class segmentation.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Yi Yang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...