Yue Cao

Orcid: 0000-0002-1679-0444

Affiliations:

Microsoft Research Asia, Beijing, China
Tsinghua University, School of Software, Beijing, China (PhD 2019)

According to our database¹, Yue Cao authored at least 62 papers between 2015 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2025

Masked Channel Modeling for Bootstrapping Visual Pre-training.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., February, 2025

2024

Correlation-Embedded Transformer Tracking: A Single-Branch Framework.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

EVA-02: A visual representation for neon genesis.

[BibT_eX]

[DOI]

Image Vis. Comput., 2024

2023

Global Context Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

SegGPT: Segmenting Everything In Context.

[BibT_eX]

[DOI]

CoRR, 2023

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

EVA-CLIP: Improved Training Techniques for CLIP at Scale.

[BibT_eX]

[DOI]

CoRR, 2023

Improving CLIP Fine-tuning Performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SegGPT: Towards Segmenting Everything In Context.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Deep Incubation: Training Large Models by Divide-and-Conquering.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

On Data Scaling in Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Revealing the Dark Secrets of Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-training for Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Images Speak in Images: A Generalist Painter for In-Context Visual Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

EVA: Exploring the Limits of Masked Visual Representation Learning at Scale.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Deep Model Assembling.

[BibT_eX]

[DOI]

CoRR, 2022

Could Giant Pretrained Image Models Extract Universal Representations?

[BibT_eX]

[DOI]

CoRR, 2022

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation.

[BibT_eX]

[DOI]

CoRR, 2022

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Could Giant Pre-trained Image Models Extract Universal Representations?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

A Simple Approach and Benchmark for 21, 000-Category Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Correlation-Aware Deep Tracking.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SimMIM: a Simple Framework for Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Video Swin Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Swin Transformer V2: Scaling Up Capacity and Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Cleaning timestamps with temporal constraints.

[BibT_eX]

[DOI]

VLDB J., 2021

A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained Vision-language Model.

[BibT_eX]

[DOI]

CoRR, 2021

Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Self-Supervised Learning with Swin Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

Bootstrap Your Object Detector via Mixed Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Leveraging Batch Normalization for Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Group-Free 3D Object Detection via Transformers.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Cross-Iteration Batch Normalization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

RepPoints v2: Verification Meets Regression for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Parametric Instance Classification for Unsupervised Visual Feature learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Disentangled Non-local Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

A Closer Look at Local Aggregation Operators in Point Cloud Analysis.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Negative Margin Matters: Understanding Margin in Few-Shot Classification.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Memory Enhanced Global-Local Aggregation for Video Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Transferable Representation Learning with Deep Adaptation Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Spatial-Temporal Relation Networks for Multi-Object Tracking.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Maximum-Margin Hamming Hashing.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018

Deep Triplet Quantization.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Cross-Modal Hamming Hashing.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

HashGAN: Deep Learning to Hash With Pair Conditional Wasserstein GAN.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Deep Cauchy Hashing for Hamming Space Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Unsupervised Domain Adaptation With Distribution Matching Machines.

[BibT_eX]

[DOI]

Yue Cao

Mingsheng Long

Jianmin Wang

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Deep Visual-Semantic Quantization for Efficient Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Correlation Hashing Network for Efficient Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Yue Cao

Mingsheng Long

Jianmin Wang

Proceedings of the British Machine Vision Conference 2017, 2017

Collective Deep Quantization for Efficient Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Deep Learning of Transferable Representation for Scalable Domain Adaptation.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2016

Cleaning Timestamps with Temporal Constraints.

[BibT_eX]

[DOI]

Shaoxu Song

Yue Cao

Jianmin Wang

Proc. VLDB Endow., 2016

Composite Correlation Quantization for Efficient Multimodal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Correlation Autoencoder Hashing for Supervised Cross-Modal Search.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Deep Visual-Semantic Hashing for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

Deep Hashing Network for Efficient Similarity Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Deep Quantization Network for Efficient Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Learning Transferable Features with Deep Adaptation Networks.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015

Yue Cao

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...