Ting Yao

Orcid: 0000-0001-7587-101X

Affiliations:
  • JD Explore Academy, Beijing, China


According to our database1, Ting Yao authored at least 200 papers between 2010 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Interactive Conversational Head Generation.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

Creatively Upscaling Images with Global-Regional Priors.
Int. J. Comput. Vis., August, 2025

Visual Autoregressive Modeling for Instruction-Guided Image Editing.
CoRR, August, 2025

Kernel Masked Image Modeling Through the Lens of Theoretical Understanding.
IEEE Trans. Neural Networks Learn. Syst., July, 2025

DreamJourney: Perpetual View Generation with Video Diffusion Models.
CoRR, June, 2025

Teaching Masked Autoencoder With Strong Augmentations.
IEEE Trans. Neural Networks Learn. Syst., May, 2025

HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer.
CoRR, May, 2025

Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots.
CoRR, May, 2025

Exploring Vision-Language Foundation Model for Novel Object Captioning.
IEEE Trans. Circuits Syst. Video Technol., January, 2025

Stream-ViT: Learning Streamlined Convolutions in Vision Transformer.
IEEE Trans. Multim., 2025

Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MotionPro: A Precise Motion Controller for Image-to-Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Visualizing and Understanding Patch Interactions in Vision Transformer.
IEEE Trans. Neural Networks Learn. Syst., October, 2024

HIRI-ViT: Scaling Vision Transformer With High Resolution Inputs.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2024

Explaining Cross-domain Recognition with Interpretable Deep Classifier.
ACM Trans. Multim. Comput. Commun. Appl., March, 2024

End-to-End Video Scene Graph Generation With Temporal Propagation Transformer.
IEEE Trans. Multim., 2024

Learning Temporal Dynamics in Videos With Image Transformer.
IEEE Trans. Multim., 2024

Bidirectional Knowledge Reconfiguration for Lightweight Point Cloud Analysis.
IEEE Trans. Multim., 2024

Learning 3D Shape Latent for Point Cloud Completion.
IEEE Trans. Multim., 2024

A Closer Look at the Reflection Formulation in Single Image Reflection Removal.
IEEE Trans. Image Process., 2024

SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer.
CoRR, 2024

VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM.
CoRR, 2024

FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Improving Virtual Try-On with Garment-Focused Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning.
Proceedings of the Computer Vision - ECCV 2024, 2024

VideoStudio: Generating Consistent-Content and Multi-scene Videos.
Proceedings of the Computer Vision - ECCV 2024, 2024

Improving Text-Guided Object Inpainting with Semantic Pre-inpainting.
Proceedings of the Computer Vision - ECCV 2024, 2024

SD-DiT: Unleashing the Power of Self-Supervised Discrimination in Diffusion Transformer<sup>*</sup>.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Boosting Diffusion Models with Moving Average Sampling in Frequency Domain.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Dual Vision Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

Lightweight and Progressively-Scalable Networks for Semantic Segmentation.
Int. J. Comput. Vis., August, 2023

Bi-calibration Networks for Weakly-Supervised Video Representation Learning.
Int. J. Comput. Vis., July, 2023

A Low Rank Promoting Prior for Unsupervised Contrastive Learning.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

Boosting Scene Graph Generation with Visual Relation Saliency.
ACM Trans. Multim. Comput. Commun. Appl., January, 2023

Boosting Vision-and-Language Navigation with Direction Guiding and Backtracing.
ACM Trans. Multim. Comput. Commun. Appl., January, 2023

Bottom-up and Top-down Object Inference Networks for Image Captioning.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Boosting Relationship Detection in Images with Multi-Granular Self-Supervised Learning.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Boosting Generic Visual-Linguistic Representation With Dynamic Contexts.
IEEE Trans. Multim., 2023

Prototypical Matching Networks for Video Object Segmentation.
IEEE Trans. Image Process., 2023

Contextual Transformer Networks for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

Selective Volume Mixup for Video Action Recognition.
CoRR, 2023

Deep Equilibrium Multimodal Fusion.
CoRR, 2023

Visual-Aware Text-to-Speech.
CoRR, 2023

Learning and Evaluating Human Preferences for Conversational Head Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CARIS: Context-Aware Referring Image Segmentation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

ControlStyle: Text-Driven Stylized Image Generation Using Diffusion Priors.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Control3D: Towards Controllable Text-to-3D Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Learning Neural Implicit Surfaces with Object-Aware Radiance Fields.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Visual-Aware Text-to-Speech<sup>*</sup>.
Proceedings of the IEEE International Conference on Acoustics, 2023

Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

HGNet: Learning Hierarchical Geometry from Points, Edges, and Surfaces.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

3D Human Pose Estimation with Spatio-Temporal Criss-Cross Attention.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Modality-Agnostic Debiasing for Single Domain Generalization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Semantic-Conditional Diffusion Networks for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PointClustering: Unsupervised Point Cloud Pre-training using Transformation Invariance in Clustering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AnchorFormer: Point Cloud Completion from Discriminative Nodes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning to Generate Language-Supervised and Open-Vocabulary Scene Graph Using Pre-Trained Visual-Semantic Space.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
msr-vtt.
Dataset, December, 2022

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Unpaired Image Captioning With semantic-Constrained Self-Learning.
IEEE Trans. Multim., 2022

3D Cascade RCNN: High Quality Object Detection in Point Clouds.
IEEE Trans. Image Process., 2022

Out-of-Distribution Detection with Hilbert-Schmidt Independence Optimization.
CoRR, 2022

Scale Attention for Learning Deep Face Representation: A Study Against Visual Scale Variation.
CoRR, 2022

Generalized One-shot Domain Adaption of Generative Adversarial Networks.
CoRR, 2022

Dual Vision Transformer.
CoRR, 2022

Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation.
CoRR, 2022

Motion-Focused Contrastive Learning of Video Representations.
CoRR, 2022

Contextual and selective attention networks for image captioning.
Sci. China Inf. Sci., 2022

Generalized One-shot Domain Adaptation of Generative Adversarial Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Out-of-Distribution Detection via Conditional Kernel Independence Model.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Responsive Listening Head Generation: A Benchmark Dataset and Baseline.
Proceedings of the Computer Vision - ECCV 2022, 2022

Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement.
Proceedings of the Computer Vision - ECCV 2022, 2022

Dynamic Temporal Filtering in Video Models.
Proceedings of the Computer Vision - ECCV 2022, 2022

Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Stand-Alone Inter-Frame Attention in Video Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Comprehending and Ordering Semantics for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

3D-Producer: A Hybrid and User-Friendly 3D Reconstruction System.
Proceedings of the Artificial Intelligence - Second CAAI International Conference, 2022

2021
Smart Director: An Event-Driven Directing System for Live Broadcasting.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Single Shot Video Object Detector.
IEEE Trans. Multim., 2021

MINet: Meta-Learning Instance Identifiers for Video Object Detection.
IEEE Trans. Image Process., 2021

Noise Augmented Double-Stream Graph Convolutional Networks for Image Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2021

A Style and Semantic Memory Mechanism for Domain Generalization.
CoRR, 2021

A Low Rank Promoting Prior for Unsupervised Contrastive Learning.
CoRR, 2021

Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Transferrable Contrastive Learning for Visual Domain Adaptation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Optimization Planning for 3D ConvNets.
Proceedings of the 38th International Conference on Machine Learning, 2021

Core-Text: Improving Scene Text Detection with Contrastive Relational Reasoning.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Condensing a Sequence to One Informative Frame for Video Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Motion-Focused Contrastive Learning of Video Representations<sup>*</sup>.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A Style and Semantic Memory Mechanism for Domain Generalization<sup>*</sup>.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Boosting Video Representation Learning With Multi-Faceted Integration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Representing Videos As Discriminative Sub-Graphs for Action Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Coarse-to-Fine Localization of Temporal Action Proposals.
IEEE Trans. Multim., 2020

Deep Metric Learning With Density Adaptivity.
IEEE Trans. Multim., 2020

Pre-training for Video Captioning Challenge 2020 Summary.
CoRR, 2020

Joint Contrastive Learning with Infinite Possibilities.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

iDirector: An Intelligent Directing System for Live Broadcast.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Exploring Depth Information for Spatial Relation Recognition.
Proceedings of the 3rd IEEE Conference on Multimedia Information Processing and Retrieval, 2020

Learning to Localize Actions from Moments.
Proceedings of the Computer Vision - ECCV 2020, 2020

Transferring and Regularizing Prediction for Semantic Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

X-Linear Attention Networks for Image Captioning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning a Unified Sample Weighting Network for Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Deep Learning-Based Multimedia Analytics: A Review.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Editorial to Special Issue on Deep Learning for Intelligent Multimedia Analytics.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Learning Click-Based Deep Structure-Preserving Embeddings with Visual Attention.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Unified Spatio-Temporal Attention Networks for Action Recognition in Videos.
IEEE Trans. Multim., 2019

See and chat: automatically generating viewer-level comments on images.
Multim. Tools Appl., 2019

Vision and Language: from Visual Perception to Content Creation.
CoRR, 2019

Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019.
CoRR, 2019

Scheduled Differentiable Architecture Search for Visual Recognition.
CoRR, 2019

vireoJD-MM at Activity Detection in Extended Videos.
CoRR, 2019

Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019.
CoRR, 2019

VireoJD-MM @ TRECVid 2019: Activities in Extended Video (ActEV).
Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Long Short-Term Relation Networks for Video Action Detection.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Animating Your Life: Real-Time Video-to-Animation Translation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Mocycle-GAN: Unpaired Video-to-Video Translation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Deep Learning for Video Captioning: A Review.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Hierarchy Parsing for Image Captioning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Relation Distillation Networks for Video Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Customizable Architecture Search for Semantic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning Spatio-Temporal Representation With Local and Global Diffusion.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Transferrable Prototypical Networks for Unsupervised Domain Adaptation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Gaussian Temporal Awareness Networks for Action Localization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Pointing Novel Objects in Image Captioning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Exploring Object Relation in Mean Teacher for Cross-Domain Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Learning Deep Spatio-Temporal Dependence for Semantic Video Segmentation.
IEEE Trans. Multim., 2018

Exploiting Web Images for Video Highlight Detection With Triplet Deep Ranking.
IEEE Trans. Multim., 2018

Boosting image sentiment analysis with visual attention.
Neurocomputing, 2018

Deep Domain Adaptation Hashing with Adversarial Learning.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

Greedy Layer-Wise Training of Long Short Term Memory Networks.
Proceedings of the 2018 IEEE International Conference on Multimedia & Expo Workshops, 2018

Exploring Visual Relationship for Image Captioning.
Proceedings of the Computer Vision - ECCV 2018, 2018

Recurrent Tubelet Proposal and Recognition Networks for Action Detection.
Proceedings of the Computer Vision - ECCV 2018, 2018

Fully Convolutional Adaptation Networks for Semantic Segmentation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Jointly Localizing and Describing Events for Dense Video Captioning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Memory Matching Networks for One-Shot Image Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Deep learning for video classification and captioning.
Proceedings of the Frontiers of Multimedia Research, 2018

2017
Detecting shot boundary with sparse coding for video summarization.
Neurocomputing, 2017

Learning hierarchical video representation for action recognition.
Int. J. Multim. Inf. Retr., 2017

Deep Semantic Hashing with Generative Adversarial Networks.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Seeing Bot.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Learning Multimodal Attention LSTM Networks for Video Captioning.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

To Create What You Tell: Generating Videos from Captions.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Boosting Image Captioning with Attributes.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep Quantization: Encoding Convolutional Activations with Deep Generative Model.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Video Captioning with Transferred Semantic Attributes.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Deep Learning for Video Classification and Captioning.
CoRR, 2016

Multi-Scale Triplet CNN for Person Re-Identification.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Share-and-Chat: Achieving Human-Level Video Commenting by Search and Multi-View Embedding.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Video ChatBot: Triggering Live Social Interactions by Automatic Video Commenting.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Deep Semantic-Preserving and Ranking-Based Hashing for Image Retrieval.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Learning Deep Intrinsic Video Representation by Exploring Temporal Coherence and Graph Structure.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Highlight Detection with Pairwise Deep Ranking for First-Person Video Summarization.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

MSR-VTT: A Large Video Description Dataset for Bridging Video and Language.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Jointly Modeling Embedding and Translation to Bridge Video and Language.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Click-boosting multi-modality graph-based reranking for image search.
Multim. Syst., 2015

Semi-supervised Hashing with Semantic Confidence for Large Scale Visual Search.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

Learning Query and Image Similarities with Ranking Canonical Correlation Analysis.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Semi-supervised Domain Adaptation with Subspace Learning for visual recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
VIREO-TNO @ TRECVID 2014: Multimedia Event Detection and Recounting (MED and MER).
Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

VIREO @ TRECVID 2014: Instance Search and Semantic Indexing.
Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

Click-through-based cross-view learning for image search.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Click-through-based Subspace Learning for Image Search.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

2013
Circular Reranking for Visual Search.
IEEE Trans. Image Process., 2013

Unified entity search in social media community.
Proceedings of the 22nd International World Wide Web Conference, 2013

VIREO/ECNU @ TRECVID 2013: A Video Dance of Detection, Recounting and Search with Motion Relativity and Concept Learning from Wild.
Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

Annotation for free: video tagging by mining user search behavior.
Proceedings of the ACM Multimedia Conference, 2013

Image search by graph-based label propagation with image representation from DNN.
Proceedings of the ACM Multimedia Conference, 2013

Video concept detection by learning from web images: A case study on cross domain learning.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

Click-boosting random walk for image search reranking.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

2012
VIREO @ TRECVID 2012: Searching with Topology, Recounting will Small Concepts, Learning with Free Examples.
Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Predicting domain adaptivity: redo or recycle?
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

2011
VIREO @ TRECVID 2011: Instance Search, Semantic Indexing, Multimedia Event Detection and Known-Item Search.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Context-based friend suggestion in online photo-sharing community.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

2010
Co-reranking by mutual reinforcement for image search.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010


  Loading...