Junyu Gao

IEEE Trans. Pattern Anal. Mach. Intell., October, 2025

Cross-Modal Dual-Causal Learning for Long-Term Action Recognition.

[BibT_eX]

[DOI]

CoRR, July, 2025

Learning Probabilistic Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2025

NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments.

[BibT_eX]

[DOI]

Xuan Yao

CoRR, June, 2025

Active Cross-Modal Domain Adaptation.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

2024

Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., October, 2024

Multimodal Imbalance-Aware Gradient Modulation for Weakly-Supervised Audio-Visual Video Parsing.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., June, 2024

Feature Disentanglement Network: Multi-Object Tracking Needs More Differentiated Features.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., March, 2024

Learning Proposal-Aware Re-Ranking for Weakly-Supervised Temporal Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., January, 2024

Learning Multi-Expert Distribution Calibration for Long-Tailed Video Classification.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Exploring Rich Semantics for Open-Set Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Spatiotemporal Orthogonal Projection Capsule Network for Incremental Few-Shot Action Recognition.

[BibT_eX]

[DOI]

Yangbo Feng

IEEE Trans. Multim., 2024

Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation.

[BibT_eX]

[DOI]

Xinhong Ma

IEEE Trans. Image Process., 2024

A Comprehensive Survey on Evidential Deep Learning and Its Applications.

[BibT_eX]

[DOI]

CoRR, 2024

Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Open-Vocabulary Video Scene Graph Generation via Union-aware Semantic Alignment.

[BibT_eX]

[DOI]

Ziyue Wu

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Xuan Yao

Proceedings of the Forty-first International Conference on Machine Learning, 2024

R-EDL: Relaxing Nonessential Settings of Evidential Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Vectorized Evidential Learning for Weakly-Supervised Temporal Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Uncertainty-Aware Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Weakly-Supervised Video Object Grounding via Causal Intervention.

[BibT_eX]

[DOI]

Wei Wang

IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Many Hands Make Light Work: Transferring Knowledge From Auxiliary Tasks for Video-Text Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Weakly-Supervised Video Object Grounding via Learning Uni-Modal Associations.

[BibT_eX]

[DOI]

Wei Wang

IEEE Trans. Multim., 2023

Learning Scene-Aware Spatio-Temporal GNNs for Few-Shot Early Action Prediction.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Spatial-Temporal Exclusive Capsule Network for Open Set Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Learning Dual-Routing Capsule Graph Neural Network for Few-Shot Video Classification.

[BibT_eX]

[DOI]

Yangbo Feng

IEEE Trans. Multim., 2023

Test-time Adaptive Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Xuan Yao

CoRR, 2023

Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing.

[BibT_eX]

[DOI]

CoRR, 2023

Video Entailment via Reaching a Structure-Aware Cross-modal Consensus.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Weakly-supervised Video Scene Graph Generation via Unbiased Cross-modal Learning.

[BibT_eX]

[DOI]

Ziyue Wu

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Lite-MKD: A Multi-modal Knowledge Distillation Framework for Lightweight Few-shot Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Leveraging Attribute Knowledge for Open-set Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio- Visual Event Perception.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Learning Hierarchical Video Graph Networks for One-Stop Video Delivery.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2022

The Model May Fit You: User-Generalized Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Compact Representation and Reliable Classification Learning for Point-Level Weakly-Supervised Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Learning Video Moment Retrieval Without a Single Annotated Video.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Learning Semantic-Aware Spatial-Temporal Attention for Interpretable Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Learning Muti-expert Distribution Calibration for Long-tailed Video Classification.

[BibT_eX]

[DOI]

CoRR, 2022

Dual-Evidential Learning for Weakly-supervised Temporal Action Localization.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Health Status Prediction with Local-Global Heterogeneous Behavior Graph.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2021

Knowledge-driven Egocentric Multimodal Activity Recognition.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2021

Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

Learning Dual-Pooling Graph Neural Networks for Few-Shot Video Classification.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

Unsupervised Video Summarization via Relation-Aware Assignment Learning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

Learning to Model Relationships for Zero-Shot Video Classification.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2021

Weakly-Supervised Video Object Grounding via Stable Context Learning.

[BibT_eX]

[DOI]

Wei Wang

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Diving Into The Relations: Leveraging Semantic and Visual Structures For Video Moment Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Active Universal Domain Adaptation.

[BibT_eX]

[DOI]

Xinhong Ma

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Fast Video Moment Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

CI-GNN: Building a Category-Instance Graph for Zero-Shot Video Classification.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2020

Find Objects and Focus on Highlights: Mining Object Semantics for Video Highlight Detection via Graph Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

SMART: Joint Sampling and Regression for Visual Tracking.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2019

Graph Convolutional Tracking.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

P2T: Part-to-Target Tracking via Deep Regression Learning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2018

Watch, Think and Attend: End-to-End Video Classification via Dynamic Knowledge Evolution Modeling.

[BibT_eX]

[DOI]

Gorthi R. K. Sai Subrahmanyam

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

The Sixth Visual Object Tracking VOT2018 Challenge Results.

[BibT_eX]

[DOI]

Abdelrahman Eldesokey

Gustavo Fernández

Álvaro García-Martín

Álvaro Iglesias-Arias

A. Aydin Alatan

Abel González-García

Alfredo Petrosino

Alireza Memarmoghadam

Andrea Vedaldi

Andrej Muhic

Anfeng He

Arnold W. M. Smeulders

Guilherme Sousa Bastos

Haibin Ling

Hamed Kiani Galoogahi

Jorge Rodríguez Herranz

Mario Edoardo Maresca

Martin Danelljan

Ming-Hsuan Yang

Mohamed H. Abdelpakey

Pablo Vicente-Moñivar

Rama Krishna Sai Subrahmanyam Gorthi

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

2017

Deep Relative Tracking.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2017

A Unified Personalized Video Recommendation via Dynamic Recurrent Neural Networks.

[BibT_eX]

[DOI]