We stand with Ukraine

We stand with Ukraine

Junyu Gao

Orcid: 0000-0002-8105-5497

Affiliations:

Chinese Academy of Sciences, Institute of Automation, National Lab of Pattern Recognition, Beijing, China
University of Chinese Academy of Sciences, School of Artifical Intelligence, Beijing, China

According to our database¹, Junyu Gao authored at least 73 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org
on ieeexplore.ieee.org

On csauthors.net:

Bibliography

2026

A Comprehensive Survey on Evidential Deep Learning and its Applications.

[DOI]

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., March, 2026

Dual-level Adaptation for Multi-Object Tracking: Building Test-Time Calibration from Experience and Intuition.

[DOI]

,

,

,

,

CoRR, March, 2026

History-Guided Prompt Generation for Vision-and-Language Navigation.

[DOI]

,

,

,

IEEE Trans. Cybern., February, 2026

HulluEdit: Single-Pass Evidence-Consistent Subspace Editing for Mitigating Hallucinations in Large Vision-Language Models.

[DOI]

,

,

,

,

,

CoRR, February, 2026

2025

Revisiting Essential and Nonessential Settings of Evidential Deep Learning.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., October, 2025

Learning Probabilistic Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., June, 2025

Active Cross-Modal Domain Adaptation.

[DOI]

,

,

,

,

,

IEEE Trans. Multim., 2025

R<sup>2</sup>A<sup>2</sup>-MoE: Ridge Regression-Based Analytic Adaptation with Mixture of Experts for Continual Learning with Vision-Language Models.

[DOI]

,

,

,

Proceedings of the Pattern Recognition and Computer Vision - 8th Chinese Conference, 2025

Cross-Modal Dual-Causal Learning for Long-Term Action Recognition.

[DOI]

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Learning Evidential Delta Denoising Scores for Video Editing.

[DOI]

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Building Embodied EvoAgent: A Brain-inspired Paradigm for Bridging Multimodal Large Models and World Models.

[DOI]

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments.

[DOI]

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Evidential Knowledge Distillation.

[DOI]

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

2024

Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding.

[DOI]

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., October, 2024

Multimodal Imbalance-Aware Gradient Modulation for Weakly-Supervised Audio-Visual Video Parsing.

[DOI]

,

,

,

IEEE Trans. Circuits Syst. Video Technol., June, 2024

Feature Disentanglement Network: Multi-Object Tracking Needs More Differentiated Features.

[DOI]

,

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., March, 2024

Learning Proposal-Aware Re-Ranking for Weakly-Supervised Temporal Action Localization.

[DOI]

,

,

,

,

,

,

IEEE Trans. Circuits Syst. Video Technol., January, 2024

Learning Multi-Expert Distribution Calibration for Long-Tailed Video Classification.

[DOI]

,

,

IEEE Trans. Multim., 2024

Exploring Rich Semantics for Open-Set Action Recognition.

[DOI]

,

,

,

,

IEEE Trans. Multim., 2024

Spatiotemporal Orthogonal Projection Capsule Network for Incremental Few-Shot Action Recognition.

[DOI]

,

,

IEEE Trans. Multim., 2024

Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation.

[DOI]

,

,

IEEE Trans. Image Process., 2024

Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models.

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Open-Vocabulary Video Scene Graph Generation via Union-aware Semantic Alignment.

[DOI]

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation.

[DOI]

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

R-EDL: Relaxing Nonessential Settings of Evidential Deep Learning.

[DOI]

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Vectorized Evidential Learning for Weakly-Supervised Temporal Action Localization.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Uncertainty-Aware Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Weakly-Supervised Video Object Grounding via Causal Intervention.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Many Hands Make Light Work: Transferring Knowledge From Auxiliary Tasks for Video-Text Retrieval.

[DOI]

,

,

,

IEEE Trans. Multim., 2023

Weakly-Supervised Video Object Grounding via Learning Uni-Modal Associations.

[DOI]

,

,

IEEE Trans. Multim., 2023

Learning Scene-Aware Spatio-Temporal GNNs for Few-Shot Early Action Prediction.

[DOI]

,

,

IEEE Trans. Multim., 2023

Spatial-Temporal Exclusive Capsule Network for Open Set Action Recognition.

[DOI]

,

,

,

IEEE Trans. Multim., 2023

Learning Dual-Routing Capsule Graph Neural Network for Few-Shot Video Classification.

[DOI]

,

,

IEEE Trans. Multim., 2023

Test-time Adaptive Vision-and-Language Navigation.

[DOI]

,

,

CoRR, 2023

Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing.

[DOI]

,

,

CoRR, 2023

Video Entailment via Reaching a Structure-Aware Cross-modal Consensus.

[DOI]

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Weakly-supervised Video Scene Graph Generation via Unbiased Cross-modal Learning.

[DOI]

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Lite-MKD: A Multi-modal Knowledge Distillation Framework for Lightweight Few-shot Action Recognition.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Leveraging Attribute Knowledge for Open-set Action Recognition.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio- Visual Event Perception.

[DOI]

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization.

[DOI]

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Learning Hierarchical Video Graph Networks for One-Stop Video Delivery.

[DOI]

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2022

The Model May Fit You: User-Generalized Cross-Modal Retrieval.

[DOI]

,

,

,

IEEE Trans. Multim., 2022

Compact Representation and Reliable Classification Learning for Point-Level Weakly-Supervised Action Localization.

[DOI]

,

,

IEEE Trans. Image Process., 2022

Learning Video Moment Retrieval Without a Single Annotated Video.

[DOI]

,

IEEE Trans. Circuits Syst. Video Technol., 2022

Learning Semantic-Aware Spatial-Temporal Attention for Interpretable Action Recognition.

[DOI]

,

,

IEEE Trans. Circuits Syst. Video Technol., 2022

Learning Muti-expert Distribution Calibration for Long-tailed Video Classification.

[DOI]

,

,

CoRR, 2022

Dual-Evidential Learning for Weakly-supervised Temporal Action Localization.

[DOI]

,

,

,

Proceedings of the Computer Vision - ECCV 2022, 2022

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization.

[DOI]

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Health Status Prediction with Local-Global Heterogeneous Behavior Graph.

[DOI]

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2021

Knowledge-driven Egocentric Multimodal Activity Recognition.

[DOI]

,

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2021

Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval.

[DOI]

,

,

,

IEEE Trans. Multim., 2021

Learning Dual-Pooling Graph Neural Networks for Few-Shot Video Classification.

[DOI]

,

,

IEEE Trans. Multim., 2021

Unsupervised Video Summarization via Relation-Aware Assignment Learning.

[DOI]

,

,

,

IEEE Trans. Multim., 2021

Learning to Model Relationships for Zero-Shot Video Classification.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., 2021

Weakly-Supervised Video Object Grounding via Stable Context Learning.

[DOI]

,

,

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Diving Into The Relations: Leveraging Semantic and Visual Structures For Video Moment Retrieval.

[DOI]

,

,

,

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Active Universal Domain Adaptation.

[DOI]

,

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Fast Video Moment Retrieval.

[DOI]

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

CI-GNN: Building a Category-Instance Graph for Zero-Shot Video Classification.

[DOI]

,

IEEE Trans. Multim., 2020

Find Objects and Focus on Highlights: Mining Object Semantics for Video Highlight Detection via Graph Neural Networks.

[DOI]

,

,

,

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

SMART: Joint Sampling and Regression for Visual Tracking.

[DOI]

,

,

IEEE Trans. Image Process., 2019

Graph Convolutional Tracking.

[DOI]

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs.

[DOI]

,

,

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

P2T: Part-to-Target Tracking via Deep Regression Learning.

[DOI]

,

,

,

IEEE Trans. Image Process., 2018

Watch, Think and Attend: End-to-End Video Classification via Dynamic Knowledge Evolution Modeling.

[DOI]

,

,

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

The Sixth Visual Object Tracking VOT2018 Challenge Results.

[DOI]

,

,

,

Michael Felsberg

,

Roman P. Pflugfelder

,

Luka Cehovin Zajc

,

,

,

,

Abdelrahman Eldesokey

,

Gustavo Fernández

,

Álvaro García-Martín

,

Álvaro Iglesias-Arias

,

A. Aydin Alatan

,

Abel González-García

,

Alfredo Petrosino

,

Alireza Memarmoghadam

,

,

,

,

Arnold W. M. Smeulders

,

Asanka G. Perera

,

,

,

,

,

Changzhen Xiong

,

,

,

,

,

,

,

,

,

,

Efstratios Gavves

,

,

Erik Velasco-Salido

,

Fahad Shahbaz Khan

,

,

,

,

Francesco Battistone

,

,

Gorthi R. K. Sai Subrahmanyam

,

Guilherme Sousa Bastos

,

,

Hamed Kiani Galoogahi

,

,

,

,

,

,

Horst Possegger

,

,

,

,

,

,

Hyung Jin Chang

,

Isabela Drummond

,

,

Jaime Spencer Martin

,

Javaan Singh Chahl

,

,

,

,

,

,

Joakim Johnander

,

João F. Henriques

,

,

Joost van de Weijer

,

Jorge Rodríguez Herranz

,

José M. Martínez

,

,

,

,

,

,

,

,

,

,

Luca Bertinetto

,

,

,

Mario Edoardo Maresca

,

Martin Danelljan

,

Ming-Hsuan Yang

,

Mohamed H. Abdelpakey

,

Mohamed Shehata

,

,

,

,

,

,

Pablo Vicente-Moñivar

,

,

,

Philip H. S. Torr

,

Priya Mariam Raju

,

,

,

,

,

Rafael Martin Nieto

,

Rama Krishna Sai Subrahmanyam Gorthi

,

,

,

Richard M. Everson

,

,

,

,

,

,

Shuangping Huang

,

,

,

,

Stuart Golodetz

,

,

,

,

,

Vincenzo Santopietro

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Yiannis Demiris

,

,

,

,

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

2017

Deep Relative Tracking.

[DOI]

,

,

,

IEEE Trans. Image Process., 2017

A Unified Personalized Video Recommendation via Dynamic Recurrent Neural Networks.

[DOI]

,

,

Proceedings of the 2017 ACM on Multimedia Conference, 2017

The Visual Object Tracking VOT2017 Challenge Results.

[DOI]

,

,

,

Michael Felsberg

,

Roman P. Pflugfelder

,

Luka Cehovin Zajc

,

,

,

,

Abdelrahman Eldesokey

,

Gustavo Fernández

,

Álvaro García-Martín

,

,

Alfredo Petrosino

,

Alireza Memarmoghadam

,

,

Antoine Manzanera

,

,

A. Aydin Alatan

,

,

,

,

,

,

,

,

,

,

,

Erik Velasco-Salido

,

Fahad Shahbaz Khan

,

Francesco Battistone

,

Gorthi R. K. Sai Subrahmanyam

,

,

,

Guilherme Sousa Bastos

,

Guna Seetharaman

,

Hongliang Zhang

,

,

,

Isabela Drummond

,

,

,

,

,

,

,

,

,

,

João F. Henriques

,

José M. Martínez

,

,

,

,

,

Kannappan Palaniappan

,

,

,

,

,

,

,

,

Luca Bertinetto

,

Mahdieh Poostchi

,

Martin Danelljan

,

Matthias Mueller

,

,

Ming-Hsuan Yang

,

,

,

,

,

Pallavi M. Venugopal

,

,

Philip H. S. Torr

,

,

,

,

Rafael Martin Nieto

,

,

,

,

,

,

Stuart Golodetz

,

,

,

,

Vincenzo Santopietro

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

2016

The Visual Object Tracking VOT2016 Challenge Results.

[DOI]

,

,

,

Michael Felsberg

,

Roman P. Pflugfelder

,

,

,

,

,

Gustavo Fernández

,

,

Alfredo Petrosino

,

Alireza Memarmoghadam

,

Álvaro García-Martín

,

Andrés Solís Montero

,

,

Andreas Robinson

,

,

Anton Varfolomieiev

,

A. Aydin Alatan

,

,

,

,

,

Brais Martínez

,

Chang-Ming Chang

,

,

,

,

,

,

,

,

,

,

Fahad Shahbaz Khan

,

,

,

,

Francesco Battistone

,

,

,

Gorthi R. K. Sai Subrahmanyam

,

Guilherme Sousa Bastos

,

Guna Seetharaman

,

,

,

,

,

Horst Possegger

,

,

,

,

Hyung Jin Chang

,

Isabela Drummond

,

,

,

,

,

,

,

,

,

,

,

,

João F. Henriques

,

,

,

José M. Martínez

,

,

,

Kannappan Palaniappan

,

,

,

Krystian Mikolajczyk

,

,

,

,

Luca Bertinetto

,

Madan Kumar Rapuru

,

Mahdieh Poostchi

,

Mario Edoardo Maresca

,

Martin Danelljan

,

Matthias Mueller

,

,

,

Michel F. Valstar

,

,

,

Muhammad Haris Khan

,

,

,

Noor Al-Shakarji

,

,

,

,

,

Philip H. S. Torr

,

,

,

Rafael Martin Nieto

,

Rengarajan Pelapur

,

,

Robert Laganière

,

,

,

Sebastian Bernd Krah

,

,

Shengping Zhang

,

,

,

,

,

,

,

Stuart Golodetz

,

Sumithra Kakanuru

,

,

,

Thomas Mauthner

,

,

Tony P. Pridmore

,

Vincenzo Santopietro

,

,

,

Wolfgang Hübner

,

,

,

,

,

Yiannis Demiris

,

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Loading...