Zhaobo Qi

Orcid: 0000-0001-9196-9818

According to our database1, Zhaobo Qi authored at least 27 papers between 2020 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Distinguishing semantically similar queries in temporal video grounding via LLM-generated query.
Multim. Syst., April, 2026

STAND: Semantic Anchoring Constraint with Dual-Granularity Disambiguation for Remote Sensing Image Change Captioning.
CoRR, April, 2026

SinkRouter: Sink-Aware Routing for Efficient Long-Context Decoding in Large Language and Multimodal Models.
CoRR, April, 2026

TowerDataset: A Heterogeneous Benchmark for Transmission Corridor Segmentation with a Global-Local Fusion Framework.
CoRR, April, 2026

Multimodal-guided mixture-of-experts bias removal strategy for natural language video localization.
Multim. Syst., February, 2026

AIFIND: Artifact-Aware Interpreting Fine-Grained Alignment for Incremental Face Forgery Detection.
Proceedings of the 2026 International Conference on Multimedia Retrieval, 2026

2025
VPA: Multi-Modal Virtual Point Augmentation for 3D Object Detection.
IEEE Trans. Circuits Syst. Video Technol., December, 2025

Multi-Modal 3D Object Detector with Object-Guided Fusion and Hierarchical Sample Selection.
ACM Trans. Multim. Comput. Commun. Appl., August, 2025

KN-VLM: KNowledge-guided Vision-and-Language Model for visual abductive reasoning.
Multim. Syst., April, 2025

Dual-guided multi-modal bias removal strategy for temporal sentence grounding in video.
Multim. Syst., April, 2025

Uncertainty-Aware Mixture of Experts for Video Action Anticipation.
IEEE Trans. Circuits Syst. Video Technol., 2025

Matching Street View and Satellite Images via Drone Imagery and Semantic Descriptions.
Proceedings of the 3rd International Workshop on UAVs in Multimedia: Capturing the World from a New Perspective, 2025

Combatting Data Imbalance and Noise in Micro-Action Recognition.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Enhancing Pre-trained Representation Classifiability can Boost its Interpretability.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Video Language Model Pretraining with Spatio-temporal Masking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Procedure Knowledge Decoupled Distillation Strategy for Procedure Planning in Instructional Videos.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Uncertainty-Boosted Robust Video Activity Anticipation.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Collaborative Debias Strategy for Temporal Sentence Grounding in Video.
IEEE Trans. Circuits Syst. Video Technol., November, 2024

Improving Sequential DeepFake Detection with Local information enhancement.
Proceedings of the 6th ACM International Conference on Multimedia in Asia, 2024

Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in Video.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Temporal Dynamic Concept Modeling Network for Explainable Video Event Recognition.
ACM Trans. Multim. Comput. Commun. Appl., November, 2023

Self-Regulated Learning for Egocentric Video Activity Anticipation.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Semantic-Aware Dynamic Feature Selection and Fusion for Object Detection in UAV Videos.
Proceedings of the ACM Multimedia Asia 2023, 2023

2020
Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Towards More Explainability: Concept Knowledge Mining Network for Event Recognition.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020


  Loading...