Xi Zhou

Orcid: 0000-0001-9943-5482

Affiliations:
  • CloudWalk Technology, Shanghai, China


According to our database1, Xi Zhou authored at least 26 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
COST: Contrastive one-stage transformer for vision-language small object tracking.
Inf. Fusion, 2026

2025
How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline.
CoRR, December, 2025

Boosting Nighttime UAV Tracking via Self-prompting Autoregressive Learning and a New Benchmark.
Proceedings of the Pattern Recognition and Computer Vision - 8th Chinese Conference, 2025

MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
High-compressed deepfake video detection with contrastive spatiotemporal distillation.
Neurocomputing, January, 2024

Multi-Level Signal Fusion for Enhanced Weakly-Supervised Audio-Visual Video Parsing.
IEEE Signal Process. Lett., 2024

Point Spatio-Temporal Pyramid Network for Point Cloud Video Understanding.
IEEE Signal Process. Lett., 2024

Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2.
CoRR, 2024

Awesome Multi-modal Object Tracking.
CoRR, 2024

WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Video Moment Retrieval via Comprehensive Relation-Aware Network.
IEEE Trans. Circuits Syst. Video Technol., September, 2023

All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

AVForensics: Audio-driven Deepfake Video Detection with Masking Strategy in Self-supervision.
Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Audio-Driven Talking Head Video Generation with Diffusion Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

Exploiting Multi-modal Fusion for Robust Face Representation Learning with Missing Modality.
Proceedings of the Artificial Neural Networks and Machine Learning, 2023

PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Efficient Video Grounding With Which-Where Reading Comprehension.
IEEE Trans. Circuits Syst. Video Technol., 2022

You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in Videos.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

HiCo: Hierarchical Contrastive Learning for Ultrasound Video Model Pretraining.
Proceedings of the Computer Vision - ACCV 2022, 2022

2021
Self-Guided Body Part Alignment With Relation Transformers for Occluded Person Re-Identification.
IEEE Signal Process. Lett., 2021

Skeleton-Based Action Recognition With Focusing-Diffusion Graph Convolutional Networks.
IEEE Signal Process. Lett., 2021

Relation-aware Video Reading Comprehension for Temporal Language Grounding.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

2020
Receptive Multi-Granularity Representation for Person Re-Identification.
IEEE Trans. Image Process., 2020

Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid Network.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Focusing and Diffusion: Bidirectional Attentive Graph Convolutional Networks for Skeleton-based Action Recognition.
CoRR, 2019


  Loading...