Xihan Wei

According to our database¹, Xihan Wei authored at least 21 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning.

[BibT_eX]

[DOI]

CoRR, September, 2025

LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs.

[BibT_eX]

[DOI]

CoRR, June, 2025

HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context.

[BibT_eX]

[DOI]

CoRR, June, 2025

CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization.

[BibT_eX]

[DOI]

CoRR, May, 2025

ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding.

[BibT_eX]

[DOI]

CoRR, April, 2025

ViSpeak: Visual Instruction Feedback in Streaming Videos.

[BibT_eX]

[DOI]

CoRR, March, 2025

R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcement Learning.

[BibT_eX]

[DOI]

Jiaxing Zhao

Xihan Wei

Liefeng Bo

CoRR, March, 2025

HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding.

[BibT_eX]

[DOI]

CoRR, January, 2025

Omni-Emotion: Extending Video MLLM with Detailed Face and Audio Modeling for Multimodal Emotion Analysis.

[BibT_eX]

[DOI]

CoRR, January, 2025

Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness.

[BibT_eX]

[DOI]

CoRR, January, 2025

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding.

[BibT_eX]

[DOI]

CoRR, January, 2025

A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Person De-reidentification: A Variation-guided Identity Shift Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

DreamView: Injecting View-Specific Text Guidance Into Text-to-3D Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2022

Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

SP-ViT: Learning 2D Spatial Priors for Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021

Interactive Self-Training With Mean Teachers for Semi-Supervised Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Continual Local Replacement for Few-shot Image Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

2019

Learning Continually from Low-shot Data Stream.

[BibT_eX]

[DOI]

CoRR, 2019

Xihan Wei

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...