Boshen Xu

Orcid: 0009-0000-1896-9600

According to our database¹, Boshen Xu authored at least 16 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

StreamPro: From Reactive Perception to Proactive Decision-Making in Streaming Video.

[BibT_eX]

[DOI]

CoRR, May, 2026

Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation.

[BibT_eX]

[DOI]

CoRR, February, 2026

An Integrated Wearable Electromagnetic Sensing System with Wireless Vector Readout for Noninvasive Glucose Monitoring.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2026

2025

Xiaomi MiMo-VL-Miloco Technical Report.

[BibT_eX]

[DOI]

CoRR, December, 2025

TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding.

[BibT_eX]

[DOI]

CoRR, November, 2025

REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding.

[BibT_eX]

[DOI]

CoRR, November, 2025

TimeZero: Temporal Video Grounding with Reasoning-Guided LVLM.

[BibT_eX]

[DOI]

CoRR, March, 2025

EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SPAFormer: Sequential 3D Part Assembly with Transformers.

[BibT_eX]

[DOI]

Boshen Xu

Sipeng Zheng

Qin Jin

Proceedings of the International Conference on 3D Vision, 2025

2024

EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions?

[BibT_eX]

[DOI]

CoRR, 2024

Unveiling Visual Biases in Audio-Visual Localization Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

2023

POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-view World.

[BibT_eX]

[DOI]

Boshen Xu

Sipeng Zheng

Qin Jin

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Open-Category Human-Object Interaction Pre-training via Language Modeling Framework.

[BibT_eX]

[DOI]

Sipeng Zheng

Boshen Xu

Qin Jin

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2021

Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report.

[BibT_eX]

[DOI]

CoRR, 2021

Boshen Xu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...