Boshen Xu

Orcid: 0009-0000-1896-9600

According to our database1, Boshen Xu authored at least 16 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
StreamPro: From Reactive Perception to Proactive Decision-Making in Streaming Video.
CoRR, May, 2026

Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation.
CoRR, February, 2026

An Integrated Wearable Electromagnetic Sensing System with Wireless Vector Readout for Noninvasive Glucose Monitoring.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2026

2025
Xiaomi MiMo-VL-Miloco Technical Report.
CoRR, December, 2025

TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding.
CoRR, November, 2025

REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding.
CoRR, November, 2025

TimeZero: Temporal Video Grounding with Reasoning-Guided LVLM.
CoRR, March, 2025

EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SPAFormer: Sequential 3D Part Assembly with Transformers.
Proceedings of the International Conference on 3D Vision, 2025

2024
EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions?
CoRR, 2024

Unveiling Visual Biases in Audio-Visual Localization Benchmarks.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

2023
POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-view World.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Open-Category Human-Object Interaction Pre-training via Language Modeling Framework.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2021
Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report.
CoRR, 2021


  Loading...