Baoqi Pei

Orcid: 0009-0007-7811-7961

According to our database¹, Baoqi Pei authored at least 18 papers between 2024 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline.

[BibT_eX]

[DOI]

CoRR, March, 2026

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., January, 2026

2025

Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT.

[BibT_eX]

[DOI]

CoRR, November, 2025

Guiding Audio-Visual Question Answering with Collective Question Reasoning.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., October, 2025

EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT.

[BibT_eX]

[DOI]

CoRR, October, 2025

Vinci: A Real-time Smart Assistant Based on Egocentric Vision-language Model for Portable Devices.

[BibT_eX]

[DOI]

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., September, 2025

Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision.

[BibT_eX]

[DOI]

CoRR, June, 2025

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant.

[BibT_eX]

[DOI]

CoRR, March, 2025

EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation.

[BibT_eX]

[DOI]

CoRR, 2024

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Baoqi Pei

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...