Qingpei Guo
Orcid: 0009-0001-0521-9664
According to our database1,
Qingpei Guo
authored at least 41 papers
between 2015 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer.
CoRR, October, 2025
VaccineRAG: Boosting Multimodal Large Language Models' Immunity to Harmful RAG Samples.
CoRR, September, 2025
IEEE Trans. Circuits Syst. Video Technol., August, 2025
CoRR, July, 2025
CoRR, June, 2025
CoRR, May, 2025
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction.
CoRR, May, 2025
From Mapping to Composing: A Two-Stage Framework for Zero-shot Composed Image Retrieval.
CoRR, April, 2025
CoRR, March, 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance.
CoRR, February, 2025
Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining, 2025
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
VQAGuider: Guiding Multimodal Large Language Models to Answer Complex Video Questions.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
SNP-S<sup>3</sup>: Shared Network Pre-Training and Significant Semantic Strengthening for Various Video-Text Tasks.
IEEE Trans. Circuits Syst. Video Technol., April, 2024
CoRR, 2024
M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text Retrieval.
CoRR, 2024
SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks.
CoRR, 2024
M<sub>2</sub>-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining.
CoRR, 2024
Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition.
CoRR, 2024
M<sup>2</sup>-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text Retrieval.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
EVE: Efficient Zero-Shot Text-Based Video Editing With Depth Map Guidance and Temporal Consistency Constraints.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
CoRR, 2023
Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input.
Proceedings of the Computer Vision - ECCV 2022, 2022
2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
Automatic Car Damage Assessment System: Reading and Understanding Videos as Professional Insurance Inspectors.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2017
Non-Frontal Facial Expression Recognition Using a Depth-Patch Based Deep Neural Network.
J. Comput. Sci. Technol., 2017
2015
The Implementation of Hadoop-based Crawler System and Graphlite-based PageRank-Calculation In Search Engine.
CoRR, 2015