Meng Cao
Orcid: 0000-0002-8946-4228Affiliations:
- Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
- Peking University, School of Electronic and Computer Engineering, Shenzhen, China (PhD 2023)
According to our database1,
Meng Cao authored at least 64 papers
between 2019 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model.
CoRR, April, 2026
ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation.
CoRR, March, 2026
Trans. Mach. Learn. Res., 2026
Trans. Mach. Learn. Res., 2026
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
CoRR, December, 2025
CoRR, December, 2025
CoRR, December, 2025
Seeing through Imagination: Learning Scene Geometry via Implicit Spatial World Modeling.
CoRR, December, 2025
Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding.
CoRR, December, 2025
CoRR, July, 2025
CoRR, May, 2025
Cross-Modal Conditioned Reconstruction for Language-Guided Medical Image Segmentation.
IEEE Trans. Medical Imaging, April, 2025
BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese.
CoRR, April, 2025
IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs.
CoRR, April, 2025
CoRR, April, 2025
CoRR, March, 2025
CoRR, February, 2025
ChineseSimpleVQA - "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models.
CoRR, February, 2025
When Large Vision Language Models Meet Multimodal Sequential Recommendation: An Empirical Study.
Proceedings of the ACM on Web Conference 2025, 2025
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
<i>ClimateIQA: </i> A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.2, 2025
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
ACM Trans. Multim. Comput. Commun. Appl., December, 2024
IEEE Trans. Circuits Syst. Video Technol., October, 2024
CoRR, 2024
Vision-Language Models Meet Meteorology: Developing Models for Extreme Weather Events Detection with Heatmaps.
CoRR, 2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
IEEE Trans. Image Process., 2023
CoRR, 2023
Video Referring Expression Comprehension via Transformer with Content-conditioned Query.
CoRR, 2023
Video Referring Expression Comprehension via Transformer with Content-conditioned Query.
Proceedings of the 1st International Workshop on Deep Multimodal Learning for Information Retrieval, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
IEEE Trans. Image Process., 2022
IEEE Trans. Circuits Syst. Video Technol., 2022
IEEE Trans. Circuits Syst. Video Technol., 2022
CoRR, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
IEEE Trans. Image Process., 2021
Synergic learning for noise-insensitive webly-supervised temporal action localization.
Image Vis. Comput., 2021
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
CoLA: Weakly-Supervised Temporal Action Localization With Snippet Contrastive Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
Weakly Labelled Audio Tagging Via Convolutional Networks with Spatial and Channel-Wise Attention.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection.
IEEE Access, 2019