Qing Li
Orcid: 0000-0003-1185-5365Affiliations:
- Beijing Institute for General Artificial Intelligence (BIGAI), National Key Laboratory of General Artificial Intelligence, Beijing, China
- University of California, Los Angeles, CA, USA (former)
- University of Science and Technology of China, Hefei, China (former)
According to our database1,
Qing Li
authored at least 52 papers
between 2016 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation.
CoRR, July, 2025
LEO-VL: Towards 3D Vision-Language Generalists via Data Scaling with Efficient Representation.
CoRR, June, 2025
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes.
CoRR, June, 2025
CoRR, May, 2025
CoRR, May, 2025
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation.
CoRR, May, 2025
CoRR, April, 2025
CoRR, April, 2025
Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding.
CoRR, January, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
2024
CoRR, 2024
INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations.
CoRR, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
[inline-graphic not available: see fulltext]VideoAgent: A Memory-Augmented Multimodal Agent for Video Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
2022
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation.
CoRR, 2022
2021
A HINT from Arithmetic: On Systematic Generalization of Perception, Syntax, and Semantics.
CoRR, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning.
Proceedings of the 37th International Conference on Machine Learning, 2020
Proceedings of the Computer Vision - ECCV 2020, 2020
2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018
Proceedings of the Computer Vision - ECCV 2018, 2018
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
2017
Int. J. Multim. Inf. Retr., 2017
2016
Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016