Hao Li
Orcid: 0009-0002-4473-6012Affiliations:
- Chinese University of Hong Kong, SAR, China
- Tsinghua University, China (former)
According to our database1,
Hao Li authored at least 62 papers
between 2019 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CoRR, May, 2026
IEEE Trans. Circuits Syst. Video Technol., April, 2026
CoRR, April, 2026
CoRR, March, 2026
CoRR, March, 2026
CoRR, March, 2026
CoRR, February, 2026
RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation.
CoRR, February, 2026
CoRR, January, 2026
CoRR, January, 2026
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2025
CoRR, October, 2025
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy.
CoRR, October, 2025
CoRR, October, 2025
NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints.
CoRR, October, 2025
CoRR, August, 2025
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation.
CoRR, July, 2025
Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models.
CoRR, July, 2025
CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation.
CoRR, June, 2025
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation.
CoRR, June, 2025
The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants.
CoRR, May, 2025
CoRR, May, 2025
CoRR, May, 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT.
CoRR, May, 2025
CoRR, April, 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing.
CoRR, March, 2025
CoRR, March, 2025
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
CityGS-$\mathcal{X}$: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
Beyond GPT-5: Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing.
Proceedings of the 2025 7th International Conference on Distributed Artificial Intelligence, 2025
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
2024
Auton. Agents Multi Agent Syst., June, 2024
CoRR, 2024
PET-NeRV: Bridging Generalized Video Codec and Content-Specific Neural Representation.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks.
CoRR, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019