Zun Wang

Orcid: 0009-0005-9502-050X

Affiliations:

University of North Carolina at Chapel Hill, NC, USA
Australian National University, Canberra, ACT, Australia (2020 - 2023)

According to our database¹, Zun Wang authored at least 30 papers between 2022 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Seeing Isn't Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)?

[BibT_eX]

[DOI]

CoRR, May, 2026

PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation.

[BibT_eX]

[DOI]

CoRR, May, 2026

OS-Themis: A Scalable Critic Framework for Generalist GUI Rewards.

[BibT_eX]

[DOI]

CoRR, March, 2026

V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising.

[BibT_eX]

[DOI]

CoRR, March, 2026

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories.

[BibT_eX]

[DOI]

CoRR, February, 2026

When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning.

[BibT_eX]

[DOI]

CoRR, February, 2026

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent.

[BibT_eX]

[DOI]

CoRR, January, 2026

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agents.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

DreamRunner: Fine-Grained Compositional Story-to-Video Generation with Retrieval-Augmented Motion Adaptation.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models.

[BibT_eX]

[DOI]

CoRR, December, 2025

Planning with Sketch-Guided Verification for Physics-Aware Video Generation.

[BibT_eX]

[DOI]

CoRR, November, 2025

Error-Driven Scene Editing for 3D Grounding in Large Language Models.

[BibT_eX]

[DOI]

CoRR, November, 2025

Learning Goal-Oriented Language-Guided Navigation with Self-Improving Demonstrations at Scale.

[BibT_eX]

[DOI]

CoRR, September, 2025

ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2025

EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance.

[BibT_eX]

[DOI]

CoRR, May, 2025

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation.

[BibT_eX]

[DOI]

CoRR, 2024

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark.

[BibT_eX]

[DOI]

CoRR, 2023

Scaling Data Generation in Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

InternVideo: General Video Foundation Models via Generative and Discriminative Learning.

[BibT_eX]

[DOI]

CoRR, 2022

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges.

[BibT_eX]

[DOI]

CoRR, 2022

1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022).

[BibT_eX]

[DOI]

CoRR, 2022

Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Zun Wang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...