Zun Wang

Orcid: 0009-0005-9502-050X

Affiliations:
  • University of North Carolina at Chapel Hill, NC, USA
  • Australian National University, Canberra, ACT, Australia (2020 - 2023)


According to our database1, Zun Wang authored at least 27 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
OS-Themis: A Scalable Critic Framework for Generalist GUI Rewards.
CoRR, March, 2026

V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising.
CoRR, March, 2026

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories.
CoRR, February, 2026

When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning.
CoRR, February, 2026

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent.
CoRR, January, 2026

DreamRunner: Fine-Grained Compositional Story-to-Video Generation with Retrieval-Augmented Motion Adaptation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models.
CoRR, December, 2025

Planning with Sketch-Guided Verification for Physics-Aware Video Generation.
CoRR, November, 2025

Error-Driven Scene Editing for 3D Grounding in Large Language Models.
CoRR, November, 2025

Learning Goal-Oriented Language-Guided Navigation with Self-Improving Demonstrations at Scale.
CoRR, September, 2025

ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2025

EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance.
CoRR, May, 2025

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models.
Trans. Mach. Learn. Res., 2024

DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation.
CoRR, 2024

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding.
CoRR, 2024

NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark.
CoRR, 2023

Scaling Data Generation in Vision-and-Language Navigation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
InternVideo: General Video Foundation Models via Generative and Discriminative Learning.
CoRR, 2022

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges.
CoRR, 2022

1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022).
CoRR, 2022

Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022


  Loading...