Yanyuan Qiao

Orcid: 0000-0002-5606-0702

According to our database1, Yanyuan Qiao authored at least 31 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Embodied Domain Adaptation for Object Detection.
CoRR, June, 2025

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation.
CoRR, June, 2025

NavBench: Probing Multimodal Large Language Models for Embodied Navigation.
CoRR, June, 2025

BadNAVer: Exploring Jailbreak Attacks On Vision-and-Language Navigation.
CoRR, May, 2025

COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation.
CoRR, March, 2025

FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks.
CoRR, March, 2025

SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation.
CoRR, March, 2025

Ground-level Viewpoint Vision-and-Language Navigation in Continuous Environments.
CoRR, February, 2025

General Scene Adaptation for Vision-and-Language Navigation.
CoRR, January, 2025

GroundingMate: Aiding Object Grounding for Goal-Oriented Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

General Scene Adaptation for Vision-and-Language Navigation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models.
Trans. Mach. Learn. Res., 2024

Effective Tuning Strategies for Generalist Robot Manipulation Policies.
CoRR, 2024

MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation.
CoRR, 2024

Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs.
CoRR, 2024

MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

VL-Mamba: Exploring State Space Models for Multimodal Learning.
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

Improving Online Source-Free Domain Adaptation for Object Detection by Unsupervised Data Acquisition.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

LLM as Copilot for Coarse-Grained Vision-and-Language Navigation.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Multi-modal Adapter for Medical Vision-and-Language Learning.
Proceedings of the Machine Learning in Medical Imaging - 14th International Workshop, 2023

VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

March in Chat: Interactive Prompting for Remote Embodied Referring Expression.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation.
CoRR, 2022

HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Referring Expression Comprehension: A Survey of Methods and Datasets.
IEEE Trans. Multim., 2021

R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020
Rankvqa: Answer Re-Ranking For Visual Question Answering.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

VC-VQA: Visual Calibration Mechanism For Visual Question Answering.
Proceedings of the IEEE International Conference on Image Processing, 2020

2019
Improving visual question answering using dropout and enhanced question encoder.
Pattern Recognit., 2019

2018
Enhancing Visual Question Answering Using Dropout.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018


  Loading...