Yanyuan Qiao

Orcid: 0000-0002-5606-0702

According to our database1, Yanyuan Qiao authored at least 33 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs.
CoRR, September, 2025

Lang2Morph: Language-Driven Morphological Design of Robotic Hands.
CoRR, September, 2025

A Knowledge-driven Adaptive Collaboration of LLMs for Enhancing Medical Decision-making.
CoRR, September, 2025

Embodied Domain Adaptation for Object Detection.
CoRR, June, 2025

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation.
CoRR, June, 2025

NavBench: Probing Multimodal Large Language Models for Embodied Navigation.
CoRR, June, 2025

BadNAVer: Exploring Jailbreak Attacks On Vision-and-Language Navigation.
CoRR, May, 2025

COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation.
CoRR, March, 2025

SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation.
CoRR, March, 2025

FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks.
IEEE Trans. Multim., 2025

GroundingMate: Aiding Object Grounding for Goal-Oriented Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Effective Tuning Strategies for Generalist Robot Manipulation Policies.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Ground-Level Viewpoint Vision-and-Language Navigation in Continuous Environments.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

General Scene Adaptation for Vision-and-Language Navigation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models.
Trans. Mach. Learn. Res., 2024

MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

VL-Mamba: Exploring State Space Models for Multimodal Learning.
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

Improving Online Source-Free Domain Adaptation for Object Detection by Unsupervised Data Acquisition.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

LLM as Copilot for Coarse-Grained Vision-and-Language Navigation.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Multi-modal Adapter for Medical Vision-and-Language Learning.
Proceedings of the Machine Learning in Medical Imaging - 14th International Workshop, 2023

VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

March in Chat: Interactive Prompting for Remote Embodied Referring Expression.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation.
CoRR, 2022

HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Referring Expression Comprehension: A Survey of Methods and Datasets.
IEEE Trans. Multim., 2021

R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020
Rankvqa: Answer Re-Ranking For Visual Question Answering.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

VC-VQA: Visual Calibration Mechanism For Visual Question Answering.
Proceedings of the IEEE International Conference on Image Processing, 2020

2019
Improving visual question answering using dropout and enhanced question encoder.
Pattern Recognit., 2019

2018
Enhancing Visual Question Answering Using Dropout.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018


  Loading...