Yuqian Fu

Orcid: 0000-0002-0412-5500

According to our database¹, Yuqian Fu authored at least 89 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

SeGDP: Source-free Cross-domain Few-shot Learning via Semantic Guided Diversity Prompting.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., May, 2026

Are Full Rollouts Necessary for On-Policy Distillation?

[BibT_eX]

[DOI]

CoRR, May, 2026

Afford-VLA: Action-Aligned Visual Planning via Internalized Affordance.

[BibT_eX]

[DOI]

CoRR, May, 2026

Accelerating Vision Foundation Models with Drop-in Depthwise Convolution.

[BibT_eX]

[DOI]

CoRR, May, 2026

VISAFF: Speaker-Centered Visual Affective Feature Learning for Emotion Recognition in Conversation.

[BibT_eX]

[DOI]

CoRR, May, 2026

Evo-Depth: A Lightweight Depth-Enhanced Vision-Language-Action Model.

[BibT_eX]

[DOI]

CoRR, May, 2026

Focusable Monocular Depth Estimation.

[BibT_eX]

[DOI]

CoRR, May, 2026

Self-supervised pretraining for an iterative image size agnostic vision transformer.

[BibT_eX]

[DOI]

CoRR, April, 2026

OFlow: Injecting Object-Aware Temporal Flow Matching for Robust Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, April, 2026

The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results.

[BibT_eX]

[DOI]

Adhémar de Senneville

CoRR, April, 2026

ATBench: A Diverse and Realistic Agent Trajectory Benchmark for Safety Evaluation and Diagnosis.

[BibT_eX]

[DOI]

CoRR, April, 2026

Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes.

[BibT_eX]

[DOI]

CoRR, March, 2026

OCRA: Object-Centric Learning with 3D and Tactile Priors for Human-to-Robot Action Transfer.

[BibT_eX]

[DOI]

CoRR, March, 2026

InterEdit: Navigating Text-Guided Multi-Human 3D Motion Editing.

[BibT_eX]

[DOI]

CoRR, March, 2026

VisNec: Measuring and Leveraging Visual Necessity for Multimodal Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, March, 2026

CPIG: Leveraging Consistency Policy With Intention Guidance for Multiagent Exploration.

[BibT_eX]

[DOI]

IEEE Trans. Cogn. Dev. Syst., February, 2026

EgoSound: Benchmarking Sound Understanding in Egocentric Videos.

[BibT_eX]

[DOI]

CoRR, February, 2026

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

CompTrack: Information Bottleneck-Guided Low-Rank Dynamic Token Compression for Point Cloud Tracking.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

MinD-3D++: Advancing fMRI-Based 3D Reconstruction With High-Quality Textured Mesh Generation and a Comprehensive Dataset.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2025

StreamEQA: Towards Streaming Video Understanding for Embodied Scenarios.

[BibT_eX]

[DOI]

CoRR, December, 2025

ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos.

[BibT_eX]

[DOI]

CoRR, December, 2025

V<sup>2</sup>-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence.

[BibT_eX]

[DOI]

CoRR, November, 2025

Exo2EgoSyn: Unlocking Foundation Video Generation Models for Exocentric-to-Egocentric Video Synthesis.

[BibT_eX]

[DOI]

CoRR, November, 2025

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks.

[BibT_eX]

[DOI]

CoRR, October, 2025

RefAtomNet++: Advancing Referring Atomic Video Action Recognition using Semantic Retrieval based Multi-Trajectory Mamba.

[BibT_eX]

[DOI]

CoRR, October, 2025

SCOOP'D: Learning Mixed-Liquid-Solid Scooping via Sim2Real Generative Policy.

[BibT_eX]

[DOI]

CoRR, October, 2025

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods.

[BibT_eX]

[DOI]

CoRR, October, 2025

EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark.

[BibT_eX]

[DOI]

CoRR, October, 2025

MAT: Multi-Range Attention Transformer for Efficient Image Super-Resolution.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., September, 2025

Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization.

[BibT_eX]

[DOI]

CoRR, September, 2025

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents.

[BibT_eX]

[DOI]

CoRR, September, 2025

Meta Learning Task Representation in Multiagent Reinforcement Learning: From Global Inference to Local Inference.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., August, 2025

Vision encoders should be image size agnostic and task driven.

[BibT_eX]

[DOI]

CoRR, August, 2025

SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning.

[BibT_eX]

[DOI]

CoRR, June, 2025

CLiViS: Unleashing Cognitive Map through Linguistic-Visual Synergy for Embodied Visual Reasoning.

[BibT_eX]

[DOI]

CoRR, June, 2025

Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025.

[BibT_eX]

[DOI]

CoRR, June, 2025

Manifold-aware Representation Learning for Degradation-agnostic Image Restoration.

[BibT_eX]

[DOI]

CoRR, May, 2025

MLLMs are Deeply Affected by Modality Bias.

[BibT_eX]

[DOI]

CoRR, May, 2025

EarthSynth: Generating Informative Earth Observation with Diffusion Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

CAFuser: Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., April, 2025

CamSAM2: Segment Anything Accurately in Camouflaged Videos.

[BibT_eX]

[DOI]

CoRR, March, 2025

Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation.

[BibT_eX]

[DOI]

CoRR, March, 2025

AVA: Attentive VLM Agent for Mastering StarCraft II.

[BibT_eX]

[DOI]

CoRR, March, 2025

MergeIT: From Selection to Merging for Efficient Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, March, 2025

LDR: Learning Discrete Representation to Improve Noise Robustness in Multiagent Tasks.

[BibT_eX]

[DOI]

IEEE Trans. Syst. Man Cybern. Syst., January, 2025

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

Sequential Multi-Object Grasping with One Dexterous Hand.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

Offline Goal-Conditioned Reinforcement Learning with Elastic-Subgoal Diffused Policy Learning.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems, 2025

DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

INS: Interaction-aware Synthesis to Enhance Offline Multi-agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Empowering LLM Agents with Zero-Shot Optimal Decision-Making through Q-learning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

XTrack: Multimodal Training Boosts RGB-X Video Object Trackers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ObjectRelator: Enabling Cross-View Object Relation Understanding Across Ego-Centric and Exo-Centric Perspectives.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Understanding Museum Exhibits using Vision-Language Reasoning.

[BibT_eX]

[DOI]

Ada-Astrid Balauca

Sanjana Garai

Stefan Balauca

Rasesh Udayakumar Shetty

Naitik Agrawal

Dhwanil Subhashbhai Shah

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

RLAE: Reinforcement Learning-Assisted Ensemble for LLMs.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

NTIRE 2025 Challenge on Cross-Domain Few-Shot Object Detection: Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Understanding the World's Museums through Vision-Language Reasoning.

[BibT_eX]

[DOI]

Ada-Astrid Balauca

Sanjana Garai

Stefan Balauca

Rasesh Udayakumar Shetty

Naitik Agrawal

Dhwanil Subhashbhai Shah

CoRR, 2024

Prompt as Free Lunch: Enhancing Diversity in Source-Free Cross-domain Few-shot Learning through Semantic-Guided Prompting.

[BibT_eX]

[DOI]

CoRR, 2024

ObjectRelator: Enabling Cross-View Object Relation Understanding in Ego-Centric and Exo-Centric Videos.

[BibT_eX]

[DOI]

CoRR, 2024

CPEG: Leveraging Consistency Policy with Consensus Guidance for Multi-agent Exploration.

[BibT_eX]

[DOI]

CoRR, 2024

Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes.

[BibT_eX]

[DOI]

CoRR, 2024

fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction.

[BibT_eX]

[DOI]

CoRR, 2024

Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community.

[BibT_eX]

[DOI]

CoRR, 2024

Towards a Generalist and Blind RGB-X Tracker.

[BibT_eX]

[DOI]

CoRR, 2024

MinD-3D: Reconstruct High-Quality 3D Objects in Human Brain.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Test-Time Linear Out-of-Distribution Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Aligning Credit for Multi-Agent Cooperation via Model-based Counterfactual Imagination.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

Open-Vocabulary Video Relation Extraction.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

MinD-3D: Reconstruct High-quality 3D objects in Human Brain.

[BibT_eX]

[DOI]

CoRR, 2023

Meta Style Adversarial Training for Cross-Domain Few-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2023

On the Importance of Spatial Relations for Few-shot Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Generalized Meta-FDMixup: Cross-Domain Few-Shot Learning Guided by Labeled Target Data.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Wave-SAN: Wavelet based Style Augmentation Network for Cross-Domain Few-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2022

TGDM: Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

LILAC: Learning a Leader for Cooperative Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022

2021

Meta-FDMixup: Cross-Domain Few-Shot Learning Guided by Labeled Target Data.

[BibT_eX]

[DOI]

Yuqian Fu

Yanwei Fu

Yu-Gang Jiang

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Can Action be Imitated? Learn to Reconstruct and Transfer Human Dynamics from Videos.

[BibT_eX]

[DOI]

Yuqian Fu

Yanwei Fu

Yu-Gang Jiang

Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

E-ACJ: Accurate Junction Extraction For Event Cameras.

[BibT_eX]

[DOI]

Zhihao Liu

Yuqian Fu

Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

2020

Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019

Embodied One-Shot Video Recognition: Learning from Actions of a Virtual Embodied Agent.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Yuqian Fu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...