We stand with Ukraine

We stand with Ukraine

Xiaoshuai Hao

Orcid: 0009-0007-4209-6695

According to our database¹, Xiaoshuai Hao authored at least 69 papers between 2020 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2026

Affine modulation-based audiogram fusion network for joint noise reduction and hearing loss compensation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Björn W. Schuller

Inf. Fusion, 2026

2025

RoboOS-NeXT: A Unified Memory-based Framework for Lifelong, Scalable, and Robust Multi-Robot Collaboration.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Shanghang Zhang

CoRR, October, 2025

Query-Specific GNN: A Comprehensive Graph Representation Learning Method for Retrieval Augmented Generation.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, October, 2025

Stability Under Scrutiny: Benchmarking Representation Paradigms for Online HD Mapping.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

Team Xiaomi EV-AD VLA: Learning to Navigate Socially Through Proactive Risk Perception - Technical Report for IROS 2025 RoboSense Challenge Social Navigation Track.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

Team Xiaomi EV-AD VLA: Caption-Guided Retrieval System for Cross-Modal Drone Navigation - Technical Report for IROS 2025 RoboSense Challenge Track 4.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

SSTAG: Structure-Aware Self-Supervised Learning Method for Text-Attributed Graphs.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, October, 2025

From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

VQualA 2025 Challenge on Image Super-Resolution Generated Content Quality Assessment: Methods and Results.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

MFFI: Multi-Dimensional Face Forgery Image Dataset for Real-World Scenarios.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Joey Tianyi Zhou

,

CoRR, September, 2025

VQualA 2025 Challenge on Engagement Prediction for Short Videos: Methods and Results.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

NavA<sup>3</sup>: Understanding Any Instruction, Navigating Anywhere, Finding Anything.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Shanghang Zhang

CoRR, August, 2025

VisualTrans: A Benchmark for Real-World Visual Transformation Reasoning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, August, 2025

Synergistic Prompting for Robust Visual Recognition with Missing Modalities.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, July, 2025

Training-free Generation of Temporally Consistent Rewards from VLMs.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, July, 2025

RoboBrain 2.0 Technical Report.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Shanghang Zhang

CoRR, July, 2025

What Really Matters for Robust Multi-Sensor HD Map Construction?

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, July, 2025

I<sup>2</sup>S-TFCKD: Intra-Inter Set Knowledge Distillation with Time-Frequency Calibration for Speech Enhancement.

[BibT_eX]

[DOI]

,

,

,

,

,

Björn W. Schuller

,

CoRR, June, 2025

Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Shanghang Zhang

CoRR, June, 2025

SVD: Spatial Video Dataset.

[BibT_eX]

[DOI]

M. H. Izadimehr

,

,

,

,

,

Mallesham Dasari

,

Christian Timmerer

,

CoRR, June, 2025

Your Classifier Can Do More: Towards Bridging the Gaps in Classification, Robustness, and Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, May, 2025

VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, May, 2025

RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Shanghang Zhang

CoRR, May, 2025

MEGA: Second-Order Gradient Alignment for Catastrophic Forgetting Mitigation in GFSCIL.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, April, 2025

FastRSR: Efficient and Accurate Road Surface Reconstruction from Bird's Eye View.

[BibT_eX]

[DOI]

,

,

,

CoRR, April, 2025

STViT+: improving self-supervised multi-camera depth estimation with spatial-temporal context and adversarial geometry regularization.

[BibT_eX]

[DOI]

,

,

,

,

Appl. Intell., April, 2025

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Shanghang Zhang

CoRR, March, 2025

TLA: Tactile-Language-Action Model for Contact-Rich Manipulation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, March, 2025

AffordGrasp: In-Context Affordance Reasoning for Open-Vocabulary Task-Oriented Grasping in Clutter.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Shanghang Zhang

CoRR, March, 2025

AS-GCL: Asymmetric Spectral Augmentation on Graph Contrastive Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Multim., 2025

Multi-Modal Molecular Representation Learning via Structure Awareness.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Image Process., 2025

A hierarchical reinforcement learning framework for multi-UAV combat using leader-follower strategy.

[BibT_eX]

[DOI]

,

,

Noureldin Mohamed Abdelaal Ahmed Mohamed

,

,

,

Knowl. Based Syst., 2025

BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Inf. Fusion, 2025

MapFusion: A novel BEV feature fusion network for multi-modal map construction.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Inf. Fusion, 2025

A universal sampling method based on feature and structural comprehensive proximity measure.

[BibT_eX]

[DOI]

,

,

,

,

,

Neurocomputing, 2025

ESC-MISR: Enhancing Spatial Correlations for Multi-image Super-Resolution in Remote Sensing.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the MultiMedia Modeling, 2025

Enhancing Adversarial Robustness of Vision-Language Models through Low-Rank Adaptation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

Open-Vocabulary Fine-Grained Hand Action Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

SafeMap: Robust HD Map Construction from Incomplete Observations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Uneven Event Modeling for Partially Relevant Video Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Multi-Granularity Based Collaborative Learning for Semi-Supervised Hashing.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

TASAR: Transfer-based Attack on Skeletal Action Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Shanghang Zhang

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

M3-Net: A Cost-Effective Graph-Free MLP-Based Model for Traffic Prediction.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025

RPGCN: Relational Probabilistic Graphs for EEG-Based Emotion Mining.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Advanced Data Mining and Applications - 21st International Conference, 2025

MapNav: A Novel Memory Representation via Annotated Semantic Maps for VLM-based Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Shanghang Zhang

,

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

KALAHash: Knowledge-Anchored Low-Resource Adaptation for Deep Hashing.

[BibT_eX]

[DOI]

,

,

,

,

Vijaykrishnan Narayanan

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Communication-Efficient Personalized Federal Graph Learning via Low-Rank Decomposition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2024

BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

Team Samsung-RAL: Technical Report for 2024 RoboDrive Challenge-Robust Map Segmentation Track.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2024

The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition.

[BibT_eX]

[DOI]

,

,

,

,

,

Benoit R. Cottereau

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Xingliang Huang

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Xiaoqiang Cheng

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2024

DOR3D-Net: Dense Ordinal Regression Network for 3D Hand Pose Estimation.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

Is Your HD Map Constructor Reliable under Sensor Corruptions?

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Huang Tai Sheng

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MBFusion: A New Multi-modal BEV Feature Fusion Method for HD Map Construction.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Customized Treatment Per Pixel for Blind Image Super-Resolution.

[BibT_eX]

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2024

MapDistill: Boosting Efficient Camera-Based HD Map Construction via Camera-LiDAR Fusion Model Distillation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Enhancing 3D Hand Pose Estimation via Dense Ordinal Regression Network.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 35th British Machine Vision Conference, 2024

2023

Team AcieLee: Technical Report for EPIC-SOUNDS Audio-Based Interaction Recognition Challenge 2023.

[BibT_eX]

[DOI]

,

,

,

Chuanguang Yang

,

,

,

CoRR, 2023

MixGen: A New Multi-Modal Data Augmentation.

[BibT_eX]

[DOI]

,

,

Srikar Appalaraju

,

,

,

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval.

[BibT_eX]

[DOI]

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Listen and Look: Multi-Modal Aggregation and Co-Attention Network for Video-Audio Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

2021

Multi-Feature Graph Attention Network for Cross-Modal Video-Text Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

What Matters: Attentive and Relational Feature Aggregation Network for Video-Text Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

2020

The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020).

[BibT_eX]

[DOI]

CoRR, 2020

Loading...