Yihua Shao

Orcid: 0009-0002-0475-7142

According to our database1, Yihua Shao authored at least 34 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray Analysis.
CoRR, March, 2026

Geometry OR Tracker: Universal Geometric Operating Room Tracking.
CoRR, March, 2026

Do MLLMs Really Understand Space? A Mathematical Reasoning Evaluation.
CoRR, February, 2026

Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning.
CoRR, February, 2026

StyMam: A Mamba-Based Generator for Artistic Style Transfer.
CoRR, January, 2026

Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation.
CoRR, January, 2026

Enhancing point cloud feature representation via historical node state increments in graph neural networks.
Pattern Recognit., 2026

3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2026

TR-DQ: Time-Rotation Diffusion Quantization.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

RAGAR: Retrieval Augmented Personalized Image Generation Guided by Recommendation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

MoFu: Scale-Aware Modulation and Fourier Fusion for Multi-Subject Video Generation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning.
CoRR, December, 2025

CurriFlow: Curriculum-Guided Depth Fusion with Optical Flow-Based Temporal Alignment for 3D Semantic Scene Completion.
CoRR, October, 2025

MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook.
CoRR, September, 2025

PDFT: parameter-diminish fine-tuning for transformer-based models.
Vis. Comput., July, 2025

EventVAD: Training-Free Event-Aware Video Anomaly Detection.
CoRR, April, 2025

WonderVerse: Extendable 3D Scene Generation with Video Generative Models.
CoRR, March, 2025

GM-MoE: Low-Light Enhancement with Gated-Mechanism Mixture-of-Experts.
CoRR, March, 2025

TR-DQ: Time-Rotation Diffusion Quantization.
CoRR, March, 2025

EventVAD: Training-Free Event-Aware Video Anomaly Detection.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

AccidentBlip: Agent of Accident Warning Based on MA-Former.
Proceedings of the IEEE Intelligent Vehicles Symposium, 2025

In-Context Meta LoRA Generation.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Renderworld: World Model with Self-Supervised 3D Label.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

AdsQA: Towards Advertisement Video Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

GM-MoE: Low-Light Enhancement with Gated-Mechanism Mixture-of-Experts.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

MambaIC: State Space Models for High-Performance Learned Image Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

TDADL-IE: A Deep Learning-Driven Cryptographic Architecture for Medical Image Security.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2025

HAFT: Hierarchical Attentional Fusion Transformer for Adaptive Feature Fusion in Medical Image Segmentation.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2025

2024
3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting.
CoRR, 2024

GWQ: Gradient-Aware Weight Quantization for Large Language Models.
CoRR, 2024

AccidentBlip2: Accident Detection With Multi-View MotionBlip2.
CoRR, 2024


  Loading...