Yihao Ding

Orcid: 0000-0001-5065-6911

According to our database1, Yihao Ding authored at least 25 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2026
Embodied intelligence for 3D understanding: A survey on 3D Scene question answering.
Inf. Fusion, 2026

2025
SynDoc: A Hybrid Discriminative-Generative Framework for Enhancing Synthetic Domain-Adaptive Document Key Information Extraction.
CoRR, September, 2025

DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections.
CoRR, August, 2025

A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends.
CoRR, July, 2025

Pseudo-labeling and knowledge-guided contrastive learning for radiology report generation.
J. Biomed. Informatics, 2025

VRD-IU: Lessons from Visually Rich Document Intelligence and Understanding.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Natural Language Processing in Support of Evidence-based Medicine: A Scoping Review.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Graph neural networks for text classification: a survey.
Artif. Intell. Rev., August, 2024

DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights.
CoRR, 2024

Deep Learning based Visually Rich Document Content Understanding: A Survey.
CoRR, 2024

PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering.
CoRR, 2024

M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding.
CoRR, 2024

MMVQA: A Comprehensive Dataset for Investigating Multipage Multimodal Information Retrieval in PDF-based Visual Question Answering.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

3MVRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

The Language Model Can Have the Personality: Joint Learning for Personality Enhanced Language Model (Student Abstract).
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
PDFVQA: A New Dataset for Real-World VQA on PDF Documents.
CoRR, 2023

Form-NLU: Dataset for the Form Language Understanding.
CoRR, 2023

Form-NLU: Dataset for the Form Natural Language Understanding.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

PDF-VQA: A New Dataset for Real-World VQA on PDF Documents.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track, 2023

Workshop on Document Intelligence Understanding.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

2022
V-Doc : Visual questions answers with Documents.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

DDI-MuG: Multi-aspect Graphs for Drug-Drug Interaction Extraction.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022


  Loading...