Yihao Ding

Orcid: 0000-0001-5065-6911

According to our database1, Yihao Ding authored at least 40 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
NarrativeSense: Predicting Affective States in University Students through Smartphone Sensing and Contextual Narratives.
ACM Trans. Comput. Heal., April, 2026

MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation.
CoRR, April, 2026

STIndex: A Context-Aware Multi-Dimensional Spatiotemporal Information Extraction System.
CoRR, April, 2026

Deep learning based visually rich document content understanding: a survey.
Artif. Intell. Rev., April, 2026

LPL3D: LVLM-Driven Pseudo-Labeling for 3D Object Detection.
IEEE Trans. Circuits Syst. Video Technol., March, 2026

GeoChemAD: Benchmarking Unsupervised Geochemical Anomaly Detection for Mineral Exploration.
CoRR, March, 2026

ToolTree: Efficient LLM Agent Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning.
CoRR, March, 2026

BRIDGE: Benchmark for multi-hop Reasoning In long multimodal Documents with Grounded Evidence.
CoRR, March, 2026

Diagnosing Causal Reasoning in Vision-Language Models via Structured Relevance Graphs.
CoRR, February, 2026

Statistical Verification of Medium-Access Parameterization for Power-Grid Edge Ad Hoc Sensor Networks.
CoRR, February, 2026

Docs2Synth: A Synthetic Data Trained Retriever Framework for Scanned Visually Rich Documents Understanding.
CoRR, January, 2026

Embodied intelligence for 3D understanding: A survey on 3D Scene question answering.
Inf. Fusion, 2026

SynJAC: synthetic-data-driven joint-granular adaptation and calibration for domain specific scanned document key information extraction.
Inf. Fusion, 2026

A Disease-Aware Dual-Stage Framework for Chest X-ray Report Generation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
PROPA: Toward Process-level Optimization in Visual Reasoning via Reinforcement Learning.
CoRR, November, 2025

SynDoc: A Hybrid Discriminative-Generative Framework for Enhancing Synthetic Domain-Adaptive Document Key Information Extraction.
CoRR, September, 2025

DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections.
CoRR, August, 2025

A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends.
CoRR, July, 2025

Pseudo-labeling and knowledge-guided contrastive learning for radiology report generation.
J. Biomed. Informatics, 2025

VRD-IU: Lessons from Visually Rich Document Intelligence and Understanding.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

KIEPrompter: Leveraging Lightweight Models' Predictions for Cost-Effective Key Information Extraction using Vision LLMs.
Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025

Natural Language Processing in Support of Evidence-based Medicine: A Scoping Review.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Graph neural networks for text classification: a survey.
Artif. Intell. Rev., August, 2024

DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights.
CoRR, 2024

Deep Learning based Visually Rich Document Content Understanding: A Survey.
CoRR, 2024

PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering.
CoRR, 2024

M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding.
CoRR, 2024

MMVQA: A Comprehensive Dataset for Investigating Multipage Multimodal Information Retrieval in PDF-based Visual Question Answering.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

3MVRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

The Language Model Can Have the Personality: Joint Learning for Personality Enhanced Language Model (Student Abstract).
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
PDFVQA: A New Dataset for Real-World VQA on PDF Documents.
CoRR, 2023

Form-NLU: Dataset for the Form Language Understanding.
CoRR, 2023

Form-NLU: Dataset for the Form Natural Language Understanding.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

PDF-VQA: A New Dataset for Real-World VQA on PDF Documents.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track, 2023

Workshop on Document Intelligence Understanding.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

2022
V-Doc : Visual questions answers with Documents.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

DDI-MuG: Multi-aspect Graphs for Drug-Drug Interaction Extraction.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022


  Loading...