Siwen Luo

Orcid: 0000-0003-0480-1991

According to our database1, Siwen Luo authored at least 22 papers between 2008 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Docs2Synth: A Synthetic Data Trained Retriever Framework for Scanned Visually Rich Documents Understanding.
CoRR, January, 2026

2025
A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends.
CoRR, July, 2025

MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

'No' Matters: Out-of-Distribution Detection in Multimodality Multi-Turn Interactive Dialogue Download PDF.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Multimodal Commonsense Knowledge Distillation for Visual Question Answering (Student Abstract).
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Local Interpretations for Explainable Natural Language Processing: A Survey.
ACM Comput. Surv., September, 2024

Multimodal Commonsense Knowledge Distillation for Visual Question Answering.
CoRR, 2024

'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue.
CoRR, 2024

PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering.
CoRR, 2024

MMVQA: A Comprehensive Dataset for Investigating Multipage Multimodal Information Retrieval in PDF-based Visual Question Answering.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

2023
SceneGATE: Scene-Graph Based Co-Attention Networks for Text Visual Question Answering.
Robotics, June, 2023

PDFVQA: A New Dataset for Real-World VQA on PDF Documents.
CoRR, 2023

PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals.
Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023

PDF-VQA: A New Dataset for Real-World VQA on PDF Documents.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track, 2023

Workshop on Document Intelligence Understanding.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

2022
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2020
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks.
CoRR, 2020

REXUP: I REason, I EXtract, I UPdate with Structured Compositional Reasoning for Visual Question Answering.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

VICTR: Visual Information Captured Text Representation for Text-to-Vision Multimodal Tasks.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2013
Cluster Labeling Extraction and Ranking Feature Selection for High Quality XML Pseudo Relevance Feedback Fragments Set.
Proceedings of the Advanced Data Mining and Applications - 9th International Conference, 2013

2008
Research on the Reformation of Information Literacy Cultivation in Colleges and Universities Based on the E-learning.
Proceedings of the International Conference on Computer Science and Software Engineering, 2008


  Loading...