Jinxu Zhang

Orcid: 0009-0000-9876-1454

According to our database¹, Jinxu Zhang authored at least 11 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

Triantafyllos Afouras

Santhosh Kumar Ramakrishnan

Oluwatumininu Oguntola

Kiran K. Somasundaram

Giovanni Maria Farinella

Int. J. Comput. Vis., December, 2025

DocRouter: Prompt guided vision transformer and Mixture of Experts connector for document understanding.

[BibT_eX]

[DOI]

Jinxu Zhang

Yu Zhang

Inf. Fusion, 2025

Predicting trajectories of coastal area vessels with a lightweight Slice-Diff self attention.

[BibT_eX]

[DOI]

Complex Intell. Syst., 2025

DREAM: Integrating Hierarchical Multimodal Retrieval with Multi-page Multimodal Language Model for Documents VQA.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

MP-FIRE: An End-to-End Cross-Modal Framework for Complex Multi-Page Document Question Answering.

[BibT_eX]

[DOI]

Yongqi Yu

Jinxu Zhang

Yu Zhang

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

DocAssistant: Integrating Key-region Reading and Step-wise Reasoning for Robust Document Visual Question Answering.

[BibT_eX]

[DOI]

Jinxu Zhang

Qiyuan Fan

Yu Zhang

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024

CFRet-DVQA: Coarse-to-Fine Retrieval and Efficient Tuning for Document Visual Question Answering.

[BibT_eX]

[DOI]

Jinxu Zhang

Yongqi Yu

Yu Zhang

CoRR, 2024

CREAM: Coarse-to-Fine Retrieval and Multi-modal Efficient Tuning for Document VQA.

[BibT_eX]

[DOI]

Jinxu Zhang

Yongqi Yu

Yu Zhang

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

A Sequence-to-Sequence Based Muti-Semantic Network with Attention for Long-Term Vessel Trajectory Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Conference on Computer, 2024

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

Triantafyllos Afouras

Santhosh Kumar Ramakrishnan

Oluwatumininu Oguntola

Kiran K. Somasundaram

Giovanni Maria Farinella

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

Triantafyllos Afouras

Santhosh Kumar Ramakrishnan

et al.

CoRR, 2023

Jinxu Zhang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...