Davide Caffagni

Orcid: 0009-0002-3279-6480

According to our database¹, Davide Caffagni authored at least 14 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Seeing Beyond Words: Self-Supervised Visual Learning for Multimodal Large Language Models.

[BibT_eX]

[DOI]

Mark Granroth-Wilding

Rita Cucchiara

CoRR, December, 2025

ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering.

[BibT_eX]

[DOI]

CoRR, November, 2025

Recurrence Meets Transformers for Universal Multimodal Retrieval.

[BibT_eX]

[DOI]

CoRR, September, 2025

Mitigating Hallucinations in Multimodal LLMs via Object-aware Preference Optimization.

[BibT_eX]

[DOI]

CoRR, August, 2025

Augmenting and mixing Transformers with synthetic data for image captioning.

[BibT_eX]

[DOI]

Image Vis. Comput., 2025

Benchmarking BERT-based Models for Latin: A Case Study on Biblical References in Ancient Christian Literature.

[BibT_eX]

[DOI]

Proceedings of the 21st Conference on Information and Research science Connecting to Digital and Library science, 2025

LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

Generating Synthetic Data with Large Language Models for Low-Resource Sentence Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Linking Theory and Practice of Digital Libraries, 2025

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

The (R)Evolution of Multimodal Large Language Models: A Survey.

[BibT_eX]

[DOI]

CoRR, 2024

Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization.

[BibT_eX]

[DOI]

Proceedings of the 35th British Machine Vision Conference, 2024

The Revolution of Multimodal Large Language Models: A Survey.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

SynthCap: Augmenting Transformers with Synthetic Data for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Image Analysis and Processing - ICIAP 2023, 2023

Davide Caffagni

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...