Lorenzo Baraldi

Orcid: 0000-0001-5125-4957

Affiliations:
  • University of Modena and Reggio Emilia, Italy
  • University of Pisa, Pisa, Toscana, Italy - PhD student


According to our database1, Lorenzo Baraldi authored at least 23 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors.
CoRR, June, 2025

What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models.
CoRR, May, 2025

Learning to mask and permute visual tokens for Vision Transformer pre-training.
Comput. Vis. Image Underst., 2025

Semantically Conditioned Prompts for Visual Recognition Under Missing Modality Scenarios.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Perceive. Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

AIGeN-Llama: An Adversarial Approach for Instruction Generation in VLN using Llama2 Model.
Proceedings of the 21st Conference on Information and Research science Connecting to Digital and Library science, 2025

Causal Graphical Models for Vision-Language Compositional Understanding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Multimodal Emotion Recognition in Conversation via Possible Speaker's Audio and Visual Sequence Selection.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Hyperbolic Safety-Aware Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Personalizing Multimodal Large Language Models for Image Captioning: An Experimental Analysis.
CoRR, 2024

Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training.
CoRR, 2024

Optimizing Resource Consumption in Diffusion Models through Hallucination Early Detection.
CoRR, 2024

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities.
CoRR, 2024

The (R)Evolution of Multimodal Large Language Models: A Survey.
CoRR, 2024

Adapt to Scarcity: Few-Shot Deepfake Detection via Low-Rank Adaptation.
Proceedings of the Pattern Recognition - 27th International Conference, 2024

Intelligent Multimodal Artificial Agents that Talk and Express Emotions.
Proceedings of the Human-Friendly Robotics 2024 - HFR: 17th International Workshop on Human-Friendly Robotics, Lugano, Switzerland, 30 September, 2024

AIGeN: An Adversarial Approach for Instruction Generation in VLN.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization.
Proceedings of the 35th British Machine Vision Conference, 2024

2023
Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Unveiling the Impact of Image Transformations on Deepfake Detection: An Experimental Analysis.
Proceedings of the Image Analysis and Processing - ICIAP 2023, 2023


  Loading...