Ali Vosoughi

Orcid: 0000-0003-1014-2937

According to our database1, Ali Vosoughi authored at least 15 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
OPENXRD: A Comprehensive Benchmark and Enhancement Framework for LLM/MLLM XRD Question Answering.
CoRR, July, 2025

Can Sound Replace Vision in LLaVA With Token Substitution?
CoRR, June, 2025

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness.
CoRR, May, 2025

I<sup>2</sup>G: Generating Instructional Illustrations via Text-Conditioned Diffusion.
CoRR, May, 2025

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting.
CoRR, April, 2025

VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity.
CoRR, March, 2025

2024
Cross Modality Bias in Visual Question Answering: A Causal View With Possible Worlds VQA.
IEEE Trans. Multim., 2024

OSCaR: Object State Captioning and State Change Representation.
CoRR, 2024

OSCaR: Object State Captioning and State Change Representation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

EAGLE: Egocentric AGgregated Language-video Engine.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Learning Audio Concepts from Counterfactual Natural Language.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Video Understanding with Large Language Models: A Survey.
CoRR, 2023

Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation.
CoRR, 2023

MISAR: A Multimodal Instructional System with Augmented Reality.
CoRR, 2023

Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA.
CoRR, 2023


  Loading...