Omkar Thawakar

According to our database¹, Omkar Thawakar authored at least 32 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

CoVR-R:Reason-Aware Composed Video Retrieval.

[BibT_eX]

[DOI]

Omkar Thawakar

Dmitry Demidov

Vaishnav Potlapalli

Bogireddy Sai Prasanna Teja

Viswanatha Reddy Gajjala

Alaa Mostafa Lasheen

Rao Muhammad Anwer

Fahad Shahbaz Khan

CoRR, March, 2026

Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device.

[BibT_eX]

[DOI]

Abdelrahman M. Shaker

Ahmed Heakl

Jaseel Muhammad Kaithakkodan

CoRR, February, 2026

A Multi-Agent Diffusion Approach for MRI Anomaly Segmentation via Modality-Specific LoRA Specialization.

[BibT_eX]

[DOI]

Wafa Al Ghallabi

Muhammad Zaigham Zaheer

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2026

DuwatBench: Bridging Language and Visual Heritage through an Arabic Calligraphy Benchmark for Multimodal Understanding.

[BibT_eX]

[DOI]

Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026

2025

Thinking Beyond Labels: Vocabulary-Free Fine-Grained Recognition using Reasoning-Augmented LMMs.

[BibT_eX]

[DOI]

CoRR, December, 2025

EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards.

[BibT_eX]

[DOI]

Omkar Thawakar

Shravan Venkatraman

Ritesh Thawkar

Abdelrahman M. Shaker

CoRR, November, 2025

How Good are Foundation Models in Step-by-Step Embodied Reasoning?

[BibT_eX]

[DOI]

CoRR, September, 2025

ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark.

[BibT_eX]

[DOI]

CoRR, May, 2025

LLM Post-Training: A Deep Dive into Reasoning Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

AIN: The Arabic INclusive Large Multimodal Model.

[BibT_eX]

[DOI]

CoRR, February, 2025

Video Instance Segmentation in an Open-World.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., January, 2025

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.

[BibT_eX]

[DOI]

Sara Ghaboura

Ahmed Heakl

Omkar Thawakar

Ali Husain Salem Abdulla Alharthi

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Vocabulary-Free Fine-Grained Visual Recognition via Enriched Contextually Grounded Vision-Language Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages.

[BibT_eX]

[DOI]

Henok Biadglign Ademtew

Mihail Minkov Mihaylov

Chao Qin

Abdelrahman M. Shaker

Mike Zhang

Mahardika Krisna Ihsani

Fadillah Adamsyah Maani

Feno Heriniaina Rabevohitra

Azril Hafizi Amirudin

Muhammad Ridzuan

Daniya Najiha Abdul Kareem

Amirpouya Ghasemaghaei

Johan S. Obando-Ceron

Nathan Augusto Zacarias Xavier

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages.

[BibT_eX]

[DOI]

Henok Biadglign Ademtew

Abdelrahman M. Shaker

Mike Zhang

Mahardika Krisna Ihsani

Fadillah Adamsyah Maani

Feno Heriniaina Rabevohitra

Amirpouya Ghasemaghaei

Johan S. Obando-Ceron

CoRR, 2024

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.

[BibT_eX]

[DOI]

Sara Ghaboura

Ahmed Heakl

Omkar Thawakar

Ali Husain Salem Abdulla Alharthi

CoRR, 2024

Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration.

[BibT_eX]

[DOI]

CoRR, 2024

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT.

[BibT_eX]

[DOI]

CoRR, 2024

Composed Video Retrieval via Enriched Context and Discriminative Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models.

[BibT_eX]

[DOI]

Omkar Thawakar

Abdelrahman M. Shaker

Sahal Shaji Mullappilly

Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, 2024

2023

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

[BibT_eX]

[DOI]

Omkar Thawakar

Abdelrahman M. Shaker

Sahal Shaji Mullappilly

CoRR, 2023

3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM.

[BibT_eX]

[DOI]

Sahal Shaji Mullappilly

Abdelrahman M. Shaker

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Fast Video Instance Segmentation via Recurrent Encoder-Based Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Analysis of Images and Patterns, 2023

2022

Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2019

Motion Saliency Based Generative Adversarial Network for Underwater Moving Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Image and Video Super Resolution using Recurrent Generative Adversarial Network.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2019

Omkar Thawakar

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...