Omkar Thawakar

According to our database¹, Omkar Thawakar authored at least 26 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

How Good are Foundation Models in Step-by-Step Embodied Reasoning?

[BibT_eX]

[DOI]

CoRR, September, 2025

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications.

[BibT_eX]

[DOI]

CoRR, August, 2025

Vocabulary-free Fine-grained Visual Recognition via Enriched Contextually Grounded Vision-Language Model.

[BibT_eX]

[DOI]

CoRR, July, 2025

Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs.

[BibT_eX]

[DOI]

CoRR, May, 2025

ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark.

[BibT_eX]

[DOI]

CoRR, May, 2025

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding.

[BibT_eX]

[DOI]

CoRR, March, 2025

LLM Post-Training: A Deep Dive into Reasoning Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

AIN: The Arabic INclusive Large Multimodal Model.

[BibT_eX]

[DOI]

CoRR, February, 2025

Video Instance Segmentation in an Open-World.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., January, 2025

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.

[BibT_eX]

[DOI]

Sara Ghaboura

Ahmed Heakl

Omkar Thawakar

Ali Husain Salem Abdulla Alharthi

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages.

[BibT_eX]

[DOI]

Henok Biadglign Ademtew

Mihail Minkov Mihaylov

Chao Qin

Abdelrahman M. Shaker

Mike Zhang

Mahardika Krisna Ihsani

Fadillah Adamsyah Maani

Feno Heriniaina Rabevohitra

Azril Hafizi Amirudin

Muhammad Ridzuan

Daniya Najiha Abdul Kareem

Amirpouya Ghasemaghaei

Johan S. Obando-Ceron

Nathan Augusto Zacarias Xavier

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages.

[BibT_eX]

[DOI]

Henok Biadglign Ademtew

Abdelrahman M. Shaker

Mike Zhang

Mahardika Krisna Ihsani

Fadillah Adamsyah Maani

Feno Heriniaina Rabevohitra

Amirpouya Ghasemaghaei

Johan S. Obando-Ceron

CoRR, 2024

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.

[BibT_eX]

[DOI]

Sara Ghaboura

Ahmed Heakl

Omkar Thawakar

Ali Husain Salem Abdulla Alharthi

CoRR, 2024

Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration.

[BibT_eX]

[DOI]

CoRR, 2024

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT.

[BibT_eX]

[DOI]

CoRR, 2024

Composed Video Retrieval via Enriched Context and Discriminative Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models.

[BibT_eX]

[DOI]

Omkar Thawakar

Abdelrahman M. Shaker

Sahal Shaji Mullappilly

Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, 2024

2023

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

[BibT_eX]

[DOI]

Omkar Thawakar

Abdelrahman M. Shaker

Sahal Shaji Mullappilly

CoRR, 2023

3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM.

[BibT_eX]

[DOI]

Sahal Shaji Mullappilly

Abdelrahman M. Shaker

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Fast Video Instance Segmentation via Recurrent Encoder-Based Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Analysis of Images and Patterns, 2023

2022

Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2019

Motion Saliency Based Generative Adversarial Network for Underwater Moving Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Image and Video Super Resolution using Recurrent Generative Adversarial Network.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2019

Omkar Thawakar

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...