Sai Rajeswar

Rabiul Awal

Shambhavi Mishra

Akshay Kalkunte Suresh

CoRR, November, 2025

Apriel-1.5-15b-Thinker.

[BibT_eX]

[DOI]

CoRR, October, 2025

Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval.

[BibT_eX]

[DOI]

CoRR, October, 2025

AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs.

[BibT_eX]

[DOI]

CoRR, September, 2025

WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation.

[BibT_eX]

[DOI]

CoRR, August, 2025

Apriel-Nemotron-15B-Thinker.

[BibT_eX]

[DOI]

CoRR, August, 2025

BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning.

[BibT_eX]

[DOI]

Shiva Krishna Reddy Malay

CoRR, August, 2025

The Promise of RL for Autoregressive Image Editing.

[BibT_eX]

[DOI]

Saba Ahmadi

Rabiul Awal

Ankur Sikarwar

Amirhossein Kazemnejad

CoRR, August, 2025

Rendering-Aware Reinforcement Learning for Vector Graphics Generation.

[BibT_eX]

[DOI]

Mohammad Reza Samsami

CoRR, May, 2025

Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA.

[BibT_eX]

[DOI]

Rishabh Maheshwary

Masoud Hashemi

Khyati Mahajan

Spandana Gella

Vikas Yadav

CoRR, May, 2025

StarFlow: Generating Structured Workflow Outputs From Sketch Images.

[BibT_eX]

[DOI]

CoRR, March, 2025

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction.

[BibT_eX]

[DOI]

CoRR, March, 2025

PairBench: A Systematic Framework for Selecting Reliable Judge VLMs.

[BibT_eX]

[DOI]

Aarash Feizi

Adriana Romero-Soriano

Reihaneh Rabbany

Spandana Gella

João Monteiro

CoRR, February, 2025

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding.

[BibT_eX]

[DOI]

CoRR, February, 2025

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

StarVector: Generating Scalable Vector Graphics Code from Images and Text.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

StarVector: Generating Scalable Vector Graphics Code from Images and Text.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks.

[BibT_eX]

[DOI]

CoRR, 2024

Representing Positional Information in Generative World Models for Object Manipulation.

[BibT_eX]

[DOI]

CoRR, 2024

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Multimodal foundation world models for generalist embodied agents.

[BibT_eX]

[DOI]

CoRR, 2024

VCR: Visual Caption Restoration.

[BibT_eX]

[DOI]

CoRR, 2024

GenRL: Multimodal-foundation world models for generalization in embodied agents.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Efficient Dynamics Modeling in Interactive Environments with Koopman Theory.

[BibT_eX]

[DOI]

Arnab Kumar Mondal

Siba Smarak Panigrahi

Kaleem Siddiqi

Siamak Ravanbakhsh

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Capture the Flag: Uncovering Data Insights with Large Language Models.

[BibT_eX]

[DOI]

Issam H. Laradji

Perouz Taslakian

CoRR, 2023

Equivariant Adaptation of Large Pretrained Models.

[BibT_eX]

[DOI]

Arnab Kumar Mondal

Siba Smarak Panigrahi

Sékou-Oumar Kaba

Siamak Ravanbakhsh

CoRR, 2023

Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Hyperbolic Deep Reinforcement Learning for Continuous Control.

[BibT_eX]

[DOI]

Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Choreographer: Learning and Adapting Skills in Imagination.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Unsupervised Model-based Pre-training for Data-efficient Control from Pixels.

[BibT_eX]

[DOI]

CoRR, 2022

Multi-label Iterated Learning for Image Classification with Label Ambiguity.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Consistency-CAM: Towards Improved Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021

Touch-based Curiosity for Sparse-Reward Tasks.

[BibT_eX]

[DOI]

CoRR, 2021

Haptics-based Curiosity for Sparse-reward Tasks.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

2020

Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images Using a View-Based Representation.

[BibT_eX]

[DOI]

Fahim Mannan

Florian Golemo

Jérôme Parent-Lévesque

David Vázquez

Derek Nowrouzezahrai

Aaron C. Courville

Int. J. Comput. Vis., 2020

2019

Adversarial Computation of Optimal Transport Maps.

[BibT_eX]

[DOI]

CoRR, 2019

2018

Hierarchical Adversarially Learned Inference.

[BibT_eX]

[DOI]

Mohamed Ishmael Belghazi

CoRR, 2018

A Deep Reinforcement Learning Chatbot (Short Version).

[BibT_eX]

[DOI]

Alexandre de Brébisson

CoRR, 2018

MINE: Mutual Information Neural Estimation.

[BibT_eX]

[DOI]

CoRR, 2018

Towards Text Generation with Adversarially Learned Neural Outlines.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Mutual Information Neural Estimation.

[BibT_eX]

[DOI]

Mohamed Ishmael Belghazi

Proceedings of the 35th International Conference on Machine Learning, 2018

Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Adversarial Generation of Natural Language.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017

2015

OCR for bilingual documents using language modeling.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Document Analysis and Recognition, 2015

A hypothesize-and-verify framework for text recognition using deep recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Document Analysis and Recognition, 2015

Text recognition using deep BLSTM networks.

[BibT_eX]

[DOI]

Proceedings of the Eighth International Conference on Advances in Pattern Recognition, 2015

2014

Scene Text Analysis using Deep Belief Networks.

[BibT_eX]

[DOI]