Arushi Goel

According to our database¹, Arushi Goel authored at least 32 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Benchmarking Single-Factor Physical Video-to-Audio Generation.

[BibT_eX]

[DOI]

Gopala Anumanchipalli

Ming-Yu Liu

CoRR, May, 2026

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music.

[BibT_eX]

[DOI]

CoRR, April, 2026

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos.

[BibT_eX]

[DOI]

CoRR, March, 2026

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

Music Flamingo: Scaling Music Understanding in Audio Language Models.

[BibT_eX]

[DOI]

CoRR, November, 2025

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM.

[BibT_eX]

[DOI]

CoRR, October, 2025

UALM: Unified Audio Language Model for Understanding, Generation and Reasoning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding.

[BibT_eX]

[DOI]

CoRR, August, 2025

Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models.

[BibT_eX]

[DOI]

CoRR, July, 2025

ETTA: Elucidating the Design Space of Text-to-Audio Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Fugatto 1: Foundational Generative Audio Transformer Opus 1.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Visually Interpretable Subtask Reasoning for Visual Question Answering.

[BibT_eX]

[DOI]

Yu Cheng

Arushi Goel

Hakan Bilen

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

2024

OMCAT: Omni Context Aware Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

Audio Dialogues: Dialogues dataset for audio and music understanding.

[BibT_eX]

[DOI]

CoRR, 2024

TiV-ODE: A Neural ODE-based Approach for Controllable Video Generation From Text-Image Pairs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE.

[BibT_eX]

[DOI]

CoRR, 2023

Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories.

[BibT_eX]

[DOI]

Thomas Mensink

Jasper R. R. Uijlings

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Who are you referring to? Coreference resolution in image narrations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Semi-supervised multimodal coreference resolution in image narrations.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter.

[BibT_eX]

[DOI]

Georgios Tziafas

Yucheng Xu

Arushi Goel

Mohammadreza Mohades Kasaei

Zhibin Li

Hamidreza Kasaei

Proceedings of the Conference on Robot Learning, 2023

2022

Who are you referring to? Weakly supervised coreference resolution with multimodal grounding.

[BibT_eX]

[DOI]

CoRR, 2022

WiCV 2022: The Tenth Women In Computer Vision Workshop.

[BibT_eX]

[DOI]

Niveditha Kalavakonda

CoRR, 2022

WiCV 2021: The Eighth Women In Computer Vision Workshop.

[BibT_eX]

[DOI]

Arushi Goel

Niveditha Kalavakonda

CoRR, 2022

PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels.

[BibT_eX]

[DOI]

Arushi Goel

Yunlong Jiao

Jordan Massiah

CoRR, 2022

Not All Relations are Equal: Mining Informative Labels for Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2020

Injecting Prior Knowledge into Image Caption Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

2019

Learning to Caption Images with Two-Stream Attention and Sentence Auto-Encoder.

[BibT_eX]

[DOI]

CoRR, 2019

Cross-Domain Image Classification through Neural-Style Transfer Data Augmentation.

[BibT_eX]

[DOI]

Yijie Xu

Arushi Goel

CoRR, 2019

A Multimodal LSTM for Predicting Listener Empathic Responses Over Time.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on Automatic Face & Gesture Recognition, 2019

An End-To-End Network for Generating Social Relationship Graphs.

[BibT_eX]

[DOI]

Arushi Goel

Keng Teck Ma

Cheston Tan

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Semantic Roles in VerbNet and FrameNet: Statistical Analysis and Evaluation.

[BibT_eX]

[DOI]

Aliaksandr Huminski

Fiona Liausvia

Arushi Goel

Proceedings of the Computational Linguistics and Intelligent Text Processing, 2019

Arushi Goel

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...