We stand with Ukraine

We stand with Ukraine

Vineet Gandhi

Orcid: 0000-0001-8861-7731

According to our database¹, Vineet Gandhi authored at least 61 papers between 2012 and 2024.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Towards Improving NAM-to-Speech Synthesis Intelligibility using Self-Supervised Speech Models.

[BibT_eX]

[DOI]

Neil Kumar Shah

,

Shirish Karande

,

CoRR, 2024

Major Entity Identification: A Generalizable Alternative to Coreference Resolution.

[BibT_eX]

[DOI]

Kawshik Manikantan

,

Shubham Toshniwal

,

Makarand Tapaswi

,

CoRR, 2024

VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time?

[BibT_eX]

[DOI]

Darshana Saravanan

,

Darshan Singh S

,

,

,

,

Makarand Tapaswi

CoRR, 2024

SARI: Simplistic Average and Robust Identification based Noisy Partial Label Learning.

[BibT_eX]

[DOI]

Darshana Saravanan

,

,

CoRR, 2024

Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings.

[BibT_eX]

[DOI]

,

,

Adhiraj Anil Deshmukh

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations.

[BibT_eX]

[DOI]

Neil Kumar Shah

,

,

Vishal Tambrahalli

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

2023

MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting.

[BibT_eX]

[DOI]

Neil Kumar Shah

,

Vishal Tambrahalli

,

,

Niranjan Pedanekar

,

CoRR, 2023

ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations.

[BibT_eX]

[DOI]

,

Neil Kumar Shah

,

Vishal Tambrahalli

,

,

CoRR, 2023

Bringing Generalization to Deep Multi-View Pedestrian Detection.

[BibT_eX]

[DOI]

,

Swetanjal Dutta

,

,

Shyamgopal Karthik

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

Assessing active speaker detection algorithms through the lens of automated editing.

[BibT_eX]

[DOI]

,

,

Adhiraj Deshmukh

,

Proceedings of the 2023 ACM International Conference on Interactive Media Experiences Workshops, 2023

Instance-Level Semantic Maps for Vision Language Navigation.

[BibT_eX]

[DOI]

,

,

,

Raghav Prabhakar

,

,

,

Krishna Murthy Jatavallabhula

,

A. H. Abdul Hafez

,

,

K. Madhava Krishna

Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification.

[BibT_eX]

[DOI]

,

Shyamgopal Karthik

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Ground then Navigate: Language-guided Navigation in Dynamic Scenes.

[BibT_eX]

[DOI]

,

Varun Chhangani

,

,

K. Madhava Krishna

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Adversarial Robustness of Mel Based Speaker Recognition Systems.

[BibT_eX]

[DOI]

Ritu Srivastava

,

,

Sarath Sivaprasad

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations.

[BibT_eX]

[DOI]

,

Neil Kumar Shah

,

Vishal Tambrahalli

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

The Prose Storyboard Language: A Tool for Annotating and Directing Movies.

[BibT_eX]

[DOI]

,

,

,

Vaishnavi Ameya Murukutla

Proceedings of the Workshop on Intelligent Cinematography and Editing, 2022

Framework to Computationally Analyze Kathakali Videos.

[BibT_eX]

[DOI]

Pratikkumar Bulani

,

Jayachandran S.

,

Sarath Sivaprasad

,

Proceedings of the Workshop on Intelligent Cinematography and Editing, 2022

Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems.

[BibT_eX]

[DOI]

,

Sarath Sivaprasad

,

Niranjan Pedanekar

,

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Cross-Domain Class-Contrastive Learning: Finding Lower Dimensional Representations for Improved Domain Generalization.

[BibT_eX]

[DOI]

,

,

Proceedings of the Thirteenth Indian Conference on Computer Vision, 2022

Does Audio help in deep Audio-Visual Saliency prediction models?

[BibT_eX]

[DOI]

,

,

,

Sarath Sivaprasad

,

Proceedings of the International Conference on Multimodal Interaction, 2022

Comprehensive Multi-Modal Interactions for Referring Image Segmentation.

[BibT_eX]

[DOI]

,

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

Reappraising Domain Generalization in Neural Networks.

[BibT_eX]

[DOI]

Sarath Sivaprasad

,

Akshay Goindani

,

,

CoRR, 2021

Bringing Generalization to Deep Multi-view Detection.

[BibT_eX]

[DOI]

,

Swetanjal Dutta

,

Shyamgopal Karthik

,

CoRR, 2021

The Curious Case of Convex Neural Networks.

[BibT_eX]

[DOI]

Sarath Sivaprasad

,

,

,

Proceedings of the Machine Learning and Knowledge Discovery in Databases. Research Track, 2021

Grounding Linguistic Commands to Navigable Regions.

[BibT_eX]

[DOI]

,

,

Unni Krishnan R. Nair

,

,

K. Madhava Krishna

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction.

[BibT_eX]

[DOI]

,

Pradeep Yarlagadda

,

,

Shyamgopal Karthik

,

Ramanathan Subramanian

,

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Emotional Prosody Control for Speech Generation.

[BibT_eX]

[DOI]

Sarath Sivaprasad

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks.

[BibT_eX]

[DOI]

Shyamgopal Karthik

,

,

Puneet K. Dokania

,

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

TextureToMTF: predicting spatial frequency response in the wild.

[BibT_eX]

[DOI]

,

Sajal Maheshwari

,

Signal Image Video Process., 2020

AViNet: Diving Deep into Audio-Visual Saliency Prediction.

[BibT_eX]

[DOI]

,

Pradeep Yarlagadda

,

Ramanathan Subramanian

,

CoRR, 2020

The Curious Case of Convex Networks.

[BibT_eX]

[DOI]

Sarath Sivaprasad

,

,

CoRR, 2020

Simple Unsupervised Multi-Object Tracking.

[BibT_eX]

[DOI]

Shyamgopal Karthik

,

,

CoRR, 2020

GAZED - Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings.

[BibT_eX]

[DOI]

K. L. Bhanu Moorthy

,

,

Ramanathan Subramanian

,

Proceedings of the 9th Workshop on Intelligent Cinematography and Editing, 2020

CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems.

[BibT_eX]

[DOI]

,

K. L. Bhanu Moorthy

,

,

,

,

Anoop M. Namboodiri

Proceedings of the 9th Workshop on Intelligent Cinematography and Editing, 2020

Exploring 3 R's of Long-term Tracking: Re-detection, Recovery and Reliability.

[BibT_eX]

[DOI]

Shyamgopal Karthik

,

Abhinav Moudgil

,

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

LiDAR guided Small obstacle Segmentation.

[BibT_eX]

[DOI]

,

Aditya Kamireddypalli

,

,

K. Madhava Krishna

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Tidying Deep Saliency Prediction Architectures.

[BibT_eX]

[DOI]

,

,

Pradeep Yarlagadda

,

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

ColorArt: Suggesting Colorizations For Graphic Arts Using Optimal Color-Graph Matching.

[BibT_eX]

[DOI]

,

Proceedings of the 45th Graphics Interface Conference 2020, 2020

Cosine Meets Softmax: A Tough-to-beat Baseline for Visual Grounding.

[BibT_eX]

[DOI]

,

Unni Krishnan R. Nair

,

K. Madhava Krishna

,

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

2019

Talk to the Vehicle: Language Conditioned Autonomous Navigation of Self Driving Cars.

[BibT_eX]

[DOI]

,

,

Jayaganesh Kalyanasundaram

,

,

Brojeshwar Bhowmick

,

K. Madhava Krishna

Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision.

[BibT_eX]

[DOI]

Syed Ashar Javed

,

,

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Nose, Eyes and Ears: Head Pose Estimation by Locating Facial Keypoints.

[BibT_eX]

[DOI]

,

Kalpit C. Thakkar

,

,

P. J. Narayanan

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Watch to Edit: Video Retargeting using Gaze.

[BibT_eX]

[DOI]

Kranthi Kumar Rachavarapu

,

,

,

Ramanathan Subramanian

Comput. Graph. Forum, 2018

Automated Top View Registration of Broadcast Football Videos.

[BibT_eX]

[DOI]

Rahul Anand Sharma

,

,

,

Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

MergeNet: A Deep Net Architecture for Small Obstacle Discovery.

[BibT_eX]

[DOI]

,

Syed Ashar Javed

,

,

K. Madhava Krishna

Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

An Iterative Approach for Shadow Removal in Document Images.

[BibT_eX]

[DOI]

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Document Quality Estimation Using Spatial Frequency Response.

[BibT_eX]

[DOI]

Pranjal Kumar Rai

,

Sajal Maheshwari

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Long-Term Visual Object Tracking Benchmark.

[BibT_eX]

[DOI]

Abhinav Moudgil

,

Proceedings of the Computer Vision - ACCV 2018, 2018

2017

Automatic analysis of broadcast football videos using contextual priors.

[BibT_eX]

[DOI]

Rahul Anand Sharma

,

,

,

Signal Image Video Process., 2017

Zooming On All Actors: Automatic Focus+Context Split Screen Video Generation.

[BibT_eX]

[DOI]

,

,

,

Michael Gleicher

Comput. Graph. Forum, 2017

3D Region Proposals For Selective Object Search.

[BibT_eX]

[DOI]

,

,

K. Madhava Krishna

Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017) - Volume 5: VISAPP, Porto, Portugal, February 27, 2017

Beyond OCRs for Document Blur Estimation.

[BibT_eX]

[DOI]

Pranjal Kumar Rai

,

Sajal Maheshwari

,

,

Parikshit Sakurikar

,

Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Small obstacle detection using stereo vision for autonomous ground vehicle.

[BibT_eX]

[DOI]

,

Sarthak Upadhyay

,

,

K. Madhava Krishna

Proceedings of the Advances in Robotics, 2017

2016

Document blur detection using edge profile mining.

[BibT_eX]

[DOI]

Sajal Maheshwari

,

Pranjal Kumar Rai

,

,

Proceedings of the Tenth Indian Conference on Computer Vision, 2016

2015

The Prose Storyboard Language: A Tool for Annotating and Directing Movies.

[BibT_eX]

[DOI]

,

,

CoRR, 2015

A Computational Framework for Vertical Video Editing.

[BibT_eX]

[DOI]

,

Proceedings of the 4th Workshop on Intelligent Cinematography and Editing, 2015

Capturing and Indexing Rehearsals: The Design and Usage of a Digital Archive of Performing Arts.

[BibT_eX]

[DOI]

,

Benoît Encelle

,

,

Pierre-Antoine Champin

,

,

,

Cyrille Migniot

,

Proceedings of the 2nd Digital Heritage International Congress, 2015

2014

Automatic Rush Generation with Application to Theatre Performances. (Généation Automatique de Prises de Vues Cinématographiques avec Applications aux Captations de Théâtre).

[BibT_eX]

[DOI]

PhD thesis, 2014

Multi-clip video editing from a single viewpoint.

[BibT_eX]

[DOI]

,

,

Michael Gleicher

Proceedings of the 11th European Conference on Visual Media Production, 2014

2013

Detecting and Naming Actors in Movies Using Generative Appearance Models.

[BibT_eX]

[DOI]

,

Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012

High-resolution depth maps based on TOF-stereo fusion.

[BibT_eX]

[DOI]

,

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2012

Loading...