Vineet Gandhi

Orcid: 0000-0001-8861-7731

According to our database1, Vineet Gandhi authored at least 58 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SARI: Simplistic Average and Robust Identification based Noisy Partial Label Learning.
CoRR, 2024

Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

2023
MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting.
CoRR, 2023

ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations.
CoRR, 2023

Bringing Generalization to Deep Multi-View Pedestrian Detection.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

Assessing active speaker detection algorithms through the lens of automated editing.
Proceedings of the 2023 ACM International Conference on Interactive Media Experiences Workshops, 2023

Instance-Level Semantic Maps for Vision Language Navigation.
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Ground then Navigate: Language-guided Navigation in Dynamic Scenes.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Adversarial Robustness of Mel Based Speaker Recognition Systems.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
The Prose Storyboard Language: A Tool for Annotating and Directing Movies.
Proceedings of the Workshop on Intelligent Cinematography and Editing, 2022

Framework to Computationally Analyze Kathakali Videos.
Proceedings of the Workshop on Intelligent Cinematography and Editing, 2022

Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Cross-Domain Class-Contrastive Learning: Finding Lower Dimensional Representations for Improved Domain Generalization.
Proceedings of the Thirteenth Indian Conference on Computer Vision, 2022

Does Audio help in deep Audio-Visual Saliency prediction models?
Proceedings of the International Conference on Multimodal Interaction, 2022

Comprehensive Multi-Modal Interactions for Referring Image Segmentation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Reappraising Domain Generalization in Neural Networks.
CoRR, 2021

Bringing Generalization to Deep Multi-view Detection.
CoRR, 2021

The Curious Case of Convex Neural Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases. Research Track, 2021

Grounding Linguistic Commands to Navigable Regions.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Emotional Prosody Control for Speech Generation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
TextureToMTF: predicting spatial frequency response in the wild.
Signal Image Video Process., 2020

AViNet: Diving Deep into Audio-Visual Saliency Prediction.
CoRR, 2020

The Curious Case of Convex Networks.
CoRR, 2020

Simple Unsupervised Multi-Object Tracking.
CoRR, 2020

CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems.
Proceedings of the 9th Workshop on Intelligent Cinematography and Editing, 2020

Exploring 3 R's of Long-term Tracking: Re-detection, Recovery and Reliability.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

LiDAR guided Small obstacle Segmentation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Tidying Deep Saliency Prediction Architectures.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

ColorArt: Suggesting Colorizations For Graphic Arts Using Optimal Color-Graph Matching.
Proceedings of the 45th Graphics Interface Conference 2020, 2020

Cosine Meets Softmax: A Tough-to-beat Baseline for Visual Grounding.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings.
Proceedings of the CHI '20: CHI Conference on Human Factors in Computing Systems, 2020

2019
Talk to the Vehicle: Language Conditioned Autonomous Navigation of Self Driving Cars.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Nose, Eyes and Ears: Head Pose Estimation by Locating Facial Keypoints.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Watch to Edit: Video Retargeting using Gaze.
Comput. Graph. Forum, 2018

Automated Top View Registration of Broadcast Football Videos.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

MergeNet: A Deep Net Architecture for Small Obstacle Discovery.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

An Iterative Approach for Shadow Removal in Document Images.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Document Quality Estimation Using Spatial Frequency Response.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Long-Term Visual Object Tracking Benchmark.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
Automatic analysis of broadcast football videos using contextual priors.
Signal Image Video Process., 2017

Zooming On All Actors: Automatic Focus+Context Split Screen Video Generation.
Comput. Graph. Forum, 2017

3D Region Proposals For Selective Object Search.
Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017) - Volume 5: VISAPP, Porto, Portugal, February 27, 2017

Beyond OCRs for Document Blur Estimation.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Small obstacle detection using stereo vision for autonomous ground vehicle.
Proceedings of the Advances in Robotics, 2017

2016
Document blur detection using edge profile mining.
Proceedings of the Tenth Indian Conference on Computer Vision, 2016

2015
The Prose Storyboard Language: A Tool for Annotating and Directing Movies.
CoRR, 2015

A Computational Framework for Vertical Video Editing.
Proceedings of the 4th Workshop on Intelligent Cinematography and Editing, 2015

Capturing and Indexing Rehearsals: The Design and Usage of a Digital Archive of Performing Arts.
Proceedings of the 2nd Digital Heritage International Congress, 2015

2014
Automatic Rush Generation with Application to Theatre Performances. (Généation Automatique de Prises de Vues Cinématographiques avec Applications aux Captations de Théâtre).
PhD thesis, 2014

Multi-clip video editing from a single viewpoint.
Proceedings of the 11th European Conference on Visual Media Production, 2014

2013
Detecting and Naming Actors in Movies Using Generative Appearance Models.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
High-resolution depth maps based on TOF-stereo fusion.
Proceedings of the IEEE International Conference on Robotics and Automation, 2012


  Loading...