Vimal Bhat

According to our database1, Vimal Bhat authored at least 16 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
RefTok: Reference-Based Tokenization for Video Generation.
CoRR, July, 2025

GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-Grained Video-Language Learning.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Now you see Me: Context-Aware Automatic Audio Description.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Detect, Disambiguate, and Translate: On-Demand Visual Reasoning for Multimodal Machine Translation with Large Vision-Language Models.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Learning Rich Speech Representations with Acoustic-Semantic Factorization.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Beyond Speaker Identity: Text Guided Target Speech Extraction.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
NowYouSee Me: Context-Aware Automatic Audio Description.
CoRR, 2024

DiffSign: AI-Assisted Generation of Customizable Sign Language Videos With Enhanced Realism.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

Text-Guided Video Masked Autoencoder.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Nearest-Neighbor Inter-Intra Contrastive Learning from Unlabeled Videos.
CoRR, 2023

A Simple and Efficient method for Dubbed Audio Sync Detection using Compressive Sensing.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Motion-Guided Masking for Spatiotemporal Representation Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LipNeRF: What is the right feature space to lip-sync a NeRF?
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

2021
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021


  Loading...