Vimal Bhat

According to our database1, Vimal Bhat authored at least 22 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding.
CoRR, April, 2026

VIRTUE: Versatile Video Retrieval Through Unified Embeddings.
CoRR, January, 2026

Attribute-Controlled Translation with Preference Optimization.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026

2025
What Happens Next? Next Scene Prediction with a Unified Video Model.
CoRR, December, 2025

RosettaSpeech: Zero-Shot Speech-to-Speech Translation from Monolingual Data.
CoRR, November, 2025

RefTok: Reference-Based Tokenization for Video Generation.
CoRR, July, 2025

GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-Grained Video-Language Learning.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Now you see Me: Context-Aware Automatic Audio Description.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Detect, Disambiguate, and Translate: On-Demand Visual Reasoning for Multimodal Machine Translation with Large Vision-Language Models.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Learning Rich Speech Representations with Acoustic-Semantic Factorization.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Beyond Speaker Identity: Text Guided Target Speech Extraction.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
NowYouSee Me: Context-Aware Automatic Audio Description.
CoRR, 2024

DiffSign: AI-Assisted Generation of Customizable Sign Language Videos With Enhanced Realism.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

Text-Guided Video Masked Autoencoder.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Nearest-Neighbor Inter-Intra Contrastive Learning from Unlabeled Videos.
CoRR, 2023

A Simple and Efficient method for Dubbed Audio Sync Detection using Compressive Sensing.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Motion-Guided Masking for Spatiotemporal Representation Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LipNeRF: What is the right feature space to lip-sync a NeRF?
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

2021
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021


  Loading...