We stand with Ukraine

We stand with Ukraine

Nirmesh J. Shah

Orcid: 0000-0002-7294-6757

Affiliations:

Sony Research India

According to our database¹, Nirmesh J. Shah authored at least 42 papers between 2013 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Gesture2Speech: How Far Can Hand Movements Shape Expressive Speech?

[DOI]

,

,

Ashishkumar P. Gudmalwar

,

CoRR, March, 2026

2025

REWIND: Speech Time Reversal for Enhancing Speaker Representations in Diffusion-based Voice Conversion.

[DOI]

Ishan D. Biyani

,

Nirmesh J. Shah

,

Ashishkumar P. Gudmalwar

,

,

Rajiv Ratn Shah

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion.

[DOI]

Ashishkumar Prabhakar Gudmalwar

,

Ishan Darshan Biyani

,

Nirmesh J. Shah

,

,

Rajiv Ratn Shah

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion.

[DOI]

Ashishkumar Gudmalwar

,

Ishan D. Biyani

,

,

,

Rajiv Ratn Shah

CoRR, 2024

DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing.

[DOI]

,

Ashishkumar P. Gudmalwar

,

,

,

Rajiv Ratn Shah

CoRR, 2024

VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech.

[DOI]

Ashishkumar P. Gudmalwar

,

,

,

,

Rajiv Ratn Shah

CoRR, 2024

Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning.

[DOI]

Shivam Ratnakant Mhaskar

,

Nirmesh J. Shah

,

,

Ashishkumar P. Gudmalwar

,

,

Rajiv Ratn Shah

CoRR, 2024

Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning.

[DOI]

,

,

,

Ashishkumar P. Gudmalwar

,

,

Rajiv Ratn Shah

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing.

[DOI]

,

Ashishkumar Gudmalwar

,

,

,

Rajiv Ratn Shah

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech.

[DOI]

Ashishkumar Gudmalwar

,

,

,

,

Rajiv Ratn Shah

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2023

Nonparallel Emotional Voice Conversion for Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing.

[DOI]

,

Mayank Kumar Singh

,

Naoya Takahashi

,

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Semi-supervised Acoustic and Language Modeling for Hindi ASR.

[DOI]

Tarun Sai Bandarupalli

,

,

,

,

Sriram Ganapathy

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation.

[DOI]

Vishal M. Chudasama

,

,

Ashish Gudmalwar

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021

Exploiting Phase-based Features for Whisper vs. Speech Classification.

[DOI]

Nirmesh J. Shah

,

M. Ali Basha Shaik

,

,

Hemant A. Patil

,

Proceedings of the 29th European Signal Processing Conference, 2021

2020

Intelligibility Improvement of Dysarthric Speech using MMSE DiscoGAN.

[DOI]

,

,

Harshit Malaviya

,

,

,

Nirmesh J. Shah

,

,

Hemant A. Patil

Proceedings of the International Conference on Signal Processing and Communications, 2020

Query-By-Example Spoken Term Detection Using Generative Adversarial Network.

[DOI]

,

,

Maulik C. Madhavi

,

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

A novel approach to remove outliers for parallel voice conversion.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

Comput. Speech Lang., 2019

Novel Inception-GAN for Whispered-to-Normal Speech Conversion.

[DOI]

,

,

,

,

Hemant A. Patil

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Whether to Pretrain DNN or not?: An Empirical Analysis for Voice Conversion.

[DOI]

Nirmesh J. Shah

,

Hardik B. Sailor

,

Hemant A. Patil

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Phone Aware Nearest Neighbor Technique Using Spectral Transition Measure for Non-Parallel Voice Conversion.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Novel Metric Learning for Non-parallel Voice Conversion.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the IEEE International Conference on Acoustics, 2019

Effectiveness of Cross-Domain Architectures for Whisper-to-Normal Speech Conversion.

[DOI]

,

,

Nirmesh J. Shah

,

,

Hemant A. Patil

Proceedings of the 27th European Signal Processing Conference, 2019

Novel Adaptive Generative Adversarial Network for Voice Conversion.

[DOI]

,

,

,

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Effectiveness of Generative Adversarial Network for Non-Audible Murmur-to-Whisper Speech Conversion.

[DOI]

,

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Effectiveness of Dynamic Features in INCA and Temporal Context-INCA.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Unsupervised Vocal Tract Length Warped Posterior Features for Non-Parallel Voice Conversion.

[DOI]

Nirmesh J. Shah

,

Maulik C. Madhavi

,

Hemant A. Patil

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Novel Inter Mixture Weighted GMM Posteriorgram for DNN and GAN-based Voice Conversion.

[DOI]

Nirmesh J. Shah

,

,

,

Hemant A. Patil

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Analysis of Features and Metrics for Alignment in Text-Dependent Voice Conversion.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the Pattern Recognition and Machine Intelligence, 2017

Novel Amplitude Scaling method for bilinear frequency Warping-based Voice Conversion.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Quality assessment of voice converted speech using articulatory features.

[DOI]

,

Nirmesh J. Shah

,

,

Hemant A. Patil

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

On the convergence of INCA algorithm.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A novel filtering-based F0 estimation algorithm with an application to voice conversion.

[DOI]

Nirmesh J. Shah

,

Pramod B. Bachhav

,

Hemant A. Patil

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Novel Pre-processing using Outlier Removal in Voice Conversion.

[DOI]

,

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

2015

Effectiveness of multiscale fractal dimension for improvement of frame classification rate.

[DOI]

,

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 23rd European Signal Processing Conference, 2015

2014

Effectiveness of fractal dimension for ASR in low resource language.

[DOI]

,

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Deterministic annealing EM algorithm for developing TTS system in Gujarati.

[DOI]

Nirmesh J. Shah

,

Hemant A. Patil

,

Maulik C. Madhavi

,

Hardik B. Sailor

,

Tanvina B. Patel

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Effectiveness of PLP-based phonetic segmentation for speech synthesis.

[DOI]

Nirmesh J. Shah

,

Bhavik B. Vachhani

,

Hardik B. Sailor

,

Hemant A. Patil

Proceedings of the IEEE International Conference on Acoustics, 2014

Effectiveness of multiscale fractal dimension-based phonetic segmentation in speech synthesis for low resource language.

[DOI]

,

Nirmesh J. Shah

,

Hemant A. Patil

Proceedings of the 2014 International Conference on Asian Language Processing, 2014

Influence of various asymmetrical contextual factors for TTS in a low resource language.

[DOI]

Nirmesh J. Shah

,

,

Hemant A. Patil

Proceedings of the 2014 International Conference on Asian Language Processing, 2014

2013

Algorithms for speech segmentation at syllable-level for text-to-speech synthesis system in Gujarati.

[DOI]

Hemant A. Patil

,

Tanvina B. Patel

,

,

Nirmesh J. Shah

,

Hardik B. Sailor

,

Bhavik B. Vachhani

,

,

Bhargav Kanakiya

,

,

Vibha Prajapati

Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

A syllable-based framework for unit selection synthesis in 13 Indian languages.

[DOI]

Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

A Novel Gaussian Filter-Based Automatic Labeling of Speech Data for TTS System in Gujarati Language.

[DOI]

,

Hemant A. Patil

,

Tanvina B. Patel

,

Hardik B. Sailor

,

Nirmesh J. Shah

Proceedings of the 2013 International Conference on Asian Language Processing, 2013

Loading...