Shivam Mehta

Orcid: 0000-0002-1886-681X

According to our database¹, Shivam Mehta authored at least 21 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction.

[BibT_eX]

[DOI]

CoRR, October, 2025

SemAlignVC: Enhancing zero-shot timbre conversion using semantic alignment.

[BibT_eX]

[DOI]

CoRR, July, 2025

EmojiVoice: Towards long-term controllable expressivity in robot speech.

[BibT_eX]

[DOI]

CoRR, June, 2025

SawtArabi: A Benchmark Corpus for Arabic TTS. Standard, Dialectal and Code-Switching.

[BibT_eX]

[DOI]

Shammur Absar Chowdhury

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Make Some Noise: Towards LLM audio reasoning and generation using sound tokens.

[BibT_eX]

[DOI]

Shivam Mehta

Nebojsa Jojic

Hannes Gamper

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Take a Look, it's in a Book, a Reading Robot.

[BibT_eX]

[DOI]

Jean-Julien Aucouturier

Angelica Lim

Proceedings of the 20th ACM/IEEE International Conference on Human-Robot Interaction, 2025

2024

Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis.

[BibT_eX]

[DOI]

CoRR, 2024

Drone Pollution Tracking in Cities Using Recurrent Proximal Policy Optimization Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Smart Cities Conference, 2024

Beyond graphemes and phonemes: continuous phonological features in neural text-to-speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Matcha-TTS: A Fast TTS Architecture with Conditional Flow Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Unified Speech and Gesture Synthesis Using Flow Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Fake it to make it: Using synthetic data to remedy the data shortage in joint multi-modal speech-and-gesture synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Stereotypical nationality representations in HRI: perspectives from international young adults.

[BibT_eX]

[DOI]

Frontiers Robotics AI, March, 2023

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis.

[BibT_eX]

[DOI]

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation.

[BibT_eX]

[DOI]

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

OverFlow: Putting flows on top of neural transducers for better TTS.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Multimodal Interaction, 2023

Prosody-Controllable Spontaneous TTS with Neural HMMS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Neural HMMS Are All You Need (For High-Quality Attention-Free TTS).

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2020

Finding the Blank with Sequence Labeling for English Learning.

[BibT_eX]

[DOI]

Shivam Mehta

Ivan Smetannikov

Proceedings of the CCRIS 2020: International Conference on Control, 2020

Shivam Mehta

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...