Shivam Mehta

Orcid: 0000-0002-1886-681X

According to our database1, Shivam Mehta authored at least 19 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
SemAlignVC: Enhancing zero-shot timbre conversion using semantic alignment.
CoRR, July, 2025

EmojiVoice: Towards long-term controllable expressivity in robot speech.
CoRR, June, 2025

Make Some Noise: Towards LLM audio reasoning and generation using sound tokens.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Take a Look, it's in a Book, a Reading Robot.
Proceedings of the 20th ACM/IEEE International Conference on Human-Robot Interaction, 2025

2024
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis.
CoRR, 2024

Drone Pollution Tracking in Cities Using Recurrent Proximal Policy Optimization Learning.
Proceedings of the IEEE International Smart Cities Conference, 2024

Beyond graphemes and phonemes: continuous phonological features in neural text-to-speech synthesis.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Matcha-TTS: A Fast TTS Architecture with Conditional Flow Matching.
Proceedings of the IEEE International Conference on Acoustics, 2024

Unified Speech and Gesture Synthesis Using Flow Matching.
Proceedings of the IEEE International Conference on Acoustics, 2024

Fake it to make it: Using synthetic data to remedy the data shortage in joint multi-modal speech-and-gesture synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Stereotypical nationality representations in HRI: perspectives from international young adults.
Frontiers Robotics AI, March, 2023

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

OverFlow: Putting flows on top of neural transducers for better TTS.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation.
Proceedings of the 25th International Conference on Multimodal Interaction, 2023

Prosody-Controllable Spontaneous TTS with Neural HMMS.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Neural HMMS Are All You Need (For High-Quality Attention-Free TTS).
Proceedings of the IEEE International Conference on Acoustics, 2022

2020
Finding the Blank with Sequence Labeling for English Learning.
Proceedings of the CCRIS 2020: International Conference on Control, 2020


  Loading...