Thomas Merritt

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech.

[BibT_eX]

[DOI]

Jaime Lorenzo-Trueba

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

AE-Flow: Autoencoder Normalizing Flow.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Expressive, Variable, and Controllable Duration Modelling in TTS.

[BibT_eX]

[DOI]

CoRR, 2022

Text-free non-parallel many-to-many voice conversion using normalising flows.

[BibT_eX]

[DOI]

CoRR, 2022

Remap, Warp and Attend: Non-Parallel Many-to-Many Accent Conversion with Normalizing Flows.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion.

[BibT_eX]

[DOI]

Magdalena Proszewska

Grzegorz Beringer

Daniel Sáez-Trigueros

Abdelhamid Ezzerg

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Creating New Voices using Normalizing Flows.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Expressive, Variable, and Controllable Duration Modelling in TTS.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Text-Free Non-Parallel Many-To-Many Voice Conversion Using Normalising Flow.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech.

[BibT_eX]

[DOI]

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Low-Resource Expressive Text-To-Speech Using Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Camp: A Two-Stage Approach to Modelling Prosody in Context.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Parallel WaveNet conditioned on VAE latent vectors.

[BibT_eX]

[DOI]

CoRR, 2020

2019

In Other News: a Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data.

[BibT_eX]

[DOI]

Nishant Prateek

Mateusz Lajszczak

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Towards Achieving Robust Universal Neural Vocoding.

[BibT_eX]

[DOI]

Alexis Moinet

Vatsal Aggarwal

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Effect of Data Reduction on Sequence-to-sequence Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Effect of data reduction on sequence-to-sequence neural TTS.

[BibT_eX]

[DOI]

CoRR, 2018

Robust universal neural vocoding.

[BibT_eX]

[DOI]

CoRR, 2018

Analysing Shortcomings of Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2018

Comprehensive Evaluation of Statistical Speech Waveform Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

2017

Overcoming the limitations of statistical parametric speech synthesis.

[BibT_eX]

[DOI]

PhD thesis, 2017

Phrase Break Prediction for Long-Form Reading TTS: Exploiting Text Structure Information.

[BibT_eX]

[DOI]

Thomas Drugman

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

From HMMS to DNNS: Where do the improvements come from?

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep neural network-guided unit selection synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

The CSTR entry to the Blizzard Challenge 2016.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

2015

Deep neural network context embeddings for model selection in rich-context HMM synthesis.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Attributing modelling errors in HMM synthesis by stepping gradually from natural to modelled speech.

[BibT_eX]

[DOI]

Javier Latorre

Simon King

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Investigating source and filter contributions, and their interaction, to statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Tuomo Raitio

Simon King

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Measuring the perceptual effects of modelling assumptions in speech synthesis using stimuli constructed from repeated natural speech.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A flexible front-end for HTS.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013

Investigating the shortcomings of HMM synthesis.

[BibT_eX]

[DOI]