Jaime Lorenzo-Trueba

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Learned Conditional Prior for the VAE Acoustic Space of a TTS System.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Low-Resource Expressive Text-To-Speech Using Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Camp: A Two-Stage Approach to Modelling Prosody in Context.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Voice Conversion for Whispered Speech Synthesis.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2020

Parallel WaveNet conditioned on VAE latent vectors.

[BibT_eX]

[DOI]

CoRR, 2020

Dynamic Prosody Generation for Speech Synthesis Using Linguistics-Driven Acoustic Embedding Selection.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.

[BibT_eX]

[DOI]

CoRR, 2019

In Other News: a Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data.

[BibT_eX]

[DOI]

Nishant Prateek

Mateusz Lajszczak

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Towards Achieving Robust Universal Neural Vocoding.

[BibT_eX]

[DOI]

Alexis Moinet

Vatsal Aggarwal

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Effect of Data Reduction on Sequence-to-sequence Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis.

[BibT_eX]

[DOI]

Speech Commun., 2018

Effect of data reduction on sequence-to-sequence neural TTS.

[BibT_eX]

[DOI]

CoRR, 2018

Robust universal neural vocoding.

[BibT_eX]

[DOI]

CoRR, 2018

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods.

[BibT_eX]

[DOI]

Fernando Villavicencio

Tomi Kinnunen

Zhen-Hua Ling

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment.

[BibT_eX]

[DOI]

Fernando Villavicencio

Zhen-Hua Ling

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Expressive Speech Synthesis Using Sentiment Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

High-Quality Nonparallel Voice Conversion Based on Cycle-Consistent Adversarial Network.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Misperceptions of the Emotional Content of Natural and Vocoded Speech in a Car.

[BibT_eX]

[DOI]

Cassia Valentini-Botinhao

Gustav Eje Henter

Rubén San-Segundo-Hernández

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Principles for Learning Controllable TTS from Annotated and Latent Variation.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Segmenting human activities based on HMMs using smartphone inertial sensors.

[BibT_eX]

[DOI]

Beatriz Martínez-González

José M. Pardo

Pervasive Mob. Comput., 2016

Continuous Expressive Speaking Styles Synthesis based on CVSM and MR-HMM.

[BibT_eX]

[DOI]

Ascensión Gallardo-Antolín

Proceedings of the COLING 2016, 2016

2015

Emotion transplantation through adaptation in HMM-based speech synthesis.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2015

2014

Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation.

[BibT_eX]

[DOI]

Julián D. Echeverry-Correa

Rubén San-Segundo-Hernández

Javier Ferreiros

Ascensión Gallardo-Antolín

Juan Manuel Montero-Martínez

Simon King

Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia, 2014

Towards Cross-Lingual Emotion Transplantation.

[BibT_eX]

[DOI]

Fernando Fernández Martínez

Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

2013

I <i>Feel</i> You: The Design and Evaluation of a Domotic Affect-Sensitive Spoken Conversational Agent.

[BibT_eX]

[DOI]

Syaheerah Lebai Lutfi

Sensors, 2013

Towards speaking style transplantation in speech synthesis.

[BibT_eX]

[DOI]

Oliver Watts

Fernando Fernández Martínez

Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

NEMOHIFI: an affective HiFi agent.

[BibT_eX]

[DOI]

Syaheerah Lebai Lutfi

Proceedings of the 2013 International Conference on Multimodal Interaction, 2013

2012

Sentence selection for improving the tuning process of a statistical machine translation system.

[BibT_eX]

[DOI]

Verónica López-Ludeña

Rubén San Segundo

Proces. del Leng. Natural, 2012

Towards an Unsupervised Speaking Style Voice Building Framework: Multi-Style Speaker Diarization.

[BibT_eX]

[DOI]

Beatriz Martínez-González

Verónica López-Ludeña

Javier Ferreiros

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Towards Glottal Source Controllability in Expressive Speech Synthesis.

[BibT_eX]

[DOI]