Jaime Lorenzo-Trueba

Orcid: 0000-0003-0459-1429

According to our database1, Jaime Lorenzo-Trueba authored at least 49 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations.
CoRR, 2024

2023
Multilingual context-based pronunciation learning for Text-to-Speech.
CoRR, 2023

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech.
CoRR, 2023

Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings.
CoRR, 2023

2022
Computer-assisted pronunciation training - Speech synthesis is almost all you need.
Speech Commun., 2022

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation.
Proceedings of the Interspeech 2022, 2022

Cross-Speaker Style Transfer for Text-to-Speech Using Data Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2022

Voice Filter: Few-Shot Text-to-Speech Speaker Adaptation Using Voice Conversion as a Post-Processing Module.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Enhancing audio quality for expressive Neural Text-to-Speech.
CoRR, 2021

Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments.
CoRR, 2021

EmoCat: Language-agnostic Emotional Voice Conversion.
CoRR, 2021

Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, 2021

Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Learned Conditional Prior for the VAE Acoustic Space of a TTS System.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling.
Proceedings of the IEEE International Conference on Acoustics, 2021

Low-Resource Expressive Text-To-Speech Using Data Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2021

Camp: A Two-Stage Approach to Modelling Prosody in Context.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Voice Conversion for Whispered Speech Synthesis.
IEEE Signal Process. Lett., 2020

Parallel WaveNet conditioned on VAE latent vectors.
CoRR, 2020

Dynamic Prosody Generation for Speech Synthesis Using Linguistics-Driven Acoustic Embedding Selection.
Proceedings of the Interspeech 2020, 2020

Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.
CoRR, 2019

In Other News: a Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Towards Achieving Robust Universal Neural Vocoding.
Proceedings of the Interspeech 2019, 2019

Effect of Data Reduction on Sequence-to-sequence Neural TTS.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis.
Speech Commun., 2018

Effect of data reduction on sequence-to-sequence neural TTS.
CoRR, 2018

Robust universal neural vocoding.
CoRR, 2018

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Expressive Speech Synthesis Using Sentiment Embeddings.
Proceedings of the Interspeech 2018, 2018

A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

High-Quality Nonparallel Voice Conversion Based on Cycle-Consistent Adversarial Network.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Misperceptions of the Emotional Content of Natural and Vocoded Speech in a Car.
Proceedings of the Interspeech 2017, 2017

Principles for Learning Controllable TTS from Annotated and Latent Variation.
Proceedings of the Interspeech 2017, 2017

2016
Segmenting human activities based on HMMs using smartphone inertial sensors.
Pervasive Mob. Comput., 2016

Continuous Expressive Speaking Styles Synthesis based on CVSM and MR-HMM.
Proceedings of the COLING 2016, 2016

2015
Emotion transplantation through adaptation in HMM-based speech synthesis.
Comput. Speech Lang., 2015

2014
Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation.
Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia, 2014

Towards Cross-Lingual Emotion Transplantation.
Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

2013
I <i>Feel</i> You: The Design and Evaluation of a Domotic Affect-Sensitive Spoken Conversational Agent.
Sensors, 2013

Towards speaking style transplantation in speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

NEMOHIFI: an affective HiFi agent.
Proceedings of the 2013 International Conference on Multimodal Interaction, 2013

2012
Sentence selection for improving the tuning process of a statistical machine translation system.
Proces. del Leng. Natural, 2012

Towards an Unsupervised Speaking Style Voice Building Framework: Multi-Style Speaker Diarization.
Proceedings of the INTERSPEECH 2012, 2012

Towards Glottal Source Controllability in Expressive Speech Synthesis.
Proceedings of the INTERSPEECH 2012, 2012


  Loading...