Roberto Barra-Chicote

Orcid: 0000-0003-0844-7037

According to our database1, Roberto Barra-Chicote authored at least 69 papers between 2005 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations.
CoRR, 2024

2023
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech.
CoRR, 2023

SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

2022
Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech.
CoRR, 2022

Text-free non-parallel many-to-many voice conversion using normalising flows.
CoRR, 2022

Remap, Warp and Attend: Non-Parallel Many-to-Many Accent Conversion with Normalizing Flows.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Prosodic alignment for off-screen automatic dubbing.
Proceedings of the Interspeech 2022, 2022

GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion.
Proceedings of the Interspeech 2022, 2022

Creating New Voices using Normalizing Flows.
Proceedings of the Interspeech 2022, 2022

Text-Free Non-Parallel Many-To-Many Voice Conversion Using Normalising Flow.
Proceedings of the IEEE International Conference on Acoustics, 2022

Voice Filter: Few-Shot Text-to-Speech Speaker Adaptation Using Voice Conversion as a Post-Processing Module.
Proceedings of the IEEE International Conference on Acoustics, 2022

Duration Modeling of Neural TTS for Automatic Dubbing.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Improving Multi-Speaker TTS Prosody Variance with a Residual Encoder and Normalizing Flows.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Intra-Sentential Speaking Rate Control in Neural Text-To-Speech for Automatic Dubbing.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Improving the Expressiveness of Neural Vocoding with Non-Affine Normalizing Flows.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

SynthASR: Unlocking Synthetic Data for Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Exploring the application of synthetic audio in training keyword spotters.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improvements to Prosodic Alignment for Automatic Dubbing.
Proceedings of the IEEE International Conference on Acoustics, 2021

Machine Translation Verbosity Control for Automatic Dubbing.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Parallel WaveNet conditioned on VAE latent vectors.
CoRR, 2020

From Speech-to-Speech Translation to Automatic Dubbing.
CoRR, 2020

From Speech-to-Speech Translation to Automatic Dubbing.
Proceedings of the 17th International Conference on Spoken Language Translation, 2020

Evaluating and Optimizing Prosodic Alignment for Automatic Dubbing.
Proceedings of the Interspeech 2020, 2020

BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
In Other News: a Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Towards Achieving Robust Universal Neural Vocoding.
Proceedings of the Interspeech 2019, 2019

Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech.
Proceedings of the Interspeech 2019, 2019

2018
Robust universal neural vocoding.
CoRR, 2018

Comprehensive Evaluation of Statistical Speech Waveform Synthesis.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

2017
Phrase Break Prediction for Long-Form Reading TTS: Exploiting Text Structure Information.
Proceedings of the Interspeech 2017, 2017

2016
Feature extraction from smartphone inertial signals for human activity segmentation.
Signal Process., 2016

Continuous Expressive Speaking Styles Synthesis based on CVSM and MR-HMM.
Proceedings of the COLING 2016, 2016

2015
Emotion transplantation through adaptation in HMM-based speech synthesis.
Comput. Speech Lang., 2015

Knowledge versus data in TTS: evaluation of a continuum of synthesis systems.
Proceedings of the INTERSPEECH 2015, 2015

2014
Translating bus information into sign language for deaf people.
Eng. Appl. Artif. Intell., 2014

Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation.
Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia, 2014

Generating segmental foreign accent.
Proceedings of the INTERSPEECH 2014, 2014

Towards Cross-Lingual Emotion Transplantation.
Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

2013
I <i>Feel</i> You: The Design and Evaluation of a Domotic Affect-Sensitive Spoken Conversational Agent.
Sensors, 2013

LSESpeak: A spoken language generator for Deaf people.
Expert Syst. Appl., 2013

Towards speaking style transplantation in speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

NEMOHIFI: an affective HiFi agent.
Proceedings of the 2013 International Conference on Multimodal Interaction, 2013

2012
Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation.
IEEE Trans. Speech Audio Process., 2012

Selection of TDOA Parameters for MDM Speaker Diarization.
Proceedings of the INTERSPEECH 2012, 2012

Towards an Unsupervised Speaking Style Voice Building Framework: Multi-Style Speaker Diarization.
Proceedings of the INTERSPEECH 2012, 2012

Towards Glottal Source Controllability in Expressive Speech Synthesis.
Proceedings of the INTERSPEECH 2012, 2012

2011
Speaker Diarization Based on Intensity Channel Contribution.
IEEE Trans. Speech Audio Process., 2011

2010
Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech.
Speech Commun., 2010

Estudio del tipo de alineamiento en un sistema de traducción estadística de castellano a Lengua de Signos Española (LSE).
Proces. del Leng. Natural, 2010

Spoken Spanish generation from sign language.
Interact. Comput., 2010

HIFI-AV: An Audio-visual Corpus for Spoken Language Human-Machine Dialogue Research in Spanish.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

2009
Speech Technology at Home: Enhanced Interfaces for People with Disabilities.
Intell. Autom. Soft Comput., 2009

Novel Applications of Neural Networks in Speech Technology Systems: Search Space Reduction and Prosodic Modeling.
Intell. Autom. Soft Comput., 2009

Speeding Up the Design of Dialogue Applications by Using Database Contents and Structure Information.
Proceedings of the SIGDIAL 2009 Conference, 2009

Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions.
Proceedings of the INTERSPEECH 2009, 2009

Expressive Speech Identifications based on Hidden Markov Model.
Proceedings of the Second International Conference on Health Informatics, 2009

2008
Speech to sign language translation system for Spanish.
Speech Commun., 2008

Desarrollo de un Robot-Guía con Integración de un Sistema de Diálogo y Expresión de Emociones: Proyecto ROBINT.
Proces. del Leng. Natural, 2008

Aplicación de métodos estadísticos para la traducción de voz a Lengua de Signos.
Proces. del Leng. Natural, 2008

Evaluation of a spoken dialogue system for controlling a Hifi audio system.
Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

2007
Automatic phonetic segmentation of Spanish emotional speech.
Proceedings of the INTERSPEECH 2007, 2007

Language identification using several sources of information with a multiple-Gaussian classifier.
Proceedings of the INTERSPEECH 2007, 2007

On the limitations of voice conversion techniques in emotion identification tasks.
Proceedings of the INTERSPEECH 2007, 2007

2006
A Spanish speech to sign language translation system for assisting deaf-mute people.
Proceedings of the INTERSPEECH 2006, 2006

Prosodic and Segmental Rubrics in Emotion Identification.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
New Advances in Cross-Task and Speaker Adaptation for Air Traffic Control Tasks.
Proces. del Leng. Natural, 2005

New word-level and sentence-level confidence scoring using graph theory calculus and its evaluation on speech understanding.
Proceedings of the INTERSPEECH 2005, 2005


  Loading...