Alexis Moinet

According to our database1, Alexis Moinet authored at least 43 papers between 2007 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data.
CoRR, 2024

2023
A Comparative Analysis of Pretrained Language Models for Text-to-Speech.
CoRR, 2023

Controllable Emphasis with zero data for text-to-speech.
CoRR, 2023

eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer.
CoRR, 2023

2022
Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody.
CoRR, 2022

Expressive, Variable, and Controllable Duration Modelling in TTS.
CoRR, 2022

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer.
CoRR, 2022

Cross-lingual Style Transfer with Conditional Prior VAE and Style Loss.
Proceedings of the Interspeech 2022, 2022

Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody.
Proceedings of the Interspeech 2022, 2022

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer.
Proceedings of the Interspeech 2022, 2022

Expressive, Variable, and Controllable Duration Modelling in TTS.
Proceedings of the Interspeech 2022, 2022

Distribution Augmentation for Low-Resource Expressive Text-To-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech.
CoRR, 2021

A Learned Conditional Prior for the VAE Acoustic Space of a TTS System.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2021

Camp: A Two-Stage Approach to Modelling Prosody in Context.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Voice Conversion for Whispered Speech Synthesis.
IEEE Signal Process. Lett., 2020

Parallel WaveNet conditioned on VAE latent vectors.
CoRR, 2020

CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech.
Proceedings of the Interspeech 2020, 2020

Singing Synthesis: With a Little Help from my Attention.
Proceedings of the Interspeech 2020, 2020

2019
Towards Achieving Robust Universal Neural Vocoding.
Proceedings of the Interspeech 2019, 2019

2018
Traditional Machine Learning for Pitch Detection.
IEEE Signal Process. Lett., 2018

Comprehensive Evaluation of Statistical Speech Waveform Synthesis.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Parameter Generation Algorithms for Text-To-Speech Synthesis with Recurrent Neural Networks.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

2017
Phrase Break Prediction for Long-Form Reading TTS: Exploiting Text Structure Information.
Proceedings of the Interspeech 2017, 2017

2016
A Semantic and Content-Based Search User Interface for Browsing Large Collections of Foley Sounds.
Proceedings of the Audio Mostly 2016, Norrköping, Sweden, October 4-6, 2016, 2016

2015
An HMM approach for synthesizing amused speech with a controllable intensity of smile.
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2015

2013
Mage - HMM-based speech synthesis reactively controlled by the articulators.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Mage - reactive articulatory feature control of HMM-based parametric speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

MAGE 2.0: New Features and its Application in the Development of a Talking Guitar.
Proceedings of the 13th International Conference on New Interfaces for Musical Expression, 2013

VideoCycle: User-Friendly Navigation by Similarity in Video Databases.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Reactive Statistical Mapping: Towards the Sketching of Performative Control with Data.
Proceedings of the Innovative and Creative Developments in Multimodal Interaction Systems, 2013

2012
Stylistic gait synthesis based on hidden Markov models.
EURASIP J. Adv. Signal Process., 2012

LoopJam: turning the dance floor into a collaborative instrumental map.
Proceedings of the 12th International Conference on New Interfaces for Musical Expression, 2012

2010
AVLaughterCycle.
J. Multimodal User Interfaces, 2010

The AVLaughterCycle Database.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

2009
Cross-language voice conversion based on eigenvoices.
Proceedings of the INTERSPEECH 2009, 2009

Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
RAMCESS 2.X framework - expressive voice analysis for realtime and accurate synthesis of singing.
J. Multimodal User Interfaces, 2008

Glottal Source Estimation Robustness - A Comparison of Sensitivity of Voice Source Estimation Techniques.
Proceedings of the SIGMAP 2008, 2008

Voice source parameters estimation by fitting the glottal formant and the inverse filtering open phase.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Causal/anticausal Decomposition for mixed-phase Description of brass and Bowed String sounds.
Proceedings of the 2007 International Computer Music Conference, 2007

Towards a Voice Conversion System Based on Frame Selection.
Proceedings of the IEEE International Conference on Acoustics, 2007


  Loading...