Junichi Yamagishi

According to our database1, Junichi Yamagishi authored at least 272 papers between 2002 and 2020.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2020
A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F<sub>0</sub> Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech.
Comput. Speech Lang., 2020

Introduction to the special issue "Speaker and language characterization and recognition: Voice modeling, conversion, synthesis and ethical aspects".
Comput. Speech Lang., 2020

Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals.
CoRR, 2020

Generating Master Faces for Use in Performing Wolf Attacks on Face Recognition Systems.
CoRR, 2020

NAUTILUS: a Versatile Voice Cloning System.
CoRR, 2020

Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis.
CoRR, 2020

The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment.
CoRR, 2020

Design Choices for X-vector Based Speaker Anonymization.
CoRR, 2020

Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction.
CoRR, 2020

Reverberation Modeling for Source-Filter-based Neural Vocoder.
CoRR, 2020

Introducing the VoicePrivacy Initiative.
CoRR, 2020

iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning.
CoRR, 2020

An initial investigation on optimizing tandem speaker verification and countermeasure systems using reinforcement learning.
CoRR, 2020

Transferring Neural Speech Waveform Synthesizers to Musical Instrument Sounds Generation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Effect of Choice of Probability Distribution, Randomness, and Search Methods for Alignment Modeling in Sequence-to-Sequence Text-to-Speech Synthesis Using Hard Alignment.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Zero-Shot Multi-Speaker Text-To-Speech with State-Of-The-Art Neural Speaker Embeddings.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-Based Detection.
Proceedings of the Advanced Information Networking and Applications, 2020

2019
Introduction to Voice Presentation Attack Detection and Recent Advances.
Proceedings of the Handbook of Biometric Anti-Spoofing, 2019

Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Spatio-temporal generative adversarial network for gait anonymization.
J. Inf. Secur. Appl., 2019

Detecting and Correcting Adversarial Images Using Image Processing Operations.
CoRR, 2019

Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.
CoRR, 2019

The ASVspoof 2019 database.
CoRR, 2019

Security of Facial Forensics Models Against Adversarial Attacks.
CoRR, 2019

A Method for Identifying Origin of Digital Images Using a Convolution Neural Network.
CoRR, 2019

Use of a Capsule Network to Detect Fake Images and Videos.
CoRR, 2019

Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment.
CoRR, 2019

Transferring neural speech waveform synthesizers to musical instrument sounds generation.
CoRR, 2019

Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments.
CoRR, 2019

Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis.
CoRR, 2019

A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation.
CoRR, 2019

Multi-task Learning For Detecting and Segmenting Manipulated Facial Images and Videos.
CoRR, 2019

Speaker Anonymization Using X-vector and Neural Waveform Models.
CoRR, 2019

Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform.
CoRR, 2019

Introduction to Voice Presentation Attack Detection and Recent Advances.
CoRR, 2019

ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection.
Proceedings of the Interspeech 2019, 2019

Training Multi-Speaker Neural Text-to-Speech Systems Using Speaker-Imbalanced Speech Corpora.
Proceedings of the Interspeech 2019, 2019

MOSNet: Deep Learning-Based Objective Assessment for Voice Conversion.
Proceedings of the Interspeech 2019, 2019

GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-Spectrogram.
Proceedings of the Interspeech 2019, 2019

Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise.
Proceedings of the Interspeech 2019, 2019

Joint Training Framework for Text-to-Speech and Voice Conversion Using Multi-Source Tacotron and WaveNet.
Proceedings of the Interspeech 2019, 2019

Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language.
Proceedings of the IEEE International Conference on Acoustics, 2019

Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

STFT Spectral Loss for Training a Neural Speech Waveform Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

Cycle-consistent Adversarial Networks for Non-parallel Vocal Effort Based Speaking Style Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2019

Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos.
Proceedings of the IEEE International Conference on Acoustics, 2019

Attentive Filtering Networks for Audio Replay Attack Detection.
Proceedings of the IEEE International Conference on Acoustics, 2019

Waveform Generation for Text-to-speech Synthesis Using Pitch-synchronous Multi-scale Generative Adversarial Networks.
Proceedings of the IEEE International Conference on Acoustics, 2019

Audiovisual Speaker Conversion: Jointly and Simultaneously Transforming Facial Expression and Acoustic Characteristics.
Proceedings of the IEEE International Conference on Acoustics, 2019

Bootstrapping Non-Parallel Voice Conversion from Speaker-Adaptive Text-to-Speech.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

An RGB Gait Anonymization Model for Low-Quality Silhouettes.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Investigating very deep highway networks for parametric speech synthesis.
Speech Commun., 2018

Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis.
Speech Commun., 2018

Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis.
CoRR, 2018

Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra.
CoRR, 2018

Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder.
IEEE Access, 2018

Transforming acoustic characteristics to deceive playback spoofing countermeasures of speaker verification systems.
Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security, 2018

MesoNet: a Compact Facial Video Forgery Detection Network.
Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security, 2018

Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-Based Speech Synthesis Systems.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Identifying Computer-Translated Paragraphs using Coherence Features.
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation, 2018

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Integrated Presentation Attack Detection and Automatic Speaker Verification: Common Features and Gaussian Back-end Fusion.
Proceedings of the Interspeech 2018, 2018

Multimodal Speech Synthesis Architecture for Unsupervised Speaker Adaptation.
Proceedings of the Interspeech 2018, 2018

Investigating Accuracy of Pitch-accent Annotations in Neural Network-based Speech Synthesis and Denoising Effects.
Proceedings of the Interspeech 2018, 2018

Speaker-independent Raw Waveform Model for Glottal Excitation.
Proceedings of the Interspeech 2018, 2018

Expressive Speech Synthesis Using Sentiment Embeddings.
Proceedings of the Interspeech 2018, 2018

Transformation on Computer-Generated Facial Image to Avoid Detection by Spoofing Detector.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

High-Quality Nonparallel Voice Conversion Based on Cycle-Consistent Adversarial Network.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Unsupervised Speaker Adaptation for DNN-based Speech Synthesis using Input Codes.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Modular Convolutional Neural Network for Discriminating between Computer-Generated Images and Photographic Images.
Proceedings of the 13th International Conference on Availability, Reliability and Security, 2018

2017
Introduction to the Issue on Spoofing and Countermeasures for Automatic Speaker Verification.
IEEE J. Sel. Top. Signal Process., 2017

ASVspoof: The Automatic Speaker Verification Spoofing and Countermeasures Challenge.
IEEE J. Sel. Top. Signal Process., 2017

Influence of speaker familiarity on blind and visually impaired children's and young adults' perception of synthetic voices.
Comput. Speech Lang., 2017

An approach for gait anonymization using deep learning.
Proceedings of the 2017 IEEE Workshop on Information Forensics and Security, 2017

Distinguishing computer graphics from natural images using convolution neural networks.
Proceedings of the 2017 IEEE Workshop on Information Forensics and Security, 2017

An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis.
Proceedings of the Interspeech 2017, 2017

Speech Intelligibility in Cars: The Effect of Speaking Style, Noise and Listener Age.
Proceedings of the Interspeech 2017, 2017

Direct Modeling of Frequency Spectra and Waveform Generation Based on Phase Recovery for DNN-Based Speech Synthesis.
Proceedings of the Interspeech 2017, 2017

Learning Word Vector Representations Based on Acoustic Counts.
Proceedings of the Interspeech 2017, 2017

Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra.
Proceedings of the Interspeech 2017, 2017

Misperceptions of the Emotional Content of Natural and Vocoded Speech in a Car.
Proceedings of the Interspeech 2017, 2017

The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection.
Proceedings of the Interspeech 2017, 2017

Generative Adversarial Network-Based Postfilter for STFT Spectrograms.
Proceedings of the Interspeech 2017, 2017

Reducing Mismatch in Training of DNN-Based Glottal Excitation Models in a Statistical Parametric Text-to-Speech System.
Proceedings of the Interspeech 2017, 2017

Principles for Learning Controllable TTS from Annotated and Latent Variation.
Proceedings of the Interspeech 2017, 2017

An autoregressive recurrent mixture density network for parametric speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Adapting and controlling DNN-based speech synthesis using input codes.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Non-parallel voice conversion using i-vector PLDA: towards unifying speaker verification and transformation.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Identifying computer-generated text using statistical analysis.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

User Generated Dialogue Systems: uDialogue.
Proceedings of the Human-Harmonized Information Technology, Volume 2, 2017

2016
Anti-Spoofing for Text-Independent Speaker Verification: An Initial Database, Comparison of Countermeasures, and Human Performance.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech Synthesis.
IEICE Trans. Inf. Syst., 2016

ALISA: An automatic lightly supervised speech segmentation and alignment tool.
Comput. Speech Lang., 2016

Multidimensional scaling of systems in the Voice Conversion Challenge 2016.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Parallel and cascaded deep neural networks for text-to-speech synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Development of a statistical parametric synthesis system for operatic singing in German.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Voice Liveness Detection for Speaker Verification based on a Tandem Single/Double-channel Pop Noise Detector.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks.
Proceedings of the Interspeech 2016, 2016

Analysis of the Voice Conversion Challenge 2016 Evaluation Results.
Proceedings of the Interspeech 2016, 2016

Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System.
Proceedings of the Interspeech 2016, 2016

Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016.
Proceedings of the Interspeech 2016, 2016

Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks.
Proceedings of the Interspeech 2016, 2016

The Voice Conversion Challenge 2016.
Proceedings of the Interspeech 2016, 2016

Syllable-Level Representations of Suprasegmental Features for DNN-Based Text-to-Speech Synthesis.
Proceedings of the Interspeech 2016, 2016

Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks.
Proceedings of the Interspeech 2016, 2016

Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering.
Proceedings of the Interspeech 2016, 2016

The SIWIS Database: A Multilingual Speech Database with Acted Emphasis.
Proceedings of the Interspeech 2016, 2016

A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Wavelet-based decomposition of F0 as a secondary task for DNN-based speech synthesis with multi-task learning.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep neural network-guided unit selection synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Initial investigation of speech synthesis based on complex-valued neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Privacy-preserving sound to degrade automatic speaker verification performance.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Testing the consistency assumption: Pronunciation variant forced alignment in read and spontaneous speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Continuous Expressive Speaking Styles Synthesis based on CVSM and MR-HMM.
Proceedings of the COLING 2016, 2016

2015
Anti-spoofing, Voice Databases.
Proceedings of the Encyclopedia of Biometrics, Second Edition, 2015

A Deep Generative Architecture for Postfiltering in Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Spoofing and countermeasures for speaker verification: A survey.
Speech Commun., 2015

Intelligibility of time-compressed synthetic speech: Compression method and speaking style.
Speech Commun., 2015

Emotion transplantation through adaptation in HMM-based speech synthesis.
Comput. Speech Lang., 2015

Deep Denoising Auto-encoder for Statistical Speech Synthesis.
CoRR, 2015

A Comparison of Manual and Automatic Voice Repair for Individual with Vocal Disabilities.
Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

Automatic speaker verification spoofing and countermeasures (ASVspoof 2015): open discussion and future plans.
Proceedings of the INTERSPEECH 2015, 2015

ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge.
Proceedings of the INTERSPEECH 2015, 2015

Human vs machine spoofing detection on wideband and narrowband data.
Proceedings of the INTERSPEECH 2015, 2015

Multiple feed-forward deep neural networks for statistical parametric speech synthesis.
Proceedings of the INTERSPEECH 2015, 2015

Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification.
Proceedings of the INTERSPEECH 2015, 2015

A perceptual investigation of wavelet-based decomposition of f0 for text-to-speech synthesis.
Proceedings of the INTERSPEECH 2015, 2015

Influence of speaker familiarity on blind and visually impaired children's perception of synthetic voices in audio games.
Proceedings of the INTERSPEECH 2015, 2015

Deep neural network context embeddings for model selection in rich-context HMM synthesis.
Proceedings of the INTERSPEECH 2015, 2015

Reconstructing voices within the multiple-average-voice-model framework.
Proceedings of the INTERSPEECH 2015, 2015

Fusion of multiple parameterisations for DNN-based sinusoidal speech synthesis with multi-task learning.
Proceedings of the INTERSPEECH 2015, 2015

SAS: A speaker verification spoofing database containing diverse attacks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Methods for applying dynamic sinusoidal models to statistical parametric speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Speaker Recognition Anti-spoofing.
Proceedings of the Handbook of Biometric Anti-Spoofing, 2014

Statistical parametric speech synthesis for Ibibio.
Speech Commun., 2014

Combining Vocal Tract Length Normalization With Hierarchical Linear Transformations.
IEEE J. Sel. Top. Signal Process., 2014

Glottal Spectral Separation for Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014

Intelligibility enhancement of HMM-generated speech in additive noise by modifying Mel cepstral coefficients to increase the glimpse proportion.
Comput. Speech Lang., 2014

Intelligibility analysis of fast synthesized speech.
Proceedings of the INTERSPEECH 2014, 2014

Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation.
Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia, 2014

Generating segmental foreign accent.
Proceedings of the INTERSPEECH 2014, 2014

An investigation of the application of dynamic sinusoidal models to statistical parametric speech synthesis.
Proceedings of the INTERSPEECH 2014, 2014

DNN-based stochastic postfilter for HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2014, 2014

Neural net word representations for phrase-break prediction without a part of speech tagger.
Proceedings of the IEEE International Conference on Acoustics, 2014

Multiple-average-voice-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014

A fixed dimension and perceptually based dynamic sinusoidal model of speech.
Proceedings of the IEEE International Conference on Acoustics, 2014

Towards Cross-Lingual Emotion Transplantation.
Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

2013
Articulatory Control of HMM-Based Parametric Speech Synthesis Using Feature-Space-Switched Multiple Regression.
IEEE Trans. Speech Audio Process., 2013

Speech Synthesis Based on Hidden Markov Models.
Proceedings of the IEEE, 2013

Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis.
Comput. Speech Lang., 2013

Building personalised synthetic voices for individuals with severe speech impairment.
Comput. Speech Lang., 2013

Unsupervised and lightly-supervised learning for rapid construction of TTS systems in multiple languages from 'found' data: evaluation and analysis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Real-time control of expressive speech synthesis using kinect body tracking.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Using neighbourhood density and selective SNR boosting to increase the intelligibility of synthetic speech in noise.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Using adaptation to improve speech transcription alignment in noisy and reverberant environments.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Towards speaking style transplantation in speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

An experimental comparison of multiple vocoder types.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Mage - HMM-based speech synthesis reactively controlled by the articulators.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Mage - reactive articulatory feature control of HMM-based parametric speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Towards Personalised Synthesised Voices for Individuals with Vocal Disabilities: Voice Banking and Reconstruction.
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies, 2013

The voice bank corpus: Design, collection and data analysis of a large regional accent speech database.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise.
Proceedings of the INTERSPEECH 2013, 2013

TUNDRA: a multilingual corpus of found data for TTS research created with light supervision.
Proceedings of the INTERSPEECH 2013, 2013

Lightly supervised discriminative training of grapheme models for improved sentence-level alignment of speech and text data.
Proceedings of the INTERSPEECH 2013, 2013

On the evaluation of inversion mapping performance in the acoustic domain.
Proceedings of the INTERSPEECH 2013, 2013

Spoofing and countermeasures for automatic speaker verification.
Proceedings of the INTERSPEECH 2013, 2013

Reactive accent interpolation through an interactive map application.
Proceedings of the INTERSPEECH 2013, 2013

Improving intelligibility in noise of HMM-generated speech via noise-dependent and -independent methods.
Proceedings of the IEEE International Conference on Acoustics, 2013

Lightly supervised GMM VAD to use audiobook for speech synthesiser.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech.
IEEE Trans. Speech Audio Process., 2012

Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using KLD-based transform mapping.
Speech Commun., 2012

Impacts of machine translation and speech synthesis on speech-to-speech translation.
Speech Commun., 2012

Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis.
Speech Commun., 2012

Noise-robust whispered speech recognition using a non-audible-murmur microphone with VTS compensation.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech Disorders.
Proceedings of the INTERSPEECH 2012, 2012

Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noise.
Proceedings of the INTERSPEECH 2012, 2012

Evaluating speech intelligibility enhancement for HMM-based synthetic speech in noise.
Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2012

Towards an Unsupervised Speaking Style Voice Building Framework: Multi-Style Speaker Diarization.
Proceedings of the INTERSPEECH 2012, 2012

Towards Glottal Source Controllability in Expressive Speech Synthesis.
Proceedings of the INTERSPEECH 2012, 2012

Vowel Creation by Articulatory Control in HMM-based Parametric Speech Synthesis.
Proceedings of the INTERSPEECH 2012, 2012

Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image Analysis.
Proceedings of the INTERSPEECH 2012, 2012

Analysis of speaker clustering strategies for HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2012, 2012

Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Combining vocal tract length normalization with hierarchial linear transformations.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering.
IEEE Trans. Speech Audio Process., 2011

The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate.
Speech Commun., 2011

Unsupervised Continuous-Valued Word Features for Phrase-Break Prediction without a Part-of-Speech Tagger.
Proceedings of the INTERSPEECH 2011, 2011

Can Objective Measures Predict the Intelligibility of Modified HMM-Based Synthetic Speech in Noise?
Proceedings of the INTERSPEECH 2011, 2011

Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-Based Speech Synthesis.
Proceedings of the INTERSPEECH 2011, 2011

Formant-Controlled HMM-Based Speech Synthesis.
Proceedings of the INTERSPEECH 2011, 2011

Evaluation of objective measures for intelligibility prediction of HMM-based synthetic speech in noise.
Proceedings of the IEEE International Conference on Acoustics, 2011

Detection of synthetic speech for the problem of imposture.
Proceedings of the IEEE International Conference on Acoustics, 2011

An analysis of machine translation and speech synthesis in speech-to-speech translation system.
Proceedings of the IEEE International Conference on Acoustics, 2011

HMM-based speech synthesiser using the LF-model of the glottal source.
Proceedings of the IEEE International Conference on Acoustics, 2011

Vocal attractiveness of statistical speech synthesisers.
Proceedings of the IEEE International Conference on Acoustics, 2011

Voice banking and voice reconstruction for MND patients.
Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility, 2011

2010
Thousands of Voices for HMM-Based Speech Synthesis-Analysis and Application of TTS Systems Built on Various ASR Corpora.
IEEE Trans. Speech Audio Process., 2010

Synthesis of Child Speech With HMM Adaptation and Voice Conversion.
IEEE Trans. Speech Audio Process., 2010

Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis.
Speech Commun., 2010

An Analysis of HMM-based prediction of articulatory movements.
Speech Commun., 2010

Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech.
Speech Commun., 2010

Measuring the Gap Between HMM-Based ASR and TTS.
IEEE J. Sel. Top. Signal Process., 2010

Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Letter-based speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

An unified and automatic approach of Mandarin HTS system.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

An HMM-based speech synthesiser using glottal post-filtering.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Utilising spontaneous conversational speech in HMM-based speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

Roles of the average voice in speaker-adaptive HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2010, 2010

The role of higher-level linguistic features in HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2010, 2010

Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners.
Proceedings of the INTERSPEECH 2010, 2010

HMM-based text-to-articulatory-movement prediction and analysis of critical articulators.
Proceedings of the INTERSPEECH 2010, 2010

Simple methods for improving speaker-similarity of HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010

Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010

Revisiting the security of speaker verification systems against imposture using synthetic speech.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis.
IEEE Trans. Speech Audio Process., 2009

Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm.
IEEE Trans. Speech Audio Process., 2009

Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis.
IEEE Trans. Speech Audio Process., 2009

Thousands of voices for HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2009, 2009

HMM adaptation and voice conversion for the synthesis of child speech: a comparison.
Proceedings of the INTERSPEECH 2009, 2009

Identification of contrast and its emphatic realization in HMM based speech synthesis.
Proceedings of the INTERSPEECH 2009, 2009

Speech synthesis without a phone inventory.
Proceedings of the INTERSPEECH 2009, 2009

2008
Phone duration modeling using gradient tree boosting.
Speech Commun., 2008

HMM-based synthesis of child speech.
Proceedings of the 1st Workshop on Child, Computer and Interaction, 2008

Robustness of HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2008, 2008

Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge.
Proceedings of the INTERSPEECH 2008, 2008

Unsupervised adaptation for HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2008, 2008

Speech-driven lip motion generation with a trajectory HMM.
Proceedings of the INTERSPEECH 2008, 2008

Glottal spectral separation for parametric speech synthesis.
Proceedings of the INTERSPEECH 2008, 2008

Performance evaluation of the speaker-independent HMM-based speech synthesis system "HTS 2007" for the Blizzard Challenge 2007.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training.
IEICE Trans. Inf. Syst., 2007

A Style Control Technique for HMM-Based Expressive Speech Synthesis.
IEICE Trans. Inf. Syst., 2007

The HMM-based speech synthesis system (HTS) version 2.0.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Utilization of an HMM-based feature generation module in 5 ms segment concatenative speech synthesis.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Towards an improved modeling of the glottal source in statistical parametric speech synthesis.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Speech driven head motion synthesis based on a trajectory model.
Proceedings of the 34. International Conference on Computer Graphics and Interactive Techniques, 2007

Performance evaluation of HMM-based style classification with a small amount of training data.
Proceedings of the INTERSPEECH 2007, 2007

Model Adaptation Approach to Speech Synthesis with Diverse Voices and Styles.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features.
IEICE Trans. Inf. Syst., 2006

A technique for controlling voice quality of synthetic speech using multiple regression HSMM.
Proceedings of the INTERSPEECH 2006, 2006

Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis.
Proceedings of the INTERSPEECH 2006, 2006

A style control technique for speech synthesis using multiple regression HSMM.
Proceedings of the INTERSPEECH 2006, 2006

Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis.
Proceedings of the INTERSPEECH 2006, 2006

HSMM-Based Model Adaptation Algorithms for Average-Voice-Based Speech Synthesis.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2005

Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing.
IEICE Trans. Inf. Syst., 2005

Human Walking Motion Synthesis with Desired Pace and Stride Length Based on HSMM.
IEICE Trans. Inf. Syst., 2005

Performance evaluation of style adaptation for hidden semi-Markov model based speech synthesis.
Proceedings of the INTERSPEECH 2005, 2005

Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2005, 2005

Adaptive Training for Hidden Semi-Markov Model.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

HumanWalking Motion Synthesis Based on Multiple Regression Hidden Semi-Markov Model.
Proceedings of the 4th International Conference on Cyberworlds (CW 2005), 2005

2004
MLLR adaptation for hidden semi-Markov model based speech synthesis.
Proceedings of the INTERSPEECH 2004, 2004

Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Modeling of various speaking styles and emotions for HMM-based speech synthesis.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A training method for average voice model based on shared decision tree context clustering and speaker adaptive training.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A context clustering technique for average voice model in HMM-based speech synthesis.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002


  Loading...