Tomoki Koriyama

Orcid: 0000-0002-8347-5604

According to our database1, Tomoki Koriyama authored at least 47 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech.
CoRR, 2024

2023
Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

Structured State Space Decoder for Speech Recognition and Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022.
Proceedings of the Interspeech 2022, 2022

Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.
Proceedings of the Interspeech 2022, 2022

2021
Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation.
Speech Commun., 2021

Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Harmonic WaveGAN: GAN-Based Speech Waveform Generation Model with Harmonic Structure Discriminator.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Emotion-Controllable Speech Synthesis Using Emotion Soft Labels and Fine-Grained Prosody Factors.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Generative Moment Matching Network-Based Neural Double-Tracking for Synthesized and Natural Singing Voices.
IEICE Trans. Inf. Syst., 2020

DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis.
Proceedings of the Interspeech 2020, 2020

Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space.
Proceedings of the Interspeech 2020, 2020

Multi-Speaker Text-to-Speech Synthesis Using Deep Gaussian Processes.
Proceedings of the Interspeech 2020, 2020

Utterance-Level Sequential Modeling for Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Statistical Parametric Speech Synthesis Using Deep Gaussian Processes.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

JVS corpus: free Japanese multi-speaker voice corpus.
CoRR, 2019

Semi-Supervised Prosody Modeling Using Deep Gaussian Process Latent Variable Model.
Proceedings of the Interspeech 2019, 2019

Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Training Method Using DNN-guided Layerwise Pretraining for Deep Gaussian Processes.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
GPR-based Thai speech synthesis using multi-level duration prediction.
Speech Commun., 2018

2017
Sampling-Based Speech Parameter Generation Using Moment-Matching Networks.
Proceedings of the Interspeech 2017, 2017

Duration prediction using multiple Gaussian process experts for GPR-based speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Enhanced F0 generation for GPR-based speech synthesis considering syllable-based prosodic features.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Speech emotion recognition using convolutional long short-term memory neural network and support vector machines.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Unsupervised Stress Information Labeling Using Gaussian Process Latent Variable Model for Statistical Speech Synthesis.
Proceedings of the Interspeech 2016, 2016

A speaker adaptation technique for Gaussian process regression based speech synthesis using feature space transform.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling.
Comput. Speech Lang., 2015

Duration prediction using multi-level model for GPR-based speech synthesis.
Proceedings of the INTERSPEECH 2015, 2015

A comparison of speech synthesis systems based on GPR, HMM, and DNN with a small amount of training data.
Proceedings of the INTERSPEECH 2015, 2015

Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis.
Speech Commun., 2014

Statistical Parametric Speech Synthesis Based on Gaussian Process Regression.
IEEE J. Sel. Top. Signal Process., 2014

Parametric speech synthesis using local and global sparse Gaussian processes.
Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2014

Transform mapping using shared decision tree context clustering for HMM-based cross-lingual speech synthesis.
Proceedings of the INTERSPEECH 2014, 2014

Accent type and phrase boundary estimation using acoustic and language models for automatic prosodic labeling.
Proceedings of the INTERSPEECH 2014, 2014

Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization.
Proceedings of the IEEE International Conference on Acoustics, 2014

HMM-based Thai speech synthesis using unsupervised stress context labeling.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
A style control technique for singing voice synthesis based on multiple-regression HSMM.
Proceedings of the INTERSPEECH 2013, 2013

Statistical nonparametric speech synthesis using sparse Gaussian processes.
Proceedings of the INTERSPEECH 2013, 2013

HMM-based expressive speech synthesis based on phrase-level F0 context labeling.
Proceedings of the IEEE International Conference on Acoustics, 2013

Frame-level acoustic modeling based on Gaussian process regression for statistical nonparametric speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Discontinuous Observation HMM for Prosodic-Event-Based F0 Generation.
Proceedings of the INTERSPEECH 2012, 2012

An F0 modeling technique based on prosodic events for spontaneous speech synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
On the Use of Extended Context for HMM-Based Spontaneous Conversational Speech Synthesis.
Proceedings of the INTERSPEECH 2011, 2011

2010
Conversational spontaneous speech synthesis using average voice model.
Proceedings of the INTERSPEECH 2010, 2010


  Loading...