Shinji Takaki

Orcid: 0000-0001-7294-7699

According to our database1, Shinji Takaki authored at least 46 papers between 2010 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System.
Proceedings of the IEEE International Conference on Acoustics, 2023

2021
PeriodNet: A Non-Autoregressive Raw Waveform Generative Model With a Structure Separating Periodic and Aperiodic Components.
IEEE Access, 2021

Periodnet: A Non-Autoregressive Waveform Generation Model with a Structure Separating Periodic and Aperiodic Components.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F<sub>0</sub> Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences.
IEEE Access, 2020

Fast and High-Quality Singing Voice Synthesis System Based on Convolutional Neural Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Semi-Supervised Learning Based on Hierarchical Generative Models for End-to-End Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.
CoRR, 2019

Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform.
CoRR, 2019

Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise.
Proceedings of the Interspeech 2019, 2019

Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language.
Proceedings of the IEEE International Conference on Acoustics, 2019

Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

STFT Spectral Loss for Training a Neural Speech Waveform Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Investigating very deep highway networks for parametric speech synthesis.
Speech Commun., 2018

Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis.
Speech Commun., 2018

Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra.
CoRR, 2018

Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder.
IEEE Access, 2018

Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Unsupervised Speaker Adaptation for DNN-based Speech Synthesis using Input Codes.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis.
Proceedings of the Interspeech 2017, 2017

Direct Modeling of Frequency Spectra and Waveform Generation Based on Phase Recovery for DNN-Based Speech Synthesis.
Proceedings of the Interspeech 2017, 2017

Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra.
Proceedings of the Interspeech 2017, 2017

Generative Adversarial Network-Based Postfilter for STFT Spectrograms.
Proceedings of the Interspeech 2017, 2017

An autoregressive recurrent mixture density network for parametric speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Adapting and controlling DNN-based speech synthesis using input codes.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Constructing a Deep Neural Network Based Spectral Model for Statistical Speech Synthesis.
Proceedings of the Recent Advances in Nonlinear Speech Processing, 2016

Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech Synthesis.
IEICE Trans. Inf. Syst., 2016

A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System.
Proceedings of the Interspeech 2016, 2016

Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks.
Proceedings of the Interspeech 2016, 2016

Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks.
Proceedings of the Interspeech 2016, 2016

A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Deep Denoising Auto-encoder for Statistical Speech Synthesis.
CoRR, 2015

Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals.
Proceedings of the Mathematics and Computation in Music - 5th International Conference, 2015

Multiple feed-forward deep neural networks for statistical parametric speech synthesis.
Proceedings of the INTERSPEECH 2015, 2015

2014
Contextual Additive Structure for HMM-Based Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014

2013
Contextual partial additive structure for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

Separable lattice 2-D HMMS introducing state duration control for recognition of images with various variations.
Proceedings of the IEEE International Conference on Acoustics, 2013

2011
An optimization algorithm of independent mean and variance parameter tying structures for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Spectral modeling with contextual additive structure for HMM-based speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010


  Loading...