HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis.

[BibT_eX]

[DOI]

Sang-Hoon Lee

Seung-Bin Kim

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

TTS-by-TTS 2: Data-Selective Augmentation for Neural Speech Synthesis Using Ranking Support Vector Machine with Variational Autoencoder.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Effective Data Augmentation Methods for Neural Text-to-Speech Systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Electronics, Information, and Communication, 2022

Linear Prediction-based Parallel WaveGAN Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Electronics, Information, and Communication, 2022

2021

Improved Parallel Wavegan Vocoder with Perceptually Weighted Spectrogram Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

LiteTTS: A Lightweight Mel-Spectrogram-Free Text-to-Wave Synthesizer Based on Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

High-Fidelity Parallel WaveGAN with Multi-Band Harmonic-Plus-Noise Model.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Parallel Waveform Synthesis Based on Generative Adversarial Networks with Voicing-Aware Conditional Discriminators.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

TTS-by-TTS: TTS-Driven Data Augmentation for Fast and High-Quality Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Speaker-Adaptive Neural Vocoders for Parametric Speech Synthesis Systems.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Workshop on Multimedia Signal Processing, 2020

Neural Text-to-Speech with a Modeling-by-Generation Excitation Vocoder.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Parallel Wavegan: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram.

[BibT_eX]

[DOI]

Ryuichi Yamamoto

Eunwoo Song

Jae-Min Kim

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Improving LPCNET-Based Text-to-Speech with Linear Prediction-Structured Mixture Density Network.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems.

[BibT_eX]

[DOI]

CoRR, 2019

Probability Density Distillation with Generative Adversarial Networks for High-Quality Parallel Waveform Generation.

[BibT_eX]

[DOI]

Ryuichi Yamamoto

Eunwoo Song

Jae-Min Kim

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

ExcitNet Vocoder: A Neural Excitation Model for Parametric Speech Synthesis Systems.

[BibT_eX]

[DOI]

Eunwoo Song

Kyungguen Byun

Hong-Goo Kang

Proceedings of the 27th European Signal Processing Conference, 2019

2018

Speaker-adaptive neural vocoders for statistical parametric speech synthesis systems.

[BibT_eX]

[DOI]

CoRR, 2018

Acoustic Modeling Using Adversarially Trained Variational Recurrent Neural Network for Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Unified Framework for the Generation of Glottal Signals in Deep Learning-based Parametric Speech Synthesis Systems.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Modeling-By-Generation-Structured Noise Compensation Algorithm for Glottal Vocoding Speech Synthesis System.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Effective Spectral and Excitation Modeling Techniques for LSTM-RNN-Based Speech Synthesis Systems.

[BibT_eX]

[DOI]

Eunwoo Song

Frank K. Soong

Hong-Goo Kang

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Perceptual quality and modeling accuracy of excitation parameters in DLSTM-based speech synthesis systems.

[BibT_eX]

[DOI]

Eunwoo Song

Frank K. Soong

Hong-Goo Kang

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Improved Time-Frequency Trajectory Excitation Vocoder for DNN-Based Speech Synthesis.

[BibT_eX]

[DOI]

Eunwoo Song

Frank K. Soong

Hong-Goo Kang

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Multi-class learning algorithm for deep neural network-based statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Eunwoo Song

Hong-Goo Kang

Proceedings of the 24th European Signal Processing Conference, 2016

Area-efficient one-cycle correction scheme for timing errors in flip-flop based pipelines.

[BibT_eX]

[DOI]

Proceedings of the IEEE Asian Solid-State Circuits Conference, 2016

2015

Deep neural network-based statistical parametric speech synthesis system using improved time-frequency trajectory excitation model.

[BibT_eX]

[DOI]

Eunwoo Song

Hong-Goo Kang

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Improved time-frequency trajectory excitation modeling for a statistical parametric speech synthesis system.

[BibT_eX]

[DOI]

Eunwoo Song

Young-Sun Joo

Hong-Goo Kang

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

A constrained two-layer compression technique for ECG waves.

[BibT_eX]

[DOI]

Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

2014

Fixed-point implementation of MPEG-D unified speech and audio coding decoder.

[BibT_eX]

[DOI]

Eunwoo Song

Hong-Goo Kang

Joonil Lee

Proceedings of the 19th International Conference on Digital Signal Processing, 2014

2013

Speech enhancement for pathological voice using time-frequency trajectory excitation modeling.

[BibT_eX]

[DOI]

Eunwoo Song

Jongyoub Ryu

Hong-Goo Kang

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Eunwoo Song

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...