Bajibabu Bollepalli

Sudarsana Reddy Kadiri

IEEE Access, 2021

Multi-Scale Spectrogram Modelling for Neural Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

2020

Improving the quality of text-to-speech (TTS) using deep learning - Emphasis on vocoders and speaking style adaptation.

[BibT_eX]

[DOI]

PhD thesis, 2020

Multiscale System for Alzheimer's Dementia Recognition Through Spontaneous Speech.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

GlotNet - A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Normal-to-Lombard adaptation of speech synthesis using long short-term memory recurrent neural networks.

[BibT_eX]

[DOI]

Cassia Valentini-Botinhao

Manu Airaksinen

Speech Commun., 2019

GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-Spectrogram.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Lombard Speech Synthesis Using Transfer Learning in a Tacotron Text-to-Speech System.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Waveform Generation for Text-to-speech Synthesis Using Pitch-synchronous Multi-scale Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention.

[BibT_eX]

[DOI]

CoRR, 2018

Speaker-independent Raw Waveform Model for Glottal Excitation.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Glottal Vocoding With Frequency-Warped Time-Weighted Linear Prediction.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2017

Reducing Mismatch in Training of DNN-Based Glottal Excitation Models in a Statistical Parametric Text-to-Speech System.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Generative Adversarial Network-Based Glottal Waveform Model for Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Lombard speech synthesis using long short-term memory recurrent neural networks.

[BibT_eX]

[DOI]

Manu Airaksinen

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Frequency-warped time-weighted linear prediction for glottal vocoding.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

DNN-based Speech Synthesis for Indian Languages from ASCII text.

[BibT_eX]

[DOI]

Srikanth Ronanki

Siva Reddy Gangireddy

Simon King

Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

High-pitched excitation generation for glottal vocoding in statistical parametric speech synthesis using a deep neural network.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2014

The Tutorbot Corpus ― A Corpus for Studying Tutoring Behaviour in Multiparty Face-to-Face Spoken Dialogue.

[BibT_eX]

[DOI]

Maria Koutsombogera

Samer Al Moubayed

Ahmed Hussen Abdelaziz

Martin Johansson

José David Águas Lopes

Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

A comparative evaluation of vocoding techniques for HMM-based laughter synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Human-robot collaborative tutoring using multiparty multimodal spoken dialogue.

[BibT_eX]

[DOI]

Ahmed Hussen Abdelaziz

Martin Johansson

Maria Koutsombogera

José David Águas Lopes

Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, 2014

Effect of MPEG audio compression on vocoders used in statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Tuomo Raitio

Proceedings of the 22nd European Signal Processing Conference, 2014

2013

Non-linear Pitch Modification in Voice Conversion Using Artificial Neural Networks.

[BibT_eX]

[DOI]

Jonas Beskow

Joakim Gustafson

Proceedings of the Advances in Nonlinear Speech Processing - 6th International Conference, 2013

Effect of MPEG audio compression on HMM-based speech synthesis.

[BibT_eX]

[DOI]

Tuomo Raitio

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Tutoring Robots - Multiparty Multimodal Social Dialogue with an Embodied Tutor.

[BibT_eX]

[DOI]

Samer Al Moubayed

Jonas Beskow

Ahmed Hussen Abdelaziz

Martin Johansson

Maria Koutsombogera

José David Águas Lopes

Proceedings of the Innovative and Creative Developments in Multimodal Interaction Systems, 2013

2012

Modelling a Noisy-channel for Voice Conversion Using Articulatory Features.

[BibT_eX]

[DOI]

Alan W. Black

Kishore Prahallad

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

SWS task: Articulatory phonetic units and sliding DTW.

[BibT_eX]

[DOI]

Gautam Varma Mantena