Tamás Gábor Csapó

CoRR, 2024

2023

Investigations on speaker adaptation using a continuous vocoder within recurrent neural network based text-to-speech synthesis.

[BibT_eX]

[DOI]

Multim. Tools Appl., April, 2023

Future Speech Interfaces with Sensors and Machine Intelligence.

[BibT_eX]

[DOI]

Bruce Denby

Michael Wand

Sensors, February, 2023

Data Augmentation Methods on Ultrasound Tongue Images for Articulation-to-Speech Synthesis.

[BibT_eX]

[DOI]

Ibrahim Ibrahimov

Gábor Gosztolya

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Modeling Irregular Voice in End-to-End Speech Synthesis via Speaker Adaptation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2023

Enhancing End-to-End Speech Synthesis by Modeling Interrogative Sentences with Speaker Adaptation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2023

Advancing Limited Data Text-to-Speech Synthesis: Non-Autoregressive Transformer for High-Quality Parallel Synthesis.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2023

Nonparallel Expressive TTS for Unseen Target Speaker using Style-Controlled Adaptive Layer and Optimized Pitch Embedding.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2023

Comparison of acoustic-to-articulatory and brain-to-articulatory mapping during speech production using ultrasound tongue imaging and EEG.

[BibT_eX]

[DOI]

Péter Nagy

Ádám Boncz

Proceedings of the 2023 Workshop on Speech, Music and Mind, 2023

Towards Ultrasound Tongue Image prediction from EEG during speech production.

[BibT_eX]

[DOI]

Péter Nagy

Ádám Boncz

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adaptation of Tongue Ultrasound-Based Silent Speech Interfaces Using Spatial Transformer Networks.

[BibT_eX]

[DOI]

László Tóth

Gábor Gosztolya

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Raw Ultrasound-Based Phonetic Segments Classification Via Mask Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Concept and Pictogram-Based User-Interface Design of a Helper Tool for People with Aphasia.

[BibT_eX]

[DOI]

Peter Mayer

Katharina Werner

Ilídio Castro Oliveira

Samuel S. Silva

Melinda Szeker

António J. S. Teixeira

Paul Panek

Proceedings of the dHealth 2023, 2023

2022

Optimizing the Ultrasound Tongue Image Representation for Residual Network-Based Articulatory-to-Acoustic Mapping.

[BibT_eX]

[DOI]

Gábor Gosztolya

László Tóth

Alexandra Markó

Sensors, 2022

Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0.

[BibT_eX]

[DOI]

Proceedings of the 30th European Signal Processing Conference, 2022

2021

Noise and acoustic modeling with waveform generator in text-to-speech and neutral speech conversion.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2021

Advances in Speech Vocoding for Text-to-Speech with Continuous Parameters.

[BibT_eX]

[DOI]

CoRR, 2021

Convolutional Neural Network-Based Age Estimation Using B-Mode Ultrasound Tongue Image.

[BibT_eX]

[DOI]

Kele Xu

Ming Feng

CoRR, 2021

Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging.

[BibT_eX]

[DOI]

László Tóth

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging.

[BibT_eX]

[DOI]

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory Input.

[BibT_eX]

[DOI]

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Effects of F0 Estimation Algorithms on Ultrasound-Based Silent Speech Interfaces.

[BibT_eX]

[DOI]

Pengyu Dai

Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2021

Speaker Adaptation with Continuous Vocoder-Based DNN-TTS.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 23rd International Conference, 2021

Neural Speaker Embeddings for Ultrasound-Based Silent Speech Interfaces.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Continuous Wavelet Vocoder-Based Decomposition of Parametric Speech Waveform Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Neural Silent Speech Interface Models by Adversarial Training.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Computer Vision, 2021

Towards a Practical Lip-to-Speech Conversion System Using Deep Neural Networks and Mobile Application Frontend.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Computer Vision, 2021

2020

Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2020

A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2020

Ultrasound-Based Articulatory-to-Acoustic Mapping with WaveGlow Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Quantification of Transducer Misalignment in Ultrasound Tongue Imaging.

[BibT_eX]

[DOI]

Kele Xu

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speaker Dependent Acoustic-to-Articulatory Inversion Using Real-Time MRI of the Vocal Tract.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speaker Dependent Articulatory-to-Acoustic Mapping Using Real-Time MRI of the Vocal Tract.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Continuous vocoder applied in deep neural network based voice conversion.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2019

Parallel Voice Conversion Based on a Continuous Sinusoidal Model.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on Speech Technology and Human-Computer Dialogue, 2019

Articulatory Analysis of Transparent Vowel /iː/ in Harmonic and Antiharmonic Hungarian Stems: Is There a Difference?

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

V-to-V Coarticulation Induced Acoustic and Articulatory Variability of Vowels: The Effect of Pitch-Accent.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Ultrasound-Based Silent Speech Interface Built on a Continuous Vocoder.

[BibT_eX]

[DOI]

Alexander Sepúlveda-Sepúlveda

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

DNN-based Acoustic-to-Articulatory Inversion using Ultrasound Tongue Imaging.

[BibT_eX]

[DOI]

Dagoberto Porras

Proceedings of the International Joint Conference on Neural Networks, 2019

Autoencoder-Based Articulatory-to-Acoustic Mapping for Ultrasound Silent Speech Interfaces.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

RNN-based speech synthesis using a continuous sinusoidal model.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

2018

A Continuous Vocoder Using Sinusoidal Model for Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 20th International Conference, 2018

Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Deep Recurrent Neural Networks in Speech Synthesis Using a Continuous Vocoder.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 19th International Conference, 2017

Word-Initial Irregular Phonation as a Function of Speech Rate and Vowel Quality in Hungarian.

[BibT_eX]

[DOI]

Proceedings of the Studies on Speech Production - 11th International Seminar, 2017

DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Time-Domain Envelope Modulating the Noise Component of Excitation in a Continuous Residual-Based Vocoder for Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 18th International Conference, 2016

Continuous fundamental frequency prediction with deep neural networks.

[BibT_eX]

[DOI]

Bálint Pál Tóth

Proceedings of the 24th European Signal Processing Conference, 2016

Modeling unvoiced sounds in statistical parametric speech synthesis with a continuous vocoder.

[BibT_eX]

[DOI]

Proceedings of the 24th European Signal Processing Conference, 2016

2015

Residual-Based Excitation with Continuous F0 Modeling in HMM-Based Speech Synthesis.

[BibT_eX]

[DOI]

Milos Cernak

Proceedings of the Statistical Language and Speech Processing, 2015

Automatic transformation of irregular to regular voice by residual analysis and synthesis.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Error analysis of extracted tongue contours from 2d ultrasound images.

[BibT_eX]

[DOI]

Steven M. Lulich

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

From text to formants - indirect model for trajectory prediction based on a multi-speaker parallel speech database.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

A gépi beszéd-előállítás természetességének növelése rejtett Markov-modell alapú szövegfelolvasó rendszerben

[BibT_eX]

[DOI]

PhD thesis, 2014

Modeling Irregular Voice in Statistical Parametric Speech Synthesis With Residual Codebook Based Excitation.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2014

Statistical parametric speech synthesis with a novel codebook-based excitation model.

[BibT_eX]

[DOI]

Intell. Decis. Technol., 2014

2013

A novel irregular voice model for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Speech-centric Multimodal Interaction for Easy-to-access Online Services - A Personal Life Assistant for the Elderly.

[BibT_eX]

[DOI]

António J. S. Teixeira

Proceedings of the 5th International Conference on Software Development for Enhancing Accessibility and Fighting Info-exclusion, 2013

2012

Synthesizing expressive speech from amateur audiobook recordings.

[BibT_eX]

[DOI]

Julie Carson-Berndsen

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

A novel codebook-based excitation model for use in speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE 3rd International Conference on Cognitive Infocommunications, 2012

2011

Context and Speaker Dependency in the Relation of Vowel Formants and Subglottal Resonances - Evidence from Hungarian.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010

Special Speech Synthesis for Social Network Websites.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

2009

Relation of formants and subglottal resonances in Hungarian vowels.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2007

Increasing prosodic variability of text-to-speech synthesizers.

[BibT_eX]

[DOI]

Márk Fék