Martti Vainio

Orcid: 0000-0003-2570-0196

Affiliations:
  • University of Helsinki, Finland


According to our database1, Martti Vainio authored at least 67 papers between 1996 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody.
CoRR, 2023

2020
Analyzing second language proficiency using wavelet-based prominence estimates.
J. Phonetics, 2020

Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech Synthesis.
CoRR, 2020

2019
Towards transformational creation of novel songs.
Connect. Sci., 2019

The Sound of Grasp Affordances: Influence of Grasp-Related Size of Categorized Objects on Vocalization.
Cogn. Sci., 2019

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations.
Proceedings of the 22nd Nordic Conference on Computational Linguistics, NoDaLiDa 2019, Turku, Finland, September 30, 2019

Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings.
Proceedings of the Interspeech 2019, 2019

Prosodic Representations of Prominence Classification Neural Networks and Autoencoders Using Bottleneck Features.
Proceedings of the Interspeech 2019, 2019

2017
Hierarchical representation and estimation of prosody using continuous wavelet transform.
Comput. Speech Lang., 2017

Comparing Languages Using Hierarchical Prosodic Analysis.
Proceedings of the Interspeech 2017, 2017

2016
Phase perception of the glottal excitation and its relevance in statistical parametric speech synthesis.
Speech Commun., 2016

Congruency Effect Between Articulation and Grasping in Native English Speakers.
Proceedings of the Interspeech 2016, 2016

Digitala: An Augmented Test and Review Process Prototype for High-Stakes Spoken Foreign Language Examination.
Proceedings of the Interspeech 2016, 2016

2015
Rapid and automatic speech-specific learning mechanism in human neocortex.
NeuroImage, 2015

Hierarchical Representation of Prosody for Statistical Speech Synthesis.
CoRR, 2015

Action planning and congruency effect between articulation and grasping.
Proceedings of the INTERSPEECH 2015, 2015

Phase perception of the glottal excitation of vocoded speech.
Proceedings of the INTERSPEECH 2015, 2015

Different parts of the same elephant: A roadmap to disentangle and connect different perspectives on prosodic prominence.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Pitch, perceived duration and auditory biases: Comparison among languages.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Prosodic and syntactic segmentation of spontaneous speech: A preliminary study.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

2014
Emergent consonantal quantity contrast and context-dependence of gestural phasing.
J. Phonetics, 2014

Synthesis and perception of breathy, normal, and Lombard speech in the presence of noise.
Comput. Speech Lang., 2014

An adaptive post-filtering method producing an artificial Lombard-like effect for intelligibility enhancement of narrowband telephone speech.
Comput. Speech Lang., 2014

Phonetics and Machine Learning: Hierarchical Modelling of Prosody in Statistical Speech Synthesis.
Proceedings of the Statistical Language and Speech Processing, 2014

Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort.
Proceedings of the INTERSPEECH 2014, 2014

Voice source modelling using deep neural networks for statistical parametric speech synthesis.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Wavelets for intonation modeling in HMM speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Acoustic and visual phonetic features in the mcgurk effect - an audiovisual speech illusion.
Proceedings of the INTERSPEECH 2013, 2013

Lombard modified text-to-speech synthesis for improved intelligibility: submission for the hurricane challenge 2013.
Proceedings of the INTERSPEECH 2013, 2013

Analysis and synthesis of shouted speech.
Proceedings of the INTERSPEECH 2013, 2013

Language background affects the strength of the pitch bias in a duration discrimination task.
Proceedings of the INTERSPEECH 2013, 2013

Comparing glottal-flow-excited statistical parametric speech synthesis methods.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
How far are vowel formants from computed vocal tract resonances?
CoRR, 2012

Effect of noise type and level on focus related fundamental frequency changes.
Proceedings of the INTERSPEECH 2012, 2012

Wideband Parametric Speech Synthesis Using Warped Linear Prediction.
Proceedings of the INTERSPEECH 2012, 2012

Utilization of the Lombard effect in post-filtering for intelligibility enhancement of telephone speech.
Proceedings of the INTERSPEECH 2012, 2012

Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction.
Proceedings of the INTERSPEECH 2012, 2012

Intonational speaker verification: A study on parameters and performance under noisy conditions.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

On measuring the intelligibility of synthetic speech in noise - Do we need a realistic noise environment?
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Comparison of post-filtering methods for intelligibility enhancement of telephone speech.
Proceedings of the 20th European Signal Processing Conference, 2012

2011
HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering.
IEEE Trans. Speech Audio Process., 2011

Analysis of HMM-Based Lombard Speech Synthesis.
Proceedings of the INTERSPEECH 2011, 2011

Relative Timing of Bilabial Gesture in Finnish.
Proceedings of the 17th International Congress of Phonetic Sciences, 2011

Estimates for the Measurement and Articulatory Error in MRI Data from Sustained Vowel Production.
Proceedings of the 17th International Congress of Phonetic Sciences, 2011

Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

Recording Speech Sound and Articulation in MRI.
Proceedings of the BIODEVICES 2011, 2011

2010
Comparison of formant enhancement methods for HMM-based speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Laryngeal voice quality in the expression of focus.
Proceedings of the INTERSPEECH 2010, 2010

2009
New method for delexicalization and its application to prosodic tagging for text-to-speech synthesis.
Proceedings of the INTERSPEECH 2009, 2009

Resources for speech research: present and future infrastructure needs.
Proceedings of the INTERSPEECH 2009, 2009

2008
Evaluation of an Artificial Speech Bandwidth Extension Method in Three Languages.
IEEE Trans. Speech Audio Process., 2008

Deep Syntactic Analysis and Rule Based Accentuation in Text-to-Speech Synthesis.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

HMM-based Finnish text-to-speech system utilizing glottal inverse filtering.
Proceedings of the INTERSPEECH 2008, 2008

2007
Laryngeal voice quality changes in expression of prominence in continuous speech.
Proceedings of the Fifth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2007

2006
Tonal features, intensity, and word order in the perception of prominence.
J. Phonetics, 2006

Word order and tonal shape in the production of focus in short Finnish utterances.
Proceedings of the INTERSPEECH 2006, 2006

2001
Three-dimensional modelling of speech corpora: added value through visualisation.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
Object-oriented Access to the Estonian Phonetic Database.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Measuring the importance of morphological information for finnish speech synthesis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Reduced impedance mismatch in speech database access.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
Towards a high quality Finnish talking head.
Proceedings of the Third IEEE Workshop on Multimedia Signal Processing, 1999

Relational vs. object-oriented models for representing speech: a comparison using ANDOSL data.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Modeling the microprosody of pitch and loudness for speech synthesis with neural networks.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Forming generic models of speech for uniform database access.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Speech synthesis using warped linear prediction and neural networks.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1996
Pitch, loudness, and segmental duration correlates: towards a model for the phonetic aspects of finnish prosody.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

A multilingual phonetic representation and analysis system for different speech databases.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996


  Loading...