Okko Johannes Räsänen

Orcid: 0000-0002-0537-0946

According to our database1, Okko Johannes Räsänen authored at least 89 papers between 2008 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Introducing Meta-analysis in the Evaluation of Computational Models of Infant Language Development.
Cogn. Sci., July, 2023

Comparison of End-to-End Neural Network Architectures and Data Augmentation Methods for Automatic Infant Motility Assessment Using Wearable Sensors.
Sensors, April, 2023

Development of a speech emotion recognizer for large-scale child-centered audio recordings from a hospital environment.
Speech Commun., March, 2023

Automatic Assessment of Parkinson's Disease Using Speech Representations of Phonation and Articulation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances.
CoRR, 2023

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models.
CoRR, 2023

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode.
CoRR, 2023

Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research.
CoRR, 2023

On Negative Sampling for Contrastive Audio-Text Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2023

Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System.
Proceedings of the 31st European Signal Processing Conference, 2023

Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

2022
Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition.
Proceedings of the Interspeech 2022, 2022

Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Language-Independent Approach for Automatic Computation of Vowel Articulation Features in Dysarthric Speech Assessment.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel.
CoRR, 2021

Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? - A computational investigation.
CoRR, 2021

ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition.
CoRR, 2021

Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Zero-Shot Audio Classification with Factored Linear and Nonlinear Acoustic-Semantic Projections.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets.
CoRR, 2020

Unsupervised Discovery of Recurring Speech Patterns Using Probabilistic Adaptive Metrics.
Proceedings of the Interspeech 2020, 2020

Measuring prosodic predictability in children's home language environments.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

2019
SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech.
IEEE Signal Process. Lett., 2019

Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech.
Speech Commun., 2019

Automatic Posture and Movement Tracking of Infants with Wearable Movement Sensors.
CoRR, 2019

Vocal Effort Based Speaking Style Conversion Using Vocoder Features and Parallel Learning.
IEEE Access, 2019

Augmented CycleGANs for Continuous Scale Normal-to-Lombard Speaking Style Conversion.
Proceedings of the Interspeech 2019, 2019

A Computational Model of Early Language Acquisition from Audiovisual Experiences of Young Infants.
Proceedings of the Interspeech 2019, 2019

Cycle-consistent Adversarial Networks for Non-parallel Vocal Effort Based Speaking Style Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2019

Data Augmentation Strategies for Neural Network F0 Estimation.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Comparison of spectral tilt measures for sentence prominence in speech - Effects of dimensionality and adverse noise conditions.
Speech Commun., 2018

Comparison of Syllabification Algorithms and Training Strategies for Robust Word Count Estimation across Different Languages and Recording Conditions.
Proceedings of the Interspeech 2018, 2018

Time-regularized Linear Prediction for Noise-robust Extraction of the Spectral Envelope of Speech.
Proceedings of the Interspeech 2018, 2018

2017
An online model for vowel imitation learning.
Speech Commun., 2017

Comparison of Non-Parametric Bayesian Mixture Models for Syllable Clustering and Zero-Resource Speech Processing.
Proceedings of the Interspeech 2017, 2017

Speaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs.
Proceedings of the Interspeech 2017, 2017

Evaluation of Spectral Tilt Measures for Sentence Prominence Under Different Noise Conditions.
Proceedings of the Interspeech 2017, 2017

Dirichlet process mixture models for clustering i-vector data.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Connecting stimulus-driven attention to the properties of infant-directed speech - Is exaggerated intonation also more surprising?
Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017

Blind Phoneme Segmentation With Temporal Prediction Errors.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Sequence Prediction With Sparse Distributed Hyperdimensional Coding Applied to the Analysis of Mobile Phone Use Patterns.
IEEE Trans. Neural Networks Learn. Syst., 2016

3PRO - An unsupervised method for the automatic detection of sentence prominence in speech.
Speech Commun., 2016

Improving Phoneme segmentation with Recurrent Neural Networks.
CoRR, 2016

Perception of Sentence Stress in Speech Correlates With the Temporal Unpredictability of Prosodic Features.
Cogn. Sci., 2016

Analyzing the Contribution of Top-Down Lexical and Bottom-Up Acoustic Cues in the Detection of Sentence Prominence.
Proceedings of the Interspeech 2016, 2016

Analyzing distributional learning of phonemic categories in unsupervised deep neural networks.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

A Cognitive Approach to Modeling Sentence Level Prominence Based on Stimulus Unpredictability.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

Statistical Learning of Prosodic Patterns and Reversal of Perceptual Cues for Sentence Prominence.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

2015
Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits.
Comput. Speech Lang., 2015

Weakly-supervised word learning is improved by an active online algorithm.
Proceedings of the INTERSPEECH 2015, 2015

Unsupervised word discovery from speech using automatic segmentation into syllable-like units.
Proceedings of the INTERSPEECH 2015, 2015

Automatic detection of sentence prominence in speech using predictability of word-level acoustic features.
Proceedings of the INTERSPEECH 2015, 2015

Data-driven metric representing the maturation of preterm EEG.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Computational evidence for effects of memory decay, familiarity preference and mutual exclusivity in cross-situational learning.
Proceedings of the 37th Annual Meeting of the Cognitive Science Society, 2015

Cross-situational cues are relevant for early word segmentation.
Proceedings of the 37th Annual Meeting of the Cognitive Science Society, 2015

Generating Hyperdimensional Distributed Representations from Continuous-Valued Multivariate Sensory Input.
Proceedings of the 37th Annual Meeting of the Cognitive Science Society, 2015

Analyzing the Predictability of Lexeme-specific Prosodic Features as a Cue to Sentence Prominence.
Proceedings of the 37th Annual Meeting of the Cognitive Science Society, 2015

2014
Modeling Dependencies in Multiple Parallel Data Streams with Hyperdimensional Computing.
IEEE Signal Process. Lett., 2014

Perception of sentence stress in English infant directed speech.
Proceedings of the INTERSPEECH 2014, 2014

Basic cuts revisited: Temporal segmentation of speech into phone-like units with statistical learning at a pre-linguistic level.
Proceedings of the 36th Annual Meeting of the Cognitive Science Society, 2014

Statistical Unpredictability of F0 Trajectories as a Cue to Sentence Stress.
Proceedings of the 36th Annual Meeting of the Cognitive Science Society, 2014

2013
Feedback and imitation by a caregiver guides a virtual infant to learn native phonemes and the skill of speech inversion.
Speech Commun., 2013

Development of a novel robust measure for interhemispheric synchrony in the neonatal EEG: Activation Synchrony Index (ASI).
NeuroImage, 2013

Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech.
Proceedings of the INTERSPEECH 2013, 2013

Automatic self-supervised learning of associations between speech and text.
Proceedings of the INTERSPEECH 2013, 2013

Attention based temporal filtering of sensory signals for data redundancy reduction.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Computational modeling of phonetic and lexical learning in early language acquisition: Existing models and future directions.
Speech Commun., 2012

A method for noise-robust context-aware pattern discovery and recognition from categorical sequences.
Pattern Recognit., 2012

Modeling spoken language acquisition with a generic cognitive architecture for associative learning.
Proceedings of the INTERSPEECH 2012, 2012

Average Spectrotemporal Structure of Continuous Speech Matches with the Frequency Resolution of Human Hearing.
Proceedings of the INTERSPEECH 2012, 2012

Non-auditory cognitive capabilities in computational modeling of early language acquisition.
Proceedings of the INTERSPEECH 2012, 2012

Feature Selection for Speaker Traits.
Proceedings of the INTERSPEECH 2012, 2012

Context induced merging of synonymous word models in computational modeling of early language acquisition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Hierarchical unsupervised discovery of user context from multivariate sensory data.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Acoustic analysis supports the existence of a single distributional learning mechanism in structural rule learning from an artificial language.
Proceedings of the 34th Annual Meeting of the Cognitive Science Society, 2012

2011
Method for Speech Inversion with Large Scale Statistical Evaluation.
Proceedings of the INTERSPEECH 2011, 2011

Comparison of classifiers in audio and acceleration based context classification in mobile phones.
Proceedings of the 19th European Signal Processing Conference, 2011

2010
Estimation studies of vocal tract shape trajectory using a variable length and lossy kelly-lochbaum model.
Proceedings of the INTERSPEECH 2010, 2010

Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic events.
Proceedings of the INTERSPEECH 2010, 2010

2009
Learning meaningful units from multimodal input - the effect of interaction strategies.
Proceedings of the Second Workshop on Child, Computer and Interaction, 2009

A comparison and combination of segmental and fixed-frame signal representations in NMF-based word recognition.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009

Indirect estimation of formant frequencies through mean spectral variance with application to automatic gender recognition.
Proceedings of the Sixth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2009

A noise robust method for pattern discovery in quantized time series: the concept matrix approach.
Proceedings of the INTERSPEECH 2009, 2009

An improved speech segmentation quality measure: the r-value.
Proceedings of the INTERSPEECH 2009, 2009

Self-learning vector quantization for pattern discovery from speech.
Proceedings of the INTERSPEECH 2009, 2009

Do multiple caregivers speed up language acquisition?
Proceedings of the INTERSPEECH 2009, 2009

Discovering keywords from cross-modal input: ecological vs. engineering methods for enhancing acoustic repetitions.
Proceedings of the INTERSPEECH 2009, 2009

2008
Computational language acquisition by statistical bottom-up processing.
Proceedings of the INTERSPEECH 2008, 2008


  Loading...