Milos Cernak

Orcid: 0000-0002-5569-9491

According to our database1, Milos Cernak authored at least 77 papers between 2004 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Multi-Channel MOSRA: Mean Opinion Score and Room Acoustics Estimation Using Simulated Data and a Teacher Model.
CoRR, 2023

Cluster-based pruning techniques for audio data.
CoRR, 2023

Speaker Embeddings as Individuality Proxy for Voice Stress Detection.
CoRR, 2023

ALO-VC: Any-to-any Low-latency One-shot Voice Conversion.
CoRR, 2023

Demo Abstract: In-Ear-Voice - Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms.
Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation, 2023

In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms.
Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation, 2023

Personalized Task Load Prediction in Speech Communication.
Proceedings of the IEEE International Conference on Acoustics, 2023

Efficient Speech Quality Assessment Using Self-Supervised Framewise Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
BC-VAD: A Robust Bone Conduction Voice Activity Detection.
CoRR, 2022

Fast accuracy estimation of deep learning based multi-class musical source separation.
Proceedings of the 2022 Northern Lights Deep Learning Workshop, 2022

Application for Real-time Personalized Speaker Extraction.
Proceedings of the Interspeech 2022, 2022

MOSRA: Joint Mean Opinion Score and Room Acoustics Speech Quality Assessment.
Proceedings of the Interspeech 2022, 2022

Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load.
Proceedings of the Interspeech 2022, 2022

PEAF: Learnable Power Efficient Analog Acoustic Features for Audio Recognition.
Proceedings of the Interspeech 2022, 2022

SERAB: A Multi-Lingual Benchmark for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Power efficient analog features for audio recognition.
CoRR, 2021

A Universal Deep Room Acoustics Estimator.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Non-Intrusive Speech Quality Assessment with Transfer Learning and Subject-Specific Scaling.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping.
Proceedings of the HEAR: Holistic Evaluation of Audio Representations, 2021

Word-Level Embeddings for Cross-Task Transfer Learning in Speech Processing.
Proceedings of the 29th European Signal Processing Conference, 2021

AC-VC: Non-Parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Joint Blind Room Acoustic Characterization From Speech And Music Signals Using Convolutional Recurrent Neural Networks.
CoRR, 2020

FastVC: Fast Voice Conversion with non-parallel data.
CoRR, 2020

Deep Speech Inpainting of Time-Frequency Masks.
Proceedings of the Interspeech 2020, 2020

Spiking Neural Networks Trained With Backpropagation for Low Power Neuromorphic Implementation of Voice Activity Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Bin Encoding Training of a Spiking Neural Network Based Voice Activity Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Voice Presentation Attack Detection Using Convolutional Neural Networks.
Proceedings of the Handbook of Biometric Anti-Spoofing, 2019

Speech-VGG: A deep feature extractor for speech processing.
CoRR, 2019

End-to-End Accented Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Open-Vocabulary Keyword Spotting with Audio and Text Embeddings.
Proceedings of the Interspeech 2019, 2019

Evaluating Audiovisual Source Separation in the Context of Video Conferencing.
Proceedings of the Interspeech 2019, 2019

Phone-Attribute Posteriors to Evaluate the Speech of Cochlear Implant Users.
Proceedings of the Interspeech 2019, 2019

2018
Cognitive Speech Coding: Examining the Impact of Cognitive Speech Processing on Speech Compression.
IEEE Signal Process. Mag., 2018

NeuroSpeech.
SoftwareX, 2018

NeuroSpeech: An open-source software for Parkinson's speech analysis.
Digit. Signal Process., 2018

Phonological Posteriors and GRU Recurrent Units to Assess Speech Impairments of Patients with Parkinson's Disease.
Proceedings of the Text, Speech, and Dialogue - 21st International Conference, 2018

Phonological i-Vectors to Detect Parkinson's Disease.
Proceedings of the Text, Speech, and Dialogue - 21st International Conference, 2018

Nasal Speech Sounds Detection Using Connectionist Temporal Classification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Perceptual Information Loss due to Impaired Speech Production.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Characterisation of voice quality of Parkinson's disease using differential phonological posterior features.
Comput. Speech Lang., 2017

Speech vocoding for laboratory phonology.
Comput. Speech Lang., 2017

Bob Speaks Kaldi.
Proceedings of the Interspeech 2017, 2017

Multi-view representation learning via gcca for multimodal analysis of Parkinson's disease.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

On the impact of non-modal phonation on phonological features.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

On structured sparsity of phonological posteriors for linguistic parsing.
Speech Commun., 2016

An Analysis of Rhythmic Staccato-Vocalization Based on Frequency Demodulation for Laughter Detection in Conversational Meetings.
CoRR, 2016

Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

HMM-Based Non-Native Accent Assessment Using Posterior Features.
Proceedings of the Interspeech 2016, 2016

Probabilistic Amplitude Demodulation Features in Speech Synthesis for Improving Prosody.
Proceedings of the Interspeech 2016, 2016

PhonVoc: A Phonetic and Phonological Vocoding Toolkit.
Proceedings of the Interspeech 2016, 2016

Sound Pattern Matching for Automatic Prosodic Event Detection.
Proceedings of the Interspeech 2016, 2016

Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures.
Proceedings of the Interspeech 2016, 2016

Modeling unvoiced sounds in statistical parametric speech synthesis with a continuous vocoder.
Proceedings of the 24th European Signal Processing Conference, 2016

2015
Incremental Syllable-Context Phonetic Vocoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Residual-Based Excitation with Continuous F0 Modeling in HMM-Based Speech Synthesis.
Proceedings of the Statistical Language and Speech Processing, 2015

Automatic accentedness evaluation of non-native speech using phonetic and sub-phonetic posterior probabilities.
Proceedings of the INTERSPEECH 2015, 2015

Neuromorphic based oscillatory device for incremental syllable boundary detection.
Proceedings of the INTERSPEECH 2015, 2015

An empirical model of emphatic word detection.
Proceedings of the INTERSPEECH 2015, 2015

On compressibility of neural network phonological features for low bit rate speech coding.
Proceedings of the INTERSPEECH 2015, 2015

Phonological vocoding using artificial neural networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Development of bilingual ASR system for MediaParl corpus.
Proceedings of the INTERSPEECH 2014, 2014

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding.
Proceedings of the INTERSPEECH 2014, 2014

2013
A Simple Continuous Pitch Estimation Algorithm.
IEEE Signal Process. Lett., 2013

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture.
Proceedings of the INTERSPEECH 2013, 2013

On the (UN)importance of the contextual factors in HMM-based speech synthesis and coding.
Proceedings of the IEEE International Conference on Acoustics, 2013

Automatic Staging of Audio with Emotions.
Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013

2012
Reading companion: the technical and social design of an automated reading tutor.
Proceedings of the Third Workshop on Child, Computer and Interaction, 2012

Robust triphone mapping for acoustic modeling.
Proceedings of the INTERSPEECH 2012, 2012

2011
Rule-Based Triphone Mapping for Acoustic Modeling in Automatic Speech Recognition.
Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011

Effective Triphone Mapping for Acoustic Modeling in Speech Recognition.
Proceedings of the INTERSPEECH 2011, 2011

2010
A Comparison of Decision Tree Classifiers for Automatic Diagnosis of Speech Recognition Errors.
Comput. Informatics, 2010

Diagnostics for Debugging Speech Recognition Systems.
Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

2006
Unit Selection Speech Synthesis in Noise.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Diagnostics of speech recognition using classification phoneme diagnostic trees.
Proceedings of the Second IASTED International Conference on Computational Intelligence, 2006

2005
TTSBOX: a MATLAB toolbox for teaching text-to-speech synthesis.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Slovak Speech Database for Experiments and Application Building in Unit-Selection Speech Synthesis.
Proceedings of the Text, Speech and Dialogue, 7th International Conference, 2004


  Loading...