Torbjørn Svendsen

Tor André Myrvoll

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Towards Better Recognition of Spontaneous Children's Speech: Speaker-Clustering Fine-Tuning of Whisper.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Workshop on Machine Learning for Signal Processing, 2024

Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions.

[BibT_eX]

[DOI]

Moreno La Quatra

Maria Francesca Turco

Juan Rafael Orozco-Arroyave

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

A Framework for Phoneme-Level Pronunciation Assessment Using CTC.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Collecting Linguistic Resources for Assessing Children's Pronunciation of Nordic Languages.

[BibT_eX]

[DOI]

Anne Marte Haug Olstad

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023

Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children.

[BibT_eX]

[DOI]

Yaroslav Getman

Nhan Phan

Ragheb Al-Ghezi

Ekaterina Voskoboinik

Anna-Riikka Smolander

Sari Ylinen

IEEE Access, 2023

Improving Generalization of Norwegian ASR with Limited Linguistic Resources.

[BibT_eX]

[DOI]

Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

A character-based analysis of impacts of dialects on end-to-end Norwegian ASR.

[BibT_eX]

[DOI]

Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation.

[BibT_eX]

[DOI]

Janine Rugayan

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

An Analysis of Goodness of Pronunciation for Child Speech.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Using Modified Adult Speech as Data Augmentation for Child Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Semantically Meaningful Metrics for Norwegian ASR Systems.

[BibT_eX]

[DOI]

Janine Rugayan

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

wav2vec2-based Speech Rating System for Children with Speech Sound Disorder.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Two-Stage Deep Modeling Approach to Articulatory Inversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Sequence-to-Sequence Articulatory Inversion Through Time Convolution of Sub-Band Frequency Signals.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transfer Learning of Articulatory Information Through Phone Information.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

A Comparative Study of Deep Learning Techniques on Frame-Level Speech Data Classification.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., 2019

A Phonetic-Level Analysis of Different Input Features for Articulatory Inversion.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Text-Independent Speaker ID Employing 2D-CNN for Automatic Video Lecture Categorization in a MOOC Setting.

[BibT_eX]

[DOI]

Zenun Kastrati

Torbjørn Karl Svendsen

Arianit Kurti

Proceedings of the 31st IEEE International Conference on Tools with Artificial Intelligence, 2019

A Study on the Performance Evaluation of Machine Learning Models for Phoneme Classification.

[BibT_eX]

[DOI]

Proceedings of the 2019 11th International Conference on Machine Learning and Computing, 2019

Evaluating Acoustic Feature Maps in 2D-CNN for Speaker Identification.

[BibT_eX]

[DOI]

Vetle Haflan

Torbjørn Karl Svendsen

Proceedings of the 2019 11th International Conference on Machine Learning and Computing, 2019

Text-Independent Speaker ID for Automatic Video Lecture Classification Using Deep Learning.

[BibT_eX]

[DOI]

Zenun Kastrati

Torbjørn Karl Svendsen

Arianit Kurti

Proceedings of the 2019 5th International Conference on Computing and Artificial Intelligence, 2019

2018

Acoustic Feature Comparison for Different Speaking Rates.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction. Interaction Technologies, 2018

2015

Combining NDHMM and phonetic feature detection for speech recognition.

[BibT_eX]

[DOI]

Jarle Bauck Hamar

Proceedings of the 23rd European Signal Processing Conference, 2015

2014

An artificial neural network approach to automatic speech processing.

[BibT_eX]

[DOI]

Neurocomputing, 2014

2013

A Bottom-Up Modular Search Approach to Large Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Universal attribute characterization of spoken languages for automatic spoken language recognition.

[BibT_eX]

[DOI]

Jeremy Reed

Comput. Speech Lang., 2013

Non-negative durational HMM.

[BibT_eX]

[DOI]

Jarle Bauck Hamar

Doddipatla Rama Sanand

Thippur Sreenivas

Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2013

Synthetic speaker models using VTLN to improve the performance of children in mismatched speaker conditions for ASR.

[BibT_eX]

[DOI]

D. Rama Sanand

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data.

[BibT_eX]

[DOI]

Dau-Cheng Lyu

IEEE Trans. Speech Audio Process., 2012

2011

iVector Approach to Phonotactic Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Frequency-Warped and Stabilized Time-Varying Cepstral Coefficients.

[BibT_eX]

[DOI]

Trond Skogstad

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Pronunciation variation modeling of non-native proper names by discriminative tree search.

[BibT_eX]

[DOI]

Line Adde

Proceedings of the IEEE International Conference on Acoustics, 2011

Multi-site heterogeneous system fusions for the Albayzin 2010 Language Recognition Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

On the use of discriminative and non-discriminative pronunciation priors in pronunciation variation modeling of non-native proper names.

[BibT_eX]

[DOI]

Line Adde

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Spontal-N: A Corpus of Interactional Spoken Norwegian.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2010

NameDat: A Database of English Proper Names Spoken by Native Norwegians.

[BibT_eX]

[DOI]

Line Adde

Proceedings of the International Conference on Language Resources and Evaluation, 2010

A survey on recent progress in the ASAT/SIRKUS paradigm.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Intra-frame variability as a predictor of frame classifiability.

[BibT_eX]

[DOI]

Trond Skogstad

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition.

[BibT_eX]

[DOI]

Jeremy Reed

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A minimum classification error approach to pronunciation variation modeling of non-native proper names.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Experimental studies on continuous speech recognition using neural architectures with "adaptive" hidden activation functions.

[BibT_eX]

[DOI]

Filippo Sorbello

Proceedings of the IEEE International Conference on Acoustics, 2010

The NTNU Concatenative Speech Synthesizer.

[BibT_eX]

[DOI]

Dyre Meen

Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

2009

Exploring universal attribute characterization of spoken languages for spoken language recognition.

[BibT_eX]

[DOI]

Jeremy Reed

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A phonetic feature based lattice rescoring approach to LVCSR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Lexicon adaptation for subword speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008

RUNDKAST: an Annotated Norwegian Broadcast News Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

A penalized logistic regression approach to detection based phone classification.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Toward a detector-based universal phone recognizer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Towards bottom-up continuous phone recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

FonDat1: A Speech Synthesis Corpus for Norwegian.

[BibT_eX]

[DOI]

Ingunn Amdal

Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

2005

Distributed ASR using speech coder data for efficient feature vector representation.

[BibT_eX]

[DOI]

Trond Skogstad

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Comparing spectral distance measures for join cost optimization in concatenative speech synthesis.

[BibT_eX]

[DOI]

Ingmund Bjrkan

Snorre Farner

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Unit selection synthesis database development using utterance verification.

[BibT_eX]

[DOI]

Ingunn Amdal

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2003

Multilingual phone clustering for recognition of spontaneous indonesian speech utilising pronunciation modelling techniques.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Cross-lingual pronunciation modelling for indonesian speech recognition.

[BibT_eX]

[DOI]

Terrence Martin

Sridha Sridharan

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Evaluation of Pronunciation Variants in the ASR Lexicon for Different Speaking Styles.

[BibT_eX]

[DOI]

Ingunn Amdal

Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

2001

Fast adaptation using constrained affine transformations with hierarchical priors.

[BibT_eX]

[DOI]

Tor André Myrvoll

Kuldip K. Paliwal

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000

TABOR - a norwegian spoken dialogue system for bus travel information.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Stochastic modeling of semantic content for use IN a spoken dialogue system.

[BibT_eX]

[DOI]

Erik Harborg

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

ASR-based subtitling of live TV-programs for the hearing impaired.

[BibT_eX]

[DOI]

Erik Harborg

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

Maximum likelihood modelling of pronunciation variation.

[BibT_eX]

[DOI]

Speech Commun., 1999

On-line captioning of TV-programs for the hearing impaired.

[BibT_eX]

[DOI]

Erik Harborg

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1997

Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1996

Combined Optimisation of Baseforms and Subword Models for an Hmm Based Speech Recogniser.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Symposium on Signal Processing and Its Applications, 1996

1995

Optimizing baseforms for HMM-based speech recognition.

[BibT_eX]

[DOI]

Frank K. Soong

Heiko Purnhagen

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

1994

Segmental quantization of speech spectral information.

[BibT_eX]

[DOI]

Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993

Efficient quantization of speech spectral information.

[BibT_eX]

[DOI]

Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Cost232: speech recognition over the telephone line.

[BibT_eX]

[DOI]

Proceedings of the Third European Conference on Speech Communication and Technology, 1993

A time-frequency segmental neural network for phoneme recognition.

[BibT_eX]

[DOI]

Anjan Basu

Proceedings of the IEEE International Conference on Acoustics, 1993

1991

ANN-based speech recognition using a preprocessor for non-linear time compression.

[BibT_eX]

[DOI]

P. O. Husoy

Proceedings of the Second European Conference on Speech Communication and Technology, 1991

1990

Automatic alignment of phonemic labels with continuous speech.

[BibT_eX]

[DOI]

Knut Kvale

Proceedings of the First International Conference on Spoken Language Processing, 1990

1989

An improved sub-word based speech recognizer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1989

1987

On the automatic segmentation of speech signals.

[BibT_eX]

[DOI]

Frank K. Soong

Proceedings of the IEEE International Conference on Acoustics, 1987

1986

Multi-dimensional quantization applied to predictive coding of speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1986

1985

A study of three coders (sub-band, RELP and MPE) for speech with additive white noise.

[BibT_eX]

[DOI]

Kuldip K. Paliwal

Proceedings of the IEEE International Conference on Acoustics, 1985

1984

Tree encoding of the LPC residual.

[BibT_eX]

[DOI]