Philip C. Woodland

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

DNCASR: End-to-End Training for Speaker-Attributed ASR.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

SkillAggregation: Reference-free LLM-Dependent Aggregation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Graph Neural Networks for Contextual ASR With the Tree-Constrained Pointer Generator.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Decoupled structure for improved adaptability of end-to-end models.

[BibT_eX]

[DOI]

Speech Commun., 2024

MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events.

[BibT_eX]

[DOI]

CoRR, 2024

Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning.

[BibT_eX]

[DOI]

CoRR, 2024

CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2024

Automatic Time Alignment Generation For End-to-End ASR Using Acoustic Probability Modelling.

[BibT_eX]

[DOI]

Dongcheng Jiang

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

An Improved Empirical Fisher Approximation for Natural Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

SOT Triggered Neural Clustering for Speaker Attributed ASR.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Confidence Estimation for Automatic Detection of Depression and Alzheimer's Disease Based on Clinical Interviews.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation.

[BibT_eX]

[DOI]

Nineli Lashkarashvili

Proceedings of the IEEE International Conference on Acoustics, 2024

FastInject: Injecting Unpaired Text Data into CTC-Based ASR Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Modelling Variability in Human Annotator Simulation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Speech-based Slot Filling using Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

Combining hybrid DNN-HMM ASR systems with attention-based models using lattice rescoring.

[BibT_eX]

[DOI]

Qiujia Li

Speech Commun., February, 2023

Minimising Biasing Word Errors for Contextual ASR With the Tree-Constrained Pointer Generator.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Estimating the Uncertainty in Emotion Class Labels With Utterance-Specific Dirichlet Priors.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2023

Conditional Diffusion Model for Target Speaker Extraction.

[BibT_eX]

[DOI]

CoRR, 2023

It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation.

[BibT_eX]

[DOI]

CoRR, 2023

Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Can Contextual Biasing Remain Effective with Whisper and GPT-2?

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Biased Self-supervised Learning for ASR.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

A Neural Time Alignment Module for End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

Dongcheng Jiang

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Self-Supervised Representations in Speech-Based Depression Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

End-to-End Spoken Language Understanding with Tree-Constrained Pointer Generator.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Supervised Learning-Based Source Separation for Meeting Data.

[BibT_eX]

[DOI]

Yuang Li

Proceedings of the IEEE International Conference on Acoustics, 2023

Spectral Clustering-Aware Learning of Embeddings for Speaker Diarisation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Adaptable End-to-End ASR Models Using Replaceable Internal LMs and Residual Softmax.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression.

[BibT_eX]

[DOI]

William D. Marslen-Wilson

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

On the similarities of representations in artificial and brain neural networks for speech recognition.

[BibT_eX]

[DOI]

Li Su

Frontiers Comput. Neurosci., 2022

Distribution-Based Emotion Recognition in Conversation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-Trained Models.

[BibT_eX]

[DOI]

Xiaoyu Yang

Qiujia Li

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Combination of deep speaker embeddings for diarisation.

[BibT_eX]

[DOI]

Neural Networks, 2021

A distributed optimisation framework combining natural gradient with Hessian-free for discriminative sequence training.

[BibT_eX]

[DOI]

Neural Networks, 2021

Discriminative Neural Clustering for Speaker Diarisation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Residual Energy-Based Models for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Variable Frame Rate Acoustic Models Using Minimum Error Reinforcement Learning.

[BibT_eX]

[DOI]

Dongcheng Jiang

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Emotion Recognition by Fusing Time Synchronous and Time Asynchronous Representations.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Content-Aware Speaker Embeddings for Speaker Diarisation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Transformer Language Models with LSTM-Based Cross-Utterance Information Representation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Tree-Constrained Pointer Generator for End-to-End Contextual Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Cross-Utterance Language Models with Acoustic Error Sampling.

[BibT_eX]

[DOI]

CoRR, 2020

Cosine-Distance Virtual Adversarial Training for Semi-Supervised Speaker-Discriminative Acoustic Embeddings.

[BibT_eX]

[DOI]

Florian L. Kreyssig

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Improved Large-Margin Softmax Loss for Speaker Diarisation.

[BibT_eX]

[DOI]

Yassir Fathullah

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Multi-Span Acoustic Modelling Using Raw Waveform Signals.

[BibT_eX]

[DOI]

Patrick von Platen

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

PyHTK: Python Library and ASR Pipelines for HTK.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Speaker Diarisation Using 2D Self-attentive Combination of Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Integrating Source-Channel and Attention-Based Sequence-to-Sequence Models for Speech Recognition.

[BibT_eX]

[DOI]

Qiujia Li

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Semi-tied Units for Efficient Gating in LSTM and Highway Networks.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Combining Natural Gradient with Hessian Free Methods for Sequence Training.

[BibT_eX]

[DOI]

Adnan Haider

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

High Order Recurrent Neural Networks for Acoustic Modelling.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Improved Tdnns Using Deep Kernels and Frequency Dependent Grid-RNNS.

[BibT_eX]

[DOI]

Florian L. Kreyssig

William D. Marslen-Wilson

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem.

[BibT_eX]

[DOI]

PLoS Comput. Biol., 2017

Joint optimisation of tandem systems using Gaussian mixture density neural network discriminative sequence training.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Sequence training of DNN acoustic models with natural gradient.

[BibT_eX]

[DOI]

Adnan Haider

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Very deep convolutional neural networks for robust speech recognition.

[BibT_eX]

[DOI]

Yanmin Qian

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

DNN speaker adaptation using parameterised sigmoid and ReLU hidden activation functions.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

System combination with log-linear models.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Improved DNN-based segmentation for multi-genre broadcast audio.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

CUED-RNNLM - An open-source toolkit for efficient training and evaluation of recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

A general artificial neural network extension for HTK.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Parameterised sigmoid and reLU hidden activation functions for DNN acoustic modelling.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

I-vector estimation using informative priors for adaptation of deep neural networks.

[BibT_eX]

[DOI]

Penny Karanasou

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Recurrent neural network language model adaptation for multi-genre broadcast speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Paraphrastic recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Recurrent neural network language model training with noise contrastive estimation for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving the training and evaluation efficiency of recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Cambridge university transcription systems for the multi-genre broadcast challenge.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The development of the cambridge university alignment systems for the multi-genre broadcast challenge.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Speaker diarisation and longitudinal linking in multi-genre broadcast data.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Multilingual representations for low resource speech recognition and keyword search.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Investigation of back-off based interpolation between recurrent neural network and n-gram language models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The MGB challenge: Evaluating multi-genre broadcast media recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Adaptation of deep neural network acoustic models using factorised i-vectors.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Standalone training of context-dependent deep neural network acoustic models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Direct sub-word confidence estimation with hidden-state conditional random fields.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Detecting deletions in ASR output.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Efficient lattice rescoring using recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Paraphrastic neural network language models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Improving lightly supervised training for broadcast transcription.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Cross-domain paraphrasing for improving language modelling using out-of-domain data.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Automatic Transcription of Multi-genre Media Archives.

[BibT_eX]

[DOI]

Pawel Swietojanski

Proceedings of the First Workshop on Speech, 2013

A confidence-based approach for improving keyword hypothesis scores.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

System combination and score normalization for spoken term detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Paraphrastic language models and combination with neural network language models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

A high-performance Cantonese keyword search system.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Investigation of multilingual deep neural networks for spoken term detection.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

Morphological decomposition in Arabic ASR systems.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2012

Transcription of multi-genre media archives using out-of-domain data.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Using Sub-word-level Information for Confidence Estimation with Conditional Random Field Models.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Paraphrastic Language Models.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Complementary Phone Error Training.

[BibT_eX]

[DOI]

Frank Diehl

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

The efficient incorporation of MLP features into automatic speech recognition systems.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2011

Combining Information Sources for Confidence Estimation with CRF Models.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Improving LVCSR System Combination Using Neural Network Language Model Cross Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Graphone Model Interpolation and Arabic Pronunciation Generation.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems.

[BibT_eX]

[DOI]

Frank Diehl

Mark John Francis Gales

Marcus Tomalin

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Investigation of acoustic units for LVCSR systems.

[BibT_eX]

[DOI]

Mark John Francis Gales

Jim L. Hieronymus

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Unsupervised training and directed manual transcription for LVCSR.

[BibT_eX]

[DOI]

Speech Commun., 2010

Improved neural network based language modelling and adaptation.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Language model cross adaptation for LVCSR system combination.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Recent improvements to the Cambridge Arabic Speech-to-Text systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Language model combination and adaptation usingweighted finite state transducers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Unsupervised Adaptation With Discriminative Mapping Transforms.

[BibT_eX]

[DOI]

Kai Yu

IEEE Trans. Speech Audio Process., 2009

Efficient generation and use of MLP features for Arabic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Use of contexts in language model interpolation and adaptation.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Exploiting Chinese character models to improve speech recognition performance.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Morphological analysis and decomposition for Arabic speech-to-text systems.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Training and adapting MLP features for Arabic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

MPE-based discriminative linear transforms for speaker adaptation.

[BibT_eX]

[DOI]

Lan Wang

Comput. Speech Lang., 2008

Context dependent language model adaptation.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Unsupervised discriminative adaptation using discriminative mapping transforms.

[BibT_eX]

[DOI]

Kai Yu

Proceedings of the IEEE International Conference on Acoustics, 2008

Phonetic pronunciations for arabic speech-to-text systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Unsupervised training with directed manual transcription for recognising Mandarin broadcast audio.

[BibT_eX]

[DOI]

Kai Yu

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Unsupervised Training for Mandarin Broadcast News and Conversation Transcription.

[BibT_eX]

[DOI]

Lan Wang

Proceedings of the IEEE International Conference on Acoustics, 2007

Improving Speech Transcription for Mandarin-English Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Consensus Network Decoding for Statistical Machine Translation System Combination.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Speech Recognition System Combination for Machine Translation.

[BibT_eX]

[DOI]

Abdelkhalek Messaoudi

Proceedings of the IEEE International Conference on Acoustics, 2007

Discriminative language model adaptation for Mandarin broadcast speech transcription and translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Development of a phonetic system for large vocabulary Arabic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

Corrections to "Automatic Transcription of Conversational Telephone Speech".

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2006

Progress in the CU-HTK broadcast news transcription system.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2006

Unsupervised language model adaptation for Mandarin broadcast conversation transcription.

[BibT_eX]

[DOI]

David Mrva

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Discriminatively Trained Gaussian Mixture Models for Sentence Boundary Detection.

[BibT_eX]

[DOI]

Marcus Tomalin

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

The Cu-Htk Mandarin Broadcast News Transcription System.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Automatic transcription of conversational telephone speech.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2005

The Cambridge University March 2005 speaker diarisation system.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Structural metadata research in the EARS program.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Development of the CU-HTK 2004 Broadcast News Transcription Systems.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Development of the CUHTK 2004 Mandarin Conversational Telephone Speech Transcription System.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Training LVCSR Systems on Thousands of Hours of Data.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Automatic capitalisation generation for speech input.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2004

A PLSA-based language model for conversational telephone speech.

[BibT_eX]

[DOI]

David Mrva

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Using VTLN for broadcast news transcription.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

MPE-based discriminative linear transform for speaker adaptation.

[BibT_eX]

[DOI]

Lan Wang

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Development of the 2003 CU-HTK conversational telephone speech transcription system.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Improving broadcast news transcription by lightly supervised discriminative training.

[BibT_eX]

[DOI]

Ho Yin Chan

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

A combined punctuation generation and speech recognition system and its performance enhancement using prosody.

[BibT_eX]

[DOI]

Speech Commun., 2003

Erratum: Language modelling for Russian and English using words and classes [Computer Speech and Language 17 (2003) 87-104].

[BibT_eX]

[DOI]

Comput. Speech Lang., 2003

Language modelling for Russian and English using words and classes.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2003

MMI-MAP and MPE-MAP for acoustic model adaptation.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Discriminative map for acoustic model adaptation.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Automatic complexity control for HLDA systems.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Porting: SwitchBoard to the VoiceMail task.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

The development of the HTK Broadcast News transcription system: An overview.

[BibT_eX]

[DOI]

Speech Commun., 2002

Large scale discriminative training of hidden Markov models for speech recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2002

Cluster identification for speaker-environment tracking.

[BibT_eX]

[DOI]

J. T. Wickramaratna

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Maximum mutual information training of hidden Markov models with vector linear predictors.

[BibT_eX]

[DOI]

K. K. Chin

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Minimum Phone Error and I-smoothing for improved discriminative training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

Implementation of automatic capitalisation generation systems for speech input.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

Improved cross-task recognition using MMIE training.

[BibT_eX]

[DOI]

Ricardo de Córdoba

Proceedings of the IEEE International Conference on Acoustics, 2002

2001

Information Retrieval from Unsegmented Broadcast News Audio.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2001

The use of prosody in a combined system for punctuation generation and speech recognition.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Efficient class-based language modelling for very large vocabularies.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2001

Improvements in linear transform based speaker adaptation.

[BibT_eX]

[DOI]

Luís Felipe Uebel

Proceedings of the IEEE International Conference on Acoustics, 2001

Improved discriminative training techniques for large vocabulary continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2001

New features in the CU-HTK system for transcription of conversational telephone speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2001

2000

Spoken document representations for probabilistic retrieval.

[BibT_eX]

[DOI]

Speech Commun., 2000

Spoken Document Retrieval for TREC-9 at Cambridge University.

[BibT_eX]

[DOI]

Proceedings of The Ninth Text REtrieval Conference, 2000

Effects of out of vocabulary words in spoken document retrieval.

[BibT_eX]

[DOI]

Proceedings of the SIGIR 2000: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2000

The Cambridge University multimedia document retrieval demo system.

[BibT_eX]

[DOI]

Proceedings of the SIGIR 2000: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2000

Audio Indexing and Retrieval of Complete Broadcoast News Shows.

[BibT_eX]

[DOI]

Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications), 2000

Particle-based language modelling.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A rule-based named entity recognition system for speech input.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Modelling sub-phone insertions and deletions in continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A method for direct audio search with applications to indexing and retrieval.

[BibT_eX]

[DOI]

Sue E. Johnson

Proceedings of the IEEE International Conference on Acoustics, 2000

Large vocabulary decoding and confidence estimation using word posterior probabilities.

[BibT_eX]

[DOI]

Gunnar Evermann

Proceedings of the IEEE International Conference on Acoustics, 2000

1999

Variable-length categoryn-gram language models.

[BibT_eX]

[DOI]

Comput. Speech Lang., 1999

A hidden Markov-model-based trainable speech synthesizer.

[BibT_eX]

[DOI]

Robert E. Donovan

Comput. Speech Lang., 1999

Spoken Document Retrieval for TREC-8 at Cambridge University.

[BibT_eX]

[DOI]

Proceedings of The Eighth Text REtrieval Conference, 1999

Improving Retrieval on Imperfect Speech Transcriptions (poster abstract).

[BibT_eX]

[DOI]

Proceedings of the SIGIR '99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1999

Improvements in accuracy and speed in the HTK broadcast news transcription system.

[BibT_eX]

[DOI]

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

An investigation into vocal tract length normalisation.

[BibT_eX]

[DOI]

Luís Felipe Uebel

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Dynamic HMM selection for continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Frame discrimination training for HMMs for large vocabulary speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

The Cambridge University spoken document retrieval system.

[BibT_eX]

[DOI]

Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

The 1998 HTK system for transcription of conversational telephone speech.

[BibT_eX]

[DOI]

Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998

Spoken Document Retrieval For TREC-7 At Cambridge University.

[BibT_eX]

Proceedings of The Seventh Text REtrieval Conference, 1998

Comparison of language modelling techniques for Russian and English.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Speaker clustering using direct maximisation of the MLLR-adapted likelihood.

[BibT_eX]

[DOI]

Sue E. Johnson

Segmentation and classification of broadcast news audio.

[BibT_eX]

[DOI]

Experiments in broadcast news transcription.

[BibT_eX]

[DOI]

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Comparison of part-of-speech and automatically derived category-based language models for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

The use of accent-specific pronunciation dictionaries in acoustic model training.

[BibT_eX]

[DOI]

Jason J. Humphries

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997

MMIE training of large vocabulary recognition systems.

[BibT_eX]

[DOI]

Speech Commun., 1997

Multilingual large vocabulary speech recognition: the European SQALE project.

[BibT_eX]

[DOI]

Herman J. M. Steeneken

Comput. Speech Lang., 1997

Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models.

[BibT_eX]

[DOI]

S. M. Ahadi

Comput. Speech Lang., 1997

Using accent-specific pronunciation modelling for improved large vocabulary continuous speech recognition.

[BibT_eX]

[DOI]

Jason J. Humphries

Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Broadcast news transcription using HTK.

[BibT_eX]

[DOI]

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Experiments in speaker normalisation and adaptation for large vocabulary speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Modelling word-pair relations in a category-based language model.

[BibT_eX]

[DOI]

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996

Mean and variance adaptation within the MLLR framework.

[BibT_eX]

[DOI]

Comput. Speech Lang., 1996

Iterative unsupervised adaptation using maximum likelihood linear regression.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Discriminative optimisation of large vocabulary recognition systems.

[BibT_eX]

[DOI]

V. Valtchev

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Combination of word-based and category-based language models.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Using accent-specific pronunciation modelling for robust speech recognition.

[BibT_eX]

[DOI]

Jason J. Humphries

David J. B. Pearce

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Improving environmental robustness in large vocabulary speech recognition.

[BibT_eX]

[DOI]

Mark John Francis Gales

Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Lattice-based discriminative training for large vocabulary speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

A variable-length category-based n-gram language model.

[BibT_eX]

[DOI]

Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995

Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models.

[BibT_eX]

[DOI]

C. J. Leggetter

Comput. Speech Lang., 1995

Large vocabulary multilingual speech recognition using HTK.

[BibT_eX]

[DOI]

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Flexible speaker adaptation for large vocabulary speech recognition.

[BibT_eX]

[DOI]

C. J. Leggetter

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Improvements in an HMM-based speech synthesiser.

[BibT_eX]

[DOI]

Robert E. Donovan

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

The 1994 HTK large vocabulary speech recognition system.

[BibT_eX]

[DOI]

Proceedings of the 1995 International Conference on Acoustics, 1995

Automatic speech synthesiser parameter estimation using HMMs.

[BibT_eX]

[DOI]

Robert E. Donovan

Proceedings of the 1995 International Conference on Acoustics, 1995

Rapid speaker adaptation using model prediction.

[BibT_eX]

[DOI]

S. M. Ahadi

Proceedings of the 1995 International Conference on Acoustics, 1995

1994

Spontaneous speech recognition for the credit card corpus using the HTK toolkit.

[BibT_eX]

[DOI]

William J. Byrne

IEEE Trans. Speech Audio Process., 1994

State clustering in hidden Markov model-based continuous speech recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 1994

Tree-Based State Tying for High Accuracy Modelling.

[BibT_eX]

[DOI]

J. J. Odell

Proceedings of the Human Language Technology, 1994

A One Pass Decoder Design For Large Vocabulary Recognition.

[BibT_eX]

[DOI]

Proceedings of the Human Language Technology, 1994

Recognition ********* a dynamic network decoder design for large vocabulary speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Speaker adaptation of continuous density HMMs using multivariate linear regression.

[BibT_eX]

[DOI]

C. J. Leggetter

Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Modelling syllable characteristics to improve a large vocabulary continuous speech recogniser.

[BibT_eX]

[DOI]

M. Jones

Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Large vocabulary continuous speech recognition using HTK.

[BibT_eX]

[DOI]

Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993

The use of state tying in continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Third European Conference on Speech Communication and Technology, 1993

The HTK tied-state continuous speech recogniser.

[BibT_eX]

[DOI]

Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Hidden Markov models using shared vector linear predictors.

[BibT_eX]

[DOI]

B. A. Maxwell

Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Using relative duration in large vocabulary speech recognition.

[BibT_eX]

[DOI]

M. Jones

Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Exploiting variable-width features in large vocabulary speech recognition.

[BibT_eX]

[DOI]

M. Jones

Proceedings of the IEEE International Conference on Acoustics, 1993

A wave digital filter model of the entire auditory periphery.

[BibT_eX]

[DOI]

Christian Giguère

Proceedings of the IEEE International Conference on Acoustics, 1993

1992

Hidden Markov models using vector linear prediction and discriminative output distributions.

[BibT_eX]

[DOI]

Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991

Optimising hidden Markov models using discriminative output distributions.

[BibT_eX]

[DOI]

David R. Cole

Proceedings of the 1991 International Conference on Acoustics, 1991

1990

An experimental comparison of connectionist and conventional classification systems on natural data.

[BibT_eX]

[DOI]