Sanjeev Khudanpur

Orcid: 0000-0001-5976-0897

According to our database1, Sanjeev Khudanpur authored at least 271 papers between 1997 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
On Speaker Attribution with SURT.
CoRR, 2024

2023
SURT 2.0: Advances in Transducer-Based Multi-Talker Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

A dilemma of ground truth in noisy speech separation and an approach to lessen the impact of imperfect training data.
Comput. Speech Lang., 2023

Enhancing Code-switching Speech Recognition with Interactive Language Biases.
CoRR, 2023

Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization.
CoRR, 2023

Speech collage: code-switched audio generation by collaging monolingual corpora.
CoRR, 2023

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios.
CoRR, 2023

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation.
CoRR, 2023

Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts.
CoRR, 2023

Investigating model performance in language identification: beyond simple error statistics.
CoRR, 2023

JHU IWSLT 2023 Multilingual Speech Translation System Description.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

JHU IWSLT 2023 Dialect Speech Translation System Description.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Crosslingual Handwritten Text Generation Using GANs.
Proceedings of the Document Analysis and Recognition - ICDAR 2023 Workshops, 2023

Reducing Language Confusion for Code-Switching Speech Recognition with Token-Level Language Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2023

Building Keyword Search System from End-To-End Asr Systems.
Proceedings of the IEEE International Conference on Acoustics, 2023

Adapting Self-Supervised Models to Multi-Talker Speech Recognition Using Speaker Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2023

Euro: Espnet Unsupervised ASR Open-Source Toolkit.
Proceedings of the IEEE International Conference on Acoustics, 2023

Clustering Unsupervised Representations as Defense Against Poisoning Attacks on Speech Commands Classification System.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Joint Energy-Based Model for Robust Speech Classification System Against Dirty-Label Backdoor Poisoning Attacks.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Learning From Flawed Data: Weakly Supervised Automatic Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Efficient Self-Supervised Learning Representations for Spoken Language Identification.
IEEE J. Sel. Top. Signal Process., 2022

Joint speaker diarization and speech recognition based on region proposal networks.
Comput. Speech Lang., 2022

GPU-accelerated Guided Source Separation for Meeting Transcription.
CoRR, 2022

Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser.
CoRR, 2022

Enhance Language Identification using Dual-mode Model with Knowledge Distillation.
CoRR, 2022

Characterizing the Details of Spatial Construction: Cognitive Constraints and Variability.
Cogn. Sci., 2022

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Enhancing Language Identification Using Dual-Mode Model with Knowledge Distillation.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

JHU IWSLT 2022 Dialect Speech Translation System Description.
Proceedings of the 19th International Conference on Spoken Language Translation, 2022

Chunking Defense for Adversarial Attacks on ASR.
Proceedings of the Interspeech 2022, 2022

PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification.
Proceedings of the Interspeech 2022, 2022

Defense against Adversarial Attacks on Hybrid Speech Recognition System using Adversarial Fine-tuning with Denoiser.
Proceedings of the Interspeech 2022, 2022

Injecting Text and Cross-Lingual Supervision in Few-Shot Learning from Self-Supervised Models.
Proceedings of the IEEE International Conference on Acoustics, 2022

Investigating Self-Supervised Learning for Speech Enhancement and Separation.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder With Exact Lattice Generation.
IEEE Signal Process. Lett., 2021

Fine-Grained Activity Recognition for Assembly Videos.
IEEE Robotics Autom. Lett., 2021

Lhotse: a speech data representation library for the modern deep learning ecosystem.
CoRR, 2021

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio.
CoRR, 2021

Adversarial Attacks and Defenses for Speech Recognition Systems.
CoRR, 2021

Learning Policies for Multilingual Training of Neural Machine Translation Systems.
CoRR, 2021

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap.
CoRR, 2021

Learning Feature Weights using Reward Modeling for Denoising Parallel Corpora.
Proceedings of the Sixth Conference on Machine Translation, 2021

Multi-Class Spectral Clustering with Overlaps for Speaker Diarization.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

DOVER-Lap: A Method for Combining Overlap-Aware Diarization Outputs.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Learning Curricula for Multilingual Neural Machine Translation Training.
Proceedings of the 18th Biennial Machine Translation Summit - Volume 1: Research Track, 2021

Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Reformulating DOVER-Lap Label Mapping as a Graph Partitioning Problem.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Speaker Verification-Based Evaluation of Single-Channel Speech Separation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

End-to-End Language Diarization for Bilingual Code-Switching Speech.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10, 000 Hours of Transcribed Audio.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Wake Word Detection with Streaming Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2021

Training Noisy Single-Channel Speech Separation with Noisy Oracle Sources: A Large Gap and a Small Step.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Parallelizable Lattice Rescoring Strategy with Neural Language Models.
Proceedings of the IEEE International Conference on Acoustics, 2021

An Asynchronous WFST-Based Decoder for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Frustratingly Easy Noise-aware Training of Acoustic Models.
CoRR, 2020

The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge.
CoRR, 2020

Wake Word Detection with Alignment-Free Lattice-Free MMI.
Proceedings of the Interspeech 2020, 2020

PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR.
Proceedings of the Interspeech 2020, 2020

Neural Language Modeling with Implicit Cache Pointers.
Proceedings of the Interspeech 2020, 2020

Efficient MDI Adaptation for n-Gram Language Models.
Proceedings of the Interspeech 2020, 2020

An Alternative to MFCCs for ASR.
Proceedings of the Interspeech 2020, 2020

OOV Recovery with Efficient 2nd Pass Decoding and Open-vocabulary Word-level RNNLM Rescoring for Hybrid ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

An Empirical Study of Transformer-Based Neural Language Model Adaptation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speaker Diarization with Region Proposal Network.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Sample Selection for Large-scale MT Discriminative Training.
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers, 2020

2019
Analysis of Robustness of Deep Single-Channel Speech Separation Using Corpora Constructed From Multiple Domains.
Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Toward Computer Vision Systems That Understand Real-World Assembly Processes.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Multi-PLDA Diarization on Children's Speech.
Proceedings of the Interspeech 2019, 2019

Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network.
Proceedings of the Interspeech 2019, 2019

Pretraining by Backtranslation for End-to-End ASR in Low-Resource Settings.
Proceedings of the Interspeech 2019, 2019

The JHU ASR System for VOiCES from a Distance Challenge 2019.
Proceedings of the Interspeech 2019, 2019

State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18.
Proceedings of the Interspeech 2019, 2019

The JHU Speaker Recognition System for the VOiCES 2019 Challenge.
Proceedings of the Interspeech 2019, 2019

Speaker Recognition Benchmark Using the CHiME-5 Corpus.
Proceedings of the Interspeech 2019, 2019

x-Vector DNN Refinement with Full-Length Recordings for Speaker Recognition.
Proceedings of the Interspeech 2019, 2019

Optical Character Recognition with Chinese and Korean Character Decomposition.
Proceedings of the Second International Workshop on Machine Learning, 2019

Using ASR Methods for OCR.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Speaker Recognition for Multi-speaker Conversations Using X-vectors.
Proceedings of the IEEE International Conference on Acoustics, 2019

Acoustic Modeling for Overlapping Speech Recognition: Jhu Chime-5 Challenge System.
Proceedings of the IEEE International Conference on Acoustics, 2019

Bottom-Up Unsupervised Word Discovery via Acoustic Units.
Proceedings of the 2019 IEEE Global Conference on Signal and Information Processing, 2019

Zero-Shot Pronunciation Lexicons for Cross-Language Acoustic Model Transfer.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Probing the Information Encoded in X-Vectors.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Incremental Lattice Determinization for WFST Decoders.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Flat-Start Single-Stage Discriminatively Trained HMM-Based Models for ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Low Latency Acoustic Modeling Using Temporal Convolution and LSTMs.
IEEE Signal Process. Lett., 2018

Low Resource Multi-modal Data Augmentation for End-to-end ASR.
CoRR, 2018

Building Corpora for Single-Channel Speech Separation Across Multiple Domains.
CoRR, 2018

The JHU Speech LOREHLT 2017 System: Cross-Language Transfer for Situation-Frame Detection.
CoRR, 2018

A Teacher-Student Learning Approach for Unsupervised Domain Adaptation of Sequence-Trained ASR Models.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Low-Resource Contextual Topic Identification on Speech.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Improving LF-MMI Using Unconstrained Supervisions for ASR.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Spoken Language Recognition using X-vectors.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Automatic Speech Recognition and Topic Identification from Speech for Almost-Zero-Resource Languages.
Proceedings of the Interspeech 2018, 2018

Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge.
Proceedings of the Interspeech 2018, 2018

Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks.
Proceedings of the Interspeech 2018, 2018

Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition.
Proceedings of the Interspeech 2018, 2018

End-to-end Speech Recognition Using Lattice-free MMI.
Proceedings of the Interspeech 2018, 2018

End-to-end Deep Neural Network Age Estimation.
Proceedings of the Interspeech 2018, 2018

Acoustic Modeling from Frequency Domain Representations of Speech.
Proceedings of the Interspeech 2018, 2018

Output-Gate Projected Gated Recurrent Unit for Speech Recognition.
Proceedings of the Interspeech 2018, 2018

A GPU-based WFST Decoder with Exact Lattice Generation.
Proceedings of the Interspeech 2018, 2018

Neural Network Language Modeling with Letter-Based Features and Importance Sampling.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

A Pruned Rnnlm Lattice-Rescoring Algorithm for Automatic Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

X-Vectors: Robust DNN Embeddings for Speaker Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Enhancement and Analysis of Conversational Speech: JSALT 2017.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

A Time-Restricted Self-Attention Layer for ASR.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Bayesian Models for Unit Discovery on a Very Low Resource Language.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Semi-Supervised Training of Acoustic Models Using Lattice-Free MMI.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Characterizing Performance of Speaker Diarization Systems on Far-Field Speech Using Standard Methods.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Constraints and Development in Children's Block Construction.
Proceedings of the 40th Annual Meeting of the Cognitive Science Society, 2018

2017
A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery.
IEEE Trans. Biomed. Eng., 2017

Using of heterogeneous corpora for training of an ASR system.
CoRR, 2017

Acoustic Data-Driven Lexicon Learning Based on a Greedy Pronunciation Selection Framework.
Proceedings of the Interspeech 2017, 2017

Backstitch: Counteracting Finite-Sample Bias via Negative Steps.
Proceedings of the Interspeech 2017, 2017

The Kaldi OpenKWS System: Improving Low Resource Keyword Search.
Proceedings of the Interspeech 2017, 2017

Deep Neural Network Embeddings for Text-Independent Speaker Verification.
Proceedings of the Interspeech 2017, 2017

Topic Identification for Speech Without ASR.
Proceedings of the Interspeech 2017, 2017

Phone Duration Modeling for LVCSR Using Neural Networks.
Proceedings of the Interspeech 2017, 2017

An Exploration of Dropout with LSTMs.
Proceedings of the Interspeech 2017, 2017

An empirical evaluation of zero resource acoustic unit discovery.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A study on data augmentation of reverberant speech for robust speech recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Topic identification of spoken documents using unsupervised acoustic unit discovery.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Characterizing spatial construction processes: Toward computational tools to understand cognition.
Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017

JHU Kaldi system for Arabic MGB-3 ASR challenge using diarization, audio-transcript alignment and transfer learning.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Investigation of transfer learning for ASR using LF-MMI trained neural networks.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
Getting more from automatic transcripts for semi-supervised language modeling.
Comput. Speech Lang., 2016

Query-by-example surgical activity detection.
Int. J. Comput. Assist. Radiol. Surg., 2016

Deep neural network-based speaker embeddings for end-to-end speaker verification.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI.
Proceedings of the Interspeech 2016, 2016

Far-Field ASR Without Parallel Data.
Proceedings of the Interspeech 2016, 2016

Acoustic Modelling from the Signal Domain Using CNNs.
Proceedings of the Interspeech 2016, 2016

Unsupervised surgical data alignment with application to automatic activity annotation.
Proceedings of the 2016 IEEE International Conference on Robotics and Automation, 2016

Highway long short-term memory RNNS for distant speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Adapting ASR for under-resourced languages using mismatched transcriptions.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Context-dependent point process models for keyword search and detection-based ASR.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Acoustic data-driven pronunciation lexicon generation for logographic languages.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging.
Proceedings of the 3rd International Conference on Learning Representations, 2015

A diversity-penalizing ensemble training method for deep learning.
Proceedings of the INTERSPEECH 2015, 2015

Modeling phonetic context with non-random forests for speech recognition.
Proceedings of the INTERSPEECH 2015, 2015

A time delay neural network architecture for efficient modeling of long temporal contexts.
Proceedings of the INTERSPEECH 2015, 2015

Reverberation robust acoustic modeling using i-vectors with time delay neural networks.
Proceedings of the INTERSPEECH 2015, 2015

Semi-supervised maximum mutual information training of deep neural network acoustic models.
Proceedings of the INTERSPEECH 2015, 2015

Audio augmentation for speech recognition.
Proceedings of the INTERSPEECH 2015, 2015

Pronunciation and silence probability modeling for ASR.
Proceedings of the INTERSPEECH 2015, 2015

Structured variability in acoustic realization: a corpus study of voice onset time in American English stops.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Librispeech: An ASR corpus based on public domain audio books.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial Workshop.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

A Coarse-Grained Model for Optimal Coupling of ASR and SMT Systems for Speech Translation.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Combining local and broad topic context to improve term detection.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

A keyword search system using open source software.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Translations of the Callhome Egyptian Arabic corpus for conversational speech translation.
Proceedings of the 11th International Workshop on Spoken Language Translation: Papers, 2014

Low-resource open vocabulary keyword search using point process models.
Proceedings of the INTERSPEECH 2014, 2014

Improving deep neural network acoustic models using generalized maxout networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

Limited resource term detection for effective topic identification of speech.
Proceedings of the IEEE International Conference on Acoustics, 2014

Some insights from translating conversational telephone speech.
Proceedings of the IEEE International Conference on Acoustics, 2014

A pitch extraction algorithm tuned for automatic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Can You Repeat That? Using Word Repetition to Improve Spoken Term Detection.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Online Learning in Tensor Space.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation
CoRR, 2013

String Motif-Based Description of Tool Motion for Detecting Skill and Gestures in Robotic Surgery.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2013, 2013

Improved speech-to-text translation with the Fisher and Callhome Spanish-English speech translation corpus.
Proceedings of the 10th International Workshop on Spoken Language Translation: Papers, 2013


Quantifying the value of pronunciation lexicons for keyword search in lowresource languages.
Proceedings of the IEEE International Conference on Acoustics, 2013

Using proxies for OOV keywords in the keyword search task.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Revisiting the Case for Explicit Syntactic Information in Language Models.
Proceedings of the Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, 2012

Hallucinating system outputs for discriminative language modeling.
Proceedings of the 2012 Symposium on Machine Learning in Speech and Language Processing, 2012

Sparse Hidden Markov Models for Surgical Gesture Classification and Skill Evaluation.
Proceedings of the Information Processing in Computer-Assisted Interventions, 2012

Phrasal Cohort Based Unsupervised Discriminative Language Modeling.
Proceedings of the INTERSPEECH 2012, 2012

Efficient Structured Language Modeling for Speech Recognition.
Proceedings of the INTERSPEECH 2012, 2012

Semi-Supervised Methods for Improving Keyword Search of Unseen Terms.
Proceedings of the INTERSPEECH 2012, 2012





Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Stepwise Optimal Subspace Pursuit for Improving Sparse Recovery.
IEEE Signal Process. Lett., 2011

Unsupervised Arabic Dialect Adaptation with Self-Training.
Proceedings of the INTERSPEECH 2011, 2011

Dirichlet Mixture Models of neural net posteriors for HMM-based speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Learning and inference algorithms for partially observed structured switching vector autoregressive models.
Proceedings of the IEEE International Conference on Acoustics, 2011

Hill climbing on speech lattices: A new rescoring framework.
Proceedings of the IEEE International Conference on Acoustics, 2011

Extensions of recurrent neural network language model.
Proceedings of the IEEE International Conference on Acoustics, 2011

Variational approximation of long-span language models for lvcsr.
Proceedings of the IEEE International Conference on Acoustics, 2011

Efficient Subsampling for Training Complex Language Models.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Randomized maximum entropy language models.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Adapting n-gram maximum entropy language models with conditional entropy regularization.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Efficient discriminative training of long-span language models.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Estimating document frequencies in a speech corpus.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Likelihood-Based Semi-Supervised Model Selection With Applications to Speech Processing.
IEEE J. Sel. Top. Signal Process., 2010

Joshua 2.0: A Toolkit for Parsing-Based Machine Translation with Syntax, Semirings, Discriminative Training and Other Goodies.
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, 2010

A Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2010

Recurrent neural network based language model.
Proceedings of the INTERSPEECH 2010, 2010

Hypothesis ranking and two-pass approaches for machine translation system combination.
Proceedings of the IEEE International Conference on Acoustics, 2010

Unsupervised Discriminative Language Model Training for Machine Translation using Simulated Confusion Sets.
Proceedings of the COLING 2010, 2010

2009
Updated MINDS report on speech recognition and understanding, Part 2 [DSP Education].
IEEE Signal Process. Mag., 2009

Developments and directions in speech recognition and understanding, Part 1 [DSP Education].
IEEE Signal Process. Mag., 2009

Decoding in JoshuaOpen Source, Parsing-Based Machine Translation.
Prague Bull. Math. Linguistics, 2009

Joshua: An Open Source Toolkit for Parsing-Based Machine Translation.
Proceedings of the Fourth Workshop on Statistical Machine Translation, 2009

Web derived pronunciations for spoken term detection.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009

Efficient Extraction of Oracle-best Translations from Hypergraphs.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Data-Derived Models for Segmentation with Application to Surgical Assessment and Training.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2009

Unsupervised estimation of the language model scaling factor.
Proceedings of the INTERSPEECH 2009, 2009

Impact of novel sources on content-based image and video retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2009

WEB-derived pronunciations.
Proceedings of the IEEE International Conference on Acoustics, 2009

Self-supervised discriminative training of statistical language models.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Variational Decoding for Statistical Machine Translation.
Proceedings of the ACL 2009, 2009

Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation.
Proceedings of the ACL 2009, 2009

2008
A Scalable Decoder for Parsing-Based Machine Translation with Equivalent Language Model State Maintenance.
Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation, 2008

Sequential system combination for machine translation of speech.
Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Automatic Recognition of Surgical Motions Using Statistical Modeling for Capturing Variability.
Proceedings of the Medicine Meets Virtual Reality 16, 2008

Computation of Csiszár's mutual Information of order α.
Proceedings of the 2008 IEEE International Symposium on Information Theory, 2008

An investigation of acoustic models for multilingual code-switching.
Proceedings of the INTERSPEECH 2008, 2008

Automatically learning speaker-independent acoustic subword units.
Proceedings of the INTERSPEECH 2008, 2008

Sample selection for automatic language identification.
Proceedings of the IEEE International Conference on Acoustics, 2008

Combination of strongly and weakly constrained recognizers for reliable detection of OOVS.
Proceedings of the IEEE International Conference on Acoustics, 2008

Large-scale Discriminative n-gram Language Models for Statistical Machine Translation.
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers, 2008

Unsupervised Learning of Acoustic Sub-word Units.
Proceedings of the ACL 2008, 2008

Machine Translation System Combination using ITG-based Alignments.
Proceedings of the ACL 2008, 2008

2007
Comparing Reordering Constraints for SMT Using Efficient BLEU Oracle Computation.
Proceedings of the NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation, 2007

Cross-Instance Tuning of Unsupervised Document Clustering Algorithms.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Error Bounds and Improved Probability Estimation using the Maximum Likelihood Set.
Proceedings of the IEEE International Symposium on Information Theory, 2007

Large-scale random forest language models for speech recognition.
Proceedings of the INTERSPEECH 2007, 2007

Iterative Denoising using Jensen-Renyi Divergences with an Application to Unsupervised Document Categorization.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Imperial College and Johns Hopkins University at TRECVID.
Proceedings of the 2006 TREC Video Retrieval Evaluation, 2006

Language Modeling with the Maximum Likelihood Set: Complexity Issues and the Back-off Formula.
Proceedings of the Proceedings 2006 IEEE International Symposium on Information Theory, 2006

Source Adaptation for Improved Content-Based Video Retrieval.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Generative Content Models for Structural Analysis of Medical Abstracts.
Proceedings of the Workshop on Linking Natural Language and Biology, 2006

2005
Maximum Likelihood Set for Estimating a Probability Mass Function.
Neural Comput., 2005

TRECVID 2005 Experiment at Johns Hopkins University: Using Hidden Markov Models for Video Retrieval.
Proceedings of the 2005 TREC Video Retrieval Evaluation, 2005

Hidden Markov models for automatic annotation and content-based retrieval of images and video.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

Joint visual-text modeling for automatic retrieval of multimedia documents.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Unsupervised classification via decision trees: an information-theoretic perspective.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Lexical triggers and latent semantic analysis for cross-lingual language model adaptation.
ACM Trans. Asian Lang. Inf. Process., 2004

Pronunciation change in conversational speech and its implications for automatic speech recognition.
Comput. Speech Lang., 2004

Mandarin-English Information (MEI): investigating translingual speech retrieval.
Comput. Speech Lang., 2004

Contemporaneous text as side-information in statistical language modeling.
Comput. Speech Lang., 2004

Improving Passage Retrieval Using Interactive Elicition and Statistical Modeling.
Proceedings of the Thirteenth Text REtrieval Conference, 2004

A Smorgasbord of Features for Statistical Machine Translation.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2004

Cross-lingual latent semantic analysis for language modeling.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Making MIRACLEs: Interactive translingual search for Cebuano and Hindi.
ACM Trans. Asian Lang. Inf. Process., 2003

Transliteration of proper names in cross-language applications.
Proceedings of the SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28, 2003

Desparately Seeking Cebuano.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003

Latent Semantic Information in Maximum Entropy Language Models for Conversational Speech Recognition.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003

Language model adaptation using cross-lingual information.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Cross-Lingual Lexical Triggers in Statistical Language Modeling.
Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2003

Transliteration of Proper Names in Cross-Lingual Information Retrieval.
Proceedings of the Workshop on Multilingual and Mixed-language Named Entity Recognition, 2003

2002
Order estimation for a special class of hidden Markov sources and binary renewal processes.
IEEE Trans. Inf. Theory, 2002

Using cross-language cues for story-specific language modeling.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Building a topic-dependent maximum entropy model for very large corpora.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Mandarin-English Information: Investigating Translingual Speech Retrieval.
Proceedings of the First International Conference on Human Language Technology Research, 2001

Robust Knowledge Discovery from Parallel Speech and Text Sources.
Proceedings of the First International Conference on Human Language Technology Research, 2001

Smoothing issues in the structured language model.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

On large vocabulary continuous speech recognition of highly inflectional language - czech.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
Pronunciation modeling by sharing Gaussian densities across phonetic models.
Comput. Speech Lang., 2000

Maximum entropy techniques for exploiting syntactic, semantic and collocational dependencies in language modeling.
Comput. Speech Lang., 2000

Efficient training methods for maximum entropy language modeling.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Syntactic heads in statistical language modeling.
Proceedings of the IEEE International Conference on Acoustics, 2000

Pronunciation ambiguity vs. pronunciation variability in speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2000

Towards language independent acoustic modeling.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Stochastic pronunciation modelling from hand-labelled phonetic corpora.
Speech Commun., 1999

Large Vocabulary Speech Recognition for Read and Broadcast Czech.
Proceedings of the Text, Speech and Dialogue - Second International Workshop, 1999

Combining nonlocal, syntactic and n-gram dependencies in language modeling.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

A maximum entropy language model integrating N-grams and topic dependencies for conversational speech recognition.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Tree-structured models of parameter dependence for rapid adaptation in large vocabulary conversational speech recognition.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Rapid speech recognizer adaptation to new speakers.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
LVCSR rescoring with modified loss functions: a decision theoretic perspective.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Pronunciation modelling using a hand-labelled corpus for conversational speech recognition.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
Structure and performance of a dependency language model.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997


  Loading...