Andrew Rosenberg

According to our database1, Andrew Rosenberg authored at least 105 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
High-precision Voice Search Query Correction via Retrievable Speech-text Embedings.
CoRR, 2024

2023
O-1: Self-training with Oracle and 1-best Hypothesis.
CoRR, 2023

Using Text Injection to Improve Recognition of Personal Identifiers in Speech.
CoRR, 2023

Improving Joint Speech-Text Representations Without Alignment.
CoRR, 2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages.
CoRR, 2023

Understanding Shared Speech-Text Representations.
Proceedings of the IEEE International Conference on Acoustics, 2023

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Mask-Conformer: Augmenting Conformer with Mask-Predict Decoder.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Ask2Mask: Guided Data Selection for Masked Speech Modeling.
IEEE J. Sel. Top. Signal Process., 2022

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data.
CoRR, 2022

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Non-Parallel Voice Conversion for ASR Augmentation.
Proceedings of the Interspeech 2022, 2022

Towards Disentangled Speech Representations.
Proceedings of the Interspeech 2022, 2022

MAESTRO: Matched Speech Text Representations through Modality Matching.
Proceedings of the Interspeech 2022, 2022

A Scalable Model Specialization Framework for Training and Inference using Submodels and its Application to Speech Model Personalization.
Proceedings of the Interspeech 2022, 2022

Reducing Domain mismatch in Self-supervised speech pre-training.
Proceedings of the Interspeech 2022, 2022

Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Extending Parrotron: An End-to-End, Speech Conversion and Speech Recognition Model for Atypical Speech.
Proceedings of the IEEE International Conference on Acoustics, 2021

Injecting Text in Self-Supervised Speech Pretraining.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior.
CoRR, 2020

SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR.
Proceedings of the Interspeech 2020, 2020

Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection.
Proceedings of the Interspeech 2020, 2020

Improving Speech Recognition Using Consistent Predictions on Synthesized Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning.
Proceedings of the Interspeech 2019, 2019

Comparison of Data Augmentation and Adaptation Strategies for Code-switched Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Speech Recognition with Augmented Synthesized Speech.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Comparing Prosodic Frameworks: Investigating the Acoustic-Symbolic Relationship in ToBI and RaP.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Interpersonal Relationship Labels for the CALLHOME Corpus.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Data Augmentation Improves Recognition of Foreign Accented Speech.
Proceedings of the Interspeech 2018, 2018

Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Measuring the Effect of Linguistic Resources on Prosody Modeling for Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
End-to-End ASR-Free Keyword Search From Speech.
IEEE J. Sel. Top. Signal Process., 2017

Utilizing overt and latent linguistic structure to improve keystroke-based authentication.
Image Vis. Comput., 2017

Recent progress in deep end-to-end models for spoken language processing.
IBM J. Res. Dev., 2017

Bias and Statistical Significance in Evaluating Speech Synthesis with Mean Opinion Scores.
Proceedings of the Interspeech 2017, 2017

Weakly-Supervised Phrase Assignment from Text in a Speech-Synthesis System Using Noisy Labels.
Proceedings of the Interspeech 2017, 2017

Active learning for low-resource speech recognition: Impact of selection size and language modeling data.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

End-to-end speech recognition and keyword search on low-resource languages.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Voice-transformation-based data augmentation for prosodic classification.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Knowledge distillation across ensembles of multilingual models for low-resource languages.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Investigating native and non-native English classification and transfer effects using Legendre polynomial coefficient clustering.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
RankDCG: Rank-Ordering Evaluation Measure.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection.
Proceedings of the Interspeech 2016, 2016

Automatically Classifying Self-Rated Personality Scores from Speech.
Proceedings of the Interspeech 2016, 2016

Supervised and unsupervised active learning for automatic speech recognition of low-resource languages.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Hierarchy Prediction in Online Communities.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Utilizing linguistically enhanced keystroke dynamics to predict typist cognition and demographics.
Int. J. Hum. Comput. Stud., 2015

Muddying The Multiword Expression Waters: How Cognitive Demand Affects Multiword Expression Production.
Proceedings of the 11th Workshop on Multiword Expressions, 2015

CUNY Systems for the Query-by-Example Search on Speech Task at MediaEval 2015.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Modeling phrasing and prominence using deep recurrent learning.
Proceedings of the INTERSPEECH 2015, 2015

Automatic recognition of unified parkinson's disease rating from speech with acoustic, i-vector and phonotactic features.
Proceedings of the INTERSPEECH 2015, 2015

Cross-Cultural Production and Detection of Deception from Speech.
Proceedings of the 2015 ACM Workshop on Multimodal Deception Detection, 2015

Improvements to keystroke-based authentication by adding linguistic context.
Proceedings of the IEEE 7th International Conference on Biometrics Theory, 2015

2014
A comparison of multiple methods for rescoring keyword search lists for low resource languages.
Proceedings of the INTERSPEECH 2014, 2014

Strategies for rescoring keyword search results using word-burst and acoustic features.
Proceedings of the INTERSPEECH 2014, 2014

Improving named entity recognition with prosodic features.
Proceedings of the INTERSPEECH 2014, 2014

"was that your mother on the phone?": classifying interpersonal relationships between dialog participants with lexical and acoustic properties.
Proceedings of the INTERSPEECH 2014, 2014

Exploiting vocal-source features to improve ASR accuracy for low-resource languages.
Proceedings of the INTERSPEECH 2014, 2014

Recent improvements in neural network acoustic modeling for LVCSR in low resource languages.
Proceedings of the INTERSPEECH 2014, 2014

Improving deep neural network acoustic modeling for audio corpus indexing under the IARPA babel program.
Proceedings of the INTERSPEECH 2014, 2014

Continuous authentication with cognition-centric text production and revision features.
Proceedings of the IEEE International Joint Conference on Biometrics, Clearwater, 2014

Rescoring Confusion Networks for Keyword Search.
Proceedings of the IEEE International Conference on Acoustics, 2014

Using word burst analysis to rescore keyword search candidates on low-resource languages.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Scan-Based Evaluation of Continuous Keystroke Authentication Systems.
IT Prof., 2013

Automatic detection of speaker state: Lexical, prosodic, and phonetic approaches to level-of-interest and intoxication classification.
Comput. Speech Lang., 2013

Modeling prosodic sequences with k-means and dirichlet process GMMs.
Proceedings of the INTERSPEECH 2013, 2013

"sure, i did the right thing": a system for sarcasm detection in speech.
Proceedings of the INTERSPEECH 2013, 2013

Let me finish: automatic conflict detection using speaker overlap.
Proceedings of the INTERSPEECH 2013, 2013

Detecting laughter and filled pauses using syllable-based features.
Proceedings of the INTERSPEECH 2013, 2013

Cross-language phrase boundary detection.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Visual and semantic interpretability of projections of high dimensional data for classification tasks
CoRR, 2012

Modeling intensity contours and the interaction of pitch and intensity to improve automatic prosodic event detection and classification.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Analysis of speech transcripts to predict winners of U.S. Presidential and Vice-Presidential debates.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Phrase Boundary Assignment from Text in Multiple Domains.
Proceedings of the INTERSPEECH 2012, 2012

Classifying Skewed Data: Importance Weighting to Optimize Average Recall.
Proceedings of the INTERSPEECH 2012, 2012

Using Prominence and Phrasing Predictions to Improve Weighted Dictionary Pronunciation Models.
Proceedings of the INTERSPEECH 2012, 2012

Rethinking The Corpus: Moving towards Dynamic Linguistic Resources.
Proceedings of the INTERSPEECH 2012, 2012

Power Mean Pyramid Scores for Summarization Evaluation.
Proceedings of the INTERSPEECH 2012, 2012

2011
"What is... Dengue Fever?" - Modeling and Predicting Pronunciation Errors in a Text-to-Speech System.
Proceedings of the INTERSPEECH 2011, 2011

Using Mutual Information to Identify Regions of Analysis for Prosodic Analysis.
Proceedings of the INTERSPEECH 2011, 2011

Symbolic and Direct Sequential Modeling of Prosody for Classification of Speaking-Style and Nativeness.
Proceedings of the INTERSPEECH 2011, 2011

Intoxication Detection Using Phonetic, Phonotactic and Prosodic Cues.
Proceedings of the INTERSPEECH 2011, 2011

Automated measures for interpretable dimensionality reduction for visual classification: A user study.
Proceedings of the 6th IEEE Conference on Visual Analytics Science and Technology, 2011

Multi-objective Genetic Programming for Visual Analytics.
Proceedings of the Genetic Programming - 14th European Conference, 2011

Evaluating importance of facial expression in american sign language and pidgin signed english animations.
Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility, 2011

2010
Multi-Objective Genetic Programming Projection Pursuit for Exploratory Data Modeling
CoRR, 2010

Classification of Prosodic Events using Quantized Contour Modeling.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2010

AutoBI - a tool for automatic toBI annotation.
Proceedings of the INTERSPEECH 2010, 2010

Dimensionality reduction using symbolic regression.
Proceedings of the Genetic and Evolutionary Computation Conference, 2010

2009
Charisma perception from text and speech.
Speech Commun., 2009

Detecting Pitch Accents at the Word, Syllable and Vowel Level.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

2008
Speech segmentation and spoken document processing.
IEEE Signal Process. Mag., 2008

Intonational phrases for speech summarization.
Proceedings of the INTERSPEECH 2008, 2008

2007
Varying input segmentation for story boundary detection in English, Arabic and Mandarin broadcast news.
Proceedings of the INTERSPEECH 2007, 2007

Detecting pitch accent using pitch-corrected energy-based predictors.
Proceedings of the INTERSPEECH 2007, 2007

Comparing american and palestinian perceptions of charisma using acoustic-prosodic and lexical analysis.
Proceedings of the INTERSPEECH 2007, 2007

V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure.
Proceedings of the EMNLP-CoNLL 2007, 2007

2006
Story Segmentation of Broadcast News in English, Mandarin and Arabic.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

On the correlation between energy and pitch accent in read English speech.
Proceedings of the INTERSPEECH 2006, 2006

2005
Acoustic/prosodic and lexical correlates of charismatic speech.
Proceedings of the INTERSPEECH 2005, 2005

2004
Augmenting the kappa statistic to determine interannotator reliability for multiply labeled data points.
Proceedings of HLT-NAACL 2004: Short Papers, Boston, Massachusetts, USA, May 2-7, 2004, 2004


  Loading...