Yanzhang He

According to our database1, Yanzhang He authored at least 63 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models.
CoRR, 2024

2023
Partial Rewriting for Multi-Stage ASR.
CoRR, 2023

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models.
CoRR, 2023

Massive End-to-end Models for Short Search Queries.
CoRR, 2023

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models.
CoRR, 2023

Multi-Output RNN-T Joint Networks for Multi-Task Learning of ASR and Auxiliary Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2023

Conditional Conformer: Improving Speaker Modulation For Single And Multi-User Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

The Role of Feature Correlation on Quantized Neural Networks.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Efficient Cascaded Streaming ASR System Via Frame Rate Reduction.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Context-Aware Neural Confidence Estimation for Rare Word Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Flickering Reduction with Partial Hypothesis Reranking for Streaming ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Closing the Gap Between Single-User and Multi-User VoiceFilter-Lite.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Improving Rare Word Recognition with LM-aware MWER Training.
Proceedings of the Interspeech 2022, 2022

A Language Agnostic Multilingual Streaming On-Device ASR System.
Proceedings of the Interspeech 2022, 2022

Improving Deliberation by Text-Only and Semi-Supervised Training.
Proceedings of the Interspeech 2022, 2022

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes.
Proceedings of the Interspeech 2022, 2022

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition.
Proceedings of the Interspeech 2022, 2022

4-bit Conformer with Native Quantization Aware Training for Speech Recognition.
Proceedings of the Interspeech 2022, 2022

Turn-Taking Prediction for Natural Conversational Speech.
Proceedings of the Interspeech 2022, 2022


Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Large-Scale ASR Domain Adaptation Using Self- and Semi-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Personalized Keyphrase Detection Using Speaker and Environment Information.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Tied & Reduced RNN-T Decoder.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2021

Learning Word-Level Confidence for Subword End-To-End ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Less is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging.
Proceedings of the IEEE International Conference on Acoustics, 2021

Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Multi-User Voicefilter-Lite via Attentive Speaker Embedding.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Cross-Attention Conformer for Context Modeling in Speech Enhancement for ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.
CoRR, 2020

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer.
Proceedings of the Interspeech 2020, 2020

Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Low Latency Speech Recognition Using End-to-End Prefetching.
Proceedings of the Interspeech 2020, 2020

An Attention-Based Joint Acoustic and Text on-Device End-To-End Model.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020


Towards Fast and Accurate Streaming End-To-End ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.
CoRR, 2019



Joint Endpointing and Decoding with End-to-end Models.
Proceedings of the IEEE International Conference on Acoustics, 2019

2017
Streaming small-footprint keyword spotting using sequence-to-sequence models.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Self-adaptive Failure Detector for Peer-to-Peer Distributed System Considering the Link Faults.
Proceedings of the Advanced Parallel Processing Technologies, 2017

2016
Using Pronunciation-Based Morphological Subword Units to Improve OOV Handling in Keyword Search.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

2015
Segmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition.
Proceedings of the INTERSPEECH 2015, 2015

Deep neural network based spectral feature mapping for robust speech recognition.
Proceedings of the INTERSPEECH 2015, 2015

Improvements on transducing syllable lattice to word lattice for keyword search.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Syllable based keyword search: Transducing syllable lattices to word lattices.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Subword-based modeling for handling OOV words inkeyword spotting.
Proceedings of the IEEE International Conference on Acoustics, 2014

Virtual Machine Scheduling Considering Both Computing and Cooling Energy.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Scalability Analysis and Improvement of Hadoop Virtual Cluster with Cost Consideration.
Proceedings of the 2014 IEEE 7th International Conference on Cloud Computing, Anchorage, AK, USA, June 27, 2014

2013
Conditional Random Fields in Speech, Audio, and Language Processing.
Proc. IEEE, 2013

HPACS: A High Privacy and Availability Cloud Storage Platform with Matrix Encryption.
Proceedings of the Advanced Parallel Processing Technologies, 2013

2012
Efficient Segmental Conditional Random Fields for One-Pass Phone Recognition.
Proceedings of the INTERSPEECH 2012, 2012

vHadoop: A Scalable Hadoop Virtual Cluster Platform for MapReduce-Based Parallel Machine Learning with Performance Consideration.
Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, 2012


  Loading...