Shuo-Yiin Chang

According to our database1, Shuo-Yiin Chang authored at least 43 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Text Injection for Capitalization and Turn-Taking Prediction in Speech Models.
CoRR, 2023

Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

How to Estimate Model Transferability of Pre-Trained Speech Models?
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Text Injection for Capitalization and Turn-Taking Prediction in Speech Models.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

UML: A Universal Monolingual Output Layer For Multilingual Asr.
Proceedings of the IEEE International Conference on Acoustics, 2023

Multi-Output RNN-T Joint Networks for Multi-Task Learning of ASR and Auxiliary Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2023

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

Context-Aware end-to-end ASR Using Self-Attentive Embedding and Tensor Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2023

Towards General-Purpose Text-Instruction-Guided Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Improved Long-Form Speech Recognition By Jointly Modeling The Primary And Non-Primary Speakers.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Language Agnostic Multilingual Streaming On-Device ASR System.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming Intended Query Detection using E2E Modeling for Continued Conversation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Turn-Taking Prediction for Natural Conversational Speech.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving the Fusion of Acoustic and Text Representations in RNN-T.
Proceedings of the IEEE International Conference on Acoustics, 2022


2021
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.
CoRR, 2020

Personal VAD: Speaker-Conditioned Voice Activity Detection.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Low Latency Speech Recognition Using End-to-End Prefetching.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020


Towards Fast and Accurate Streaming End-To-End ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Deep Learning for Audio Signal Processing.
IEEE J. Sel. Top. Signal Process., 2019

On Neural Phone Recognition of Mixed-Source ECoG Signals.
CoRR, 2019


Joint Endpointing and Decoding with End-to-end Models.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Unified Endpointer Using Multitask and Multidomain Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Temporal Modeling Using Dilated Convolution and Gating for Voice-Activity-Detection.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Improved End-of-Query Detection for Streaming Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Endpoint Detection Using Grid Long Short-Term Memory Networks for Streaming Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
Feature Design for Robust Speech Recognition: Nurture and Nature.
PhD thesis, 2016

2015
On the importance of modeling and robustness for deep neural network feature.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Robust CNN-based speech recognition with Gabor filter kernels.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013
Informative spectro-temporal bottleneck features for noise-robust speech recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

The blame game in meeting room ASR: An analysis of feature versus model errors in noisy and mismatched conditions.
Proceedings of the IEEE International Conference on Acoustics, 2013

Spectro-temporal features for noise-robust speech recognition using power-law nonlinearity and power-bias subtraction.
Proceedings of the IEEE International Conference on Acoustics, 2013

2009
Improved clustered hierarchical tandem system with bottom-up processing.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Data-driven clustered hierarchical tandem system for LVCSR.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008


  Loading...