Shuo-Yiin Chang

According to our database1, Shuo-Yiin Chang authored at least 42 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study.
CoRR, 2024

2023
Text Injection for Capitalization and Turn-Taking Prediction in Speech Models.
CoRR, 2023

How to Estimate Model Transferability of Pre-Trained Speech Models?
CoRR, 2023

Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR.
CoRR, 2023

UML: A Universal Monolingual Output Layer For Multilingual Asr.
Proceedings of the IEEE International Conference on Acoustics, 2023

Multi-Output RNN-T Joint Networks for Multi-Task Learning of ASR and Auxiliary Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2023

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

Context-Aware end-to-end ASR Using Self-Attentive Embedding and Tensor Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2023

Towards General-Purpose Text-Instruction-Guided Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Improved Long-Form Speech Recognition By Jointly Modeling The Primary And Non-Primary Speakers.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification.
Proceedings of the Interspeech 2022, 2022

A Language Agnostic Multilingual Streaming On-Device ASR System.
Proceedings of the Interspeech 2022, 2022

E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR.
Proceedings of the Interspeech 2022, 2022

Streaming Intended Query Detection using E2E Modeling for Continued Conversation.
Proceedings of the Interspeech 2022, 2022

Turn-Taking Prediction for Natural Conversational Speech.
Proceedings of the Interspeech 2022, 2022

Improving the Fusion of Acoustic and Text Representations in RNN-T.
Proceedings of the IEEE International Conference on Acoustics, 2022


2021
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.
CoRR, 2020

Personal VAD: Speaker-Conditioned Voice Activity Detection.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Low Latency Speech Recognition Using End-to-End Prefetching.
Proceedings of the Interspeech 2020, 2020


Towards Fast and Accurate Streaming End-To-End ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Deep Learning for Audio Signal Processing.
IEEE J. Sel. Top. Signal Process., 2019

On Neural Phone Recognition of Mixed-Source ECoG Signals.
CoRR, 2019


Joint Endpointing and Decoding with End-to-end Models.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Unified Endpointer Using Multitask and Multidomain Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Temporal Modeling Using Dilated Convolution and Gating for Voice-Activity-Detection.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Improved End-of-Query Detection for Streaming Speech Recognition.
Proceedings of the Interspeech 2017, 2017

Endpoint Detection Using Grid Long Short-Term Memory Networks for Streaming Speech Recognition.
Proceedings of the Interspeech 2017, 2017

2016
Feature Design for Robust Speech Recognition: Nurture and Nature.
PhD thesis, 2016

2015
On the importance of modeling and robustness for deep neural network feature.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Robust CNN-based speech recognition with Gabor filter kernels.
Proceedings of the INTERSPEECH 2014, 2014

2013
Informative spectro-temporal bottleneck features for noise-robust speech recognition.
Proceedings of the INTERSPEECH 2013, 2013

The blame game in meeting room ASR: An analysis of feature versus model errors in noisy and mismatched conditions.
Proceedings of the IEEE International Conference on Acoustics, 2013

Spectro-temporal features for noise-robust speech recognition using power-law nonlinearity and power-bias subtraction.
Proceedings of the IEEE International Conference on Acoustics, 2013

2009
Improved clustered hierarchical tandem system with bottom-up processing.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Data-driven clustered hierarchical tandem system for LVCSR.
Proceedings of the INTERSPEECH 2008, 2008


  Loading...