Chung-Cheng Chiu

Sharfiden Hassen Yusuf

Digit. Signal Process., 2026

2025

Data-Centric Lessons To Improve Speech-Language Pretraining.

[BibT_eX]

[DOI]

CoRR, October, 2025

AXLearn: Modular Large Model Training on Heterogeneous Infrastructure.

[BibT_eX]

[DOI]

CoRR, July, 2025

Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

SLM: Bridge the thin gap between speech and text foundation models.

[BibT_eX]

[DOI]

CoRR, 2023

Efficient Adapters for Giant Speech Models.

[BibT_eX]

[DOI]

CoRR, 2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages.

[BibT_eX]

[DOI]

CoRR, 2023

Textless Direct Speech-to-Speech Translation with Discrete Speech Representation.

[BibT_eX]

[DOI]

Xinjian Li

Ye Jia

Proceedings of the IEEE International Conference on Acoustics, 2023

SLM: Bridge the Thin Gap Between Speech and Text Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2022

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data.

[BibT_eX]

[DOI]

CoRR, 2022

Self-supervised learning with random-projection quantizer for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Improving The Latency And Quality Of Cascaded Encoders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models.

[BibT_eX]

[DOI]

CoRR, 2021

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Pushing the Limits of Non-Autoregressive Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Efficient Knowledge Distillation for RNN-Transducer Models.

[BibT_eX]

[DOI]

Sankaran Panchapagesan

Proceedings of the IEEE International Conference on Acoustics, 2021

Cascaded Encoders for Unifying Streaming and Non-Streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Cross-Attention Conformer for Context Modeling in Speech Enhancement for ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling.

[BibT_eX]

[DOI]

CoRR, 2020

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.

[BibT_eX]

[DOI]

CoRR, 2020

Improved Noisy Student Training for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Conformer: Convolution-augmented Transformer for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Attention-Based Joint Acoustic and Text on-Device End-To-End Model.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Specaugment on Large Scale Datasets.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speech Sentiment Analysis via Pre-Trained Features from End-to-End ASR Models.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.

[BibT_eX]

[DOI]

CoRR, 2019

Two-Pass End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Edge Detection Algorithm Based on Texture Blocks.

[BibT_eX]

[DOI]

Shou-Cih Chen

Proceedings of the IEEE 4th International Conference on Computer and Communication Systems, 2019

Recognizing Long-Form Speech Using Streaming End-to-End Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

A Comparison of End-to-End Models for Long-Form Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018

A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Compression of End-to-End Models.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speech Recognition for Medical Conversations.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Monotonic Chunkwise Attention.

[BibT_eX]

[DOI]

Colin Raffel

Proceedings of the 6th International Conference on Learning Representations, 2018

No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Improving the Performance of Online Neural Transducer Models.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Minimum Word Error Rate Training for Attention-Based Sequence-to-Sequence Models.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Learning Hard Alignments with Variational Inference.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

State-of-the-Art Speech Recognition with Sequence-to-Sequence Models.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

2017

An online sequence-to-sequence model for noisy speech recognition.

[BibT_eX]

[DOI]

CoRR, 2017

A Robust Vision-Based Skyline Detection Algorithm Under Different Weather Conditions.

[BibT_eX]

[DOI]

Yun-Jiun Liu

Jia-Horng Yang

IEEE Access, 2017

Learning online alignments with continuous rewards policy gradient.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

Contrast Enhancement Algorithm Based on Gap Adjustment for Histogram Equalization.

[BibT_eX]

[DOI]

Chih-Chung Ting

Sensors, 2016

A skyline detection algorithm for use in different weather and environmental conditions.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Electro Information Technology, 2016

2015

Visual Contrast Enhancement Algorithm Based on Histogram Equalization.

[BibT_eX]

[DOI]

Sensors, 2015

Monocular Vision System for Fixed Altitude Flight of Unmanned Aerial Vehicles.

[BibT_eX]

[DOI]

Sensors, 2015

Block-Based Connected-Component Labeling Algorithm Using Binary Decision Trees.

[BibT_eX]

[DOI]

Wan-Yu Chang

Jia-Horng Yang

Sensors, 2015

Predicting Co-verbal Gestures: A Deep and Temporal Modeling Approach.

[BibT_eX]

[DOI]

Louis-Philippe Morency

Proceedings of the Intelligent Virtual Agents - 15th International Conference, 2015

2014

Acting the part: the role of gesture on avatar identity.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Conference on Motion in Games, Playa Vista, CA, USA, November 06, 2014

An efficient scan algorithm for block-based connected component labeling.

[BibT_eX]

[DOI]

Wan-Yu Chang

Proceedings of the 22nd Mediterranean Conference on Control and Automation, 2014

Gesture generation with low-dimensional embeddings.

[BibT_eX]

[DOI]

Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2014

2012

Subjective Optimization.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Virtual Agents - 12th International Conference, 2012

Personal identification by extracting SIFT features from laser speckle patterns.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Vision-Only Automatic Flight Control for Small UAVs.

[BibT_eX]

[DOI]

Ching-Tung Lo

IEEE Trans. Veh. Technol., 2011

Vision-based Automatic Flight Control for Small UAVs.

[BibT_eX]

[DOI]

Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2011), 2011

Histogram Enhancement Using Adaptive Segmentation Algorithm.

[BibT_eX]

[DOI]

Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2011), 2011

How to Train Your Avatar: A Data Driven Approach to Gesture Generation.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Virtual Agents - 11th International Conference, 2011

A style controller for generating virtual human behaviors.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

2010

A Robust Object Segmentation System Using a Probability-Based Background Extraction Algorithm.

[BibT_eX]

[DOI]

Min-Yu Ku

Li-Wey Liang

IEEE Trans. Circuits Syst. Video Technol., 2010

Automatic Traffic Surveillance System for Vision-Based Vehicle Recognition and Tracking.

[BibT_eX]

[DOI]

Min-Yu Ku

Chun-Yi Wang

J. Inf. Sci. Eng., 2010

Real-Time Front Vehicle Detection Algorithm for an Asynchronous Binocular System.

[BibT_eX]

[DOI]

Ming-liang Chung

Wen-Chung Chen

J. Inf. Sci. Eng., 2010

Automatic Complexity Reduction in Reinforcement Learning.

[BibT_eX]

[DOI]

Von-Wun Soo

Comput. Intell., 2010

Analysis of adverse drug reactions using drug and drug target interactions and graph-based methods.

[BibT_eX]

[DOI]

Artif. Intell. Medicine, 2010

2009

Asynchronous stereo vision system for front-vehicle detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

On the Construction of Initial Basis Function for Efficient Value Function Approximation.

[BibT_eX]

Kuan-Ta Chen

Proceedings of the 2009 International Conference on Artificial Intelligence, 2009

2008

Classifying Proteins Related to Adverse Drug Reactions from Drug Targets Using Support Vector Machines.

[BibT_eX]

Proceedings of the International Conference on Bioinformatics & Computational Biology, 2008

2007

Motorcycle Detection and Tracking System with Occlusion Segmentation.

[BibT_eX]

[DOI]

Min-Yu Ku

Hung-Tsung Chen

Proceedings of the Eighth International Workshop on Image Analysis for Multimedia Interactive Services, 2007

Subgoal Identification for Reinforcement Learning and Planning in Multiagent Problem Solving.

[BibT_eX]

[DOI]

Von-Wun Soo

Proceedings of the Multiagent System Technologies, 5th German Conference, 2007

AI-RPG Toolkit: Towards A Deep Model Implementation for Improvisational Virtual Drama.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Virtual Agents, 7th International Conference, 2007

Probability Analysis on Associations of Adverse Drug Events with Drug-Drug Interactions.

[BibT_eX]

[DOI]

Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, 2007

2006

A real-time wavelet-based video compression approach to intelligent video surveillance systems.

[BibT_eX]

[DOI]

Int. J. Comput. Appl. Technol., 2006

2005

Multi-layer segmentation of complex document images.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 2005

A new region-based segmentation method for complex document image analysis.

[BibT_eX]

[DOI]

Int. J. Comput. Sci. Eng., 2005

A Discriminant Analysis Based Recursive Automatic Thresholding Approach for Image Segmentation.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

2004

Complex document image segmentation using localized histogram analysis with multi-layer matching and clustering.

[BibT_eX]

[DOI]