Duc Le

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Bibliography

2026

MELD: Mel-Spectrogram-Based Speech Language Modeling with Discrete Latent Variables.

[BibT_eX]

[DOI]

CoRR, May, 2026

An Intelligent Multi-agent System for Urban Traffic Signal Control Leveraging Reinforcement Learning and Graph Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Applied Algorithms - Third International Conference, 2026

2025

Latent Speech-Text Transformer.

[BibT_eX]

[DOI]

CoRR, October, 2025

From Visual Explanations to Counterfactual Explanations with Latent Diffusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Physics-Informed Ground Reaction Dynamics from Human Motion Capture.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Human System Interaction, 2025

2024

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation.

[BibT_eX]

[DOI]

CoRR, 2024

A Foundation Model for Music Informatics.

[BibT_eX]

[DOI]

Minz Won

Yun-Ning Hung

Duc Le

Proceedings of the IEEE International Conference on Acoustics, 2024

STEMGEN: A Music Generation Model That Listens.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Multi-scale and Multi-level Attention Based on External Knowledge in EHRs.

[BibT_eX]

[DOI]

Duc Le

Bac Le

Proceedings of the Recent Challenges in Intelligent Information and Database Systems, 2024

2023

sCL-ST: Supervised Contrastive Learning With Semantic Transformations for Multiple Lead ECG Arrhythmia Classification.

[BibT_eX]

[DOI]

IEEE J. Biomed. Health Informatics, June, 2023

Seq2seq for Automatic Paraphasia Detection in Aphasic Speech.

[BibT_eX]

[DOI]

CoRR, 2023

Scaling Up Music Information Retrieval Training with Semi-Supervised Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Text Generation with Speech Synthesis for ASR Data Augmentation.

[BibT_eX]

[DOI]

Ethan Campbell-Taylor

Jessie Salas

Irina-Elena Veliche

Xi Chen

CoRR, 2023

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Learning ASR Pathways: A Sparse Multilingual ASR Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

ICASSP 2023 Spoken Language Understanding Grand Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Improving fast-slow Encoder based Transducer with Streaming Deliberation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Learning ASR pathways: A sparse multilingual ASR model.

[BibT_eX]

[DOI]

CoRR, 2022

STOP: A dataset for Spoken Task Oriented Semantic Parsing.

[BibT_eX]

[DOI]

CoRR, 2022

Stop: A Dataset for Spoken Task Oriented Semantic Parsing.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Scaling ASR Improves Zero and Few Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming parallel transducer beam search with fast slow cascaded encoders.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Deliberation Model for On-Device Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Neural-FST Class Language Model for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning.

[BibT_eX]

[DOI]

Morten Olgaard Jensen

Ngan Le

Proceedings of the IEEE-EMBS International Conference on Biomedical and Health Informatics, 2022

2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios.

[BibT_eX]

[DOI]

CoRR, 2021

Wearable Metasurface-Enabled Quasi-Yagi Antenna for UHF RFID Reader With End-Fire Radiation Along the Forearm.

[BibT_eX]

[DOI]

IEEE Access, 2021

Alignment Restricted Streaming Recurrent Neural Network Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Improving RNN Transducer Based ASR with Auxiliary Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Deep Shallow Fusion for RNN-T Personalization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Emformer: Efficient Memory Transformer Based Acoustic Model for Low Latency Streaming Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Weak-Attention Suppression for Transformer Based Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transformer-Based Acoustic Modeling for Hybrid Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

ML-Assisted Monitoring and Characterization of IoT Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems, 2020

2019

Neural network modeling of monthly salinity variations in oyster reef in Apalachicola Bay in response to freshwater inflow and winds.

[BibT_eX]

[DOI]

Duc Le

Wenrui Huang

Elijah Johnson

Neural Comput. Appl., 2019

Transformer-Transducer: End-to-End Speech Recognition with Self-Attention.

[BibT_eX]

[DOI]

CoRR, 2019

The integrated National NeuroAIDS Tissue Consortium database: a rich platform for neuroHIV research.

[BibT_eX]

[DOI]

Database J. Biol. Databases Curation, 2019

From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Automatic quantitative analysis of spontaneous aphasic speech.

[BibT_eX]

[DOI]

Duc Le

Keli Licata

Emily Mower Provost

Speech Commun., 2018

Real-time Air Pollution prediction model based on Spatiotemporal Big data.

[BibT_eX]

[DOI]

Duc Le

CoRR, 2018

Classification of Huntington Disease Using Acoustic and Lexical Features.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Towards Automatic Speech-Language Assessment for Aphasia Rehabilitation.

[BibT_eX]

[DOI]

Duc Le

PhD thesis, 2017

Automatic Paraphasia Detection from Aphasic Speech: A Preliminary Study.

[BibT_eX]

[DOI]

Duc Le

Keli Licata

Emily Mower Provost

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Discretized Continuous Speech Emotion Recognition with Multi-Task Deep Recurrent Neural Network.

[BibT_eX]

[DOI]

Duc Le

Zakaria Aldeneh

Emily Mower Provost

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Automatic Assessment of Speech Intelligibility for Individuals With Aphasia.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Improving Automatic Recognition of Aphasic Speech with AphasiaBank.

[BibT_eX]

[DOI]

Duc Le

Emily Mower Provost

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Wild wild emotion: a multimodal ensemble approach.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

2015

Data selection for acoustic emotion recognition: Analyzing and comparing utterance and sub-utterance selection strategies.

[BibT_eX]

[DOI]

Duc Le

Emily Mower Provost

Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014

MuCheck: an extensible tool for mutation testing of haskell programs.

[BibT_eX]

[DOI]

Duc Le

Mohammad Amin Alipour

Rahul Gopinath

Alex Groce

Proceedings of the International Symposium on Software Testing and Analysis, 2014

Modeling pronunciation, rhythm, and intonation for automatic assessment of speech quality in aphasia rehabilitation.

[BibT_eX]

[DOI]

Duc Le

Emily Mower Provost

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Automatic analysis of speech quality for aphasia treatment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

A preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perception.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks.

[BibT_eX]

[DOI]

Duc Le

Emily Mower Provost

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

The function, and dysfunction, of information sources in learning functional programming.

[BibT_eX]

[DOI]

J. Comput. Sci. Coll., 2012

2011

#ifdef confirmed harmful: Promoting understandable software variation.

[BibT_eX]

[DOI]

Duc Le

Eric Walkingshaw

Martin Erwig

Proceedings of the 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, 2011

Support for software variation editing.

[BibT_eX]

[DOI]

Duc Le

Proceedings of the 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, 2011

2005

A 10.24GSPS photonic sampled bandpass ΔΣ modulator direct-sampling at 12GHz.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2005 Custom Integrated Circuits Conference, 2005

Duc Le

Bibliography

Loading...