Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition.

[BibT_eX]

[DOI]

Ju Lin

Proceedings of the IEEE International Conference on Acoustics, 2024

End-to-End Speech Recognition Contextualization with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Effective Internal Language Model Training and Fusion for Factorized Transducer Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Prompting Large Language Models with Speech Recognition Abilities.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data.

[BibT_eX]

[DOI]

CoRR, 2023

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision.

[BibT_eX]

[DOI]

Xubo Liu

Egor Lakomkin

Konstantinos Vougioukas

CoRR, 2023

Directional Speech Recognition for Speaker Disambiguation and Cross-talk Suppression.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Streaming Audio-Visual Speech Recognition with Alignment Regularization.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

SynthVSR: Scaling Up Visual Speech RecognitionWith Synthetic Supervision.

[BibT_eX]

[DOI]

Xubo Liu

Egor Lakomkin

Konstantinos Vougioukas

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Scaling ASR Improves Zero and Few Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Ego4D: Around the World in 3, 000 Hours of Egocentric Video.

[BibT_eX]

[DOI]

Santhosh Kumar Ramakrishnan

Christoph Feichtenhofer

Kiran K. Somasundaram

Giovanni Maria Farinella

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Ego4D: Around the World in 3, 000 Hours of Egocentric Video.

[BibT_eX]

[DOI]

Santhosh Kumar Ramakrishnan

Christoph Feichtenhofer

Kiran K. Somasundaram

Giovanni Maria Farinella

CoRR, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios.

[BibT_eX]

[DOI]

CoRR, 2021

Alignment Restricted Streaming Recurrent Neural Network Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Deep Shallow Fusion for RNN-T Personalization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Transformer-Based Acoustic Modeling for Streaming Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Two-Stage Approach to Speech Bandwidth Extension.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Do Sound Event Representations Generalize to Other Audio Tasks? A Case Study in Audio Transfer Learning.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Contrastive Semi-Supervised Learning for ASR.

[BibT_eX]

[DOI]

Alex Xiao

Christian Fuegen

Abdelrahman Mohamed

Proceedings of the IEEE International Conference on Acoustics, 2021

Memory-Efficient Speech Recognition on Smart Devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

A Time-Domain Convolutional Recurrent Network for Packet Loss Concealment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Interactive Text-to-Speech via Semi-supervised Style Transfer Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Large Scale Weakly and Semi-Supervised Learning for Low-Resource Video ASR.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Weak-Attention Suppression for Transformer Based Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Interactive Text-to-Speech System via Joint Style Analysis.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transformer-Based Acoustic Modeling for Hybrid Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Libri-Light: A Benchmark for ASR with Limited or No Supervision.

[BibT_eX]

[DOI]

Pierre-Emmanuel Mazaré

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Spatial Attention for Far-Field Speech Recognition with Deep Beamforming Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

RNN-T For Latency Controlled ASR With Improved Beam Search.

[BibT_eX]

[DOI]

CoRR, 2019

Transformer-Transducer: End-to-End Speech Recognition with Self-Attention.

[BibT_eX]

[DOI]

CoRR, 2019

Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Towards End-to-end Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes.

[BibT_eX]

[DOI]

Anurag Kumar

Maksim Khadkevich

Christian Fügen

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2013

Maximum entropy language modeling for Russian ASR.

[BibT_eX]

[DOI]

Proceedings of the 10th International Workshop on Spoken Language Translation: Papers, 2013

Efficient speech transcription through respeaking.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A real-world system for simultaneous translation of German lectures.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2009

A System for Simultaneous Translation of Lectures and Speeches.

[BibT_eX]

[DOI]

Christian Fügen

PhD thesis, 2009

End-to-End Evaluation in Simultaneous Translation.

[BibT_eX]

[DOI]

Proceedings of the EACL 2009, 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, Athens, Greece, March 30, 2009

2008

Spoken language translation.

[BibT_eX]

[DOI]

Alex Waibel

Christian Fügen

IEEE Signal Process. Mag., 2008

2007

Enabling Multimodal Human-Robot Interaction for the Karlsruhe Humanoid Robot.

[BibT_eX]

[DOI]

IEEE Trans. Robotics, 2007

Simultaneous translation of lectures and speeches.

[BibT_eX]

[DOI]

Christian Fügen

Alex Waibel

Muntsin Kolss

Mach. Transl., 2007

The ISL 2007 English speech transcription system for european parliament speeches.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

The influence of utterance chunking on machine translation performance.

[BibT_eX]

[DOI]

Christian Fügen

Muntsin Kolss

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Speech Translation Enhanced ASR for European Parliament Speeches - On the Influence of ASR Performance on Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Simultaneous multispeaker segmentation for automatic meeting recognition.

[BibT_eX]

[DOI]

Kornel Laskowski

Christian Fügen

Tanja Schultz

Proceedings of the 15th European Signal Processing Conference, 2007

2006

The ISL RT-06S Speech-to-Text System.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction, 2006

Multi-source far-distance microphone selection and combination for automatic transcription of lectures.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Cross-system adaptation and combination for continuous speech recognition: the influence of phoneme set and acoustic front-end.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Tracking and beamforming for multiple simultaneous speakers with probabilistic data association filters.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Advances in lecture recognition: the ISL RT-06s evaluation system.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Open Domain Speech Recognition & Translation: Lectures and Speeches.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Document driven machine translation enhanced ASR.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Rapid porting of ASR-systems to mobile devices.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Automatically Transcribing Meetings using Distant Microphones.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Natural human-robot interaction using speech, head pose and gestures.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 28, 2004

Issues in meeting transcription - the ISL meeting transcription system.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognition.

[BibT_eX]

[DOI]

Christian Fügen

Hartwig Holzapfel

Alex Waibel

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Towards language portability in statistical speech translation.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

The 2003 ISL rich transcription system for conversational telephony speech.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2002

Interlingua based statistical machine translation.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Integrating Emotional Cues into a Framework for Dialogue Management.

[BibT_eX]

[DOI]

Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), 2002

Efficient language model lookahead through polymorphic linguistic context assignment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

2001

LingWear: A Mobile Tourist Information System.

[BibT_eX]

[DOI]

Proceedings of the First International Conference on Human Language Technology Research, 2001

2000

Integrating dynamic speech modalities into context decision trees.

[BibT_eX]

[DOI]

Christian Fügen

Ivica Rogina

Proceedings of the IEEE International Conference on Acoustics, 2000

Christian Fügen

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...