Brian Yan

According to our database1, Brian Yan authored at least 40 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer.
CoRR, 2024

2023
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing.
J. Open Source Softw., November, 2023

Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310).
Dataset, October, 2023

Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing.
CoRR, 2023

Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study.
CoRR, 2023

Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization.
CoRR, 2023

Speech collage: code-switched audio generation by collaging monolingual corpora.
CoRR, 2023

Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff.
CoRR, 2023

Bayes Risk Transducer: Transducer with Controllable Alignment Prediction.
CoRR, 2023

Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding.
CoRR, 2023

Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning.
CoRR, 2023

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization.
CoRR, 2023

A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks.
CoRR, 2023

CMU's IWSLT 2023 Simultaneous Speech Translation System.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Bayes Risk CTC: Controllable CTC Alignment in Sequence-to-Sequence Tasks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Towards Zero-Shot Code-Switched Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Align, Write, Re-Order: Explainable End-to-End Speech Translation via Operation Sequence Generation.
Proceedings of the IEEE International Conference on Acoustics, 2023

E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

The Pipeline System of ASR and NLU with MLM-based data Augmentation Toward Stop Low-Resource Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Massively Multilingual ASR with Auxiliary CTC Objectives.
Proceedings of the IEEE International Conference on Acoustics, 2023

Avoid Overthinking in Self-Supervised Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Study on the Integration of Pipeline and E2E SLU Systems for Spoken Semantic Parsing Toward Stop Quality Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History.
Proceedings of the IEEE International Conference on Acoustics, 2023

CTC Alignments Improve Autoregressive Translation.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Joint Prediction and Denoising for Large-Scale Multilingual Self-Supervised Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

2022
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders.
CoRR, 2022

CMU's IWSLT 2022 Dialect Speech Translation System.
Proceedings of the 19th International Conference on Spoken Language Translation, 2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding.
Proceedings of the Interspeech 2022, 2022

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation.
Proceedings of the Interspeech 2022, 2022

Two-Pass Low Latency End-to-End Spoken Language Understanding.
Proceedings of the Interspeech 2022, 2022

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2022

ESPnet-SLU: Advancing Spoken Language Understanding Through ESPnet.
Proceedings of the IEEE International Conference on Acoustics, 2022

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

ESPnet-ST IWSLT 2021 Offline Speech Translation System.
Proceedings of the 18th International Conference on Spoken Language Translation, 2021

Differentiable Allophone Graphs for Language-Universal Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021


  Loading...