Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024

Distributed Compression Coding Based on Convolutional Sparse Coding Using Multiple Key Frames.

[BibT_eX]

[DOI]

Yosuke Higuchi

Muhammad Ajmal Muhaimin Bin Mustafa

Yoshimitsu Kuroki

Proceedings of the International Symposium on Intelligent Signal Processing and Communication Systems, 2024

Hierarchical Multi-Task Learning with CTC and Recursive Operation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Parody Detection Using Source-Target Attention with Teacher-Forced Lyrics.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Differences Between Singer and Speaker Verification: Training Singer Feature Representation Extractor Utilizing Singing Voice Characteristics.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Yosuke Higuchi

Tetsuji Ogawa

Tetsunori Kobayashi

CoRR, 2023

BECTRA: Transducer-Based End-To-End ASR with Bert-Enhanced Encoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Intermpl: Momentum Pseudo-Labeling With Intermediate CTC Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Mask-CTC-Based Encoder Pre-Training for Streaming End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

Spotting Parodies: Detecting Alignment Collapse Between Lyrics and Singing Voice.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

CTC Alignments Improve Autoregressive Translation.

[BibT_eX]

[DOI]

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Mask-Conformer: Augmenting Conformer with Mask-Predict Decoder.

[BibT_eX]

[DOI]

Yosuke Higuchi

Andrew Rosenberg

Yuan Wang

Murali Karthick Baskar

Bhuvana Ramabhadran

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2022

A Study on the Integration of Pre-Trained SSL, ASR, LM and SLU Models for Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Non-Autoregressive End-to-End Speech Recognition with Pre-Trained Acoustic and Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021

Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring.

[BibT_eX]

[DOI]

CoRR, 2021

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

ORTHROS: non-autoregressive end-to-end speech translation With dual-decoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Improved Mask-CTC for Non-Autoregressive End-to-End ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Recent Developments on Espnet Toolkit Boosted By Conformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans.

[BibT_eX]

[DOI]

Aswin Shanmugam Subramanian

Wangyou Zhang

CoRR, 2020

Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speaker Embeddings Incorporating Acoustic Conditions for Diarization.

[BibT_eX]

[DOI]

Yosuke Higuchi

Masayuki Suzuki

Gakuto Kurata

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Noise-robust Attention Learning for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 28th European Signal Processing Conference, 2020

2019

Speaker Adversarial Training of DPGMM-Based Feature Extractor for Zero-Resource Languages.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Yosuke Higuchi

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...