Yosuke Higuchi

Orcid: 0000-0003-4500-8957

According to our database1, Yosuke Higuchi authored at least 26 papers between 2019 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition.
CoRR, 2023

BECTRA: Transducer-Based End-To-End ASR with Bert-Enhanced Encoder.
Proceedings of the IEEE International Conference on Acoustics, 2023

Intermpl: Momentum Pseudo-Labeling With Intermediate CTC Loss.
Proceedings of the IEEE International Conference on Acoustics, 2023

Mask-CTC-Based Encoder Pre-Training for Streaming End-to-End Speech Recognition.
Proceedings of the 31st European Signal Processing Conference, 2023

Spotting Parodies: Detecting Alignment Collapse Between Lyrics and Singing Voice.
Proceedings of the 31st European Signal Processing Conference, 2023

CTC Alignments Improve Autoregressive Translation.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Mask-Conformer: Augmenting Conformer with Mask-Predict Decoder.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels.
IEEE J. Sel. Top. Signal Process., 2022

A Study on the Integration of Pre-Trained SSL, ASR, LM and SLU Models for Spoken Language Understanding.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.
Proceedings of the IEEE International Conference on Acoustics, 2022

Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units.
Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Non-Autoregressive End-to-End Speech Recognition with Pre-Trained Acoustic and Language Models.
Proceedings of the IEEE International Conference on Acoustics, 2022

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring.
CoRR, 2021

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

ORTHROS: non-autoregressive end-to-end speech translation With dual-decoder.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improved Mask-CTC for Non-Autoregressive End-to-End ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Recent Developments on Espnet Toolkit Boosted By Conformer.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans.
CoRR, 2020

Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict.
Proceedings of the Interspeech 2020, 2020

Speaker Embeddings Incorporating Acoustic Conditions for Diarization.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Noise-robust Attention Learning for End-to-End Speech Recognition.
Proceedings of the 28th European Signal Processing Conference, 2020

2019
Speaker Adversarial Training of DPGMM-Based Feature Extractor for Zero-Resource Languages.
Proceedings of the Interspeech 2019, 2019


  Loading...