Shigeki Karita

According to our database¹, Shigeki Karita authored at least 30 papers between 2015 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability.

[BibT_eX]

[DOI]

Adriana Guevara-Rukoz

Heiga Zen

Michiel Bacchiani

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2025

Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2025

2024

FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2023

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency.

[BibT_eX]

[DOI]

Shigeki Karita

Richard Sproat

Haruko Ishikawa

CoRR, 2023

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

SNRi Target Training for Joint Speech Enhancement and Recognition.

[BibT_eX]

[DOI]

Yuma Koizumi

Shigeki Karita

Arun Narayanan

Sankaran Panchapagesan

Michiel Bacchiani

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Knowledge Transfer from Large-Scale Pretrained Language Models to End-To-End Speech Recognizers.

[BibT_eX]

[DOI]

Yotaro Kubo

Shigeki Karita

Michiel Bacchiani

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

DF-Conformer: Integrated Architecture of Conv-Tasnet and Conformer Using Linear Complexity Self-Attention for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Unsupervised Learning of Disentangled Speech Content and Style Representation.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition.

[BibT_eX]

[DOI]

Shigeki Karita

Yotaro Kubo

Michiel Adriaan Unico Bacchiani

Llion Jones

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans.

[BibT_eX]

[DOI]

Aswin Shanmugam Subramanian

Wangyou Zhang

CoRR, 2020

Self-Distillation for Improving CTC-Transformer-Based ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ESPnet-ST: All-in-One Speech Translation Toolkit.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

2019

Improved Deep Duel Model for Rescoring N-Best Speech Recognition List Using Backward LSTMLM and Ensemble Encoders.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration.

[BibT_eX]

[DOI]

Shigeki Karita

Nelson Enrique Yalta Soplin

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Semi-supervised End-to-end Speech Recognition Using Text-to-speech and Autoencoders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

A Comparative Study on Transformer vs RNN in Speech Applications.

[BibT_eX]

[DOI]

Nelson Enrique Yalta Soplin

Ryuichi Yamamoto

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

ESPnet: End-to-End Speech Processing Toolkit.

[BibT_eX]

[DOI]

Nelson Enrique Yalta Soplin

Jahn Heymann

Matthew Wiesner

Nanxin Chen

Adithya Renduchintala

Tsubasa Ochiai

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Semi-Supervised End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Auxiliary Feature Based Adaptation of End-to-end ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Rescoring N-Best Speech Recognition List Based on One-on-One Hypothesis Comparison Using Encoder-Classifier Model.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Sequence Training of Encoder-Decoder Model Using Policy Gradient for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Frame-by-Frame Closed-Form Update for Mask-Based Adaptive MVDR Beamforming.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Unfolded Deep Recurrent Convolutional Neural Network with Jump Ahead Connections for Acoustic Modeling.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Forward-Backward Convolutional LSTM for Acoustic Modeling.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming.

[BibT_eX]

[DOI]

Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

2015

Owner authentication for mobile devices using motion gestures based on multi-owner template update.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Multimedia & Expo Workshops, 2015

Far-field speech recognition using CNN-DNN-HMM with convolution in time.

[BibT_eX]

[DOI]

Takuya Yoshioka

Shigeki Karita

Tomohiro Nakatani

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Shigeki Karita

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...