Shota Horiguchi

Orcid: 0000-0002-3166-4956

According to our database¹, Shota Horiguchi authored at least 62 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Frontend Token Enhancement for Token-Based Speech Recognition.

[BibT_eX]

[DOI]

CoRR, February, 2026

Microphone array geometry-independent multi-talker distant ASR: NTT system for DASR task of the CHiME-8 challenge.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2026

2025

Dissecting the Segmentation Model of End-to-End Diarization with Vector Clustering.

[BibT_eX]

[DOI]

CoRR, June, 2025

Voice Impression Control in Zero-Shot TTS.

[BibT_eX]

[DOI]

Keinichi Fujita

Shota Horiguchi

Yusuke Ijima

CoRR, June, 2025

Pretraining Multi-Speaker Identification for Neural Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Mitigating Non-Target Speaker Bias in Guided Speaker Embedding.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Voice Impression Control in Zero-Shot TTS.

[BibT_eX]

[DOI]

Kenichi Fujita

Shota Horiguchi

Yusuke Ijima

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Analysis of Semantic and Acoustic Token Variability Across Speech, Music, and Audio Domains.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Multi-channel Speaker Counting for EEND-VC-based Speaker Diarization on Multi-domain Conversation.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Mamba-based Segmentation Model for Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Alignment-Free Training for Transducer-based Multi-Talker ASR.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Guided Speaker Embedding.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits.

[BibT_eX]

[DOI]

CoRR, 2024

Recursive Attentive Pooling For Extracting Speaker Embeddings From Multi-Speaker Recordings.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Investigation of Speaker Representation for Target-Speaker Speech Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Factor-Conditioned Speaking-Style Captioning.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Streaming Active Learning for Regression Problems Using Regression via Classification.

[BibT_eX]

[DOI]

Shota Horiguchi

Kota Dohi

Yohei Kawaguchi

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

CAPTDURE: Captioned Sound Dataset of Single Sources.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model.

[BibT_eX]

[DOI]

Aoi Ito

Shota Horiguchi

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Synthetic Data Augmentation for ASR with Domain Filtering.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Encoder-Decoder Based Attractors for End-to-End Neural Diarization.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Online Neural Diarization of Unlimited Numbers of Speakers.

[BibT_eX]

[DOI]

CoRR, 2022

Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization.

[BibT_eX]

[DOI]

Natsuo Yamashita

Shota Horiguchi

Takeshi Homma

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models.

[BibT_eX]

[DOI]

Yuki Takashima

Shota Horiguchi

Shinji Watanabe

Leibny Paola García-Perera

Yohei Kawaguchi

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Rethinking Fano's Inequality in Ensemble Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Environmental Sound Extraction Using Onomatopoeic Words.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Channel End-To-End Neural Diarization with Distributed Microphones.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Environmental Sound Extraction Using Onomatopoeia.

[BibT_eX]

[DOI]

CoRR, 2021

Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization.

[BibT_eX]

[DOI]

CoRR, 2021

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap.

[BibT_eX]

[DOI]

CoRR, 2021

Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.

[BibT_eX]

[DOI]

CoRR, 2021

Online End-To-End Neural Diarization with Speaker-Tracing Buffer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Block-Online Guided Source Separation.

[BibT_eX]

[DOI]

Shota Horiguchi

Yusuke Fujita

Kenji Nagamatsu

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.

[BibT_eX]

[DOI]

Leibny Paola García-Perera

Kenji Nagamatsu

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization.

[BibT_eX]

[DOI]

Leibny Paola García-Perera

Kenji Nagamatsu

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-To-End Speaker Diarization as Post-Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Significance of Softmax-Based Features in Comparison to Distance Metric Learning-Based Features.

[BibT_eX]

[DOI]

Shota Horiguchi

Daiki Ikami

Kiyoharu Aizawa

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Online End-to-End Neural Diarization with Speaker-Tracing Buffer.

[BibT_eX]

[DOI]

CoRR, 2020

Neural Speaker Diarization with Speaker-Wise Chain Rule.

[BibT_eX]

[DOI]

CoRR, 2020

End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification.

[BibT_eX]

[DOI]

CoRR, 2020

Hitachi at SemEval-2020 Task 8: Simple but Effective Modality Ensemble for Meme Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones.

[BibT_eX]

[DOI]

Shota Horiguchi

Yusuke Fujita

Kenji Nagamatsu

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Anticipating the Start of User Interaction for Service Robot in the Wild.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

2019

Omnidirectional Pedestrian Detection by Rotation Invariant Training.

[BibT_eX]

[DOI]

Masato Tamura

Shota Horiguchi

Tomokazu Murakami

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation.

[BibT_eX]

[DOI]

Shota Horiguchi

Naoyuki Kanda

Kenji Nagamatsu

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Neural Speaker Diarization with Permutation-Free Objectives.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Acoustic Modeling for Distant Multi-talker Speech Recognition with Single- and Multi-channel Branches.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

End-to-End Neural Speaker Diarization with Self-Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Personalized Classifier for Food Image Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2018

Face-Voice Matching using Cross-modal Embeddings.

[BibT_eX]

[DOI]

Shota Horiguchi

Naoyuki Kanda

Kenji Nagamatsu

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

2016

Food Search Based on User Feedback to Assist Image-based Food Recording Systems.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, 2016

The log-normal distribution of the size of objects in daily meal images and its application to the efficient reduction of object proposals.

[BibT_eX]

[DOI]

Shota Horiguchi

Kiyoharu Aizawa

Makoto Ogawa

Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Shota Horiguchi

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...