Naohiro Tawara

According to our database¹, Naohiro Tawara authored at least 51 papers between 2011 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics.

[BibT_eX]

[DOI]

CoRR, March, 2026

Effect of individual characteristics on impressions of one's own recorded voice.

[BibT_eX]

[DOI]

Hikaru Yanagida

Yusuke Ijima

Naohiro Tawara

Speech Commun., 2026

Microphone array geometry-independent multi-talker distant ASR: NTT system for DASR task of the CHiME-8 challenge.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2026

2025

Dissecting the Segmentation Model of End-to-End Diarization with Vector Clustering.

[BibT_eX]

[DOI]

CoRR, June, 2025

Why is children's ASR so difficult? Analyzing children's phonological error patterns using SSL-based phoneme recognizers.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Pretraining Multi-Speaker Identification for Neural Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Mitigating Non-Target Speaker Bias in Guided Speaker Embedding.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Multi-channel Speaker Counting for EEND-VC-based Speaker Diarization on Multi-domain Conversation.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Mamba-based Segmentation Model for Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Guided Speaker Embedding.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model.

[BibT_eX]

[DOI]

Carlos Hernandez-Olivan

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Predictive ASR and Turn-taking Prediction at Once: Towards More Responsive Spoken Dialog System.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Investigating Self-Supervised Learning-Based Front-End for Multi-Channel Replay Attack Detection.

[BibT_eX]

[DOI]

Takuo Yamaguchi

Sayaka Shiota

Naohiro Tawara

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

Applying LLMs for Rescoring N-best ASR Hypotheses of Casual Conversations: Effects of Domain Adaptation and Context Carry-over.

[BibT_eX]

[DOI]

CoRR, 2024

Recursive Attentive Pooling For Extracting Speaker Embeddings From Multi-Speaker Recordings.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Interaural Time Difference Loss for Binaural Target Sound Extraction.

[BibT_eX]

[DOI]

Carlos Hernandez-Olivan

Proceedings of the 18th International Workshop on Acoustic Signal Enhancement, 2024

NTT Speaker Diarization System for Chime-7: Multi-Domain, Multi-Microphone end-to-end and Vector Clustering Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Discriminative Training of VBx Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Influence of Personal Traits on Impressions of One's Own Voice.

[BibT_eX]

[DOI]

Hikaru Yanagida

Yusuke Ijima

Naohiro Tawara

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

What are differences? Comparing DNN and Human by Their Performance and Characteristics in Speaker Age Estimation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Iterative Shallow Fusion of Backward Language Model for End-To-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Voice or Content? - Exploring Impact of Speech Content on Age Estimation from Voice.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

Coarse-Age Loss: A New Training Method Using Coarse-Age Labeled Data for Speaker Age Estimation.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Multi-Source Domain Generalization Using Domain Attributes for Recurrent Neural Network Language Models.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2022

Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Advances in Integration of End-to-End Neural and Clustering-Based Diarization for Real Conversational Speech.

[BibT_eX]

[DOI]

Keisuke Kinoshita

Marc Delcroix

Naohiro Tawara

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Age-VOX-Celeb: Multi-Modal Corpus for Facial and Speech Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

BLSTM-Based Confidence Estimation for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Integrating End-to-End Neural and Clustering-Based Diarization: Getting the Best of Both Worlds.

[BibT_eX]

[DOI]

Keisuke Kinoshita

Marc Delcroix

Naohiro Tawara

Proceedings of the IEEE International Conference on Acoustics, 2021

Robust Speech-Age Estimation Using Local Maximum Mean Discrepancy Under Mismatched Recording Conditions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Language Model Data Augmentation Based on Text Domain Transfer.

[BibT_eX]

[DOI]

Atsunori Ogawa

Naohiro Tawara

Marc Delcroix

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Frame-Level Phoneme-Invariant Speaker Embedding for Text-Independent Speaker Recognition on Extremely Short Utterances.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Improving Speaker-Attribute Estimation by Voting Based on Speaker Cluster Information.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Noise-robust Attention Learning for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 28th European Signal Processing Conference, 2020

Speaker Age Estimation Using Age-Dependent Insensitive Loss.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Multi-Channel Speech Enhancement Using Time-Domain Convolutional Denoising Autoencoder.

[BibT_eX]

[DOI]

Naohiro Tawara

Tetsunori Kobayashi

Tetsuji Ogawa

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speaker Adversarial Training of DPGMM-Based Feature Extractor for Zero-Resource Languages.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Postfiltering Using an Adversarial Denoising Autoencoder with Noise-aware Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Sequential Fish Catch Forecasting Using Bayesian State Space Models.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Pattern Recognition, 2018

Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Language Model Domain Adaptation Via Recurrent Neural Networks with Domain-Shared and Domain-Specific Representations.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Adversarial autoencoder for reducing nonlinear distortion.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Exploiting end of sentences and speaker alternations in language modeling for multiparty conversations.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2015

A comparative study of spectral clustering for i-vector-based speaker clustering under noisy conditions.

[BibT_eX]

[DOI]

Naohiro Tawara

Tetsuji Ogawa

Tetsunori Kobayashi

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2013

Blocked Gibbs sampling based multi-scale mixture model for speaker clustering on noisy data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2013

2012

Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Fully Bayesian inference of multi-mixture Gaussian model and its evaluation using speaker clustering.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Speaker Clustering Based on Utterance-Oriented Dirichlet Process Mixture Model.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Naohiro Tawara

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...