Naoyuki Kanda
Orcid: 0000-0002-8628-3288
According to our database1,
Naoyuki Kanda
authored at least 80 papers
between 2005 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription.
CoRR, 2024
2023
Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation.
CoRR, 2023
CoRR, 2023
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
CoRR, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-To-End Neural Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2023
Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Self-Supervised Learning with Bi-Label Masked Speech Prediction for Streaming Multi-Talker Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE J. Sel. Top. Signal Process., 2022
Comput. Speech Lang., 2022
CoRR, 2022
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition.
CoRR, 2022
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization.
CoRR, 2022
Proceedings of the Interspeech 2022, 2022
Proceedings of the Interspeech 2022, 2022
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition.
Proceedings of the Interspeech 2022, 2022
Proceedings of the Interspeech 2022, 2022
Proceedings of the Interspeech 2022, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
IEEE Signal Process. Lett., 2021
CoRR, 2021
Exploring End-to-End Multi-Channel ASR with Bias Information for Meeting Transcription.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Hypothesis Stitcher for End-to-End Speaker-Attributed ASR on Long-Form Multi-Talker Recordings.
Proceedings of the IEEE International Conference on Acoustics, 2021
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
Proceedings of the Interspeech 2020, 2020
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers.
Proceedings of the Interspeech 2020, 2020
2019
Proceedings of the Interspeech 2019, 2019
Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.
Proceedings of the Interspeech 2019, 2019
Proceedings of the Interspeech 2019, 2019
Proceedings of the Interspeech 2019, 2019
Acoustic Modeling for Distant Multi-talker Speech Recognition with Single- and Multi-channel Branches.
Proceedings of the IEEE International Conference on Acoustics, 2019
Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018
Proceedings of the Interspeech 2018, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Minimum Bayes risk training of CTC acoustic models in maximum a posteriori based decoding framework.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Investigation of lattice-free maximum mutual information-based acoustic models with sequence-level Kullback-Leibler divergence.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
2016
Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription.
Speech Commun., 2016
Proceedings of the Interspeech 2016, 2016
Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks.
Proceedings of the Interspeech 2016, 2016
2015
Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
2014
Open-ended Spoken Language Technology: Studies on Spoken Dialogue Systems and Spoken Document Retrieval Systems.
PhD thesis, 2014
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2014, 2014
Boundary contraction training for acoustic models based on discrete deep neural networks.
Proceedings of the INTERSPEECH 2014, 2014
2013
Proceedings of the INTERSPEECH 2013, 2013
Multiple index combination for Japanese spoken term detection with optimum index selection based on OOV-region classifier.
Proceedings of the IEEE International Conference on Acoustics, 2013
Elastic spectral distortion for low resource speech recognition with deep neural networks.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013
2012
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
2011
A multi-expert model for dialogue and behavior control of conversational robots and agents.
Knowl. Based Syst., 2011
2008
Proceedings of the International Workshop on Multimedia Signal Processing, 2008
2006
Multi-Domain Spoken Dialogue System with Extensibility and Robustness against Speech Recognition Errors.
Proceedings of the SIGDIAL 2006 Workshop, 2006
2005
A two-layer model for behavior and dialogue planning in conversational service robots.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005
Contextual constraints based on dialogue models in database search task for spoken dialogue systems.
Proceedings of the INTERSPEECH 2005, 2005