Kenji Nagamatsu

Leibny Paola García-Perera

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.

[BibT_eX]

[DOI]

Leibny Paola García-Perera

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Audio-Visual Speech Enhancement Method Conditioned in the Lip Motion and Speaker-Discriminative Embeddings.

[BibT_eX]

[DOI]

Koichiro Ito

Masaaki Yamamoto

Proceedings of the IEEE International Conference on Acoustics, 2021

End-To-End Speaker Diarization as Post-Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Building Multi lingual TTS using Cross Lingual Voice Conversion.

[BibT_eX]

[DOI]

Qinghua Sun

CoRR, 2020

Online End-to-End Neural Diarization with Speaker-Tracing Buffer.

[BibT_eX]

[DOI]

CoRR, 2020

Neural Speaker Diarization with Speaker-Wise Chain Rule.

[BibT_eX]

[DOI]

CoRR, 2020

End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification.

[BibT_eX]

[DOI]

CoRR, 2020

Delay Mitigation for Backchannel Prediction in Spoken Dialog System.

[BibT_eX]

[DOI]

Amalia Istiqlali Adiba

Proceedings of the Conversational Dialogue Systems for the Next Decade, 2020

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Meta-Learning for Speech Emotion Recognition Considering Ambiguity of Emotion Labels.

[BibT_eX]

[DOI]

Takuya Fujioka

Takeshi Homma

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Anticipating the Start of User Interaction for Service Robot in the Wild.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

2019

Addressing Ambiguity of Emotion Labels Through Meta-Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Neural Speaker Diarization with Permutation-Free Objectives.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Acoustic Modeling for Distant Multi-talker Speech Recognition with Single- and Multi-channel Branches.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

End-to-End Neural Speaker Diarization with Self-Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Face-Voice Matching using Cross-modal Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Fast Multichannel Nonnegative Matrix Factorization with Constraints on Active Source Candidates.

[BibT_eX]

[DOI]

Rintaro Ikeshita

Yohei Kawaguchi

Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Lattice-free State-level Minimum Bayes Risk Training of Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Sequence Distillation for Purely Sequence Trained Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Local Gaussian model with source-set constraints in audio source separation.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Independent vector analysis with frequency range division and prior switching.

[BibT_eX]

[DOI]

Proceedings of the 25th European Signal Processing Conference, 2017

Investigation of lattice-free maximum mutual information-based acoustic models with sequence-level Kullback-Leibler divergence.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2013

Cycle time based multi-goal path optimization for redundant robotic systems.

[BibT_eX]

[DOI]

Iacopo Gentilini

Kenji Shimada

Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

2010

A Pre-Identification Method for Chinese Named Entity Recognition.

[BibT_eX]

[DOI]

J. Softw., 2010

2009

Cascade Chinese Potential Name Recognition.

[BibT_eX]

[DOI]

Proceedings of the International Forum on Information Technology and Applications, 2009

A Hybrid Method of Chinese Prosodic Word Tagging Based on Keyword Anchor and Hidden Markov Model.

[BibT_eX]

[DOI]

Proceedings of the 2009 International Conference on Asian Language Processing, 2009

2006

Scalable Implementation Of Unit Selection Based Text-To-Speech System For Embedded Solutions.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2004

Unit selection using pitch synchronous cross correlation for Japanese concatenative speech synthesis.

[BibT_eX]

[DOI]

Nobuo Nukaga

Ryota Kamoshida

Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004

1996

Estimating Point-of-View-based Similarity Using POV Reinforcement and Similarity Propagation.

[BibT_eX]

[DOI]