Katunobu Itou

Proceedings of the International Symposium on Multimedia, 2025

Emotion and Genre-Based Music Recommendation System.

[BibT_eX]

[DOI]

Yuecheng Yao

Proceedings of the Thirteenth International Symposium on Computing and Networking, CANDAR 2025, 2025

Developing a Dialogue Generation System to Enhance Communication Skills with Strangers.

[BibT_eX]

[DOI]

Hongzhi Ding

Proceedings of the 7th International Conference on Big-data Service and Intelligent Computation, 2025

2024

Instrumentality Classification Evaluation System for Natural Sounds<sup>*</sup>.

[BibT_eX]

[DOI]

Yuhuan Wang

Proceedings of the IEEE International Symposium on Multimedia, 2024

Homophonic Music Composition Using a GAN and LSTM Pipeline for Melody and Harmony Generation.

[BibT_eX]

[DOI]

Clément Saint-Marc

Proceedings of the IEEE International Symposium on Multimedia, 2024

Speaker Pseudonymization for Japanese Speech Using Duration Embeddings.

[BibT_eX]

[DOI]

Aoi Ito

Proceedings of the IEEE International Symposium on Multimedia, 2024

2022

Homophonic Music Composition Using Pipelined LSTMs for Melody and Harmony Generation.

[BibT_eX]

[DOI]

Clément Saint-Marc

Katsunobu Itou

Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2022

Cross-Lingual Transfer Learning Approach to Phoneme Error Detection via Latent Phonetic Representation.

[BibT_eX]

[DOI]

Jovan M. Dalhouse

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2020

F0 Estimation Using Blind Source Separation for Analyzing Noh Singing.

[BibT_eX]

[DOI]

Atsuki Tamoto

Proceedings of the 30th IEEE International Workshop on Machine Learning for Signal Processing, 2020

2019

Voice authentication by text dependent single utterance for in-car environment.

[BibT_eX]

[DOI]

Atsuki Tamoto

Proceedings of the Tenth International Symposium on Information and Communication Technology, 2019

2018

DNN-Based Near- and Far-Field Source Separation Using Spherical-Harmonic-Analysis-Based Acoustic Features.

[BibT_eX]

[DOI]

Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Automatic Electronic Organ Reduction System Based on Melody Clustering Considering Melodic and Instrumental Characteristics.

[BibT_eX]

[DOI]

Daiki Tanaka

Proceedings of the 2018 IEEE International Symposium on Multimedia, 2018

2016

Prominence detection for presentation training system.

[BibT_eX]

[DOI]

Atsushi Kojima

Proceedings of the Seventh Symposium on Information and Communication Technology, 2016

2014

Intra-note segmentation via sticky HMM with DP emission.

[BibT_eX]

[DOI]

Yuma Koizumi

Proceedings of the IEEE International Conference on Acoustics, 2014

2012

Discriminant analysis of the utterance state while singing.

[BibT_eX]

[DOI]

Kentaro Hirayama

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2012

2010

Speaker model updating by the conversational sounds in speaker verification.

[BibT_eX]

[DOI]

Keita Yamamuro

Proceedings of the iiWAS'2010, 2010

2009

Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data.

[BibT_eX]

[DOI]

Inf. Media Technol., 2009

The use of acoustically detected filled and silent pauses in spontaneous speech recognition.

[BibT_eX]

[DOI]

Jun Ogata

Masataka Goto

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Effect of the Topic Dependent Translation Models for Patent Translation - Experiment at NTCIR-7.

[BibT_eX]

[DOI]

Takeshi Ito

Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2008

In-car Speech Data Collection along with Various Multimodal Signals.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

Test Collections for Spoken Document Retrieval from Lecture Audio Data.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

2007

Language model adaptation for fixed phrases by amplifying partial n-gram sequences.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 2007

Driver Modeling Based on Driving Behavior and Its Evaluation in Driver Identification.

[BibT_eX]

[DOI]

Proc. IEEE, 2007

Non-factoid Question Answering Experiments at NTCIR-6: Towards Answer Type Detection for Realworld Questions.

[BibT_eX]

[DOI]

Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2007

Statistical Machine Translation based Passage Retrieval for Cross-Lingual Question Answering --- Experiments at NTCIR-6.

[BibT_eX]

[DOI]

Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2007

A Stochastic Representation of the Dynamics of Sung Melody.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Music Information Retrieval, 2007

Statistical segmentation and recognition of fingertip trajectories for a gesture interface.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Multimodal Interfaces, 2007

2006

LODEM: A system for on-demand video lectures.

[BibT_eX]

[DOI]

Speech Commun., 2006

Driver Identification Using Driving Behavior Signals.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

Single-Channel Multiple Regression for In-Car Speech Enhancement.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

Statistical Analysis for Thesaurus Construction using an Encyclopedic Corpus.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Characterizing in-Car Conversational Speech of Different Dialogue Modes.

[BibT_eX]

[DOI]

Proceedings of the First International Conference on Innovative Computing, Information and Control (ICICIC 2006), 30 August, 2006

Cepstral Analysis of Driving Behavioral Signals for Driver Identification.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Adaptive Regression Based Framework for In-Car Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Development of Micro-Dodecahedral Loudspeaker for Measuring Head-Related Transfer Functions in The Proximal region.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Construction and Evaluation of a Large In-Car Speech Corpus.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

Speech Recognition Using Finger Tapping Timings.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

Cyclone: an encyclopedic web search site.

[BibT_eX]

[DOI]

Proceedings of the 14th international conference on World Wide Web, 2005

Bi-directional Cross Language Question Answering using a Single Monolingual QA System.

[BibT_eX]

[DOI]

Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

Question Answering Experiments at NTCIR-5: Acquisition of Answer Evaluation Patterns and Context Processing using Passage Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

Exploiting Anchor Text for the Navigational Web Retrieval at NTCIR-5.

[BibT_eX]

[DOI]

Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

Modeling of individualities in driving through spectral analysis of behavioral signals.

[BibT_eX]

[DOI]

Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005

Data collection and evaluation of speech recognition for motorbike riders.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Discrimination between singing and speaking voices.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Subjective and objective quality assessment of regression-enhanced speech in real car environments.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Improved Noise Spectra Estimation and Log-spectral Regression for In-car Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

Analysis of a large in-car speech corpus.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

Two-stage Noise Spectra Estimation and Regression based In-car Speech Recognition using Single Distant Microphone.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Analysis of a large in-car speech corpus and its application to the multimodel ASR.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents.

[BibT_eX]

Proceedings of the Life-like characters - tools, affective functions, and applications., 2004

In-Car Speech Recognition Using Distributed Multiple Microphones.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Question Answering Using "Common Sense" and Utility Maximization Principle.

[BibT_eX]

[DOI]

Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

Experiments on Web Retrieval Driven by Spontaneously Spoken Queries.

[BibT_eX]

[DOI]

Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

Collecting Spontaneously Spoken Queries for Information Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Speech-Recognition Interfaces for Music Information Retrieval: 'Speech Completion' and 'Speech Spotter'.

[BibT_eX]

[DOI]

Proceedings of the ISMIR 2004, 2004

Recent progress of open-source LVCSR engine julius and Japanese model repository.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Effects of language modeling on speech-driven question answering.

[BibT_eX]

[DOI]

Katsunobu Itou

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Analysis of in-car speech recognition experiments using a large-scale multi-mode dialogue corpus.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Unsupervised topic adaptation for lecture speech retrieval.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Speech recognition using synchronization between speech and finger tapping.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Biometric identification using driving behavioral signals.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

2003

Speech starter: noise-robust endpoint detection by using filled pauses.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech shift: direct speech-input-mode switching through intentional control of voice pitch.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A cross-media retrieval system for lecture videos.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Building a test collection for speech-driven web retrieval.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Adapting language models for frequent fixed phrases by emphasizing n-gram subsets.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Language Modeling for Multi-Domain Speech-Driven Text Retrieval

[BibT_eX]

[DOI]

CoRR, 2002

Evaluating Speech-Driven IR in the NTCIR-3 Web Retrieval Task.

[BibT_eX]

[DOI]

Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, 2002

Towards Speech-Driven Question Answering: Experiments Using the NTCIR-3 Question Answering Collection.

[BibT_eX]

[DOI]

Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, 2002

Continuous Speech Recognition Consortium an Open Repository for CSR Tools and Models.

[BibT_eX]

[DOI]

Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Producing a Large-scale Encyclopedic Corpus over the Web.

[BibT_eX]

[DOI]

Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Speech completion: on-demand completion assistance using filled pauses for speech input interfaces.

[BibT_eX]

[DOI]

Masataka Goto

Satoru Hayamizu

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Selective back-off smoothing for incorporating grammatical constraints into the n-gram language model.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A Method for Open-Vocabulary Speech-Driven Text Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, 2002

2001

Jijo-2: An Office Robot that Communicates and Learns.

[BibT_eX]

[DOI]

IEEE Intell. Syst., 2001

Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Information Retrieval Techniques for Speech Applications [this book is based on the workshop "Information Retrieval Techniques for Speech Applications", 2001

Spoken Language Interface of the Jijo-2 Office Robot.

[BibT_eX]

[DOI]

Proceedings of the Robotics Research, The Tenth International Symposium, 2001

Real-time sound source localization and separation system and its application to automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A structured statistical language model conditioned by arbitrarily abstracted grammatical categories based on GLR parsing.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000

IPA Japanese Dictation Free Software Project.

[BibT_eX]

[DOI]

Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Free software toolkit for Japanese large vocabulary continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Semi-automatic language model acquisition without large corpora.

[BibT_eX]

[DOI]

Katsunobu Itou

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

A real-time filled pause detection system for spontaneous speech recognition.

[BibT_eX]

[DOI]

Masataka Goto