Tetsuya Takiguchi

IEEE Access, 2025

A Robot that Supports Collaborative Art Appreciation through Visual Thinking Strategies.

[BibT_eX]

[DOI]

Minori Iwata

Masahiro Shiomi

Proceedings of the 34th IEEE International Conference on Robot and Human Interactive Communication, 2025

Operatic Singing Voice Synthesis From Inexperienced Voice Considering Tempo and Vowel Change.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling, 2025

Enhancing Proactive Dialogue Systems Through Self-Learning of Reasoning and Action-Planning.

[BibT_eX]

[DOI]

Ryosuke Ito

Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology, 2025

Zero-Shot Learning for Acoustic Event Classification Using an Attribute Vector and Conditional GAN.

[BibT_eX]

[DOI]

Kohei Uehara

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Revisiting WFST-based Hybrid Japanese Speech Recognition System for Individuals with Organic Speech Disorders.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Highly Intelligible Text-to-Speech System Based on Weighted Averaging of Parameters for Individuals with Spinal Muscular Atrophy.

[BibT_eX]

[DOI]

Proceedings of the 27th International ACM SIGACCESS Conference on Computers and Accessibility, 2025

Speaker-dependent Continuous Speech Recognition for Individuals with Cerebral Palsy Using Weighted Finite-State Transducer and Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 27th International ACM SIGACCESS Conference on Computers and Accessibility, 2025

Outlier Removal in MEG Data for Imagined Speech Classification.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Reasoning Visualization for Critical Care EEG Classification with Prototypical Part Networks.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

Learning Global Evapotranspiration Dataset Corrections from a Water Cycle Closure Supervision.

[BibT_eX]

[DOI]

Remote. Sens., January, 2024

Fast Neural Speech Waveform Generative Models With Fully-Connected Layer-Based Upsampling.

[BibT_eX]

[DOI]

IEEE Access, 2024

Dysarthric Speech Recognition Using Pseudo-Labeling, Self-Supervised Feature Learning, and a Joint Multi-Task Learning Approach.

[BibT_eX]

[DOI]

IEEE Access, 2024

Effects of Listening Behaviors of a Social Robot on Adult's Motivation and Performance in Piano Practice.

[BibT_eX]

[DOI]

Ryuto Matsusaka

Masahiro Shiomi

Proceedings of the 33rd IEEE International Conference on Robot and Human Interactive Communication, 2024

Integrating Textual and Financial Time Series Data for Enhanced Forecasting.

[BibT_eX]

[DOI]

Proceedings of the 16th IIAI International Congress on Advanced Applied Informatics, 2024

Training of VITS Model Reflecting the Duration of a Physically Unimpaired Speaker for a Text-to-speech System for a Person with a Stutter.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE Global Conference on Consumer Electronics, 2024

Speech Recognition for a Person With Cerebral Palsy Using Whisper Fine-Tuned on Japanese and English Dysarthric Speech.

[BibT_eX]

[DOI]

Kirito Haze

Proceedings of the 13th IEEE Global Conference on Consumer Electronics, 2024

Representation Learning Based on Variational Autoencoders for Imagined Speech Classification.

[BibT_eX]

[DOI]

Proceedings of the 32nd European Signal Processing Conference, 2024

Generation of Colored Subtitle Images Based on Emotional Information of Speech Utterances.

[BibT_eX]

[DOI]

Proceedings of the 32nd European Signal Processing Conference, 2024

Attempts on detecting Alzheimer's disease by fine-tuning pre-trained model with Gaze Data.

[BibT_eX]

[DOI]

Proceedings of the 2024 Symposium on Eye Tracking Research and Applications, 2024

Self-supervised learning using unlabeled speech with multiple types of speech disorder for disordered speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility, 2024

Individuality-Preserving Speech Synthesis for Spinal Muscular Atrophy with a Tracheotomy.

[BibT_eX]

[DOI]

Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility, 2024

2023

Speaker-Independent Emotional Voice Conversion via Disentangled Representations.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Reversible designs for extreme memory cost reduction of CNN training.

[BibT_eX]

[DOI]

EURASIP J. Image Video Process., 2023

Rule-based Fact Verification Utilizing Knowledge Graphs.

[BibT_eX]

[DOI]

Yuki Momii

Proceedings of the Workshop, 2023

Zero-Shot Sound Event Classification Using a Sound Attribute Vector with Global and Local Feature Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

EEG Source Estimation Using Deep Prior Without a Subject's Individual Lead Field.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Operatic Singing Voice Synthesis Using Diff-SVC.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023

2022

Phoneme-guided Dysarthric speech conversion With non-parallel data by joint training.

[BibT_eX]

[DOI]

Signal Image Video Process., 2022

Direction of arrival estimation for indoor environments based on acoustic composition model with a single microphone.

[BibT_eX]

[DOI]

Pattern Recognit., 2022

Learn to See Faster: Pushing the Limits of High-Speed Camera with Deep Underexposed Image Denoising.

[BibT_eX]

[DOI]

CoRR, 2022

Optical Flow Regularization of Implicit Neural Representations for Video Frame Interpolation.

[BibT_eX]

[DOI]

CoRR, 2022

Current Source Localization Using Deep Prior with Depth Weighting.

[BibT_eX]

[DOI]

CoRR, 2022

Building a Knowledge-Based Dialogue System with Text Infilling.

[BibT_eX]

[DOI]

Qiang Xue

Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022

MEG Source Localization Using Deep Prior.

[BibT_eX]

[DOI]

Proceedings of the 4th IEEE Global Conference on Life Sciences and Technologies, 2022

Comparative Evaluation of Neural Vocoders for Speech Synthesis of Operatic Singing.

[BibT_eX]

[DOI]

Proceedings of the 4th IEEE Global Conference on Life Sciences and Technologies, 2022

Adaptation of a Pronunciation Dictionary for Dysarthric Speech Recognition.

[BibT_eX]

[DOI]

Yuya Sawa

Proceedings of the 4th IEEE Global Conference on Life Sciences and Technologies, 2022

Data Augmentation for Dysarthric Speech Recognition Based on Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 4th IEEE Global Conference on Life Sciences and Technologies, 2022

Where Do Humans Build Levees? A Case Study on the Contiguous United States.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2022

Speaker-Targeted Audio-Visual Speech Recognition Using a Hybrid CTC/Attention Model with Interference Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Binary Attribute Embeddings for Zero-Shot Sound Event Classification.

[BibT_eX]

[DOI]

Proceedings of the 11th IEEE Global Conference on Consumer Electronics, 2022

2021

Multimodal fusion for indoor sound source localization.

[BibT_eX]

[DOI]

Pattern Recognit., 2021

Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2021

Full-Band LPCNet: A Real-Time Neural Vocoder for 48 kHz Audio With a CPU.

[BibT_eX]

[DOI]

IEEE Access, 2021

High-Intelligibility Speech Synthesis for Dysarthric Speakers with LPCNet-Based TTS and CycleVAE-Based VC.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Data Augmentation Based on Frequency Warping for Recognition of Cleft Palate Speech.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Dysarthric Speech Recognition Based on Deep Metric Learning.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Two-Step Acoustic Model Adaptation for Dysarthric Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Convolutional neural networks Memory optimization Inference with Splitting Image.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

An Investigation of End-to-End Speech Recognition Using Model Adaptation for Dysarthric Speakers.

[BibT_eX]

[DOI]

Yuya Sawa

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Opera Singing Voice Synthesis Considering Vowel Variations.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

FasterRCNN Monitoring of Road Damages: Competition and Deployment.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

2019

Polar Transformation on Image Features for Orientation-Invariant Representations.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2019

Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Semantic embeddings of generic objects for zero-shot learning.

[BibT_eX]

[DOI]

Tristan Hascoet

EURASIP J. Image Video Process., 2019

Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2019

Reversible designs for extreme memory cost reduction of CNN training.

[BibT_eX]

[DOI]

CoRR, 2019

Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition.

[BibT_eX]

[DOI]

IEEE Access, 2019

Generation of Objections Using Topic and Claim Information in Debate Dialogue System.

[BibT_eX]

[DOI]

Kazuaki Furumai

Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019

Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Assisting human experts in the interpretation of their visual process: A case study on assessing copper surface adhesive potency.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

End-to-end Dysarthric Speech Recognition Using Multiple Databases.

[BibT_eX]

[DOI]

Yuki Takashima

Proceedings of the IEEE International Conference on Acoustics, 2019

Cortical Patterns for Prediction of Subjective Preference Induced by Chords.

[BibT_eX]

[DOI]

Hajime Yano

Seiji Nakagawa

Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

On Zero-Shot Recognition of Generic Objects.

[BibT_eX]

[DOI]

Tristan Hascoet

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

An AI-based approach to auto-analyzing historical handwritten business documents: : As applied to the Kanebo database.

[BibT_eX]

[DOI]

J. Comput. Soc. Sci., 2018

Debate Dialog for News Question Answering System 'NetTv'-Debate Based on Claim and Reason Estimation-.

[BibT_eX]

[DOI]

Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Chat Response Generation Based on Semantic Prediction Using Distributed Representations of Words.

[BibT_eX]

[DOI]

Kazuaki Furumai

Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Spectrum Enhancement of Singing Voice Using Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Symposium on Multimedia, 2018

Sound Recovery Considering the Vibration Direction of an Object in a Video.

[BibT_eX]

[DOI]

Yohei Fuse

Yusuke Yasumi

Proceedings of the 2018 IEEE International Symposium on Multimedia, 2018

Oil Price Forecasting Using Supervised GANs with Continuous Wavelet Transform Features.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Pattern Recognition, 2018

Parallel-Data-Free Dictionary Learning for Voice Conversion Using Non-Negative Tucker Decomposition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Spatiotemporal Characteristics of Cortical Activities Associated with Articulation of Speech Perception.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

Hybrid Text-to-Speech for Articulation Disorders with a Small Amount of Non-Parallel Data.

[BibT_eX]

[DOI]

Ryuka Nanzaka

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

User's Intention Understanding in Question-Answering System Using Attention-based LSTM.

[BibT_eX]

[DOI]

Yuki Matsuyoshi

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Sound Recovery Using Vibration Modes of the Object in a Video.

[BibT_eX]

[DOI]

Yohei Fuse

Yusuke Yasumi

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Rotation-reversal invariant HOG cascade for facial expression recognition.

[BibT_eX]

[DOI]

Signal Image Video Process., 2017

Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2017

Visual-to-speech conversion based on maximum likelihood estimation.

[BibT_eX]

[DOI]

Proceedings of the Fifteenth IAPR International Conference on Machine Vision Applications, 2017

Emotional Voice Conversion with Adaptive Scales F0 Based on Wavelet Transform Using Limited Amount of Emotional Data.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Phoneme-Discriminative Features for Dysarthric Speech Conversion.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A Bayesian nonparametric multimodal data modeling framework for video emotion recognition.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Facial Expression Recognition with deep age.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017

Spatiotemporal properties of magnetic fields induced by auditory speech sound imagery and perception.

[BibT_eX]

[DOI]

Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

2016

Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine.

[BibT_eX]

[DOI]

Yasuhiro Minami

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Multiple Non-Negative Matrix Factorization for Many-to-Many Voice Conversion.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

LLC Revisit: Scene Classification with <i>k</i>-Farthest Neighbours.

[BibT_eX]

[DOI]

Katsuyuki Tanaka

IEICE Trans. Inf. Syst., 2016

Multithreading cascade of SURF for facial expression recognition.

[BibT_eX]

[DOI]

EURASIP J. Image Video Process., 2016

Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform.

[BibT_eX]

[DOI]

Zhaojie Luo

Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Modeling deep bidirectional relationships for image classification and generation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Semi-non-negative matrix factorization using alternating direction method of multipliers for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Expression Recognition with Ri-HOG Cascade.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2016 Workshops, 2016

Emotional voice conversion using deep neural networks with MCC and F0 features.

[BibT_eX]

[DOI]

Zhaojie Luo

Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

Lip reading using a dynamic feature of lip images and convolutional neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

Selection of an optimum random matrix using a genetic algorithm for acoustic feature extraction.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

2015

Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars.

[BibT_eX]

[DOI]

ACM Trans. Access. Comput., 2015

Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss.

[BibT_eX]

[DOI]

IPSJ Trans. Comput. Vis. Appl., 2015

A robust SVM classification framework using PSM for multi-class recognition.

[BibT_eX]

[DOI]

EURASIP J. Image Video Process., 2015

Voice conversion using speaker-dependent conditional restricted Boltzmann machine.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Multimodal voice conversion based on non-negative matrix factorization.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Many-to-one voice conversion using exemplar-based sparse representation.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Individuality-Preserving Spectrum Modification for Articulation Disorders Using Phone Selective Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

Content-based Image Retrieval Using Rotation-invariant Histograms of Oriented Gradients.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Many-to-many voice conversion based on multiple non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Word-Error Correction of Continuous Speech Recognition Based on Normalized Relevance Distance.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Individuality-Preserving Voice Reconstruction for Articulation Disorders Using Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Reina Ueda

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, WA, USA, November 09, 2015

Sparse nonlinear representation for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Multithreading AdaBoost framework for object recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Activity-mapping non-negative matrix factorization for exemplar-based voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Estimation of object functions using deformable part model.

[BibT_eX]

[DOI]

Yosuke Kitano

Proceedings of the 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision, 2015

Feature extraction using pre-trained convolutive bottleneck nets for dysarthric speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd European Signal Processing Conference, 2015

Noise-robust voice conversion using a small parallel data based on non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the 23rd European Signal Processing Conference, 2015

Rotation-invariant histograms of oriented gradients for local patch robust representation.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Facial expression recognition with multithreaded cascade of rotation-invariant HOG.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014

Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2014

Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2014

A preliminary demonstration of exemplar-based voice conversion for articulation disorders using an individuality-preserving dictionary.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2014

Individuality-preserving Voice Conversion for Articulation Disorders Using Dictionary Selective Non-negative Matrix Factorization.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Speech and Language Processing for Assistive Technologies, 2014

High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Multimodal exemplar-based voice conversion using lip features in noisy environments.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Error correction of automatic speech recognition based on normalized web distance.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

3D-Object Recognition Based on LLC Using Depth Spatial Pyramid.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Voice conversion in time-invariant speaker-independent space.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Multimodal voice conversion using non-negative matrix factorization in noisy environments.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Voice conversion based on Non-negative matrix factorization using phoneme-categorized dictionary.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Exemplar-based emotional voice conversion using non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

A Robust Learning Framework Using PSM and Ameliorated SVMs for Emotional Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2014 Workshops, 2014

2013

Exemplar-Based Voice Conversion Using Sparse Representation in Noisy Environments.

[BibT_eX]

[DOI]

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2013

Noise-robust voice conversion based on spectral mapping on sparse space.

[BibT_eX]

[DOI]

Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Robust Feature Extraction to Utterance Fluctuation of Articulation Disorders Based on Random Projection.

[BibT_eX]

[DOI]

Toshiya Yoshioka

Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies, 2013

Individuality-Preserving Voice Conversion for Articulation Disorders Using Locality-Constrained NMF.

[BibT_eX]

[DOI]

Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies, 2013

High-Frequency Restoration Using Deep Belief Nets for Super-resolution.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Signal-Image Technology & Internet-Based Systems, 2013

Event Detection and Recognition Using HMM with Whistle Sounds.

[BibT_eX]

[DOI]

Hiroki Itoh

Proceedings of the Ninth International Conference on Signal-Image Technology & Internet-Based Systems, 2013

Acoustic feature selection utilizing multiple kernel learning for classification of children with autism spectrum and typically developing children.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE/SICE International Symposium on System Integration, 2013

Voice conversion based on Non-negative Matrix Factorization in noisy environments.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE/SICE International Symposium on System Integration, 2013

Unknown Object Identification Using Category Visual Words with Rejection Function.

[BibT_eX]

[DOI]

Yuto Tanaka

Proceedings of the 13. IAPR International Conference on Machine Vision Applications, 2013

Robust facial expressions recognition using 3D average face and ameliorated adaboost.

[BibT_eX]

[DOI]

Proceedings of the ACM Multimedia Conference, 2013

Two-step correction of speech recognition errors based on n-gram and long contextual information.

[BibT_eX]

[DOI]

Ryohei Nakatani

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Voice conversion in high-order eigen space using deep belief nets.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Exemplar-based individuality-preserving voice conversion for articulation disorders in noisy environments.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Prediction of unlearned position based on local regression for single-channel talker localization using acoustic transfer function.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Sparse representation for outliers suppression in semi-supervised image annotation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Exemplar-based voice conversion in noisy environment.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Robust AAM-based audio-visual speech recognition against face direction changes.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Super-resolution Using GMM and PLS Regression.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Symposium on Multimedia, 2012

Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification.

[BibT_eX]

[DOI]

Christophe Garcia

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

3D tracking of soccer players using time-situation graph in monocular image sequence.

[BibT_eX]

[DOI]

Hiroki Itoh

Proceedings of the 21st International Conference on Pattern Recognition, 2012

Acoustic model transformations based on random projections.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

A new multiple-kernel-learning weighting method for localizing human brain magnetic activity.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Super-resolution by GMM based conversion using self-reduction image.

[BibT_eX]

[DOI]

Yuki Ogawa

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Generic object recognition by graph structural expression.

[BibT_eX]

[DOI]

Takahiro Hori

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Towards Domain Independent Why Text Segment Classification Based on Bag of Function Words.

[BibT_eX]

[DOI]

Katsuyuki Tanaka

Proceedings of the AI 2012: Advances in Artificial Intelligence, 2012

Robust feature extraction to utterance fluctuations due to articulation disorders based on sparse expression.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

An adaboost-based weighting method for localizing human brain magnetic activity.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Consonant enhancement for articulation disorders based on non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Audio-Visual Speech Recognition Based on AAM Parameter and Phoneme Analysis of Visual Feature.

[BibT_eX]

[DOI]

Yuto Komai

Proceedings of the Advances in Image and Video Technology - 5th Pacific Rim Symposium, 2011

Image Annotation with Concept Level Feature Using PLSA+CCA.

[BibT_eX]

[DOI]

Yu Zheng

Proceedings of the Advances in Multimedia Modeling, 2011

Constrained Spectrum Generation Using A Probabilistic Spectrum Envelope for Mixed Music Analysis.

[BibT_eX]

[DOI]

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Single-Channel Head Orientation Estimation Based on Discrimination of Acoustic Transfer Function.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Probabilistic Spectrum Envelope: Categorized Audio-Features Representation for NMF-Based Sound Decomposition.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Feature selection based on Multiple Kernel Learning for single-channel sound source localization using the acoustic transfer function.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Generic object recognition using automatic region extraction and dimensional feature integration utilizing multiple kernel learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

3D Human Pose Estimation from a Monocular Image Using Model Fitting in Eigenspaces.

[BibT_eX]

[DOI]

J. Softw. Eng. Appl., 2010

Multimodal speech recognition of a person with articulation disorders using AAM and MAF.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Workshop on Multimedia Signal Processing, 2010

Speech synthesis by modeling harmonics structure with multiple function.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Generic Object Recognition by Tree Conditional Random Field Based on Hierarchical Segmentation.

[BibT_eX]

[DOI]

Takeshi Okumura

Proceedings of the 20th International Conference on Pattern Recognition, 2010

Structuring a gene network using a multiresolution independence test.

[BibT_eX]

[DOI]

Takayuki Yamamoto

Proceedings of the IEEE International Conference on Acoustics, 2010

Evaluation of random-projection-based feature combination on speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

HMM-based separation of acoustic transfer function for single-channel sound source localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Auditory-Visual Speech Processing, 2010

Gaze Estimation Using Regression Analysis and AAMs Parameters Selected Based on Information Criterion.

[BibT_eX]

[DOI]

Manabu Takatani

Proceedings of the Computer Vision - ACCV 2010 Workshops, 2010

2009

Integration of Metamodel and Acoustic Model for Dysarthric Speech Recognition.

[BibT_eX]

[DOI]

Toshitaka Nakabayashi

J. Multim., 2009

Graph Cuts Segmentation by Using Local Texture Features of Multiresolution Analysis.

[BibT_eX]

[DOI]

Keita Fukuda

IEICE Trans. Inf. Syst., 2009

Single-Channel Talker Localization Based on Discrimination of Acoustic Transfer Functions.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2009

Integrated Phoneme Subspace Method for Speech Feature Extraction.

[BibT_eX]

[DOI]

Hyunsin Park

EURASIP J. Audio Speech Music. Process., 2009

System request detection in human conversation based on multi-resolution Gabor wavelet features.

[BibT_eX]

[DOI]

Tomoyuki Yamagata

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Monaural sound-source-direction estimation using the acoustic transfer function of an active microphone.

[BibT_eX]

[DOI]

Proceedings of the 12th International Conference on Information Fusion, 2009

Human Action Recognition Using HDP by Integrating Motion and Location Information.

[BibT_eX]

[DOI]

Takuya Tonaru

Proceedings of the Computer Vision, 2009

2008

Human-Robot Interface Using System Request Utterance Detection Based on Acoustic Features.

[BibT_eX]

[DOI]

Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering (MUE 2008), 2008

Audio-Based Video Editing with Two-Channel Microphone.

[BibT_eX]

[DOI]

Jun Adachi

Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering (MUE 2008), 2008

Speaker Independent Phoneme Recognition Based on Fisher Weight Map.

[BibT_eX]

[DOI]

Takashi Muroi

Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering (MUE 2008), 2008

Tagging Video Contents with Positive/Negative Interest Based on User's Facial Expression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Modeling, 2008

Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments: newest Part of the CENSREC Series -.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Sudden noise reduction based on GMM with noise power estimation.

[BibT_eX]

[DOI]

Nobuyuki Miyake

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Integration of metamodel and acoustic model for speech recognition.

[BibT_eX]

[DOI]

Toshitaka Nakabayashi

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Object recognition and segmentation using SIFT and Graph Cuts.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

3D human posture estimation using the HOG features from monocular image.

[BibT_eX]

[DOI]

Katsunori Onishi

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Graph cuts by using local texture features of wavelet coefficient for image segmentation.

[BibT_eX]

[DOI]

Keita Fukuda

Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Digital camera work for soccer video production with event recognition and accurate ball tracking by switching search method.

[BibT_eX]

[DOI]

Kazuki Yano

Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

2007

PCA-Based Speech Enhancement for Distorted Speech Recognition.

[BibT_eX]

[DOI]

J. Multim., 2007

Voice activity detection by lip shape tracking using EBGM.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Multimedia 2007, 2007

System request detection in conversation based on acoustic and speaker alternation features.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Language modeling using PLSA-based topic HMM.

[BibT_eX]

[DOI]

Atsushi Sako

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

PCA-based feature extraction for fluctuation in speaking style of articulation disorders.

[BibT_eX]

[DOI]

Toshitaka Nakabayashi

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

Phoneme recognition based on fisher weight map to higher-order local auto-correlation.

[BibT_eX]

[DOI]

Shunsuke Kato

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Robust Feature Extraction using Kernel PCA.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Two-Channel-Based Noise Reduction in a Complex Spectrum Plane for Hands-Free Communication System.

[BibT_eX]

[DOI]

Toshiya Ohkubo

Proceedings of the Advances in Multimedia Information Processing, 2005

Recognition of hands-free speech and hand pointing action for conversational TV.

[BibT_eX]

[DOI]

Atsushi Sako

Proceedings of the 13th ACM International Conference on Multimedia, 2005

Situation based speech recognition for structuring baseball live games.

[BibT_eX]

[DOI]

Atsushi Sako

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004

Improved HMM Separation for Distant-Talking Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2004

Sound Source Localization Using a Profile Fitting Method with Sound Reflectors.

[BibT_eX]

[DOI]

Osamu Ichikawa

IEICE Trans. Inf. Syst., 2004

Acoustic model adaptation using first order prediction for reverberant speech.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2001

HMM-separation-based speech recognition for a distant moving speaker.

[BibT_eX]

[DOI]

Satoshi Nakamura

Kiyohiro Shikano

IEEE Trans. Speech Audio Process., 2001

2000

Model adaptation by HMM decomposition and composition in noisy reverberant environments.

[BibT_eX]

[DOI]

Satoshi Nakamura

Kiyohiro Shikano

Syst. Comput. Jpn., 2000

Speech recognition for a distant moving speaker based on HMM composition and separation.

[BibT_eX]

[DOI]