Toshio Irino

Speech Commun., February, 2023

GESI: Gammachirp Envelope Similarity Index for Predicting Intelligibility of Simulated Hearing Loss Sounds.

[BibT_eX]

[DOI]

CoRR, 2023

Hearing Impairment Simulator Based on Auditory Excitation Pattern Playback: WHIS.

[BibT_eX]

[DOI]

IEEE Access, 2023

Impact of Residual Noise and Artifacts in Speech Enhancement Errors on Intelligibility of Human and Machine.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Auditory Representation Effective for Estimating Vocal Tract Information.

[BibT_eX]

[DOI]

Shintaro Doan

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Modelling speaker-size discrimination with voiced and unvoiced speech sounds based on the effect of spectral lift.

[BibT_eX]

[DOI]

Speech Commun., 2022

WHIS: Hearing impairment simulator based on the gammachirp auditory filterbank.

[BibT_eX]

[DOI]

CoRR, 2022

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening.

[BibT_eX]

[DOI]

CoRR, 2022

Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI).

[BibT_eX]

[DOI]

Honoka Tamaru

Ayako Yamamoto

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Observational and Accelerometer Analysis of Head Movement Patterns in Psychotherapeutic Dialogue.

[BibT_eX]

[DOI]

Sensors, 2021

Comparison of Remote Experiments Using Crowdsourcing and Laboratory Experiments on Speech Intelligibility.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Interactive and Real-Time Acoustic Measurement Tools for Speech Data Acquisition and Presentation: Application of an Extended Member of Time Stretched Pulses.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Mixture of Orthogonal Sequences Made from Extended Time-Stretched Pulses Enables Measurement of Involuntary Voice Fundamental Frequency Response to Pitch Perturbation.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Implementation of Interactive Tools for Investigating Fundamental Frequency Response of Voiced Sounds to Auditory Stimulation.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

GEDI: Gammachirp envelope distortion index for predicting intelligibility of enhanced speech.

[BibT_eX]

[DOI]

Speech Commun., 2020

Speech Clarity Improvement by Vocal Self-Training Using a Hearing Impairment Simulator and its Correlation with an Auditory Modulation Index.

[BibT_eX]

[DOI]

Soichi Higashiyama

Hanako Yoshigi

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Predicting Intelligibility of Enhanced Speech Using Posteriors Derived from DNN-Based ASR System.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Predicting Speech Intelligibility of Enhanced Speech Using Phone Accuracy of DNN-Based ASR System.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Frequency domain variant of Velvet noise and its application to acoustic measurements.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Frequency domain variants of velvet noise and their application to speech processing and synthesis: with appendices.

[BibT_eX]

[DOI]

CoRR, 2018

Multi-resolution Gammachirp Envelope Distortion Index for Intelligibility Prediction of Noisy Speech.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Frequency Domain Variants of Velvet Noise and Their Application to Speech Processing and Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Predicting Speech Intelligibility Using a Gammachirp Envelope Distortion Index Based on the Signal-to-Distortion Ratio.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

The Effect of Spectral Tilt on Size Discrimination of Voiced Speech Sounds.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A New Cosine Series Antialiasing Function and its Application to Aliasing-Free Glottal Source Models for Speech and Singing Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An Auditory Model of Speaker Size Perception for Voiced Speech Sounds.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

How the slope of the speech spectrum affects the perception of speaker size.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Aliasing-free implementation of discrete-time glottal source models and their applications to speech synthesis and F0 extractor evaluation.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014

Excitation source analysis for high-quality speech manipulation systems based on an interference-free representation of group delay with minimum phase response compensation.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Vocal tract length estimation based on vowels using a database consisting of 385 speakers and a database with MRI-based vocal tract shape information.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Proposal for an Interactive 3D Sound Playback Interface Controlled by User behavior.

[BibT_eX]

[DOI]

Proceedings of the HCI International 2014 - Posters' Extended Abstracts, 2014

Development of a Mobile Application for Crowdsourcing the Data Collection of Environmental Sounds.

[BibT_eX]

[DOI]

Proceedings of the Human Interface and the Management of Information. Information and Knowledge Design and Evaluation, 2014

Hearing impairment simulator based on compressive gammachirp filter.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Excitation source design for high-quality speech manipulation systems based on a temporally static group delay representation of periodic signals.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

Controlling "shout" expression in a Japanese POP singing performance: analysis and suppression study.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Beyond bandlimited sampling of speech spectral envelope imposed by the harmonic structure of voiced sounds.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Higher order waveform symmetry measure and its application to periodicity detectors for speech and singing with fine temporal resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Vocal tract length estimation for voiced and whispered speech using gammachirp filterbank.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012

Comparison of performance with voiced and whispered speech in word recognition and mean-formant-frequency discrimination.

[BibT_eX]

[DOI]

Speech Commun., 2012

Deviation measure of waveform symmetry and its application to high-speed and temporally-fine F0 extraction for vocal sound texture manipulation.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Detecting child speaker based on auditory feature vectors for VTL estimation.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Modulation transfer function design for a flexible cross synthesis VOCODER based on F0 adaptive spectral envelope recovery.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

An interference-free representation of group delay for periodic signals.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Auditory Filterbank Improves Voice Morphing.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

An interference-free representation of instantaneous frequency of periodic signals and its application to F0 extraction.

[BibT_eX]

[DOI]

Masanori Morise

Proceedings of the IEEE International Conference on Acoustics, 2011

Development of Web-Based Voice Interface to Identify Child Users Based on Automatic Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction. Users and Applications, 2011

Manual and Accelerometer Analysis of Head Nodding Patterns in Goal-oriented Dialogues.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction. Interaction Techniques and Environments, 2011

2010

Auditory speech processing for scale-shift covariance and its evaluation in automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

High-quality and light-weight voice transformation enabling extrapolation without perceptual and objective breakdown.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

A bottom-up procedure to extract periodicity structure of voiced sounds and its application to represent and restoration of pathological voices.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2009

Influences of vowel duration on speaker-size estimation and discrimination.

[BibT_eX]

[DOI]

Chihiro Takeshima

Minoru Tsuzaki

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Observation of empirical cumulative distribution of vowel spectral distances and its application to vowel based voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Temporally variable multi-aspect auditory morphing enabling extrapolation without objective and perceptual breakdown.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Development of Speech Input Method for Interactive VoiceWeb Systems.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction. Novel Interaction Methods and Techniques, 2009

2008

A method for fundamental frequency estimation and voicing decision: Application to infant utterances recorded in real acoustical environments.

[BibT_eX]

[DOI]

Speech Commun., 2008

Vowel-based frequency alignment function design and recognition-based time alignment for automatic speech morphing.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Speech-to-text input method for web system using JavaScript.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Spectral envelope recovery beyond the nyquist limit for high-quality manipulation of speech sounds.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Discrimination and recognition of scaled word sounds.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Group delay for acoustic event representation and its application for speech aperiodicity analysis.

[BibT_eX]

[DOI]

Proceedings of the 15th European Signal Processing Conference, 2007

2006

Speech Segregation Using an Auditory Vocoder With Event-Synchronous Enhancements.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2006

A Dynamic Compressive Gammachirp Auditory Filterbank.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2006

Automatic assignment of anchoring points on vowel templates for defining correspondence between time-frequency representations of speech samples.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Analyzing dialogue data for real-world emotional speech classification.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Dynamic, Compressive Gammachirp Auditory Filterbank for Perceptual Signal Processing.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Logarithmic temporal processing applied to accurate empirical transfer function measurements in vocal sound propagation.

[BibT_eX]

[DOI]

Masanori Morise

Proceedings of the 14th European Signal Processing Conference, 2006

Speech style conversion based on the statistics of vowel spectrograms and nonlinear frequency mapping.

[BibT_eX]

[DOI]

Proceedings of the 14th European Signal Processing Conference, 2006

2005

Voice and emotional expression transformation based on statistics of vowel parameters in an emotional speech database.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Speech intelligibility derived from time-frequency and source smearing.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Underlying Principles of a High-quality Speech Manipulation System STRAIGHT and Its Application to Speech Segregation.

[BibT_eX]

[DOI]

Proceedings of the Speech Separation by Humans and Machines, 2005

Speech Segregation Using an Event-synchronous Auditory Image and STRAIGHT.

[BibT_eX]

[DOI]

Proceedings of the Speech Separation by Humans and Machines, 2005

2004

A design of audio-visual talker tracking system based on CSP analysis and frame difference in real noisy environments.

[BibT_eX]

[DOI]

Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Intelligibility of degraded speech from smeared STRAIGHT spectrum.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Algorithm amalgam: morphing waveform based methods, sinusoidal models and STRAIGHT.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

Glottal closure instant synchronous sinusoidal model for high quality speech analysis/synthesis.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Dominance spectrum based v/UV classification and f_0 estimation.

[BibT_eX]

[DOI]

Tomohiro Nakatani

Parham Zolfaghari

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech segregation based on fundamental event information using an auditory vocoder.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech segregation using event synchronous auditory vocoder.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform.

[BibT_eX]

[DOI]

Speech Commun., 2002

Robust fundamental frequency estimation against background noise and spectral distortion.

[BibT_eX]

[DOI]

Tomohiro Nakatani

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Evaluation of a speech recognition / generation method based on HMM and straight.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Auditory VOCODER: Speech resynthesis from an auditory Mellin representation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

2000

Robust fundamental frequency estimation using instantaneous frequencies of harmonic components.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

Stabilised wavelet mellin transform: an auditory strategy for normalising sound-source size.

[BibT_eX]

[DOI]

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Noise suppression using a time-varying, analysis/synthesis gamma chirp filterbank.

[BibT_eX]

[DOI]

Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998

The Gammachirp for Optimal Auditory Filtering.

[BibT_eX]

Proceedings of the Fifth International Conference on Neural Information Processing, 1998

A time-varying, analysis/synthesis auditory filterbank using the gammachirp.

[BibT_eX]

[DOI]

Masashi Unoki

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1996

A 'gammachirp' function as an optimal auditory filter with the Mellin transform.

[BibT_eX]

[DOI]

Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1994

A theory of asymmetric intensity enhancement around acoustic transients.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

1993

Signal reconstruction from modified auditory wavelet transform.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 1993

1992

Signal reconstruction from modified wavelet transform-An application to auditory signal processing.

[BibT_eX]

[DOI]

Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1990

A Method for Designing Neural Networks Using Nonlinear Multivariate Analysis: Application to Speaker-Independent Vowel Recognition.

[BibT_eX]

[DOI]

Neural Comput., 1990

1988

Vowel-feature extraction from cochlear vibration using neural networks.

[BibT_eX]

[DOI]