Masashi Unoki
Orcid: 0000-0002-6605-2052
According to our database1,
Masashi Unoki
authored at least 175 papers
between 1997 and 2025.
Collaborative distances:
Collaborative distances:
Book In proceedings Article PhD thesis Dataset OtherLinks
IEEE Access, 2025
Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network.
Speech Commun., 2024
Modeling and Estimation of Vocal Tract and Glottal Source Parameters Using ARMAX-LF Model.
CoRR, 2024
Machine Anomalous Sound Detection Using Spectral-temporal Modulation Representations Derived from Machine-specific Filterbanks.
CoRR, 2024
Hybrid Transformer Architectures With Diverse Audio Features for Deepfake Speech Classification.
IEEE Access, 2024
IEEE Access, 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2023
A Discriminative Feature Representation Method Based on Cascaded Attention Network With Adversarial Strategy for Speech Emotion Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Personality trait estimation in group discussions using multimodal analysis and speaker embedding.
J. Multimodal User Interfaces, 2023
Computational models of sound-quality metrics using method for calculating loudness with gammatone/gammachirp auditory filterbank.
CoRR, 2023
Blind Estimation of Speech Transmission Index and Room Acoustic Parameters by Using Extended Model of Room Impulse Response Derived From Speech Signals.
IEEE Access, 2023
Anomalous Sound Detection for Industrial Machines Using Acoustical Features Related to Timbral Metrics.
IEEE Access, 2023
IEEE Access, 2023
Speaker Verification Using Distance Based on Principal Component Analysis for Household Scenario Adaptation.
Proceedings of the International Conference on Computing and Communication Technologies, 2023
Proceedings of the International Conference on Computing and Communication Technologies, 2023
Consonant-emphasis Method Incorporating Robust Consonant-section Detection to Improve Intelligibility of Bone-conducted speech.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
An Improved Optimal Transport Kernel Embedding Method with Gating Mechanism for Singing Voice Separation and Speaker Identification.
Proceedings of the IEEE International Conference on Acoustics, 2023
Auditory Model Optimization with Wavegram-CNN and Acoustic Parameter Models for Nonintrusive Speech Intelligibility Prediction in Hearing Aids.
Proceedings of the 31st European Signal Processing Conference, 2023
Data-driven Non-uniform Filterbanks Based on F-ratio for Machine Anomalous Sound Detection.
Proceedings of the 31st European Signal Processing Conference, 2023
Incorporating the Digit Triplet Test in A Lightweight Speech Intelligibility Prediction for Hearing Aids.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Increasing Speech Intelligibility by Mimicking Professional Announcers' Voices and Its Physical Correlates.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Contribution of modulation spectral features for cross-lingual speech emotion recognition under noisy reverberant conditions.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Analysis of Spectro-Temporal Modulation Representation for Deep-Fake Speech Detection.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Deepfake-speech Detection with Pathological Features and Multilayer Perceptron Neural Network.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Detection of Brain Network Communities During Natural Speech Comprehension From Functionally Aligned EEG Sources.
Frontiers Comput. Neurosci., 2022
Blind Speech Watermarking Method with Frame Self-Synchronization Based on Spread-Spectrum Using Linear Prediction Residue.
Entropy, 2022
Speaker anonymization by modifying fundamental frequency and x-vector singular value.
Comput. Speech Lang., 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Method for improving the word intelligibility of presented speech using bone-conduction headphones.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Automatic Mean Opinion Score Estimation with Temporal Modulation Features on Gammatone Filterbank for Speech Assessment.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Global Signal-to-noise Ratio Estimation Based on Multi-subband Processing Using Convolutional Neural Network.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
An Improved Stimulus Reconstruction Method for EEG-Based Short-Time Auditory Attention Detection.
Proceedings of the Neural Information Processing - 29th International Conference, 2022
Bone-conducted Speech Enhancement Using Vector-quantized Variational Autoencoder and Gammachirp Filterbank Cepstral Coefficients.
Proceedings of the 30th European Signal Processing Conference, 2022
Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network.
Proceedings of the 30th European Signal Processing Conference, 2022
Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Using Temporal Modulation Features on Gammatone Auditory Filterbank.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Multi-resolution modulation-filtered cochleagram feature for LSTM-based dimensional emotion recognition from speech.
Neural Networks, 2021
Speech Watermarking Method Using McAdams Coefficient Based on Random Forest Learning.
Entropy, 2021
Frequency-specific Brain Network Dynamics during Perceiving Real Words and Pseudowords.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network.
Proceedings of the IEEE International Conference on Acoustics, 2021
Blind Estimation of Room Acoustic Parameters and Speech Transmission Index using MTF-based CNNs.
Proceedings of the 29th European Signal Processing Conference, 2021
Tampering Detection for Speech Signals Using Synchronization Code and LSF-based Watermarks.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Improving Security in McAdams Coefficient-Based Speaker Anonymization by Watermarking Method.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Study on Simultaneous Estimation of Glottal Source and Vocal Tract Parameters by ARMAX-LF Model for Speech Analysis/Synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Hybridization of speech information hiding and encryption for double-layer security in speech communication.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Non-Blind Speech Watermarking Method Based on Spread-Spectrum Using Linear Prediction Residue.
IEICE Trans. Inf. Syst., 2020
Speech Emotion Recognition Using 3D Convolutions and Attention-Based Sliding Recurrent Networks With Auditory Front-Ends.
IEEE Access, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Speech Privacy Protection based on Optimal Controlling Estimated Speech Transmission Index in Noisy Reverberant Environments.
Proceedings of the 28th European Signal Processing Conference, 2020
Enhancement of speech intelligibility under noisy reverberant conditions based on modulation spectrum concept.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
IEEE Signal Process. Lett., 2019
Detection of speech tampering using sparse representations and spectral manipulations based information hiding.
Speech Commun., 2019
J. Inf. Hiding Multim. Signal Process., 2019
Estimates of Transmission Characteristics Related to Perception of Bone-Conducted Speech Using Real Utterances and Transcutaneous Vibration on Larynx.
Proceedings of the Speech and Computer - 21st International Conference, 2019
Data Augmentation for Monaural Singing Voice Separation Based on Variational Autoencoder-Generative Adversarial Network.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Inaudible Speech Watermarking Based on Self-compensated Echo-hiding and Sparse Subspace Clustering.
Proceedings of the IEEE International Conference on Acoustics, 2019
Multimodal BigFive Personality Trait Analysis Using Communication Skill Indices and Multiple Discussion Types Dataset.
Proceedings of the Social Computing and Social Media. Design, Human Behavior and Analytics, 2019
Dimensional Emotion Recognition from Speech Using Modulation Spectral Features and Recurrent Neural Networks.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
A Robust Method for Blindly Estimating Speech Transmission Index using Convolutional Neural Network with Temporal Amplitude Envelope.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the Speech and Computer - 20th International Conference, 2018
Digital Audio Watermarking Method Based on Singular Spectrum Analysis with Automatic Parameter Estimation Using a Convolutional Neural Network.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018
Auditory-Inspired End-to-End Speech Emotion Recognition Using 3D Convolutional Recurrent Neural Networks Based on Spectral-Temporal Representation.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018
Speech Watermarking Based on Robust Principal Component Analysis and Formant Manipulations.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Method of Estimating Direction of Arrival of Sound Source for Monaural Hearing Based on Temporal Modulation Perception.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Method of Blindly Estimating Speech Transmission Index in Noisy Reverberant Environments.
J. Inf. Hiding Multim. Signal Process., 2017
Method of Estimating Signal-to-Noise Ratio Based on Optimal Design for Sub-band Voice Activity Detection.
J. Inf. Hiding Multim. Signal Process., 2017
Robust Method for Estimating F<sub>0</sub> of Complex Tone Based on Pitch Perception of Amplitude Modulated Signal.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017
Feasibility of vocal emotion conversion on modulation spectrogram for simulated cochlear implants.
Proceedings of the 25th European Signal Processing Conference, 2017
Study on method for protecting speech privacy by actively controlling speech transmission index in simulated room.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Speech emotion recognition using multichannel parallel convolutional recurrent neural networks based on gammatone auditory filterbank.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Speech watermarking scheme based on singular-spectrum analysis for tampering detection and identification.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
F0 estimation using empirical mode decomposition and complex cepstrum analysis in reverberant environments.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments.
J. Signal Process. Syst., 2016
Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments.
Speech Commun., 2016
Audio Watermarking Scheme Based on Singular Spectrum Analysis and Psychoacoustic Model with Self-Synchronization.
J. Electr. Comput. Eng., 2016
IEICE Trans. Inf. Syst., 2016
MTF-Based Kalman Filtering with Linear Prediction for Power Envelope Restoration in Noisy Reverberant Environments.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2016
Singular-Spectrum Analysis for Digital Audio Watermarking with Automatic Parameterization and Parameter Estimation.
IEICE Trans. Inf. Syst., 2016
Speech Analysis Method Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2016
Robust Speech Analysis Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition in Noisy Environments.
Proceedings of the Speech and Computer - 18th International Conference, 2016
Robust front-end for speech recognition by human and machine in noisy reverberant environments: The effect of phase information.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Modulation Spectral Features for Predicting Vocal Emotion Recognition by Simulated Cochlear Implants.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Investigations into vowel and consonant structures in articulatory and auditory spaces using Laplacian eigenmaps.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
Restoration scheme of instantaneous amplitude and phase using Kalman filter with efficient linear prediction for speech enhancement.
Speech Commun., 2015
Tampering Detection Scheme for Speech Signals using Formant Enhancement based Watermarking.
J. Inf. Hiding Multim. Signal Process., 2015
Robust, Blindly-Detectable, and Semi-Reversible Technique of Audio Watermarking Based on Cochlear Delay Characteristics.
IEICE Trans. Inf. Syst., 2015
Complex tensor factorization in modulation frequency domain for single-channel speech enhancement.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Feasibility of Estimating Direction of Arrival Based on Monaural Modulation Spectrum.
Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015
Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Robust and reliable audio watermarking based on dynamic phase coding and error control coding.
Proceedings of the 23rd European Signal Processing Conference, 2015
Restoration of instantaneous amplitude and phase of speech signal in noisy reverberant environments.
Proceedings of the 23rd European Signal Processing Conference, 2015
An audio watermarking scheme based on automatic parameterized singular-spectrum analysis using differential evolution.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
J. Inf. Hiding Multim. Signal Process., 2014
iDAF-drum: Supporting Practice of Drumstick Control by Exploiting Insignificantly Delayed Auditory Feedback.
Proceedings of the Knowledge, Information and Creativity Support Systems - Selected Papers from KICSS'2014, 2014
Proceedings of the Digital-Forensics and Watermarking - 13th International Workshop, 2014
Proceedings of the Digital-Forensics and Watermarking - 13th International Workshop, 2014
Signal to noise ratio estimation based on an optimal design of subband voice activity detection.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Speech analysis method based on source-filter model using multivariate empirical mode decomposition in log-spectrum domain.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014
Restoration of instantaneous amplitude and phase using Kalman filter for speech enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the 22nd European Signal Processing Conference, 2014
Controlling Tradeoff Between Approximation Accuracy and Complexity of a Smooth Function in a Reproducing Kernel Hilbert Space for Noise Reduction.
IEEE Trans. Signal Process., 2013
Proceedings of the International Symposium on Intelligent Signal Processing and Communication Systems, 2013
Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013
Study on Method for Estimating F0 of Steady Complex Tone in Noisy Reverberant Environments.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013
Robust Audio Data Hiding Method Based on Phase of Modulated Complex Lapped Transform.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013
Blind method of estimating speech transmission index from reverberant speech signals.
Proceedings of the 21st European Signal Processing Conference, 2013
Blind method of estimating speech transmission index in room acoustics based on concept of modulation transfer function.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013
IMM-based feature compensation robust to slowly time-varying noise and reverberation.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013
Improvements to Creativity in Singing Abilities Based on Perspective of Studies on Interaction between Speech Production and Auditory Perception.
Proceedings of the Seventh International Conference on Knowledge, 2012
Unified denoising and dereverberation method used in restoration of MTF-based power envelope.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Robust voice activity detection using empirical mode decomposition and modulation spectrum analysis.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2012
Proceedings of the Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2012
Temporal modulation normalization for robust speech feature extraction and recognition.
Multim. Tools Appl., 2011
Embedding Limitations with Digital-audio Watermarking Method Based on Cochlear Delay Characteristics.
J. Inf. Hiding Multim. Signal Process., 2011
Sub-band temporal modulation envelopes and their normalization for automatic speech recognition in reverberant environments.
Comput. Speech Lang., 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the Seventh International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2011
Temporal contrast normalization and edge-preserved smoothing of temporal modulation structures of speech for robust speech recognition.
Speech Commun., 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Design of IIR All-Pass Filter Based on Cochlear Delay to Reduce Embedding Limitations.
Proceedings of the Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2010), 2010
IEICE Trans. Inf. Syst., 2009
Normalization on the modulation spectrum of the subband temporal envelopes for automatic speech recognition in reverberant environments.
Proceedings of the 3rd International Universal Communication Symposium, 2009
Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environments.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Embedding Limitations with Audio-watermarking Method Based on Cochlear-delay Characteristics.
Proceedings of the Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), 2009
Temporal contrast normalization and edge-preserved smoothing on temporal modulation structure for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the 17th European Signal Processing Conference, 2009
A comprehensive study on the effects of room reverberation on fundamental frequency estimation.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Robust front end processing for speech recognition in reverberant environments: utilization of speech characteristics.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2008), 2008
Comparative evaluations of robust and accurate F0 estimates in reverberant environments.
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the 2008 16th European Signal Processing Conference, 2008
Method of LP-based blind restoration for improving intelligibility of bone-conducted speech.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
A robust feature extraction based on the MTF concept for speech recognition in reverberant environment.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis.
Speech Commun., 2005
A model for selective segregation of a target instrument sound from the mixed sound of various instruments.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
A speech dereverberation method based on the MTF concept using adaptive time-frequency divisions.
Proceedings of the 2004 12th European Signal Processing Conference, 2004
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
A model for selective segregation of a target instrument sound from the mixed sound of various instruments.
Proceedings of the 2003 International Computer Music Conference, 2003
A method based on the MTF concept for dereverberating the power envelope from the reverberant signal.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Speech Commun., 1999
Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997