Longbiao Wang
Orcid: 0000-0002-4005-5036Affiliations:
- Nagaoka University of Technology
According to our database1,
Longbiao Wang
authored at least 235 papers
between 2004 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
2024
Significance of relative phase features for shouted and normal speech classification.
EURASIP J. Audio Speech Music. Process., December, 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Text-to-Speech for Low-Resource Agglutinative Language With Morphology-Aware Language Model Pre-Training.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Adversarial Domain Generalized Transformer for Cross-Corpus Speech Emotion Recognition.
IEEE Trans. Affect. Comput., 2024
Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network.
Speech Commun., 2024
Towards multimodal sarcasm detection via label-aware graph contrastive learning with back-translation augmentation.
Knowl. Based Syst., 2024
Knowl. Based Syst., 2024
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition.
CoRR, 2024
AIMDiT: Modality Augmentation and Interaction via Multimodal Dimension Transformation for Emotion Recognition in Conversations.
CoRR, 2024
An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios.
CoRR, 2024
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge.
CoRR, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding.
Proceedings of the IEEE International Conference on Acoustics, 2024
G^2SAM: Graph-Based Global Semantic Awareness Method for Multimodal Sarcasm Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Speech Commun., November, 2023
Neurocomputing, October, 2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
CFDRN: A Cognition-Inspired Feature Decomposition and Recombination Network for Dysarthric Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
IEEE Signal Process. Lett., 2023
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations.
CoRR, 2023
CoRR, 2023
Auditory Attention Detection in Real-Life Scenarios Using Common Spatial Patterns from EEG.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
SDNet: Stream-attention and Dual-feature Learning Network for Ad-hoc Array Speech Separation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Discrimination of the Different Intents Carried by the Same Text Through Integrating Multimodal Information.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Improving Zero-shot Cross-domain Slot Filling via Transformer-based Slot Semantics Fusion.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Local and Global Context Modeling with Relation Matching Task for Dialog Act Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2023
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
Multi-Modal Sarcasm Detection Based on Cross-Modal Composition of Inscribed Entity Relations.
Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, 2023
Enhancing Multimodal Alignment with Momentum Augmentation for Dense Video Captioning.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Time-Domain Speech Enhancement Assisted by Multi-Resolution Frequency Encoder and Decoder.
Proceedings of the IEEE International Conference on Acoustics, 2023
Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Leveraging Positional-Related Local-Global Dependency for Synthetic Speech Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023
VF-Taco2: Towards Fast and Lightweight Synthesis for Autoregressive Models with Variation Autoencoder and Feature Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Tackling Modality Heterogeneity with Multi-View Calibration Network for Multimodal Sentiment Detection.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Augmenting Affective Dependency Graph via Iterative Incongruity Graph Learning for Sarcasm Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Toward Efficient Processing and Learning With Spikes: New Approaches for Multispike Learning.
IEEE Trans. Cybern., 2022
Learning affective representations based on magnitude and dynamic relative phase information for speech emotion recognition.
Speech Commun., 2022
Emotion Recognition With Multimodal Transformer Fusion Framework Based on Acoustic and Lexical Information.
IEEE Multim., 2022
Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition.
IEEE Multim., 2022
Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling.
EURASIP J. Audio Speech Music. Process., 2022
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation.
CoRR, 2022
Talking Head Generation Driven by Speech-Related Facial Action Units and Audio- Based on Multimodal Representation Fusion.
CoRR, 2022
Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Adaptive Attention Network with Domain Adversarial Training for Multi-Accent Speech Recognition.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with Unknown Number of Sound Sources.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Self-Distillation Based on High-level Information Supervision for Compressing End-to-End ASR Model.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
TopicKS: Topic-driven Knowledge Selection for Knowledge-grounded Dialogue Generation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Finer-grained Modeling units-based Meta-Learning for Low-resource Tibetan Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Global Signal-to-noise Ratio Estimation Based on Multi-subband Processing Using Convolutional Neural Network.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Improve emotional speech synthesis quality by learning explicit and implicit representations with semi-supervised training.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the International Joint Conference on Neural Networks, 2022
An Improved Stimulus Reconstruction Method for EEG-Based Short-Time Auditory Attention Detection.
Proceedings of the Neural Information Processing - 29th International Conference, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Multi-Stage Graph Representation Learning for Dialogue-Level Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022
Cache: Modeling Contribution-Aware Context Hierarchically for Long-Range Dialogue State Tracking.
Proceedings of the IEEE International Conference on Acoustics, 2022
Compressing Transformer-Based ASR Model by Task-Driven Loss and Attention-Based Multi-Level Feature Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2022
Using Multiple Reference Audios and Style Embedding Constraints for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Robust Environmental Sound Recognition With Sparse Key-Point Encoding and Efficient Multispike Learning.
IEEE Trans. Neural Networks Learn. Syst., 2021
Knowl. Based Syst., 2021
Replay attack detection using variable-frequency resolution phase and magnitude features.
Comput. Speech Lang., 2021
Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis.
CoRR, 2021
Exploiting Explicit and Inferred Implicit Personas for Multi-turn Dialogue Generation.
Proceedings of the Natural Language Processing and Chinese Computing, 2021
A Sentiment Similarity-Oriented Attention Model with Multi-task Learning for Text-Based Emotion Recognition.
Proceedings of the MultiMedia Modeling - 27th International Conference, 2021
Dialogue Act Recognition using Branch Architecture with Attention Mechanism for Imbalanced Data.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Joint Feature Enhancement and Speaker Recognition with Multi-Objective Task-Oriented Network.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Domain-Specific Multi-Agent Dialog Policy Learning in Multi-Domain Task-Oriented Scenarios.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Time-Frequency Representation Learning with Graph Convolutional Network for Dialogue-Level Speech Emotion Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
TacoLPCNet: Fast and Stable TTS by Conditioning LPCNet on Mel Spectrogram Predictions.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Metric Learning Based Feature Representation with Gated Fusion Model for Speech Emotion Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Information Sieve: Content Leakage Reduction in End-to-End Prosody Transfer for Expressive Speech Synthesis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the International Joint Conference on Neural Networks, 2021
Proceedings of the Neural Information Processing - 28th International Conference, 2021
Proceedings of the Neural Information Processing - 28th International Conference, 2021
Exploring Effective Speech Representation via ASR for High-Quality End-to-End Multispeaker TTS.
Proceedings of the Neural Information Processing - 28th International Conference, 2021
CONSK-GCN: Conversational Semantic- and Knowledge-Oriented Graph Convolutional Network for Multimodal Emotion Recognition.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Multimodal Emotion Recognition with Capsule Graph Convolutional Based Representation Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2021
Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network.
Proceedings of the IEEE International Conference on Acoustics, 2021
Representation Learning with Spectro-Temporal-Channel Attention for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021
Improving Naturalness and Controllability of Sequence-to-Sequence Speech Synthesis by Learning Local Prosody Representations.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Domain-Adversarial Autoencoder with Attention Based Feature Level Fusion for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 32nd British Machine Vision Conference 2021, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Learning Language and Speaker Information for Code-Switch Speech Synthesis with Limited Data.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
Towards Efficient Processing and Learning with Spikes: New Approaches for Multi-Spike Learning.
CoRR, 2020
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020
Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
ARET: Aggregated Residual Extended Time-Delay Neural Networks for Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Temporal Attention Convolutional Network for Speech Emotion Recognition with Latent Representation.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the Neural Information Processing - 27th International Conference, 2020
Hierarchical Interactive Matching Network for Multi-turn Response Selection in Retrieval-Based Chatbots.
Proceedings of the Neural Information Processing - 27th International Conference, 2020
Adversarial Shared-Private Attention Network for Joint Slot Filling and Intent Detection.
Proceedings of the Neural Information Processing - 27th International Conference, 2020
Investigation of Effectively Synthesizing Code-Switched Speech Using Highly Imbalanced Mix-Lingual Data.
Proceedings of the Neural Information Processing - 27th International Conference, 2020
Proceedings of the ICCAI '20: 2020 6th International Conference on Computing and Artificial Intelligence, 2020
A Hierarchical Model for Dialog Act Recognition Considering Acoustic and Lexical Context Information.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
EURASIP J. Audio Speech Music. Process., 2019
Robust Environmental Sound Recognition with Sparse Key-point Encoding and Efficient Multi-spike Learning.
CoRR, 2019
Aust. J. Intell. Inf. Process. Syst., 2019
Replay Attack Detection Using Linear Prediction Analysis-Based Relative Phase Features.
IEEE Access, 2019
Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine.
IEEE Access, 2019
An integrated system for robust gender classification with convolutional restricted Boltzmann machine and spiking neural network.
Proceedings of the IEEE Symposium Series on Computational Intelligence, 2019
Proceedings of the IEEE Symposium Series on Computational Intelligence, 2019
CNN-BLSTM Based Question Detection from Dialogs Considering Phase and Context Information.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Environment-Dependent Attention-Driven Recurrent Convolutional Neural Network for Robust Speech Enhancement.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
A Spiking Neural Network with Distributed Keypoint Encoding for Robust Sound Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2019
Time-Frequency Deep Representation Learning for Speech Emotion Recognition Integrating Self-attention.
Proceedings of the Neural Information Processing - 26th International Conference, 2019
A Fast Convolutional Self-attention Based Speech Dereverberation Method for Robust Speech Recognition.
Proceedings of the Neural Information Processing - 26th International Conference, 2019
NVSRN: A Neural Variational Scaling Reasoning Network for Initiative Response Generation.
Proceedings of the 2019 IEEE International Conference on Data Mining, 2019
Replay Attack Detection Using Magnitude and Phase Information with Attention-based Adaptive Filters.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Robust Sound Event Classification with Local Time-Frequency Information and Convolutional Neural Networks.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2019: Text and Time Series, 2019
A Semi-Supervised Stable Variational Network for Promoting Replier-Consistency in Dialogue Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Effective Training End-to-End ASR systems for Low-resource Lhasa Dialect of Tibetan Language.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
Multim. Tools Appl., 2018
Replay Attacks Detection Using Phase and Magnitude Features with Various Frequency Resolutions.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Distant-talking Speech Recognition Based on Multi-objective Learning using Phase and Magnitude-based Feature.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Pitch Synchronized Relative Phase with Peak Error Detection For Noise-robust Speaker Recognition.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Revealing Spatiotemporal Brain Dynamics of Speech Production Based on EEG and Eye Movement.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Speech Emotion Recognition by Combining Amplitude and Phase Information Using Convolutional Neural Network.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018
Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition.
Proceedings of the Neural Information Processing - 25th International Conference, 2018
Proceedings of the Neural Information Processing - 25th International Conference, 2018
A Feature Fusion Method Based on Extreme Learning Machine for Speech Emotion Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2018, 2018
Interaction-Aware Topic Model for Microblog Conversations through Network Embedding and User Attention.
Proceedings of the 27th International Conference on Computational Linguistics, 2018
Implicit Discourse Relation Recognition using Neural Tensor Network with Interactive Attention and Sparse Learning.
Proceedings of the 27th International Conference on Computational Linguistics, 2018
2017
IEEE J. Sel. Top. Signal Process., 2017
Noise robust voice activity detection using joint phase and magnitude based feature enhancement.
J. Ambient Intell. Humaniz. Comput., 2017
Proceedings of the Studies on Speech Production - 11th International Seminar, 2017
Global Monitoring of Dynamic Functional Interactions in the Brain During Chinese Verbs Perception.
Proceedings of the Studies on Speech Production - 11th International Seminar, 2017
Proceedings of the Studies on Speech Production - 11th International Seminar, 2017
Proceedings of the Neural Information Processing - 24th International Conference, 2017
Proceedings of the Neural Information Processing - 24th International Conference, 2017
Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models.
Proceedings of the Neural Information Processing - 24th International Conference, 2017
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017
Pseudo-pitch-synchronized phase information extraction and its application for robust speaker recognition.
Proceedings of the IEEE 6th Global Conference on Consumer Electronics, 2017
2016
Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization.
J. Signal Process. Syst., 2016
Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition.
Multim. Tools Appl., 2016
Multim. Tools Appl., 2016
Noise Robust Speech Recognition Using Multi-Channel Based Channel Selection And ChannelWeighting.
CoRR, 2016
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
EURASIP J. Adv. Signal Process., 2015
Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification.
EURASIP J. Audio Speech Music. Process., 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
2014
Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array.
IEEE ACM Trans. Audio Speech Lang. Process., 2014
Distant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computation.
EURASIP J. Audio Speech Music. Process., 2014
EURASIP J. Audio Speech Music. Process., 2014
Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014
Distant-talking speech recognition using multi-channel LMS and multiple-step linear prediction.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Single-sided approach to discriminative PLDA training for text-independent speaker verification without using expanded i-vector.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Multi-channel speech enhancement using sparse coding on local time-frequency structures.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Denoising autoencoder and environment adaptation for distant-talking speech recognition with asynchronous speech recording.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
2013
Robust Log-Energy Estimation and its Dynamic Change Enhancement for In-car Speech Recognition.
IEEE Trans. Speech Audio Process., 2013
Improvement of distant-talking speaker identification using bottleneck features of DNN.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Hands-free speaker identification based on spectral subtraction using a multi-channel least mean square approach.
Proceedings of the IEEE International Conference on Acoustics, 2013
Joint sparse representation based cepstral-domain dereverberation for distant-talking speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
Speech recognition using blind source separation and dereverberation method for mixed sound of speech and music.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
Speaker identification using pseudo pitch synchronized phase information in noisy environments.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
2012
IEEE Trans. Speech Audio Process., 2012
Dereverberation and denoising based on generalized spectral subtraction by multi-channel LMS algorithm using a small-scale microphone array.
EURASIP J. Adv. Signal Process., 2012
Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Dereverberantion based on generalized spectral subtraction for distant-talking speaker recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
Distant-talking speaker identification using a reverberation model with various artificial room impulse responses.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
On the use of phase information-based joint factor analysis for speaker verification under channel mismatch condition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
2011
Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm.
IEICE Trans. Inf. Syst., 2011
Evaluation of Hands-Free Large Vocabulary Continuous Speech Recognition by Blind Dereverberation Based on Spectral Subtraction by Multi-channel LMS Algorithm.
Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011
2010
IEICE Trans. Inf. Syst., 2010
Speaker identification by combining MFCC and phase information in noisy environments.
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
High improvement of speaker identification and verification by combining MFCC and phase information.
Proceedings of the IEEE International Conference on Acoustics, 2009
2008
Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN.
IEICE Trans. Inf. Syst., 2008
Blind dereverberation based on CMN and spectral subtraction by multi-channel LMS algorithm.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
2007
Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM.
Speech Commun., 2007
Analysis of effect of compensation parameter estimation for CMN on speech/speaker recognition.
Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Robust Distant Speech Recognition by Combining Position-Dependent CMN with Conventional CMN.
Proceedings of the IEEE International Conference on Acoustics, 2007
2006
Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN.
EURASIP J. Adv. Signal Process., 2006
2005
Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing technique.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Robust distant speaker recognition based on position dependent cepstral mean normalization.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004