Longbiao Wang

Orcid: 0000-0002-4005-5036

Affiliations:
  • Nagaoka University of Technology


According to our database1, Longbiao Wang authored at least 221 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Significance of relative phase features for shouted and normal speech classification.
EURASIP J. Audio Speech Music. Process., December, 2024

Text-to-Speech for Low-Resource Agglutinative Language With Morphology-Aware Language Model Pre-Training.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge.
CoRR, 2024

G^2SAM: Graph-Based Global Semantic Awareness Method for Multimodal Sarcasm Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Disordered speech recognition considering low resources and abnormal articulation.
Speech Commun., November, 2023

MPP-net: Multi-perspective perception network for dense video captioning.
Neurocomputing, October, 2023

Amer: A New Attribute-Missing Network Embedding Approach.
IEEE Trans. Cybern., 2023

Meta-Generalization for Domain-Invariant Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

CFDRN: A Cognition-Inspired Feature Decomposition and Recombination Network for Dysarthric Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

A CIF-Based Speech Segmentation Method for Streaming E2E ASR.
IEEE Signal Process. Lett., 2023

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations.
CoRR, 2023

Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions.
CoRR, 2023

A Refining Underlying Information Framework for Monaural Speech Enhancement.
CoRR, 2023

High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models.
CoRR, 2023

Learning Speech Representation From Contrastive Token-Acoustic Pretraining.
CoRR, 2023

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding.
CoRR, 2023

Rethinking the visual cues in audio-visual speaker extraction.
CoRR, 2023

Local and Global Context Modeling with Relation Matching Task for Dialog Act Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2023

Commonsense Knowledge Enhanced Sentiment Dependency Graph for Sarcasm Detection.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Multi-Modal Sarcasm Detection Based on Cross-Modal Composition of Inscribed Entity Relations.
Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, 2023

Enhancing Multimodal Alignment with Momentum Augmentation for Dense Video Captioning.
Proceedings of the IEEE International Conference on Acoustics, 2023

Stream Attention Based U-Net for L3DAS23 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

Noise-Disentanglement Metric Learning for Robust Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2023

Time-Domain Speech Enhancement Assisted by Multi-Resolution Frequency Encoder and Decoder.
Proceedings of the IEEE International Conference on Acoustics, 2023

Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Cross-Modal Audio-Visual Co-Learning for Text-Independent Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Positional-Related Local-Global Dependency for Synthetic Speech Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023

VF-Taco2: Towards Fast and Lightweight Synthesis for Autoregressive Models with Variation Autoencoder and Feature Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Supervised Audio-Visual Speaker Representation with Co-Meta Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

Tackling Modality Heterogeneity with Multi-View Calibration Network for Multimodal Sentiment Detection.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Augmenting Affective Dependency Graph via Iterative Incongruity Graph Learning for Sarcasm Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Toward Efficient Processing and Learning With Spikes: New Approaches for Multispike Learning.
IEEE Trans. Cybern., 2022

Whispered Speech Detection Using Glottal Flow-Based Features.
Symmetry, 2022

Learning affective representations based on magnitude and dynamic relative phase information for speech emotion recognition.
Speech Commun., 2022

Emotion Recognition With Multimodal Transformer Fusion Framework Based on Acoustic and Lexical Information.
IEEE Multim., 2022

Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition.
IEEE Multim., 2022

Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling.
EURASIP J. Audio Speech Music. Process., 2022

MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation.
CoRR, 2022

I4U System Description for NIST SRE'20 CTS Challenge.
CoRR, 2022

Monolingual Recognizers Fusion for Code-switching Speech Recognition.
CoRR, 2022

Talking Head Generation Driven by Speech-Related Facial Action Units and Audio- Based on Multimodal Representation Fusion.
CoRR, 2022

Deep Spectro-temporal Artifacts for Detecting Synthesized Speech.
Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022

The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Adaptive Attention Network with Domain Adversarial Training for Multi-Accent Speech Recognition.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Deep Multi-task Cascaded Acoustic Echo Cancellation and Noise Suppression.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with Unknown Number of Sound Sources.
Proceedings of the Interspeech 2022, 2022

Self-Distillation Based on High-level Information Supervision for Compressing End-to-End ASR Model.
Proceedings of the Interspeech 2022, 2022

Hierarchical Tagger with Multi-task Learning for Cross-domain Slot Filling.
Proceedings of the Interspeech 2022, 2022

TopicKS: Topic-driven Knowledge Selection for Knowledge-grounded Dialogue Generation.
Proceedings of the Interspeech 2022, 2022

Language-specific Characteristic Assistance for Code-switching Speech Recognition.
Proceedings of the Interspeech 2022, 2022

Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction.
Proceedings of the Interspeech 2022, 2022

Finer-grained Modeling units-based Meta-Learning for Low-resource Tibetan Speech Recognition.
Proceedings of the Interspeech 2022, 2022

Global Signal-to-noise Ratio Estimation Based on Multi-subband Processing Using Convolutional Neural Network.
Proceedings of the Interspeech 2022, 2022

VCSE: Time-Domain Visual-Contextual Speaker Extraction Network.
Proceedings of the Interspeech 2022, 2022

Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection.
Proceedings of the Interspeech 2022, 2022

Improve emotional speech synthesis quality by learning explicit and implicit representations with semi-supervised training.
Proceedings of the Interspeech 2022, 2022

Iterative Sound Source Localization for Unknown Number of Sources.
Proceedings of the Interspeech 2022, 2022

Dual-stream Speech Dereverberation Network Using Long-term and Short-term Cues.
Proceedings of the International Joint Conference on Neural Networks, 2022

An Improved Stimulus Reconstruction Method for EEG-Based Short-Time Auditory Attention Detection.
Proceedings of the Neural Information Processing - 29th International Conference, 2022

Improving Dialogue Generation via Proactively Querying Grounded Knowledge.
Proceedings of the IEEE International Conference on Acoustics, 2022

Learning Domain-Invariant Transformation for Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2022

Joint and Adversarial Training with ASR for Expressive Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Stage Graph Representation Learning for Dialogue-Level Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Cache: Modeling Contribution-Aware Context Hierarchically for Long-Range Dialogue State Tracking.
Proceedings of the IEEE International Conference on Acoustics, 2022

Compressing Transformer-Based ASR Model by Task-Driven Loss and Attention-Based Multi-Level Feature Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2022

Using Multiple Reference Audios and Style Embedding Constraints for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

L-SpEx: Localized Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2022

Domain-Invariant Feature Learning for Cross Corpus Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Robust Environmental Sound Recognition With Sparse Key-Point Encoding and Efficient Multispike Learning.
IEEE Trans. Neural Networks Learn. Syst., 2021

gMatch: Knowledge base question answering via semantic matching.
Knowl. Based Syst., 2021

Replay attack detection using variable-frequency resolution phase and magnitude features.
Comput. Speech Lang., 2021

Using multiple reference audios and style embedding constraints for speech synthesis.
CoRR, 2021

Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis.
CoRR, 2021

Exploring Deep Learning for Joint Audio-Visual Lip Biometrics.
CoRR, 2021

Exploiting Explicit and Inferred Implicit Personas for Multi-turn Dialogue Generation.
Proceedings of the Natural Language Processing and Chinese Computing, 2021

A Sentiment Similarity-Oriented Attention Model with Multi-task Learning for Text-Based Emotion Recognition.
Proceedings of the MultiMedia Modeling - 27th International Conference, 2021

Dialogue Act Recognition using Branch Architecture with Attention Mechanism for Imbalanced Data.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Spoken Language Understanding with Sememe Knowledge as Domain Knowledge.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Order-aware Pairwise Intoxication Detection.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Joint Feature Enhancement and Speaker Recognition with Multi-Objective Task-Oriented Network.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Domain-Specific Multi-Agent Dialog Policy Learning in Multi-Domain Task-Oriented Scenarios.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Time-Frequency Representation Learning with Graph Convolutional Network for Dialogue-Level Speech Emotion Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

TacoLPCNet: Fast and Stable TTS by Conditioning LPCNet on Mel Spectrogram Predictions.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Metric Learning Based Feature Representation with Gated Fusion Model for Speech Emotion Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Information Sieve: Content Leakage Reduction in End-to-End Prosody Transfer for Expressive Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multi-Modal Emotion Recognition Based On deep Learning Of EEG And Audio Signals.
Proceedings of the International Joint Conference on Neural Networks, 2021

Simultaneous Progressive Filtering-Based Monaural Speech Enhancement.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

Speech Dereverberation Based on Scale-Aware Mean Square Error Loss.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

Exploring Effective Speech Representation via ASR for High-Quality End-to-End Multispeaker TTS.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

CONSK-GCN: Conversational Semantic- and Knowledge-Oriented Graph Convolutional Network for Multimodal Emotion Recognition.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Meta-Learning for Cross-Channel Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2021

Replay-Attack Detection Using Features With Adaptive Spectro-Temporal Resolution.
Proceedings of the IEEE International Conference on Acoustics, 2021

Multimodal Emotion Recognition with Capsule Graph Convolutional Based Representation Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2021

Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network.
Proceedings of the IEEE International Conference on Acoustics, 2021

Representation Learning with Spectro-Temporal-Channel Attention for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Naturalness and Controllability of Sequence-to-Sequence Speech Synthesis by Learning Local Prosody Representations.
Proceedings of the IEEE International Conference on Acoustics, 2021

Multi-Stage Speaker Extraction with Utterance and Frame-Level Reference Signals.
Proceedings of the IEEE International Conference on Acoustics, 2021

Domain-Adversarial Autoencoder with Attention Based Feature Level Fusion for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Talking Head Generation with Audio and Speech Related Facial Action Units.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

DeepLip: A Benchmark for Deep Learning-Based Audio-Visual Lip Biometrics.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Learning Language and Speaker Information for Code-Switch Speech Synthesis with Limited Data.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Spectrograms Fusion-based End-to-end Robust Automatic Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Towards Efficient Processing and Learning with Spikes: New Approaches for Multi-Spike Learning.
CoRR, 2020

Multiple Knowledge Syncretic Transformer for Natural Dialogue Generation.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Speaker-Aware Speech Emotion Recognition by Fusing Amplitude and Phase Information.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Dynamic Margin Softmax Loss for Speaker Verification.
Proceedings of the Interspeech 2020, 2020

EEG-Based Short-Time Auditory Attention Detection Using Multi-Task Deep Learning.
Proceedings of the Interspeech 2020, 2020

Adversarial Separation Network for Speaker Recognition.
Proceedings of the Interspeech 2020, 2020

ARET: Aggregated Residual Extended Time-Delay Neural Networks for Speaker Verification.
Proceedings of the Interspeech 2020, 2020

Singing Voice Extraction with Attention-Based Spectrograms Fusion.
Proceedings of the Interspeech 2020, 2020

Temporal Attention Convolutional Network for Speech Emotion Recognition with Latent Representation.
Proceedings of the Interspeech 2020, 2020

Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription.
Proceedings of the Interspeech 2020, 2020

SpEx+: A Complete Time Domain Speaker Extraction Network.
Proceedings of the Interspeech 2020, 2020

Deep Discriminative Embedding with Ranked Weight for Speaker Verification.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Hierarchical Interactive Matching Network for Multi-turn Response Selection in Retrieval-Based Chatbots.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Adversarial Shared-Private Attention Network for Joint Slot Filling and Intent Detection.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Investigation of Effectively Synthesizing Code-Switched Speech Using Highly Imbalanced Mix-Lingual Data.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Amplitude Consistent Enhancement for Speech Dereverberation.
Proceedings of the ICCAI '20: 2020 6th International Conference on Computing and Artificial Intelligence, 2020

A Hierarchical Model for Dialog Act Recognition Considering Acoustic and Lexical Context Information.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speech Emotion Recognition with Local-Global Aware Deep Representation Learning.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

End-to-End Articulatory Modeling for Dysarthric Articulatory Attribute Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Pitch-aware Speaker Extraction Serial Network.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
Replay attack detection with auditory filter-based relative phase features.
EURASIP J. Audio Speech Music. Process., 2019

Robust Environmental Sound Recognition with Sparse Key-point Encoding and Efficient Multi-spike Learning.
CoRR, 2019

Static-Dynamically Attentive Variational Network for Dialogue Generation.
Aust. J. Intell. Inf. Process. Syst., 2019

Replay Attack Detection Using Linear Prediction Analysis-Based Relative Phase Features.
IEEE Access, 2019

Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine.
IEEE Access, 2019

An integrated system for robust gender classification with convolutional restricted Boltzmann machine and spiking neural network.
Proceedings of the IEEE Symposium Series on Computational Intelligence, 2019

A Matching Pursuit Approach for Image Classification with Spiking Neural Networks.
Proceedings of the IEEE Symposium Series on Computational Intelligence, 2019

CNN-BLSTM Based Question Detection from Dialogs Considering Phase and Context Information.
Proceedings of the Interspeech 2019, 2019

Environment-Dependent Attention-Driven Recurrent Convolutional Neural Network for Robust Speech Enhancement.
Proceedings of the Interspeech 2019, 2019

A Spiking Neural Network with Distributed Keypoint Encoding for Robust Sound Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2019

Time-Frequency Deep Representation Learning for Speech Emotion Recognition Integrating Self-attention.
Proceedings of the Neural Information Processing - 26th International Conference, 2019

A Fast Convolutional Self-attention Based Speech Dereverberation Method for Robust Speech Recognition.
Proceedings of the Neural Information Processing - 26th International Conference, 2019

NVSRN: A Neural Variational Scaling Reasoning Network for Initiative Response Generation.
Proceedings of the 2019 IEEE International Conference on Data Mining, 2019

Replay Attack Detection Using Magnitude and Phase Information with Attention-based Adaptive Filters.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Multi-spike Approach for Robust Sound Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Robust Sound Event Classification with Local Time-Frequency Information and Convolutional Neural Networks.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2019: Text and Time Series, 2019

A Semi-Supervised Stable Variational Network for Promoting Replier-Consistency in Dialogue Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Effective Training End-to-End ASR systems for Low-resource Lhasa Dialect of Tibetan Language.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Phase and reverberation aware DNN for distant-talking speech enhancement.
Multim. Tools Appl., 2018

Replay Attacks Detection Using Phase and Magnitude Features with Various Frequency Resolutions.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Distant-talking Speech Recognition Based on Multi-objective Learning using Phase and Magnitude-based Feature.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Pitch Synchronized Relative Phase with Peak Error Detection For Noise-robust Speaker Recognition.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Revealing Spatiotemporal Brain Dynamics of Speech Production Based on EEG and Eye Movement.
Proceedings of the Interspeech 2018, 2018

Multiple Phase Information Combination for Replay Attacks Detection.
Proceedings of the Interspeech 2018, 2018

Speech Emotion Recognition by Combining Amplitude and Phase Information Using Convolutional Neural Network.
Proceedings of the Interspeech 2018, 2018

Integrative Network Embedding via Deep Joint Reconstruction.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition.
Proceedings of the Neural Information Processing - 25th International Conference, 2018

Efficient Multi-spike Learning with Tempotron-Like LTP and PSD-Like LTD.
Proceedings of the Neural Information Processing - 25th International Conference, 2018

A Feature Fusion Method Based on Extreme Learning Machine for Speech Emotion Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Gender-Aware CNN-BLSTM for Speech Emotion Recognition.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2018, 2018

Interaction-Aware Topic Model for Microblog Conversations through Network Embedding and User Attention.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Implicit Discourse Relation Recognition using Neural Tensor Network with Interactive Attention and Sparse Learning.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

2017
Spoofing Speech Detection Using Modified Relative Phase Information.
IEEE J. Sel. Top. Signal Process., 2017

Noise robust voice activity detection using joint phase and magnitude based feature enhancement.
J. Ambient Intell. Humaniz. Comput., 2017

Prediction of F0 Based on Articulatory Features Using DNN.
Proceedings of the Studies on Speech Production - 11th International Seminar, 2017

Global Monitoring of Dynamic Functional Interactions in the Brain During Chinese Verbs Perception.
Proceedings of the Studies on Speech Production - 11th International Seminar, 2017

Speech Emotion Recognition Considering Local Dynamic Features.
Proceedings of the Studies on Speech Production - 11th International Seminar, 2017

Phonemic Restoration Based on the Movement Continuity of Articulation.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Neuronal Classifier for both Rate and Timing-Based Spike Patterns.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Phase aware deep neural network for noise robust voice activity detection.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Pseudo-pitch-synchronized phase information extraction and its application for robust speaker recognition.
Proceedings of the IEEE 6th Global Conference on Consumer Electronics, 2017

2016
Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization.
J. Signal Process. Syst., 2016

Guest Editorial: Immersive Audio/Visual Systems.
Multim. Tools Appl., 2016

Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition.
Multim. Tools Appl., 2016

Distant-talking accent recognition by combining GMM and DNN.
Multim. Tools Appl., 2016

Noise Robust Speech Recognition Using Multi-Channel Based Channel Selection And ChannelWeighting.
CoRR, 2016

Multi-channel feature adaptation for robust speech recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Exploring tonal information for Lhasa dialect acoustic modeling.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification.
Proceedings of the Interspeech 2016, 2016

Face recognition with local contourlet combined patterns.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Environment-dependent denoising autoencoder for distant-talking speech recognition.
EURASIP J. Adv. Signal Process., 2015

Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification.
EURASIP J. Audio Speech Music. Process., 2015

Relative phase information for detecting human speech and spoofed speech.
Proceedings of the INTERSPEECH 2015, 2015

Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

A spectrum smoothing method for speaker verification.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Speech selection and environmental adaptation for asynchronous speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014
Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Distant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computation.
EURASIP J. Audio Speech Music. Process., 2014

PLDA in the I-Supervector Space for Text-Independent Speaker Verification.
EURASIP J. Audio Speech Music. Process., 2014

Speaker Identification by Combining Various Vocal Tract and Vocal Source Features.
Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

Distant-talking speech recognition using multi-channel LMS and multiple-step linear prediction.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Speech enhancement via low-rank matrix decomposition and image based masking.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Single-sided approach to discriminative PLDA training for text-independent speaker verification without using expanded i-vector.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Multi-channel speech enhancement using sparse coding on local time-frequency structures.
Proceedings of the INTERSPEECH 2014, 2014

Log-domain polynomial filters for illumination-robust face recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Denoising autoencoder and environment adaptation for distant-talking speech recognition with asynchronous speech recording.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
Robust Log-Energy Estimation and its Dynamic Change Enhancement for In-car Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

Improvement of distant-talking speaker identification using bottleneck features of DNN.
Proceedings of the INTERSPEECH 2013, 2013

Hands-free speaker identification based on spectral subtraction using a multi-channel least mean square approach.
Proceedings of the IEEE International Conference on Acoustics, 2013

Joint sparse representation based cepstral-domain dereverberation for distant-talking speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Frequency-domain dereverberation on speech signal using surround retinex.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Sparse coding for sound event classification.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Local consistency preserved coupled mappings for low-resolution face recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Speech recognition using blind source separation and dereverberation method for mixed sound of speech and music.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Speaker identification using pseudo pitch synchronized phase information in noisy environments.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012
Speaker Identification and Verification by Combining MFCC and Phase Information.
IEEE Trans. Speech Audio Process., 2012

Dereverberation and denoising based on generalized spectral subtraction by multi-channel LMS algorithm using a small-scale microphone array.
EURASIP J. Adv. Signal Process., 2012

Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment.
Proceedings of the INTERSPEECH 2012, 2012

Dereverberantion based on generalized spectral subtraction for distant-talking speaker recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Distant-talking speaker identification using a reverberation model with various artificial room impulse responses.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

On the use of phase information-based joint factor analysis for speaker verification under channel mismatch condition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm.
IEICE Trans. Inf. Syst., 2011

Evaluation of Hands-Free Large Vocabulary Continuous Speech Recognition by Blind Dereverberation Based on Spectral Subtraction by Multi-channel LMS Algorithm.
Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011

2010
Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions.
IEICE Trans. Inf. Syst., 2010

Speaker identification by combining MFCC and phase information in noisy environments.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
High improvement of speaker identification and verification by combining MFCC and phase information.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN.
IEICE Trans. Inf. Syst., 2008

Blind dereverberation based on CMN and spectral subtraction by multi-channel LMS algorithm.
Proceedings of the INTERSPEECH 2008, 2008

2007
Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM.
Speech Commun., 2007

Analysis of effect of compensation parameter estimation for CMN on speech/speaker recognition.
Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007

Speaker recognition by combining MFCC and phase information.
Proceedings of the INTERSPEECH 2007, 2007

Robust Distant Speech Recognition by Combining Position-Dependent CMN with Conventional CMN.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN.
EURASIP J. Adv. Signal Process., 2006

2005
Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing technique.
Proceedings of the INTERSPEECH 2005, 2005

Robust distant speaker recognition based on position dependent cepstral mean normalization.
Proceedings of the INTERSPEECH 2005, 2005

2004
Robust distant speech recognition based on position dependent CMN.
Proceedings of the INTERSPEECH 2004, 2004


  Loading...