Xugang Lu

Orcid: 0000-0001-7075-448X

According to our database1, Xugang Lu authored at least 145 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Non-Intrusive Neural Quality Assessment Model for Surface Electromyography Signals.
CoRR, 2024

2023
TMS: Temporal multi-scale in time-delay neural network for speaker verification.
Appl. Intell., November, 2023

Self-supervised learning based domain regularization for mask-wearing speaker verification.
Speech Commun., July, 2023

Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions.
CoRR, 2023

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition.
CoRR, 2023

Neural domain alignment for spoken language recognition based on optimal transport.
CoRR, 2023

Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR.
CoRR, 2023

Contributions of Jitter and Shimmer in the Voice for Fake Audio Detection.
IEEE Access, 2023

Optimal Transport with a Diversified Memory Bank for Cross-Domain Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2023

Data-driven Non-uniform Filterbanks Based on F-ratio for Machine Anomalous Sound Detection.
Proceedings of the 31st European Signal Processing Conference, 2023

Cross-Modal Alignment With Optimal Transport For CTC-Based ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement.
IEEE Trans. Artif. Intell., 2022

Partial Coupling of Optimal Transport for Spoken Language Identification.
CoRR, 2022

TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding.
CoRR, 2022

An Investigation of Feature Difference Between Child and Adult Voices Using Line Spectral Pairs.
Proceedings of the 2022 5th International Conference on Signal Processing and Machine Learning, 2022

Pronunciation-Aware Unique Character Encoding for RNN Transducer-Based Mandarin Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Transducer-based language embedding for spoken language identification.
Proceedings of the Interspeech 2022, 2022

Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection.
Proceedings of the Interspeech 2022, 2022

Perceptual Contrast Stretching on Target Feature for Speech Enhancement.
Proceedings of the Interspeech 2022, 2022

CS-REP: Making Speaker Verification Networks Embracing Re-Parameterization.
Proceedings of the IEEE International Conference on Acoustics, 2022

Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network.
Proceedings of the 30th European Signal Processing Conference, 2022

2021
Coupling a Generative Model With a Discriminative Learning Framework for Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Integrating a joint Bayesian generative model in a discriminative learning framework for speaker verification.
CoRR, 2021

Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

EMA2S: An End-to-End Multimodal Articulatory-to-Speech System.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Unsupervised Neural Adaptation Model Based on Optimal Transport for Spoken Language Identification.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Study of Incorporating Articulatory Movement Information in Speech Enhancement.
Proceedings of the 29th European Signal Processing Conference, 2021

Siamese Neural Network with Joint Bayesian Model Structure for Speaker Verification.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Automatic Speech Recognition.
Proceedings of the Speech-to-Speech Translation, 2020

Speech Enhancement Based on Denoising Autoencoder With Multi-Branched Encoders.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement.
IEEE Signal Process. Lett., 2020

Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement.
CoRR, 2020

Speech Enhancement based on Denoising Autoencoder with Multi-branched Encoders.
CoRR, 2020

Compensation on x-vector for Short Utterance Spoken Language Identification.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Investigation of NICT Submission for Short-Duration Speaker Verification Challenge 2020.
Proceedings of the Interspeech 2020, 2020

Incorporating Broad Phonetic Information for Speech Enhancement.
Proceedings of the Interspeech 2020, 2020

Self-Supervised Denoising Autoencoder with Linear Regression Decoder for Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Robust Unsupervised Neural Machine Translation with Adversarial Denoising Training.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2019
Deep progressive multi-scale attention for acoustic event classification.
CoRR, 2019

Improving the Intelligibility of Electric and Acoustic Stimulation Speech Using Fully Convolutional Networks Based Speech Enhancement.
CoRR, 2019

Optimal Classifier Parameter Status Selection Based on Bayes Boundary-ness for Multi-ProtoType and Multi-Layer Perceptron Classifiers.
Proceedings of the Integrated Uncertainty in Knowledge Modelling and Decision Making, 2019

Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric.
Proceedings of the Interspeech 2019, 2019

Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection.
Proceedings of the Interspeech 2019, 2019

Incorporating Symbolic Sequential Modeling for Speech Enhancement.
Proceedings of the Interspeech 2019, 2019

Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation.
Proceedings of the Interspeech 2019, 2019

Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese.
Proceedings of the Interspeech 2019, 2019

End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Interactive Learning of Teacher-student Model for Short Utterance Spoken Language Identification.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Study of articulators' contribution and compensation during speech by articulatory speech recognition.
Multim. Tools Appl., 2018

Speech Dereverberation Based on Integrated Deep and Ensemble Learning.
CoRR, 2018

Improving Very Deep Time-Delay Neural Network With Vertical-Attention For Effectively Training CTC-Based ASR Systems.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Feature Representation of Short Utterances Based on Knowledge Distillation for Spoken Language Identification.
Proceedings of the Interspeech 2018, 2018

Temporal Attentive Pooling for Acoustic Event Detection.
Proceedings of the Interspeech 2018, 2018

Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks.
Proceedings of the Interspeech 2018, 2018

Speech Dereverberation Based on Integrated Deep and Ensemble Learning Algorithm.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implant Simulation.
IEEE Trans. Biomed. Eng., 2017

Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Method of Estimating Signal-to-Noise Ratio Based on Optimal Design for Sub-band Voice Activity Detection.
J. Inf. Hiding Multim. Signal Process., 2017

Regularization of neural network model with distance metric learning for i-vector based spoken language identification.
Comput. Speech Lang., 2017

End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks.
CoRR, 2017

Multi-Metrics Learning for Speech Enhancement.
CoRR, 2017

Complex spectrogram enhancement by convolutional neural network with multi-metrics learning.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Conditional Generative Adversarial Nets Classifier for Spoken Language Identification.
Proceedings of the Interspeech 2017, 2017

Semi-supervised ensemble DNN acoustic model training.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Minimum Bayes risk training of CTC acoustic models in maximum a posteriori based decoding framework.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Incremental training and constructing the very deep convolutional residual network acoustic models.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Raw waveform-based speech enhancement by fully convolutional networks.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments.
J. Signal Process. Syst., 2016

Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization.
IEEE Signal Process. Lett., 2016

Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription.
Speech Commun., 2016

Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers.
IEICE Trans. Inf. Syst., 2016

Automatic acoustic segmentation in N-best list rescoring for lecture speech recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Comparison of regularization constraints in deep neural network based speaker adaptation.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A pseudo-task design in multi-task learning deep neural network for speaker recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Confidence estimation for speech recognition systems using conditional random fields trained with partially annotated data.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Incorporating local environment information with ensemble neural networks to robust automatic speech recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

F<sub>0</sub> Contour Analysis Based on Empirical Mode Decomposition for DNN Acoustic Modeling in Mandarin Speech Recognition.
Proceedings of the Interspeech 2016, 2016

Pair-Wise Distance Metric Learning of Neural Network Model for Spoken Language Identification.
Proceedings of the Interspeech 2016, 2016

Maximum a posteriori Based Decoding for CTC Acoustic Models.
Proceedings of the Interspeech 2016, 2016

Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks.
Proceedings of the Interspeech 2016, 2016

SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement.
Proceedings of the Interspeech 2016, 2016

Local fisher discriminant analysis for spoken language identification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Bottleneck linear transformation network adaptation for speaker adaptive training-based hybrid DNN-HMM speech recognizer.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Ensemble environment modeling using affine transform group.
Speech Commun., 2015

Sparse representation with temporal max-smoothing for acoustic event detection.
Proceedings of the INTERSPEECH 2015, 2015

Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation.
Proceedings of the INTERSPEECH 2015, 2015

Speaker adaptive training for deep neural networks embedding linear transformation networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014
Incorporating local information of the acoustic environments to MAP-based feature compensation and acoustic model adaptation.
Comput. Speech Lang., 2014

Signal to noise ratio estimation based on an optimal design of subband voice activity detection.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Spectral patch based sparse coding for acoustic event detection.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Mandarin speech recognition using convolution neural network with augmented tone features.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Ensemble modeling of denoising autoencoder for speech spectrum restoration.
Proceedings of the INTERSPEECH 2014, 2014

Speaker Adaptive Training using Deep Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

Sparse representation based on a bag of spectral exemplars for acoustic event detection.
Proceedings of the IEEE International Conference on Acoustics, 2014

Speech enhancement using segmental nonnegative matrix factorization.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Controlling Tradeoff Between Approximation Accuracy and Complexity of a Smooth Function in a Reproducing Kernel Hilbert Space for Noise Reduction.
IEEE Trans. Signal Process., 2013

The NICT ASR system for IWSLT 2013.
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013

Speech enhancement based on deep denoising autoencoder.
Proceedings of the INTERSPEECH 2013, 2013

Speech spectrum restoration based on conditional restricted boltzmann machine.
Proceedings of the INTERSPEECH 2013, 2013

Automatic localization of a language-independent sub-network on deep neural networks trained by multi-lingual speech.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
The NICT ASR system for IWSLT2012.
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012

Factored recurrent neural network language model in TED lecture transcription.
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012

Unified denoising and dereverberation method used in restoration of MTF-based power envelope.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Controlling the tradeoff property in a regularization framework for noise reduction.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Acoustic space partition based on broad phonetic class for ensemble acoustic modeling.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Speech restoration based on deep learning autoencoder with layer-wised pretraining.
Proceedings of the INTERSPEECH 2012, 2012

Noise estimation using a constrained sequential HMM IN log-spectral domain.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Factored Language Model based on Recurrent Neural Network.
Proceedings of the COLING 2012, 2012

2011
Temporal modulation normalization for robust speech feature extraction and recognition.
Multim. Tools Appl., 2011

Sub-band temporal modulation envelopes and their normalization for automatic speech recognition in reverberant environments.
Comput. Speech Lang., 2011

Voice Activity Detection in MTF-Based Power Envelope Restoration.
Proceedings of the INTERSPEECH 2011, 2011

Adaptive Regularization Framework for Robust Voice Activity Detection.
Proceedings of the INTERSPEECH 2011, 2011

2010
Vowel Production Manifold: Intrinsic Factor Analysis of Vowel Articulation.
IEEE Trans. Speech Audio Process., 2010

Temporal contrast normalization and edge-preserved smoothing of temporal modulation structures of speech for robust speech recognition.
Speech Commun., 2010

Speech enhancement as a functional approximation and generalization.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Voice activity detection in a reguarized reproducing kernel hilbert space.
Proceedings of the INTERSPEECH 2010, 2010

2009
Speech Enhancement Based on Noise Eigenspace Projection.
IEICE Trans. Inf. Syst., 2009

Normalization on the modulation spectrum of the subband temporal envelopes for automatic speech recognition in reverberant environments.
Proceedings of the 3rd International Universal Communication Symposium, 2009

Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environments.
Proceedings of the INTERSPEECH 2009, 2009

Temporal contrast normalization and edge-preserved smoothing on temporal modulation structure for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification.
Speech Commun., 2008

Normalization on Temporal Modulation Transfer Function for Robust Speech Recognition.
Proceedings of the ISUC 2008, 2008

Noise Reduction Based Random Matrix Theory.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Robust front end processing for speech recognition in reverberant environments: utilization of speech characteristics.
Proceedings of the INTERSPEECH 2008, 2008

A model based investigation of activation patterns of the tongue muscles for vowel production.
Proceedings of the INTERSPEECH 2008, 2008

2007
A Model-Based Learning Process for Modeling Coarticulation of Human Speech.
IEICE Trans. Inf. Syst., 2007

Dimension reduction for speaker identification based on mutual information.
Proceedings of the INTERSPEECH 2007, 2007

Physiological Feature Extraction for Text Independent Speaker Identification using Non-Uniform Subband Processing.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
A Robust Voice Activity Detection Based on Noise Eigenspace Projection.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Auditory Contrast Spectrum for Robust Speech Recognition.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

A simulation based parameter optimization for a coarticulation model.
Proceedings of the INTERSPEECH 2006, 2006

A robust feature extraction based on the MTF concept for speech recognition in reverberant environment.
Proceedings of the INTERSPEECH 2006, 2006

2005
A noise reduction system in arbitrary noise environments and its applications to speech enhancement and speech recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2000
Dominant subspace analysis for auditory spectrum.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
Nonlinear processing in auditory system.
Proceedings of the IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing (NSIP'99), 1999

A New Cochlear Model Based on Adaptive Gain Mechanism.
Proceedings of the Foundations and Tools for Neural Modeling, 1999

Integrating spatial and temporal mechanisms in auditory neural fiber's computational model.
Proceedings of the International Joint Conference Neural Networks, 1999


  Loading...