Lei Xie
Affiliations:- Northwestern Polytechnical University, School of Computer Science, Xi'an, China
- The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Hong Kong (2006 - 2007)
- City University of Hong Kong, School of Creative Media, Hong Kong (2004 - 2006)
- Northwestern Polytechnical University, Xi'an, China (PhD 2004)
- Vrije Universiteit Brussel, Department of Electronics and Information Processing, Belgium (2001 - 2002)
According to our database1,
Lei Xie
authored at least 181 papers
between 2008 and 2022.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing.
Neural Networks, 2022
Two-stage streaming keyword detection and localization with multi-scale depthwise temporal convolution.
Neural Networks, 2022
Neural Networks, 2022
CoRR, 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
CoRR, 2022
2021
LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder With Exact Lattice Generation.
IEEE Signal Process. Lett., 2021
Neural Networks, 2021
Effective and direct control of neural TTS prosody by removing interactions between different attributes.
Neural Networks, 2021
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios.
CoRR, 2021
CoRR, 2021
CoRR, 2021
CoRR, 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis.
CoRR, 2021
Improving robustness of one-shot voice conversion with deep discriminative speaker encoder.
CoRR, 2021
CoRR, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Learn2Sing: Target Speaker Singing Voice Synthesis by Learning from a Singing Teacher.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Fine-Grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Enriching Source Style Transfer in Recognition-Synthesis Based Non-Parallel Voice Conversion.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-End Neural TTS.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Duality Temporal-Channel-Frequency Attention Enhanced Speaker Representation Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
IEEE Trans. Emerg. Top. Comput. Intell., 2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2020
Adversarial Feature Learning and Unsupervised Clustering Based Speech Synthesis for Found Data With Acoustic and Textual Noise.
IEEE Signal Process. Lett., 2020
Neural Networks, 2020
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training.
CoRR, 2020
Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Proceedings of the Interspeech 2020, 2020
Proceedings of the Interspeech 2020, 2020
Proceedings of the Interspeech 2020, 2020
Proceedings of the Interspeech 2020, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
IEEE ACM Trans. Audio Speech Lang. Process., 2019
IEEE Signal Process. Lett., 2019
Pre-Alignment Guided Attention for Improving Training Efficiency and Model Stability in End-to-End Speech Synthesis.
IEEE Access, 2019
Query-by-Example Speech Search Using Recurrent Neural Acoustic Word Embeddings With Temporal Context.
IEEE Access, 2019
Proceedings of the Interspeech 2019, 2019
Proceedings of the Interspeech 2019, 2019
Proceedings of the Interspeech 2019, 2019
Unsupervised Adaptation with Adversarial Dropout Regularization for Robust Speech Recognition.
Proceedings of the Interspeech 2019, 2019
Proceedings of the International Conference on Multimodal Interaction, 2019
Enhancing Hybrid Self-attention Structure with Relative-position-aware Bias for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Adversarial Examples for Improving End-to-end Attention-based Small-footprint Keyword Spotting.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Component Fusion: Learning Replaceable Language Model Component for End-to-end Speech Recognition System.
Proceedings of the IEEE International Conference on Acoustics, 2019
Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Controlling Emotion Strength with Relative Attribute for End-to-End Speech Synthesis.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Improving Mandarin End-to-End Speech Synthesis by Self-Attention and Learnable Gaussian Bias.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Learning Hierarchical Representations for Expressive Speaking Style in End-to-End Speech Synthesis.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Multiple fixed beamformers with a spacial Wiener-form postfilter for far-field speech recognition.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
J. Signal Process. Syst., 2018
Signal Process., 2018
Unsupervised measure of Chinese lexical semantic similarity using correlated graph model for news story segmentation.
Neurocomputing, 2018
A Refined Query-by-Example Approach to Spoken-Term-Detection on ESL learners' Speech.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search.
Proceedings of the Interspeech 2018, 2018
Investigating Generative Adversarial Networks Based Speech Dereverberation for Robust Speech Recognition.
Proceedings of the Interspeech 2018, 2018
Proceedings of the Interspeech 2018, 2018
Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition.
Proceedings of the Interspeech 2018, 2018
Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the Advances in Brain Inspired Cognitive Systems, 2018
2017
Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News.
IEEE ACM Trans. Audio Speech Lang. Process., 2017
IEEE J. Sel. Top. Signal Process., 2017
J. Ambient Intell. Humaniz. Comput., 2017
A hybrid neural network hidden Markov model approach for automatic story segmentation.
J. Ambient Intell. Humaniz. Comput., 2017
Neurocomputing, 2017
Frontiers Comput. Sci., 2017
Frontiers Comput. Sci., 2017
Denoising Recurrent Neural Network for Deep Bidirectional LSTM Based Voice Conversion.
Proceedings of the Interspeech 2017, 2017
Pairwise learning using multi-lingual bottleneck features for low-resource query-by-example spoken term detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Extracting bottleneck features and word-like pairs from untranscribed speech for feature representation.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Statistical parametric speech synthesis using generative adversarial networks under a multi-task learning framework.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
Real-time tracking-by-learning with high-order regularization fusion for big video abstraction.
Signal Process., 2016
Multim. Tools Appl., 2016
Deformable object tracking with spatiotemporal segmentation in big vision surveillance.
Neurocomputing, 2016
An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016
Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016
Investigating neural network based query-by-example keyword spotting approach for personalized wake-up word detection in Mandarin Chinese.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information.
Proceedings of the Interspeech 2016, 2016
Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion.
Proceedings of the Interspeech 2016, 2016
Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis.
Proceedings of the Interspeech 2016, 2016
Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection.
Proceedings of the Interspeech 2016, 2016
Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016
Approximate search of audio queries by using DTW with phone time boundary and data augmentation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
On the use of I-vectors and average voice model for voice conversion without parallel data.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
2015
IEEE Trans. Multim., 2015
Multiple pedestrian tracking based on couple-states Markov chain with semantic topic learning for video surveillance.
Soft Comput., 2015
Soft Comput., 2015
Soft Comput., 2015
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015
Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation.
Proceedings of the INTERSPEECH 2015, 2015
Proceedings of the INTERSPEECH 2015, 2015
Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study.
Proceedings of the INTERSPEECH 2015, 2015
Language independent query-by-example spoken term detection using N-best phone sequences and partial matching.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Non-negative matrix factorization using stable alternating direction method of multipliers for source separation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
A waveform representation framework for high-quality statistical parametric speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015
2014
Multim. Tools Appl., 2014
Multimodal joint information processing in human machine interaction: recent advances.
Multim. Tools Appl., 2014
Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Experimental study on dereverberation and noise reduction for distant speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection.
Proceedings of the INTERSPEECH 2014, 2014
Proceedings of the INTERSPEECH 2014, 2014
Stereo acoustic echo suppression using widely linear filtering in the frequency domain.
Proceedings of the INTERSPEECH 2014, 2014
Proceedings of the INTERSPEECH 2014, 2014
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014
Unsupervised broadcast news story segmentation using distance dependent Chinese restaurant processes.
Proceedings of the IEEE International Conference on Acoustics, 2014
Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Measuring semantic similarity by contextualword connections in Chinese news story segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Numerical calculation of the head-related transfer functions with Chinese dummy head.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013
2012
IEEE Trans. Speech Audio Process., 2012
Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features.
IEICE Trans. Inf. Syst., 2012
Proceedings of the INTERSPEECH 2012, 2012
Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis.
Proceedings of the INTERSPEECH 2012, 2012
Proceedings of the INTERSPEECH 2012, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
2011
Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news.
Multim. Syst., 2011
On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news.
Inf. Sci., 2011
Proceedings of the INTERSPEECH 2011, 2011
2010
Inf. Sci., 2010
Speech and Auditory Interfaces for Ubiquitous, Immersive and Personalized Applications.
Proceedings of the Symposia and Workshops on Ubiquitous, 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the INTERSPEECH 2010, 2010
Proceedings of the INTERSPEECH 2010, 2010
2009
Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models.
J. Vis. Lang. Comput., 2009
Noise robust features for speech/music discrimination in real-time telecommunication.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009
A Subword Normalized Cut Approach to Automatic Story Segmentation of Chinese Broadcast News.
Proceedings of the Information Retrieval Technology, 2009
Proceedings of the Computer Vision, 2009
2008
Proceedings of the Information Retrieval Technology, 2008