Lei Xie

According to our database1, Lei Xie authored at least 108 papers between 2008 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2019
Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2019

Region Proposal Network Based Small-Footprint Keyword Spotting.
IEEE Signal Process. Lett., 2019

Query-by-Example Speech Search Using Recurrent Neural Acoustic Word Embeddings With Temporal Context.
IEEE Access, 2019

Adversarial Examples for Improving End-to-end Attention-based Small-footprint Keyword Spotting.
Proceedings of the IEEE International Conference on Acoustics, 2019

Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

An Attention-based Neural Network Approach for Single Channel Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
A Bidirectional LSTM Approach with Word Embeddings for Sentence Boundary Detection.
Signal Processing Systems, 2018

Learning distributed sentence representations for story segmentation.
Signal Processing, 2018

Unsupervised measure of Chinese lexical semantic similarity using correlated graph model for news story segmentation.
Neurocomputing, 2018

Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition.
CoRR, 2018

Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search.
CoRR, 2018

Domain Adversarial Training for Accented Speech Recognition.
CoRR, 2018

Training Augmentation with Adversarial Examples for Robust Speech Recognition.
CoRR, 2018

Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition.
CoRR, 2018

Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search.
Proceedings of the Interspeech 2018, 2018

Investigating Generative Adversarial Networks Based Speech Dereverberation for Robust Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Training Augmentation with Adversarial Examples for Robust Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Domain Adversarial Training for Accented Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Self-validated Story Segmentation of Chinese Broadcast News.
Proceedings of the Advances in Brain Inspired Cognitive Systems, 2018

2017
Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2017

Multitask Feature Learning for Low-Resource Query-by-Example Spoken Term Detection.
J. Sel. Topics Signal Processing, 2017

A hybrid neural network hidden Markov model approach for automatic story segmentation.
J. Ambient Intelligence and Humanized Computing, 2017

An unsupervised deep domain adaptation approach for robust speech recognition.
Neurocomputing, 2017

Sound image externalization for headphone based real-time 3D audio.
Frontiers Comput. Sci., 2017

Introduction to special section on advances of orange technologies.
Frontiers Comput. Sci., 2017

Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework.
CoRR, 2017

Denoising Recurrent Neural Network for Deep Bidirectional LSTM Based Voice Conversion.
Proceedings of the Interspeech 2017, 2017

Pairwise learning using multi-lingual bottleneck features for low-resource query-by-example spoken term detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Extracting bottleneck features and word-like pairs from untranscribed speech for feature representation.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Statistical parametric speech synthesis using generative adversarial networks under a multi-task learning framework.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Multilingual bottle-neck feature learning from untranscribed speech.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Topic embedding of sentences for story segmentation.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

An end-to-end neural network approach to story segmentation.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A segmental DNN/i-vector approach for digit-prompted speaker verification.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Real-time tracking-by-learning with high-order regularization fusion for big video abstraction.
Signal Processing, 2016

A deep bidirectional LSTM approach for video-realistic talking head.
Multimedia Tools Appl., 2016

Deformable object tracking with spatiotemporal segmentation in big vision surveillance.
Neurocomputing, 2016

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

The NNI Vietnamese Speech Recognition System for MediaEval 2016.
Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

Investigating neural network based query-by-example keyword spotting approach for personalized wake-up word detection in Mandarin Chinese.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information.
Proceedings of the Interspeech 2016, 2016

A DNN-HMM Approach to Story Segmentation.
Proceedings of the Interspeech 2016, 2016

Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion.
Proceedings of the Interspeech 2016, 2016

Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis.
Proceedings of the Interspeech 2016, 2016

Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection.
Proceedings of the Interspeech 2016, 2016

Deep neural network derived bottleneck features for accurate audio classification.
Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016

Approximate search of audio queries by using DTW with phone time boundary and data augmentation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Exemplar-based sparse representation of timbre and prosody for voice conversion.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

On the use of I-vectors and average voice model for voice conversion without parallel data.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Study on near-field crosstalk cancellation based on least square algorithm.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Tennis Ball Tracking Using a Two-Layered Data Association Approach.
IEEE Trans. Multimedia, 2015

Multiple pedestrian tracking based on couple-states Markov chain with semantic topic learning for video surveillance.
Soft Comput., 2015

NestDE: generic parameters tuning for automatic story segmentation.
Soft Comput., 2015

Topic segmentation on spoken documents using self-validated acoustic cuts.
Soft Comput., 2015

A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis.
CoRR, 2015

Online Object Tracking Based on CNN with Metropolis-Hasting Re-Sampling.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

The NNI Query-by-Example System for MediaEval 2015.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation.
Proceedings of the INTERSPEECH 2015, 2015

An alternating optimization approach for phase retrieval.
Proceedings of the INTERSPEECH 2015, 2015

Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study.
Proceedings of the INTERSPEECH 2015, 2015

Language independent query-by-example spoken term detection using N-best phone sequences and partial matching.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Non-negative matrix factorization using stable alternating direction method of multipliers for source separation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A density peak clustering approach to unsupervised acoustic subword units discovery.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A waveform representation framework for high-quality statistical parametric speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Fundamental frequency modeling using wavelets for emotional voice conversion.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
A statistical parametric approach to video-realistic text-driven talking avatar.
Multimedia Tools Appl., 2014

Multimodal joint information processing in human machine interaction: recent advances.
Multimedia Tools Appl., 2014

The NNI Query-by-Example System for MediaEval 2014.
Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014

A hybrid virtual bass system with improved phase vocoder and high efficiency.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Experimental study on dereverberation and noise reduction for distant speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection.
Proceedings of the INTERSPEECH 2014, 2014

A deep neural network approach for sentence boundary detection in broadcast news.
Proceedings of the INTERSPEECH 2014, 2014

Stereo acoustic echo suppression using widely linear filtering in the frequency domain.
Proceedings of the INTERSPEECH 2014, 2014

Speech-driven head motion synthesis using neural networks.
Proceedings of the INTERSPEECH 2014, 2014

An ensemble of deep neural networks for object tracking.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Unsupervised broadcast news story segmentation using distance dependent Chinese restaurant processes.
Proceedings of the IEEE International Conference on Acoustics, 2014

Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Learning optimal features for music transcription.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Multimodal continuous affect recognition based on LSTM and multiple kernel learning.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
A two layered data association approach for ball tracking.
Proceedings of the IEEE International Conference on Acoustics, 2013

A tighter lower bound estimate for dynamic time warping.
Proceedings of the IEEE International Conference on Acoustics, 2013

Measuring semantic similarity by contextualword connections in Chinese news story segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2013

Broadcast news story segmentation using latent topics on data manifold.
Proceedings of the IEEE International Conference on Acoustics, 2013

Numerical calculation of the head-related transfer functions with Chinese dummy head.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News.
IEEE Trans. Audio, Speech & Language Processing, 2012

Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features.
IEICE Transactions, 2012

Mask Estimation and Refinement for MFT-based Robust Speaker Verification.
Proceedings of the INTERSPEECH 2012, 2012

Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis.
Proceedings of the INTERSPEECH 2012, 2012

Lexical Story Co-Segmentation of Chinese Broadcast News.
Proceedings of the INTERSPEECH 2012, 2012

Acoustic TextTiling for story segmentation of spoken documents.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Detection of ball hits in a tennis game using audio and visual information.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news.
Multimedia Syst., 2011

Probabilistic Latent Semantic Analysis for Broadcast News Story Segmentation.
Proceedings of the INTERSPEECH 2011, 2011

2010
Cascade Markov random fields for stroke extraction of Chinese characters.
Inf. Sci., 2010

Speech and Auditory Interfaces for Ubiquitous, Immersive and Personalized Applications.
Proceedings of the Symposia and Workshops on Ubiquitous, 2010

Multi-modal feature integration for story boundary detection in broadcast news.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Dual-microphone noise reduction based on semi-blind DUET.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Phoneme lattice based texttiling towards multilingual story segmentation.
Proceedings of the INTERSPEECH 2010, 2010

Maximum lexical cohesion for fine-grained news story segmentation.
Proceedings of the INTERSPEECH 2010, 2010

2009
Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models.
J. Vis. Lang. Comput., 2009

Noise robust features for speech/music discrimination in real-time telecommunication.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

A Subword Normalized Cut Approach to Automatic Story Segmentation of Chinese Broadcast News.
Proceedings of the Information Retrieval Technology, 2009

Multicue Graph Mincut for Image Segmentation.
Proceedings of the Computer Vision, 2009

2008
Multi-Scale TextTiling for Automatic Story Segmentation in Chinese Broadcast News.
Proceedings of the Information Retrieval Technology, 2008


  Loading...