Lei Xie

Affiliations:
  • Northwestern Polytechnical University, School of Computer Science, Xi'an, China
  • The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Hong Kong (2006 - 2007)
  • City University of Hong Kong, School of Creative Media, Hong Kong (2004 - 2006)
  • Northwestern Polytechnical University, Xi'an, China (PhD 2004)
  • Vrije Universiteit Brussel, Department of Electronics and Information Processing, Belgium (2001 - 2002)


According to our database1, Lei Xie authored at least 181 papers between 2008 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2022
Cross-Speaker Emotion Disentangling and Transfer for End-to-End Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Disentangling Style and Speaker Attributes for TTS Style Transfer.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing.
Neural Networks, 2022

Two-stage streaming keyword detection and localization with multi-scale depthwise temporal convolution.
Neural Networks, 2022

Noise-robust voice conversion with domain adversarial training.
Neural Networks, 2022

Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition.
CoRR, 2022

Conversational Speech Recognition By Learning Conversation-level Characteristics.
CoRR, 2022

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
CoRR, 2022

2021
LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder With Exact Lattice Generation.
IEEE Signal Process. Lett., 2021

Factorized WaveNet for voice conversion with limited data.
Speech Commun., 2021

Cycle consistent network for end-to-end style transfer TTS training.
Neural Networks, 2021

Effective and direct control of neural TTS prosody by removing interactions between different attributes.
Neural Networks, 2021

Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios.
CoRR, 2021

One-shot Voice Conversion For Style Transfer Based On Speaker Adaptation.
CoRR, 2021

M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.
CoRR, 2021

WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition.
CoRR, 2021

Controllable cross-speaker emotion transfer for end-to-end speech synthesis.
CoRR, 2021

Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis.
CoRR, 2021

Controllable Context-aware Conversational Speech Synthesis.
CoRR, 2021

Improving robustness of one-shot voice conversion with deep discriminative speaker encoder.
CoRR, 2021

Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification.
CoRR, 2021

Multi-Band Melgan: Faster Waveform Generation For High-Quality Text-To-Speech.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Learn2Sing: Target Speaker Singing Voice Synthesis by Learning from a Singing Teacher.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Fine-Grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Conversational End-to-End TTS for Voice Agents.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Optimizing Voice Conversion Network with Cycle Consistency Loss of Speaker Identity.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Context-aware RNNLM Rescoring for Conversational Speech Recognition.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Adversarial Training for Multi-domain Speaker Recognition.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Accent and Speaker Disentanglement in Many-to-many Voice Conversion.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Controllable Emotion Transfer For End-to-End Speech Synthesis.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Enriching Source Style Transfer in Recognition-Synthesis Based Non-Parallel Voice Conversion.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Improving Performance of Seen and Unseen Speech Style Transfer in End-to-End Neural TTS.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

The Multi-Speaker Multi-Style Voice Cloning Challenge 2021.
Proceedings of the IEEE International Conference on Acoustics, 2021

Wake Word Detection with Streaming Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2021

An Asynchronous WFST-Based Decoder for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Duality Temporal-Channel-Frequency Attention Enhanced Speaker Representation Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Boundary and Context Aware Training for CIF-Based Non-Autoregressive End-to-End ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Target Speaker Extraction for Customizable Query-by-Example Keyword Spotting.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Improving Adversarial Neural Machine Translation for Morphologically Rich Language.
IEEE Trans. Emerg. Top. Comput. Intell., 2020

Fast Query-by-Example Speech Search Using Attention-Based Deep Binary Embeddings.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Loanword Identification in Low-Resource Languages with Minimal Supervision.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2020

Adversarial Feature Learning and Unsupervised Clustering Based Speech Synthesis for Found Data With Acoustic and Textual Noise.
IEEE Signal Process. Lett., 2020

On the localness modeling for the self-attention based end-to-end speech synthesis.
Neural Networks, 2020

Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training.
CoRR, 2020

Conversational End-to-End TTS for Voice Agent.
CoRR, 2020

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis.
Proceedings of the Interspeech 2020, 2020

Inaudible Adversarial Perturbations for Targeted Attack in Speaker Recognition.
Proceedings of the Interspeech 2020, 2020

Wake Word Detection with Alignment-Free Lattice-Free MMI.
Proceedings of the Interspeech 2020, 2020

Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training.
Proceedings of the Interspeech 2020, 2020

Mining Effective Negative Training Samples for Keyword Spotting.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Time-Domain Neural Network Approach for Speech Bandwidth Extension.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Effective Wavenet Adaptation for Voice Conversion with Limited Data.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Region Proposal Network Based Small-Footprint Keyword Spotting.
IEEE Signal Process. Lett., 2019

Pre-Alignment Guided Attention for Improving Training Efficiency and Model Stability in End-to-End Speech Synthesis.
IEEE Access, 2019

Query-by-Example Speech Search Using Recurrent Neural Acoustic Word Embeddings With Temporal Context.
IEEE Access, 2019

Adversarial Regularization for End-to-End Robust Speaker Verification.
Proceedings of the Interspeech 2019, 2019

Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS.
Proceedings of the Interspeech 2019, 2019

A New GAN-Based End-to-End TTS Training Algorithm.
Proceedings of the Interspeech 2019, 2019

Unsupervised Adaptation with Adversarial Dropout Regularization for Robust Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Deep Audio-visual System for Closed-set Word-level Speech Recognition.
Proceedings of the International Conference on Multimodal Interaction, 2019

Enhancing Hybrid Self-attention Structure with Relative-position-aware Bias for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Pitch-aware Approach to Single-channel Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Adversarial Examples for Improving End-to-end Attention-based Small-footprint Keyword Spotting.
Proceedings of the IEEE International Conference on Acoustics, 2019

Investigating End-to-end Speech Recognition for Mandarin-english Code-switching.
Proceedings of the IEEE International Conference on Acoustics, 2019

Component Fusion: Learning Replaceable Language Model Component for End-to-end Speech Recognition System.
Proceedings of the IEEE International Conference on Acoustics, 2019

Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

An Attention-based Neural Network Approach for Single Channel Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2019

Controlling Emotion Strength with Relative Attribute for End-to-End Speech Synthesis.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Verifying Deep Keyword Spotting Detection with Acoustic Word Embeddings.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improving Mandarin End-to-End Speech Synthesis by Self-Attention and Learnable Gaussian Bias.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Virtual Adversarial Training for DS-CNN Based Small-Footprint Keyword Spotting.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

WaveNet Factorization with Singular Value Decomposition for Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Incremental Lattice Determinization for WFST Decoders.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Learning Hierarchical Representations for Expressive Speaking Style in End-to-End Speech Synthesis.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Multiple fixed beamformers with a spacial Wiener-form postfilter for far-field speech recognition.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
A Bidirectional LSTM Approach with Word Embeddings for Sentence Boundary Detection.
J. Signal Process. Syst., 2018

Learning distributed sentence representations for story segmentation.
Signal Process., 2018

Unsupervised measure of Chinese lexical semantic similarity using correlated graph model for news story segmentation.
Neurocomputing, 2018

A Refined Query-by-Example Approach to Spoken-Term-Detection on ESL learners' Speech.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search.
Proceedings of the Interspeech 2018, 2018

Investigating Generative Adversarial Networks Based Speech Dereverberation for Robust Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Training Augmentation with Adversarial Examples for Robust Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Domain Adversarial Training for Accented Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Self-validated Story Segmentation of Chinese Broadcast News.
Proceedings of the Advances in Brain Inspired Cognitive Systems, 2018

2017
Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Multitask Feature Learning for Low-Resource Query-by-Example Spoken Term Detection.
IEEE J. Sel. Top. Signal Process., 2017

Online object tracking based on BLSTM-RNN with contextual-sequential labeling.
J. Ambient Intell. Humaniz. Comput., 2017

A hybrid neural network hidden Markov model approach for automatic story segmentation.
J. Ambient Intell. Humaniz. Comput., 2017

An unsupervised deep domain adaptation approach for robust speech recognition.
Neurocomputing, 2017

Sound image externalization for headphone based real-time 3D audio.
Frontiers Comput. Sci., 2017

Introduction to special section on advances of orange technologies.
Frontiers Comput. Sci., 2017

Denoising Recurrent Neural Network for Deep Bidirectional LSTM Based Voice Conversion.
Proceedings of the Interspeech 2017, 2017

Pairwise learning using multi-lingual bottleneck features for low-resource query-by-example spoken term detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Extracting bottleneck features and word-like pairs from untranscribed speech for feature representation.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Statistical parametric speech synthesis using generative adversarial networks under a multi-task learning framework.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Multilingual bottle-neck feature learning from untranscribed speech.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Topic embedding of sentences for story segmentation.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

An end-to-end neural network approach to story segmentation.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A segmental DNN/i-vector approach for digit-prompted speaker verification.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Real-time tracking-by-learning with high-order regularization fusion for big video abstraction.
Signal Process., 2016

Guest Editorial: Immersive Audio/Visual Systems.
Multim. Tools Appl., 2016

A deep bidirectional LSTM approach for video-realistic talking head.
Multim. Tools Appl., 2016

Deformable object tracking with spatiotemporal segmentation in big vision surveillance.
Neurocomputing, 2016

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

The NNI Vietnamese Speech Recognition System for MediaEval 2016.
Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

Investigating neural network based query-by-example keyword spotting approach for personalized wake-up word detection in Mandarin Chinese.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information.
Proceedings of the Interspeech 2016, 2016

A DNN-HMM Approach to Story Segmentation.
Proceedings of the Interspeech 2016, 2016

Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion.
Proceedings of the Interspeech 2016, 2016

Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis.
Proceedings of the Interspeech 2016, 2016

Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection.
Proceedings of the Interspeech 2016, 2016

Deep neural network derived bottleneck features for accurate audio classification.
Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016

Approximate search of audio queries by using DTW with phone time boundary and data augmentation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Exemplar-based sparse representation of timbre and prosody for voice conversion.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

On the training of DNN-based average voice model for speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

On the use of I-vectors and average voice model for voice conversion without parallel data.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Study on near-field crosstalk cancellation based on least square algorithm.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Tennis Ball Tracking Using a Two-Layered Data Association Approach.
IEEE Trans. Multim., 2015

Multiple pedestrian tracking based on couple-states Markov chain with semantic topic learning for video surveillance.
Soft Comput., 2015

NestDE: generic parameters tuning for automatic story segmentation.
Soft Comput., 2015

Topic segmentation on spoken documents using self-validated acoustic cuts.
Soft Comput., 2015

Online Object Tracking Based on CNN with Metropolis-Hasting Re-Sampling.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

The NNI Query-by-Example System for MediaEval 2015.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation.
Proceedings of the INTERSPEECH 2015, 2015

An alternating optimization approach for phase retrieval.
Proceedings of the INTERSPEECH 2015, 2015

Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study.
Proceedings of the INTERSPEECH 2015, 2015

Language independent query-by-example spoken term detection using N-best phone sequences and partial matching.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Photo-real talking head with deep bidirectional LSTM.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Non-negative matrix factorization using stable alternating direction method of multipliers for source separation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A density peak clustering approach to unsupervised acoustic subword units discovery.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A waveform representation framework for high-quality statistical parametric speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Fundamental frequency modeling using wavelets for emotional voice conversion.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
A statistical parametric approach to video-realistic text-driven talking avatar.
Multim. Tools Appl., 2014

Multimodal joint information processing in human machine interaction: recent advances.
Multim. Tools Appl., 2014

The NNI Query-by-Example System for MediaEval 2014.
Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014

A hybrid virtual bass system with improved phase vocoder and high efficiency.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Experimental study on dereverberation and noise reduction for distant speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection.
Proceedings of the INTERSPEECH 2014, 2014

A deep neural network approach for sentence boundary detection in broadcast news.
Proceedings of the INTERSPEECH 2014, 2014

Stereo acoustic echo suppression using widely linear filtering in the frequency domain.
Proceedings of the INTERSPEECH 2014, 2014

Speech-driven head motion synthesis using neural networks.
Proceedings of the INTERSPEECH 2014, 2014

An ensemble of deep neural networks for object tracking.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Unsupervised broadcast news story segmentation using distance dependent Chinese restaurant processes.
Proceedings of the IEEE International Conference on Acoustics, 2014

Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Learning optimal features for music transcription.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Multimodal continuous affect recognition based on LSTM and multiple kernel learning.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
A two layered data association approach for ball tracking.
Proceedings of the IEEE International Conference on Acoustics, 2013

A tighter lower bound estimate for dynamic time warping.
Proceedings of the IEEE International Conference on Acoustics, 2013

Measuring semantic similarity by contextualword connections in Chinese news story segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2013

Broadcast news story segmentation using latent topics on data manifold.
Proceedings of the IEEE International Conference on Acoustics, 2013

Numerical calculation of the head-related transfer functions with Chinese dummy head.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News.
IEEE Trans. Speech Audio Process., 2012

Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features.
IEICE Trans. Inf. Syst., 2012

Mask Estimation and Refinement for MFT-based Robust Speaker Verification.
Proceedings of the INTERSPEECH 2012, 2012

Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis.
Proceedings of the INTERSPEECH 2012, 2012

Lexical Story Co-Segmentation of Chinese Broadcast News.
Proceedings of the INTERSPEECH 2012, 2012

Acoustic TextTiling for story segmentation of spoken documents.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Detection of ball hits in a tennis game using audio and visual information.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news.
Multim. Syst., 2011

On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news.
Inf. Sci., 2011

Probabilistic Latent Semantic Analysis for Broadcast News Story Segmentation.
Proceedings of the INTERSPEECH 2011, 2011

2010
Cascade Markov random fields for stroke extraction of Chinese characters.
Inf. Sci., 2010

Speech and Auditory Interfaces for Ubiquitous, Immersive and Personalized Applications.
Proceedings of the Symposia and Workshops on Ubiquitous, 2010

Multi-modal feature integration for story boundary detection in broadcast news.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Dual-microphone noise reduction based on semi-blind DUET.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Phoneme lattice based texttiling towards multilingual story segmentation.
Proceedings of the INTERSPEECH 2010, 2010

Maximum lexical cohesion for fine-grained news story segmentation.
Proceedings of the INTERSPEECH 2010, 2010

2009
Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models.
J. Vis. Lang. Comput., 2009

Noise robust features for speech/music discrimination in real-time telecommunication.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

A Subword Normalized Cut Approach to Automatic Story Segmentation of Chinese Broadcast News.
Proceedings of the Information Retrieval Technology, 2009

Multicue Graph Mincut for Image Segmentation.
Proceedings of the Computer Vision, 2009

2008
Multi-Scale TextTiling for Automatic Story Segmentation in Chinese Broadcast News.
Proceedings of the Information Retrieval Technology, 2008


  Loading...