Weiqiang Zhang

CoRR, 2024

2023

Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Symmetric Saliency-Based Adversarial Attack to Speaker Identification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2023

Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges.

[BibT_eX]

[DOI]

CoRR, 2023

Transferring speech-generic and depression-specific knowledge for Alzheimer's disease detection.

[BibT_eX]

[DOI]

CoRR, 2023

Task-Agnostic Structured Pruning of Speech Representation Models.

[BibT_eX]

[DOI]

CoRR, 2023

DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model.

[BibT_eX]

[DOI]

CoRR, 2023

Learnable Sparsity Structured Pruning for Acoustic Pre-trained Models.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Signal Processing and Machine Learning, 2023

DistilALHuBERT: A Distilled Parameter Sharing Audio Representation Model.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Signal Processing and Machine Learning, 2023

Expressive Speech-Driven Facial Animation with Controllable Emotions.

[BibT_eX]

[DOI]

Yutong Chen

Junhong Zhao

Proceedings of the IEEE International Conference on Multimedia and Expo Workshops, 2023

2022

Improving Automatic Speech Recognition Performance for Low-Resource Languages With Self-Supervised Models.

[BibT_eX]

[DOI]

Jing Zhao

IEEE J. Sel. Top. Signal Process., 2022

The THUEE System Description for the IARPA OpenASR21 Challenge.

[BibT_eX]

[DOI]

CoRR, 2022

BERT-LID: Leveraging BERT to Improve Spoken Language Identification.

[BibT_eX]

[DOI]

CoRR, 2022

Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

BERT-LID: Leveraging BERT to Improve Spoken Language Identification.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Summary On The ISCSLP 2022 Chinese-English Code-Switching ASR Challenge.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

The THUEE System Description for the IARPA OpenASR21 Challenge.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2022, 2022

2021

End-to-end keyword search system based on attention mechanism and energy scorer for low resource languages.

[BibT_eX]

[DOI]

Zeyu Zhao

Neural Networks, 2021

Keyword Search Based on Unsupervised Pre-Trained Acoustic Models.

[BibT_eX]

[DOI]

Int. J. Asian Lang. Process., 2021

Timestamp-aligning and keyword-biasing end-to-end ASR front-end for a KWS system.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2021

The TNT Team System Descriptions of Cantonese and Mongolian for IARPA OpenASR20.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Automatic Speech Recognition for Low-Resource Languages: The Thuee Systems for the IARPA Openasr20 Evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

MulKINet: Multi-Stage Key-Invariant Convolutional Neural Networks for Accurate and Fast Cover Song Identification.

[BibT_eX]

[DOI]

Chengdi Cao

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2020

End-to-End Keyword Search Based on Attention and Energy Scorer for Low Resource Languages.

[BibT_eX]

[DOI]

Zeyu Zhao

Proceedings of the Interspeech 2020, 2020

THUEE System for NIST SRE19 CTS Challenge.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Dynamic Temporal Residual Learning for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Staged Training Strategy and Multi-Activation for Audio Tagging with Noisy and Sparse Multi-Label Data.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Tightness of a New and Enhanced Semidefinite Relaxation for MIMO Detection.

[BibT_eX]

[DOI]

SIAM J. Optim., 2019

THUEE system description for NIST 2019 SRE CTS Challenge.

[BibT_eX]

[DOI]

CoRR, 2019

A Small-Footprint End-to-End KWS System in Low Resources.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Conference on Signal Processing and Machine Learning, 2019

Multi-Task Learning Based End-to-End Speaker Recognition.

[BibT_eX]

[DOI]

Yuxuan Pan

Proceedings of the 2nd International Conference on Signal Processing and Machine Learning, 2019

Singing Voice Separation Based on Deep Regression Neural Network.

[BibT_eX]

[DOI]

Shuqian Yang

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2019

A Fusion Model for Robust Voice Activity Detection.

[BibT_eX]

[DOI]

Guan-Bo Wang

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2019

End-to-End Topic Classification without ASR.

[BibT_eX]

[DOI]

Zexian Dong

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2019

Music Genre Classification Using Duplicated Convolutional Layers in Neural Networks.

[BibT_eX]

[DOI]

Hansi Yang

Proceedings of the Interspeech 2019, 2019

Learning How to Listen: A Temporal-Frequential Attention Model for Sound Event Detection.

[BibT_eX]

[DOI]

Yu-Han Shen

Ke-Xin He

Proceedings of the Interspeech 2019, 2019

Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection.

[BibT_eX]

[DOI]

Ke-Xin He

Yu-Han Shen

Proceedings of the Interspeech 2019, 2019

Multiple Neural Networks with Ensemble Method for Audio Tagging with Noisy Labels and Minimal Supervision.

[BibT_eX]

[DOI]

Kexin He

Yuhan Shen

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

An RNN and CRNN Based Approach to Robust Voice Activity Detection.

[BibT_eX]

[DOI]

Guan-Bo Wang

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Dilated-Gated Convolutional Neural Network with A New Loss Function on Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Lattice Based Transcription Loss for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2018

Semi-supervised minimum redundancy maximum relevance feature selection for audio classification.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2018

Argument division based branch-and-bound algorithm for unit-modulus constrained complex quadratic programming.

[BibT_eX]

[DOI]

J. Glob. Optim., 2018

Advanced recurrent network-based hybrid acoustic models for low resource speech recognition.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2018

An adapted data selection for deep learning-based audio segmentation in multi-genre broadcast channel.

[BibT_eX]

[DOI]

Digit. Signal Process., 2018

SAM-GCNN: A Gated Convolutional Neural Network with Segment-Level Attention Mechanism for Home Activity Monitoring.

[BibT_eX]

[DOI]

Yu-Han Shen

Ke-Xin He

Proceedings of the 2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), 2018

Improved Phonotactic Language Recognition Using Collaborated Language Model.

[BibT_eX]

[DOI]

Proceedings of the 5th IEEE International Conference on Cloud Computing and Intelligence Systems, 2018

Investigation of Frame Alignments for GMM-based Digit-prompted Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Deep neural networks based speaker modeling at different levels of phonetic granularity.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

An LSTM-CTC based verification system for proxy-word based OOV keyword search.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Gated convolutional networks based hybrid acoustic models for low resource speech recognition.

[BibT_eX]

[DOI]

Jian Kang

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Discriminative Boosting Algorithm for Diversified Front-End Phonotactic Language Recognition.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2016

Semi-supervised feature selection for audio classification based on constraint compensated Laplacian score.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2016

Voice activity detection algorithm based on long-term pitch information.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2016

The NDSC transcription system for the 2016 multi-genre broadcast challenge.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Application of i-vector in speech and music classification.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Signal Processing and Information Technology, 2016

A speech enhancement algorithm using computational auditory scene analysis with spectral subtraction.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Signal Processing and Information Technology, 2016

Gated recurrent units based hybrid acoustic models for robust speech recognition.

[BibT_eX]

[DOI]

Jian Kang

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Lattice based transcription loss for end-to-end speech recognition.

[BibT_eX]

[DOI]

Jian Kang

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A study of variational method for text-independent speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Improving Deep Neural Networks Based Speaker Verification Using Unlabeled Data.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2016, 2016

A Novel Discriminative Score Calibration Method for Keyword Search.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2016, 2016

2015

Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition.

[BibT_eX]

[DOI]

Zhiyi Li

Multim. Tools Appl., 2015

Regularized minimum class variance extreme learning machine for language recognition.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

THUEE language modeling method for the OpenKWS 2015 evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2015

Convolutional maxout neural networks for speech separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2015

Neuron sparseness versus connection sparseness in deep neural network for large vocabulary speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The THUEE system for the openKWS14 keyword search evaluation.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improved system fusion for keyword search.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Calibration of word posterior estimation in confusion networks for keyword search.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014

Efficient One-Pass Decoding with NNLM for Speech Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2014

Spoken language recognition based on gap-weighted subsequence kernels.

[BibT_eX]

[DOI]

Speech Commun., 2014

Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2014

Empirically combining unnormalized NNLM and back-off N-gram for fast N-best rescoring in speech recognition.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2014

Homogenous ensemble phonotactic language recognition based on SVM supervector reconstruction.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2014

Text-Independent Speaker Verification via State Alignment.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Multi-scale kernels for short utterance speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Speaker verification using Fisher vector.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Discriminative boosting regression backend for phonotactic language recognition.

[BibT_eX]

[DOI]

Wei-Wei Liu

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Phonotactic language recognition based on DNN-HMM acoustic model.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A new fast and memory effective i-vector extraction based on factor analysis of KLD derived GMM supervector.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Speaker adaptation based on sparse and low-rank eigenphone matrix estimation.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2014, 2014

Phonotactic language recognition based on time-gap-weighted lattice kernels.

[BibT_eX]

[DOI]

Wei-Wei Liu

Proceedings of the INTERSPEECH 2014, 2014

Variance regularization of RNNLM for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Improved phonotactic language recognition based on RNN feature reconstruction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Rapid speaker adaptation using compressive sensing.

[BibT_eX]

[DOI]

Speech Commun., 2013

Exploiting articulatory features for pitch accent detection.

[BibT_eX]

[DOI]

J. Zhejiang Univ. Sci. C, 2013

Exploiting contextual information for prosodic event detection using auto-context.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2013

RNN language model with word clustering and class-based output layer.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2013

THU-EE system fusion for the NIST 2012 speaker recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2013, 2013

Parallel absolute-relative feature based phonotactic language recognition.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2013, 2013

Temporal kernel neural network language model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

THUEE system for the Albayzin 2012 language recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

Improving deep neural network acoustic models using unlabeled data.

[BibT_eX]

[DOI]

Meng Cai

Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

Compact acoustic modeling based on acoustic manifold using a mixture of factor analyzers.

[BibT_eX]

[DOI]

Wen-Lin Zhang

Bi-Cheng Li

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Complementary combination in i-vector level for language recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Automatic pitch accent detection using auto-context with acoustic features.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

2011

Time-Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2011

Time-Frequency Cepstral Features and Combining Discriminative Training for Phonotactic Language Recognition.

[BibT_eX]

[DOI]

J. Comput., 2011

Language Recognition Based on Acoustic Diversified Phone Recognizers and Phonotactic Feature Fusion.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2011

Robust Audio Fingerprinting Based on Local Spectral Luminance Maxima Scheme.

[BibT_eX]

[DOI]

Yongzhe Shi

Proceedings of the INTERSPEECH 2011, 2011

Speaker adaptation based on speaker-dependent eigenphone estimation.

[BibT_eX]

[DOI]

Wen-Lin Zhang

Bi-Cheng Li

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Multiple Background Models for Speaker Verification.

[BibT_eX]

[DOI]

Yuxiang Shan

Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

Multi-feature combination for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Variant time-frequency cepstral features for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2010, 2010

A fast query by humming system based on notes.

[BibT_eX]

[DOI]

Jingzhou Yang

Proceedings of the INTERSPEECH 2010, 2010

Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression.

[BibT_eX]

[DOI]

Sha Meng

Proceedings of the INTERSPEECH 2010, 2010

Integration of Complementary Phone Recognizers for Phonotactic Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the Information Computing and Applications - First International Conference, 2010

2008

An Equalized Heteroscedastic Linear Discriminant Analysis Algorithm.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2008

Fractional Fourier transform based auditory feature for language identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

Channel compensation technology in differential GSV-SVM speaker verification system.

[BibT_eX]

[DOI]

Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

2007

Two-Stage Method for Specific Audio Retrieval.

[BibT_eX]

[DOI]