Liang He

IEEE Trans. Veh. Technol., May, 2026

DS3: Dual-space sample selection for speaker representation learning with noisy labels.

[BibT_eX]

[DOI]

Pattern Recognit., 2026

2025

DTKD-DL: Dual-teacher knowledge distillation with dual-loops for continuous few-shot relation extraction.

[BibT_eX]

[DOI]

Appl. Soft Comput., 2025

A Joint Network for Singing Melody Extraction from Polyphonic Music with Attention Aggregation and Self-Consistency Training.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Alignment Losses for End-to-End Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

Self-supervised Speaker Verification with Batch-scale Pseudo-labels Correction.

[BibT_eX]

[DOI]

Junxu Wang

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Stable Extended U-Net for Noise-Robust Speaker Verification.

[BibT_eX]

[DOI]

Zonghui Wang

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Noise Supervised Contrastive Learning and Feature-Perturbed for Anomalous Sound Detection.

[BibT_eX]

[DOI]

Shun Huang

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Enhancing Medical Dialogue Generation through Knowledge Refinement and Dynamic Prompt Adjustment.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

GLFER-Net: a polyphonic sound source localization and detection network based on global-local feature extraction and recalibration.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2024

FSN: Joint Entity and Relation Extraction Based on Filter Separator Network.

[BibT_eX]

[DOI]

Entropy, February, 2024

Improving Speaker Verification With Noise-Aware Label Ensembling and Sample Selection: Learning and Correcting Noisy Speaker Labels.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

IIFC-Net: A Monaural Speech Enhancement Network With High-Order Information Interaction and Feature Calibration.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2024

LMKG: A large-scale and multi-source medical knowledge graph for intelligent medicine applications.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2024

Prompt for extraction: Multiple templates choice model for event extraction.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2024

One Small and One Large for Document-level Event Argument Extraction.

[BibT_eX]

[DOI]

CoRR, 2024

Scene Text Recognition Via k-NN Attention-Based Decoder and Margin-Based Softmax Loss.

[BibT_eX]

[DOI]

Hongxia Zhang

Minqiang Xu

Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024

Simplified Skip-Connected UNet for Robust Speaker Verification Under Noisy Environments.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

LE-CAM++: A Lighter and More Efficient CAM++ for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Self-Supervised Speaker Verification with Mini-Batch Prediction Correction.

[BibT_eX]

[DOI]

Junxu Wang

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Speech Topic Classification Based on Multi-Scale and Graph Attention Networks.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

YOLOPitch: A Time-Frequency Dual-Branch YOLO Model for Pitch Estimation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Cross-modal Features Interaction-and-Aggregation Network with Self-consistency Training for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Enhancing Abstractive Dialogue Summarization with Internal Knowledge.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2024

Transmission Line Routing Based on Multi-Source Geographic Information and Improved Grey Wolf Algorithm.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Machine Learning and Computer Application, 2024

CSMA-CNER: Multi-modal Chinese NER task with Cross- and Self-Modality Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Speaker Recognition Based on Pre-Trained Model and Deep Clustering.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

A Speaker Recognition Method Based on Stable Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Phase Continuity-Aware Self-Attentive Recurrent Network with Adaptive Feature Selection for Robust VAD.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Introducing Multilingual Phonetic Information to Speaker Embedding for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

SMMA-Net: An Audio Clue-Based Target Speaker Extraction Network with Spectrogram Matching and Mutual Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

A Study on Graph Embedding for Speaker Recognition.

[BibT_eX]

[DOI]

Ruida Li

Mengqi Niu

Proceedings of the IEEE International Conference on Acoustics, 2024

Multi-View Speaker Embedding Learning for Enhanced Stability and Discriminability.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

C-LLM: Learn to Check Chinese Spelling Errors Character by Character.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023

Audio-Visual Fusion Based on Interactive Attention for Person Verification.

[BibT_eX]

[DOI]

Sensors, December, 2023

W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2023

Eval-GCSC: A New Metric for Evaluating ChatGPT's Performance in Chinese Spelling Correction.

[BibT_eX]

[DOI]

CoRR, 2023

Graph Neural Network Backend for Speaker Recognition.

[BibT_eX]

[DOI]

Ruida Li

Mengqi Niu

CoRR, 2023

MAKBQA: Multi-hop Knowledge Base Question Answering System Based on Sensors and Internet Agricultural Data.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual IEEE International Conference on Sensing, 2023

GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the ACM Multimedia Asia 2023, 2023

Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization.

[BibT_eX]

[DOI]

Proceedings of the ACM Multimedia Asia 2023, 2023

A Study on Visualization of Voiceprint Feature.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Dynamic Fully-Connected Layer for Large-Scale Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Robust Training for Speaker Verification against Noisy Labels.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

CRA-DIFFUSE: Improved Cross-Domain Speech Enhancement Based on Diffusion Model with T-F Domain Pre-Denoising.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Speech Topic Classification Based on Pre-trained and Graph Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

A Joint Network Based on Interactive Attention for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

SKD-NER: Continual Named Entity Recognition via Span-based Knowledge Distillation with Reinforcement Learning.

[BibT_eX]

[DOI]

Yi Chen

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022

Hierarchic Temporal Convolutional Network With Cross-Domain Encoder for Music Source Separation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2022

A bimodal network based on Audio-Text-Interactional-Attention with ArcFace loss for speech emotion recognition.

[BibT_eX]

[DOI]

Speech Commun., 2022

Multi-stage music separation network with dual-branch attention and hybrid convolution.

[BibT_eX]

[DOI]

J. Intell. Inf. Syst., 2022

OR-Gate: A Noisy Label Filtering Method for Speaker Verification.

[BibT_eX]

[DOI]

CoRR, 2022

I4U System Description for NIST SRE'20 CTS Challenge.

[BibT_eX]

[DOI]

CoRR, 2022

THUEE system description for NIST 2020 SRE CTS challenge.

[BibT_eX]

[DOI]

CoRR, 2022

How to Boost Anti-Spoofing with X-Vectors.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

End-to-end speech topic classification based on pre-trained model Wavlm.

[BibT_eX]

[DOI]

Tengfei Cao

Fangjing Niu

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

A Multi-grained based Attention Network for Semi-supervised Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Virtual Fully-Connected Layer for a Large-Scale Speaker Verification Dataset.

[BibT_eX]

[DOI]

Proceedings of the Biometric Recognition - 16th Chinese Conference, 2022

2021

End-to-End Cross-Lingual Spoken Language Understanding Model with Multilingual Pretraining.

[BibT_eX]

[DOI]

Xianwei Zhang

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improved Lightcnn with Attention Modules for Asv Spoofing Detection.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

2020

Adaptive Multi-Scale Detection of Acoustic Events.

[BibT_eX]

[DOI]

Wenhao Ding

IEEE ACM Trans. Audio Speech Lang. Process., 2020

MTF-CRNN: Multiscale Time-Frequency Convolutional Recurrent Neural Network for Sound Event Detection.

[BibT_eX]

[DOI]

IEEE Access, 2020

Combined Vector Based on Factorized Time-delay Neural Network for Text-Independent Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

A Joint Detection-Classification Model for Weakly Supervised Sound Event Detection Using Multi-Scale Attention Method.

[BibT_eX]

[DOI]

Yaoguang Wang

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2020

THUEE System for NIST SRE19 CTS Challenge.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Distance-Dependent Metric Learning.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2019

Latent class model with application to speaker diarization.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2019

THUEE system description for NIST 2019 SRE CTS Challenge.

[BibT_eX]

[DOI]

CoRR, 2019

Multi-Scale Time-Frequency Attention for Acoustic Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Large Margin Softmax Loss for Speaker Verification.

[BibT_eX]

[DOI]

Yi Liu

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Towards Discriminative Representations and Unbiased Predictions: Class-Specific Angular Softmax for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-objective Optimization Training of PLDA for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Geometric Discriminant Analysis for I-vector Based Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Subtraction-Positive Similarity Learning.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Local Pairwise Linear Discriminant Analysis for Speaker Verification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2018

Semi-supervised minimum redundancy maximum relevance feature selection for audio classification.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2018

Multiobjective Optimization Training of PLDA for Speaker Verification.

[BibT_eX]

[DOI]

CoRR, 2018

Defect characterization of amorphous silicon thin film solar cell based on low frequency noise.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2018

VB-HMM Speaker Diarization with Enhanced and Refined Segment Representation.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Latent Class Model for Single Channel Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Exploring a Unified Attention-Based Pooling Framework for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Parallel Double Audio Fingerprinting.

[BibT_eX]

[DOI]

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Speaker Embedding Extraction with Phonetic Information.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

MTGAN: Speaker Verification through Multitasking Triplet Generative Adversarial Networks.

[BibT_eX]

[DOI]

Wenhao Ding

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigation of Frame Alignments for GMM-based Digit-prompted Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Fading channel modelling using single-hidden layer feedforward neural networks.

[BibT_eX]

[DOI]

Multidimens. Syst. Signal Process., 2017

Investigation of Frame Alignments for GMM-based Text-prompted Speaker Verification.

[BibT_eX]

[DOI]

CoRR, 2017

Ivec-PLDA-AHC priors for VB-HMM speaker diarization system.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Workshop on Signal Processing Systems, 2017

Deep neural networks based speaker modeling at different levels of phonetic granularity.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Comparison of multiple features and modeling methods for text-dependent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Semi-supervised feature selection for audio classification based on constraint compensated Laplacian score.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2016

Voice activity detection algorithm based on long-term pitch information.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2016

Investigation of Senone-based Long-Short Term Memory RNNs for Spoken Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

A study of variational method for text-independent speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Improving Deep Neural Networks Based Speaker Verification Using Unlabeled Data.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

THU-EE System Description for NIST LRE 2015.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

Convolutional maxout neural networks for speech separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2015

Investigation of bottleneck features and multilingual deep neural networks for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Simultaneous utilization of spectral magnitude and phase information to extract supervectors for speaker verification anti-spoofing.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Stacked bottleneck features for speaker verification.

[BibT_eX]

[DOI]

Yao Tian

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

PRISM: A statistical modeling framework for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

2014

Speaker verification using Fisher vector.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Improved multitaper PNCC feature for robust speaker verification.

[BibT_eX]

[DOI]

Yi Liu

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

2013

I-matrix for text-independent speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

THUEE system for the Albayzin 2012 language recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

2012

Complementary combination in i-vector level for language recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Discriminant local information distance preserving projection for text-independent speaker recognition.

[BibT_eX]

[DOI]

Jia Li

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Orthogonal Subspace Combination Based on the Joint Factor Analysis for Text-Independent Speaker Recognition.

[BibT_eX]

[DOI]