Bing Han

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Attention-Based Encoder-Decoder End-to-End Neural Diarization With Embedding Enhancer.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Advancing speaker embedding learning: Wespeaker toolkit for research and production.

[BibT_eX]

[DOI]

Speech Commun., 2024

Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning.

[BibT_eX]

[DOI]

CoRR, 2024

Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching.

[BibT_eX]

[DOI]

CoRR, 2024

VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment.

[BibT_eX]

[DOI]

CoRR, 2024

Improving Anomalous Sound Detection Via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Combining Self-Supervised Learning and Adversarial Training Based Domain Adaptation for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Improving Acoustic Scene Classification via Self-Supervised and Semi-Supervised Learning with Efficient Audio Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Semi-Supervised Acoustic Scene Classification with Test-Time Adaptation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Robust Cross-Domain Speaker Verification with Multi-Level Domain Adapters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Exploring Large Scale Pre-Trained Models for Robust Machine Anomalous Sound Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

Wespeaker baselines for VoxSRC2023.

[BibT_eX]

[DOI]

CoRR, 2023

Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improving Dino-Based Self-Supervised Speaker Verification with Progressive Cluster-Aware Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Exploring Binary Classification Loss for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022.

[BibT_eX]

[DOI]

CoRR, 2022

The SJTU X-LANCE Lab System for CNSRC 2022.

[BibT_eX]

[DOI]

CoRR, 2022

A Comprehensive Study on Self-Supervised Distillation for Speaker Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

DF-ResNet: Boosting Speaker Verification Performance with Depth-First Design.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

The Sjtu System For Multimodal Information Based Speech Processing Challenge 2021.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Local Information Modeling with Self-Attention for Speaker Verification.

[BibT_eX]

[DOI]