Li Liu

CoRR, August, 2025

UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation.

[BibT_eX]

[DOI]

Jinting Wang

Shan Yang

CoRR, June, 2025

AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation.

[BibT_eX]

[DOI]

CoRR, May, 2025

FauForensics: Boosting Audio-Visual Deepfake Detection with Facial Action Units.

[BibT_eX]

[DOI]

CoRR, May, 2025

BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model.

[BibT_eX]

[DOI]

CoRR, February, 2025

Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Learning Class Unique Features in Fine-Grained Visual Classification.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Reliable Imputed-Sample Assisted Vertical Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Fine-portraitist: Visualizing the Speaker's Face Portrait during Speech Listening.

[BibT_eX]

[DOI]

Jinting Wang

Jun Wang

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

MotionComposer: Enhancing Rhythmic Music Generation with Adaptive Retrieval Reference.

[BibT_eX]

[DOI]

Jinting Wang

Jun Wang

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Fusing Pruned and Backdoored Models: Optimal Transport-based Data-free Backdoor Mitigation.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

FHVAC: Feature-Level Hybrid Video Adaptive Configuration for Machine-Centric Live Streaming.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., May, 2024

Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition.

[BibT_eX]

[DOI]

Lei Liu

Haizhou Li

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Less confidence, less forgetting: Learning with a humbler teacher in exemplar-free Class-Incremental learning.

[BibT_eX]

[DOI]

Neural Networks, 2024

New Paradigm of Adversarial Training: Breaking Inherent Trade-Off between Accuracy and Robustness via Dummy Classes.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2.

[BibT_eX]

[DOI]

CoRR, 2024

Seeing Your Speech Style: A Novel Zero-Shot Identity-Disentanglement Face-based Voice Conversion.

[BibT_eX]

[DOI]

Yan Rong

CoRR, 2024

Prior-free Balanced Replay: Uncertainty-guided Reservoir Sampling for Long-Tailed Continual Learning.

[BibT_eX]

[DOI]

Lei Liu

Yawen Cui

CoRR, 2024

Segment Anything for Videos: A Systematic Survey.

[BibT_eX]

[DOI]

CoRR, 2024

A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights.

[BibT_eX]

[DOI]

CoRR, 2024

TIMA: Text-Image Mutual Awareness for Balancing Zero-Shot Adversarial Robustness and Generalization Ability.

[BibT_eX]

[DOI]

Fengji Ma

Hei Victor Cheng

CoRR, 2024

Awesome Multi-modal Object Tracking.

[BibT_eX]

[DOI]

CoRR, 2024

Content-Aware Efficient Learner for Audio-Visual Emotion Recognition.

[BibT_eX]

[DOI]

Guanjie Huang

Weilin Lin

Proceedings of the Social Robotics - 16th International Conference, 2024

Cued Speech-Integrated Audio-Visual Variational Autoencoder for Speech Enhancement.

[BibT_eX]

[DOI]

Lufei Gao

Yan Rong

Proceedings of the Social Robotics - 16th International Conference, 2024

WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Bridge to Non-Barrier Communication: Gloss-Prompted Fine-Grained Cued Speech Gesture Generation with Diffusion Model.

[BibT_eX]

[DOI]

Wentao Lei

Jun Wang

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Leveraging Noisy Labels of Nearest Neighbors for Label Correction and Sample Selection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

WebUAV-3M: A Benchmark for Unveiling the Power of Million-Scale Deep UAV Tracking.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Generating and Weighting Semantically Consistent Sample Pairs for Ultrasound Contrastive Learning.

[BibT_eX]

[DOI]

IEEE Trans. Medical Imaging, May, 2023

Defenses in Adversarial Machine Learning: A Survey.

[BibT_eX]

[DOI]

CoRR, 2023

Realistic Speech-to-Face Generation with Speech-Conditioned Latent Diffusion Model with Face Prior.

[BibT_eX]

[DOI]

CoRR, 2023

A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation.

[BibT_eX]

[DOI]

CoRR, 2023

Robust Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers.

[BibT_eX]

[DOI]

CoRR, 2023

X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models.

[BibT_eX]

[DOI]

Yixiong Chen

Chris Ding

CoRR, 2023

A Comprehensive Survey on Segment Anything Model for Vision and Beyond.

[BibT_eX]

[DOI]

CoRR, 2023

Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example.

[BibT_eX]

[DOI]

CoRR, 2023

FedAds: A Benchmark for Privacy-Preserving CVR Estimation with Vertical Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

MetaLR: Meta-tuning of Learning Rates for Transfer Learning in Medical Imaging.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

A Novel Interpretable and Generalizable Re-synchronization Model for Cued Speech based on a Multi-Cuer Corpus.

[BibT_eX]

[DOI]

Lufei Gao

Shan Huang

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Global Balanced Experts for Federated Long-Tailed Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Memory-Augmented Contrastive Learning for Talking Head Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Two-Stream Joint-Training for Speaker Independent Acoustic-to-Articulatory Inversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Spatio-Temporal Structure Consistency for Semi-Supervised Medical Image Classification.

[BibT_eX]

[DOI]

Wentao Lei

Lei Liu

Proceedings of the IEEE International Conference on Acoustics, 2023

TAOTF: A Two-Stage Approximately Orthogonal Training Framework in Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

2022

Rethinking Two Consensuses of the Transferability in Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2022

MetaLR: Layer-wise Learning Rate based on Meta-Learning for Adaptively Fine-tuning Medical Pre-trained Models.

[BibT_eX]

[DOI]

CoRR, 2022

WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking.

[BibT_eX]

[DOI]

CoRR, 2022

Pre-activation Distributions Expose Backdoor Neurons.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Objective Hand Complexity Comparison between Two Mandarin Chinese Cued Speech Systems.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 29th International Conference, 2022

Residual-Guided Personalized Speech Synthesis based on Face Image.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Acoustic-to-Articulatory Inversion Based on Speech Decomposition and Auxiliary Feature.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Data-Free Backdoor Removal Based on Channel Lipschitzness.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

HiCo: Hierarchical Contrastive Learning for Ultrasound Video Model Pretraining.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2022, 2022

2021

Re-Synchronization Using the Hand Preceding Model for Multi-Modal Fusion in Automatic Continuous Cued Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

A hybrid framework for brain tissue segmentation in magnetic resonance images.

[BibT_eX]

[DOI]

Int. J. Imaging Syst. Technol., 2021

Research on distributed logistics scheduling method for workshop production based on hybrid particle swarm optimisation.

[BibT_eX]

[DOI]

Xiangli Xu

Int. J. Manuf. Technol. Manag., 2021

USCL: Pretraining Deep Ultrasound Image Diagnosis Model Through Video Contrastive Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

Multi-Modal Active Learning For Automatic Liver Fibrosis Diagnosis Based On Ultrasound Shear Wave Elastography.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021

Cross-Modal Knowledge Distillation Method for Automatic Cued Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

An Attention Self-Supervised Contrastive Learning Based Three-Stage Model for Hand Shape Feature Representation in Cued Speech.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

The Ninth Visual Object Tracking VOT2021 Challenge Results.

[BibT_eX]

[DOI]

Joni-Kristian Kämäräinen

Mohamed H. Abdelpakey

Alireza Memarmoghadam

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Self-Supervised Depth Estimation Via Implicit Cues from Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Effective Sample Pair Generation for Ultrasound Video Contrastive Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Towards Class-Specific Unit.

[BibT_eX]

[DOI]

CoRR, 2020

Attention-based Residual Speech Portrait Model for Speech to Face Generation.

[BibT_eX]

[DOI]

CoRR, 2020

Self-Supervised Joint Learning Framework of Depth Estimation via Implicit Cues.

[BibT_eX]

[DOI]

CoRR, 2020

A New Re-synchronization Method based Multi-modal Fusion for Automatic Continuous Cued Speech Recognition.

[BibT_eX]

[DOI]

Xiao-Ping (Steven) Zhang

CoRR, 2020

Semi-Supervised Active Learning for COVID-19 Lung Ultrasound Multi-symptom Classification.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, 2020

Three-Dimensional Lip Motion Network for Text-Independent Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Pattern Recognition, 2020

2019

Hierarchical Clustering Based Band Selection Algorithm for Hyperspectral Face Recognition.

[BibT_eX]

[DOI]

IEEE Access, 2019

A Light-Weight Context-Aware Self-Attention Model for Skin Lesion Segmentation.

[BibT_eX]

[DOI]

Proceedings of the PRICAI 2019: Trends in Artificial Intelligence, 2019

Automatic Detection of the Temporal Segmentation of Hand Movements in British English Cued Speech.

[BibT_eX]

[DOI]

Jianze Li

Xiao-Ping (Steven) Zhang

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A novel resynchronization procedure for hand-lips fusion applied to continuous French Cued Speech recognition.

[BibT_eX]

[DOI]

Xiao-Ping (Steven) Zhang

Proceedings of the 27th European Signal Processing Conference, 2019

2018

Visual Recognition of Continuous Cued Speech Using a Tandem CNN-HMM Approach.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Automatic Temporal Segmentation of Hand Movements for Hand Positions Recognition in French Cued Speech.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Inner lips feature extraction based on CLNF with hybrid dynamic template for Cued Speech.

[BibT_eX]

[DOI]

EURASIP J. Image Video Process., 2017

Automatic dynamic template tracking of inner lips based on CLNF.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Inner Lips Parameter Estimation based on Adaptive Ellipse Model.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Auditory-Visual Speech Processing, 2017

2016

Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine.

[BibT_eX]

[DOI]

Comput. Math. Methods Medicine, 2016

Extraction automatique de contour de lèvre à partir du modèle CLNF (Automatic lip contour extraction using CLNF model).

[BibT_eX]

[DOI]