Zhaoxin Fan
Orcid: 0000-0002-6324-1712
According to our database1,
Zhaoxin Fan
authored at least 87 papers
between 2016 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Phys-EdiGAN: A privacy-preserving method for editing physiological signals in facial videos.
Pattern Recognit., 2026
Unveiling hidden vulnerabilities in digital human generation via adversarial attacks.
Pattern Recognit., 2026
2025
CoRR, August, 2025
CoRR, August, 2025
Pose-RFT: Enhancing MLLMs for 3D Pose Generation via Hybrid Action Reinforcement Fine-Tuning.
CoRR, August, 2025
CoRR, August, 2025
SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting.
CoRR, June, 2025
RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks.
CoRR, June, 2025
DS-TTS: Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation.
CoRR, June, 2025
CoRR, May, 2025
AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars.
CoRR, May, 2025
MatchDance: Collaborative Mamba-Transformer Architecture Matching for High-Quality 3D Dance Synthesis.
CoRR, May, 2025
TinyAlign: Boosting Lightweight Vision-Language Models by Mitigating Modal Alignment Bottlenecks.
CoRR, May, 2025
Black-box Adversaries from Latent Space: Unnoticeable Attacks on Human Pose and Shape Estimation.
CoRR, May, 2025
Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation.
CoRR, May, 2025
AP-CAP: Advancing High-Quality Data Synthesis for Animal Pose Estimation via a Controllable Image Generation Pipeline.
CoRR, April, 2025
CoRR, March, 2025
STAMICS: Splat, Track And Map with Integrated Consistency and Semantics for Dense RGB-D SLAM.
CoRR, March, 2025
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis.
CoRR, March, 2025
DH-RAG: A Dynamic Historical Context-Powered Retrieval-Augmented Generation Method for Multi-Turn Dialogue.
CoRR, February, 2025
CoRR, February, 2025
TinyLLaVA-Video: A Simple Framework of Small-scale Large Multimodal Models for Video Understanding.
CoRR, January, 2025
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
2024
MonoSIM: Simulating Learning Behaviors of Heterogeneous Point Cloud Object Detectors for Monocular 3-D Object Detection.
IEEE Trans. Instrum. Meas., 2024
A novel transformer autoencoder for multi-modal emotion recognition with incomplete data.
Neural Networks, 2024
Dust to Tower: Coarse-to-Fine Photo-Realistic Scene Reconstruction from Sparse Uncalibrated Images.
CoRR, 2024
CoheDancers: Enhancing Interactive Group Dance Generation through Music-Driven Coherence Decomposition.
CoRR, 2024
Score and Distribution Matching Policy: Advanced Accelerated Visuomotor Policies via Matched Distillation.
CoRR, 2024
CoRR, 2024
LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details.
CoRR, 2024
VGG-Tex: A Vivid Geometry-Guided Facial Texture Estimation Model for High Fidelity Monocular 3D Face Reconstruction.
CoRR, 2024
Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation.
CoRR, 2024
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer.
CoRR, 2024
CoRR, 2024
A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing.
CoRR, 2024
Idea-2-3D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs.
CoRR, 2024
CoRR, 2024
AS-FIBA: Adaptive Selective Frequency-Injection for Backdoor Attack on Deep Face Restoration.
CoRR, 2024
Enhancing Weakly Supervised 3D Medical Image Segmentation through Probabilistic-aware Learning.
CoRR, 2024
Multi-dimensional Fusion and Consistency for Semi-supervised Medical Image Segmentation.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024
STDG: Semi-Teacher-Student Training Paradigm for Depth-guided One-stage Scene Graph Generation.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024
BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024
ACR-Pose: Adversarial Canonical Representation Reconstruction Network for Category Level 6D Object Pose Estimation.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024
ESTGN: Enhanced Self-Mined Text Guided Super-Resolution Network for Superior Image Super Resolution.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Everything2Motion: Synchronizing Diverse Inputs via a Unified Framework for Human Motion Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview.
ACM Comput. Surv., 2023
STDG: Semi-Teacher-Student Training Paradigram for Depth-guided One-stage Scene Graph Generation.
CoRR, 2023
CoRR, 2023
SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
GIDP: Learning a Good Initialization and Inducing Descriptor Post-enhancing for Large-scale Place Recognition.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
CoRR, 2022
CoRR, 2022
Proceedings of the Pattern Recognition and Computer Vision - 5th Chinese Conference, 2022
Unsupervised Multi-Task Learning for 3D Subtomogram Image Alignment, Clustering and Segmentation.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022
Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image.
Proceedings of the Computer Vision - ECCV 2022, 2022
SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
ACR-Pose: Adversarial Canonical Representation Reconstruction Network for Category Level 6D Object Pose Estimation.
CoRR, 2021
Attentive Rotation Invariant Convolution for Point Cloud-based Large Scale Place Recognition.
CoRR, 2021
SVT-Net: A Super Light-Weight Network for Large Scale Place Recognition using Sparse Voxel Transformers.
CoRR, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Comput. Graph. Forum, 2020
Comput. Graph. Forum, 2020
DAGC: Employing Dual Attention and Graph Convolution for Point Cloud based Place Recognition.
Proceedings of the 2020 on International Conference on Multimedia Retrieval, 2020
Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, 2020
2016
Int. J. Parallel Program., 2016