Yinfeng Yu

Orcid: 0000-0003-3089-4140

According to our database1, Yinfeng Yu authored at least 46 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence.
CoRR, April, 2026

Semantic-Emotional Resonance Embedding: A Semi-Supervised Paradigm for Cross-Lingual Speech Emotion Recognition.
CoRR, April, 2026

Generalizable Audio-Visual Navigation via Binaural Difference Attention and Action Transition Prediction.
CoRR, April, 2026

Reliability-Aware Geometric Fusion for Robust Audio-Visual Navigation.
CoRR, April, 2026

Spatial-Aware Conditioned Fusion for Audio-Visual Navigation.
CoRR, April, 2026

Audio Spatially-Guided Fusion for Audio-Visual Navigation.
CoRR, April, 2026

Residual Cross-Modal Fusion Networks for Audio-Visual Navigation.
CoRR, January, 2026

Beyond textual knowledge: Leveraging multimodal knowledge bases for enhancing vision-and-language navigation.
Inf. Process. Manag., 2026

2025
Audio-Guided Visual Perception for Audio-Visual Navigation.
CoRR, October, 2025

Advancing Audio-Visual Navigation Through Multi-Agent Collaboration in 3D Environments.
CoRR, September, 2025

FSDENet: A Frequency and Spatial Domains-Based Detail Enhancement Network for Remote Sensing Semantic Segmentation.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2025

EGSTalker: Real-Time Audio-Driven Talking Head Generation with Efficient Gaussian Deformation.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2025

Iterative Residual Cross-Attention Mechanism: An Integrated Approach for Audio-Visual Navigation Tasks.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2025

FGHFN: High-Resolution Fusion Network with Frequency-Domain Guidance for Remote Sensing Semantic Segmentation.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2025

DP-GaussTalk: Dual-Path Audio-Driven Feature Fusion for 3D Gaussian-Based Talking Head Synthesis.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2025

ECTSpeech: Enhancing Efficient Speech Synthesis via Easy Consistency Tuning.
Proceedings of the 7th ACM International Conference on Multimedia in Asia, 2025

Phoneme-Controlled LLM with Self-Supervised Speech Prompts for Mispronunciation Detection.
Proceedings of the 7th ACM International Conference on Multimedia in Asia, 2025

DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

DGFNet: End-to-End Audio-Visual Source Separation Based on Dynamic Gating Fusion.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

Leveraging Label Potential for Enhanced Multimodal Emotion Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2025

AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis.
Proceedings of the International Joint Conference on Neural Networks, 2025

PGSTalker: Real-Time Audio-Driven Talking Head Generation via 3D Gaussian Splatting with Pixel-Aware Density Control.
Proceedings of the Neural Information Processing - 32nd International Conference, 2025

Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation.
Proceedings of the Neural Information Processing - 32nd International Conference, 2025

Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

A Speech Enhancement Method Based on Training Lifetime Knowledge Distillation.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

Landmark-Guided Knowledge for Vision-and-Language Navigation.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment Analysis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Audio-Driven Talking Head Generation with Emotion Based on FLAME Geometry Model.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2025, 2025

Dynamic Multi-Target Fusion for Efficient Audio-Visual Navigation.
Proceedings of the ECAI 2025 - 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy, 2025

LGFormer: A Local-Global Dynamic Attention Window Transformer for Speech Emotion Recognition.
Proceedings of the 28th International Conference on Computer Supported Cooperative Work in Design, 2025

2024
Nonlinear Regularization Decoding Method for Speech Recognition.
Sensors, June, 2024

Heterogeneous Space Fusion and Dual-Dimension Attention: A New Paradigm for Speech Enhancement.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2024

BSS-CFFMA: Cross-Domain Feature Fusion and Multi-Attention Speech Enhancement Network Based on Self-Supervised Embedding.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2024

VNet: A GAN-Based Multi-Tier Discriminator Network for Speech Synthesis Vocoders.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2024

Collaborative Transformer Decoder Method for Uyghur Speech Recognition in-Vehicle Environment.
Proceedings of the Pattern Recognition - 27th International Conference, 2024

ECMISM: Speech Recognition via Enhancing Conformer Models with Innovative Scoring Matrices.
Proceedings of the Pattern Recognition - 27th International Conference, 2024

MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

PCQ: Emotion Recognition in Speech via Progressive Channel Querying.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2024

2023
Edge-Guided Camouflaged Object Detection via Multi-Level Feature Integration.
Sensors, July, 2023

Echo-Enhanced Embodied Visual Navigation.
Neural Comput., May, 2023

Measuring Acoustics with Collaborative Multiple Agents.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

SRTNET: Time Domain Speech Enhancement via Stochastic Refinement.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Sound Adversarial Audio-Visual Navigation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Pay Self-Attention to Audio-Visual Navigation.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
WeaveNet: End-to-End Audiovisual Sentiment Analysis.
Proceedings of the Cognitive Systems and Information Processing, 2021

2019
Multi-Spectral Image Change Detection Based on Band Selection and Single-Band Iterative Weighting.
IEEE Access, 2019


  Loading...