Fei Ma

Orcid: 0009-0002-5388-9125

Affiliations:

Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China
Tsinghua University, Tsinghua-Berkeley Shenzhen Institute, DSIT Research Center, Shenzhen, China

According to our database¹, Fei Ma authored at least 54 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Human Motion Video Generation: A Survey.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., November, 2025

TheraMind: A Strategic and Adaptive Agent for Longitudinal Psychological Counseling.

[BibT_eX]

[DOI]

CoRR, October, 2025

Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation.

[BibT_eX]

[DOI]

CoRR, October, 2025

Rethinking Efficient and Effective Point-Based Networks for Event Camera Classification and Regression.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks for Enhanced Action Understanding.

[BibT_eX]

[DOI]

CoRR, August, 2025

Enhancing Long Video Question Answering with Scene-Localized Frame Grouping.

[BibT_eX]

[DOI]

CoRR, August, 2025

Universal Visuo-Tactile Video Understanding for Embodied Interaction.

[BibT_eX]

[DOI]

CoRR, May, 2025

Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling.

[BibT_eX]

[DOI]

CoRR, May, 2025

EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

A Review of Human Emotion Synthesis Based on Generative Technology.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2025

MLLM-TA: Leveraging Multimodal Large Language Models for Precise Temporal Video Grounding.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2025

Generative technology for human emotion recognition: A scoping review.

[BibT_eX]

[DOI]

Inf. Fusion, 2025

VisualRWKV-HM: Enhancing linear visual-language models via hybrid mixing.

[BibT_eX]

[DOI]

Inf. Fusion, 2025

Exploring Embodied Multimodal Large Models: Development, datasets, and future directions.

[BibT_eX]

[DOI]

Inf. Fusion, 2025

Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning.

[BibT_eX]

[DOI]

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

CCIS-DIFF: A Generative Model with Stable Diffusion Prior for Controlled Colonoscopy Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Biomedical Imaging, 2025

Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

GaussianPU: Color Point Cloud Upsampling via 3D Gaussian Splatting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

VideoHumanMIB: Unlocking Appearance Decoupling for Video Human Motion In-betweening.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Inter3D: A Benchmark and Strong Baseline for Human-Interactive 3D Object Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Object Isolated Attention for Consistent Story Visualization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

UniSync: A Unified Framework for Audio-Visual Synchronization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

DictAvatar: Expressive Facial Avatar Reconstruction with Facial Feature Dictionary.

[BibT_eX]

[DOI]

Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

DEP-SLAM: A Dynamic Environment Perception SLAM System with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Training-Free Language-Guided Video Summarization via Multi-Grained Saliency Scoring.

[BibT_eX]

[DOI]

Proceedings of the Computational Visual Media - 13th International Conference, 2025

VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

ReMask-Animate: Refined Character Image Animation Using Mask-Guided Adapters.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Subgraph Invariant Learning Towards Large-Scale Graph Node Classification.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Your blush gives you away: detecting hidden mental states with remote photoplethysmography and thermal imaging.

[BibT_eX]

[DOI]

PeerJ Comput. Sci., 2024

Frequency-aware Event Cloud Network.

[BibT_eX]

[DOI]

CoRR, 2024

SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing.

[BibT_eX]

[DOI]

CoRR, 2024

GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting.

[BibT_eX]

[DOI]

CoRR, 2024

Enhancing and Accelerating Large Language Models via Instruction-Aware Contextual Compression.

[BibT_eX]

[DOI]

CoRR, 2024

Learn To Learn More Precisely.

[BibT_eX]

[DOI]

CoRR, 2024

Generative Technology for Human Emotion Recognition: A Scope Review.

[BibT_eX]

[DOI]

CoRR, 2024

Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba.

[BibT_eX]

[DOI]

CoRR, 2024

SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

CodeSwap: Symmetrically Face Swapping Based on Prior Codebook.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

A Language-Driven Navigation Strategy Integrating Semantic Maps and Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

2023

STAN: Spatial-Temporal Awareness Network for Temporal Action Detection.

[BibT_eX]

[DOI]

Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports, 2023

2022

Poster Abstract: Representation Learning from Multimodal Sensor Data with Maximally Correlated Autoencoders.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM/IEEE International Conference on Information Processing in Sensor Networks, 2022

2021

Maximum Likelihood Estimation for Multimodal Learning with Missing Modality.

[BibT_eX]

[DOI]

CoRR, 2021

A Semi-supervised Learning Approach for Visual Question Answering based on Maximal Correlation.

[BibT_eX]

[DOI]

Sikai Yin

Fei Ma

Shao-Lun Huang

Proceedings of the 2021 IEEE International Conference on Systems, Man, and Cybernetics, 2021

An Efficient Approach for Audio-Visual Emotion Recognition With Missing Labels And Missing Modalities.

[BibT_eX]

[DOI]

Fei Ma

Shao-Lun Huang

Lin Zhang

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Semi-Supervised Multimodal Image Translation for Missing Modality Imputation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Person Recognition with HGR Maximal Correlation on Multimodal Data.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Pattern Recognition, 2020

2019

Unsupervised anomaly detection via generative adversarial networks: poster abstract.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Information Processing in Sensor Networks, 2019

Info-Detection: An Information-Theoretic Approach to Detect Outlier.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 26th International Conference, 2019

An End-to-End Learning Approach for Multimodal Emotion Recognition: Extracting Common and Private Information.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

2018

Multimodal Emotion Recognition by extracting common and modality-specific information.

[BibT_eX]

[DOI]

Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, SenSys 2018, 2018

Speech Emotion Recognition via Attention-based DNN from Multi-Task Learning.

[BibT_eX]

[DOI]

Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, SenSys 2018, 2018

Real-Time Emotion Detection via E-See.

[BibT_eX]

[DOI]

Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, SenSys 2018, 2018

Fei Ma

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...