Fei Ma

Orcid: 0009-0002-5388-9125

Affiliations:
  • Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China
  • Tsinghua University, Tsinghua-Berkeley Shenzhen Institute, DSIT Research Center, Shenzhen, China


According to our database1, Fei Ma authored at least 70 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Trajectory Tracking Control of Fully Actuated Hexarotor UAVs With Adaptive Iterative Learning: From Theory to Application.
IEEE Trans. Ind. Electron., May, 2026

AffectAgent: Collaborative Multi-Agent Reasoning for Retrieval-Augmented Multimodal Emotion Recognition.
CoRR, April, 2026

LottieGPT: Tokenizing Vector Animation for Autoregressive Generation.
CoRR, April, 2026

T2SGrid: Temporal-to-Spatial Gridification for Video Temporal Grounding.
CoRR, March, 2026

Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning.
CoRR, February, 2026

E^2-LLM: Bridging Neural Signals and Interpretable Affective Analysis.
CoRR, January, 2026

Restoring neural radiance fields performance under adverse weather conditions.
Eng. Appl. Artif. Intell., 2026

TheraMind: A Strategic and Adaptive Agent for Longitudinal Psychological Counseling.
Proceedings of the ACM Web Conference 2026, 2026

A Principle-Driven Adaptive Policy for Group Cognitive Stimulation Dialogue for Elderly with Cognitive Impairment.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

D-GARA: A Dynamic Benchmarking Framework for GUI Agent Robustness in Real-World Anomalies.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Human Motion Video Generation: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2025

TheraMind: A Strategic and Adaptive Agent for Longitudinal Psychological Counseling.
CoRR, October, 2025

Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation.
CoRR, October, 2025

Rethinking Efficient and Effective Point-Based Networks for Event Camera Classification and Regression.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks for Enhanced Action Understanding.
CoRR, August, 2025

Enhancing Long Video Question Answering with Scene-Localized Frame Grouping.
CoRR, August, 2025

EMER-Ranker: Learning to Rank Emotion Descriptions in the Absence of Ground Truth.
CoRR, July, 2025

Universal Visuo-Tactile Video Understanding for Embodied Interaction.
CoRR, May, 2025

Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling.
CoRR, May, 2025

EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models.
CoRR, February, 2025

A Review of Human Emotion Synthesis Based on Generative Technology.
IEEE Trans. Affect. Comput., 2025

MLLM-TA: Leveraging Multimodal Large Language Models for Precise Temporal Video Grounding.
IEEE Signal Process. Lett., 2025

Generative technology for human emotion recognition: A scoping review.
Inf. Fusion, 2025

VisualRWKV-HM: Enhancing linear visual-language models via hybrid mixing.
Inf. Fusion, 2025

Exploring Embodied Multimodal Large Models: Development, datasets, and future directions.
Inf. Fusion, 2025

HoloTrace: LLM-based Bidirectional Causal Knowledge Graph for Edge-Cloud Video Anomaly Detection.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

OnlineHOI: Towards Online Human-Object Interaction Generation and Perception.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

CCIS-DIFF: A Generative Model with Stable Diffusion Prior for Controlled Colonoscopy Image Synthesis.
Proceedings of the 22nd IEEE International Symposium on Biomedical Imaging, 2025

Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

GaussianPU: Color Point Cloud Upsampling via 3D Gaussian Splatting.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

VideoHumanMIB: Unlocking Appearance Decoupling for Video Human Motion In-betweening.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Active Multimodal Distillation for Few-shot Action Recognition.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Inter3D: A Benchmark and Strong Baseline for Human-Interactive 3D Object Reconstruction.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Object Isolated Attention for Consistent Story Visualization.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

UniSync: A Unified Framework for Audio-Visual Synchronization.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

DictAvatar: Expressive Facial Avatar Reconstruction with Facial Feature Dictionary.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

DEP-SLAM: A Dynamic Environment Perception SLAM System with Large Language Models.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

ABM++: Learning Generalizable Manipulation Policies with a Mask-Guided World Model.
Proceedings of the ECAI 2025 - 28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy, 2025

Training-Free Language-Guided Video Summarization via Multi-Grained Saliency Scoring.
Proceedings of the Computational Visual Media - 13th International Conference, 2025

VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

ReMask-Animate: Refined Character Image Animation Using Mask-Guided Adapters.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Subgraph Invariant Learning Towards Large-Scale Graph Node Classification.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Dependency-Aware Microservice Deployment for Edge Computing: A Deep Reinforcement Learning Approach With Network Representation.
IEEE Trans. Mob. Comput., December, 2024

Your blush gives you away: detecting hidden mental states with remote photoplethysmography and thermal imaging.
PeerJ Comput. Sci., 2024

Frequency-aware Event Cloud Network.
CoRR, 2024

SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing.
CoRR, 2024

GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting.
CoRR, 2024

Enhancing and Accelerating Large Language Models via Instruction-Aware Contextual Compression.
CoRR, 2024

Learn To Learn More Precisely.
CoRR, 2024

Generative Technology for Human Emotion Recognition: A Scope Review.
CoRR, 2024

Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba.
CoRR, 2024

SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

CodeSwap: Symmetrically Face Swapping Based on Prior Codebook.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

A Language-Driven Navigation Strategy Integrating Semantic Maps and Large Language Models.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

2023
STAN: Spatial-Temporal Awareness Network for Temporal Action Detection.
Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports, 2023

2022
Poster Abstract: Representation Learning from Multimodal Sensor Data with Maximally Correlated Autoencoders.
Proceedings of the 21st ACM/IEEE International Conference on Information Processing in Sensor Networks, 2022

2021
Maximum Likelihood Estimation for Multimodal Learning with Missing Modality.
CoRR, 2021

A Semi-supervised Learning Approach for Visual Question Answering based on Maximal Correlation.
Proceedings of the 2021 IEEE International Conference on Systems, Man, and Cybernetics, 2021

An Efficient Approach for Audio-Visual Emotion Recognition With Missing Labels And Missing Modalities.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Semi-Supervised Multimodal Image Translation for Missing Modality Imputation.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Person Recognition with HGR Maximal Correlation on Multimodal Data.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

2019
Unsupervised anomaly detection via generative adversarial networks: poster abstract.
Proceedings of the 18th International Conference on Information Processing in Sensor Networks, 2019

Info-Detection: An Information-Theoretic Approach to Detect Outlier.
Proceedings of the Neural Information Processing - 26th International Conference, 2019

An End-to-End Learning Approach for Multimodal Emotion Recognition: Extracting Common and Private Information.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

2018
Multimodal Emotion Recognition by extracting common and modality-specific information.
Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, SenSys 2018, 2018

Speech Emotion Recognition via Attention-based DNN from Multi-Task Learning.
Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, SenSys 2018, 2018

Real-Time Emotion Detection via E-See.
Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, SenSys 2018, 2018


  Loading...