Dongmei Jiang

Orcid: 0000-0002-6238-8499

According to our database1, Dongmei Jiang authored at least 158 papers between 2006 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Efficient Adversarial Training via Criticality-Aware Fine-Tuning.
CoRR, April, 2026

EnergyAction: Unimanual to Bimanual Composition with Energy-Based Models.
CoRR, March, 2026

FedDAAM: Federated Domain Adversarial Learning With Attention Mechanism for Privacy Preserving Multimodal Depression Assessment.
IEEE Trans. Circuits Syst. Video Technol., February, 2026

Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation.
CoRR, February, 2026

DreamAssemble: Complex Multi-Object Text-to-3D Generation via Multi-Density Neural Fields.
IEEE Trans. Image Process., 2026

Appearance- and Relation-Aware Parallel Graph Attention Fusion Network for Facial Expression Recognition.
IEEE Trans. Affect. Comput., 2026

Facial Action Units Generation via Cross-Modality Attention Fusion and Calibrated Denoising.
IEEE Signal Process. Lett., 2026

AlignMamba-2: Enhancing multimodal fusion and sentiment analysis with modality-aware Mamba.
Pattern Recognit., 2026

Dual-Attention based prompt generation and catalyzing for instance-wise continual learning.
Pattern Recognit., 2026

Multi-modal prompt learning for facial expression recognition: Leveraging emojis and large language models.
Inf. Fusion, 2026

FDA-CAPMA: Federated domain adaptation with co-activation pattern and multimodal mamba for fMRI depression detection.
Inf. Fusion, 2026

Modeling speaker-specific long-term context for emotion recognition in conversation.
Inf. Fusion, 2026

RA3-FDA: Resource-adaptive federated domain adaptation with dual heterogeneity awareness for EEG-based depression detection.
Expert Syst. Appl., 2026

Bolster Hallucination Detection via Prompt-Guided Data Augmentation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
ARPGNet: Appearance- and Relation-aware Parallel Graph Attention Fusion Network for Facial Expression Recognition.
CoRR, November, 2025

Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization.
CoRR, September, 2025

FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models.
CoRR, August, 2025

HumanPCR: Probing MLLM Capabilities in Diverse Human-Centric Scenes.
CoRR, August, 2025

Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills.
CoRR, June, 2025

Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts.
CoRR, June, 2025

Prompt Customization for Continual Learning.
IEEE Trans. Artif. Intell., May, 2025

Harmony: A Unified Framework for Modality Incremental Learning.
CoRR, April, 2025

Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation.
CoRR, March, 2025

GraphATC: advancing multilevel and multi-label anatomical therapeutic chemical classification via atom-level graph learning.
Briefings Bioinform., March, 2025

CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation.
CoRR, January, 2025

ExpLLM: Towards Chain of Thought for Facial Expression Recognition.
IEEE Trans. Multim., 2025

Cross-DINO: Cross the Deep MLP and Transformer for Small Object Detection.
IEEE Trans. Multim., 2025

AVES: An Audio-Visual Emotion Stream Dataset for Temporal Emotion Detection.
IEEE Trans. Affect. Comput., 2025

DDC: Dynamic distribution calibration for few-shot learning under multi-scale representation.
Knowl. Based Syst., 2025

DS-Det: Single-Query Paradigm and Attention Disentangled Learning for Flexible Object Detection.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Open-Det: An Efficient Learning Framework for Open-Ended Detection.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

DTAD: A Distribution-Transformed Supervised Anomaly Detection Method.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

PolaFormer: Polarity-aware Linear Attention for Vision Transformers.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Comprehensive Perturbation Consistency for Semi-Supervised Change Detection in Remote Sensing Images.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Sound Bridge: Associating Egocentric and Exocentric Videos via Audio Cues.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Learning Hierarchical Continuous Dynamics for Facial Action Unit Intensity Estimation.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2025

Transferable Adversarial Face Attack with Text Controlled Attribute.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Unsupervised Degradation Representation Aware Transform for Real-World Blind Image Super-Resolution.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Facial Action Unit Representation Based on Self-Supervised Learning With Ensembled Priori Constraints.
IEEE Trans. Image Process., 2024

Directional Spatial and Spectral Attention Network (DSSA Net) for EEG-based emotion recognition.
Frontiers Neurorobotics, 2024

A single frame and multi-frame joint network for 360-degree panorama video super-resolution.
Eng. Appl. Artif. Intell., 2024

CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs.
CoRR, 2024

Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment.
CoRR, 2024

OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion.
CoRR, 2024

Enhancing the Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought.
CoRR, 2024

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment.
Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing, 2024

MLP-DINO: Category Modeling and Query Graphing with Deep MLP for Object Detection.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Facial Action Unit Detection with the Semantic Prompt.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Disentangled Task Representation Learning for Offline Meta Reinforcement Learning.
Proceedings of the IEEE International Conference on Agents, 2024

The Second Visual Object Tracking Segmentation VOTS2024 Challenge Results.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

CricaVPR: Cross-Image Correlation-Aware Representation Learning for Visual Place Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Deep Homography Estimation for Visual Place Recognition.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Cross-view adaptive graph attention network for dynamic facial expression recognition.
Multim. Syst., October, 2023

HiT-MST: Dynamic facial expression recognition with hierarchical transformers and multi-scale spatiotemporal aggregation.
Inf. Sci., October, 2023

A Bayesian Filtering Framework for Continuous Affect Recognition From Facial Images.
IEEE Trans. Multim., 2023

Region Attentive Action Unit Intensity Estimation With Uncertainty Weighted Multi-Task Learning.
IEEE Trans. Affect. Comput., 2023

Efficient spatiotemporal context modeling for action recognition.
Neurocomputing, 2023

Benign Shortcut for Debiasing: Fair Visual Recognition via Intervention with Shortcut Features.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Towards Adaptable Graph Representation Learning: An Adaptive Multi-Graph Contrastive Transformer.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Semi-Supervised Multimodal Emotion Recognition with Class-Balanced Pseudo-labeling.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Strip-MLP: Efficient Token Interaction for Vision MLP.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Relate Auditory Speech To Eeg By Shallow-Deep Attention-Based Network.
Proceedings of the IEEE International Conference on Acoustics, 2023

Context-Aware EEG-Based Perceived Stress Recognition based on Emotion Transition Paradigm.
Proceedings of the 11th International Conference on Affective Computing and Intelligent Interaction, ACII 2023, 2023

2022
Leveraging the Deep Learning Paradigm for Continuous Affect Estimation from Facial Expressions.
IEEE Trans. Affect. Comput., 2022

A multi-scale multi-attention network for dynamic facial expression recognition.
Multim. Syst., 2022

Uncertainty-Aware Semi-Supervised Learning of 3D Face Rigging from Single Image.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

2021
Monocular 3D Facial Expression Features for Continuous Affect Recognition.
IEEE Trans. Multim., 2021

Transformer Encoder With Multi-Modal Multi-Head Attention for Continuous Affect Recognition.
IEEE Trans. Multim., 2021

Integrating Deep and Shallow Models for Multi-Modal Depression Analysis - Hybrid Architectures.
IEEE Trans. Affect. Comput., 2021

A Lightweight Object Detection Framework for Remote Sensing Images.
Remote. Sens., 2021

Efficient Spatialtemporal Context Modeling for Action Recognition.
CoRR, 2021

Action Unit Driven Facial Expression Synthesis from a Single Image with Patch Attentive GAN.
Comput. Graph. Forum, 2021

Aspect-based Sentiment Analysis with Weighted Relational Graph Attention Network.
Proceedings of the ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, October 18, 2021

Temporal Attentive Adversarial Domain Adaption for Cross Cultural Affect Recognition.
Proceedings of the ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, October 18, 2021

Positional-Spectral-Temporal Attention in 3D Convolutional Neural Networks for EEG Emotion Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
An efficient model-level fusion approach for continuous affect recognition from audiovisual signals.
Neurocomputing, 2020

Adaptive dictionary learning based on local configuration pattern for face recognition.
EURASIP J. Adv. Signal Process., 2020

Detection and diagnosis of myocarditis in young patients using ECG analysis based on artificial neural networks.
Computing, 2020

Emotion recognition from spatiotemporal EEG representations with hybrid convolutional recurrent neural networks via wearable multi-channel headset.
Comput. Commun., 2020

Feature Augmenting Networks for Improving Depression Severity Estimation From Speech Signals.
IEEE Access, 2020

Learning Salient Segments for Speech Emotion Recognition Using Attentive Temporal Pooling.
IEEE Access, 2020

MEMOS: A Multi-modal Emotion Stream Database for Temporal Spontaneous Emotional State Detection.
Proceedings of the Companion Publication of the 2020 International Conference on Multimodal Interaction, 2020

2019
Automatic Depression Analysis Using Dynamic Facial Appearance Descriptor and Dirichlet Process Fisher Encoding.
IEEE Trans. Multim., 2019

A video prediction approach for animating single face image.
Multim. Tools Appl., 2019

Continuous affect recognition with weakly supervised learning.
Multim. Tools Appl., 2019

A Common Spatial Pattern and Wavelet Packet Decomposition Combined Method for EEG-Based Emotion Recognition.
J. Adv. Comput. Intell. Intell. Informatics, 2019

Automatic Face Recognition Based on Sparse Representation and Extended Transfer Learning.
IEEE Access, 2019

A Multimodal Framework for State of Mind Assessment with Sentiment Pre-classification.
Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop, 2019

Efficient Spatial Temporal Convolutional Features for Audiovisual Continuous Affect Recognition.
Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop, 2019

FACS3D-Net: 3D Convolution based Spatiotemporal Representation for Action Unit Detection.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction, 2019

Audio Visual Multimodal Classification of Bipolar Disorder Episodes.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2019

2018
Exploiting Structured Sparsity for Hyperspectral Anomaly Detection.
IEEE Trans. Geosci. Remote. Sens., 2018

Leveraging the Bayesian Filtering Paradigm for Vision-Based Facial Affective State Estimation.
IEEE Trans. Affect. Comput., 2018

Hierarchical sparse coding framework for speech emotion recognition.
Speech Commun., 2018

Structured Background Modeling for Hyperspectral Anomaly Detection.
Sensors, 2018

A New Motor Imagery EEG Classification Method FB-TRCSP+RF Based on CSP and Random Forest.
IEEE Access, 2018

An Improved Camouflage Target Detection Using Hyperspectral Image Based on Block-Diagonal and Low-Rank Representation.
Proceedings of the Pattern Recognition and Computer Vision - First Chinese Conference, 2018

Bipolar Disorder Recognition with Histogram Features of Arousal and Body Gestures.
Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop, 2018

The Design of Immersion Acrophobia Adjuvant Therapy System (IAATS).
Proceedings of the Digital TV and Multimedia Communication - 15th International Forum, 2018

An Extended Common Spatial Pattern Framework for EEG-Based Emotion Classification.
Proceedings of the Advances in Brain Inspired Cognitive Systems, 2018

2017
Hybrid Depression Classification and Estimation from Audio Video and Text Information.
Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, Mountain View, CA, USA, October 23, 2017

Multimodal Measurement of Depression Using Deep Learning Models.
Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, Mountain View, CA, USA, October 23, 2017

A hybrid PAPR reduction approach for the IM/DD optical OFDM communications.
Proceedings of the 2017 IEEE/CIC International Conference on Communications in China, 2017

DCNN and DNN based multi-modal depression recognition.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction, 2017

2016
Hybrid precoding with compressive sensing based limited feedback in massive MIMO systems.
Trans. Emerg. Telecommun. Technol., 2016

Efficient Convolutional Auto-Encoding via Random Convexification and Frequency-Domain Minimization.
CoRR, 2016

A multiCell visual tracking algorithm using multi-task particle swarm optimization for low-contrast image sequences.
Appl. Intell., 2016

A Hybrid ACO-ACM Based Approach for Multi-cell Image Segmentation.
Proceedings of the Advances in Swarm Intelligence, 7th International Conference, 2016

Decision Tree Based Depression Classification from Audio Video and Language Information.
Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, 2016

Hyperspectral anomaly detection using background learning and structured sparse representation.
Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium, 2016

Deep neural network and switching Kalman filter based continuous affect recognition.
Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016

Joint Relay Processing and Power Control for Two-Way Relay Networks Under Individual SINR Constraints.
Proceedings of the Communications and Networking, 2016

Audio Visual Recognition of Spontaneous Emotions In-the-Wild.
Proceedings of the Pattern Recognition - 7th Chinese Conference, 2016

MEC 2016: The Multimodal Emotion Recognition Challenge of CCPR 2016.
Proceedings of the Pattern Recognition - 7th Chinese Conference, 2016

2015
Relevance units machine based dimensional and continuous speech emotion prediction.
Multim. Tools Appl., 2015

Multi-class Object Recognition and Segmentation Based on Multi-feature Fusion Modeling.
Proceedings of the 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), 2015

A Hybrid Multi-Cell Tracking Approach with Level Set Evolution and Ant Colony Optimization.
Proceedings of the Advances in Swarm and Computational Intelligence, 2015

Multimodal Affective Dimension Prediction Using Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks.
Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, 2015

3D emotional facial animation synthesis with factored conditional Restricted Boltzmann Machines.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Multimodal dimensional affect recognition using deep bidirectional long short-term memory recurrent neural networks.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Monocular 3D facial information retrieval for automated facial expression analysis.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Multimodal depression recognition with dynamic visual and audio cues.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Framework for combination aware AU intensity recognition.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features.
Multim. Tools Appl., 2014

Objectifying Facial Expressivity Assessment of Parkinson's Patients: Preliminary Study.
Comput. Math. Methods Medicine, 2014

Physiological Signal Processing for Emotional Feature Extraction.
Proceedings of the PhyCS 2014, 2014

Speech-driven head motion synthesis using neural networks.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Coefficients optimization in femtocell utility function for distributed utility-based SINR adaption algorithm.
Proceedings of the Sixth International Conference on Ubiquitous and Future Networks, 2014

Multimodal continuous affect recognition based on LSTM and multiple kernel learning.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
Multiuser two-way relay processing and power control methods for cognitive radio networks.
Wirel. Commun. Mob. Comput., 2013

Speech driven photo-realistic face animation with mouth and jaw dynamics.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Hybrid Deep Neural Network-Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition.
Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013

2012
Power-efficient resource allocation with QoS guarantees for TDMA fading channels.
Wirel. Commun. Mob. Comput., 2012

Joint precoding and power allocation for multiuser transmission in MIMO relay networks.
Int. J. Commun. Syst., 2012

Dimensional emotion driven facial expression synthesis based on the multi-stream DBN model.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Photo-realistic visual speech synthesis based on AAM features and an articulatory DBN model with constrained asynchrony.
Proceedings of the Auditory-Visual Speech Processing, 2011

Audio Visual Emotion Recognition Based on Triple-Stream Dynamic Bayesian Network Models.
Proceedings of the Affective Computing and Intelligent Interaction, 2011

Kalman Filter-Based Facial Emotional Expression Recognition.
Proceedings of the Affective Computing and Intelligent Interaction, 2011

2010
Multi-modal feature integration for story boundary detection in broadcast news.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Audio visual speech recognition based on multi-stream DBN models with Articulatory Features.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Dual-microphone noise reduction based on semi-blind DUET.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Realistic mouth animation based on an articulatory DBN model with constrained asynchrony.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Manifold Analysis for Subject Independent Dynamic Emotion Recognition in Video Sequences.
Proceedings of the Fifth International Conference on Image and Graphics, 2009

Video Realistic Mouth Animation Based on an Audio Visual DBN Model with Articulatory Features and Constrained Asynchrony.
Proceedings of the Fifth International Conference on Image and Graphics, 2009

A Visual Silence Detector Constraining Speech Source Separation.
Proceedings of the Fifth International Conference on Image and Graphics, 2009

Audio-Visual Emotion Recognition Based on a DBN Model with Constrained Asynchrony.
Proceedings of the Fifth International Conference on Image and Graphics, 2009

2008
Speech driven realistic mouth animation based on multi-modal unit selection.
J. Multimodal User Interfaces, 2008

Accurate visual speech synthesis based on diviseme unit selection and concatenation.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

2007
Multi-stream Asynchrony Modeling for Audio-Visual Speech Recognition.
Proceedings of the Ninth IEEE International Symposium on Multimedia, 2007

A Novel DBN Model for Large Vocabulary Continuous Speech Recognition and Phone Segmentation.
Proceedings of the International Conference on Artificial Intelligence and Pattern Recognition, 2007

2006
Personalization of internet telephony services for presence with SIP and extended CPL.
Comput. Commun., 2006

DBN Based Models for Audio-Visual Speech Analysis and Recognition.
Proceedings of the Advances in Multimedia Information Processing, 2006


  Loading...