Dan Guo

Orcid: 0000-0003-2594-254X

Affiliations:
  • Hefei University of Technology, Hefei, China


According to our database1, Dan Guo authored at least 144 papers between 2010 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
MSPhys: multiscale fusing-based diffusion model for remote physiological measurement.
Mach. Vis. Appl., September, 2025

Alleviating Confirmation Bias in Learning with Noisy Labels via Two-Network Collaboration.
ACM Trans. Intell. Syst. Technol., August, 2025

Emotion Separation and Recognition From a Facial Expression by Generating the Poker Face With Vision Transformers.
IEEE Trans. Comput. Soc. Syst., August, 2025

Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

AgentMental: An Interactive Multi-Agent Framework for Explainable and Adaptive Mental Health Assessment.
CoRR, August, 2025

CLASP: Cross-modal Salient Anchor-based Semantic Propagation for Weakly-supervised Dense Audio-Visual Event Localization.
CoRR, August, 2025

Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation.
CoRR, August, 2025

Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering.
CoRR, August, 2025

Online Micro-gesture Recognition Using Data Augmentation and Spatial-Temporal Attention.
CoRR, July, 2025

MM-Gesture: Towards Precise Micro-Gesture Recognition through Multimodal Fusion.
CoRR, July, 2025

Facial Depression Estimation via Multi-Cue Contrastive Learning.
IEEE Trans. Circuits Syst. Video Technol., June, 2025

SSAM: Self-Supervised Association Modeling for Test-Time Adaption.
CoRR, June, 2025

Towards Energy-efficient Audio-visual Classification via Multimodal Interactive Spiking Neural Network.
ACM Trans. Multim. Comput. Commun. Appl., May, 2025

Temporal Boundary Awareness Network for Repetitive Action Counting.
ACM Trans. Multim. Comput. Commun. Appl., April, 2025

Introduction to the Special Issue on Deep Learning for Robust Human Body Language Understanding.
ACM Trans. Multim. Comput. Commun. Appl., April, 2025

PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation.
IEEE Trans. Comput. Soc. Syst., April, 2025

Multi-Objective Convex Quantization for Efficient Model Compression.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2025

Audio-Visual Segmentation with Semantics.
Int. J. Comput. Vis., April, 2025

EmoSEM: Segment and Explain Emotion Stimuli in Visual Art.
CoRR, April, 2025

Towards Efficient Partially Relevant Video Retrieval with Active Moment Discovering.
CoRR, April, 2025

Ensemble Prototype Network For Weakly Supervised Temporal Action Localization.
IEEE Trans. Neural Networks Learn. Syst., March, 2025

A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli.
CoRR, March, 2025

An Active Multi-Target Domain Adaptation Strategy: Progressive Class Prototype Rectification.
IEEE Trans. Multim., 2025

Repetitive Action Counting With Hybrid Temporal Relation Modeling.
IEEE Trans. Multim., 2025

Text-Infused Audio-Visual Video Parsing with Semantic-Aware Multimodal Contrastive Learning.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Linguistics-Vision Monotonic Consistent Network for Sign Language Production.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Towards Open-Vocabulary Audio-Visual Event Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

NTIRE 2025 Challenge on Low Light Image Enhancement: Methods and Results.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Patch-level Sounding Object Tracking for Audio-Visual Question Answering.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Multimodal Graph Causal Embedding for Multimedia-Based Recommendation.
IEEE Trans. Knowl. Data Eng., December, 2024

Depth Matters: Spatial Proximity-Based Gaze Cone Generation for Gaze Following in Wild.
ACM Trans. Multim. Comput. Commun. Appl., November, 2024

Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-Wise Pseudo Labeling.
Int. J. Comput. Vis., November, 2024

Seeking False Hard Negatives for Graph Contrastive Learning.
IEEE Trans. Circuits Syst. Video Technol., August, 2024

Channel-Wise Interactive Learning for Remote Heart Rate Estimation From Facial Video.
IEEE Trans. Circuits Syst. Video Technol., June, 2024

Dual-Path TokenLearner for Remote Photoplethysmography-Based Physiological Measurement With Facial Videos.
IEEE Trans. Comput. Soc. Syst., June, 2024

Graph Pooling Inference Network for Text-based VQA.
ACM Trans. Multim. Comput. Commun. Appl., April, 2024

Visual-linguistic-stylistic Triple Reward for Cross-lingual Image Captioning.
ACM Trans. Multim. Comput. Commun. Appl., April, 2024

FedSH: Towards Privacy-Preserving Text-Based Person Re-Identification.
IEEE Trans. Multim., 2024

Active Factor Graph Network for Group Activity Recognition.
IEEE Trans. Image Process., 2024

Emotional Video Captioning With Vision-Based Emotion Interpretation Network.
IEEE Trans. Image Process., 2024

Benchmarking Micro-Action Recognition: Dataset, Methods, and Applications.
IEEE Trans. Circuits Syst. Video Technol., 2024

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights.
CoRR, 2024

Moderating the Generalization of Score-based Generative Model.
CoRR, 2024

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation.
CoRR, 2024

Grounding is All You Need? Dual Temporal Grounding for Video Dialog.
CoRR, 2024

Scene-Text Grounding for Text-Based Video Question Answering.
CoRR, 2024

MMAD: Multi-label Micro-Action Detection in Videos.
CoRR, 2024

Micro-gesture Online Recognition using Learnable Query Points.
CoRR, 2024

A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+.
CoRR, 2024

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report.
CoRR, 2024

Robust video question answering via contrastive cross-modality representation learning.
Sci. China Inf. Sci., 2024

Joint Spatial-Temporal Modeling and Contrastive Learning for Self-supervised Heart Rate Measurement.
Proceedings of the 3rd Vision-based Remote Physiological Signal Sensing Challenge & Workshop (RePSS 2024) co-located with the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024), 2024

AMTN: Attention-Enhanced Multimodal Temporal Network for Humor Detection.
Proceedings of the 5th on Multimodal Sentiment Analysis Challenge and Workshop: Social Perception and Humor, 2024

Repetitive Action Counting with Feature Interaction Enhancement and Adaptive Gate Fusion.
Proceedings of the 6th ACM International Conference on Multimedia in Asia, 2024

Cluster-Phys: Facial Clues Clustering Towards Efficient Remote Physiological Measurement.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Maskable Retentive Network for Video Moment Retrieval.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MAC 2024: Micro-Action Analysis Grand Challenge.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Micro-gesture Online Recognition using Learnable Query Points.
Proceedings of IJCAI 2024 Workshop&Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA 2024) co-located with 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024), 2024

Prototype Learning for Micro-gesture Classification.
Proceedings of IJCAI 2024 Workshop&Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA 2024) co-located with 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024), 2024

Label-Anticipated Event Disentanglement for Audio-Visual Video Parsing.
Proceedings of the Computer Vision - ECCV 2024, 2024

Training A Small Emotional Vision Language Model for Visual Art Comprehension.
Proceedings of the Computer Vision - ECCV 2024, 2024

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Frequency Decoupling for Motion Magnification Via Multi-Level Isomorphic Architecture.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Data-Free Quantization via Pseudo-label Filtering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Syntax-Controllable Video Captioning with Tree-Structural Syntax Augmentation.
Proceedings of the 2024 2nd Asia Conference on Computer Vision, 2024

Towards Understanding Future: Consistency Guided Probabilistic Modeling for Action Anticipation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

KPA-Tracker: Towards Robust and Real-Time Category-Level Articulated Object 6D Pose Tracking.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Transformer-Based Visual Grounding with Cross-Modality Interaction.
ACM Trans. Multim. Comput. Commun. Appl., November, 2023

Micro-expression recognition with attention mechanism and region enhancement.
Multim. Syst., October, 2023

ViGT: proposal-free video grounding with a learnable token in the transformer.
Sci. China Inf. Sci., October, 2023

Spatiotemporal contrastive modeling for video moment retrieval.
World Wide Web (WWW), July, 2023

Contrastive Positive Sample Propagation Along the Audio-Visual Event Line.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation.
ACM Trans. Inf. Syst., April, 2023

LCSNet: End-to-end Lipreading with Channel-aware Feature Selection.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

Joint Multi-Grained Popularity-Aware Graph Convolution Collaborative Filtering for Recommendation.
IEEE Trans. Comput. Soc. Syst., February, 2023

Global Temporal Difference Network for Action Recognition.
IEEE Trans. Multim., 2023

Contextual Attention Network for Emotional Video Captioning.
IEEE Trans. Multim., 2023

Multimodal Graph Contrastive Learning for Multimedia-Based Recommendation.
IEEE Trans. Multim., 2023

Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA.
IEEE Trans. Image Process., 2023

Memorial GAN With Joint Semantic Optimization for Unpaired Image Captioning.
IEEE Trans. Cybern., 2023

Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding.
CoRR, 2023

Dual-path TokenLearner for Remote Photoplethysmography-based Physiological Measurement with Facial Videos.
CoRR, 2023

Improving Audio-Visual Video Parsing with Pseudo Visual Labels.
CoRR, 2023

Multimodal Counterfactual Learning Network for Multimedia-based Recommendation.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Exploiting Diverse Feature for Multimodal Sentiment Analysis.
Proceedings of the 4th on Multimodal Sentiment Analysis Challenge and Workshop: Mimicked Emotions, 2023

Emotion-Prior Awareness Network for Emotional Video Captioning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Data Augmentation for Human Behavior Analysis in Multi-Person Conversations.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Multi-modality Fusion for Emotion Recognition in Videos.
Proceedings of IJCAI-2023 Workshop&Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA 2023) co-located with 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

Joint Skeletal and Semantic Embedding Loss for Micro-gesture Classification.
Proceedings of IJCAI-2023 Workshop&Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA 2023) co-located with 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

2022
Graph-Based Multimodal Sequential Embedding for Sign Language Translation.
IEEE Trans. Multim., 2022

Context-Aware Graph Inference With Knowledge Distillation for Visual Dialog.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Visual feature synthesis with semantic reconstructor for traditional and generalized zero-shot object classification.
Int. J. Intell. Syst., 2022

MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation.
CoRR, 2022

Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Gloss Semantic-Enhanced Network with Online Back-Translation for Sign Language Production.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Audio-Visual Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
DDFPN: Context enhanced network for object detection.
Future Gener. Comput. Syst., 2021

Pairwise VLAD Interaction Network for Video Question Answering.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Distilling Dynamic Spatial Relation Network for Human Pose Estimation.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Proposal-Free Video Grounding with Contextual Pyramid Network.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Shuffle Scheduling for MapReduce Jobs Based on Periodic Network Status.
IEEE/ACM Trans. Netw., 2020

Hierarchical Recurrent Deep Fusion Using Adaptive Clip Summarization for Sign Language Translation.
IEEE Trans. Image Process., 2020

Textual-Visual Reference-Aware Attention Network for Visual Dialog.
IEEE Trans. Image Process., 2020

Unsupervised video summarization via clustering validity index.
Multim. Tools Appl., 2020

Recurrent Relational Memory Network for Unsupervised Image Captioning.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

AOPNet: Anchor Offset Prediction Network for Temporal Action Proposal Generation.
Proceedings of the IEEE International Conference on Signal Processing, 2020

Iterative Context-Aware Graph Inference for Visual Dialog.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Cross-Modality Retrieval by Joint Correlation Learning.
ACM Trans. Multim. Comput. Commun. Appl., 2019

DADNet: Dilated-Attention-Deformable ConvNet for Crowd Counting.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Dual Visual Attention Network for Visual Dialog.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Dense Temporal Convolution Network for Sign Language Translation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Parallel Temporal Encoder For Sign Language Translation.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

2018
Online Early-Late Fusion Based on Adaptive HMM for Sign Language Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2018

Co-occurrence pattern mining based on a biological approximation scoring matrix.
Pattern Anal. Appl., 2018

Connectionist Temporal Fusion for Sign Language Translation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Hierarchical LSTM for Sign Language Translation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Improved marching tetrahedra algorithm based on hierarchical signed distance field and multi-scale depth map fusion for 3D reconstruction.
J. Vis. Commun. Image Represent., 2017

Contraflow-constrained evacuation route planning.
Proceedings of the 13th International Conference on Natural Computation, 2017

2016
Complex-query web image search with concept-based relevance estimation.
World Wide Web, 2016

Parametric and nonparametric residual vector quantization optimizations for ANN search.
Neurocomputing, 2016

Motion-compensated frame interpolation with weighted motion estimation and hierarchical vector refinement.
Neurocomputing, 2016

Sign language recognition based on adaptive HMMS with data augmentation.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Frequent Pattern Mining Based on Approximate Edit Distance Matrix.
Proceedings of the IEEE First International Conference on Data Science in Cyberspace, 2016

Max-Flow Rate Priority Algorithm for Evacuation Route Planning.
Proceedings of the IEEE First International Conference on Data Science in Cyberspace, 2016

2013
MAIL: mining sequential patterns with wildcards.
Int. J. Data Min. Bioinform., 2013

Pattern matching with wildcards and gap-length constraints based on a centrality-degree graph.
Appl. Intell., 2013

Flexible Pattern Matching with Gap-Length and One-Off Conditions.
Proceedings of the 25th IEEE International Conference on Tools with Artificial Intelligence, 2013

2012
Online pattern matching with wildcards.
Proceedings of the 2012 IEEE International Conference on Granular Computing, 2012

2011
A Bit-Parallel Algorithm for Sequential Pattern Matching with Wildcards.
Cybern. Syst., 2011

2010
Sequential Pattern Mining with Wildcards.
Proceedings of the 22nd IEEE International Conference on Tools with Artificial Intelligence, 2010


  Loading...