We stand with Ukraine

We stand with Ukraine

Xuecheng Wu

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Bibliography

2026

MooD: Perception-Enhanced Efficient Affective Image Editing via Continuous Valence-Arousal Modeling.

[DOI]

,

,

,

,

,

,

,

,

CoRR, May, 2026

A Benchmark of Microvideos for Public Opinion Analysis.

[DOI]

,

,

,

,

,

IEEE Trans. Comput. Soc. Syst., April, 2026

AeroRAG: Structured Multimodal Retrieval-Augmented LLM for Fine-Grained Aerial Visual Reasoning.

[DOI]

,

,

,

,

,

,

CoRR, April, 2026

AIM-Bench: Benchmarking and Improving Affective Image Manipulation via Fine-Grained Hierarchical Control.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

EPIR: An Efficient Patch Tokenization, Integration and Representation Framework for Micro-expression Recognition.

[DOI]

,

,

,

,

,

CoRR, April, 2026

DiffVC: A Non-autoregressive Framework Based on Diffusion Model for Video Captioning.

[DOI]

,

,

,

,

,

,

CoRR, April, 2026

TR-ICRL: Test-Time Rethinking for In-Context Reinforcement Learning.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, March, 2026

ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation.

[DOI]

,

,

,

,

,

CoRR, March, 2026

TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, February, 2026

Dolphin-v2: Universal Document Parsing via Scalable Anchor Prompting.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, February, 2026

RePose: A Real-Time 3D Human Pose Estimation and Biomechanical Analysis Framework for Rehabilitation.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, January, 2026

Disentangling Hardness from Noise: An Uncertainty-Driven Model-Agnostic Framework for Long-Tailed Remote Sensing Classification.

[DOI]

,

,

,

,

,

,

,

CoRR, January, 2026

A Trustworthy Method for Multimodal Emotion Recognition.

[DOI]

,

,

,

,

Big Data Min. Anal., 2026

2025

Beyond Pixel Simulation: Pathology Image Generation via Diagnostic Semantic Tokens and Prototype Control.

[DOI]

,

,

,

,

,

,

,

CoRR, December, 2025

DARTs: A Dual-Path Robust Framework for Anomaly Detection in High-Dimensional Multivariate Time Series.

[DOI]

,

,

,

,

,

,

CoRR, December, 2025

V2X-RECT: An Efficient V2X Trajectory Prediction Framework via Redundant Interaction Filtering and Tracking Error Correction.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, November, 2025

Improving Multimodal Sentiment Analysis via Modality Optimization and Dynamic Primary Modality Selection.

[DOI]

,

,

,

,

,

,

,

CoRR, November, 2025

Code-driven Number Sequence Calculation: Enhancing the inductive Reasoning Abilities of Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

A Survey of Inductive Reasoning for Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

Scalable Audio-Visual Masked Autoencoders for Efficient Affective Video Facial Analysis.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

Towards Comprehensive Interactive Change Understanding in Remote Sensing: A Large-scale Dataset and Dual-granularity Enhanced VLM.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

End4: End-to-end Denoising Diffusion for Diffusion-Based Inpainting Detection.

[DOI]

,

,

,

,

,

CoRR, September, 2025

Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training.

[DOI]

,

,

,

,

,

,

,

,

CoRR, August, 2025

AD-AVSR: Asymmetric Dual-stream Enhancement for Robust Audio-Visual Speech Recognition.

[DOI]

,

,

,

,

,

CoRR, August, 2025

eMotions: A Large-Scale Dataset and Audio-Visual Fusion Network for Emotion Analysis in Short-form Videos.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, August, 2025

HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs.

[DOI]

,

,

,

,

,

CoRR, June, 2025

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Trong-Hieu Nguyen-Mau

,

,

Minh-Khoa Le-Phan

,

,

Hai-Dang Nguyen

,

Minh-Triet Tran

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, May, 2025

ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations.

[DOI]

,

,

,

,

,

,

,

,

CoRR, May, 2025

Incipient fault detection and process monitoring of thermal power plant pulverizing system based on deep representation learning.

[DOI]

,

,

,

,

,

,

Trans. Inst. Meas. Control, 2025

Holographic Airborne for Cloud Particle Imager (HACPI): Development and Applications to Icing Wind Tunnels.

[DOI]

,

,

,

,

,

,

,

,

,

IEEE Trans. Instrum. Meas., 2025

Airborne Interferometric Cloud Particle Imager: Instrumentation and Applications to Icing Wind Tunnel.

[DOI]

,

,

,

,

,

,

,

,

,

,

IEEE Trans. Instrum. Meas., 2025

Rainbow Airborne Cloud Particle Imager: Instrumentation and Applications for Supercooled Droplet Temperature Measurement.

[DOI]

,

,

,

,

,

,

,

,

IEEE Trans. Instrum. Meas., 2025

Multi-objective optimization of regional energy systems with exergy efficiency and user satisfaction dynamics.

[DOI]

,

Qiongbing Xiong

,

Sustain. Comput. Informatics Syst., 2025

Self-Distillation Based Multi-task Learning Model For Stylus Input Latency Compensation MHCI027.

[DOI]

,

,

,

,

,

,

Proc. ACM Hum. Comput. Interact., 2025

Affective Video Content Analysis: Decade Review and New Perspectives.

[DOI]

,

,

,

,

Big Data Min. Anal., 2025

3A-YOLO : New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2025

HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs.

[DOI]

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

HOLA: Enhancing Audio-visual Deepfake Detection via Hierarchical Contextual Aggregations and Efficient Pre-training.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

DDSE: A Decoupled Dual-Stream Enhanced Framework for Multimodal Sentiment Analysis with Text-Centric SSM.

[DOI]

,

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

DSACap: Enhancing Visual-Semantic Alignment with Diffusion-based Framework for Image Captioning.

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Towards Emotion Analysis in Short-form Videos: A Large-Scale Dataset and Baseline.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

MirrorDiff: Learning Mirror Diffusion for Image Captioning via Regeneration.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

PTSR: A Unified Patch Tokenization, Selection and Representation Framework for Efficient Micro-expression Recognition.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

TACR-YOLO: A Real-time Detection Framework for Abnormal Human Behaviors Enhanced with Coordinate and Task-Aware Representations.

[DOI]

,

,

,

,

Proceedings of the International Joint Conference on Neural Networks, 2025

InfoSyncNet: Information Synchronization Temporal Convolutional Network for Visual Speech Recognition.

[DOI]

,

,

,

,

Proceedings of the International Joint Conference on Neural Networks, 2025

FAMNet: Integrating 2D and 3D Features for Micro-expression Recognition via Multi-task Learning and Hierarchical Attention.

[DOI]

,

,

,

Proceedings of the International Joint Conference on Neural Networks, 2025

MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models.

[DOI]

,

,

,

,

Kongcheng Zhang

,

,

,

,

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs.

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning.

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Trong-Hieu Nguyen-Mau

,

,

Minh-Khoa Le-Phan

,

,

Hai-Dang Nguyen

,

Minh-Triet Tran

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

2024

Spectroscopic Techniques and Hydrogen-Sensitive Compounds: A New Horizon in Hydrogen Detection.

[DOI]

,

,

,

,

Chenghang Zheng

,

,

,

Sensors, May, 2024

Research on Irregular Flight Recovery Strategy Under Different Flight Route Types With Big Data Computing.

[DOI]

,

,

,

,

,

,

Int. J. Inf. Technol. Syst. Approach, 2024

3A-YOLO: New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations.

[DOI]

,

,

,

,

,

CoRR, 2024

Building Robust Video-Level Deepfake Detection via Audio-Visual Local-Global Interactions.

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Temporal Semantic Scoring Path Aware Multi-embedding Sequential Recommendation.

[DOI]

,

,

,

,

Proceedings of the Neural Information Processing - 31st International Conference, 2024

2023

eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos.

[DOI]

,

,

,

,

,

,

CoRR, 2023

Emotion Recognition by Video: A review.

[DOI]

,

,

,

CoRR, 2023

2022

Discrimination and Measurement of Droplet and Ice Crystal by Combining Digital Inline Holography With Interferometric Particle Imaging With Single Color Camera.

[DOI]

,

,

,

,

IEEE Trans. Instrum. Meas., 2022

A Method for Medical Microscopic Images' Sharpness Evaluation Based on NSST and Variance by Combining Time and Frequency Domains.

[DOI]

,

,

,

,

,

,

Sensors, 2022

Research on Mask Wearing Detection of Natural Population Based on Improved YOLOv4.

[DOI]

,

,

CoRR, 2022

ICANet: A Method of Short Video Emotion Recognition Driven by Multimodal Data.

[DOI]

,

,

CoRR, 2022

Loading...