Xuecheng Wu

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Bibliography

2026
MooD: Perception-Enhanced Efficient Affective Image Editing via Continuous Valence-Arousal Modeling.
CoRR, May, 2026

A Benchmark of Microvideos for Public Opinion Analysis.
IEEE Trans. Comput. Soc. Syst., April, 2026

AeroRAG: Structured Multimodal Retrieval-Augmented LLM for Fine-Grained Aerial Visual Reasoning.
CoRR, April, 2026

AIM-Bench: Benchmarking and Improving Affective Image Manipulation via Fine-Grained Hierarchical Control.
CoRR, April, 2026

EPIR: An Efficient Patch Tokenization, Integration and Representation Framework for Micro-expression Recognition.
CoRR, April, 2026

DiffVC: A Non-autoregressive Framework Based on Diffusion Model for Video Captioning.
CoRR, April, 2026

TR-ICRL: Test-Time Rethinking for In-Context Reinforcement Learning.
CoRR, April, 2026

FED-Bench: A Cross-Granular Benchmark for Disentangled Evaluation of Facial Expression Editing.
CoRR, March, 2026

ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation.
CoRR, March, 2026

TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering.
CoRR, February, 2026

Dolphin-v2: Universal Document Parsing via Scalable Anchor Prompting.
CoRR, February, 2026

RePose: A Real-Time 3D Human Pose Estimation and Biomechanical Analysis Framework for Rehabilitation.
CoRR, January, 2026

Disentangling Hardness from Noise: An Uncertainty-Driven Model-Agnostic Framework for Long-Tailed Remote Sensing Classification.
CoRR, January, 2026

A Trustworthy Method for Multimodal Emotion Recognition.
Big Data Min. Anal., 2026

2025
Beyond Pixel Simulation: Pathology Image Generation via Diagnostic Semantic Tokens and Prototype Control.
CoRR, December, 2025

DARTs: A Dual-Path Robust Framework for Anomaly Detection in High-Dimensional Multivariate Time Series.
CoRR, December, 2025

V2X-RECT: An Efficient V2X Trajectory Prediction Framework via Redundant Interaction Filtering and Tracking Error Correction.
CoRR, November, 2025

Improving Multimodal Sentiment Analysis via Modality Optimization and Dynamic Primary Modality Selection.
CoRR, November, 2025

Code-driven Number Sequence Calculation: Enhancing the inductive Reasoning Abilities of Large Language Models.
CoRR, October, 2025

A Survey of Inductive Reasoning for Large Language Models.
CoRR, October, 2025

Scalable Audio-Visual Masked Autoencoders for Efficient Affective Video Facial Analysis.
CoRR, September, 2025

Towards Comprehensive Interactive Change Understanding in Remote Sensing: A Large-scale Dataset and Dual-granularity Enhanced VLM.
CoRR, September, 2025

End4: End-to-end Denoising Diffusion for Diffusion-Based Inpainting Detection.
CoRR, September, 2025

Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning.
CoRR, September, 2025

TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training.
CoRR, August, 2025

AD-AVSR: Asymmetric Dual-stream Enhancement for Robust Audio-Visual Speech Recognition.
CoRR, August, 2025

eMotions: A Large-Scale Dataset and Audio-Visual Fusion Network for Emotion Analysis in Short-form Videos.
CoRR, August, 2025

HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs.
CoRR, June, 2025

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment.
CoRR, May, 2025

ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations.
CoRR, May, 2025

Incipient fault detection and process monitoring of thermal power plant pulverizing system based on deep representation learning.
Trans. Inst. Meas. Control, 2025

Holographic Airborne for Cloud Particle Imager (HACPI): Development and Applications to Icing Wind Tunnels.
IEEE Trans. Instrum. Meas., 2025

Airborne Interferometric Cloud Particle Imager: Instrumentation and Applications to Icing Wind Tunnel.
IEEE Trans. Instrum. Meas., 2025

Rainbow Airborne Cloud Particle Imager: Instrumentation and Applications for Supercooled Droplet Temperature Measurement.
IEEE Trans. Instrum. Meas., 2025

Multi-objective optimization of regional energy systems with exergy efficiency and user satisfaction dynamics.
Sustain. Comput. Informatics Syst., 2025

Self-Distillation Based Multi-task Learning Model For Stylus Input Latency Compensation MHCI027.
Proc. ACM Hum. Comput. Interact., 2025

Affective Video Content Analysis: Decade Review and New Perspectives.
Big Data Min. Anal., 2025

3A-YOLO : New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2025

HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

HOLA: Enhancing Audio-visual Deepfake Detection via Hierarchical Contextual Aggregations and Efficient Pre-training.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

DDSE: A Decoupled Dual-Stream Enhanced Framework for Multimodal Sentiment Analysis with Text-Centric SSM.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

DSACap: Enhancing Visual-Semantic Alignment with Diffusion-based Framework for Image Captioning.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Towards Emotion Analysis in Short-form Videos: A Large-Scale Dataset and Baseline.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

MirrorDiff: Learning Mirror Diffusion for Image Captioning via Regeneration.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

PTSR: A Unified Patch Tokenization, Selection and Representation Framework for Efficient Micro-expression Recognition.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

TACR-YOLO: A Real-time Detection Framework for Abnormal Human Behaviors Enhanced with Coordinate and Task-Aware Representations.
Proceedings of the International Joint Conference on Neural Networks, 2025

InfoSyncNet: Information Synchronization Temporal Convolutional Network for Visual Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2025

FAMNet: Integrating 2D and 3D Features for Micro-expression Recognition via Multi-task Learning and Hierarchical Attention.
Proceedings of the International Joint Conference on Neural Networks, 2025

MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

2024
Spectroscopic Techniques and Hydrogen-Sensitive Compounds: A New Horizon in Hydrogen Detection.
Sensors, May, 2024

Research on Irregular Flight Recovery Strategy Under Different Flight Route Types With Big Data Computing.
Int. J. Inf. Technol. Syst. Approach, 2024

3A-YOLO: New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations.
CoRR, 2024

Building Robust Video-Level Deepfake Detection via Audio-Visual Local-Global Interactions.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Temporal Semantic Scoring Path Aware Multi-embedding Sequential Recommendation.
Proceedings of the Neural Information Processing - 31st International Conference, 2024

2023
eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos.
CoRR, 2023

Emotion Recognition by Video: A review.
CoRR, 2023

2022
Discrimination and Measurement of Droplet and Ice Crystal by Combining Digital Inline Holography With Interferometric Particle Imaging With Single Color Camera.
IEEE Trans. Instrum. Meas., 2022

A Method for Medical Microscopic Images' Sharpness Evaluation Based on NSST and Variance by Combining Time and Frequency Domains.
Sensors, 2022

Research on Mask Wearing Detection of Natural Population Based on Improved YOLOv4.
CoRR, 2022

ICANet: A Method of Short Video Emotion Recognition Driven by Multimodal Data.
CoRR, 2022


  Loading...