Shengeng Tang

Orcid: 0000-0001-6313-2543

According to our database1, Shengeng Tang authored at least 56 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
OmniVL-Guard Pro: A Tool-Augmented Agent for Omnibus Vision-Language Forensics.
CoRR, May, 2026

CanonSLR: Canonical-View Guided Multi-View Continuous Sign Language Recognition.
CoRR, April, 2026

CFLip: Generalizing Lipreading to Unseen Speakers by Learning Common Features.
IEEE Trans. Comput. Soc. Syst., February, 2026

OmniVL-Guard: Towards Unified Vision-Language Forgery Detection and Grounding via Balanced RL.
CoRR, February, 2026

Wi-CBR: Salient-aware Adaptive WiFi Sensing for Cross-domain Behavior Recognition.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Open-World 3D Scene Graph Generation for Retrieval-Augmented Reasoning.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

LinProVSR: Linguistics-Knowledge Guided Progressive Disambiguation Network for Visual Speech Recognition.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Accelerating Controllable Generation via Hybrid-grained Cache.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection.
CoRR, December, 2025

NTIRE 2025 Challenge on Low Light Image Enhancement: Methods and Results.
CoRR, October, 2025

Towards Robust and Generalizable Continuous Space-Time Video Super-Resolution with Events.
CoRR, October, 2025

Alleviating Confirmation Bias in Learning with Noisy Labels via Two-Network Collaboration.
ACM Trans. Intell. Syst. Technol., August, 2025

SplitGaussian: Reconstructing Dynamic Scenes via Visual Geometry Decomposition.
CoRR, August, 2025

Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation.
CoRR, August, 2025

Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering.
CoRR, August, 2025

SignAligner: Harmonizing Complementary Pose Modalities for Coherent Sign Language Generation.
CoRR, June, 2025

Wi-CBR: WiFi-based Cross-domain Behavior Recognition via Multimodal Collaborative Awareness.
CoRR, June, 2025

Temporal Boundary Awareness Network for Repetitive Action Counting.
ACM Trans. Multim. Comput. Commun. Appl., April, 2025

Gloss-driven Conditional Diffusion Models for Sign Language Production.
ACM Trans. Multim. Comput. Commun. Appl., April, 2025

The Tenth NTIRE 2025 Image Denoising Challenge Report.
CoRR, April, 2025

Text-Driven Diffusion Model for Sign Language Production.
CoRR, March, 2025

Leveraging vision-language prompts for real-world image restoration and enhancement.
Comput. Vis. Image Underst., 2025

Efficient Vision Language Model Fine-tuning for Text-based Person Anomaly Search.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2025, 2025

Mixture of Multimodal Adapters for Sentiment Analysis.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

StgcDiff: Spatial-Temporal Graph Condition Diffusion for Sign Language Transition Generation.
Proceedings of the 3rd International Workshop on Deep Multimodal Generation and Retrieval, 2025

Towards Fine-Grained Emotion Understanding via Skeleton-Based Micro-Gesture Recognition.
Proceedings of IJCAI-2025 Workshop & Challenge on Human Behavior Analysis for Emotion Understanding (MiGA 2025), 2025

Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Knowledge Swapping via Learning and Unlearning.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Linguistics-Vision Monotonic Consistent Network for Sign Language Production.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SLRTP2025 Sign Language Production Challenge: Methodology, Results and Future Work.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025


Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

NTIRE 2025 Challenge on Low Light Image Enhancement: Methods and Results.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Patch-level Sounding Object Tracking for Audio-Visual Question Answering.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Intermediary-Generated Bridge Network for RGB-D Cross-Modal Re-Identification.
ACM Trans. Intell. Syst. Technol., December, 2024

Emotional Video Captioning With Vision-Based Emotion Interpretation Network.
IEEE Trans. Image Process., 2024

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation.
CoRR, 2024

Modality Alignment Meets Federated Broadcasting.
CoRR, 2024

Dataset Distillers Are Good Label Denoisers In the Wild.
CoRR, 2024

Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing.
CoRR, 2024

Micro-gesture Online Recognition using Learnable Query Points.
CoRR, 2024

A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+.
CoRR, 2024

Comprehensive Survey on Person Identification: Queries, Methods, and Datasets.
Proceedings of the 1st ICMR Workshop on Multimedia Object Re-Identification, 2024

Micro-gesture Online Recognition using Learnable Query Points.
Proceedings of IJCAI 2024 Workshop&Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA 2024) co-located with 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024), 2024


2023
Emotion-Prior Awareness Network for Emotional Video Captioning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022
Graph-Based Multimodal Sequential Embedding for Sign Language Translation.
IEEE Trans. Multim., 2022

Gloss Semantic-Enhanced Network with Online Back-Translation for Sign Language Production.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

2019
Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019


  Loading...