Shengeng Tang

Orcid: 0000-0001-6313-2543

According to our database1, Shengeng Tang authored at least 44 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Alleviating Confirmation Bias in Learning with Noisy Labels via Two-Network Collaboration.
ACM Trans. Intell. Syst. Technol., August, 2025

SplitGaussian: Reconstructing Dynamic Scenes via Visual Geometry Decomposition.
CoRR, August, 2025

Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation.
CoRR, August, 2025

Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering.
CoRR, August, 2025

StgcDiff: Spatial-Temporal Graph Condition Diffusion for Sign Language Transition Generation.
CoRR, June, 2025

Towards Fine-Grained Emotion Understanding via Skeleton-Based Micro-Gesture Recognition.
CoRR, June, 2025

SignAligner: Harmonizing Complementary Pose Modalities for Coherent Sign Language Generation.
CoRR, June, 2025

Wi-CBR: WiFi-based Cross-domain Behavior Recognition via Multimodal Collaborative Awareness.
CoRR, June, 2025

Temporal Boundary Awareness Network for Repetitive Action Counting.
ACM Trans. Multim. Comput. Commun. Appl., April, 2025

Gloss-driven Conditional Diffusion Models for Sign Language Production.
ACM Trans. Multim. Comput. Commun. Appl., April, 2025

The Tenth NTIRE 2025 Image Denoising Challenge Report.
CoRR, April, 2025

Text-Driven Diffusion Model for Sign Language Production.
CoRR, March, 2025

Knowledge Swapping via Learning and Unlearning.
CoRR, February, 2025

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning.
CoRR, February, 2025

Leveraging vision-language prompts for real-world image restoration and enhancement.
Comput. Vis. Image Underst., 2025

Efficient Vision Language Model Fine-tuning for Text-based Person Anomaly Search.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2025, 2025

Mixture of Multimodal Adapters for Sentiment Analysis.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Linguistics-Vision Monotonic Consistent Network for Sign Language Production.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SLRTP2025 Sign Language Production Challenge: Methodology, Results and Future Work.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025


Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

NTIRE 2025 Challenge on Low Light Image Enhancement: Methods and Results.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Patch-level Sounding Object Tracking for Audio-Visual Question Answering.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Intermediary-Generated Bridge Network for RGB-D Cross-Modal Re-Identification.
ACM Trans. Intell. Syst. Technol., December, 2024

Emotional Video Captioning With Vision-Based Emotion Interpretation Network.
IEEE Trans. Image Process., 2024

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation.
CoRR, 2024

Modality Alignment Meets Federated Broadcasting.
CoRR, 2024

Dataset Distillers Are Good Label Denoisers In the Wild.
CoRR, 2024

Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing.
CoRR, 2024

Micro-gesture Online Recognition using Learnable Query Points.
CoRR, 2024

A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+.
CoRR, 2024

Comprehensive Survey on Person Identification: Queries, Methods, and Datasets.
Proceedings of the 1st ICMR Workshop on Multimedia Object Re-Identification, 2024

Micro-gesture Online Recognition using Learnable Query Points.
Proceedings of IJCAI 2024 Workshop&Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA 2024) co-located with 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024), 2024


2023
Emotion-Prior Awareness Network for Emotional Video Captioning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022
Graph-Based Multimodal Sequential Embedding for Sign Language Translation.
IEEE Trans. Multim., 2022

Gloss Semantic-Enhanced Network with Online Back-Translation for Sign Language Production.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

2019
Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019


  Loading...