Jitao Sang

Orcid: 0000-0002-0699-3205

Affiliations:
  • Beijing Jiaotong University, Beijing, China


According to our database1, Jitao Sang authored at least 110 papers between 2012 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
GAROD: Delve into Gradient-Based Attribution Reliability for Out-of-Distribution Detection.
ACM Trans. Multim. Comput. Commun. Appl., March, 2026

HulluEdit: Single-Pass Evidence-Consistent Subspace Editing for Mitigating Hallucinations in Large Vision-Language Models.
CoRR, February, 2026

ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations.
Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, 2026

Membership Inference Attack Against Large Language Model-Based Recommendation Systems: A New Distillation-Based Paradigm.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Look before Transcription: End-to-End SlideASR with Visually-Anchored Policy Optimization.
CoRR, October, 2025

Towards Robust Recommendation: A Review and an Adversarial Robustness Evaluation Library.
IEEE Trans. Knowl. Data Eng., September, 2025

Backdoor for Debias: Mitigating Model Bias With Backdoor Attack-Based Artificial Bias.
IEEE Trans. Circuits Syst. Video Technol., August, 2025

NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models.
CoRR, June, 2025

Debiased Prompt Tuning in Vision-Language Model without Annotations.
CoRR, March, 2025

Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration.
CoRR, February, 2025

MF-CLIP: Leveraging CLIP as Surrogate Models for No-Box Adversarial Attacks.
IEEE Trans. Inf. Forensics Secur., 2025

A Disguised Wolf Is More Harmful Than a Toothless Tiger: Adaptive and Malicious Code Injection Backdoor Attack Leveraging User Behavior as Triggers.
IEEE Trans. Inf. Forensics Secur., 2025

Prescribing the right remedy: Mitigating hallucinations in large vision-language models via targeted instruction tuning.
Inf. Sci., 2025

Exploring the Privacy Protection Capabilities of Chinese Large Language Models.
IEEE Multim., 2025

A LLM-based Controllable, Scalable, Human-Involved User Simulator Framework for Conversational Recommender Systems.
Proceedings of the ACM on Web Conference 2025, 2025

SILLM4Rec: Self-Improving with Chain of Thought Enhanced Preference Optimization for Multimodal Recommendation.
Proceedings of the 7th ACM International Conference on Multimedia in Asia, 2025

Debiased Prompt Tuning for Vision-Language Models without Annotations.
Proceedings of the International Joint Conference on Neural Networks, 2025

Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Investigating and Enhancing Vision-Audio Capability in Omnimodal Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Adaptive Adversarial Logits Pairing.
ACM Trans. Multim. Comput. Commun. Appl., February, 2024

TIF: Threshold Interception and Fusion for Compact and Fine-Grained Visual Attribution.
IEEE Trans. Multim., 2024

Debiasing Vison-Language Models with Text-Only Training.
CoRR, 2024

AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models.
CoRR, 2024

KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions.
CoRR, 2024

DenoiseReID: Denoising Model for Representation Learning of Person Re-Identification.
CoRR, 2024

Semantically consistent Video-to-Audio Generation using Multimodal Language Large Model.
CoRR, 2024

Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning.
CoRR, 2024

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception.
CoRR, 2024

How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Staying in the Cat-and-Mouse Game: Towards Black-box Adversarial Example Detection.
Proceedings of the 2nd International Workshop on Deep Multimodal Generation and Retrieval, 2024

Poisoning for Debiasing: Fair Recognition via Eliminating Bias Uncovered in Data Poisoning.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Adversarial Prompt Tuning for Vision-Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Low-mid adversarial perturbation against unauthorized face recognition system.
Inf. Sci., November, 2023

Knowledge Graph-Enhanced Sampling for Conversational Recommendation System.
IEEE Trans. Knowl. Data Eng., October, 2023

Attention, Please! Adversarial Defense via Activation Rectification and Preservation.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Towards a multimodal human activity dataset for healthcare.
Multim. Syst., 2023

Debiasing backdoor attack: A benign application of backdoor attack in eliminating data bias.
Inf. Sci., 2023

Language-assisted Vision Model Debugger: A Sample-Free Approach to Finding Bugs.
CoRR, 2023

An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation.
CoRR, 2023

Evaluation and Analysis of Hallucination in Large Vision-Language Models.
CoRR, 2023

CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility.
CoRR, 2023

Introducing Foundation Models as Surrogate Models: Advancing Towards More Practical Adversarial Attacks.
CoRR, 2023

Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias.
CoRR, 2023

Benign Shortcut for Debiasing: Fair Visual Recognition via Intervention with Shortcut Features.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Echoes: Unsupervised Debiasing via Pseudo-bias Labeling in an Echo Chamber.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Improved Visual Fine-tuning with Natural Language Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Unlearnable Clusters: Towards Label-Agnostic Unlearnable Examples.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

ImageNet Pre-training Also Transfers Non-robustness.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Image-Based Personality Questionnaire Design.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Learning to Learn a Cold-start Sequential Recommender.
ACM Trans. Inf. Syst., 2022

Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment.
CoRR, 2022

Fair Visual Recognition via Intervention with Proxy Features.
CoRR, 2022

FairCLIP: Social Bias Elimination based on Attribute Prototype Learning and Representation Neutralization.
CoRR, 2022

JPEG Compression-Resistant Low-Mid Adversarial Perturbation against Unauthorized Face Recognition System.
CoRR, 2022

Debiasing Backdoor Attack: A Benign Application of Backdoor Attack in Eliminating Data Bias.
CoRR, 2022

Towards Adversarial Attack on Vision-Language Pre-training Models.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Benign Adversarial Attack: Tricking Models for Goodness.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Non-generative Generalized Zero-shot Learning via Task-correlated Disentanglement and Controllable Samples Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Knowledge-driven Egocentric Multimodal Activity Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Robust CAPTCHAs Towards Malicious OCR.
IEEE Trans. Multim., 2021

Metadata Connector: Exploiting Hashtag and Tag for Cross-OSN Event Search.
IEEE Trans. Multim., 2021

Knowledge Graph-enhanced Sampling for Conversational Recommender System.
CoRR, 2021

Benign Adversarial Attack: Tricking Algorithm for Goodness.
CoRR, 2021

Pre-training also Transfers Non-Robustness.
CoRR, 2021

Trustworthy Multimedia Analysis.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020
Meta-path Augmented Sequential Recommendation with Contextual Co-attention Network.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Locality-constrained discrete graph hashing.
Neurocomputing, 2020

Beyond Literal Visual Modeling: Understanding Image Metaphor Based on Literal-Implied Concept Mapping.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Adversarial Privacy-preserving Filter.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Towards Accuracy-Fairness Paradox: Adversarial Example-based Data Augmentation for Visual Debiasing.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019
A Generalization Theory based on Independent and Task-Identically Distributed Assumption.
CoRR, 2019

blessing in disguise: Designing Robust Turing Test by Employing Algorithm Unrobustness.
CoRR, 2019

Multimodal Attribute and Feature Embedding for Activity Recognition.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Multi-source User Attribute Inference based on Hierarchical Auto-encoder.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Comprehensive Event Storyline Generation from Microblogs.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Explainable Interaction-driven User Modeling over Knowledge Graph for Sequential Recommendation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

2018
Understanding Dynamic Cross-OSN Associations for Cold-Start Recommendation.
IEEE Trans. Multim., 2018

Bundled Local Features for Image Representation.
IEEE Trans. Circuits Syst. Video Technol., 2018

Social Relationship Labeling Based on Multimodal Behaviors and Social Interactions.
IEEE Multim., 2018

Attention, Please! Adversarial Defense via Attention Rectification and Preservation.
CoRR, 2018

CSAN: Contextual Self-Attention Network for User Sequential Recommendation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

2017
Who Are Your "Real" Friends: Analyzing and Distinguishing Between Offline and Online Friendships From Social Multimedia Data.
IEEE Trans. Multim., 2017

A Demo for Image-Based Personality Test.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

Towards SMP Challenge: Stacking of Diverse Models for Social Image Popularity Prediction.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Hashtag-centric Immersive Search on Social Media.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

2016
A Unified Video Recommendation by Cross-Network User Modeling.
ACM Trans. Multim. Comput. Commun. Appl., 2016

Folksonomy-Based Visual Ontology Construction and Its Applications.
IEEE Trans. Multim., 2016

基于关联规则挖掘的跨网络知识关联及协同应用 (Association Rules Mining Based Cross-network Knowledge Association and Collaborative Applications).
计算机科学, 2016

2015
Learning Feature Hierarchies: A Layer-Wise Tag-Embedded Approach.
IEEE Trans. Multim., 2015

YouTube Video Promotion by Cross-Network Association: @Britney to Advertise Gangnam Style.
IEEE Trans. Multim., 2015

Word-of-Mouth Understanding: Entity-Centric Multimodal Aspect-Opinion Mining in Social Media.
IEEE Trans. Multim., 2015

Relational User Attribute Inference in Social Media.
IEEE Trans. Multim., 2015

Activity Sensor: Check-In Usage Mining for Local Recommendation.
ACM Trans. Intell. Syst. Technol., 2015

Unified YouTube Video Recommendation via Cross-network Collaboration.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

2014
Twitter is Faster: Personalized Time-Aware Video Recommendation from Twitter to YouTube.
ACM Trans. Multim. Comput. Commun. Appl., 2014

A Unified Framework of Latent Feature Learning in Social Media.
IEEE Trans. Multim., 2014

Mining Cross-network Association for YouTube Video Promotion.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

2013
Interaction Design for Mobile Visual Search.
IEEE Trans. Multim., 2013

Latent feature learning in social media network.
Proceedings of the ACM Multimedia Conference, 2013

Tag-aware image classification via Nested Deep Belief nets.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Friend transfer: Cold-start friend recommendation with cross-platform transfer learning of social knowledge.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

User-Oriented Social Analysis across Social Media Sites.
Proceedings of the New Trends in Image Analysis and Processing - ICIAP 2013, 2013

2012
Probabilistic sequential POIs recommendation via check-in data.
Proceedings of the SIGSPATIAL 2012 International Conference on Advances in Geographic Information Systems (formerly known as GIS), 2012


  Loading...