Xing Xu

Yin Zhang

IEEE Trans. Neural Networks Learn. Syst., January, 2025

Joint Objective and Subjective Fuzziness Denoising for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

IEEE Trans. Fuzzy Syst., January, 2025

VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Mitigating Hallucinations in Large Vision-Language Models via Reasoning Uncertainty-Guided Refinement.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

UCPM: Uncertainty-Guided Cross-Modal Retrieval With Partially Mismatched Pairs.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2025

AnoOnly: Semi-supervised anomaly detection with the only loss on anomalies.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2025

Self-adaptive uncertainty modeling and relation reasoning for cross-modal egocentric human action recognition in sports.

[BibT_eX]

[DOI]

Comput. Electr. Eng., 2025

MSCRS: Multi-modal Semantic Graph Prompt Learning Framework for Conversational Recommender Systems.

[BibT_eX]

[DOI]

Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025

Geometric Gradient Divergence Modulation for Imbalanced Multimodal Learning.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

SaP-Bot: A Multimodal Large-Language Model for End-to-End Same-Product Identification.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Multimodal Time Series Alignment for Error Detection in Human Robot Interactions.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Composed Query-Based Event Retrieval in Video Corpus with Multimodal Episodic Perceptron.

[BibT_eX]

[DOI]

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

Heterogeneous Graph Embedding for Multimodal Multi-Label Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

Causal Intervention with Active Learning for Large Vision-Language Models in Egocentric Contexts.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Noise Mitigation for Unsupervised Cross-Domain Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Cross-Modal Task Verification via Hypergraph-based Sequential Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Social Optimum Assisted Gradient Modulation for Imbalanced Multimodal Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Probabilistic Embeddings with Causal Constraint for Error Detection in Egocentric Procedural Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Egocentric Online Action Segmentation with Behavior-Centred Feature Augmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

TAU-106K: A New Dataset for Comprehensive Understanding of Traffic Accident.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

ReCon: Enhancing True Correspondence Discrimination through Relation Consistency for Robust Noisy Correspondence Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

PHGC: Procedural Heterogeneous Graph Completion for Natural Language Task Verification in Egocentric Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Importance Sampling Facilitates Ensemble Adversarial Transferability.

[BibT_eX]

[DOI]

Proceedings of the Databases Theory and Applications, 2025

From Observation to Understanding: Front-Door Adjustments with Uncertainty Calibration for Enhancing Egocentric Reasoning in LVLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

BatchNorm-Based Weakly Supervised Video Anomaly Detection.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., December, 2024

Fuzzy Multimodal Graph Reasoning for Human-Centric Instructional Video Grounding.

[BibT_eX]

[DOI]

IEEE Trans. Fuzzy Syst., September, 2024

CR-FPN: channel relation feature pyramid network for object detection.

[BibT_eX]

[DOI]

Wirel. Networks, July, 2024

Representation separation adversarial networks for cross-modal retrieval.

[BibT_eX]

[DOI]

Wirel. Networks, July, 2024

Cross-Modal Attention Preservation with Self-Contrastive Learning for Composed Query-Based Image Retrieval.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., June, 2024

Complex Relation Embedding for Scene Graph Generation.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., June, 2024

SDN: Semantic Decoupling Network for Temporal Language Grounding.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., May, 2024

Multi-Grained Attention Network With Mutual Exclusion for Composed Query-Based Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., April, 2024

Relation-Aggregated Cross-Graph Correlation Learning for Fine-Grained Image-Text Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., February, 2024

Learning Relationship-Enhanced Semantic Graph for Fine-Grained Image-Text Matching.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., February, 2024

Ecarnet: enhanced clue-ambiguity reasoning network for multimodal fake news detection.

[BibT_eX]

[DOI]

Multim. Syst., February, 2024

Runge-Kutta Guided Feature Augmentation for Few-Sample Learning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Zero-Shot Video Moment Retrieval With Angular Reconstructive Text Embeddings.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Boosting Adversarial Training with Hardness-Guided Attack Strategy.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Semantics Disentangling for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

Coreset Learning-Based Sparse Black-Box Adversarial Attack for Video Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2024

Multi-Scale Temporal Difference Transformer for Video-Text Retrieval.

[BibT_eX]

[DOI]

Ni Wang

Dongliang Liao

CoRR, 2024

UGNCL: Uncertainty-Guided Noisy Correspondence Learning for Efficient Cross-Modal Matching.

[BibT_eX]

[DOI]

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Unsupervised Cross-Domain Image Retrieval with Semantic-Attended Mixture-of-Experts.

[BibT_eX]

[DOI]

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Enhanced Experts with Uncertainty-Aware Routing for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Counterfactually Augmented Event Matching for De-biased Temporal Sentence Grounding.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

PTAN: Principal Token-aware Adjacent Network for Compositional Temporal Grounding.

[BibT_eX]

[DOI]

Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024

Temporal Self-Paced Proposal Learning for Weakly-Supervised Video Moment Retrieval and Highlight Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Domain Prompt Learning Framework for Real Image Dehazing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Diverse Embedding Modeling with Adaptive Noise Filter for Text-based Person Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Large Language Model Signals for Science Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Adaptive Uncertainty-Based Learning for Text-Based Person Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Composition-Aware Image Steganography Through Adversarial Self-Generated Supervision.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., November, 2023

Less is Better: Exponential Loss for Cross-Modal Matching.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., September, 2023

Hypercomplex context guided interaction modeling for scene graph generation.

[BibT_eX]

[DOI]

Pattern Recognit., September, 2023

Category Alignment Adversarial Learning for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., May, 2023

Quaternion Relation Embedding for Scene Graph Generation.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

OMGH: Online Manifold-Guided Hashing for Flexible Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Self-Supervised Fine-Grained Cycle-Separation Network (FSCN) for Visual-Audio Separation.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Quaternion Representation Learning for cross-modal matching.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2023

TFUN: Trilinear Fusion Network for Ternary Image-Text Retrieval.

[BibT_eX]

[DOI]

Inf. Fusion, 2023

IRPSM-net: Information retention pyramid stereo matching network.

[BibT_eX]

[DOI]

Int. J. Comput. Sci. Math., 2023

MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection.

[BibT_eX]

[DOI]

CoRR, 2023

MoCoSA: Momentum Contrast for Knowledge Graph Completion with Structure-Augmented Pre-trained Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

AnoOnly: Semi-Supervised Anomaly Detection without Loss on Normal Data.

[BibT_eX]

[DOI]

CoRR, 2023

Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models.

[BibT_eX]

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

DCEL: Deep Cross-modal Evidential Learning for Text-Based Person Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Faster Video Moment Retrieval with Point-Level Supervision.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Taking a Part for the Whole: An Archetype-agnostic Framework for Voice-Face Association.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Joint Searching and Grounding: Multi-Granularity Video Content Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Zero-shot Sketch-based Image Retrieval with Adaptive Balanced Discriminability and Generalizability.

[BibT_eX]

[DOI]

Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization.

[BibT_eX]

[DOI]

Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

Region-Aware Semantic Consistency for Unsupervised Domain-Adaptive Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Information Selection-based Domain Adaptation from Black-box Predictors.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Label-Semantic-Enhanced Online Hashing for Efficient Cross-modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Progressive Event Alignment Network for Partial Relevant Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Special Issue on Synthetic Media on the Web.

[BibT_eX]

[DOI]

World Wide Web, 2022

Mind the Remainder: Taylor's Theorem View on Recurrent Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2022

Cross-Modal Dynamic Networks for Video Moment Retrieval With Text Query.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

View-Invariant Human Action Recognition Via View Transformation Network (VTN).

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Semantic-Aligned Attention With Refining Feature Embedding for Few-Shot Image Classification.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2022

Cognitive Memory-Guided AutoEncoder for Effective Intrusion Detection in Internet of Things.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, 2022

Med-BERT: A Pretraining Framework for Medical Records Named Entity Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, 2022

Learning Cross-Modal Common Representations by Private-Shared Subspaces Separation.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2022

Flow-Edge Guided Unsupervised Video Object Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Action-Centric Relation Transformer Network for Video Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Modeling Two-Stream Correspondence for Visual Sound Separation.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Joint Feature Synthesis and Embedding: Adversarial Cross-Modal Retrieval Revisited.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Universal Weighting Metric Learning for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Semantic guided knowledge graph for large-scale zero-shot learning.

[BibT_eX]

[DOI]

J. Vis. Commun. Image Represent., 2022

Comprehensive Framework of Early and Late Fusion for Image-Sentence Retrieval.

[BibT_eX]

[DOI]

IEEE Multim., 2022

Query-based black-box attack against medical image segmentation model.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2022

Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Thunder: Thumbnail based Fast Lightweight Image Denoising Network.

[BibT_eX]

[DOI]

CoRR, 2022

I-WKNN: Fast-speed and high-accuracy WIFI positioning for intelligent sports stadiums.

[BibT_eX]

[DOI]

Comput. Electr. Eng., 2022

Language-enhanced object reasoning networks for video moment retrieval with text query.

[BibT_eX]

[DOI]

Comput. Electr. Eng., 2022

Learning discriminative representations via variational self-distillation for cross-view geo-localization.

[BibT_eX]

[DOI]

Comput. Electr. Eng., 2022

Structure-Aware Semantic-Aligned Network for Universal Cross-Domain Retrieval.

[BibT_eX]

[DOI]

Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Multimodal Disentanglement Variational AutoEncoders for Zero-Shot Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Point to Rectangle Matching for Image Text Retrieval.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Rethinking Open-World Object Detection in Autonomous Driving Scenarios.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

ARRA: Absolute-Relative Ranking Attack against Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Selective Hypergraph Convolutional Networks for Skeleton-based Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Accelerated Sign Hunter: A Sign-based Black-box Attack via Branch-Prune Strategy and Stabilized Hierarchical Search.

[BibT_eX]

[DOI]

Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Instance-Level Semantic Alignment for Zero-Shot Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

GTLR: Graph-Based Transformer with Language Reconstruction for Video Paragraph Grounding.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Detach and Enhance: Learning Disentangled Cross-modal Latent Representation for Efficient Face-Voice Association and Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining, 2022

Semi-supervised Video Paragraph Grounding with Contrastive Encoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

TVT: Three-Way Vision Transformer through Multi-Modal Hypersphere Learning for Zero-Shot Sketch-Based Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Cross-Modal Hybrid Feature Fusion for Image-Sentence Matching.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2021

Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2021

Radial Graph Convolutional Network for Visual Question Generation.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2021

Interclass-Relativity-Adaptive Metric Learning for Cross-Modal Matching and Beyond.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

Exploiting Subspace Relation in Semantic Labels for Cross-Modal Hashing.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2021

Adversarial Attack Against Urban Scene Segmentation for Autonomous Vehicles.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, 2021

Deep Fuzzy Hashing Network for Efficient Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Fuzzy Syst., 2021

Few-shot prototype alignment regularization network for document image layout segementation.

[BibT_eX]

[DOI]

Pattern Recognit., 2021

5G-Network-Enabled Smart Ambulance: Architecture, Application, and Evaluation.

[BibT_eX]

[DOI]

IEEE Netw., 2021

Toward Effective Intrusion Detection Using Log-Cosh Conditional Variational Autoencoder.

[BibT_eX]

[DOI]

IEEE Internet Things J., 2021

Adaptive Square Attack: Fooling Autonomous Cars With Adversarial Traffic Signs.

[BibT_eX]

[DOI]

IEEE Internet Things J., 2021

Heterogeneous data fusion for predicting mild cognitive impairment conversion.

[BibT_eX]

[DOI]

Inf. Fusion, 2021

A Cognitive Memory-Augmented Network for Visual Anomaly Detection.

[BibT_eX]

[DOI]

IEEE CAA J. Autom. Sinica, 2021

I-WKNN: Fast-Speed and High-Accuracy WIFI Positioning for Intelligent Stadiums.

[BibT_eX]

[DOI]

CoRR, 2021

Hybrid Fusion with Intra- and Cross-Modality Attention for Image-Recipe Retrieval.

[BibT_eX]

[DOI]

Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Semantic Enhanced Cross-modal GAN for Zero-shot Learning.

[BibT_eX]

[DOI]

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

Extracting Useful Knowledge from Noisy Web Images via Data Purification for Fine-Grained Recognition.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Video Representation Learning with Graph Contrastive Augmentation.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Meta Self-Paced Learning for Cross-Modal Matching.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Disentangled Representation Learning and Enhancement Network for Single Image De-Raining.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Relationship-Preserving Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Vision-guided Music Source Separation via a Fine-grained Cycle-Separation Network.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

CAA: Candidate-Aware Aggregation for Temporal Action Detection.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Learning Hierarchal Channel Attention for Fine-grained Visual Classification.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multi-scale Dynamic Network for Temporal Action Detection.

[BibT_eX]

[DOI]

Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Cross-Modal Image-Recipe Retrieval via Intra- and Inter-Modality Hybrid Fusion.

[BibT_eX]

[DOI]

Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

PoseGTAC: Graph Transformer Encoder-Decoder with Atrous Convolution for 3D Human Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Feature Space Targeted Attacks by Statistic Alignment.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Graph Convolutional Hourglass Networks for Skeleton-Based Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Efficient Online Label Consistent Hashing for Large-Scale Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Attention-Based Relation Reasoning Network for Video-Text Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Combine Early and Late Fusion Together: A Hybrid Fusion Framework for Image-Text Matching.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Multimodal Transformer Networks with Latent Interaction for Audio-Visual Event Localization.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

From General to Specific: Informative Scene Graph Generation via Balance Adjustment.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Enhancing Audio-Visual Association with Self-Supervised Curriculum Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Cross-Modal Attention With Semantic Consistence for Image-Text Matching.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2020

Temporal Reasoning Graph for Activity Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

A Context Knowledge Map Guided Coarse-to-Fine Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2020

A Novel Vehicle Tracking ID Switches Algorithm for Driving Recording Sensors.

[BibT_eX]

[DOI]

Sensors, 2020

Similarity preserving feature generating networks for zero-shot learning.

[BibT_eX]

[DOI]

Neurocomputing, 2020

Question-Led object attention for visual question answering.

[BibT_eX]

[DOI]

Neurocomputing, 2020

Unified Binary Generative Adversarial Network for Image Retrieval and Compression.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2020

Cognitive visual anomaly detection with constrained latent representations for industrial inspection robot.

[BibT_eX]

[DOI]

Appl. Soft Comput., 2020

Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

3D Self-Attention for Unsupervised Video Quantization.

[BibT_eX]

[DOI]

Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Scene graph generation via multi-relation classification and cross-modal attention coordinator.

[BibT_eX]

[DOI]

Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Graph-based variational auto-encoder for generalized zero-shot learning.

[BibT_eX]

[DOI]

Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Self-supervised adversarial learning for cross-modal retrieval.

[BibT_eX]

[DOI]

Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Multi-level expression guided attention network for referring expression comprehension.

[BibT_eX]

[DOI]

Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Temporal Denoising Mask Synthesis Network for Learning Blind Video Temporal Consistency.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning Optimization-based Adversarial Perturbations for Attacking Sequential Recognition Models.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Hearing like Seeing: Improving Voice-Face Interactions and Associations via Adversarial Deep Semantic Matching Network.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

CC-LSTM: Cross and Conditional Long-Short Time Memory for Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020

Ocean: A Dual Learning Approach For Generalized Zero-Shot Sketch-Based Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Fooled by Imagination: Adversarial Attack to Image Captioning Via Perturbation in Complex Domain.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Universal Weighting Metric Learning for Cross-Modal Matching.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Deep adversarial metric learning for cross-modal retrieval.

[BibT_eX]

[DOI]

World Wide Web, 2019

Fusion by synthesizing: A multi-view deep neural network for zero-shot recognition.

[BibT_eX]

[DOI]

Signal Process., 2019

Word-to-region attention network for visual question answering.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2019

Learning one-to-many stylised Chinese character transformation and generation by generative adversarial networks.

[BibT_eX]

[DOI]

IET Image Process., 2019

Cooperative Cross-Stream Network for Discriminative Action Representation.

[BibT_eX]

[DOI]

CoRR, 2019

Generative Reconstructive Hashing for Incomplete Video Analysis.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Learnable Aggregating Net with Diversity Learning for Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning to create multi-stylized Chinese character fonts by generative adversarial networks.

[BibT_eX]

[DOI]

Proceedings of the ACM Turing Celebration Conference - China, 2019

Template-Based Math Word Problem Solvers with Recursive Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Deliberate Attention Networks for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Perceptual Pyramid Adversarial Networks for Text-to-Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Recognition and Detection of Two-Person Interactive Actions Using Automatically Selected Skeleton Features.

[BibT_eX]

[DOI]

IEEE Trans. Hum. Mach. Syst., 2018

One-shot learning based pattern transition map for action early recognition.

[BibT_eX]

[DOI]

Signal Process., 2018

Zero-shot learning via discriminative representation extraction.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2018

Semantic binary coding for visual recognition via joint concept-attribute modelling.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2018

FDCNet: filtering deep convolutional network for marine organism classification.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2018

Domain Invariant Subspace Learning for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Cumulative Nets for Edge Detection.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Dual Learning for Visual Question Generation.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Domain separation network for cross-modal retrieval.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018

Index and Retrieve Multimedia Data: Cross-Modal Hashing by Learning Subspace Relation.

[BibT_eX]

[DOI]

Proceedings of the Database Systems for Advanced Applications, 2018

Deep Region Hashing for Generic Instance Search from Images.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Binary Generative Adversarial Networks for Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Video Captioning With Attention-Based LSTM and Semantic Consistency.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2017

Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2017

Large-scale image retrieval with supervised sparse hashing.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Exploiting score distribution for heterogenous feature fusion in image classification.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Supervised hashing with adaptive discrete optimization for multimedia retrieval.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Deep Region Hashing for Efficient Large-scale Instance Search from Images.

[BibT_eX]

[DOI]

CoRR, 2017

Wound intensity correction and segmentation with convolutional neural networks.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2017

Non-Linear Matrix Completion for Social Image Tagging.

[BibT_eX]

[DOI]

IEEE Access, 2017

Spatial Verification via Compact Words for Mobile Instance Search.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

A System for Spatiotemporal Anomaly Localization in Surveillance Videos.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

Adversarial Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

Transductive Visual-Semantic Embedding for Zero-shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

Attribute hashing for zero-shot image retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Asymmetric sparse hashing.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Unsupervised cross-modal retrieval through adversarial learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Exploiting Concept Correlation with Attributes for Semantic Binary Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Internet Multimedia Computing and Service, 2017

Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Jointly Learning Attentions with Semantic Cross-Modal Correlation for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Databases Theory and Applications, 2017

2016

Learning multi-task local metrics for image annotation.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2016

Underwater image enhancement method using weighted guided trigonometric filtering and artificial light correction.

[BibT_eX]

[DOI]

J. Vis. Commun. Image Represent., 2016

Learning unified binary codes for cross-modal retrieval via latent semantic hashing.

[BibT_eX]

[DOI]

Neurocomputing, 2016

Combining multi-representation for multimedia event detection using co-training.

[BibT_eX]

[DOI]

Neurocomputing, 2016

Bidirectional Long-Short Term Memory for Video Description.

[BibT_eX]

[DOI]

CoRR, 2016

Cross-modal Retrieval with Label Completion.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Attention-based LSTM with Semantic Consistency for Videos Captioning.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Bidirectional Long-Short Term Memory for Video Description.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Discriminant Cross-modal Hashing.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Underwater image descattering and quality assessment.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Multi-cue Information Fusion for Two-Layer Activity Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2016 Workshops, 2016

2015

Semi-supervised Coupled Dictionary Learning for Cross-modal Retrieval in Internet Images and Texts.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Coupled dictionary learning and feature mapping for cross-modal retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Query expansion with pairwise learning in object retrieval challenge.

[BibT_eX]

[DOI]

Proceedings of the 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision, 2015

2014

Tag completion with defective tag assignments via image-tag re-weighting.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

MLIA at ImageCLFE 2014 Scalable Concept Image Annotation Challenge.

[BibT_eX]

[DOI]

Proceedings of the Working Notes for CLEF 2014 Conference, 2014

Exploring Image Specific Structured Loss for Image Annotation with Incomplete Labelling.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2014, 2014

2013

Latent topic model for image annotation by modeling topic correlation.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Image Annotation by Learning Label-Specific Distance Metrics.

[BibT_eX]

[DOI]

Proceedings of the Image Analysis and Processing - ICIAP 2013, 2013

Correlated topic model for image annotation.

[BibT_eX]

[DOI]