Jingkuan Song

Orcid: 0000-0002-2549-8322

According to our database1, Jingkuan Song authored at least 274 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Utilizing Greedy Nature for Multimodal Conditional Image Synthesis in Transformers.
IEEE Trans. Multim., 2024

Memory-Based Augmentation Network for Video Captioning.
IEEE Trans. Multim., 2024

DMH-CL: Dynamic Model Hardness Based Curriculum Learning for Complex Pose Estimation.
IEEE Trans. Multim., 2024

BadCM: Invisible Backdoor Attack Against Cross-Modal Learning.
IEEE Trans. Image Process., 2024

Exploring Hierarchical Information in Hyperbolic Space for Self-Supervised Image Hashing.
IEEE Trans. Image Process., 2024

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models.
CoRR, 2024

CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model.
CoRR, 2024

Training-Free Semantic Video Composition via Pre-trained Diffusion Model.
CoRR, 2024

F³-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
KE-RCNN: Unifying Knowledge-Based Reasoning Into Part-Level Attribute Parsing.
IEEE Trans. Cybern., November, 2023

Adaptive Fine-Grained Predicates Learning for Scene Graph Generation.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Continuous cross-modal hashing.
Pattern Recognit., October, 2023

Less is Better: Exponential Loss for Cross-Modal Matching.
IEEE Trans. Circuits Syst. Video Technol., September, 2023

Semisupervised Network Embedding With Differentiable Deep Quantization.
IEEE Trans. Neural Networks Learn. Syst., August, 2023

On the Imaginary Wings: Text-Assisted Complex-Valued Fusion Network for Fine-Grained Visual Classification.
IEEE Trans. Neural Networks Learn. Syst., August, 2023

Complementarity-Aware Space Learning for Video-Text Retrieval.
IEEE Trans. Circuits Syst. Video Technol., August, 2023

Deep debiased contrastive hashing.
Pattern Recognit., July, 2023

Learning visual question answering on controlled semantic noisy labels.
Pattern Recognit., June, 2023

Transferable and differentiable discrete network embedding for multi-domains with hierarchical knowledge distillation.
Inf. Sci., June, 2023

Label-Guided Generative Adversarial Network for Realistic Image Synthesis.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Heterogeneous Knowledge Network for Visual Dialog.
IEEE Trans. Circuits Syst. Video Technol., February, 2023

Label-Affinity Self-Adaptive Central Similarity Hashing for Image Retrieval.
IEEE Trans. Multim., 2023

AMANet: Adaptive Multi-Path Aggregation for Learning Human 2D-3D Correspondences.
IEEE Trans. Multim., 2023

Revisiting Multi-Codebook Quantization.
IEEE Trans. Image Process., 2023

From Global to Local: Multi-Scale Out-of-Distribution Detection.
IEEE Trans. Image Process., 2023

Spherical Centralized Quantization for Fast Image Retrieval.
IEEE Trans. Image Process., 2023

End-to-End Pre-Training With Hierarchical Matching and Momentum Contrast for Text-Video Retrieval.
IEEE Trans. Image Process., 2023

Toward a Unified Transformer-Based Framework for Scene Graph Generation and Human-Object Interaction Detection.
IEEE Trans. Image Process., 2023

State-Aware Compositional Learning Toward Unbiased Training for Scene Graph Generation.
IEEE Trans. Image Process., 2023

End-to-end Image Captioning via Visual Region Aggregation and Dual-level Collaboration.
Int. J. Softw. Informatics, 2023

Context-based Transfer and Efficient Iterative Learning for Unbiased Scene Graph Generation.
CoRR, 2023

ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval.
CoRR, 2023

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control.
CoRR, 2023

F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis.
CoRR, 2023

Towards Redundancy-Free Sub-networks in Continual Learning.
CoRR, 2023

MotionZero: Exploiting Motion Priors for Zero-shot Text-to-Video Generation.
CoRR, 2023

BatchNorm-based Weakly Supervised Video Anomaly Detection.
CoRR, 2023

Towards a Unified Transformer-based Framework for Scene Graph Generation and Human-object Interaction Detection.
CoRR, 2023

RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments.
CoRR, 2023

Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks.
CoRR, 2023

DePT: Decoupled Prompt Tuning.
CoRR, 2023

MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection.
CoRR, 2023

CIParsing: Unifying Causality Properties into Multiple Human Parsing.
CoRR, 2023

Informative Scene Graph Generation via Debiasing.
CoRR, 2023

Part-Aware Transformer for Generalizable Person Re-identification.
CoRR, 2023

CageViT: Convolutional Activation Guided Efficient Vision Transformer.
CoRR, 2023

Boosting Adversarial Attacks by Leveraging Decision Boundary Information.
CoRR, 2023

RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Precise Target-Oriented Attack against Deep Hashing-based Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CHAIN: Exploring Global-Local Spatio-Temporal Information for Improved Self-Supervised Video Hashing.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

HCMA '23: 4th International Workshop on Human-Centric Multimedia Analysis.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Style-Controllable Generalized Person Re-identification.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CUCL: Codebook for Unsupervised Continual Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

A Closer Look at Few-shot Classification Again.
Proceedings of the International Conference on Machine Learning, 2023

Towards Boosting Black-Box Attack Via Sharpness-Aware.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

EANet: Towards Lightweight Human Pose Estimation With Effective Aggregation Network.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

DETA: Denoised Task Adaptation for Few-Shot Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Part-Aware Transformer for Generalizable Person Re-identification.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Prototype-Based Embedding Network for Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Relation-aware aggregation network with auxiliary guidance for text-based person search.
World Wide Web, 2022

Scenario-Aware Recurrent Transformer for Goal-Directed Video Captioning.
ACM Trans. Multim. Comput. Commun. Appl., 2022

AgeGAN++: Face Aging and Rejuvenation With Dual Conditional GANs.
IEEE Trans. Multim., 2022

Push & Pull: Transferable Adversarial Examples With Attentive Attack.
IEEE Trans. Multim., 2022

Improving Image Similarity Learning by Adding External Memory.
IEEE Trans. Knowl. Data Eng., 2022

Video Question Answering With Prior Knowledge and Object-Sensitive Learning.
IEEE Trans. Image Process., 2022

Continual Referring Expression Comprehension via Dual Modular Memorization.
IEEE Trans. Image Process., 2022

Hierarchical Representation Network With Auxiliary Tasks for Video Captioning and Video Question Answering.
IEEE Trans. Image Process., 2022

Relation Regularized Scene Graph Generation.
IEEE Trans. Cybern., 2022

Progressive Meta-Learning With Curriculum.
IEEE Trans. Circuits Syst. Video Technol., 2022

KTN: Knowledge Transfer Network for Learning Multiperson 2D-3D Correspondences.
IEEE Trans. Circuits Syst. Video Technol., 2022

Text-instance graph: Exploring the relational semantics for text-based visual question answering.
Pattern Recognit., 2022

MCFL: multi-label contrastive focal loss for deep imbalanced pedestrian attribute recognition.
Neural Comput. Appl., 2022

Hyperbolic Hierarchical Contrastive Hashing.
CoRR, 2022

RepParser: End-to-End Multiple Human Parsing with Representative Parts.
CoRR, 2022

KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D Correspondences.
CoRR, 2022

Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation.
CoRR, 2022

FedMed-GAN: Federated Multi-Modal Unsupervised Brain Image Synthesis.
CoRR, 2022

A Lower Bound of Hash Codes' Performance.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Natural Color Fool: Towards Boosting Black-box Unrestricted Attacks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Free-Lunch for Cross-Domain Few-Shot Learning: Style-Aware Episodic Training with Robust Contrastive Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

HCMA'22: 3rd International Workshop on Human-Centric Multimedia Analysis.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Progressive Tree-Structured Prototype Network for End-to-End Image Captioning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Prompting for Multi-Modal Tracking.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Dynamic Scene Graph Generation via Temporal Prior Inference.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Skeleton-based Action Recognition via Adaptive Cross-Form Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Class Gradient Projection For Continual Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

S2 Transformer for Image Captioning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Deep Category-Aware Hashing for Object Retrieval in Multi-Label Image.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Learning to Generate Scene Graph from Head to Tail.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

MKE-GCN: Multi-Modal Knowledge Embedded Graph Convolutional Network for Skeleton-Based Action Recognition in the Wild.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Support-Set Based Multi-Modal Representation Enhancement for Video Captioning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Multi-Scale Graph Attention Network for Scene Graph Generation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Context Gating with Multi-Level Ranking Learning for Visual Dialog.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Frequency Domain Model Augmentation for Adversarial Attack.
Proceedings of the Computer Vision - ECCV 2022, 2022

Towards Open-Vocabulary Scene Graph Generation with Prompt-Based Finetuning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning Dual-Fused Modality-Aware Representations for RGBD Tracking.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Meta Distribution Alignment for Generalizable Person Re-Identification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Fine-Grained Predicates Learning for Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Rethinking Spatial Invariance of Convolutional Networks for Object Counting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Self-supervised Label-Visual Correlation Hashing for Multi-label Image Retrieval.
Proceedings of the Web and Big Data - 6th International Joint Conference, 2022

2021
High-order nonlocal Hashing for unsupervised cross-modal retrieval.
World Wide Web, 2021

Rich Visual Knowledge-Based Augmentation Network for Visual Question Answering.
IEEE Trans. Neural Networks Learn. Syst., 2021

BATCH: A Scalable Asymmetric Discrete Cross-Modal Hashing.
IEEE Trans. Knowl. Data Eng., 2021

Learning Efficient Hash Codes for Fast Graph-Based Data Similarity Retrieval.
IEEE Trans. Image Process., 2021

Introduction to the Special Issue on Learning-based Support for Data Science Applications.
Trans. Data Sci., 2021

GuessWhich? Visual dialog with attentive memory network.
Pattern Recognit., 2021

Unsupervised deep hashing with node representation for image retrieval.
Pattern Recognit., 2021

Explainable deep learning for efficient and robust pattern recognition: A survey of recent developments.
Pattern Recognit., 2021

Verification mechanism to obtain an elaborate answer span in machine reading comprehension.
Neurocomputing, 2021

Part-level attention networks for cross-domain person re-identification.
IET Image Process., 2021

Technical Report: Disentangled Action Parsing Networks for Accurate Part-level Action Parsing.
CoRR, 2021

Fast Gradient Non-sign Methods.
CoRR, 2021

Unsupervised Domain-adaptive Hash for Networks.
CoRR, 2021

Semi-supervised Network Embedding with Differentiable Deep Quantisation.
CoRR, 2021

Semantic Compositional Learning for Low-shot Scene Graph Generation.
CoRR, 2021

Staircase Sign Method for Boosting Adversarial Attacks.
CoRR, 2021

Cross-Domain Person Re-Identification Based on Feature Fusion.
IEEE Access, 2021

Extracting Useful Knowledge from Noisy Web Images via Data Purification for Fine-Grained Recognition.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Curriculum-Based Meta-learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Conceptual and Syntactical Cross-modal Alignment with Cross-level Consistency for Image-Text Matching.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

A System for Interactive and Intelligent AD Auxiliary Screening.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Semantic-aware Transfer with Instance-adaptive Parsing for Crowded Scenes Pose Estimation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Fully Functional Image Manipulation Using Scene Graphs in A Bounding-Box Free Way.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Camera-Agnostic Person Re-Identification via Adversarial Disentangling Learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

HUMA'21: 2nd International Workshop on Human-centric Multimedia Analysis.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Exploring Contextual-Aware Representation and Linguistic-Diverse Expression for Visual Dialog.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Towards Unsupervised Deformable-Instances Image-to-Image Translation.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Feature Space Targeted Attacks by Statistic Alignment.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

SKANet: Structured Knowledge-Aware Network for Visual Dialog.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Hierarchical Representation Network With Auxiliary Tasks For Video Captioning.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Exploiting Scene Graphs for Human-Object Interaction Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

From General to Specific: Informative Scene Graph Generation via Balance Adjustment.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

RSGNet: Relation based Skeleton Graph Network for Crowded Scenes Pose Estimation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A low cost and un-cancelled laplace noise based differential privacy algorithm for spatial decompositions.
World Wide Web, 2020

Spatio-Temporal Attention Networks for Action Recognition and Detection.
IEEE Trans. Multim., 2020

Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval.
IEEE Trans. Cybern., 2020

Fast large scale deep face search.
Pattern Recognit. Lett., 2020

Binary neural networks: A survey.
Pattern Recognit., 2020

Play and rewind: Context-aware video temporal action proposals.
Pattern Recognit., 2020

Hierarchical LSTMs with Adaptive Attention for Visual Captioning.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image.
Neural Comput. Appl., 2020

Fused GRU with semantic-temporal attention for video captioning.
Neurocomputing, 2020

Question-Led object attention for visual question answering.
Neurocomputing, 2020

Unified Binary Generative Adversarial Network for Image Retrieval and Compression.
Int. J. Comput. Vis., 2020

Patch-wise++ Perturbation for Adversarial Targeted Attacks.
CoRR, 2020

3D Self-Attention for Unsupervised Video Quantization.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

EvoGAN: an evolutionary GAN for face aging and rejuvenation.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

KTN: Knowledge Transfer Network for Multi-person DensePose Estimation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

HUMA'20: 1st International Workshop on Human-Centric Multimedia Analysis.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

One-shot Scene Graph Generation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Lab2Pix: Label-Adaptive Generative Adversarial Network for Unsupervised Image Synthesis.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning from the Scene and Borrowing from the Rich: Tackling the Long Tail in Scene Graph Generation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Bottom-up and Top-down: Bidirectional Additive Net for Edge Detection.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Deep Self-Taught Graph Embedding Hashing With Pseudo Labels For Image Retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Patch-Wise Attack for Fooling Deep Neural Network.
Proceedings of the Computer Vision - ECCV 2020, 2020

Forward and Backward Information Retention for Accurate Binary Neural Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Salience-Guided Cascaded Suppression Network for Person Re-Identification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Graph Attention Based Proposal 3D ConvNets for Action Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

SNEQ: Semi-Supervised Attributed Network Embedding with Attention-Based Quantisation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning.
IEEE Trans. Neural Networks Learn. Syst., 2019

Learning Match Kernels on Grassmann Manifolds for Action Recognition.
IEEE Trans. Image Process., 2019

Deep Self-Taught Hashing for Image Retrieval.
IEEE Trans. Cybern., 2019

Towards Accurate Georeferenced Video Search With Camera Field of View Modeling.
IEEE Trans. Circuits Syst. Video Technol., 2019

One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Network.
CoRR, 2019

Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

BraidNet: Braiding Semantics and Details for Accurate Human Parsing.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Learnable Aggregating Net with Diversity Learning for Video Question Answering.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Adaptive Multi-Path Aggregation for Human DensePose Estimation in the Wild.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Localizing Unseen Activities in Video via Image Query.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Deep Recurrent Quantization for Generating Sequential Binary Codes.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Beyond Product Quantization: Deep Progressive Quantization for Image Retrieval.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

The Research of Chinese Ethnical Face Recognition Based on Deep Learning.
Proceedings of the Web and Big Data, 2019

Model of Charging Stations Construction and Electric Vehicles Development Prediction.
Proceedings of the Web and Big Data, 2019

A Framework for Image Dark Data Assessment.
Proceedings of the Web and Big Data - Third International Joint Conference, 2019

Boundary Detector Encoder and Decoder with Soft Attention for Video Captioning.
Proceedings of the Web and Big Data, 2019

Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Structured Two-Stream Attention Network for Video Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Deliberate Attention Networks for Image Captioning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Perceptual Pyramid Adversarial Networks for Text-to-Image Synthesis.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Indexing Techniques for Multimedia Data Retrieval.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Local and Global Structure Preservation for Robust Unsupervised Spectral Feature Selection.
IEEE Trans. Knowl. Data Eng., 2018

NAIS: Neural Attentive Item Similarity Model for Recommendation.
IEEE Trans. Knowl. Data Eng., 2018

Cross-Paced Representation Learning With Partial Curricula for Sketch-Based Image Retrieval.
IEEE Trans. Image Process., 2018

Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder.
IEEE Trans. Image Process., 2018

Quantization-based hashing: a general framework for scalable image and video retrieval.
Pattern Recognit., 2018

A Survey on Learning to Hash.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Hidden semantic hashing for fast retrieval over large scale document collection.
Multim. Tools Appl., 2018

Multiple hierarchical deep hashing for large scale image retrieval.
Multim. Tools Appl., 2018

Deep appearance and motion learning for egocentric activity recognition.
Neurocomputing, 2018

EFUI: An ensemble framework using uncertain inference for pornographic image recognition.
Neurocomputing, 2018

Cumulative Nets for Edge Detection.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Session details: Vision-1 (Machine Learning).
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Examine before You Answer: Multi-task Learning with Adaptive-attentions for Multiple-choice VQA.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

3D Image-based Indoor Localization Joint With WiFi Positioning.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

From Pixels to Objects: Cubic Visual Attention for Visual Question Answering.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dual Conditional GANs for Face Aging and Rejuvenation.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Coarse-to-fine Image Co-segmentation with Intra and Inter Rank Constraints.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Complementary Binary Quantization for Joint Multiple Indexing.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dual Learning for Visual Question Generation.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

MathDQN: Solving Arithmetic Word Problems via Deep Reinforcement Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Deep Region Hashing for Generic Instance Search from Images.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Binary Generative Adversarial Networks for Image Retrieval.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Guest Editorial: Large-Scale Multimedia Data Retrieval, Classification, and Understanding.
IEEE Trans. Multim., 2017

Bilinear Optimized Product Quantization for Scalable Visual Content Analysis.
IEEE Trans. Image Process., 2017

Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition.
IEEE Signal Process. Lett., 2017

Learning in high-dimensional multimedia data: the state of the art.
Multim. Syst., 2017

Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources.
Neurocomputing, 2017

Graph self-representation method for unsupervised feature selection.
Neurocomputing, 2017

Real-time social media retrieval with spatial, temporal and social constraints.
Neurocomputing, 2017

A novel low-rank hypergraph feature selection for multi-view classification.
Neurocomputing, 2017

Supervised hashing with adaptive discrete optimization for multimedia retrieval.
Neurocomputing, 2017

Binary Generative Adversarial Networks for Image Retrieval.
CoRR, 2017

From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning.
CoRR, 2017

Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search.
CoRR, 2017

Deep Region Hashing for Efficient Large-scale Instance Search from Images.
CoRR, 2017

Classification by Retrieval: Binarizing Data and Classifiers.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Deep Discrete Hashing with Self-supervised Pairwise Labels.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

Exploring Consistent Preferences: Discrete Hashing with Pair-Exemplar for Scalable Landmark Search.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Unsupervised Discovery of Spatially-Informed Lung Texture Patterns for Pulmonary Emphysema: The MESA COPD Study.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, 2017

Generative method to discover emphysema subtypes with unsupervised learning using lung macroscopic patterns (LMPS): The MESA COPD study.
Proceedings of the 14th IEEE International Symposium on Biomedical Imaging, 2017

Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Synchronization-Inspired Co-Clustering and Its Application to Gene Expression Data.
Proceedings of the 2017 IEEE International Conference on Data Mining, 2017

Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Jointly Learning Attentions with Semantic Cross-Modal Correlation for Visual Question Answering.
Proceedings of the Databases Theory and Applications, 2017

Event Video Mashup: From Hundreds of Videos to Minutes of Skeleton.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Active domain adaptation with noisy labels for multimedia analysis.
World Wide Web, 2016

A Distance-Computation-Free Search Scheme for Binary Code Databases.
IEEE Trans. Multim., 2016

Web Video Event Recognition by Semantic Analysis From Ubiquitous Documents.
IEEE Trans. Image Process., 2016

Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation.
IEEE Trans. Image Process., 2016

A Fast Optimization Method for General Binary Code Learning.
IEEE Trans. Image Process., 2016

Multi-view multi-label learning for image annotation.
Multim. Tools Appl., 2016

Towards optimal VLAD for human action recognition from still images.
Image Vis. Comput., 2016

Deep and fast: Deep learning hashing with semi-supervised graph construction.
Image Vis. Comput., 2016

Cross-modal Retrieval with Label Completion.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Academic Coupled Dictionary Learning for Sketch-based Image Retrieval.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Attention-based LSTM with Semantic Consistency for Videos Captioning.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-Paced Dictionary Learning for cross-domain retrieval and recognition.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Multi-cue Information Fusion for Two-Layer Activity Recognition.
Proceedings of the Computer Vision - ACCV 2016 Workshops, 2016

Graph-without-cut: An Ideal Graph Learning for Image Segmentation.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Compact Image Fingerprint Via Multiple Kernel Hashing.
IEEE Trans. Multim., 2015

Optimized Cartesian K-Means.
IEEE Trans. Knowl. Data Eng., 2015

Supervised feature learning via l<sub>2</sub>-norm regularized logistic regression for 3D object recognition.
Neurocomputing, 2015

Deep Self-taught Hashing for Image Retrieval.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Zero-shot Image Categorization by Image Correlation Exploration.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Localize Me Anywhere, Anytime: A Multi-task Point-Retrieval Approach.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Optimal graph learning with partial tags and multiple features for image and video annotation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Learning Deep Representations of Appearance and Motion for Anomalous Event Detection.
Proceedings of the British Machine Vision Conference 2015, 2015

2014
Effective Hashing for Searching Large-scale Multimedia Databases
PhD thesis, 2014

Robust Hashing With Local Models for Approximate Similarity Search.
IEEE Trans. Cybern., 2014

Hashing for Similarity Search: A Survey.
CoRR, 2014

Minimizing dataset bias: Discriminative multi-task sparse coding through shared subspace learning for image classification.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

2013
Multi-Feature Fusion via Hierarchical Regression for Multimedia Analysis.
IEEE Trans. Multim., 2013

Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval.
IEEE Trans. Multim., 2013

Inter-media hashing for large-scale retrieval from heterogeneous data sources.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Effective hashing for large-scale multimedia search.
Proceedings of the 2013 SIGMOD/PODS Ph.D. Symposium, New York, NY, USA, June 23, 2013, 2013

2011
UQMSG Experiments for TRECVID 2011.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Multiple feature hashing for real-time large scale near-duplicate video retrieval.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011


  Loading...