Jingkuan Song

Orcid: 0000-0002-2549-8322

According to our database¹, Jingkuan Song authored at least 352 papers between 2011 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Distribution-to-Points Matching for Image Text Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2026

Generalizable Egocentric Task Verification via Cross-Modal Hybrid Hypergraph Matching.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2026

Causal-Inspired Fourier Representation Learning for Wearable IMUs and Egocentric Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., May, 2026

A Closer Look at Conditional Prompt Tuning for Vision-Language Models.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., May, 2026

Janus-LoRA: A Balanced Low-Rank Adaptation for Continual Learning.

[BibT_eX]

[DOI]

CoRR, May, 2026

FGIM: a Fast Graph-based Indexes Merging Framework for Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

CoRR, March, 2026

Language-Grounded Decoupled Action Representation for Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, March, 2026

Structure-aware Prompt Adaptation from Seen to Unseen for Open-Vocabulary Compositional Zero-Shot Learning.

[BibT_eX]

[DOI]

CoRR, March, 2026

TIMI: Training-Free Image-to-3D Multi-Instance Generation with Spatial Fidelity.

[BibT_eX]

[DOI]

CoRR, March, 2026

Benchmarking Few-shot Transferability of Pre-trained Models with Improved Evaluation Protocols.

[BibT_eX]

[DOI]

CoRR, March, 2026

Beyond the Majority: Long-tail Imitation Learning for Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, February, 2026

Sim-and-Human Co-training for Data-Efficient and Generalizable Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, January, 2026

From One-to-One to Many-to-Many: Dynamic Cross-Layer Injection for Deep Vision-Language Fusion.

[BibT_eX]

[DOI]

CoRR, January, 2026

SeMv-3D: Toward Concurrency of Semantic and Multi-View Consistency in General Text-to-3D Generation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2026

A2VAD: Attribute-augmented prompt learning for weakly supervised video anomaly detection.

[BibT_eX]

[DOI]

Pattern Recognit., 2026

Privacy preserving person re-identification via anonymizing diffusion model.

[BibT_eX]

[DOI]

Pattern Recognit., 2026

Debiased Orthogonal Boundary-Driven Efficient Noise Mitigation.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Learning to Curate Context: Jointly Optimizing Retrieval and Prediction for Multimodal Social Media Popularity.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

De-biased Natural Language Egocentric Task Verification via Prototypical Evidence Learning.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Hyper-Opinion Vagueness Quantification for Robust Multimodal Learning.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

MiVLA: Towards Generalizable Vision-Language-Action Model with Human-Robot Mutual Imitation Pre-training.

[BibT_eX]

[DOI]

CoRR, December, 2025

Pseudo-Label Refinement for Robust Wheat Head Segmentation via Two-Stage Hybrid Training.

[BibT_eX]

[DOI]

CoRR, December, 2025

Reversible Inversion for Training-Free Exemplar-guided Image Editing.

[BibT_eX]

[DOI]

CoRR, December, 2025

Reliable Few-Shot Learning Under Dual Noises.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., October, 2025

A Survey on Efficient Vision-Language-Action Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Approximate Nearest Neighbor Search of Large Scale Vectors on Distributed Storage.

[BibT_eX]

[DOI]

CoRR, October, 2025

FlexAC: Towards Flexible Control of Associative Reasoning in Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation.

[BibT_eX]

[DOI]

CoRR, October, 2025

More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration.

[BibT_eX]

[DOI]

CoRR, October, 2025

An Empirical Analysis of VLM-based OOD Detection: Mechanisms, Advantages, and Sensitivity.

[BibT_eX]

[DOI]

CoRR, September, 2025

SINDI: an Efficient Index for Approximate Maximum Inner Product Search on Sparse Vectors.

[BibT_eX]

[DOI]

CoRR, September, 2025

VSAG: An Optimized Search Framework for Graph-based Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

Proc. VLDB Endow., August, 2025

Learning Generalizable and Efficient Image Watermarking via Hierarchical Two-Stage Optimization.

[BibT_eX]

[DOI]

CoRR, August, 2025

Dynamic Pattern Alignment Learning for Pretraining Lightweight Human-Centric Vision Models.

[BibT_eX]

[DOI]

CoRR, August, 2025

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation.

[BibT_eX]

[DOI]

CoRR, August, 2025

Informative Scene Graph Generation via Debiasing.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., July, 2025

SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism.

[BibT_eX]

[DOI]

CoRR, July, 2025

Reliable Few-shot Learning under Dual Noises.

[BibT_eX]

[DOI]

CoRR, June, 2025

EnhanceGraph: A Continuously Enhanced Graph-based Index for High-dimensional Approximate Nearest Neighbor Search.

[BibT_eX]

[DOI]

CoRR, June, 2025

InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning.

[BibT_eX]

[DOI]

CoRR, May, 2025

Policy Contrastive Decoding for Robotic Foundation Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

Towards Generalized and Training-Free Text-Guided Semantic Manipulation.

[BibT_eX]

[DOI]

CoRR, April, 2025

Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation.

[BibT_eX]

[DOI]

CoRR, March, 2025

Scale-Aware Pre-Training for Human-Centric Visual Perception: Enabling Lightweight and Generalizable Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

MSFlow: Multiscale Flow-Based Framework for Unsupervised Anomaly Detection.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., February, 2025

VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Mitigating Hallucinations in Large Vision-Language Models via Reasoning Uncertainty-Guided Refinement.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Text-Video Retrieval With Global-LocalSemantic Consistent Learning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2025

AICL: Action In-Context Learning for Text-to-Video Generation.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Cross-Modal Task Verification via Hypergraph-based Sequential Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

PHGC: Procedural Heterogeneous Graph Completion for Natural Language Task Verification in Egocentric Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Importance Sampling Facilitates Ensemble Adversarial Transferability.

[BibT_eX]

[DOI]

Proceedings of the Databases Theory and Applications, 2025

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

From Observation to Understanding: Front-Door Adjustments with Uncertainty Calibration for Enhancing Egocentric Reasoning in LVLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Improving Multimodal Social Media Popularity Prediction via Selective Retrieval Knowledge Augmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

BatchNorm-Based Weakly Supervised Video Anomaly Detection.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., December, 2024

Ump: Unified Modality-Aware Prompt Tuning for Text-Video Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., November, 2024

Overcoming Data Deficiency for Multi-Person Pose Estimation.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., August, 2024

SPT: Spatial Pyramid Transformer for Image Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., June, 2024

Utilizing Greedy Nature for Multimodal Conditional Image Synthesis in Transformers.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Memory-Based Augmentation Network for Video Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Boosting Adversarial Training with Hardness-Guided Attack Strategy.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

DMH-CL: Dynamic Model Hardness Based Curriculum Learning for Complex Pose Estimation.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

BadCM: Invisible Backdoor Attack Against Cross-Modal Learning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

Exploring Hierarchical Information in Hyperbolic Space for Self-Supervised Image Hashing.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

CPI-Parser: Integrating Causal Properties Into Multiple Human Parsing.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

Allowing Supervision in Unsupervised Deformable- Instances Image-to-Image Translation.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2024

GT23D-Bench: A Comprehensive General Text-to-3D Generation Benchmark.

[BibT_eX]

[DOI]

CoRR, 2024

SeMv-3D: Towards Semantic and Mutil-view Consistency simultaneously for General Text-to-3D Generation with Triplane Priors.

[BibT_eX]

[DOI]

CoRR, 2024

One-step Noisy Label Mitigation.

[BibT_eX]

[DOI]

CoRR, 2024

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct.

[BibT_eX]

[DOI]

CoRR, 2024

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

Unsupervised Cross-Domain Image Retrieval with Semantic-Attended Mixture-of-Experts.

[BibT_eX]

[DOI]

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

CoIN: A Benchmark of Continual Instruction Tuning for Multimodel Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MPT: Multi-grained Prompt Tuning for Text-Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SI-BiViT: Binarizing Vision Transformers with Spatial Interaction.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MagicVFX: Visual Effects Synthesis in Just Minutes.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Counterfactually Augmented Event Matching for De-biased Temporal Sentence Grounding.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Effective and Efficient Few-shot Fine-tuning for Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Pedestrian Attributes Recognition for UAV-Human.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

BFD: Binarized Frequency-enhanced Distillation for Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Training-Free Semantic Video Composition via Pre-trained Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

HCMA'24: The 5th International Workshop on Human-centric Multimedia Analysis Summary.

[BibT_eX]

[DOI]

Proceedings of the 5th International Workshop on Human-centric Multimedia Analysis, 2024

RoScenes: A Large-Scale Multi-view 3D Dataset for Roadside Perception.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

DePT: Decoupled Prompt Tuning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ProS: Prompting-to-Simulate Generalized Knowledge for Universal Cross-Domain Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

F³-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

KE-RCNN: Unifying Knowledge-Based Reasoning Into Part-Level Attribute Parsing.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., November, 2023

Adaptive Fine-Grained Predicates Learning for Scene Graph Generation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Continuous cross-modal hashing.

[BibT_eX]

[DOI]

Pattern Recognit., October, 2023

Less is Better: Exponential Loss for Cross-Modal Matching.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., September, 2023

Semisupervised Network Embedding With Differentiable Deep Quantization.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., August, 2023

On the Imaginary Wings: Text-Assisted Complex-Valued Fusion Network for Fine-Grained Visual Classification.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., August, 2023

Complementarity-Aware Space Learning for Video-Text Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., August, 2023

Deep debiased contrastive hashing.

[BibT_eX]

[DOI]

Pattern Recognit., July, 2023

Learning visual question answering on controlled semantic noisy labels.

[BibT_eX]

[DOI]

Pattern Recognit., June, 2023

Transferable and differentiable discrete network embedding for multi-domains with hierarchical knowledge distillation.

[BibT_eX]

[DOI]

Inf. Sci., June, 2023

Label-Guided Generative Adversarial Network for Realistic Image Synthesis.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Heterogeneous Knowledge Network for Visual Dialog.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., February, 2023

Label-Affinity Self-Adaptive Central Similarity Hashing for Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

AMANet: Adaptive Multi-Path Aggregation for Learning Human 2D-3D Correspondences.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Revisiting Multi-Codebook Quantization.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

From Global to Local: Multi-Scale Out-of-Distribution Detection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

Spherical Centralized Quantization for Fast Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

End-to-End Pre-Training With Hierarchical Matching and Momentum Contrast for Text-Video Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

Toward a Unified Transformer-Based Framework for Scene Graph Generation and Human-Object Interaction Detection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

State-Aware Compositional Learning Toward Unbiased Training for Scene Graph Generation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

End-to-end Image Captioning via Visual Region Aggregation and Dual-level Collaboration.

[BibT_eX]

[DOI]

Int. J. Softw. Informatics, 2023

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control.

[BibT_eX]

[DOI]

CoRR, 2023

F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Redundancy-Free Sub-networks in Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2023

MotionZero: Exploiting Motion Priors for Zero-shot Text-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

Towards a Unified Transformer-based Framework for Scene Graph Generation and Human-object Interaction Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks.

[BibT_eX]

[DOI]

CoRR, 2023

MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection.

[BibT_eX]

[DOI]

CoRR, 2023

CIParsing: Unifying Causality Properties into Multiple Human Parsing.

[BibT_eX]

[DOI]

CoRR, 2023

Part-Aware Transformer for Generalizable Person Re-identification.

[BibT_eX]

[DOI]

CoRR, 2023

CageViT: Convolutional Activation Guided Efficient Vision Transformer.

[BibT_eX]

[DOI]

CoRR, 2023

Boosting Adversarial Attacks by Leveraging Decision Boundary Information.

[BibT_eX]

[DOI]

CoRR, 2023

RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Precise Target-Oriented Attack against Deep Hashing-based Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

CHAIN: Exploring Global-Local Spatio-Temporal Information for Improved Self-Supervised Video Hashing.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

HCMA '23: 4th International Workshop on Human-Centric Multimedia Analysis.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Style-Controllable Generalized Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

CUCL: Codebook for Unsupervised Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

A Closer Look at Few-shot Classification Again.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Towards Boosting Black-Box Attack Via Sharpness-Aware.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

EANet: Towards Lightweight Human Pose Estimation With Effective Aggregation Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

DETA: Denoised Task Adaptation for Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Part-Aware Transformer for Generalizable Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Prototype-Based Embedding Network for Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Relation-aware aggregation network with auxiliary guidance for text-based person search.

[BibT_eX]

[DOI]

World Wide Web, 2022

Scenario-Aware Recurrent Transformer for Goal-Directed Video Captioning.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2022

AgeGAN++: Face Aging and Rejuvenation With Dual Conditional GANs.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Push & Pull: Transferable Adversarial Examples With Attentive Attack.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Improving Image Similarity Learning by Adding External Memory.

[BibT_eX]

[DOI]

Xinjian Gao

Tingting Mu

John Yannis Goulermas

Jingkuan Song

Meng Wang

IEEE Trans. Knowl. Data Eng., 2022

Video Question Answering With Prior Knowledge and Object-Sensitive Learning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Continual Referring Expression Comprehension via Dual Modular Memorization.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Hierarchical Representation Network With Auxiliary Tasks for Video Captioning and Video Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Relation Regularized Scene Graph Generation.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2022

Progressive Meta-Learning With Curriculum.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

KTN: Knowledge Transfer Network for Learning Multiperson 2D-3D Correspondences.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Text-instance graph: Exploring the relational semantics for text-based visual question answering.

[BibT_eX]

[DOI]

Pattern Recognit., 2022

MCFL: multi-label contrastive focal loss for deep imbalanced pedestrian attribute recognition.

[BibT_eX]

[DOI]

Neural Comput. Appl., 2022

Hyperbolic Hierarchical Contrastive Hashing.

[BibT_eX]

[DOI]

CoRR, 2022

RepParser: End-to-End Multiple Human Parsing with Representative Parts.

[BibT_eX]

[DOI]

CoRR, 2022

KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D Correspondences.

[BibT_eX]

[DOI]

CoRR, 2022

Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation.

[BibT_eX]

[DOI]

CoRR, 2022

FedMed-GAN: Federated Multi-Modal Unsupervised Brain Image Synthesis.

[BibT_eX]

[DOI]

CoRR, 2022

A Lower Bound of Hash Codes' Performance.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Natural Color Fool: Towards Boosting Black-box Unrestricted Attacks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Free-Lunch for Cross-Domain Few-Shot Learning: Style-Aware Episodic Training with Robust Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

HCMA'22: 3rd International Workshop on Human-Centric Multimedia Analysis.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Progressive Tree-Structured Prototype Network for End-to-End Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Prompting for Multi-Modal Tracking.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Dynamic Scene Graph Generation via Temporal Prior Inference.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Skeleton-based Action Recognition via Adaptive Cross-Form Learning.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Class Gradient Projection For Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

S2 Transformer for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Deep Category-Aware Hashing for Object Retrieval in Multi-Label Image.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Learning to Generate Scene Graph from Head to Tail.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

MKE-GCN: Multi-Modal Knowledge Embedded Graph Convolutional Network for Skeleton-Based Action Recognition in the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Support-Set Based Multi-Modal Representation Enhancement for Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Multi-Scale Graph Attention Network for Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Context Gating with Multi-Level Ranking Learning for Visual Dialog.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Frequency Domain Model Augmentation for Adversarial Attack.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Towards Open-Vocabulary Scene Graph Generation with Prompt-Based Finetuning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Learning Dual-Fused Modality-Aware Representations for RGBD Tracking.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Meta Distribution Alignment for Generalizable Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Fine-Grained Predicates Learning for Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Rethinking Spatial Invariance of Convolutional Networks for Object Counting.

[BibT_eX]

[DOI]

Alexander G. Hauptmann

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Self-supervised Label-Visual Correlation Hashing for Multi-label Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Web and Big Data - 6th International Joint Conference, 2022

2021

High-order nonlocal Hashing for unsupervised cross-modal retrieval.

[BibT_eX]

[DOI]

World Wide Web, 2021

Rich Visual Knowledge-Based Augmentation Network for Visual Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2021

BATCH: A Scalable Asymmetric Discrete Cross-Modal Hashing.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2021

Learning Efficient Hash Codes for Fast Graph-Based Data Similarity Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Introduction to the Special Issue on Learning-based Support for Data Science Applications.

[BibT_eX]

[DOI]

Ke Zhou

Jingkuan Song

Trans. Data Sci., 2021

GuessWhich? Visual dialog with attentive memory network.

[BibT_eX]

[DOI]

Pattern Recognit., 2021

Unsupervised deep hashing with node representation for image retrieval.

[BibT_eX]

[DOI]

Pattern Recognit., 2021

Explainable deep learning for efficient and robust pattern recognition: A survey of recent developments.

[BibT_eX]

[DOI]

Pattern Recognit., 2021

Verification mechanism to obtain an elaborate answer span in machine reading comprehension.

[BibT_eX]

[DOI]

Neurocomputing, 2021

Part-level attention networks for cross-domain person re-identification.

[BibT_eX]

[DOI]

IET Image Process., 2021

Technical Report: Disentangled Action Parsing Networks for Accurate Part-level Action Parsing.

[BibT_eX]

[DOI]

CoRR, 2021

Fast Gradient Non-sign Methods.

[BibT_eX]

[DOI]

CoRR, 2021

Unsupervised Domain-adaptive Hash for Networks.

[BibT_eX]

[DOI]

CoRR, 2021

Semi-supervised Network Embedding with Differentiable Deep Quantisation.

[BibT_eX]

[DOI]

CoRR, 2021

Semantic Compositional Learning for Low-shot Scene Graph Generation.

[BibT_eX]

[DOI]

CoRR, 2021

Staircase Sign Method for Boosting Adversarial Attacks.

[BibT_eX]

[DOI]

CoRR, 2021

Cross-Domain Person Re-Identification Based on Feature Fusion.

[BibT_eX]

[DOI]

IEEE Access, 2021

Extracting Useful Knowledge from Noisy Web Images via Data Purification for Fine-Grained Recognition.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Curriculum-Based Meta-learning.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Conceptual and Syntactical Cross-modal Alignment with Cross-level Consistency for Image-Text Matching.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

A System for Interactive and Intelligent AD Auxiliary Screening.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Semantic-aware Transfer with Instance-adaptive Parsing for Crowded Scenes Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Fully Functional Image Manipulation Using Scene Graphs in A Bounding-Box Free Way.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Camera-Agnostic Person Re-Identification via Adversarial Disentangling Learning.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

HUMA'21: 2nd International Workshop on Human-centric Multimedia Analysis.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Exploring Contextual-Aware Representation and Linguistic-Diverse Expression for Visual Dialog.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Towards Unsupervised Deformable-Instances Image-to-Image Translation.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Feature Space Targeted Attacks by Statistic Alignment.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

SKANet: Structured Knowledge-Aware Network for Visual Dialog.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Hierarchical Representation Network With Auxiliary Tasks For Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Exploiting Scene Graphs for Human-Object Interaction Detection.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

From General to Specific: Informative Scene Graph Generation via Balance Adjustment.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

RSGNet: Relation based Skeleton Graph Network for Crowded Scenes Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

A low cost and un-cancelled laplace noise based differential privacy algorithm for spatial decompositions.

[BibT_eX]

[DOI]

World Wide Web, 2020

Spatio-Temporal Attention Networks for Action Recognition and Detection.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2020

Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2020

Fast large scale deep face search.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2020

Binary neural networks: A survey.

[BibT_eX]

[DOI]

Pattern Recognit., 2020

Play and rewind: Context-aware video temporal action proposals.

[BibT_eX]

[DOI]

Pattern Recognit., 2020

Hierarchical LSTMs with Adaptive Attention for Visual Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image.

[BibT_eX]

[DOI]

Neural Comput. Appl., 2020

Fused GRU with semantic-temporal attention for video captioning.

[BibT_eX]

[DOI]

Neurocomputing, 2020

Question-Led object attention for visual question answering.

[BibT_eX]

[DOI]

Neurocomputing, 2020

Unified Binary Generative Adversarial Network for Image Retrieval and Compression.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2020

Patch-wise++ Perturbation for Adversarial Targeted Attacks.

[BibT_eX]

[DOI]

CoRR, 2020

3D Self-Attention for Unsupervised Video Quantization.

[BibT_eX]

[DOI]

Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

EvoGAN: an evolutionary GAN for face aging and rejuvenation.

[BibT_eX]

[DOI]

Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

KTN: Knowledge Transfer Network for Multi-person DensePose Estimation.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

HUMA'20: 1st International Workshop on Human-Centric Multimedia Analysis.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

One-shot Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Lab2Pix: Label-Adaptive Generative Adversarial Network for Unsupervised Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning from the Scene and Borrowing from the Rich: Tackling the Long Tail in Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Bottom-up and Top-down: Bidirectional Additive Net for Edge Detection.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Deep Self-Taught Graph Embedding Hashing With Pseudo Labels For Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Patch-Wise Attack for Fooling Deep Neural Network.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Forward and Backward Information Retention for Accurate Binary Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Salience-Guided Cascaded Suppression Network for Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Graph Attention Based Proposal 3D ConvNets for Action Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

SNEQ: Semi-Supervised Attributed Network Embedding with Attention-Based Quantisation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2019

Learning Match Kernels on Grassmann Manifolds for Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2019

Deep Self-Taught Hashing for Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2019

Towards Accurate Georeferenced Video Search With Camera Field of View Modeling.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2019

One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Network.

[BibT_eX]

[DOI]

CoRR, 2019

Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

BraidNet: Braiding Semantics and Details for Accurate Human Parsing.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Learnable Aggregating Net with Diversity Learning for Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Adaptive Multi-Path Aggregation for Human DensePose Estimation in the Wild.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Localizing Unseen Activities in Video via Image Query.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Deep Recurrent Quantization for Generating Sequential Binary Codes.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Beyond Product Quantization: Deep Progressive Quantization for Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

The Research of Chinese Ethnical Face Recognition Based on Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Web and Big Data, 2019

Model of Charging Stations Construction and Electric Vehicles Development Prediction.

[BibT_eX]

[DOI]

Qilong Zhang

Zheyong Qiu

Jingkuan Song

Proceedings of the Web and Big Data, 2019

A Framework for Image Dark Data Assessment.

[BibT_eX]

[DOI]

Proceedings of the Web and Big Data - Third International Joint Conference, 2019

Boundary Detector Encoder and Decoder with Soft Attention for Video Captioning.

[BibT_eX]

[DOI]

Tangming Chen

Qike Zhao

Jingkuan Song

Proceedings of the Web and Big Data, 2019

Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Structured Two-Stream Attention Network for Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Deliberate Attention Networks for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Perceptual Pyramid Adversarial Networks for Text-to-Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Indexing Techniques for Multimedia Data Retrieval.

[BibT_eX]

[DOI]

Jingkuan Song

Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Local and Global Structure Preservation for Robust Unsupervised Spectral Feature Selection.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2018

NAIS: Neural Attentive Item Similarity Model for Recommendation.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2018

Cross-Paced Representation Learning With Partial Curricula for Sketch-Based Image Retrieval.

[BibT_eX]

[DOI]

Dan Xu

Xavier Alameda-Pineda

Jingkuan Song

Elisa Ricci

Nicu Sebe

IEEE Trans. Image Process., 2018

Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2018

Quantization-based hashing: a general framework for scalable image and video retrieval.

[BibT_eX]

[DOI]

Pattern Recognit., 2018

A Survey on Learning to Hash.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2018

Hidden semantic hashing for fast retrieval over large scale document collection.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2018

Multiple hierarchical deep hashing for large scale image retrieval.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2018

Deep appearance and motion learning for egocentric activity recognition.

[BibT_eX]

[DOI]

Neurocomputing, 2018

EFUI: An ensemble framework using uncertain inference for pornographic image recognition.

[BibT_eX]

[DOI]

Neurocomputing, 2018

Cumulative Nets for Edge Detection.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Session details: Vision-1 (Machine Learning).

[BibT_eX]

[DOI]

Jingkuan Song

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Examine before You Answer: Multi-task Learning with Adaptive-attentions for Multiple-choice VQA.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

3D Image-based Indoor Localization Joint With WiFi Positioning.

[BibT_eX]

[DOI]

Guoyu Lu

Jingkuan Song

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

From Pixels to Objects: Cubic Visual Attention for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dual Conditional GANs for Face Aging and Rejuvenation.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Coarse-to-fine Image Co-segmentation with Intra and Inter Rank Constraints.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Complementary Binary Quantization for Joint Multiple Indexing.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dual Learning for Visual Question Generation.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

MathDQN: Solving Arithmetic Word Problems via Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Deep Region Hashing for Generic Instance Search from Images.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Binary Generative Adversarial Networks for Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Guest Editorial: Large-Scale Multimedia Data Retrieval, Classification, and Understanding.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2017

Bilinear Optimized Product Quantization for Scalable Visual Content Analysis.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2017

Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2017

Learning in high-dimensional multimedia data: the state of the art.

[BibT_eX]

[DOI]

Multim. Syst., 2017

Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Graph self-representation method for unsupervised feature selection.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Real-time social media retrieval with spatial, temporal and social constraints.

[BibT_eX]

[DOI]

Neurocomputing, 2017

A novel low-rank hypergraph feature selection for multi-view classification.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Supervised hashing with adaptive discrete optimization for multimedia retrieval.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Binary Generative Adversarial Networks for Image Retrieval.

[BibT_eX]

[DOI]

Jingkuan Song

CoRR, 2017

From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning.

[BibT_eX]

[DOI]

CoRR, 2017

Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search.

[BibT_eX]

[DOI]

CoRR, 2017

Deep Region Hashing for Efficient Large-scale Instance Search from Images.

[BibT_eX]

[DOI]

CoRR, 2017

Classification by Retrieval: Binarizing Data and Classifiers.

[BibT_eX]

[DOI]

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Deep Discrete Hashing with Self-supervised Pairwise Labels.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

Exploring Consistent Preferences: Discrete Hashing with Pair-Exemplar for Scalable Landmark Search.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

Unsupervised Discovery of Spatially-Informed Lung Texture Patterns for Pulmonary Emphysema: The MESA COPD Study.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, 2017

Generative method to discover emphysema subtypes with unsupervised learning using lung macroscopic patterns (LMPS): The MESA COPD study.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Biomedical Imaging, 2017

Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Synchronization-Inspired Co-Clustering and Its Application to Gene Expression Data.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Data Mining, 2017

Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Jointly Learning Attentions with Semantic Cross-Modal Correlation for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Databases Theory and Applications, 2017

Event Video Mashup: From Hundreds of Videos to Minutes of Skeleton.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Active domain adaptation with noisy labels for multimedia analysis.

[BibT_eX]

[DOI]

Gaowen Liu

Yan Yan

Ramanathan Subramanian

Jingkuan Song

Guoyu Lu

Nicu Sebe

World Wide Web, 2016

A Distance-Computation-Free Search Scheme for Binary Code Databases.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2016

Web Video Event Recognition by Semantic Analysis From Ubiquitous Documents.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2016

Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2016

A Fast Optimization Method for General Binary Code Learning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2016

Multi-view multi-label learning for image annotation.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2016

Towards optimal VLAD for human action recognition from still images.

[BibT_eX]

[DOI]

Image Vis. Comput., 2016

Deep and fast: Deep learning hashing with semi-supervised graph construction.

[BibT_eX]

[DOI]

Image Vis. Comput., 2016

Cross-modal Retrieval with Label Completion.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Academic Coupled Dictionary Learning for Sketch-based Image Retrieval.

[BibT_eX]

[DOI]

Dan Xu

Xavier Alameda-Pineda

Jingkuan Song

Elisa Ricci

Nicu Sebe

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Attention-based LSTM with Semantic Consistency for Videos Captioning.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-Paced Dictionary Learning for cross-domain retrieval and recognition.

[BibT_eX]

[DOI]

Dan Xu

Jingkuan Song

Xavier Alameda-Pineda

Elisa Ricci

Nicu Sebe

Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Multi-cue Information Fusion for Two-Layer Activity Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2016 Workshops, 2016

Graph-without-cut: An Ideal Graph Learning for Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Compact Image Fingerprint Via Multiple Kernel Hashing.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2015

Optimized Cartesian K-Means.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2015

Supervised feature learning via l<sub>2</sub>-norm regularized logistic regression for 3D object recognition.

[BibT_eX]

[DOI]

Neurocomputing, 2015

Deep Self-taught Hashing for Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Zero-shot Image Categorization by Image Correlation Exploration.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Localize Me Anywhere, Anytime: A Multi-task Point-Retrieval Approach.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Optimal graph learning with partial tags and multiple features for image and video annotation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Learning Deep Representations of Appearance and Motion for Anomalous Event Detection.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2015, 2015

2014

Effective Hashing for Searching Large-scale Multimedia Databases

[BibT_eX]

[DOI]

Jingkuan Song

PhD thesis, 2014

Robust Hashing With Local Models for Approximate Similarity Search.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2014

Hashing for Similarity Search: A Survey.

[BibT_eX]

[DOI]

CoRR, 2014

Minimizing dataset bias: Discriminative multi-task sparse coding through shared subspace learning for image classification.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

2013

Multi-Feature Fusion via Hierarchical Regression for Multimedia Analysis.

[BibT_eX]

[DOI]

Alexander G. Hauptmann

IEEE Trans. Multim., 2013

Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2013

Inter-media hashing for large-scale retrieval from heterogeneous data sources.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Effective hashing for large-scale multimedia search.

[BibT_eX]

[DOI]

Jingkuan Song

Proceedings of the 2013 SIGMOD/PODS Ph.D. Symposium, New York, NY, USA, June 23, 2013, 2013

2011

UQMSG Experiments for TRECVID 2011.

[BibT_eX]

[DOI]

Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Multiple feature hashing for real-time large scale near-duplicate video retrieval.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Jingkuan Song

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...