Changsheng Xu

Orcid: 0000-0001-8343-9665

According to our database1, Changsheng Xu authored at least 626 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Awards

IEEE Fellow

IEEE Fellow 2014, "For contributions to multimedia content analysis".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
MGDCF: Distance Learning via Markov Graph Diffusion for Neural Collaborative Filtering.
IEEE Trans. Knowl. Data Eng., July, 2024

Exploring the Temporal Consistency of Arbitrary Style Transfer: A Channelwise Perspective.
IEEE Trans. Neural Networks Learn. Syst., June, 2024

Multimodal Imbalance-Aware Gradient Modulation for Weakly-Supervised Audio-Visual Video Parsing.
IEEE Trans. Circuits Syst. Video Technol., June, 2024

Multi-object Tracking with Spatial-Temporal Tracklet Association.
ACM Trans. Multim. Comput. Commun. Appl., May, 2024

NExT-OOD: Overcoming Dual Multiple-Choice VQA Biases.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Feature Disentanglement Network: Multi-Object Tracking Needs More Differentiated Features.
ACM Trans. Multim. Comput. Commun. Appl., March, 2024

Incremental Audio-Visual Fusion for Person Recognition in Earthquake Scene.
ACM Trans. Multim. Comput. Commun. Appl., February, 2024

Adaptive Adversarial Logits Pairing.
ACM Trans. Multim. Comput. Commun. Appl., February, 2024

TIF: Threshold Interception and Fusion for Compact and Fine-Grained Visual Attribution.
IEEE Trans. Multim., 2024

Semantic Distance Adversarial Learning for Text-to-Image Synthesis.
IEEE Trans. Multim., 2024

CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding.
IEEE Trans. Multim., 2024

Recovering Generalization via Pre-Training-Like Knowledge Distillation for Out-of-Distribution Visual Question Answering.
IEEE Trans. Multim., 2024

Snippet-to-Prototype Contrastive Consensus Network for Weakly Supervised Temporal Action Localization.
IEEE Trans. Multim., 2024

SgVA-CLIP: Semantic-Guided Visual Adapting of Vision-Language Models for Few-Shot Image Classification.
IEEE Trans. Multim., 2024

Learning Multi-Expert Distribution Calibration for Long-Tailed Video Classification.
IEEE Trans. Multim., 2024

Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection.
IEEE Trans. Image Process., 2024

Few-shot Incremental Learning with Textual Knowledge Embedding by Visual-language Model.
Int. J. Softw. Informatics, 2024

Cluster-Aware Similarity Diffusion for Instance Retrieval.
CoRR, 2024

SEP: Self-Enhanced Prompt Tuning for Visual-Language Model.
CoRR, 2024

Libra: Building Decoupled Vision System on Large Language Models.
CoRR, 2024

HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.
CoRR, 2024

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion.
CoRR, 2024

Dance-to-Music Generation with Encoder-based Textual Inversion of Diffusion Models.
CoRR, 2024

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion.
CoRR, 2024

Hierarchical Prompts for Rehearsal-free Continual Learning.
CoRR, 2024

Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition.
CoRR, 2024

T<sup>3</sup>RD: Test-Time Training for Rumor Detection on Social Media.
Proceedings of the ACM on Web Conference 2024, 2024

LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Part-Aware Prompt Tuning for Weakly Supervised Referring Expression Grounding.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

Three Heads Are Better than One: Complementary Experts for Long-Tailed Semi-supervised Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Music Style Transfer with Time-Varying Inversion of Diffusion Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models.
ACM Trans. Graph., December, 2023

Vectorized Evidential Learning for Weakly-Supervised Temporal Action Localization.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Uncertainty-Aware Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning.
ACM Trans. Graph., October, 2023

Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Contrastive Multi-Modal Knowledge Graph Representation Learning.
IEEE Trans. Knowl. Data Eng., September, 2023

Debiased Video-Text Retrieval via Soft Positive Sample Calibration.
IEEE Trans. Circuits Syst. Video Technol., September, 2023

Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models.
IEEE Trans. Circuits Syst. Video Technol., September, 2023

Dual Instance-Consistent Network for Cross-Domain Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Emotion-Aware Music Driven Movie Montage.
J. Comput. Sci. Technol., June, 2023

CrossRectify: Leveraging disagreement for semi-supervised object detection.
Pattern Recognit., May, 2023

Integrating Multi-Label Contrastive Learning With Dual Adversarial Graph Neural Networks for Cross-Modal Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Weakly-Supervised Video Object Grounding via Causal Intervention.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Adaptive Text Denoising Network for Image Caption Editing.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

Knowledge-Enhanced Attributed Multi-Task Learning for Medicine Recommendation.
ACM Trans. Inf. Syst., January, 2023

Balance-Aware Grid Collage for Small Image Collections.
IEEE Trans. Vis. Comput. Graph., 2023

Dual Scene Graph Convolutional Network for Motivation Prediction.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Multi-Source Knowledge Reasoning Graph Network for Multi-Modal Commonsense Inference.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Counterfactual Scenario-relevant Knowledge-enriched Multi-modal Emotion Reasoning.
ACM Trans. Multim. Comput. Commun. Appl., 2023

User Cold-Start Recommendation via Inductive Heterogeneous Graph Neural Network.
ACM Trans. Inf. Syst., 2023

Dual Structural Knowledge Interaction for Domain Adaptation.
IEEE Trans. Multim., 2023

Robust Video-Text Retrieval Via Noisy Pair Calibration.
IEEE Trans. Multim., 2023

Many Hands Make Light Work: Transferring Knowledge From Auxiliary Tasks for Video-Text Retrieval.
IEEE Trans. Multim., 2023

Weakly-Supervised Video Object Grounding via Learning Uni-Modal Associations.
IEEE Trans. Multim., 2023

User-Guided Personalized Image Aesthetic Assessment Based on Deep Reinforcement Learning.
IEEE Trans. Multim., 2023

SMNet: Synchronous Multi-Scale Low Light Enhancement Network With Local and Global Concern.
IEEE Trans. Multim., 2023

Zero-Shot Predicate Prediction for Scene Graph Parsing.
IEEE Trans. Multim., 2023

Learning Scene-Aware Spatio-Temporal GNNs for Few-Shot Early Action Prediction.
IEEE Trans. Multim., 2023

Spatial-Temporal Exclusive Capsule Network for Open Set Action Recognition.
IEEE Trans. Multim., 2023

Learning Dual-Routing Capsule Graph Neural Network for Few-Shot Video Classification.
IEEE Trans. Multim., 2023

Heterogeneous Graph Contrastive Learning Network for Personalized Micro-Video Recommendation.
IEEE Trans. Multim., 2023

Reducing Vision-Answer Biases for Multiple-Choice VQA.
IEEE Trans. Image Process., 2023

Category Knowledge-Guided Parameter Calibration for Few-Shot Object Detection.
IEEE Trans. Image Process., 2023

SPA<sup>2</sup>Net: Structure-Preserved Attention Activated Network for Weakly Supervised Object Localization.
IEEE Trans. Image Process., 2023

Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking.
CoRR, 2023

MotionCrafter: One-Shot Motion Customization of Diffusion Models.
CoRR, 2023

TCP: Textual-based Class-aware Prompt tuning for Visual-Language Model.
CoRR, 2023

Test-time Adaptive Vision-and-Language Navigation.
CoRR, 2023

Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation.
CoRR, 2023

A Survey on Interpretable Cross-modal Reasoning.
CoRR, 2023

Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection.
CoRR, 2023

Introducing Foundation Models as Surrogate Models: Advancing Towards More Practical Adversarial Attacks.
CoRR, 2023

Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing.
CoRR, 2023

Multi-modal Queried Object Detection in the Wild.
CoRR, 2023

ProSpect: Expanded Conditioning for the Personalization of Attribute-aware Image Generation.
CoRR, 2023

Camera-Incremental Object Re-Identification with Identity Knowledge Evolution.
CoRR, 2023

CLIP-VG: Self-paced Curriculum Adapting of CLIP via Exploiting Pseudo-Language Labels for Visual Grounding.
CoRR, 2023

Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias.
CoRR, 2023

Region-Aware Diffusion for Zero-shot Text-driven Image Editing.
CoRR, 2023

Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples.
CoRR, 2023

Open-World Social Event Classification.
Proceedings of the ACM Web Conference 2023, 2023

Multi-modal Queried Object Detection in the Wild.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

C2MR: Continual Cross-Modal Retrieval for Streaming Multi-modal Data.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

AffectFAL: Federated Active Affective Computing with Non-IID Data.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Video Entailment via Reaching a Structure-Aware Cross-modal Consensus.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Weakly-supervised Video Scene Graph Generation via Unbiased Cross-modal Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Iterative Learning with Extra and Inner Knowledge for Long-tail Dynamic Scene Graph Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Leveraging Attribute Knowledge for Open-set Action Recognition.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Fine-grained Primitive Representation Learning for Compositional Zero-shot Classification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Variational Causal Inference Network for Explanatory Visual Question Answering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

VQACL: A Novel Visual Question Answering Continual Learning Setting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Inversion-based Style Transfer with Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

UTM: A Unified Multiple Object Tracking Model with Identity-Aware Feature Enhancement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Visual-Language Prompt Tuning with Knowledge-Guided Context Optimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio- Visual Event Perception.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Unlearnable Clusters: Towards Label-Agnostic Unlearnable Examples.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Tell, Imagine, and Search: End-to-end Learning for Composing Text and Image to Image Retrieval.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Towards Corruption-Agnostic Robust Domain Adaptation.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Learning Hierarchical Video Graph Networks for One-Stop Video Delivery.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Domain-invariant Graph for Adaptive Semi-supervised Domain Adaptation.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Image-Based Personality Questionnaire Design.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Special Section on Edge-AI for Connected Living.
ACM Trans. Internet Techn., 2022

Learning to Learn a Cold-start Sequential Recommender.
ACM Trans. Inf. Syst., 2022

Seek Common Ground While Reserving Differences: A Model-Agnostic Module for Noisy Domain Adaptation.
IEEE Trans. Multim., 2022

Explicit Cross-Modal Representation Learning for Visual Commonsense Reasoning.
IEEE Trans. Multim., 2022

Weakly-Supervised Facial Expression Recognition in the Wild With Noisy Data.
IEEE Trans. Multim., 2022

Multi-Modal Meta Multi-Task Learning for Social Media Rumor Detection.
IEEE Trans. Multim., 2022

Attribute-Induced Bias Eliminating for Transductive Zero-Shot Learning.
IEEE Trans. Multim., 2022

DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering.
IEEE Trans. Multim., 2022

Adaptive Label-Aware Graph Convolutional Networks for Cross-Modal Retrieval.
IEEE Trans. Multim., 2022

The Model May Fit You: User-Generalized Cross-Modal Retrieval.
IEEE Trans. Multim., 2022

Intra-Domain Consistency Enhancement for Unsupervised Person Re-Identification.
IEEE Trans. Multim., 2022

Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition.
IEEE Trans. Multim., 2022

Heterogeneous Hierarchical Feature Aggregation Network for Personalized Micro-Video Recommendation.
IEEE Trans. Multim., 2022

Geometry Sensitive Cross-Modal Reasoning for Composed Query Based Image Retrieval.
IEEE Trans. Image Process., 2022

Compact Representation and Reliable Classification Learning for Point-Level Weakly-Supervised Action Localization.
IEEE Trans. Image Process., 2022

Margin-Based Adversarial Joint Alignment Domain Adaptation.
IEEE Trans. Circuits Syst. Video Technol., 2022

Joint Expression Synthesis and Representation Learning for Facial Expression Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2022

Multi-Object Tracking With Spatial-Temporal Topology-Based Detector.
IEEE Trans. Circuits Syst. Video Technol., 2022

Learning Video Moment Retrieval Without a Single Annotated Video.
IEEE Trans. Circuits Syst. Video Technol., 2022

Learning Semantic-Aware Spatial-Temporal Attention for Interpretable Action Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2022

A unified framework for multi-modal federated learning.
Neurocomputing, 2022

TCKGE: Transformers with contrastive learning for knowledge graph embedding.
Int. J. Multim. Inf. Retr., 2022

Prototype local-global alignment network for image-text retrieval.
Int. J. Multim. Inf. Retr., 2022

Transformers in computational visual media: A survey.
Comput. Vis. Media, 2022

Non-dominated sorting based multi-page photo collage.
Comput. Vis. Media, 2022

SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification.
CoRR, 2022

Inversion-Based Creativity Transfer with Diffusion Models.
CoRR, 2022

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization.
CoRR, 2022

Learning Muti-expert Distribution Calibration for Long-tailed Video Classification.
CoRR, 2022

MGDCF: Distance Learning via Markov Graph Diffusion for Neural Collaborative Filtering.
CoRR, 2022

Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding.
CoRR, 2022

Mitigating the Mutual Error Amplification for Semi-Supervised Object Detection.
CoRR, 2022

Robustly Recognizing Irregular Scene Text by Rectifying Principle Irregularities.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022

Comprehensive Relationship Reasoning for Composed Query Based Image Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

MMT: Image-guided Story Ending Generation with Multimodal Memory Transformer.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Adaptive Transformer-Based Conditioned Variational Autoencoder for Incomplete Social Event Classification.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Attribute-guided Dynamic Routing Graph Network for Transductive Few-shot Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Adaptive Anti-Bottleneck Multi-Modal Graph Learning Network for Personalized Micro-video Recommendation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Quantification of Artist Representativity within an Art Movement.
Proceedings of the IEEE International Conference on Multimedia and Expo Workshops, 2022

Multi-Modal Learning with Text Merging for TEXTVQA.
Proceedings of the IEEE International Conference on Acoustics, 2022

Dual-Evidential Learning for Weakly-supervised Temporal Action Localization.
Proceedings of the Computer Vision - ECCV 2022, 2022

DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Dynamic Scene Graph Generation via Anticipatory Pre-training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

StyTr<sup>2</sup>: Image Style Transfer with Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross-Modal Federated Human Activity Recognition via Modality-Agnostic and Modality-Specific Representation Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Content-Based Visual Summarization for Image Collections.
IEEE Trans. Vis. Comput. Graph., 2021

Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks for Fake News Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Health Status Prediction with Local-Global Heterogeneous Behavior Graph.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Distribution Aligned Multimodal and Multi-domain Image Stylization.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Part-based Structured Representation Learning for Person Re-identification.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Knowledge-driven Egocentric Multimodal Activity Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Dynamic Graph Learning Convolutional Networks for Semi-supervised Classification.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Multimodal Disentangled Domain Adaption for Social Media Event Rumor Detection.
IEEE Trans. Multim., 2021

Adversarial Multimodal Network for Movie Story Question Answering.
IEEE Trans. Multim., 2021

Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval.
IEEE Trans. Multim., 2021

Emotion Knowledge Driven Video Highlight Detection.
IEEE Trans. Multim., 2021

Density-Aware Multi-Task Learning for Crowd Counting.
IEEE Trans. Multim., 2021

Heterogeneous Community Question Answering via Social-Aware Multi-Modal Co-Attention Convolutional Matching.
IEEE Trans. Multim., 2021

Learning Dual-Pooling Graph Neural Networks for Few-Shot Video Classification.
IEEE Trans. Multim., 2021

Unsupervised Video Summarization via Relation-Aware Assignment Learning.
IEEE Trans. Multim., 2021

Metadata Connector: Exploiting Hashtag and Tag for Cross-OSN Event Search.
IEEE Trans. Multim., 2021

Exploring the Representativity of Art Paintings.
IEEE Trans. Multim., 2021

HAPGN: Hierarchical Attentive Pooling Graph Network for Point Cloud Segmentation.
IEEE Trans. Multim., 2021

Attention-Based Multi-Source Domain Adaptation.
IEEE Trans. Image Process., 2021

Joint Person Objectness and Repulsion for Person Search.
IEEE Trans. Image Process., 2021

TEST: Triplet Ensemble Student-Teacher Model for Unsupervised Person Re-Identification.
IEEE Trans. Image Process., 2021

SAN: Selective Alignment Network for Cross-Domain Pedestrian Detection.
IEEE Trans. Image Process., 2021

Multi-Target Multi-Camera Tracking With Optical-Based Pose Association.
IEEE Trans. Circuits Syst. Video Technol., 2021

PEN: Pose-Embedding Network for Pedestrian Detection.
IEEE Trans. Circuits Syst. Video Technol., 2021

Unified Cross-domain Classification via Geometric and Statistical Adaptations.
Pattern Recognit., 2021

Learning to Model Relationships for Zero-Shot Video Classification.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

SiamCPN: Visual tracking with the Siamese center-prediction network.
Comput. Vis. Media, 2021

Dual Cluster Contrastive learning for Person Re-Identification.
CoRR, 2021

SSAGCN: Social Soft Attention Graph Convolution Network for Pedestrian Trajectory Prediction.
CoRR, 2021

Contrastive Adaptive Propagation Graph Neural Networks for Efficient Graph Learning.
CoRR, 2021

GRecX: An Efficient and Unified Benchmark for GNN-based Recommendation.
CoRR, 2021

Contrastive Proposal Extension with LSTM Network for Weakly Supervised Object Detection.
CoRR, 2021

StyTr^2: Unbiased Image Style Transfer with Transformers.
CoRR, 2021

Hierarchical Multi-modal Contextual Attention Network for Fake News Detection.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Few-shot Egocentric Multimodal Activity Recognition.
Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Hierarchical Multi-Task Learning for Diagram Question Answering with Multi-Modal Transformer.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Zero-shot Video Emotion Recognition via Multimodal Protagonist-aware Transformer Network.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multimodal Global Relation Knowledge Distillation for Egocentric Action Anticipation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Efficient Graph Deep Learning in TensorFlow with tf_geometric.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Weakly-Supervised Video Object Grounding via Stable Context Learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Global Relation-Aware Attention Network for Image-Text Retrieval.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Text Style Transfer With Decorative Elements.
Proceedings of the 4th IEEE International Conference on Multimedia Information Processing and Retrieval, 2021

Meta-Learning Causal Feature Selection for Stable Prediction.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Diving Into The Relations: Leveraging Semantic and Visual Structures For Video Moment Retrieval.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Active Universal Domain Adaptation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Fast Video Moment Retrieval.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-Shot Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Dual Adversarial Graph Neural Networks for Multi-label Cross-modal Retrieval.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Arbitrary Video Style Transfer via Multi-Channel Correlation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Meta-path Augmented Sequential Recommendation with Contextual Co-attention Network.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Knowledge-Based Topic Model for Multi-Modal Social Event Analysis.
IEEE Trans. Multim., 2020

Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval.
IEEE Trans. Multim., 2020

CI-GNN: Building a Category-Instance Graph for Zero-Shot Video Classification.
IEEE Trans. Multim., 2020

Knowledge-aware Attentive Wasserstein Adversarial Dialogue Response Generation.
ACM Trans. Intell. Syst. Technol., 2020

A Unified Deep Model for Joint Facial Expression Recognition, Face Synthesis, and Face Alignment.
IEEE Trans. Image Process., 2020

Geometry Guided Pose-Invariant Facial Expression Recognition.
IEEE Trans. Image Process., 2020

Self-Supervised Feature Augmentation for Large Image Object Detection.
IEEE Trans. Image Process., 2020

Cross-domain personalized image captioning.
Multim. Tools Appl., 2020

Editorial.
Multim. Syst., 2020

Asymmetric multi-stage CNNs for small-scale pedestrian detection.
Neurocomputing, 2020

Multimodal graph convolutional networks for high quality content recognition.
Neurocomputing, 2020

Discriminative multimodal embedding for event classification.
Neurocomputing, 2020

Relative coordinates constraint for face alignment.
Neurocomputing, 2020

Effective Label Propagation for Discriminative Semi-Supervised Domain Adaptation.
CoRR, 2020

MMCGAN: Generative Adversarial Network with Explicit Manifold Prior.
CoRR, 2020

Adaptive Adversarial Logits Pairing.
CoRR, 2020

IEEE Access Special Section Editorial: Mobile Multimedia for Healthcare.
IEEE Access, 2020

Multi-hop Interactive Cross-Modal Retrieval.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Structured Neural Motifs: Scene Graph Parsing via Enhanced Context.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Beyond Literal Visual Modeling: Understanding Image Metaphor Based on Literal-Implied Concept Mapping.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Destylization of text with decorative elements.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Local structure alignment guided domain adaptation with few source samples.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Joint Attribute Manipulation and Modality Alignment Learning for Composing Text and Image to Image Retrieval.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-modal Multi-relational Feature Aggregation Network for Medical Knowledge Representation Learning.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-modal Attentive Graph Pooling Model for Community Question Answer Matching.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Arbitrary Style Transfer via Multi-Adaptation Network.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Fake News Detection via Knowledge-driven Multimodal Graph Convolutional Networks.
Proceedings of the 2020 on International Conference on Multimedia Retrieval, 2020

Multi-attribute Guided Painting Generation.
Proceedings of the 3rd IEEE Conference on Multimedia Information Processing and Retrieval, 2020

Category-Level Adversarial Self-Ensembling for Domain Adaptation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Autosoccer: An Automatic Soccer Live Broadcasting Generator.
Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops, 2020

Dynamic Refinement Network for Oriented and Densely Packed Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

GAEAT: Graph Auto-Encoder Attention Networks for Knowledge Graph Completion.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Find Objects and Focus on Highlights: Mining Object Semantics for Video Highlight Detection via Graph Neural Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Image Captioning by Asking Questions.
ACM Trans. Multim. Comput. Commun. Appl., 2019

A<sup>2</sup>CMHNE: Attention-Aware Collaborative Multimodal Heterogeneous Network Embedding.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Deep Multi-Modality Adversarial Networks for Unsupervised Domain Adaptation.
IEEE Trans. Multim., 2019

Deep Representation Learning With Part Loss for Person Re-Identification.
IEEE Trans. Image Process., 2019

SMART: Joint Sampling and Regression for Visual Tracking.
IEEE Trans. Image Process., 2019

Depth Information Guided Crowd Counting for complex crowd scenes.
Pattern Recognit. Lett., 2019

Robust Structural Sparse Tracking.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Learning Multi-Task Correlation Particle Filters for Visual Tracking.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Multi-modal max-margin supervised topic model for social event analysis.
Multim. Tools Appl., 2019

Selective clustering for representative paintings selection.
Multim. Tools Appl., 2019

Video Highlight Detection via Region-Based Deep Ranking Model.
Int. J. Pattern Recognit. Artif. Intell., 2019

DR<sup>2</sup>-Net: Deep Residual Reconstruction Network for image compressive sensing.
Neurocomputing, 2019

A Generalization Theory based on Independent and Task-Identically Distributed Assumption.
CoRR, 2019

Unpaired Images based Generator Architecture for Facial Expression Recognition.
Proceedings of the 2019 IEEE Visual Communications and Image Processing, 2019

Balance-Based Photo Posting.
Proceedings of the SIGGRAPH Asia 2019 Posters, 2019

Time-Guided High-Order Attention Model of Longitudinal Heterogeneous Healthcare Data.
Proceedings of the PRICAI 2019: Trends in Artificial Intelligence, 2019

Sentiment-Aware Multi-modal Recommendation on Tourist Attractions.
Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

Multi-modal Knowledge-aware Hierarchical Attention Network for Explainable Medical Question Answering.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Multi-modal Knowledge-aware Event Memory Network for Social Media Rumor Detection.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Biomedia ACM MM Grand Challenge 2019: Using Data Enhancement to Solve Sample Unbalance.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Adaptive Feature Fusion via Graph Neural Network for Person Re-identification.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Market2Dish: A Health-aware Food Recommendation System.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Explainable Interaction-driven User Modeling over Knowledge Graph for Sequential Recommendation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Hierarchical Graph Semantic Pooling Network for Multi-modal Community Question Answer Matching.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Exploring Feature Representation and Training Strategies in Temporal Action Localization.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Multimodal Latent Factor Model with Language Constraint for Predicate Detection.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Graph Convolutional Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Text2Video: An End-to-end Learning Framework for Expressing Text With Videos.
IEEE Trans. Multim., 2018

Deep-Structured Event Modeling for User-Generated Photos.
IEEE Trans. Multim., 2018

Understanding Dynamic Cross-OSN Associations for Cold-Start Recommendation.
IEEE Trans. Multim., 2018

Online Multimodal Multiexpert Learning for Social Event Tracking.
IEEE Trans. Multim., 2018

Cross-Domain Collaborative Learning via Discriminative Nonparametric Bayesian Model.
IEEE Trans. Multim., 2018

Correlation Particle Filter for Visual Tracking.
IEEE Trans. Image Process., 2018

P2T: Part-to-Target Tracking via Deep Regression Learning.
IEEE Trans. Image Process., 2018

Social Relationship Labeling Based on Multimodal Behaviors and Social Interactions.
IEEE Multim., 2018

Advances in Next-Generation Networking Technologies for Smart Healthcare.
IEEE Commun. Mag., 2018

IEEE Access Special Section Editorial: Advances of Multisensory Services and Technologies for Healthcare in Smart Cities.
IEEE Access, 2018

Facial Expression Recognition in the Wild: A Cycle-Consistent Adversarial Attention Transfer Approach.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Learning Multimodal Taxonomy via Variational Deep Graph Embedding and Clustering.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

A Unified Framework for Multimodal Domain Adaptation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

A Unified Generative Adversarial Framework for Image Generation and Person Re-identification.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

CSAN: Contextual Self-Attention Network for User Sequential Recommendation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Attentive Interactive Convolutional Matching for Community Question Answering in Social Multimedia.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Watch, Think and Attend: End-to-End Video Classification via Dynamic Knowledge Evolution Modeling.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

A Standalone Demo for Quiz Game "Describe and Guess".
Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

Attribute-Assisted Domain Transfer from Image to Sketch.
Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

Scene Recognition via Bi-enhanced Knowledge Space Learning.
Proceedings of the New Trends in Computer Technologies and Applications, 2018

Entity Competition Network for Video Classification.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Learning semantic topics for domain-adapted textual knowledge transfer.
Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018

The Sixth Visual Object Tracking VOT2018 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Joint Pose and Expression Modeling for Facial Expression Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Representation Learning of Knowledge Graphs with Entity Attributes and Multimedia Descriptions.
Proceedings of the Fourth IEEE International Conference on Multimedia Big Data, 2018

2017
Exploiting Social-Mobile Information for Location Visualization.
ACM Trans. Intell. Syst. Technol., 2017

Deep Relative Tracking.
IEEE Trans. Image Process., 2017

A discriminative graph inferring framework towards weakly supervised image parsing.
Multim. Syst., 2017

Cross-media analysis and reasoning: advances and directions.
Frontiers Inf. Technol. Electron. Eng., 2017

Cloud-Based Multimedia Services for healthcare and other related applications.
Future Gener. Comput. Syst., 2017

Learning explicit video attributes from mid-level representation for video captioning.
Comput. Vis. Image Underst., 2017

Understanding Deep Learning Generalization by Maximum Entropy.
CoRR, 2017

Impact of Next-Generation Mobile Technologies on IoT-Cloud Convergence.
IEEE Commun. Mag., 2017

Video Highlight Detection via Deep Ranking Modeling.
Proceedings of the Image and Video Technology - 8th Pacific-Rim Symposium, 2017

A Demo for Image-Based Personality Test.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

Demographic Attribute Inference from Social Multimedia Behaviors: A Cross-OSN Approach.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

Multi-Modal Knowledge Representation Learning via Webly-Supervised Relationships Mining.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Towards SMP Challenge: Stacking of Diverse Models for Social Image Popularity Prediction.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

A Unified Personalized Video Recommendation via Dynamic Recurrent Neural Networks.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Hashtag-centric Immersive Search on Social Media.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

A Generic Framework for Social Event Analysis.
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

The Visual Object Tracking VOT2017 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Multi-task Correlation Particle Filter for Robust Object Tracking.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Semantic Feature Mining for Video Event Understanding.
ACM Trans. Multim. Comput. Commun. Appl., 2016

A Unified Video Recommendation by Cross-Network User Modeling.
ACM Trans. Multim. Comput. Commun. Appl., 2016

Deep Relative Attributes.
IEEE Trans. Multim., 2016

Multi-Modal Event Topic Model for Social Event Analysis.
IEEE Trans. Multim., 2016

Folksonomy-Based Visual Ontology Construction and Its Applications.
IEEE Trans. Multim., 2016

STCAPLRS: A Spatial-Temporal Context-Aware Personalized Location Recommendation System.
ACM Trans. Intell. Syst. Technol., 2016

Robust Visual Tracking via Exclusive Context Modeling.
IEEE Trans. Cybern., 2016

Note onset detection based on sparse decomposition.
Multim. Tools Appl., 2016

An incremental probabilistic model for temporal theme analysis of landmarks.
Multim. Syst., 2016

基于关联规则挖掘的跨网络知识关联及协同应用 (Association Rules Mining Based Cross-network Knowledge Association and Collaborative Applications).
计算机科学, 2016

Modelling Temporal Information Using Discrete Fourier Transform for Video Classification.
CoRR, 2016

Visual BFI: An Exploratory Study for Image-Based Personality Test.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Abnormal Event Discovery in User Generated Photos.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Social Multimedia Ming: From Special to General.
Proceedings of the IEEE International Symposium on Multimedia, 2016

The Visual Object Tracking VOT2016 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Structural Correlation Filter for Robust Visual Tracking.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Boosted Multifeature Learning for Cross-Domain Transfer.
ACM Trans. Multim. Comput. Commun. Appl., 2015

Cross-Platform Emerging Topic Detection and Elaboration from Multimedia Streams.
ACM Trans. Multim. Comput. Commun. Appl., 2015

Learning Feature Hierarchies: A Layer-Wise Tag-Embedded Approach.
IEEE Trans. Multim., 2015

Automatic Visual Concept Learning for Social Event Understanding.
IEEE Trans. Multim., 2015

Cross-Domain Feature Learning in Multimedia.
IEEE Trans. Multim., 2015

YouTube Video Promotion by Cross-Network Association: @Britney to Advertise Gangnam Style.
IEEE Trans. Multim., 2015

Knowing Verb From Object: Retagging With Transfer Learning on Verb-Object Concept Images.
IEEE Trans. Multim., 2015

Cross-OSN User Modeling by Homogeneous Behavior Quantification and Local Social Regularization.
IEEE Trans. Multim., 2015

Cross-Platform Multi-Modal Topic Modeling for Personalized Inter-Platform Recommendation.
IEEE Trans. Multim., 2015

Learning Consistent Feature Representation for Cross-Modal Multimedia Retrieval.
IEEE Trans. Multim., 2015

Word-of-Mouth Understanding: Entity-Centric Multimodal Aspect-Opinion Mining in Social Media.
IEEE Trans. Multim., 2015

Relational User Attribute Inference in Social Media.
IEEE Trans. Multim., 2015

Latent Support Vector Machine Modeling for Sign Language Recognition with Kinect.
ACM Trans. Intell. Syst. Technol., 2015

Activity Sensor: Check-In Usage Mining for Local Recommendation.
ACM Trans. Intell. Syst. Technol., 2015

Max-Confidence Boosting With Uncertainty for Visual Tracking.
IEEE Trans. Image Process., 2015

Joint Local and Global Consistency on Interdocument and Interword Relationships for Co-Clustering.
IEEE Trans. Cybern., 2015

Accumulated reconstruction error vector (AREV): a semantic representation for cross-media retrieval.
Multim. Tools Appl., 2015

Multi-object tracking via MHT with multiple information fusion in surveillance video.
Multim. Syst., 2015

A new discriminative coding method for image classification.
Multim. Syst., 2015

The 21st International Conference on MultiMedia Modeling.
IEEE Multim., 2015

Seamlessly Integrating Effective Links with Attributes for Networked Data Classification.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2015

Cross-Domain Collaborative Learning in Social Multimedia.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Unified YouTube Video Recommendation via Cross-network Collaboration.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Meta-Path based Nonnegative Matrix Factorization for clustering on multi-type relational data.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Robust gender classification on unconstrained face images.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015

Structural Sparse Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Matching-CNN meets KNN: Quasi-parametric human parsing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

A Probabilistic Framework for Temporal User Modeling on Microblogs.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

On Analyzing the 'Variety' of Big Social Multimedia.
Proceedings of the 2015 IEEE International Conference on Multimedia Big Data, BigMM 2015, 2015

2014
Cross-Domain Multi-Event Tracking via CO-PMHT.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Social Event Classification via Boosted Multimodal Supervised Latent Dirichlet Allocation.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Discovering Geo-Informative Attributes for Location Recognition and Exploration.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Twitter is Faster: Personalized Time-Aware Video Recommendation from Twitter to YouTube.
ACM Trans. Multim. Comput. Commun. Appl., 2014

A Unified Framework of Latent Feature Learning in Social Media.
IEEE Trans. Multim., 2014

Mobile Landmark Search with 3D Models.
IEEE Trans. Multim., 2014

Topic-Sensitive Influencer Mining in Interest-Based Social Media Networks via Hypergraph Learning.
IEEE Trans. Multim., 2014

Snap & Play: Auto-Generated Personalized Find-the-Difference Game.
ACM Trans. Intell. Syst. Technol., 2014

CAMHID: Camera Motion Histogram Descriptor and Its Application to Cinematographic Shot Classification.
IEEE Trans. Circuits Syst. Video Technol., 2014

Inductive hierarchical nonnegative graph embedding for "verb-object" image classification.
Mach. Vis. Appl., 2014

Preface: Internet multimedia computing and service.
Multim. Tools Appl., 2014

Multimodal Spatio-Temporal Theme Modeling for Landmark Analysis.
IEEE Multim., 2014

Mining Cross-network Association for YouTube Video Promotion.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Social multimedia computing.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Boosted Multi-modal Supervised Latent Dirichlet Allocation for Social Event Classification.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Scene and viewpoint based visual summarization for landmarks.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Multi-Modal Supervised Latent Dirichlet Allocation for Event Classification in Social Media.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

Partial Occlusion Handling for Visual Tracking via Robust Part Matching.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Social influence analysis and application on multimedia sharing websites.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Enhancing news organization for convenient retrieval and browsing.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Interaction Design for Mobile Visual Search.
IEEE Trans. Multim., 2013

Script-to-Movie: A Computational Framework for Story Movie Composition.
IEEE Trans. Multim., 2013

Spectral Hashing With Semantically Consistent Graph for Image Indexing.
IEEE Trans. Multim., 2013

Cross-Space Affinity Learning with Its Application to Movie Recommendation.
IEEE Trans. Knowl. Data Eng., 2013

General Subspace Learning With Corrupted Training Data Via Graph Embedding.
IEEE Trans. Image Process., 2013

Mining Semantic Context Information for Intelligent Video Surveillance of Traffic Scenes.
IEEE Trans. Ind. Informatics, 2013

Discriminative Exemplar Coding for Sign Language Recognition With Kinect.
IEEE Trans. Cybern., 2013

Hierarchical affective content analysis in arousal and valence dimensions.
Signal Process., 2013

Self-taught dimensionality reduction on the high-dimensional small-sized data.
Pattern Recognit., 2013

M<sup>4</sup>L: Maximum margin Multi-instance Multi-cluster Learning for scene modeling.
Pattern Recognit., 2013

MLRank: Multi-correlation Learning to Rank for image annotation.
Pattern Recognit., 2013

Guest editorial: selected papers from ICIMCS 2011.
Multim. Syst., 2013

Web-Scale Near-Duplicate Search: Techniques and Applications.
IEEE Multim., 2013

Multi-cue Based Multi-target Tracking with Boosted MHT.
Proceedings of the Advances in Multimedia Information Processing - PCM 2013, 2013

Graph-Guided Fusion Penalty Based Sparse Coding for Image Classification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2013, 2013

Verb-Object Concepts Image Classification via Hierarchical Nonnegative Graph Embedding.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Landmark History Visualization.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Paint the City Colorfully: Location Visualization from Multiple Themes.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Latent feature learning in social media network.
Proceedings of the ACM Multimedia Conference, 2013

GIANT: geo-informative attributes for location recognition and exploration.
Proceedings of the ACM Multimedia Conference, 2013

Social event detection with robust high-order co-clustering.
Proceedings of the International Conference on Multimedia Retrieval, 2013

Tag-aware image classification via Nested Deep Belief nets.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Label localization with weakly spatial constrained graph propagation.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Friend transfer: Cold-start friend recommendation with cross-platform transfer learning of social knowledge.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Personalized video recommendation based on cross-platform user modeling.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Label localization by appearance guided graph inferring.
Proceedings of the IEEE International Conference on Image Processing, 2013

Latent support vector machine for sign language recognition with Kinect.
Proceedings of the IEEE International Conference on Image Processing, 2013

Locality discriminative coding for image classification.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

User-Oriented Social Analysis across Social Media Sites.
Proceedings of the New Trends in Image Analysis and Processing - ICIAP 2013, 2013

Low-Rank Sparse Coding for Image Classification.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Object Tracking by Occlusion Detection via Structured Sparse Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
A Generic Framework for Video Annotation via Semi-Supervised Learning.
IEEE Trans. Multim., 2012

Special Section on Object and Event Classification in Large-Scale Video Collections.
IEEE Trans. Multim., 2012

Enhanced 3-D Modeling for Landmark Image Classification.
IEEE Trans. Multim., 2012

Learn to Personalized Image Search From the Photo Sharing Websites.
IEEE Trans. Multim., 2012

User-Aware Image Tag Refinement via Ternary Semantic Analysis.
IEEE Trans. Multim., 2012

Robust Face-Name Graph Matching for Movie Character Identification.
IEEE Trans. Multim., 2012

Weakly Supervised Graph Propagation Towards Collective Image Parsing.
IEEE Trans. Multim., 2012

Inductive Robust Principal Component Analysis.
IEEE Trans. Image Process., 2012

Intelligent multimedia interactivity.
Pattern Recognit. Lett., 2012

Dimensionality reduction by Mixed Kernel Canonical Correlation Analysis.
Pattern Recognit., 2012

@ICT: attention-based virtual content insertion.
Multim. Syst., 2012

Societally connected multimedia across cultures.
J. Zhejiang Univ. Sci. C, 2012

Faceted Subtopic Retrieval: Exploiting the Topic Hierarchy via a Multi-modal Framework.
J. Multim., 2012

What Happened Near Big Ben: Event-Driven Landmark Mining from Flickr.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

Personalized Celebrity Video Search Based on Cross-Space Mining.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

Right buddy makes the difference: an early exploration of social relation analysis in multimedia applications.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Hi, magic closet, tell me what to wear!
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Multimedia news digger on emerging topics from social streams.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Ordinal preserving projection: a novel dimensionality reduction method for image ranking.
Proceedings of the International Conference on Multimedia Retrieval, 2012

Saliency Aware Locality-preserving Coding for Image Classification.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Extended MHT algorithm for multiple object tracking.
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Kinect-based visual communication system.
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Chat with illustration: a chat system with visual aids.
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Probabilistic sequential POIs recommendation via check-in data.
Proceedings of the SIGSPATIAL 2012 International Conference on Advances in Geographic Information Systems (formerly known as GIS), 2012

Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Browse by chunks: Topic mining and organizing on web-scale social media.
ACM Trans. Multim. Comput. Commun. Appl., 2011

Boosted Exemplar Learning for Action Recognition and Annotation.
IEEE Trans. Circuits Syst. Video Technol., 2011

Frame Fusion for Video Copy Detection.
IEEE Trans. Circuits Syst. Video Technol., 2011

Boosted multi-class semi-supervised learning for human action recognition.
Pattern Recognit., 2011

Audio-visual large-scale video copy detection.
Int. J. Comput. Math., 2011

Boosting part-sense multi-feature learners toward effective object detection.
Comput. Vis. Image Underst., 2011

Generative Group Activity Analysis with Quaternion Descriptor.
Proceedings of the Advances in Multimedia Modeling, 2011

A Visualized Communication System Using Cross-Media Semantic Association.
Proceedings of the Advances in Multimedia Modeling, 2011

Learning "verb-object" concepts for semantic image annotation.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Landmark recognition and retrieval: from 2D to 3D.
Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding, 2011

Exploiting user information for image tag refinement.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Snap & play: auto-generate personalized find-the-difference mobile game.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

News contextualization with geographic and visual information.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Robust movie character identification and the sensitivity analysis.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Descriptive local feature groups for image classification.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Using context saliency for movie shot classification.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Copy detection towards semantic mining for video retrieval.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

One step beyond bags of features: Visual categorization using components.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Image classification by non-negative sparse coding, low-rank and sparse decomposition.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

TVParser: An automatic TV video parsing method.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Size Adaptive Selection of Most Informative Features.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010
Introduction to the best papers of ACM multimedia 2009.
ACM Trans. Multim. Comput. Commun. Appl., 2010

Building topographic subspace model with transfer learning for sparse representation.
Neurocomputing, 2010

Cross-media retrieval: state-of-the-art and open issues.
Int. J. Multim. Intell. Secur., 2010

Personalized Sports Video Customization Using Content and Context Analysis.
Int. J. Digit. Multim. Broadcast., 2010

Discovering Phrase-Level Lexicon for Image Annotation.
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010

Using Scripts for Affective Content Retrieval.
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010

Visual Attention Based Motion Object Detection and Trajectory Tracking.
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010

Personalized Sports Video Customization for Mobile Devices.
Proceedings of the Advances in Multimedia Modeling, 2010

Extended CBIR via Learning Semantics of Query Image.
Proceedings of the Advances in Multimedia Modeling, 2010

A generic framework for event detection in various video domains.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Landmark image classification using 3D point clouds.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Character-based movie summarization.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

The third eye: mining the visual cognition across multi-language communities.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Fast feature selection and training for AdaBoost-based concept detection with large scale datasets.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Extracting Key Sub-trajectory Features for Supervised Tactic Detection in Sports Video.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Video based 3D reconstruction using spatio-temporal attention analysis.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Event based news video people classification and ranking using multimodality features.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

A close-up detection method for movies.
Proceedings of the International Conference on Image Processing, 2010

Visual attention based small object segmentation in natual images.
Proceedings of the International Conference on Image Processing, 2010

Human action recognition via multi-view learning.
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010

Adaptive local hyperplanes for MTV affective analysis.
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010

Feature selection under learning to rank model for multimedia retrieve.
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010

Hausdorff matching based SVD-covariance descriptor for object tracking.
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010

Coherent bag-of audio words model for efficient large-scale video copy detection.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

Image Classification Using Spatial Pyramid Coding and Visual Word Reweighting.
Proceedings of the Computer Vision - ACCV 2010, 2010

2009
Event Tactic Analysis Based on Broadcast Sports Video.
IEEE Trans. Multim., 2009

Character Identification in Feature-Length Films Using Global Face-Name Matching.
IEEE Trans. Multim., 2009

Effective Annotation and Search for Video Blogs with Integration of Context and Content Analysis.
IEEE Trans. Multim., 2009

Personalized retrieval of sports video based on multi-modal analysis and user preference acquisition.
Multim. Tools Appl., 2009

Sports Video Analysis: Semantics Extraction, Editorial Content Creation and Adaptation.
J. Multim., 2009

A Hierarchical Semantics-Matching Approach for Sports Video Annotation.
Proceedings of the Advances in Multimedia Information Processing, 2009

A Novel Role-Based Movie Scene Segmentation Method.
Proceedings of the Advances in Multimedia Information Processing, 2009

Naming faces in films using hypergraph matching.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Multi-view multi-label active learning for image classification.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Context saliency based image summarization.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Web image retrieval via learning semantics of query image.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Advertise gently - in-image advertising with low intrusiveness.
Proceedings of the International Conference on Image Processing, 2009

2008
Audio keywords generation for sports video analysis.
ACM Trans. Multim. Comput. Commun. Appl., 2008

Using Webcast Text for Semantic Event Detection in Broadcast Sports Video.
IEEE Trans. Multim., 2008

A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video.
IEEE Trans. Multim., 2008

Automatic composition of broadcast sports video.
Multim. Syst., 2008

A generic virtual content insertion system based on visual attention analysis.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Collaborate ball and player trajectory extraction in broadcast soccer video.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Automatic character identification in feature-length films.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Automatic semantic annotation for video blogs.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Lower attentive region detection for virtual content insertion in broadcast video.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Personalization of media and its attention service applications.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Event tactic analysis based on player and ball trajectory in broadcast video.
Proceedings of the 7th ACM International Conference on Image and Video Retrieval, 2008

2007
Content-adaptive digital music watermarking based on music structure analysis.
ACM Trans. Multim. Comput. Commun. Appl., 2007

Human Behavior Analysis for Highlight Ranking in Broadcast Racket Sports Video.
IEEE Trans. Multim., 2007

Generation of Personalized Music Sports Video Using Multimodal Cues.
IEEE Trans. Multim., 2007

Automatic TV Logo Detection, Tracking and Removal in Broadcast Video.
Proceedings of the Advances in Multimedia Modeling, 2007

Trajectory based event tactics analysis in broadcast sports video.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Region-based visual attention analysis with its application in image browsing on small displays.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Personalized retrieval of sports video.
Proceedings of the 9th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2007

Semantic Event Extraction from Basketball Games using Multi-Modal Analysis.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Semantic Analysis and Personalization for Mobile Media Applications.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

The Demo: A Real-Time Score Detection and Recognition Approach in Broadcast Basketball Sports Video.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

A Real-Time Score Detection and Recognition Approach for Broadcast Basketball Video.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Outlier Detection from Pooled Data for Image Retrieval System Evaluation.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Automatic summarization of music videos.
ACM Trans. Multim. Comput. Commun. Appl., 2006

Trajectory-Based Ball Detection and Tracking in Broadcast Soccer Video.
IEEE Trans. Multim., 2006

Nonparametric motion characterization for robust classification of camera motion patterns.
IEEE Trans. Multim., 2006

Player action recognition in broadcast tennis video with applications to semantic analysis of sports game.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

Live sports event detection based on broadcast video and web-casting text.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

Segmentation, categorization, and identification of commercial clips from TV streams using multimodal analysis.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

Action Recognition in Broadcast Tennis Video.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Automatic Sports Video Genre Classification using Pseudo-2D-HMM.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Local Motion Analysis and Its Application in Video based Swimming Style Recognition.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Efficient Relevance Feedback Using Semi-supervised Kernel-specified K-means Clustering.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Reliable Video Clock Time Recognition.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Automatic Multi-Player Detection and Tracking in Broadcast Sports Video using Support Vector Machine and Particle Filter.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Identify Sports Video Shots with "Happy" or "Sad" Emotions.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Fully and Semi-Automatic Music Sports Video Composition.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Automatic Content Placement in Sports Highlights.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Predominant Vocal Pitch Detection in Polyphonic Music.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

An Automatic Classification System Applied in Medical Images.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

A Matlab-Based Simulation of System Stability In Frequency-Field Analysis.
Proceedings of the First International Conference on Innovative Computing, Information and Control (ICICIC 2006), 30 August, 2006

A Mid-Level Scene Change Representation Via Audiovisual Alignment.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Video Clock Time Reconition Based on Temporal Periodic Pattern Change of the Digit Characters.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Action Recognition in Broadcast Tennis Video Using Optical Flow and Support Vector Machine.
Proceedings of the Computer Vision in Human-Computer Interaction, 2006

Two-stage SVM for Medical Image Annotation.
Proceedings of the Working Notes for CLEF 2006 Workshop co-located with the 10th European Conference on Digital Libraries (ECDL 2006), 2006

Stripe: Image Feature Based on a New Grid Method and Its Application in ImageCLEF.
Proceedings of the Information Retrieval Technology, 2006

2005
A unified framework for semantic shot classification in sports video.
IEEE Trans. Multim., 2005

Automatic music classification and summarization.
IEEE Trans. Speech Audio Process., 2005

Shot-Level Camera Motion Estimation Based on a Parametric Model.
Proceedings of the 2005 TREC Video Retrieval Evaluation, 2005

Automatic music video summarization based on audio-visual-text analysis and alignment.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

MultiPRE: a novel framework with multiple parallel retrieval engines for content-based image retrieval.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Automatic generation of personalized music sports video.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Real time advertisement insertion in baseball video based on advertisement effect.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

A unified framework for semantic shot representation of sports video.
Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2005

Automatic mobile sports highlights.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Periodicity Detection of Local Motion.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

A Mid-level Visual Concept Generation Framework for Sports Analysis.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Replay Scene Classification in Soccer Video Using Web Broadcast Text.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Content-based medical image retrieval using dynamically optimized regional features.
Proceedings of the 2005 International Conference on Image Processing, 2005

Peg-free Human Hand Shape Analysis and Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Soccer replay detection using scene transition structure analysis.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Automatic music summarization based on music structure analysis.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Combining Multilevel Visual Features for Medical Image Retrieval in ImageCLEFmed 2005.
Proceedings of the Working Notes for CLEF 2005 Workshop co-located with the 9th European Conference on Digital Libraries (ECDL 2005), 2005

Combining Visual Features for Medical Image Retrieval and Annotation.
Proceedings of the Accessing Multilingual Information Repositories, 2005

Report on the Annotation Task in ImageCLEFmed 2005.
Proceedings of the Working Notes for CLEF 2005 Workshop co-located with the 9th European Conference on Digital Libraries (ECDL 2005), 2005

2004
Fast and Robust Short Video Clip Search for Copy Detection.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Implanting Virtual Advertisement into Broadcast Soccer Video.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

HMM-Based Audio Keyword Generation.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Archiving Tennis Video Clips Based on Tactics Information.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Automatic Sports Highlights Extraction with Content Augmentation.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Semantic Region Detection in Acoustic Music Signals.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

A robust and accumulator-free ellipse hough transform.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Audio keyword generation for sports video analysis.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Automatic replay generation for soccer video broadcasting.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Content-based music structure analysis with applications to music semantics understanding.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Fast and robust video clip search using index structure.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Nonparametric motion model.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Nonparametric motion model with applications to camera motion pattern classification.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Fast and robust short video clip search using an index structure.
Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2004

Efficient Multimodal Features for Automatic Soccer Highlight Generation.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Singer Identification Based on Vocal and Instrumental Models.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Visual Keywords Labeling in Soccer Video.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

A robust Hough-based algorithm for partial ellipse detection in broadcast soccer video.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Automatically summarize musical audio using adaptive clustering.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Sports highlight detection from keyword sequences using HMM.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Robust soccer highlight generation with a novel dominant-speech feature extractor.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Unsupervised classification of music genre using hidden Markov model.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Singing voice detection using twice-iterated composite Fourier transform.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Event detection based on non-broadcast sports video.
Proceedings of the 2004 International Conference on Image Processing, 2004

A new approch to automatic music video summarization.
Proceedings of the 2004 International Conference on Image Processing, 2004

Goal detection in soccer video using audio/visual keywords.
Proceedings of the 2004 International Conference on Image Processing, 2004

Mean shift based nonparametric motion characterization.
Proceedings of the 2004 International Conference on Image Processing, 2004

Automatic music summarization in compressed domain.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Mean shift based video segment representation and applications to replay detection.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Fault-induced attack on semi-fragile image authentication schemes.
Proceedings of the Visual Communications and Image Processing 2003, 2003

Trajectory-based ball detection and tracking with applications to semantic analysis of broadcast soccer video.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

Robust goal-mouth detection for virtual content insertion.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

Real-time goal-mouth detection in MPEG soccer video.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

Nonparametric color characterization using mean shift.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

A mid-level representation framework for semantic sports video analysis.
Proceedings of the Eleventh ACM International Conference on Multimedia, 2003

An SVM-based classification approach to musical audio.
Proceedings of the ISMIR 2003, 2003

A ball tracking framework for broadcast soccer video.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

Creating audio keywords for event detection in soccer video.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

The security flaws in some authentication watermarking schemes.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

Automatically generating summaries for musical video.
Proceedings of the 2003 International Conference on Image Processing, 2003

Musical genre classification using support vector machines.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

A fusion scheme of visual and auditory modalities for event detection in sports video.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Real-time camera field-view tracking in soccer video.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Robust and efficient content-based digital audio watermarking.
Multim. Syst., 2002

Counterfeiting Attack on a Lossless Authentication Watermarking Scheme.
Proceedings of the Visualisation 2002, 2002

Image Mosaics Based on Homogenous Coordinates.
Proceedings of the Visualisation 2002, 2002

Support Vector Machine Learning for Music Discrimination.
Proceedings of the Advances in Multimedia Information Processing, 2002

Statistical Analysis of Musical Instruments.
Proceedings of the Advances in Multimedia Information Processing, 2002

Efficient Object-Based Stream Authentication.
Proceedings of the Progress in Cryptology, 2002

Automatic music summarization based on temporal, spectral and cepstral features.
Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

2001
Content Protection and Usage Control for Digital Music.
Proceedings of the First International Conference on WEB Delivering of Music (WEDELMUSIC '01), 2001

Web-based Image Authentication Using Invisible Fragile Watermark.
Proceedings of the Visualisation 2001, 2001

Pitch Tracking and Melody Slope Matching for Song Retrieval.
Proceedings of the Advances in Multimedia Information Processing, 2001

Copyright Protection for WAV-Table Synthesis Audio Using Digital Watermarking.
Proceedings of the Advances in Multimedia Information Processing, 2001

Digital audio watermarking based-on multiple-bit hopping and human auditory system.
Proceedings of the 9th ACM International Conference on Multimedia 2001, Ottawa, Ontario, Canada, September 30, 2001

Content-based Retrieval for Digital Audio and Music.
Proceedings of the Fifth IASTED International Conference Internet and Multimedia Systems and Applications (IMSA 2001), 2001

Web-based Protection and Secure Distribution for Digital Music.
Proceedings of the Fifth IASTED International Conference Internet and Multimedia Systems and Applications (IMSA 2001), 2001

Melody Curve Processing For Music Retrieval.
Proceedings of the 2001 IEEE International Conference on Multimedia and Expo, 2001

A Robust And Fast Watermarking Scheme For Compressed Audio.
Proceedings of the 2001 IEEE International Conference on Multimedia and Expo, 2001

2000
Audio registration and its application in digital watermarking.
Proceedings of the Security and Watermarking of Multimedia Contents II, 2000

Content-Based Watermarking for Compressed Audio.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications), 2000

1999
A Visual Processing system for Facial Prediction.
Proceedings of the Visual Information and Information Systems, 1999

A robust digital audio watermarking technique.
Proceedings of the ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications, 1999

Digital audio watermarking and its application in multimedia database.
Proceedings of the ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications, 1999


  Loading...