Yongdong Zhang

Orcid: 0000-0002-1151-1792

Affiliations:
  • University of Science and Technology of China, National Engineering Laboratory for Brain-inspired Intelligence Technology an Application, Hefei, China
  • Chinese Academy of Sciences, Institute of Computing Technology, Key Laboratory of Intelligent Information Processing, Beijing, China
  • University of the Chinese Academy of Sciences, Beijing, China
  • Tianjin University, China (PhD 2002)


According to our database1, Yongdong Zhang authored at least 532 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Exploring Visual Relationships via Transformer-based Graphs for Enhanced Image Captioning.
ACM Trans. Multim. Comput. Commun. Appl., May, 2024

Causal Incremental Graph Convolution for Recommender System Retraining.
IEEE Trans. Neural Networks Learn. Syst., April, 2024

Learning to Supervise Knowledge Retrieval Over a Tree Structure for Visual Question Answering.
IEEE Trans. Multim., 2024

CDCM: ChatGPT-Aided Diversity-Aware Causal Model for Interactive Recommendation.
IEEE Trans. Multim., 2024

Balanced Classification: A Unified Framework for Long-Tailed Object Detection.
IEEE Trans. Multim., 2024

Event-Aware Retrospective Learning for Knowledge-Based Image Captioning.
IEEE Trans. Multim., 2024

Counterfactual Visual Dialog: Robust Commonsense Knowledge Learning From Unbiased Training.
IEEE Trans. Multim., 2024

Prototype-Augmented Self-Supervised Generative Network for Generalized Zero-Shot Learning.
IEEE Trans. Image Process., 2024

Decoupled Cross-Modal Phrase-Attention Network for Image-Sentence Matching.
IEEE Trans. Image Process., 2024

Efficient Dynamic Correspondence Network.
IEEE Trans. Image Process., 2024

Robust and Generalized Physical Adversarial Attacks via Meta-GAN.
IEEE Trans. Inf. Forensics Secur., 2024

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations.
CoRR, 2024

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization.
CoRR, 2024

Alleviating Structural Distribution Shift in Graph Anomaly Detection.
CoRR, 2024

Frequency Domain Modality-invariant Feature Learning for Visible-infrared Person Re-Identification.
CoRR, 2024

Task-Adaptive Prompted Transformer for Cross-Domain Few-Shot Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Bootstrapping Large Language Models for Radiology Report Generation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Discriminative Feature Mining Based on Frequency Information and Metric Learning for Face Forgery Detection.
IEEE Trans. Knowl. Data Eng., December, 2023

Long-Short Range Adaptive Transformer With Dynamic Sampling for 3D Object Detection.
IEEE Trans. Circuits Syst. Video Technol., December, 2023

Dynamic Keypoint Detection Network for Image Matching.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Meta semi-supervised medical image segmentation with label hierarchy.
Health Inf. Sci. Syst., December, 2023

Ridge-Regression-Induced Robust Graph Relational Network.
IEEE Trans. Cybern., September, 2023

Constructing Spatio-Temporal Graphs for Face Forgery Detection.
ACM Trans. Web, August, 2023

Rumor detection with self-supervised learning on texts and social graph.
Frontiers Comput. Sci., August, 2023

Task-Aware Weakly Supervised Object Localization With Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

CatGCN: Graph Convolutional Networks With Categorical Node Features.
IEEE Trans. Knowl. Data Eng., April, 2023

Neighborhood-Adaptive Multi-Cluster Ranking for Deep Metric Learning.
IEEE Trans. Circuits Syst. Video Technol., April, 2023

Uncertainty Guided Collaborative Training for Weakly Supervised and Unsupervised Temporal Action Localization.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Multi-task hourglass network for online automatic diagnosis of developmental dysplasia of the hip.
World Wide Web (WWW), March, 2023

Learning Pixel Affinity Pyramid for Arbitrary-Shaped Text Detection.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

Addressing Confounding Feature Issue for Causal Recommendation.
ACM Trans. Inf. Syst., 2023

Unified Adaptive Relevance Distinguishable Attention Network for Image-Text Matching.
IEEE Trans. Multim., 2023

S$^{2}$-Net:Semantic and Saliency Attention Network for Person Re-Identification.
IEEE Trans. Multim., 2023

ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene Text Detection.
IEEE Trans. Multim., 2023

Learning Cross-Channel Representations for Semantic Segmentation.
IEEE Trans. Multim., 2023

Multi-Scale Fine-Grained Alignments for Image and Sentence Matching.
IEEE Trans. Multim., 2023

Intra-Class Adaptive Augmentation With Neighbor Correction for Deep Metric Learning.
IEEE Trans. Multim., 2023

What is the Real Need for Scene Text Removal? Exploring the Background Integrity and Erasure Exhaustivity Properties.
IEEE Trans. Image Process., 2023

Prototypical Matching Networks for Video Object Segmentation.
IEEE Trans. Image Process., 2023

Hierarchical Shape-Consistent Transformer for Unsupervised Point Cloud Shape Correspondence.
IEEE Trans. Image Process., 2023

On the Calibration of Large Language Models and Alignment.
CoRR, 2023

Causality is all you need.
CoRR, 2023

ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences.
CoRR, 2023

Promoting Generalization for Exact Solvers via Adversarial Instance Augmentation.
CoRR, 2023

Accelerate Presolve in Large-Scale Linear Programming via Reinforcement Learning.
CoRR, 2023

A Deep Instance Generative Framework for MILP Solvers Under Limited Data Availability.
CoRR, 2023

Learning Complete Topology-Aware Correlations Between Relations for Inductive Link Prediction.
CoRR, 2023

T2IW: Joint Text to Image & Watermark Generation.
CoRR, 2023

A Circuit Domain Generalization Framework for Efficient Logic Synthesis in Chip Design.
CoRR, 2023

MCDAN: a Multi-scale Context-enhanced Dynamic Attention Network for Diffusion Prediction.
CoRR, 2023

MomentDiff: Generative Video Moment Retrieval from Random to Real.
CoRR, 2023

DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation.
CoRR, 2023

ExpertPrompting: Instructing Large Language Models to be Distinguished Experts.
CoRR, 2023

kNN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference.
CoRR, 2023

Generalization in Visual Reinforcement Learning with the Reward Sequence Distribution.
CoRR, 2023

Addressing Heterophily in Graph Anomaly Detection: A Perspective of Graph Spectrum.
Proceedings of the ACM Web Conference 2023, 2023

Alleviating Structural Distribution Shift in Graph Anomaly Detection.
Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023

Reformulating CTR Prediction: Learning Invariant Feature Interactions for Recommendation.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

MomentDiff: Generative Video Moment Retrieval from Random to Real.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Deep Instance Generative Framework for MILP Solvers Under Limited Data Availability.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Frequency-based Zero-Shot Learning with Phase Augmentation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Masked Text Modeling: A Self-Supervised Pre-training Method for Scene Text Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

My Brother Helps Me: Node Injection Based Adversarial Attack on Social Bot Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CARIS: Context-Aware Referring Image Segmentation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

High Fidelity Face Swapping via Semantics Disentanglement and Structure Enhancement.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Improving Rumor Detection by Class-based Adversarial Domain Adaptation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

RAIRNet: Region-Aware Identity Rectification for Face Forgery Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Cut Selection for Mixed-Integer Linear Programming via Hierarchical Sequence Model.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

De Novo Molecular Generation via Connection-aware Motif Mining.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Adaptive Template Transformer for Mitochondria Segmentation in Electron Microscopy Images.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SADE: A Self-Adaptive Expert for Multi-Dataset Question Answering.
Proceedings of the IEEE International Conference on Acoustics, 2023

On the Calibration of Large Language Models and Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Grammatical Error Correction via Mixed-Grained Weighted Training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Crossing the Gap: Domain Generalization for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Semantic Relationship among Instances for Image-Text Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Dynamic Generative Targeted Attacks with Pattern Injection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

S2ynRE: Two-stage Self-training with Synthetic data for Low-resource Relation Extraction.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Exploring Stroke-Level Modifications for Scene Text Editing.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Reliability Enhancement for VR Delivery in Mobile-Edge Empowered Dual-Connectivity Sub-6 GHz and mmWave HetNets.
IEEE Trans. Wirel. Commun., 2022

Toward Region-Aware Attention Learning for Scene Graph Generation.
IEEE Trans. Neural Networks Learn. Syst., 2022

Multimodal Learning for Temporally Coherent Talking Face Generation With Articulator Synergy.
IEEE Trans. Multim., 2022

Attribute-Induced Bias Eliminating for Transductive Zero-Shot Learning.
IEEE Trans. Multim., 2022

Boundary-Aware Arbitrary-Shaped Scene Text Detector With Learnable Embedding Network.
IEEE Trans. Multim., 2022

Focus Your Attention: A Focal Attention for Multimodal Learning.
IEEE Trans. Multim., 2022

Online Residual Quantization Via Streaming Data Correlation Preserving.
IEEE Trans. Multim., 2022

Mixed Dish Recognition With Contextual Relation and Domain Alignment.
IEEE Trans. Multim., 2022

Dynamic-Aware Federated Learning for Face Forgery Video Detection.
ACM Trans. Intell. Syst. Technol., 2022

PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition.
IEEE Trans. Image Process., 2022

Adversarial Transformers for Weakly Supervised Object Localization.
IEEE Trans. Image Process., 2022

Diverse Complementary Part Mining for Weakly Supervised Object Localization.
IEEE Trans. Image Process., 2022

Visible-Infrared Person Re-Identification With Modality-Specific Memory Network.
IEEE Trans. Image Process., 2022

Deep Fourier Ranking Quantization for Semi-Supervised Image Retrieval.
IEEE Trans. Image Process., 2022

Semantically Similarity-Wise Dual-Branch Network for Scene Graph Generation.
IEEE Trans. Circuits Syst. Video Technol., 2022

High-Order Interaction Learning for Image Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2022

Region-Aware Image Captioning via Interaction Learning.
IEEE Trans. Circuits Syst. Video Technol., 2022

Bilateral Temporal Re-Aggregation for Weakly-Supervised Video Object Segmentation.
IEEE Trans. Circuits Syst. Video Technol., 2022

Adaptive Spatial Location With Balanced Loss for Video Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2022

Self-Supervised Synthesis Ranking for Deep Metric Learning.
IEEE Trans. Circuits Syst. Video Technol., 2022

Semi-Supervised Text Detection With Accurate Pseudo-Labels.
IEEE Signal Process. Lett., 2022

Context-Aware Visual Policy Network for Fine-Grained Image Captioning.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Attention-guided transformation-invariant attack for black-box adversarial examples.
Int. J. Intell. Syst., 2022

Explainable Sparse Knowledge Graph Completion via High-order Graph Reasoning Network.
CoRR, 2022

Interpolative Distillation for Unifying Biased and Debiased Recommendation.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Fine-tuning with Multi-modal Entity Prompts for News Image Captioning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Finding the Host from the Lesion by Iteratively Mining the Registration Graph.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

REMOT: A Region-to-Whole Framework for Realistic Human Motion Transfer.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Proxy Probing Decoder for Weakly Supervised Object Localization: A Baseline Investigation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Wavelet-enhanced Weakly Supervised Local Feature Learning for Face Forgery Detection.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Dual Part Discovery Network for Zero-Shot Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Addressing Unmeasured Confounder for Recommendation with Sensitivity Analysis.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Detecting Tampered Scene Text in the Wild.
Proceedings of the Computer Vision - ECCV 2022, 2022

Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval.
Proceedings of the Computer Vision - ECCV 2022, 2022

Cross-Modality Transformer for Visible-Infrared Person Re-Identification.
Proceedings of the Computer Vision - ECCV 2022, 2022

Negative-Aware Attention Framework for Image-Text Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Motion-modulated Temporal Fragment Alignment Network For Few-Shot Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Partial Class Activation Attention for Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Abusive Language Detection with Graph based Multi-task Learning.
Proceedings of the IEEE International Conference on Big Data, 2022

Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Blockchain Based Secure Data Aggregation and Distributed Power Dispatching for Microgrids.
IEEE Trans. Smart Grid, 2021

Depth Image Denoising Using Nuclear Norm and Learning Graph Model.
ACM Trans. Multim. Comput. Commun. Appl., 2021

R-Net: A Relationship Network for Efficient and Accurate Scene Text Detection.
IEEE Trans. Multim., 2021

Domain-Oriented Semantic Embedding for Zero-Shot Learning.
IEEE Trans. Multim., 2021

A Mutually Attentive Co-Training Framework for Semi-Supervised Recognition.
IEEE Trans. Multim., 2021

Adaptively Clustering-Driven Learning for Visual Relationship Detection.
IEEE Trans. Multim., 2021

Hip Landmark Detection With Dependency Mining in Ultrasound Image.
IEEE Trans. Medical Imaging, 2021

Self-Supervised Attention Mechanism for Pediatric Bone Age Assessment With Efficient Weak Annotation.
IEEE Trans. Medical Imaging, 2021

Local Correspondence Network for Weakly Supervised Temporal Sentence Grounding.
IEEE Trans. Image Process., 2021

Multi-Scale Structure-Aware Network for Weakly Supervised Temporal Action Detection.
IEEE Trans. Image Process., 2021

CGNet: A Light-Weight Context Guided Network for Semantic Segmentation.
IEEE Trans. Image Process., 2021

Consistency Graph Modeling for Semantic Correspondence.
IEEE Trans. Image Process., 2021

Review and Arrange: Curriculum Learning for Natural Language Understanding.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Dual Optical Path Based Adaptive Compressive Sensing Imaging System.
Sensors, 2021

PRRNet: Pixel-Region relation network for face forgery detection.
Pattern Recognit., 2021

ROBP a robust border-peeling clustering using Cauchy kernel.
Inf. Sci., 2021

A Simple and Strong Baseline: Progressively Region-based Scene Text Removal Networks.
CoRR, 2021

Cross-Modal Attention Consistency for Video-Audio Unsupervised Learning.
CoRR, 2021

AttriMeter: An Attribute-guided Metric Interpreter for Person Re-Identification.
CoRR, 2021

Causal Intervention for Leveraging Popularity Bias in Recommendation.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

TDI TextSpotter: Taking Data Imbalance into Account in Scene Text Spotting.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Mask and Predict: Multi-step Reasoning for Scene Graph Generation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Triangle-Reward Reinforcement Learning: A Visual-Linguistic Semantic Alignment for Image Captioning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Dynamic Inconsistency-aware DeepFake Video Detection.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Task-aware Part Mining Network for Few-Shot Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Foreground Activation Maps for Weakly Supervised Object Localization.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Meta-Attack: Class-agnostic and Model-agnostic Physical Adversarial Attack.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Explainable Person Re-Identification with Attribute-guided Metric Distillation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Lesion-Aware Transformers for Diabetic Retinopathy Grading.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Action Unit Memory Network for Weakly Supervised Temporal Action Localization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Query-Memory Re-Aggregation for Weakly-supervised Video Object Segmentation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Semantic-guided Reinforced Region Embedding for Generalized Zero-Shot Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Deep Metric Learning with Self-Supervised Ranking.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A User-Centric Handover Scheme for Ultra-Dense LEO Satellite Networks.
IEEE Wirel. Commun. Lett., 2020

Energy Efficiency and Traffic Offloading Optimization in Integrated Satellite/Terrestrial Radio Access Networks.
IEEE Trans. Wirel. Commun., 2020

Robust Deep Co-Saliency Detection With Group Semantic and Pyramid Attention.
IEEE Trans. Neural Networks Learn. Syst., 2020

Corrections to "STAT: Spatial-Temporal Attention Mechanism for Video Captioning".
IEEE Trans. Multim., 2020

STAT: Spatial-Temporal Attention Mechanism for Video Captioning.
IEEE Trans. Multim., 2020

3D Room Layout Estimation From a Single RGB Image.
IEEE Trans. Multim., 2020

Multi-Level Policy and Reward-Based Deep Reinforcement Learning Framework for Image Captioning.
IEEE Trans. Multim., 2020

Bidirectional Attention-Recognition Model for Fine-Grained Object Classification.
IEEE Trans. Multim., 2020

Misshapen Pelvis Landmark Detection With Local-Global Feature Learning for Diagnosing Developmental Dysplasia of the Hip.
IEEE Trans. Medical Imaging, 2020

Unlocking Author Power: On the Exploitation of Auxiliary Author-Retweeter Relations for Predicting Key Retweeters.
IEEE Trans. Knowl. Data Eng., 2020

Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition.
IEEE Trans. Image Process., 2020

Self-Supervised Agent Learning for Unsupervised Cross-Domain Person Re-Identification.
IEEE Trans. Image Process., 2020

Panoramic Light Field From Hand-Held Video and Its Sampling for Real-Time Rendering.
IEEE Trans. Circuits Syst. Video Technol., 2020

Robust non-negative matrix factorization with multiple correntropy-induced hypergraph regularizer.
Signal Process., 2020

Perspective-Adaptive Convolutions for Scene Parsing.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

An ICN/SDN-Based Network Architecture and Efficient Content Retrieval for Future Satellite-Terrestrial Integrated Networks.
IEEE Netw., 2020

Saliency Prediction Network for 360° Videos.
IEEE J. Sel. Top. Signal Process., 2020

Global context and boundary structure-guided network for cross-modal organ segmentation.
Inf. Process. Manag., 2020

Reliability Enhancement for VR Delivery in Mobile-Edge Empowered Dual-Connectivity μWave-mmWave HetNets.
CoRR, 2020

Bilinear Graph Neural Network with Node Interactions.
CoRR, 2020

Graph Convolution Machine for Context-aware Recommender System.
CoRR, 2020

How to Retrain Recommender System?: A Sequential Meta-Learning Method.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Hierarchical Granularity Transfer Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

March on Data Imperfections: Domain Division and Domain Generalization for Semantic Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

A Feature Generalization Framework for Social Media Popularity Prediction.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Part-Aware Interactive Learning for Scene Graph Generation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Dual Context-Aware Refinement Network for Person Search.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

A Cross-modality and Progressive Person Search System.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning Rich Attention for Pediatric Bone Age Assessment.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Overcoming Language Priors with Self-supervised Learning for Visual Question Answering.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Bilinear Graph Neural Network with Neighbor Interactions.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Self-Supervised Domain-Aware Generative Network for Generalized Zero-Shot Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Multi-Modality Cross Attention Network for Image and Sentence Matching.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Graph Structured Network for Image-Text Matching.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Curriculum Learning for Natural Language Understanding.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

CircleNet for Hip Landmark Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Temporal Netgrid Model-Based Dynamic Routing in Large-Scale Small Satellite Networks.
IEEE Trans. Veh. Technol., 2019

Spatiotemporal-Textual Co-Attention Network for Video Question Answering.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Convolutional Attention Networks for Scene Text Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Dense 3D-Convolutional Neural Network for Person Re-Identification in Videos.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Cross-Modality Bridging and Knowledge Transferring for Image Understanding.
IEEE Trans. Multim., 2019

Double-Bit Quantization and Index Hashing for Nearest Neighbor Search.
IEEE Trans. Multim., 2019

Deep Representation Learning With Part Loss for Person Re-Identification.
IEEE Trans. Image Process., 2019

Multi-Domain and Multi-Task Learning for Human Action Recognition.
IEEE Trans. Image Process., 2019

Asymmetric GAN for Unpaired Image-to-Image Translation.
IEEE Trans. Image Process., 2019

Dual-Stream Recurrent Neural Network for Video Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2019

Dynamic Resource Allocation for Streaming Scalable Videos in SDN-Aided Dense Small-Cell Networks.
IEEE Trans. Commun., 2019

Automated pulmonary nodule detection in CT images using deep convolutional neural networks.
Pattern Recognit., 2019

Real-time indoor scene reconstruction with Manhattan assumption.
Multim. Tools Appl., 2019

Scene-adaptive coded aperture imaging.
Multim. Tools Appl., 2019

Detection and tracking based tubelet generation for video object detection.
J. Vis. Commun. Image Represent., 2019

DR<sup>2</sup>-Net: Deep Residual Reconstruction Network for image compressive sensing.
Neurocomputing, 2019

Scheduled Differentiable Architecture Search for Visual Recognition.
CoRR, 2019

Consensus Feature Network for Scene Parsing.
CoRR, 2019

Dense Scale Network for Crowd Counting.
CoRR, 2019

Freely Explore the Scene with 360°Field of View.
Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces, 2019

Relational Collaborative Filtering: Modeling Multiple Item Relations for Recommendation.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

WaveCSN: Cascade Segmentation Network for Hip Landmark Detection.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Adaptive Bilinear Pooling for Fine-grained Representation Learning.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Domain-Specific Embedding Network for Zero-Shot Recognition.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Deep Adversarial Graph Attention Convolution Network for Text-Based Person Search.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Focus Your Attention: A Bidirectional Focal Attention Network for Image-Text Matching.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Mixed-dish Recognition with Contextual Relation Networks.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

ACE-Net: Biomedical Image Segmentation with Augmented Contracting and Expansive Paths.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Deep Cascaded Attention Network for Multi-task Brain Tumor Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Misshapen Pelvis Landmark Detection by Spatial Local Correlation Mining for Diagnosing Developmental Dysplasia of the Hip.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Extract Bone Parts Without Human Prior: End-to-end Convolutional Neural Network for Pediatric Bone Age Assessment.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Spatiotemporal Breast Mass Detection Network (MD-Net) in 4D DCE-MRI Images.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

DSRN: A Deep Scale Relationship Network for Scene Text Detection.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Boundary Perception Guidance: A Scribble-Supervised Semantic Segmentation Approach.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Learning to Draw Text in Natural Images with Conditional Adversarial Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Semi-supervised User Profiling with Heterogeneous Graph Attention Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Real-time Indoor Scene Reconstruction with RGBD and Inertial Input.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

MLTS: A Multi-Language Scene Text Spotter.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Semantic-Embedding and Shape-Aware U-Net for Ultrasound Eyeball Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Accurate Segmentation of Synaptic Cleft with Contour Growing Concatenated with a Convnet.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Near-infrared Image Guided Neural Networks for Color Image Denoising.
Proceedings of the IEEE International Conference on Acoustics, 2019

APE-GAN: Adversarial Perturbation Elimination with GAN.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Two-Stream Mutual Attention Network for Semi-Supervised Biomedical Segmentation with Noisy Labels.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Spherical Superpixel Segmentation.
IEEE Trans. Multim., 2018

A Fast Uyghur Text Detector for Complex Background Images.
IEEE Trans. Multim., 2018

GLA: Global-Local Attention for Image Description.
IEEE Trans. Multim., 2018

Supervised Hash Coding With Deep Neural Network for Environment Perception of Intelligent Vehicles.
IEEE Trans. Intell. Transp. Syst., 2018

Effective Uyghur Language Text Detection in Complex Background Images for Traffic Prompt Identification.
IEEE Trans. Intell. Transp. Syst., 2018

AutoBD: Automated Bi-Level Description for Scalable Fine-Grained Visual Categorization.
IEEE Trans. Image Process., 2018

Implicit Negative Sub-Categorization and Sink Diversion for Object Detection.
IEEE Trans. Image Process., 2018

Variable aperture panoramic imaging.
Multim. Tools Appl., 2018

Region similarity arrangement for large-scale image retrieval.
Neurocomputing, 2018

Eigenobject-wise saliency detection based on manifold ranking.
Neurocomputing, 2018

CGNet: A Light-weight Context Guided Network for Semantic Segmentation.
CoRR, 2018

CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
CoRR, 2018

Potential of Attention Mechanism for Classification of Optical Coherence Tomography Images.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Image Denoising with Local Dense and Adaptive Global Residual Networks.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Temporal-Contextual Attention Network for Video-Based Person Re-identification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Style Separation and Synthesis via Generative Adversarial Networks.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Post Tuned Hashing: A New Approach to Indexing High-dimensional Data.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Context-Aware Visual Policy Network for Sequence-Level Image Captioning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

CA<sub>3</sub>Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Automated Pulmonary Nodule Detection: High Sensitivity with Few Candidates.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018

Distortion-aware CNNs for Spherical Images.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

High Resolution Feature Recovering for Accelerating Urban Scene Parsing.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Multi-Level Policy and Reward Reinforcement Learning for Image Captioning.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Panoramic Light Field Video Acquisition.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Semantic Preserving Hash Coding Through VAE-GAN.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Compressive hyperspectral imaging mask optimization.
Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018

Not All Words Are Equal: Video-specific Information Loss for Video Captioning.
Proceedings of the British Machine Vision Conference 2018, 2018

Image Captioning Based on Adaptive Balancing Loss.
Proceedings of the Fourth IEEE International Conference on Multimedia Big Data, 2018

2017
Trip Outfits Advisor: Location-Oriented Clothing Recommendation.
IEEE Trans. Multim., 2017

Object Localization Based on Proposal Fusion.
IEEE Trans. Multim., 2017

Novel Visual and Statistical Image Features for Microblogs News Verification.
IEEE Trans. Multim., 2017

Sparse Online Learning of Image Similarity.
ACM Trans. Intell. Syst. Technol., 2017

Multi-modal tag localization for mobile video search.
Multim. Syst., 2017

High-level background prior based salient object detection.
J. Vis. Commun. Image Represent., 2017

DSP: Discriminative Spatial Part modeling for Fine-Grained Visual Categorization.
Image Vis. Comput., 2017

HDIdx: High-dimensional indexing for efficient approximate nearest neighbor search.
Neurocomputing, 2017

Kernelized product quantization.
Neurocomputing, 2017

Deep Representation Learning with Part Loss for Person Re-Identification.
CoRR, 2017

DR<sup>2</sup>-Net: Deep Residual Reconstruction Network for Image Compressive Sensing.
CoRR, 2017

AE-GAN: adversarial eliminating with GAN.
CoRR, 2017

Rumor Detection on Twitter Pertaining to the 2016 U.S. Presidential Election.
CoRR, 2017

Detection and Analysis of 2016 US Presidential Election Related Rumors on Twitter.
Proceedings of the Social, Cultural, and Behavioral Modeling, 2017

Multi-scale Convolutional Neural Networks for Non-blind Image Deconvolution.
Proceedings of the Advances in Multimedia Information Processing - PCM 2017, 2017

One-Shot Fine-Grained Instance Retrieval.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Learning Multimodal Attention LSTM Networks for Video Captioning.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Multimodal Fusion with Recurrent Neural Networks for Rumor Detection on Microblogs.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Sequential Prediction of Social Media Popularity with Deep Temporal Context Networks.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Large-scale person re-identification as retrieval.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Deep saliency map estimation of hand-crafted features.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Scale-Adaptive Convolutions for Scene Parsing.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Task-Driven Dynamic Fusion: Reducing Ambiguity in Video Description.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Image Caption with Global-Local Attention.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Deep Fusion of Multiple Semantic Cues for Complex Event Recognition.
IEEE Trans. Image Process., 2016

Coarse-to-Fine Description for Fine-Grained Visual Categorization.
IEEE Trans. Image Process., 2016

Web Image Search Re-Ranking With Click-Based Similarity and Typicality.
IEEE Trans. Image Process., 2016

On application-unbiased benchmarking of web videos from a social network perspective.
Multim. Tools Appl., 2016

Adaptive weighted imbalance learning with application to abnormal activity recognition.
Neurocomputing, 2016

Web video topics discovery and structuralization with social network.
Neurocomputing, 2016

Boosted Near-miss Under-sampling on SVM ensembles for concept detection in large-scale imbalanced datasets.
Neurocomputing, 2016

Image Credibility Analysis with Effective Domain Transferred Deep Networks.
CoRR, 2016

Drug-target interaction prediction: databases, web servers and computational models.
Briefings Bioinform., 2016

Efficient Perceptual Region Detector Based on Object Boundary.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Time Matters: Multi-scale Temporalization of Social Media Popularity.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Region similarity arrangement for image retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Unfolding Temporal Dynamics: Predicting Social Media Popularity Using Multi-scale Temporal Decomposition.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

News Verification by Exploiting Conflicting Social Viewpoints in Microblogs.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Enhancing Video Event Recognition Using Automatically Constructed Semantic-Visual Knowledge Base.
IEEE Trans. Multim., 2015

Full-Space Local Topology Extraction for Cross-Modal Retrieval.
IEEE Trans. Image Process., 2015

Click-boosting multi-modality graph-based reranking for image search.
Multim. Syst., 2015

An efficient concept detection system via sparse ensemble learning.
Neurocomputing, 2015

Automatic foreground segmentation using light field images.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

Orientational Spatial Part Modeling for Fine-Grained Visual Categorization.
Proceedings of the 2015 IEEE International Conference on Mobile Services, MS 2015, New York City, NY, USA, June 27, 2015

Scalable logo recognition based on compact sparse dictionary for mobile devices.
Proceedings of the 17th IEEE International Workshop on Multimedia Signal Processing, 2015

MCG-ICT at MediaEval 2015: Verifying Multimedia Use with a Two-Level Classification Model.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Online Learning to Rank for Content-Based Image Retrieval.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Large visual words for large scale image classification.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Lenselet image compression scheme based on subaperture images streaming.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Multi-task deep visual-semantic embedding for video thumbnail selection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

SOLAR: Scalable Online Learning Algorithms for Ranking.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
A Prior-Free Weighting Scheme for Binary Code Ranking.
IEEE Trans. Multim., 2014

Contextual Query Expansion for Image Retrieval.
IEEE Trans. Multim., 2014

Instant Mobile Video Search With Layered Audio-Video Indexing and Progressive Transmission.
IEEE Trans. Multim., 2014

A Simple and Efficient Re-Scrambling Scheme for DTV Programs.
IEEE Trans. Multim., 2014

Community Discovery from Social Media by Low-Rank Matrix Recovery.
ACM Trans. Intell. Syst. Technol., 2014

A Unified Geolocation Framework for Web Videos.
ACM Trans. Intell. Syst. Technol., 2014

Scalable Similarity Search With Topology Preserving Hashing.
IEEE Trans. Image Process., 2014

Image Search Reranking With Query-Dependent Click-Based Relevance Feedback.
IEEE Trans. Image Process., 2014

Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors.
IEEE Trans. Circuits Syst. Video Technol., 2014

A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors.
IEEE Signal Process. Lett., 2014

Pedestrian detection based on sparse coding and transfer learning.
Mach. Vis. Appl., 2014

Efficient binary code indexing with pivot based locality sensitive clustering.
Multim. Tools Appl., 2014

Salient region detection for complex background images using integrated features.
Inf. Sci., 2014

Semi-supervised learning via sparse model.
Neurocomputing, 2014

Representative selection based on sparse modeling.
Neurocomputing, 2014

Encoder combined video moving object detection.
Neurocomputing, 2014

FSpH: Fitted spectral hashing for efficient similarity search.
Comput. Vis. Image Underst., 2014

Rescue Tail Queries: Learning to Image Search Re-rank via Click-wise Multimodal Fusion.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Deep Learning for Content-Based Image Retrieval: A Comprehensive Study.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Monte Carlo Sampling based Salient Region Detection.
Proceedings of the International Conference on Multimedia Retrieval, 2014

A Representative Local Region Detector Based On Color-Contrast-MSER.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Real-Time Scene Text Detection Based on Stroke Model.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Salient region detection : Integrate both global and local cues.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Image compressed sensing reconstruction with 3D transform domain collaborative filtering.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Representative local features mining for large-scale near-duplicates retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

News Credibility Evaluation on Microblog with a Hierarchical Propagation Model.
Proceedings of the 2014 IEEE International Conference on Data Mining, 2014

SOML: Sparse Online Metric Learning with Application to Image Retrieval.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Accurate Estimation of Human Body Orientation From RGB-D Sensors.
IEEE Trans. Cybern., 2013

An improved method of locality sensitive hashing for indexing large-scale and high-dimensional features.
Signal Process., 2013

Accurate off-line query expansion for large-scale mobile visual search.
Signal Process., 2013

Robust common visual pattern discovery using graph matching.
J. Vis. Commun. Image Represent., 2013

Robust human body segmentation based on part appearance and spatial constraint.
Neurocomputing, 2013

Highly parallel mode decision method for HEVC.
Proceedings of the 30th Picture Coding Symposium, 2013

Motion Estimation for Video Coding Based on Subspace Pursuit.
Proceedings of the Advances in Multimedia Information Processing - PCM 2013, 2013

COGE: A Novel Binary Feature Descriptor Exploring Anisotropy and Non-uniformity.
Proceedings of the Advances in Multimedia Information Processing - PCM 2013, 2013

Distribution-Aware Locality Sensitive Hashing.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Efficient HEVC to H.264/AVC Transcoding with Fast Intra Mode Decision.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

VTrans: A Distributed Video Transcoding Platform.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Topology preserving hashing for similarity search.
Proceedings of the ACM Multimedia Conference, 2013

What are the distance metrics for local features?
Proceedings of the ACM Multimedia Conference, 2013

LAVES: an instant mobile video search system based on layered audio-video indexing.
Proceedings of the ACM Multimedia Conference, 2013

Listen, look, and gotcha: instant video search with mobile phones by layered audio-video indexing.
Proceedings of the ACM Multimedia Conference, 2013

Data driven multi-index hashing.
Proceedings of the IEEE International Conference on Image Processing, 2013

Spatial HOG based TV logo detection.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

Fast mode decision algorithm for intra prediction in HEVC.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

Click-boosting random walk for image search reranking.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

A video copy detection algorithm combining local feature's robustness and global feature's speed.
Proceedings of the IEEE International Conference on Acoustics, 2013

Efficient Parallel Framework for HEVC Deblocking Filter on Many-Core Platform.
Proceedings of the 2013 Data Compression Conference, 2013

Highly Parallel Framework for HEVC Motion Estimation on Many-Core Platform.
Proceedings of the 2013 Data Compression Conference, 2013

Binary Code Ranking with Weighted Hamming Distance.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
Efficient Parallel Framework for H.264/AVC Deblocking Filter on Many-Core Platform.
IEEE Trans. Multim., 2012

Web Video Geolocation by Geotagged Social Resources.
IEEE Trans. Multim., 2012

Improved total variation minimization method for compressive sensing by intra-prediction.
Signal Process., 2012

Exploring probabilistic localized video representation for human action recognition.
Multim. Tools Appl., 2012

Exploring multi-modality structure for cross domain adaptation in video concept annotation.
Neurocomputing, 2012

An Incremental Clustering based codebook construction in video copy detection.
Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation, 2012

Query Range Sensitive Probability Guided Multi-probe Locality Sensitive Hashing.
Proceedings of the 13th ACIS International Conference on Software Engineering, 2012

Efficient Partial Decoding Scheme for Intra Frame in H.264/AVC Stream.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

RGB-D Based Multi-attribute People Search in Intelligent Visual Surveillance.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

Finding Suits in Images of People.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

A method for detecting salient regions using integrated features.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Data Independent Method of Constructing Distributed LSH for Large-Scale Dynamic High-Dimensional Indexing.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Visual stem mapping and Geometric Tense coding for Augmented Visual Vocabulary.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Graph-based multi-space semantic correlation propagation for video retrieval.
Vis. Comput., 2011

Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search.
IEEE Trans. Multim., 2011

Robust Spatial Matching for Object Retrieval and Its Parallel Implementation on GPU.
IEEE Trans. Multim., 2011

Localized Multiple Kernel Learning for Realistic Human Action Recognition in Videos.
IEEE Trans. Circuits Syst. Video Technol., 2011

Tracking Web Video Topics: Discovery, Visualization, and Monitoring.
IEEE Trans. Circuits Syst. Video Technol., 2011

Towards hierarchical context: unfolding visual community potential for interactive video retrieval.
Multim. Tools Appl., 2011

Web video retagging.
Multim. Tools Appl., 2011

Restricted H.264/AVC video coding for privacy protected video scrambling.
J. Vis. Commun. Image Represent., 2011

Mining concise and distinctive affine-stable features for object detection in large corpus.
Int. J. Comput. Math., 2011

A pivot-based filtering algorithm for enhancing query performance of LSH.
Proceedings of the 2011 IEEE Visual Communications and Image Processing, 2011

Compressive sensing based video scrambling for privacy protection.
Proceedings of the 2011 IEEE Visual Communications and Image Processing, 2011

Fusing Audio-Words with Visual Features for Pornographic Video Detection.
Proceedings of the IEEE 10th International Conference on Trust, 2011

Perceptual Motivated Coding Strategy for Quality Consistency.
Proceedings of the Advances in Multimedia Modeling, 2011

Parallel Deblocking Filter for H.264/AVC on the TILERA Many-Core Systems.
Proceedings of the Advances in Multimedia Modeling, 2011

Efficient approximate nearest neighbor search with integrated binary codes.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Common visual pattern discovery via graph matching.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Leveraging collective wisdom for web video retrieval through heterogeneous community discovery.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Personalized portraits ranking.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Pairwise weak geometric consistency for large scale image search.
Proceedings of the 1st International Conference on Multimedia Retrieval, 2011

Parallel deblocking filter for H.264/AVC implemented on Tile64 platform.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Hollow TV logo detection.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Local geometric consistency constraint for image retrieval.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Human skin detection in images by MSER analysis.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

A pseudo relevance feedback based cross domain video concept detection.
Proceedings of the ICIMCS 2011, 2011

Efficient Video Coding Optimization Using a Novel Perceptual Distortion Model.
Proceedings of the 2011 Data Compression Conference (DCC 2011), 2011

2010
Multiview Spectral Embedding.
IEEE Trans. Syst. Man Cybern. Part B, 2010

Automatic Detection and Analysis of Player Action in Moving Background Sports Video Sequences.
IEEE Trans. Circuits Syst. Video Technol., 2010

Visual quality assessment for web videos.
J. Vis. Commun. Image Represent., 2010

Context-oriented web video tag recommendation.
Proceedings of the 19th International Conference on World Wide Web, 2010

Known-Item Search by MCG-ICT-CAS.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

Multi-modal query expansion for web video search.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

Compressive video sensing based on user attention model.
Proceedings of the Picture Coding Symposium, 2010

Sensing Geographical Impact Factor of Multimedia News Events for Localized Retrieval and News Filtering.
Proceedings of the Advances in Multimedia Modeling, 2010

Data-oriented locality sensitive hashing.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Visual security evaluation for video encryption.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Tag transformer.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Web video categorization based on Wikipedia categories and content-duplicated open resources.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Trajectory-based visualization of web video topics.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Explicit and implicit concept-based video retrieval with bipartite graph propagation model.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Adult Image Detection Combining BoVW Based on Region of Interest and Color Moments.
Proceedings of the Intelligent Information Processing V, 2010

Parallel spatial matching for object retrieval implemented on GPU.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

A distribution based video representation for human action recognition.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Restricted H.264/AVC video coding for privacy region scrambling.
Proceedings of the International Conference on Image Processing, 2010

GPU-based fast scale invariant interest point detector.
Proceedings of the IEEE International Conference on Acoustics, 2010

Fast and robust spatial matching for object retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2010

Effective and Efficient Image Copy Detection Based on GPU.
Proceedings of the Trends and Topics in Computer Vision, 2010

Hierarchical feedback algorithm based on visual community discovery for interactive video retrieval.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

Affine Stable Characteristic based sample expansion for object detection.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

2009
On defining affinity graph for spectral clustering through ranking on manifolds.
Neurocomputing, 2009

A density-based method for adaptive LDA model selection.
Neurocomputing, 2009

TRECVID 2009 of MCG-ICT-CAS.
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

Multimedia Evidence Fusion for Video Concept Detection via OWA Operator.
Proceedings of the Advances in Multimedia Modeling, 2009

Localizing volumetric motion for action recognition in realistic videos.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Pornprobe: an LDA-SVM based pornography detection system.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Google challenge: incremental-learning for web video categorization on robust semantic feature space.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Personalized movie recommendation.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Distribution-based concept selection for concept-based video retrieval.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Locally non-negative linear structure learning for interactive image retrieval.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Pseudo relevance feedback with incremental learning for high level feature detection.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

KNSC: A novel local classification method for multimedia semantic analysis.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Localizing and recognizing action unit using position information of local feature.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Confusion network based Video OCR post-processing approach.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Logo detection based on spatial-spectral saliency and partial spatial context.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Motion region-based trajectory analysis and re-ranking for video retrieval.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

VideoMap: an interactive video retrieval system of MCG-ICT-CAS.
Proceedings of the 8th ACM International Conference on Image and Video Retrieval, 2009

2008
Personalized multimedia web summarizer for tourist.
Proceedings of the 17th International Conference on World Wide Web, 2008

A Hierarchical Scheme for Rapid Video Copy Detection.
Proceedings of the 9th IEEE Workshop on Applications of Computer Vision (WACV 2008), 2008

An Innovative Model of Tempo and Its Application in Action Scene Detection for Movie Analysis.
Proceedings of the 9th IEEE Workshop on Applications of Computer Vision (WACV 2008), 2008

Local Separability Assessment: A Novel Feature Selection Method for Multimedia Applications.
Proceedings of the Advances in Multimedia Information Processing, 2008

Synopsis Alignment: Importing External Text Information for Multi-model Movie Analysis.
Proceedings of the Advances in Multimedia Information Processing, 2008

A More Topologically Stable Locally Linear Embedding Algorithm Based on R*-Tree.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2008

A statistical framework for replay detection in soccer video.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

Object-based Image Retrieval with Attention Analysis and Spatial Re-ranking.
Proceedings of the Intelligent Information Processing IV, 2008

Local Subspace-Based Denoising for Shot Boundary Detection.
Proceedings of the New Frontiers in Applied Artificial Intelligence, 2008

Document Clustering Based on Spectral Clustering and Non-negative Matrix Factorization.
Proceedings of the New Frontiers in Applied Artificial Intelligence, 2008

Invariant visual patterns for video copy detection.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

An Innovative Tempo Model for Movie Content Analysis.
Proceedings of the IEEE International Conference on Networking, Sensing and Control, 2008

Web video recommendation and long tail discovering.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Human attention model for semantic scene analysis in movies.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Object retrieval based on spatially frequent items with informative patches.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

A hierarchical framework for movie content analysis: Let computers watch films like humans.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008

Adaptive multiple feedback strategies for interactive video search.
Proceedings of the 7th ACM International Conference on Image and Video Retrieval, 2008

A Novel Image Text Extraction Method Based on K-Means Clustering.
Proceedings of the 7th IEEE/ACIS International Conference on Computer and Information Science, 2008

Attention Model Based SIFT Keypoints Filtration for Image Retrieval.
Proceedings of the 7th IEEE/ACIS International Conference on Computer and Information Science, 2008

2007
Selection of the most efficient tile size in tile-based cylinder panoramic video coding and transmission.
Vis. Comput., 2007

Improvement on Rate-Distortion Performance of H.264 Rate Control in Low Bit Rate.
IEEE Trans. Circuits Syst. Video Technol., 2007

Format-Independent Motion Content Description based on Spatiotemporal Visual Sensitivity.
IEEE Trans. Consumer Electron., 2007

Complexity controllable DCT for real-time H.264 encoder.
J. Vis. Commun. Image Represent., 2007

Secure and Incidental Distortion Tolerant Digital Signature for Image Authentication.
J. Comput. Sci. Technol., 2007

TRECVID 2007 High-Level Feature Extraction By MCG-ICT-CAS.
Proceedings of the TRECVID 2007 workshop participants notebook papers, 2007

TRECVID 2007 Search Tasks by NUS-ICT.
Proceedings of the TRECVID 2007 workshop participants notebook papers, 2007

LDA-Based Retrieval Framework for Semantic News Video Retrieval.
Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007

Multi-modal Interview Concept Detection for Rushes Exploitation.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications) - RIAO 2007, 8th International Conference, Carnegie Mellon University, Pittsburgh, PA, USA, May 30, 2007

A Fast Global Motion Estimation Method for Panoramic Video Coding.
Proceedings of the Advances in Multimedia Information Processing, 2007

A Lexicon-Guided LSI Method for Semantic News Video Retrieval.
Proceedings of the Advances in Multimedia Information Processing, 2007

Visual Features Extraction Through Spatiotemporal Slice Analysis.
Proceedings of the Advances in Multimedia Modeling, 2007

Automatic Detection and Recognition of Athlete Actions in Diving Video.
Proceedings of the Advances in Multimedia Modeling, 2007

Segregated feedback with performance-based adaptive sampling for interactive news video retrieval.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Panoramic Video Coding Using Affine Motion Compensated Prediction.
Proceedings of the Multimedia Content Analysis and Mining, International Workshop, 2007

Highlights extraction in soccer videos based on goal-mouth detection.
Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007

Adaptive Selection of Motion Models for Panoramic Video Coding.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

The Most Efficient Tile Size in Tile-Based Cylinder Panoramic Video Coding and its Selection Under Restriction of Bandwidth.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

A Novel Anchorperson Detection Algorithm Based on Spatio-temporal Slice.
Proceedings of the 14th International Conference on Image Analysis and Processing (ICIAP 2007), 2007

Automatic Video-based Analysis of Athlete Action.
Proceedings of the 14th International Conference on Image Analysis and Processing (ICIAP 2007), 2007

Rate Control Algorithm for MPEG-2 to H.264/AVC Transcoding.
Proceedings of the Pattern Recognition and Image Analysis, Third Iberian Conference, 2007

Retrieval Method for Video Content in Different Format Based on Spatiotemporal Features.
Proceedings of the Advances in Information Retrieval, 2007

Active learning approach to interactive spatio-temporal news video retrieval.
Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

Statistical Framework for Shot Segmentation and Classification in Sports Video.
Proceedings of the Computer Vision, 2007

2006
Optimum bit allocation and rate control for H.264/AVC.
IEEE Trans. Circuits Syst. Video Technol., 2006

Motion adaptive deinterlacing with accurate motion detection and anti-aliasing interpolation filter.
IEEE Trans. Consumer Electron., 2006

Motion adaptive deinterlacing of video data with texture detection.
IEEE Trans. Consumer Electron., 2006

TRECVID 2006 Rushes Exploitation by CAS MCG.
Proceedings of the 2006 TREC Video Retrieval Evaluation, 2006

A Novel Method for Spoken Text Feature Extraction in Semantic Video Retrieval.
Proceedings of the Advances in Multimedia Information Processing, 2006

Improvements on Rate-Distortion Performance of H.264 Rate Control in Low Bit Rate Video Coding.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

An New Coefficients Transform Matrix for the Transform Domain MPEG-2 TO H.264/AVC Transcoding.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Combining Template Matching and Model Fitting for Human Body Segmentation and Tracking with Applications to Sports Training.
Proceedings of the Image Analysis and Recognition, Third International Conference, 2006

A Fast Scheme for Converting DCT Coefficients to H.264/AVC Integer Transform Coefficients.
Proceedings of the Image Analysis and Recognition, Third International Conference, 2006

Fast Picture and Macroblock Level Adaptive Frame/Field Coding for H.264.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems 2006, 2006

2005
High throughput and low memory access sub-pixel interpolation architecture for H.264/AVC HDTV decoder.
IEEE Trans. Consumer Electron., 2005

SSF fingerprint for image authentication: an incidental distortion resistant scheme.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Fast inter frame encoding based on modes pre-decision in H.264.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Efficient quantization step selection scheme for I-frame in rate-constrained video coding.
Proceedings of the 2005 International Conference on Image Processing, 2005

Compact and Robust Image Hashing.
Proceedings of the Computational Science and Its Applications, 2005

Compact and Robust Fingerprints Using DCT Coefficients of Key Blocks.
Proceedings of the Pattern Recognition and Image Analysis, Second Iberian Conference, 2005

Replay Scene Based Sports Video Abstraction.
Proceedings of the Fuzzy Systems and Knowledge Discovery, Second International Conference, 2005

2004
Semantic and Structural Analysis of TV Diving Programs.
J. Comput. Sci. Technol., 2004

Efficient block size selection for MPEG-2 to H.264 transcoding.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Fast 4*4 intra-prediction mode selection for H.264.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

An automatic segmentation algorithm for moving objects in video sequences under multi-constraints.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Ontology Based Sports Video Annotation and Summary.
Proceedings of the Content Computing, Advanced Workshop on Content Computing, 2004


  Loading...