Wengang Zhou

Orcid: 0000-0003-1690-9836

Affiliations:
  • University of Science and Technology of China, Department of Electronic Engineering and Information Science, MCC Lab, Hefei, China
  • Hefei Comprehensive National Science Center, Institute of Artificial Intelligence, Hefei, China
  • Chinese Academy of Sciences, Key Laboratory of Technology in Geo-spatial Information Processing and Application System, Hefei, China
  • University of Texas at San Antonio, Department of Computer Science, San Antonio, TX, USA (2011 - 2013)


According to our database1, Wengang Zhou authored at least 398 papers between 2008 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
PCVR: a pre-trained contextualized visual representation for DNA sequence classification.
BMC Bioinform., December, 2025

Multi-scale count-task guided feature enhancement face detection.
Multim. Syst., October, 2025

Revisit Weakly Supervised Hashing With Deep Multi-Modal Foundation Models.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2025

HandNeRF++: Modeling Animatable Interacting Hands With Neural Radiance Fields.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2025

DocScanner: Robust Document Image Rectification with Progressive Learning.
Int. J. Comput. Vis., August, 2025

Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning.
CoRR, August, 2025

Goal Discovery with Causal Capacity for Efficient Reinforcement Learning.
CoRR, August, 2025

DocR1: Evidence Page-Guided GRPO for Multi-Page Document Understanding.
CoRR, August, 2025

SinKD: Sinkhorn Distance Minimization for Knowledge Distillation.
IEEE Trans. Neural Networks Learn. Syst., July, 2025

$\hbox {I}^2$MD: 3D Action Representation Learning with Inter- and Intra-Modal Mutual Distillation.
Int. J. Comput. Vis., July, 2025

Robust Multimodal Large Language Models Against Modality Conflict.
CoRR, July, 2025

RESIST: Rationale-Enhanced and Reward Model-Based End-to-End Social Influence Dialogue System.
ACM Trans. Multim. Comput. Commun. Appl., June, 2025

SinDiffusion: Learning a Diffusion Model From a Single Natural Image.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2025

GaussNav: Gaussian Splatting for Visual Navigation.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2025

Self-Classification Enhancement and Correction for Weakly Supervised Object Detection.
CoRR, May, 2025

Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks.
CoRR, May, 2025

Bias Fitting to Mitigate Length Bias of Reward Model in RLHF.
CoRR, May, 2025

Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering.
CoRR, May, 2025

LayoutEnc: Leveraging Enhanced Layout Representations for Transformer-based Complex Scene Synthesis.
ACM Trans. Multim. Comput. Commun. Appl., April, 2025

Motion-Aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction.
IEEE Trans. Circuits Syst. Video Technol., April, 2025

Long-Term Feature Extraction via Frequency Prediction for Efficient Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2025

AAGS: Appearance-Aware 3D Gaussian Splatting with Unconstrained Photo Collections.
Multim. Syst., April, 2025

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models.
CoRR, April, 2025

DISA: Disentangled Dual-Branch Framework for Affordance-Aware Human Insertion.
ACM Trans. Multim. Comput. Commun. Appl., March, 2025

Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression.
CoRR, March, 2025

Cross-Modal Consistency Learning for Sign Language Recognition.
CoRR, March, 2025

Multi-Cue Adaptive Visual Token Pruning for Large Vision-Language Models.
CoRR, March, 2025

DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models.
CoRR, March, 2025

Model Evolution Framework with Genetic Algorithm for Multi-Task Reinforcement Learning.
CoRR, February, 2025

Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling.
CoRR, February, 2025

DeepEraser: Deep Iterative Context Mining for Generic Text Eraser.
IEEE Trans. Multim., 2025

Adaptive Bit Selection for Scalable Deep Hashing.
IEEE Trans. Image Process., 2025

Recovering Permuted Sequential Features for effective Reinforcement Learning.
Neural Networks, 2025

TIMAR: Transition-informed representation for sample-efficient multi-agent reinforcement learning.
Neural Networks, 2025

Adaptive Confidence-aware Preference-based Reinforcement Learning with Noisy Feedback.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2025, 2025

Single-Source Dual-Stream Representation Learning for DNA Sequence Classification.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

EG4D: Explicit Generation of 4D Object without Score Distillation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Uni-Sign: Toward Unified Sign Language Understanding at Scale.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

AMF: Adaptive Modality Fusion for Zero-Shot Composed Image Retrieval.
Proceedings of the 6th Workshop on Intelligent Cross-Data Analysis and Retrieval, 2025

CSD: Cross-Modal Similarity Distillation for Zero-Shot Composed Image Retrieval.
Proceedings of the 6th Workshop on Intelligent Cross-Data Analysis and Retrieval, 2025

Leveraging Visual Captions for Enhanced Zero-Shot HOI Detection.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Cross-Modal Consistency Learning for Sign Language Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SLRTP2025 Sign Language Production Challenge: Methodology, Results and Future Work.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

SmartEraser: Remove Anything from Images using Masked-Region Guidance.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Detector-free Image Matching with Lightweight Backbone and Feature Filtering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

I2VGuard: Safeguarding Images against Misuse in Diffusion-based Image-to-Video Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

OPTICAL: Leveraging Optimal Transport for Contribution Allocation in Dataset Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Incremental Transformer: Efficient Encoder for Incremented Text Over MRC and Conversation Tasks.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Controllable Style Arithmetic with Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
DanZero+: Dominating the GuanDan Game Through Reinforcement Learning.
IEEE Trans. Games, December, 2024

SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection.
ACM Trans. Multim. Comput. Commun. Appl., November, 2024

MASA: Motion-Aware Masked Autoencoder With Semantic Alignment for Sign Language Recognition.
IEEE Trans. Circuits Syst. Video Technol., November, 2024

Reconstruction-Free Image Compression for Machine Vision via Knowledge Transfer.
ACM Trans. Multim. Comput. Commun. Appl., October, 2024

Toward On-Demand Transmission: Joint Feature and Image Coding With Reversible Neural Networks.
IEEE Trans. Circuits Syst. Video Technol., October, 2024

Recurrent Generic Contour-Based Instance Segmentation With Progressive Learning.
IEEE Trans. Circuits Syst. Video Technol., September, 2024

Full DouZero+: Improving DouDizhu AI by Opponent Modeling, Coach-Guided Training and Bidding Learning.
IEEE Trans. Games, September, 2024

MCMARL: Parameterizing Value Function via Mixture of Categorical Distributions for Multi-Agent Reinforcement Learning.
IEEE Trans. Games, September, 2024

Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain Generalization.
Int. J. Comput. Vis., September, 2024

CLIP2GAN: Toward Bridging Text With the Latent Space of GANs.
IEEE Trans. Circuits Syst. Video Technol., August, 2024

Coordinate-aligned multi-camera collaboration for active multi-object tracking.
Multim. Syst., August, 2024

Optimizing Camera Motion with MCTS and Target Motion Modeling in Multi-Target Active Object Tracking.
ACM Trans. Multim. Comput. Commun. Appl., July, 2024

Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB Video.
ACM Trans. Multim. Comput. Commun. Appl., June, 2024

Rethinking Supervision in Document Unwarping: A Self-Consistent Flow-Free Approach.
IEEE Trans. Circuits Syst. Video Technol., June, 2024

Detect Any Shadow: Segment Anything for Video Shadow Detection.
IEEE Trans. Circuits Syst. Video Technol., May, 2024

DaFIR: Distortion-Aware Representation Learning for Fisheye Image Rectification.
IEEE Trans. Circuits Syst. Video Technol., May, 2024

CTDS: Centralized Teacher With Decentralized Student for Multiagent Reinforcement Learning.
IEEE Trans. Games, March, 2024

Towards Codebook-Free Deep Probabilistic Quantization for Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2024

Progressive Recurrent Network for shadow removal.
Comput. Vis. Image Underst., January, 2024

Guest Editorial Introduction to the Issue on Pre-Trained Models for Multi-Modality Understanding.
IEEE Trans. Multim., 2024

Structure Similarity Preservation Learning for Asymmetric Image Retrieval.
IEEE Trans. Multim., 2024

Video Demoiréing With Deep Temporal Color Embedding and Video-Image Invertible Consistency.
IEEE Trans. Multim., 2024

Prior-Aware Cross Modality Augmentation Learning for Continuous Sign Language Recognition.
IEEE Trans. Multim., 2024

Deep Unrestricted Document Image Rectification.
IEEE Trans. Multim., 2024

Progressive Similarity Preservation Learning for Deep Scalable Product Quantization.
IEEE Trans. Multim., 2024

Learning 3D Shape Latent for Point Cloud Completion.
IEEE Trans. Multim., 2024

Multi-Granularity Matching Transformer for Text-Based Person Search.
IEEE Trans. Multim., 2024

Self-Supervised Representation Learning With Spatial-Temporal Consistency for Sign Language Recognition.
IEEE Trans. Image Process., 2024

RL-LLM-DT: An Automatic Decision Tree Generation Method Based on RL Evaluation and LLM Enhancement.
CoRR, 2024

ROOT: VLM based System for Indoor Scene Understanding and Beyond.
CoRR, 2024

MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning.
CoRR, 2024

StreetSurfGS: Scalable Urban Street Surface Reconstruction with Planar-based Gaussian Splatting.
CoRR, 2024

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding.
CoRR, 2024

LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation.
CoRR, 2024

Scaling up Multimodal Pre-training for Sign Language Understanding.
CoRR, 2024

RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation.
CoRR, 2024

Text-Animator: Controllable Visual Text Video Generation.
CoRR, 2024

Learning Generalizable Human Motion Generator with Reinforcement Learning.
CoRR, 2024

TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding.
CoRR, 2024

Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding.
CoRR, 2024

DocPedia: unleashing the power of large multimodal model in the frequency domain for versatile document understanding.
Sci. China Inf. Sci., 2024

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Refining Video-Based Person Re-Identification: An Integrated Framework with Facial and Body Cues.
Proceedings of the 1st ICMR Workshop on Multimedia Object Re-Identification, 2024

P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Progressive Multi-modal Conditional Prompt Tuning.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024

Remember the Past for Better Future: Memory-Augmented Offline RL.
Proceedings of the International Joint Conference on Neural Networks, 2024

Temporal State Prediction and Sequence Recovery for Multi-agent Reinforcement Learning.
Proceedings of the Neural Information Processing - 31st International Conference, 2024

Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Cross-Lingual Transfer for Natural Language Inference via Multilingual Prompt Translator.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Heredity-aware Child Face Image Generation with Latent Space Disentanglement.
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

Exploring GPT-4 Vision for Text-to-Image Synthesis Evaluation.
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

FOREST2SEQ: Revitalizing Order Prior for Sequential Indoor Scene Synthesis.
Proceedings of the Computer Vision - ECCV 2024, 2024

Instance-Aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Sinkhorn Distance Minimization for Knowledge Distillation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Semi-Supervised Spoken Language Glossification.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Revisiting Open-Set Panoptic Segmentation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

SUF: Stabilized Unconstrained Fine-Tuning for Offline-to-Online Reinforcement Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Weakly Supervised Hashing with Reconstructive Cross-modal Attention.
ACM Trans. Multim. Comput. Commun. Appl., November, 2023

TransVG++: End-to-End Visual Grounding With Language Conditioned Vision Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Improving Deep Reinforcement Learning With Mirror Loss.
IEEE Trans. Games, September, 2023

SignBERT+: Hand-Model-Aware Self-Supervised Pre-Training for Sign Language Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

Exploring the diversity and invariance in yourself for visual pre-training task.
Pattern Recognit., July, 2023

Unsupervised Person Re-Identification With Wireless Positioning Under Weak Scene Labeling.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Masked Contrastive Representation Learning for Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

FI-WSOD: Foreground Information Guided Weakly Supervised Object Detection.
IEEE Trans. Multim., 2023

Hash Bit Selection With Reinforcement Learning for Image Retrieval.
IEEE Trans. Multim., 2023

Improving Person Re-Identification With Multi-Cue Similarity Embedding and Propagation.
IEEE Trans. Multim., 2023

Coherent Image Animation Using Spatial-Temporal Correspondence.
IEEE Trans. Multim., 2023

Collaborative Multilingual Continuous Sign Language Recognition: A Unified Framework.
IEEE Trans. Multim., 2023

Deep Graph Convolutional Quantization Networks for Image Retrieval.
IEEE Trans. Multim., 2023

Model-Aware Pre-Training for Radial Distortion Rectification.
IEEE Trans. Image Process., 2023

Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs.
CoRR, 2023

DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding.
CoRR, 2023

State Sequences Prediction via Fourier Transform for Representation Learning.
CoRR, 2023

I<sup>2</sup>MD: 3D Action Representation Learning with Inter- and Intra-modal Mutual Distillation.
CoRR, 2023

UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding.
CoRR, 2023

Exploring Effective Mask Sampling Modeling for Neural Image Compression.
CoRR, 2023

Learning Transferable Pedestrian Representation from Multimodal Information Supervision.
CoRR, 2023

Discriminative Experience Replay for Efficient Multi-agent Reinforcement Learning.
CoRR, 2023

Recurrent Contour-based Instance Segmentation with Progressive Learning.
CoRR, 2023

End-to-end Action Quality Assessment with Action Parsing Transformer.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

Learning robust representation for reinforcement learning with distractions by reward sequence prediction.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Multi-Agent First Order Constrained Optimization in Policy Space.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

State Sequences Prediction via Fourier Transform for Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Hierarchical Multi-Agent Skill Discovery.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CLIP4HOI: Towards Adapting CLIP for Practical Zero-Shot HOI Detection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Text-Only Training for Visual Storytelling.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Dual-view Molecular Pre-training.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Q-SAT: Value Factorization with Self-Attention for Deep Multi-Agent Reinforcement Learning.
Proceedings of the International Joint Conference on Neural Networks, 2023

MA2CL: Masked Attentive Contrastive Learning for Multi-Agent Reinforcement Learning.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Robust Person Re-Identification with Wireless Signals.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

DocMAE: Document Image Rectification via Self-supervised Representation Learning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

𝒪-GNN: incorporating ring priors into molecular modeling.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Making Better Decision by Directly Planning in Continuous Control.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

A General Rank Preserving Framework for Asymmetric Image Retrieval.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Sign Language Translation with Iterative Prototype.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DIRE for Diffusion-Generated Image Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Masked Motion Predictors are Strong 3D Action Representation Learners.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Asymmetric Feature Fusion for Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AltFreezing for More General Video Face Forgery Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

HandNeRF: Neural Radiance Fields for Animatable Interacting Hands.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AnchorFormer: Point Cloud Completion from Discriminative Nodes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DanZero: Mastering GuanDan Game with Reinforcement Learning.
Proceedings of the IEEE Conference on Games, 2023

Mastering Curling with RL-revised Decision Tree.
Proceedings of the IEEE Conference on Games, 2023

Sample Efficient Reinforcement Learning with Double Importance Sampling Weight Clipping.
Proceedings of the IEEE Conference on Games, 2023

Implementing First-Person Shooter Game AI in WILD-SCAV with Rule-Enhanced Deep Reinforcement Learning.
Proceedings of the IEEE Conference on Games, 2023

Multi-Agent Reinforcement Learning with Safety Layer for Active Voltage Control.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Hybrid and Collaborative Passage Reranking.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

BEST: BERT Pre-training for Sign Language Recognition with Coupling Tokenization.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Low-Light Video Enhancement with Synthetic Event Guidance.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Spatial-Temporal Multi-Cue Network for Sign Language Recognition and Translation.
IEEE Trans. Multim., 2022

Conditional Sentence Generation and Cross-Modal Reranking for Sign Language Translation.
IEEE Trans. Multim., 2022

Learning Temporal-Correlated and Channel- Decorrelated Siamese Networks for Visual Tracking.
IEEE Trans. Multim., 2022

Deep Enhanced Weakly-Supervised Hashing With Iterative Tag Refinement.
IEEE Trans. Multim., 2022

Weakly Supervised Temporal Adjacent Network for Language Grounding.
IEEE Trans. Multim., 2022

Multi-Modal Context Propagation for Person Re-Identification With Wireless Positioning.
IEEE Trans. Multim., 2022

Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations.
IEEE Trans. Multim., 2022

Direct Molecular Conformation Generation.
Trans. Mach. Learn. Res., 2022

Anti-Distractor Active Object Tracking in 3D Environments.
IEEE Trans. Circuits Syst. Video Technol., 2022

Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents.
Frontiers Inf. Technol. Electron. Eng., 2022

CLIP2GAN: Towards Bridging Text with the Latent Space of GANs.
CoRR, 2022

Semantic Image Synthesis via Diffusion Models.
CoRR, 2022

Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods.
CoRR, 2022

Multi-Target Active Object Tracking with Monte Carlo Tree Search and Target Motion Modeling.
CoRR, 2022

Learning Enriched Illuminants for Cross and Single Sensor Color Constancy.
CoRR, 2022

CTDS: Centralized Teacher with Decentralized Student for Multi-Agent Reinforcement Learning.
CoRR, 2022

DQMIX: A Distributional Perspective on Multi-Agent Reinforcement Learning.
CoRR, 2022

Revisiting QMIX: Discriminative Credit Assignment by Gradient Entropy Regularization.
CoRR, 2022

Direct Molecular Conformation Generation.
CoRR, 2022

PolyTracker: Progressive Contour Regression for Multiple Object Tracking and Segmentation.
Proceedings of the Pattern Recognition and Computer Vision - 5th Chinese Conference, 2022

LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Hand-Object Interaction Image Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Unified 2D and 3D Pre-Training of Molecular Representations.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Stabilizing Voltage in Power Distribution Networks via Multi-Agent Reinforcement Learning with Transformer.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Hardware-Oriented Shallow Joint Demosaicing and Denoising.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

MVP: Multimodality-Guided Visual Pre-training.
Proceedings of the Computer Vision - ECCV 2022, 2022

CMD: Self-supervised 3D Action Representation Learning with Cross-Modal Mutual Distillation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Multi-modal Sign Language Spotting by Multi/One-Shot Learning.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

TAPE: Task-Agnostic Prior Embedding for Image Restoration.
Proceedings of the Computer Vision - ECCV 2022, 2022

CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds.
Proceedings of the Computer Vision - ECCV 2022, 2022

Geometric Representation Learning for Document Image Rectification.
Proceedings of the Computer Vision - ECCV 2022, 2022

Contextual Similarity Distillation for Asymmetric Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Uformer: A General U-Shaped Transformer for Image Restoration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Domain-Agnostic Prior for Transfer Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning.
Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022

Mastering the Game of 3v3 Snakes with Rule-Enhanced Multi-Agent Reinforcement Learning.
Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022

Learning Token-Based Representation for Image Retrieval.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
MFECN: Multi-level Feature Enhanced Cumulative Network for Scene Text Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Affinity Derivation for Accurate Instance Segmentation.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Residual Refinement Network with Attribute Guidance for Precise Saliency Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Global-Local Enhancement Network for NMF-Aware Sign Language Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Progressive Learning of Low-Precision Networks for Image Classification.
IEEE Trans. Multim., 2021

Progressive Unsupervised Person Re-Identification by Tracklet Association With Spatio-Temporal Regularization.
IEEE Trans. Multim., 2021

Collaborative Image Relevance Learning for Visual Re-Ranking.
IEEE Trans. Multim., 2021

Single Shot Video Object Detector.
IEEE Trans. Multim., 2021

Deep Relation Embedding for Cross-Modal Retrieval.
IEEE Trans. Image Process., 2021

Learning Diverse Models for End-to-End Ensemble Tracking.
IEEE Trans. Image Process., 2021

An End-to-End Foreground-Aware Network for Person Re-Identification.
IEEE Trans. Image Process., 2021

MINet: Meta-Learning Instance Identifiers for Video Object Detection.
IEEE Trans. Image Process., 2021

MCFD: A Hardware-Efficient Noniterative Multicue Fusion Demosaicing Algorithm.
IEEE Trans. Circuits Syst. Video Technol., 2021

Semantic Boundary Detection With Reinforcement Learning for Continuous Sign Language Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2021

Cascaded Regression Tracking: Towards Online Hard Distractor Discrimination.
IEEE Trans. Circuits Syst. Video Technol., 2021

From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection.
IEEE Trans. Circuits Syst. Video Technol., 2021

Unsupervised Deep Representation Learning for Real-Time Tracking.
Int. J. Comput. Vis., 2021

State Representation Learning With Adjacent State Consistency Loss for Deep Reinforcement Learning.
IEEE Multim., 2021

Heredity-aware Child Face Image Generation with Latent Space Disentanglement.
CoRR, 2021

Dual-view Molecule Pre-training.
CoRR, 2021

Contextual Similarity Aggregation with Self-attention for Visual Re-ranking.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Cross-modal Joint Prediction and Alignment for Composed Query Image Retrieval.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Semantic Scalable Image Compression with Cross-Layer Priors.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Random Sampling Weights Allocation Update for Deep Reinforcement Learning.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

Attentive Update of Multi-Critic for Deep Reinforcement Learning.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

IOT: Instance-wise Layer Reordering for Transformer Structures.
Proceedings of the 9th International Conference on Learning Representations, 2021

Learning Deep Local Features with Multiple Dynamic Attentions for Large-Scale Image Retrieval.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Instance-wise Hard Negative Example Generation for Contrastive Learning in Unpaired Image-to-Image Translation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Joint Inductive and Transductive Learning for Video Object Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

TransVG: End-to-End Visual Grounding with Transformers.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Improving Sign Language Translation With Monolingual Data by Sign Back-Translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Model-Aware Gesture-to-Gesture Translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Instance Mining with Class Feature Banks for Weakly Supervised Object Detection.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Contrastive Transformation for Self-supervised Correspondence Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Hand-Model-Aware Sign Language Recognition.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
AB-LSTM: Attention-based Bidirectional LSTM Model for Scene Text Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Single-stage Instance Segmentation.
ACM Trans. Multim. Comput. Commun. Appl., 2020

MV2Flow: Learning Motion Representation for Fast Compressed Video Action Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Neighborhood Pyramid Preserving Hashing.
IEEE Trans. Multim., 2020

Real-Time Correlation Tracking Via Joint Model Compression and Transfer.
IEEE Trans. Image Process., 2020

Hierarchical Recurrent Deep Fusion Using Adaptive Clip Summarization for Sign Language Translation.
IEEE Trans. Image Process., 2020

Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving.
CoRR, 2020

Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations.
CoRR, 2020

Can Semantic Labels Assist Self-Supervised Visual Representation Learning?
CoRR, 2020

Masked Contrastive Representation Learning for Reinforcement Learning.
CoRR, 2020

Global-local Enhancement Network for NMFs-aware Sign Language Recognition.
CoRR, 2020

Boosting Continuous Sign Language Recognition via Cross Modality Augmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Vision Meets Wireless Positioning: Effective Person Re-identification with Recurrent Context Propagation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

State Representation Learning For Effective Deep Reinforcement Learning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Contextual Adversarial Attacks For Object Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Incorporating BERT into Neural Machine Translation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Wavelet-Based Dual-Branch Network for Image Demoiréing.
Proceedings of the Computer Vision - ECCV 2020, 2020

The Eighth Visual Object Tracking VOT2020 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Transformation GAN for Unsupervised Image Synthesis and Representation Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

POST: POlicy-Based Switch Tracking.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Attentive Experience Replay.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Relation-Guided Spatial Attention and Temporal Refinement for Video-Based Person Re-Identification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Deep Scalable Supervised Quantization by Self-Organizing Map.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Reliable Re-Detection for Long-Term Tracking.
IEEE Trans. Circuits Syst. Video Technol., 2019

Attention-Based 3D-CNNs for Large-Vocabulary Sign Language Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2019

Multiple complementary inverted indexing based on multiple metrics.
Multim. Tools Appl., 2019

Multi-tracker fusion via adaptive outlier detection.
Multim. Tools Appl., 2019

Scene text detection with fully convolutional neural networks.
Multim. Tools Appl., 2019

Exploiting weak mask representation with convolutional neural networks for accurate object tracking.
Multim. Tools Appl., 2019

Progressive Learning of Low-Precision Networks.
CoRR, 2019

WIDER Face and Pedestrian Challenge 2018: Methods and Results.
CoRR, 2019

Dynamic Pseudo Label Decoding for Continuous Sign Language Recognition.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Dynamic Cascaded Regression Network with Reinforcement Learning for Robust Face Alignment.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Learning Motion-Aware Policies for Robust Visual Tracking.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Knowledge Distillation with Category-Aware Attention and Discriminant Logit Losses.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Continuous Sign Language Recognition via Reinforcement Learning.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Relation Distillation Networks for Video Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Unsupervised Deep Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Iterative Alignment Network for Continuous Sign Language Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Deep Grammatical Multi-classifier for Continuous Sign Language Recognition.
Proceedings of the Fifth IEEE International Conference on Multimedia Big Data, 2019

Soft Contextual Data Augmentation for Neural Machine Translation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Spatial and Temporal Mutual Promotion for Video-Based Person Re-Identification.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Re<sup>2</sup>EMA: Regularized and Reinitialized Exponential Moving Average for Target Model Update in Object Tracking.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Online Early-Late Fusion Based on Adaptive HMM for Sign Language Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2018

A General Framework for Linear Distance Preserving Hashing.
IEEE Trans. Image Process., 2018

Assessing Image Retrieval Quality at the First Glance.
IEEE Trans. Image Process., 2018

Retrieval Oriented Deep Feature Learning With Complementary Supervision Mining.
IEEE Trans. Image Process., 2018

Collaborative Index Embedding for Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Low-Latency Human Action Recognition with Weighted Multi-Region Convolutional Neural Network.
CoRR, 2018

Visual Attribute-augmented Three-dimensional Convolutional Neural Network for Enhanced Human Action Recognition.
CoRR, 2018

Convolutional Neural Networks with Generalized Attentional Pooling for Action Recognition.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Effective Similarity Measurement for Video-based Person Re-identification.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Retrieval Across Optical and SAR Images with Deep Neural Network.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Residual Compression Network for Faster Correlation Tracking.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Scalable Bag of Selected Deep Features for Visual Instance Retrieval.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Connectionist Temporal Fusion for Sign Language Translation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Cascaded Feature Augmentation with Diffusion for Image Retrieval.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Improving Deep Neural Network Sparsity through Decorrelation Regularization.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dilated Convolutional Network with Iterative Optimization for Continuous Sign Language Recognition.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Adaptive Layerwise Quantization for Deep Neural Network Compression.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Online Filter Weakening and Pruning for Efficient Convnets.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Weighted Multi-Region Convolutional Neural Network for Action Recognition With Low-Latency Online Prediction.
Proceedings of the 2018 IEEE International Conference on Multimedia & Expo Workshops, 2018

Enhanced Action Recognition With Visual Attribute-Augmented 3D Convolutional Neural Network.
Proceedings of the 2018 IEEE International Conference on Multimedia & Expo Workshops, 2018

Robust Object Tracking Via Part-Based Correlation Particle Filter.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Major-Subordinate-Task Learning for Image Orientation Estimation.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Online Filter Clustering and Pruning for Efficient Convnets.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Facial Expression Recognition with Data Augmentation and Compact Feature Learning.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Cascaded Deep Convolutional Neural Network for Robust Face Alignment.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Affinity Derivation and Graph Merge for Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2018, 2018

The Sixth Visual Object Tracking VOT2018 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Multi-Cue Correlation Filters for Robust Visual Tracking.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Video-Based Sign Language Recognition Without Temporal Segmentation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Hierarchical LSTM for Sign Language Translation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Picking Neural Activations for Fine-Grained Recognition.
IEEE Trans. Multim., 2017

Local residual similarity for image re-ranking.
Inf. Sci., 2017

Recent Advance in Content-based Image Retrieval: A Literature Survey.
CoRR, 2017

No-Reference Image Quality Assessment Based on Internal Generative Mechanism.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

Deep Supervised Quantization by Self-Organizing Map.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Quasi rate distortion optimization for binary hashing.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Orientation Estimation Network.
Proceedings of the Image and Graphics - 9th International Conference, 2017

The Visual Object Tracking VOT2017 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

2016
Scalable Object Retrieval with Compact Image Representation from Generic Object Regions.
ACM Trans. Multim. Comput. Commun. Appl., 2016

Democratic Diffusion Aggregation for Image Retrieval.
IEEE Trans. Multim., 2016

Effective Active Skeleton Representation for Low Latency Human Action Recognition.
IEEE Trans. Multim., 2016

Robust Blur Kernel Estimation for License Plate Images From Fast Moving Vehicles.
IEEE Trans. Image Process., 2016

Fused One-vs-All Features With Semantic Alignments for Fine-Grained Visual Categorization.
IEEE Trans. Image Process., 2016

Making Residual Vector Distribution Uniform for Distinctive Image Representation.
IEEE Trans. Circuits Syst. Video Technol., 2016

Scalable Feature Matching by Dual Cascaded Scalar Quantization for Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

A no-reference Image sharpness metric based on structural information using sparse representation.
Inf. Sci., 2016

Compressive tracking with adaptive color feature selection and foreground modeling.
Proceedings of the 2016 Visual Communications and Image Processing, 2016

Sparse Matrix Based Hashing for Approximate Nearest Neighbor Search.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Sign Language Recognition with Multi-modal Features.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Respiration Motion State Estimation on 4D CT Rib Cage Images.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Sign Language Recognition Based on Trajectory Modeling with HMMs.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Linear Distance Preserving Pseudo-Supervised and Unsupervised Hashing.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Chinese sign language recognition with adaptive HMM.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Sign language recognition with long short-term memory.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Sign language recognition based on adaptive HMMS with data augmentation.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Improve Visual Tracking by End-to-end Multi-Tracker Selection.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2016

Adaptively Weighted Graph Fusion for Image Retrieval.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2016

Picking Deep Filter Responses for Fine-Grained Image Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Uniting Keypoints: Local Visual Information Fusion for Large-Scale Image Search.
IEEE Trans. Multim., 2015

BSIFT: Toward Data-Independent Codebook for Large Scale Image Search.
IEEE Trans. Image Process., 2015

Heterogeneous Graph Propagation for Large-Scale Web Image Search.
IEEE Trans. Image Process., 2015

Visual word expansion and BSIFT verification for large-scale image search.
Multim. Syst., 2015

Attribute Mining for Scalable 3D Human Action Recognition.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Fast Democratic Aggregation and Query Fusion for Image Search.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Sign Language Recognition using 3D convolutional neural networks.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Rank-aware graph fusion with contextual dissimilarity measurement for image retrieval.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Scalable local feature matching without visual codebook training.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015

SOM: Semantic obviousness metric for image quality assessment.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

A new system for Chinese sign language recognition.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Sign language recognition using real-sense.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

2014
Towards Codebook-Free: Scalable Cascaded Hashing for Mobile Image Search.
IEEE Trans. Multim., 2014

Cross-Indexing of Binary SIFT Codes for Large-Scale Image Search.
IEEE Trans. Image Process., 2014

Contextual Hashing for Large-Scale Image Search.
IEEE Trans. Image Process., 2014

Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval.
J. Comput. Sci. Technol., 2014

Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb.
Comput. Vis. Image Underst., 2014

Fused one-vs-all mid-level features for fine-grained visual categorization.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

A Threshold-based HMM-DTW Approach for Continuous Sign Language Recognition.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

Evaluation on the Impact of Image Quality on Image Retrieval.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

Search by Detection: Object-Level Feature for Image Retrieval.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

Bayes Merging of Multiple Vocabularies for Scalable Image Retrieval.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
SIFT match verification by geometric coding for large-scale partial-duplicate web image search.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Scale based region growing for scene text detection.
Proceedings of the ACM Multimedia Conference, 2013

2012
Principal Visual Word Discovery for Automatic License Plate Detection.
IEEE Trans. Image Process., 2012

Exploring tag relevance for image tag re-ranking.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

Scalar quantization for large scale image search.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Image tag re-ranking by coupled probability transition.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Embedding spatial context information into inverted filefor large-scale image retrieval.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Query expansion enhancement by fast binary matching.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Attribute-assisted reranking for web image retrieval.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Binary SIFT: towards efficient feature matching verification for image search.
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

2011
Latent visual context learning for web image applications.
Pattern Recognit., 2011

Building descriptive and discriminative visual codebook for large-scale image applications.
Multim. Tools Appl., 2011

Modeling spatial and semantic cues for large-scale near-duplicated image retrieval.
Comput. Vis. Image Underst., 2011

Large scale image search with geometric coding.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

2010
Active contours with selective local or global segmentation: A new formulation and level set method.
Image Vis. Comput., 2010

Large scale partially duplicated web image retrieval.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Spatial coding for large scale partial-duplicate web image search.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Canonical Image Selection by Visual Context Learning.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Large scale partial-duplicate image retrieval with bi-space quantization and geometric consistency.
Proceedings of the IEEE International Conference on Acoustics, 2010

Latent visual context analysis for image re-ranking.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

2009
3D neuron dendritic spine detection and dendrite reconstruction.
Int. J. Comput. Aided Eng. Technol., 2009

Visual block link analysis for image re-ranking.
Proceedings of the First International Conference on Internet Multimedia Computing and Service, 2009

2008
3D Dendrite Reconstruction and Spine Identification.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2008


  Loading...