Jingdong Wang

Affiliations:
  • Baidu, AI Group, Sunnyvale, CA, USA
  • Microsoft Research Asia, Beijing, China (former)
  • Hong Kong University of Science and Technology, Hong Kong (PhD 2007)
  • Tsinghua University, Department of Automation, Beijing, China (1997 - 2004)


According to our database1, Jingdong Wang authored at least 341 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective.
Int. J. Comput. Vis., February, 2024

Recent advances in artificial intelligence generated content.
Frontiers Inf. Technol. Electron. Eng., January, 2024

Context Autoencoder for Self-supervised Representation Learning.
Int. J. Comput. Vis., January, 2024

Collaborative Position Reasoning Network for Referring Image Segmentation.
CoRR, 2024

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition.
CoRR, 2024

MS-DETR: Efficient DETR Training with Mixed Supervision.
CoRR, 2024

2023
Guest editorial: special issue on human pose estimation and its applications.
Mach. Vis. Appl., November, 2023

Structured Knowledge Distillation for Dense Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields.
IEEE Robotics Autom. Lett., 2023

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection.
CoRR, 2023

Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation.
CoRR, 2023

A Survey of Reasoning with Foundation Models.
CoRR, 2023

GIR: 3D Gaussian Inverse Rendering for Relightable Scene Factorization.
CoRR, 2023

Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future.
CoRR, 2023

Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition.
CoRR, 2023

Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis.
CoRR, 2023

GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?
CoRR, 2023

GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding.
CoRR, 2023

Disentangled Representation Learning with Transmitted Information Bottleneck.
CoRR, 2023

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception.
CoRR, 2023

Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection.
CoRR, 2023

Accelerating Vision Transformers Based on Heterogeneous Attention Patterns.
CoRR, 2023

PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement.
CoRR, 2023

Unified Frequency-Assisted Transformer Framework for Detecting and Grounding Multi-Modal Manipulation.
CoRR, 2023

Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification.
CoRR, 2023

VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation.
CoRR, 2023

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation.
CoRR, 2023

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation.
CoRR, 2023

Learning Implicit Entity-object Relations by Bidirectional Generative Alignment for Multimodal NER.
CoRR, 2023

Multimodal Adaptation of CLIP for Few-Shot Action Recognition.
CoRR, 2023

Enhancing Your Trained DETRs with Box Refinement.
CoRR, 2023

Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation.
CoRR, 2023

Vision Transformer with Attention Map Hallucination and FFN Compaction.
CoRR, 2023

Rethinking the Open-Loop Evaluation of End-to-End Autonomous Driving in nuScenes.
CoRR, 2023

Multi-Modal 3D Object Detection by Box Matching.
CoRR, 2023

Exploring Effective Factors for Improving Visual In-Context Learning.
CoRR, 2023

Task-Oriented Multi-Modal Mutual Leaning for Vision-Language Models.
CoRR, 2023

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box.
CoRR, 2023

Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation.
CoRR, 2023

Understanding Self-Supervised Pretraining with Part-Aware Representation Learning.
CoRR, 2023

Efficient Video Portrait Reenactment via Grid-based Codebook.
Proceedings of the ACM SIGGRAPH 2023 Conference Proceedings, 2023

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DAC-DETR: Divide the Attention Layers and Conquer.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Graph Contrastive Learning for Skeleton-based Action Recognition.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

What Can Simple Arithmetic Operations Do for Temporal Modeling?
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Task-Oriented Multi-Modal Mutual Learning for Vision-Language Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Forward Flow for Novel View Synthesis of Dynamic Scenes.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

UATVR: Uncertainty-Adaptive Text-Video Retrieval.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

σ-Adaptive Decoupled Prototype for Few-Shot Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Augmentation Matters: A Simple-Yet-Effective Approach to Semi-Supervised Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Instance-Specific and Model-Adaptive Supervision for Semi-Supervised Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PSVT: End-to-End Multi-Person 3D Pose and Shape Estimation with Progressive Video Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cyclically Disentangled Feature Translation for Face Anti-spoofing.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Robust Video Portrait Reenactment via Personalized Representation Quantization.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Distillation-Guided Residual Learning for Binary Convolutional Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., 2022

Guest Editorial: Introduction to the Special Section on Fine-Grained Visual Categorization.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Few-Shot Image and Sentence Matching via Aligned Cross-Modal Memory.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition.
CoRR, 2022

CAE v2: Context Autoencoder with CLIP Target.
CoRR, 2022

Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining.
CoRR, 2022

It Takes Two: Masked Appearance-Motion Modeling for Self-supervised Video Transformer Pre-training.
CoRR, 2022

NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields.
CoRR, 2022

TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers.
CoRR, 2022

Group DETR: Fast Training Convergence with Decoupled One-to-Many Label Assignment.
CoRR, 2022

Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption.
CoRR, 2022

Conditional DETR V2: Efficient Detection Transformer with Box Queries.
CoRR, 2022

Towards Lightweight Super-Resolution with Dual Regression Learning.
CoRR, 2022

MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining.
CoRR, 2022

Efficient Video Segmentation Models with Per-frame Inference.
CoRR, 2022

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers.
Proceedings of the SIGGRAPH Asia 2022 Conference Papers, 2022

Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Delving into Sequential Patches for Deepfake Detection.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Self-Guided Hard Negative Generation for Unsupervised Person Re-Identification.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

On the Connection between Local Attention and Dynamic Depth-wise Convolution.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Learning Versatile Neural Architectures by Propagating Network Codes.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Fatigue Life Evaluation of Rubber Tyred Gantry Crane based on Minner Criterion.
Proceedings of the 2nd International Conference on Control and Intelligent Robotics, 2022

StyleSwap: Style-Based Generator Empowers Robust Face Swapping.
Proceedings of the Computer Vision - ECCV 2022, 2022

UFO: Unified Feature Optimization.
Proceedings of the Computer Vision - ECCV 2022, 2022

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval.
Proceedings of the Computer Vision - ECCV 2022, 2022

Diverse Learner: Exploring Diverse Supervision for Semi-supervised Object Detection.
Proceedings of the Computer Vision - ECCV 2022, 2022

GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

DaViT: Dual Attention Vision Transformers.
Proceedings of the Computer Vision, 2022

Action Quality Assessment with Temporal Parsing Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022

Human-Object Interaction Detection via Disentangled Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Implicit Sample Extension for Unsupervised Person Re-Identification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Few-Shot Font Generation by Learning Fine-Grained Local Styles.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Few-Shot Head Swapping in the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Expressive Talking Head Generation with Granular Audio-Visual Control.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MixFormer: Mixing Features across Windows and Dimensions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Learning to Segment Video Object With Accurate Boundaries.
IEEE Trans. Multim., 2021

Group Reidentification with Multigrained Matching and Integration.
IEEE Trans. Cybern., 2021

Deep High-Resolution Representation Learning for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Content-aware convolutional neural networks.
Neural Networks, 2021

OCNet: Object Context for Semantic Segmentation.
Int. J. Comput. Vis., 2021

SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search.
CoRR, 2021

Whole Brain Segmentation with Full Volume Neural Network.
CoRR, 2021

HRFormer: High-Resolution Transformer for Dense Prediction.
CoRR, 2021

Realistic Image Synthesis with Configurable 3D Scene Layouts.
CoRR, 2021

Cross-Modal Attention Consistency for Video-Audio Unsupervised Learning.
CoRR, 2021

Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight.
CoRR, 2021

Whole brain segmentation with full volume neural network.
Comput. Medical Imaging Graph., 2021

HRFormer: High-Resolution Vision Transformer for Dense Predict.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Results of the NeurIPS'21 Challenge on Billion-Scale Approximate Nearest Neighbor Search.
Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, 2021

SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Hybrid Network Compression via Meta-Learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Admix: Enhancing the Transferability of Adversarial Attacks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Conditional DETR for Fast Training Convergence.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Lite-HRNet: A Lightweight High-Resolution Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Boosting Adversarial Transferability through Enhanced Momentum.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Parameter Distribution Balanced CNNs.
IEEE Trans. Neural Networks Learn. Syst., 2020

Guest Editorial Multimedia Computing With Interpretable Machine Learning.
IEEE Trans. Multim., 2020

Semantic Image Segmentation by Scale-Adaptive Networks.
IEEE Trans. Image Process., 2020

Improving Person Re-Identification With Iterative Impression Aggregation.
IEEE Trans. Image Process., 2020

Object Detection in Videos by High Quality Object Linking.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

S4Net: Single stage salient-instance segmentation.
Comput. Vis. Media, 2020

Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates.
CoRR, 2020

Informative Dropout for Robust Representation Learning: A Shape-bias Perspective.
Proceedings of the 37th International Conference on Machine Learning, 2020

SegFix: Model-Agnostic Boundary Refinement for Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Object-Contextual Representations for Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Efficient Semantic Video Segmentation with Per-Frame Inference.
Proceedings of the Computer Vision - ECCV 2020, 2020

Weakly-Supervised Action Localization by Generative Attention Modeling.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Balanced Decoupled Spatial Convolution for CNNs.
IEEE Trans. Neural Networks Learn. Syst., 2019

Learning Attentional Recurrent Neural Network for Visual Tracking.
IEEE Trans. Multim., 2019

Automatic Ensemble Diffusion for 3D Shape and Image Retrieval.
IEEE Trans. Image Process., 2019

A Bilinear Ranking SVM for Knowledge Based Relation Prediction and Classification.
IEEE Trans. Big Data, 2019

Composite Quantization.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Ordinal Constraint Binary Coding for Approximate Nearest Neighbor Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Joint salient object detection and existence prediction.
Frontiers Comput. Sci., 2019

Bottom-up Higher-Resolution Networks for Multi-Person Pose Estimation.
CoRR, 2019

Interlaced Sparse Self-Attention for Semantic Segmentation.
CoRR, 2019

MMDetection: Open MMLab Detection Toolbox and Benchmark.
CoRR, 2019

Beyond Intra-modality Discrepancy: A Comprehensive Survey of Heterogeneous Person Re-identification.
CoRR, 2019

Group Re-Identification with Multi-grained Matching and Integration.
CoRR, 2019

High-Resolution Representations for Labeling Pixels and Regions.
CoRR, 2019

Disparity-preserved Deep Cross-platform Association for Cross-platform Video Recommendation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Cross View Fusion for 3D Human Pose Estimation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Global-Local Temporal Representations for Video Person Re-Identification.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Structured Knowledge Distillation for Semantic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Deep High-Resolution Representation Learning for Human Pose Estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Face Alignment With Deep Regression.
IEEE Trans. Neural Networks Learn. Syst., 2018

A Survey on Learning to Hash.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Multi-Dimensional Sparse Models.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Multiview Cross-Media Hashing with Semantic Consistency.
IEEE Multim., 2018

Accelerating Deep Neural Networks with Spatial Bottleneck Modules.
CoRR, 2018

OCNet: Object Context Network for Scene Parsing.
CoRR, 2018

IGCV2: Interleaved Structured Sparse Convolutional Neural Networks.
CoRR, 2018

Object Detection in Videos by Short and Long Range Object Linking.
CoRR, 2018

On the Large-Scale Transferability of Convolutional Neural Networks.
Proceedings of the Trends and Applications in Knowledge Discovery and Data Mining, 2018

Weakly Supervised Dense Event Captioning in Videos.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Group Re-Identification: Leveraging and Integrating Multi-Grain Information.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Triplet Quantization.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Convolutional Neural Networks with Merge-and-Run Mappings.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Rethinking ReLU to Train Better CNNs.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Feature Incay for Representation Regularization.
Proceedings of the 6th International Conference on Learning Representations, 2018

Part-Aligned Bilinear Representations for Person Re-identification.
Proceedings of the Computer Vision - ECCV 2018, 2018

Interleaved Structured Sparse Convolutional Neural Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Global Versus Localized Generative Adversarial Nets.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks.
Proceedings of the British Machine Vision Conference 2018, 2018

Decoupled Convolutions for CNNs.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Learning Correspondence Structures for Person Re-Identification.
IEEE Trans. Image Process., 2017

Multi-Timescale Collaborative Tracking.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Special issue on intelligent urban computing with big data.
Mach. Vis. Appl., 2017

Towards Reversal-Invariant Image Representation.
Int. J. Comput. Vis., 2017

Salient Object Detection: A Discriminative Regional Feature Integration Approach.
Int. J. Comput. Vis., 2017

Exemplar-Guided Similarity Learning on Polynomial Kernel Feature Map for Person Re-identification.
Int. J. Comput. Vis., 2017

Training Better CNNs Requires to Rethink ReLU.
CoRR, 2017

Interleaved Group Convolutions for Deep Neural Networks.
CoRR, 2017

Orthogonal and Idempotent Transformations for Learning Deep Neural Networks.
CoRR, 2017

Finding the Secret of CNN Parameter Layout under Strict Size Constraint.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Mixture Factorized Ornstein-Uhlenbeck Processes for Time-Series Forecasting.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

Random Shifting for CNN: a Solution to Reduce Information Loss in Down-Sampling Layers.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Deeply-Learned Part-Aligned Representations for Person Re-identification.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Interleaved Group Convolutions.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Human Pose Estimation Using Global and Local Normalization.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Ensemble Diffusion for Retrieval.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
Generalized Deep Transfer Networks for Knowledge Propagation in Heterogeneous Domains.
ACM Trans. Multim. Comput. Commun. Appl., 2016

Dual Low-Rank Pursuit: Learning Salient Features for Saliency Detection.
IEEE Trans. Neural Networks Learn. Syst., 2016

A Distance-Computation-Free Search Scheme for Binary Code Databases.
IEEE Trans. Multim., 2016

Weakly Supervised Metric Learning for Traffic Sign Recognition in a LIDAR-Equipped Vehicle.
IEEE Trans. Intell. Transp. Syst., 2016

A Diffusion and Clustering-Based Approach for Finding Coherent Motions and Understanding Crowd Scenes.
IEEE Trans. Image Process., 2016

Joint Multilabel Classification With Community-Aware Label Graph Learning.
IEEE Trans. Image Process., 2016

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection.
IEEE Trans. Image Process., 2016

Incorporating visual adjectives for image classification.
Neurocomputing, 2016

Accurate Image Search with Multi-Scale Contextual Evidences.
Int. J. Comput. Vis., 2016

Detection of Co-salient Objects by Looking Deep and Wide.
Int. J. Comput. Vis., 2016

Good Practice in CNN Feature Transfer.
CoRR, 2016

On the Connection of Deep Fusion to Ensembling.
CoRR, 2016

Deeply-Fused Nets.
CoRR, 2016

Self-Paced Cross-Modal Subspace Matching.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Fast Nearest Neighbor Search in the Hamming Space.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Binary Optimized Hashing.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

MARS: A Video Benchmark for Large-Scale Person Re-Identification.
Proceedings of the Computer Vision - ECCV 2016, 2016

Geometric Neural Phrase Pooling: Modeling the Spatial Co-occurrence of Neurons.
Proceedings of the Computer Vision - ECCV 2016, 2016

Collaborative Quantization for Cross-Modal Similarity Search.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

InterActive: Inter-Layer Activeness Propagation.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

DisturbLabel: Regularizing CNN on the Loss Layer.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Supervised Quantization for Similarity Search.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Fine-Grained Image Search.
IEEE Trans. Multim., 2015

Optimized Cartesian K-Means.
IEEE Trans. Knowl. Data Eng., 2015

Exploratory Product Image Search With Circle-to-Search Interaction.
IEEE Trans. Circuits Syst. Video Technol., 2015

Guest Editorial: Big Media Data: Understanding, Search, and Mining (Part 2).
IEEE Trans. Big Data, 2015

Guest Editorial: Big Media Data: Understanding, Search, and Mining.
IEEE Trans. Big Data, 2015

Guest Editorial: Ad Hoc Web Multimedia Analysis with Limited Supervision.
Multim. Tools Appl., 2015

Group $K$-Means.
CoRR, 2015

Deep kinship verification.
Proceedings of the 17th IEEE International Workshop on Multimedia Signal Processing, 2015

Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Quantized Correlation Hashing for Fast Cross-Modal Search.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Scalable Person Re-identification: A Benchmark.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

RIDE: Reversal Invariant Descriptor Enhancement.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Person Re-Identification with Correspondence Structure Learning.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Sparse composite quantization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Co-saliency detection via looking deep and wide.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Similarity learning on an explicit polynomial kernel feature map for person re-identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Fast Neighborhood Graph Search Using Cartesian Concatenation.
Proceedings of the Multimedia Data Mining and Analytics - Disruptive Innovation, 2015

Fast Approximate K-Means via Cluster Closures.
Proceedings of the Multimedia Data Mining and Analytics - Disruptive Innovation, 2015

2014
Personalized Video Recommendation through Graph Propagation.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Browse-to-Search: Interactive Exploratory Search with Visual Entities.
ACM Trans. Inf. Syst., 2014

Regularized Tree Partitioning and Its Application to Unsupervised Image Segmentation.
IEEE Trans. Image Process., 2014

Trinary-Projection Trees for Approximate Nearest Neighbor Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Image tag refinement by regularized latent Dirichlet allocation.
Comput. Vis. Image Underst., 2014

Low-rank SIFT: An Affine Invariant Feature for Place Recognition.
CoRR, 2014

Hashing for Similarity Search: A Survey.
CoRR, 2014

Deep Regression for Face Alignment.
CoRR, 2014

Salient Object Detection: A Discriminative Regional Feature Integration Approach.
CoRR, 2014

Inner Product Similarity Search using Compositional Codes.
CoRR, 2014

Transductive 3D Shape Segmentation using Sparse Reconstruction.
Comput. Graph. Forum, 2014

Optimized Distances for Binary Code Ranking.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Composite Quantization for Approximate Nearest Neighbor Search.
Proceedings of the 31th International Conference on Machine Learning, 2014

Low-rank SIFT: An affine invariant feature for place recognition.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Finding Coherent Motions and Semantic Regions in Crowd Scenes: A Diffusion and Clustering Approach.
Proceedings of the Computer Vision - ECCV 2014, 2014

Orientational Pyramid Matching for Recognizing Indoor Scenes.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

How Fashion Talks: Clothing-Region-Based Gender Recognition.
Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2014

2013
Interactive Multimodal Visual Search on Mobile Device.
IEEE Trans. Multim., 2013

Structure-Sensitive Superpixels via Geodesic Distance.
Int. J. Comput. Vis., 2013

Hybrid Affinity Propagation.
CoRR, 2013

Scalable $k$-NN graph construction.
CoRR, 2013

Order preserving hashing for approximate nearest neighbor search.
Proceedings of the ACM Multimedia Conference, 2013

Image search by graph-based label propagation with image representation from DNN.
Proceedings of the ACM Multimedia Conference, 2013

Clickage: towards bridging semantic and intent gaps via mining click logs of search engines.
Proceedings of the ACM Multimedia Conference, 2013

Fixed-Point Model For Structured Labeling.
Proceedings of the 30th International Conference on Machine Learning, 2013

Two dimensional synthesis sparse model.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Two dimensional analysis sparse model.
Proceedings of the IEEE International Conference on Image Processing, 2013

Learning CRFs for Image Parsing with Adaptive Subgradient Descent.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Fast Neighborhood Graph Search Using Cartesian Concatenation.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Online Robust Non-negative Dictionary Learning for Visual Tracking.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Supervised Kernel Descriptors for Visual Recognition.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Salient Object Detection: A Discriminative Regional Feature Integration Approach.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
An interactive approach to semantic modeling of indoor scenes with an RGBD camera.
ACM Trans. Graph., 2012

Correction to "Bayesian Visual Reranking".
IEEE Trans. Multim., 2012

Recommending Flickr groups with social topic model.
Inf. Retr., 2012

Color filter for image search.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Scalable similar image search by joint indices.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Query-driven iterated neighborhood graph search for large scale indexing.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Similar image search with a tiny bag-of-delegates representation.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Browse-to-search.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Personalized video recommendation through tripartite graph propagation.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Contextual Dominant Color Name Extraction for Web Image Search.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, 2012

A Probabilistic Approach to Robust Matrix Factorization.
Proceedings of the Computer Vision - ECCV 2012, 2012

Scalable k-NN graph construction for visual descriptors.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Salient object detection for searched web images via global saliency.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Fast approximate k-means via cluster closures.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Image search results refinement via outlier detection using deep contexts.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Bayesian Visual Reranking.
IEEE Trans. Multim., 2011

Interactive Image Search by Color Map.
ACM Trans. Intell. Syst. Technol., 2011

A transductive multi-label learning approach for video concept detection.
Pattern Recognit., 2011

Learning to Detect a Salient Object.
IEEE Trans. Pattern Anal. Mach. Intell., 2011

Interactive browsing via diversified visual summarization for image search results.
Multim. Syst., 2011

Discriminative Sketch-based 3D Model Retrieval via Robust Shape Matching.
Comput. Graph. Forum, 2011

Document clustering with universum.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Hybrid image summarization.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

JIGSAW: interactive mobile visual search with multimodal queries.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Web-scale image search by color sketch.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Robust visual reranking via sparsity and ranking constraints.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Contextual image search.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Complementary hashing for approximate nearest neighbor search.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Multi-task low-rank affinity pursuit for image segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2011

A non-convex relaxation approach to sparse dictionary learning.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Automatic salient object segmentation based on context and shape prior.
Proceedings of the British Machine Vision Conference, 2011

2010
Interactive image search by 2D semantic map.
Proceedings of the 19th International Conference on World Wide Web, 2010

Image search by concept map.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

Dynamic Video Collage.
Proceedings of the Advances in Multimedia Modeling, 2010

Learning to combine multi-resolution spatially-weighted co-occurrence matrices for image representation.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Optimizing kd-trees for scalable visual descriptor indexing.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Picture Collage.
IEEE Trans. Multim., 2009

Linear Neighborhood Propagation and Its Applications.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

Graph-based semi-supervised learning with multiple labels.
J. Vis. Commun. Image Represent., 2009

Tag refinement by regularized LDA.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Summarizing tagged image collections by cross-media representativeness voting.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

2008
MSRA atT TRECVID 2008: High-Level Feature Extraction and Automatic Search.
Proceedings of the TRECVID 2008 workshop participants notebook papers, 2008

Semi-Supervised Classification with Universum.
Proceedings of the SIAM International Conference on Data Mining, 2008

Bayesian video search reranking.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Finding image exemplars using fast sparse affinity propagation.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Transductive multi-label learning for video concept detection.
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008

Graph-based semi-supervised learning with multi-label.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Optimized video scene segmentation.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Transductive video annotation via local learnable kernel classifier.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Augmented tree partitioning for interactive image segmentation.
Proceedings of the International Conference on Image Processing, 2008

Maximum Margin Clustering with Pairwise Constraints.
Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), 2008

Joint multi-label multi-instance learning for image classification.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Normalized tree partitioning for image segmentation.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
Image-based tree modeling.
ACM Trans. Graph., 2007

Face recognition using spectral features.
Pattern Recognit., 2007

Image-Based Modeling by Joint Segmentation.
Int. J. Comput. Vis., 2007

Joint Affinity Propagation for Multiple View Segmentation.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

2006
Image-based plant modeling.
ACM Trans. Graph., 2006

Semi-Supervised Classification Using Linear Neighborhood Propagation.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

Picture Collage.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

2005
Visual object recognition using probabilistic kernel subspace similarity.
Pattern Recognit., 2005

2004
Probabilistic tangent subspace: a unified view.
Proceedings of the Machine Learning, 2004

Multi-view EM algorithm and its application to color image segmentation.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

2003
Kernel GMM and its application to image binarization.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

Color Image Segmentation: Kernel Do the Feature Space.
Proceedings of the Machine Learning: ECML 2003, 2003

Kernel Trick Embedded Gaussian Mixture Model.
Proceedings of the Algorithmic Learning Theory, 14th International Conference, 2003


  Loading...