Xiaogang Wang

Affiliations:
  • Chinese University of Hong Kong, Department of Electrical Engineering, CUHK-SenseTime Joint Laboratory, Hong Kong
  • Massachusetts Institute of Technology, Cambridge, MA, USA (PhD 2009)


According to our database1, Xiaogang Wang authored at least 345 papers between 2002 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Structured Domain Adaptation With Online Relation Regularization for Unsupervised Person Re-ID.
IEEE Trans. Neural Networks Learn. Syst., January, 2024

2023
A Holistically-Guided Decoder for Deep Representation Learning With Applications to Semantic Segmentation and Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

3D Object Detection for Autonomous Driving: A Comprehensive Survey.
Int. J. Comput. Vis., August, 2023

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection.
Int. J. Comput. Vis., 2023

Digital Life Project: Autonomous 3D Characters with Social Intelligence.
CoRR, 2023

CoNe: Contrast Your Neighbours for Supervised Image Classification.
CoRR, 2023

FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow.
CoRR, 2023

ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process.
CoRR, 2023

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory.
CoRR, 2023

Topology Reasoning for Driving Scenes.
CoRR, 2023

Edge Preserving Implicit Surface Representation of Point Clouds.
CoRR, 2023

A Unified Conditional Framework for Diffusion-based Image Restoration.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Ternary Weight Networks.
Proceedings of the IEEE International Conference on Acoustics, 2023

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Siamese Image Modeling for Self-Supervised Vision Representation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Robust Self-Supervised LiDAR Odometry Via Representative Structure Discovery and 3D Inherent Error Modeling.
IEEE Robotics Autom. Lett., 2022

Probabilistic Graph Attention Network With Conditional Kernels for Pixel-Wise Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network.
Int. J. Comput. Vis., 2022

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.
CoRR, 2022

Demystify Transformers & Convolutions in Modern Image Deep Networks.
CoRR, 2022

ZoomNAS: Searching for Whole-body Human Pose Estimation in the Wild.
CoRR, 2022

Pose for Everything: Towards Category-Agnostic Pose Estimation.
CoRR, 2022

No Attention is Needed: Grouped Spatial-temporal Shift for Simple and Efficient Video Restorers.
CoRR, 2022

3D Object Detection for Autonomous Driving: A Review and New Outlooks.
CoRR, 2022

Siamese Image Modeling for Self-Supervised Vision Representation Learning.
CoRR, 2022

Relational Self-Supervised Learning.
CoRR, 2022

Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Dynamic Token Normalization improves Vision Transformers.
Proceedings of the Tenth International Conference on Learning Representations, 2022

ViTAS: Vision Transformer Architecture Search.
Proceedings of the Computer Vision - ECCV 2022, 2022

Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space.
Proceedings of the Computer Vision - ECCV 2022, 2022

Frozen CLIP Models are Efficient Video Learners.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning Degradation Representations for Image Deblurring.
Proceedings of the Computer Vision - ECCV 2022, 2022

IDR: Self-Supervised Image Denoising via Iterative Data Refinement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning a Structured Latent Space for Unsupervised Point Cloud Completion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

GreedyNASv2: Greedier Search with a Greedy Path Filter.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Person Re-Identification With Deep Kronecker-Product Matching and Group-Shuffling Random Walk.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Dynamic Token Normalization Improves Vision Transformer.
CoRR, 2021

Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks.
CoRR, 2021

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition.
CoRR, 2021

INTERN: A New Learning Paradigm Towards General Vision.
CoRR, 2021

Vision Transformer Architecture Search.
CoRR, 2021

Scalable Transformers for Neural Machine Translation.
CoRR, 2021

Decoupled Spatial-Temporal Transformer for Video Inpainting.
CoRR, 2021

Fixing the Teacher-Student Knowledge Discrepancy in Distillation.
CoRR, 2021

ReSSL: Relational Self-Supervised Learning with Weak Augmentation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution.
Proceedings of the 38th International Conference on Machine Learning, 2021

Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Proceedings of the 9th International Conference on Learning Representations, 2021

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Weakly Supervised Contrastive Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Rethinking Noise Synthesis and Modeling in Raw Denoising.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning with Privileged Tasks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Fast Convergence of DETR with Spatially Modulated Co-Attention.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Visually Informed Binaural Audio Generation without Binaural Audios.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Semantic Scene Completion via Integrating Instances and Scene In-the-Loop.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
HMS-Net: Hierarchical Multi-Scale Sparsity-Invariant Network for Sparse Depth Completion.
IEEE Trans. Image Process., 2020

SSN: Learning Sparse Switchable Normalization via SparsestMax.
Int. J. Comput. Vis., 2020

Deep Learning for Generic Object Detection: A Survey.
Int. J. Comput. Vis., 2020

A Holistically-Guided Decoder for Deep Representation Learning with Applications to Semantic Segmentation and Object Detection.
CoRR, 2020

End-to-End Object Detection with Adaptive Clustering Transformer.
CoRR, 2020

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation.
CoRR, 2020

Gradient Regularized Contrastive Learning for Continual Domain Adaptation.
CoRR, 2020

1st Place Solutions for OpenImage2019 - Object Detection and Instance Segmentation.
CoRR, 2020

Channel Equilibrium Networks for Learning Deep Representation.
Proceedings of the 37th International Conference on Machine Learning, 2020

Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Adapting Object Detectors with Conditional Domain Normalization.
Proceedings of the Computer Vision - ECCV 2020, 2020

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions.
Proceedings of the Computer Vision - ECCV 2020, 2020

Rotate-and-Render: Unsupervised Photorealistic Face Rotation From Single-View Images.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

3D Human Mesh Regression With Dense Correspondence.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Revisiting the Sibling Head in Object Detector.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Density-Aware Feature Embedding for Face Clustering.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Robust Superpixel-Guided Attentional Adversarial Attack.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

KPNet: Towards Minimal Face Detector.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
SCAN: Self-and-Collaborative Attention Network for Video Person Re-Identification.
IEEE Trans. Image Process., 2019

Deep Continuous Conditional Random Fields With Asymmetric Inter-Object Constraints for Online Multi-Object Tracking.
IEEE Trans. Circuits Syst. Video Technol., 2019

Progressively diffused networks for semantic visual parsing.
Pattern Recognit., 2019

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Monocular Depth Estimation Using Multi-Scale Continuous CRFs as Sequential Deep Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Guest Editors' Introduction to the Special Section on Compact and Efficient Feature Representation and Learning in Computer Vision.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

LCrowdV: Generating labeled videos for pedestrian detectors training and crowd behavior learning.
Neurocomputing, 2019

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection.
Int. J. Comput. Vis., 2019

An incremental model transfer method for complex process fault diagnosis.
IEEE CAA J. Autom. Sinica, 2019

Part-A<sup>2</sup> Net: 3D Part-Aware and Aggregation Neural Network for Object Detection from Point Cloud.
CoRR, 2019

Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation.
CoRR, 2019

Unsupervised Bi-directional Flow-based Video Generation from one Snapshot.
CoRR, 2019

DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images.
CoRR, 2019

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Feature Intertwiner for Object Detection.
Proceedings of the 7th International Conference on Learning Representations, 2019

Vision-Infused Deep Audio Inpainting.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Interpolated Convolutional Networks for 3D Point Cloud Understanding.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Once a MAN: Towards Multi-Target Attack via Learning Multi-Target Adversarial Network Once.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Deep Self-Learning From Noisy Labels.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Multi-Modality Latent Interaction Network for Visual Question Answering.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

P2SGrad: Refined Gradients for Optimizing Deep Face Models.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Context and Attribute Grounded Dense Captioning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Semantics Disentangling for Text-To-Image Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

SSN: Learning Sparse Switchable Normalization via SparsestMax.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Video Generation From Single Semantic Label Map.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Conditional Adversarial Generative Flow for Controllable Image Synthesis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Finding Task-Relevant Features for Few-Shot Learning by Category Traversal.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Group-Wise Correlation Stereo Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Unsupervised Cross-Spectral Stereo Matching by Learning to Synthesize.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Gradient Harmonized Single-Stage Detector.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Structure Learning for Deep Neural Networks Based on Multiobjective Optimization.
IEEE Trans. Neural Networks Learn. Syst., 2018

Crowd Tracking by Group Structure Evolution.
IEEE Trans. Circuits Syst. Video Technol., 2018

T-CNN: Tubelets With Convolutional Neural Networks for Object Detection From Videos.
IEEE Trans. Circuits Syst. Video Technol., 2018

Deep Learning for Visual Understanding: Part 2 [From the Guest Editors].
IEEE Signal Process. Mag., 2018

Crafting GBD-Net for Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Jointly Learning Deep Features, Deformable Parts, Occlusion and Classification for Pedestrian Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

HMS-Net: Hierarchical Multi-scale Sparsity-invariant Network for Sparse Depth Completion.
CoRR, 2018

Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association.
CoRR, 2018

Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation.
CoRR, 2018

FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Person Re-identification with Deep Similarity-Guided Graph Neural Network.
Proceedings of the Computer Vision - ECCV 2018, 2018

Transductive Centroid Projection for Semi-supervised Large-Scale Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data.
Proceedings of the Computer Vision - ECCV 2018, 2018

Factorizable Net: An Efficient Subgraph-Based Framework for Scene Graph Generation.
Proceedings of the Computer Vision - ECCV 2018, 2018

Neural Network Encapsulation.
Proceedings of the Computer Vision - ECCV 2018, 2018

Learning Monocular Depth by Distilling Cross-Domain Stereo Networks.
Proceedings of the Computer Vision - ECCV 2018, 2018

Question-Guided Hybrid Convolution for Visual Question Answering.
Proceedings of the Computer Vision - ECCV 2018, 2018

Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association.
Proceedings of the Computer Vision - ECCV 2018, 2018

3D Human Pose Estimation in the Wild by Adversarial Learning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Eliminating Background-Bias for Robust Person Re-Identification.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Avatar-Net: Multi-Scale Zero-Shot Style Transfer by Feature Decoration.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

End-to-End Deep Kronecker-Product Matching for Person Re-Identification.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Deep Group-Shuffling Random Walk for Person Re-Identification.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

FaceID-GAN: Learning a Symmetry Three-Player GAN for Identity-Preserving Face Synthesis.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Visual Question Generation as Dual Task of Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Video Person Re-Identification With Competitive Snippet-Similarity Aggregation and Co-Attentive Snippet Embedding.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Group Consistent Similarity Learning via Deep CRF for Person Re-Identification.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Exploring Disentangled Feature Representation Beyond Face Identification.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Spatial as Deep: Spatial CNN for Traffic Scene Understanding.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Co-Attending Free-Form Regions and Detections With Multi-Modal Multiplicative Feature Embedding for Visual Question Answering.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Visual Importance and Distortion Guided Deep Image Quality Assessment Framework.
IEEE Trans. Multim., 2017

Learning Scene-Independent Group Descriptors for Crowd Understanding.
IEEE Trans. Circuits Syst. Video Technol., 2017

Crowded Scene Understanding by Deeply Learned Volumetric Slices.
IEEE Trans. Circuits Syst. Video Technol., 2017

Deep Learning for Visual Understanding [From the Guest Editors].
IEEE Signal Process. Mag., 2017

Local binary features for texture classification: Taxonomy and experimental study.
Pattern Recognit., 2017

Person Re-Identification by Saliency Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

L<sub>0</sub> Regularized Stationary-Time Estimation for Crowd Analysis.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

A Comprehensive Study on Cross-View Gait Based Human Identification with Deep CNNs.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

DeepID-Net: Object Detection with Deformable Part Based Convolutional Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Editorial- Deep Learning for Computer Vision.
Comput. Vis. Image Underst., 2017

Rethinking Feature Discrimination and Polymerization for Large-scale Recognition.
CoRR, 2017

Visual Question Generation as Dual Task of Visual Question Answering.
CoRR, 2017

Progressively Diffused Networks for Semantic Image Segmentation.
CoRR, 2017

Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision.
CoRR, 2017

Learning Chained Deep Features and Classifiers for Cascade in Object Detection.
CoRR, 2017

Learning Deep Features via Congenerous Cosine Loss for Person Recognition.
CoRR, 2017

Scene Graph Generation from Objects, Phrases and Caption Regions.
CoRR, 2017

ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection.
CoRR, 2017

Zoom Out-and-In Network with Recursive Training for Object Proposal.
CoRR, 2017

Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, 2017

Learning Feature Pyramids for Human Pose Estimation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Chained Cascade Network for Object Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Deep Dual Learning for Semantic Image Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Recurrent Scale Approximation for Object Detection in CNN.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Identity-Aware Textual-Visual Matching with Latent Co-attention.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Scene Graph Generation from Objects, Phrases and Region Captions.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Online Multi-object Tracking Using CNN-Based Single Object Tracker with Spatial-Temporal Attention Mechanism.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Learning Cross-Modal Deep Representations for Robust Pedestrian Detection.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Joint Detection and Identification Feature Learning for Person Search.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Learning Object Interactions and Descriptions for Semantic Image Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Residual Attention Network for Image Classification.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Person Search with Natural Language Description.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

ViP-CNN: Visual Phrase Guided Convolutional Neural Network.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Object Detection in Videos with Tubelet Proposal Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Multi-context Attention for Human Pose Estimation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep Learning for Scene-Independent Crowd Analysis.
Proceedings of the Group and Crowd Behavior for Computer Vision, 1st Edition, 2017

2016
Data-Driven Crowd Understanding: A Baseline for a Large-Scale Crowd Dataset.
IEEE Trans. Multim., 2016

Bridging Music and Image via Cross-Modal Ranking Analysis.
IEEE Trans. Multim., 2016

Exemplar-AMMs: Recognizing Crowd Movements From Pedestrian Trajectories.
IEEE Trans. Multim., 2016

Pedestrian Behavior Modeling From Stationary Crowds With Applications to Intelligent Surveillance.
IEEE Trans. Image Process., 2016

Median Robust Extended Local Binary Pattern for Texture Classification.
IEEE Trans. Image Process., 2016

Partial Occlusion Handling in Pedestrian Detection With a Deep Model.
IEEE Trans. Circuits Syst. Video Technol., 2016

Hybrid Deep Learning for Face Verification.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

A survey on heterogeneous face recognition: Sketch, infra-red, 3D and low-resolution.
Image Vis. Comput., 2016

Magnetic Resonance Fingerprinting with compressed sensing and distance metric learning.
Neurocomputing, 2016

Learning Mutual Visibility Relationship for Pedestrian Detection with a Deep Model.
Int. J. Comput. Vis., 2016

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks.
CoRR, 2016

End-to-End Deep Learning for Person Search.
CoRR, 2016

Convolutional neural networks with low-rank regularization.
Proceedings of the 4th International Conference on Learning Representations, 2016

Factors in Finetuning Deep Model for object detection.
CoRR, 2016

Real-time Sign Language Recognition with Guided Deep Convolutional Neural Networks.
Proceedings of the 2016 Symposium on Spatial User Interaction, 2016

CRF-CNN: Modeling Structured Information in Human Pose Estimation.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Emerging Topics in Learning from Noisy and Missing Data.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-Bias Non-linear Activation in Deep Neural Networks.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Crossing-Line Crowd Counting with Two-Phase Deep Neural Networks.
Proceedings of the Computer Vision - ECCV 2016, 2016

Gated Bi-directional CNN for Object Detection.
Proceedings of the Computer Vision - ECCV 2016, 2016

Pedestrian Behavior Understanding and Prediction with Deep Neural Networks.
Proceedings of the Computer Vision - ECCV 2016, 2016

Learnable Histogram: Statistical Context Features for Deep Neural Networks.
Proceedings of the Computer Vision - ECCV 2016, 2016

Fashion Landmark Detection in the Wild.
Proceedings of the Computer Vision - ECCV 2016, 2016

LCrowdV: Generating Labeled Videos for Simulation-Based Crowd Behavior Learning.
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Evaluation of LBP and Deep Texture Descriptors with a New Robustness Benchmark.
Proceedings of the Computer Vision - ECCV 2016, 2016

End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

STCT: Sequentially Training Convolutional Networks for Visual Tracking.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Sparsifying Neural Network Connections for Face Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Slicing Convolutional Neural Network for Crowd Video Understanding.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Factors in Finetuning Deep Model for Object Detection with Long-Tail Distribution.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Object Detection from Video Tubelets with Convolutional Neural Networks.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Structured Feature Learning for Pose Estimation.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Face Model Compression by Distilling Knowledge from Neurons.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Single-Pedestrian Detection Aided by Two-Pedestrian Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2015

Learning Collective Crowd Behaviors with Dynamic Pedestrian-Agents.
Int. J. Comput. Vis., 2015

Window-Object Relationship Guided Representation Learning for Generic Object Detections.
CoRR, 2015

DeepID3: Face Recognition with Very Deep Neural Networks.
CoRR, 2015

Pedestrian Travel Time Estimation in Crowded Scenes.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Visual Tracking with Fully Convolutional Networks.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Deep Learning Strong Parts for Pedestrian Detection.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Learning Deep Representation with Large-Scale Attributes.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Deep Learning Face Attributes in the Wild.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Multi-task Recurrent Neural Network for Immediacy Prediction.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Saliency detection by multi-context deep learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Cross-scene crowd counting via deep convolutional neural networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Understanding pedestrian behaviors from stationary crowd groups.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Learning from massive noisy labeled data for image classification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Pedestrian detection aided by deep learning semantic tasks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Deeply learned face representations are sparse, selective, and robust.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Deeply learned attributes for crowded scene understanding.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

DeepID-Net: Deformable deep convolutional neural networks for object detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Person Re-identification: System Design and Evaluation Overview.
Proceedings of the Person Re-Identification, 2014

Measuring Crowd Collectiveness.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Scene-Specific Pedestrian Detection for Static Video Surveillance.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Web Image Re-Ranking UsingQuery-Specific Semantic Signatures.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Deep Learning Multi-View Representation for Face Recognition.
CoRR, 2014

Recover Canonical-View Faces in the Wild with Deep Neural Networks.
CoRR, 2014

Deep Learning Face Representation by Joint Identification-Verification.
CoRR, 2014

DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection.
CoRR, 2014

Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for Pixelwise Classification.
CoRR, 2014

Real-time sign language recognition using RGBD stream: spatial-temporal feature exploration.
Proceedings of the 2nd ACM Symposium on Spatial User Interaction, 2014

Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Deep Learning Face Representation by Joint Identification-Verification.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Fusing Music and Video Modalities Using Multi-timescale Shared Representations.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

MRF denoising with compressed sensing and adaptive filtering.
Proceedings of the IEEE 11th International Symposium on Biomedical Imaging, 2014

Profiling stationary crowd groups.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Crowd Tracking with Dynamic Evolution of Group Structures.
Proceedings of the Computer Vision - ECCV 2014, 2014

Deep Learning of Scene-Specific Classifier for Pedestrian Detection.
Proceedings of the Computer Vision - ECCV 2014, 2014

Learning Mid-level Filters for Person Re-identification.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

L0 Regularized Stationary Time Estimation for Crowd Group Analysis.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Deep Learning Face Representation from Predicting 10, 000 Classes.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Scene-Independent Group Profiling in Crowd.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Multi-source Deep Learning for Human Pose Estimation.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Switchable Deep Network for Pedestrian Detection.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

DeepReID: Deep Filter Pairing Neural Network for Person Re-identification.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Simplifying HOG arithmetic for speedy hardware realization.
Proceedings of the 2014 IEEE Asia Pacific Conference on Circuits and Systems, 2014

2013
Content-Based Photo Quality Assessment.
IEEE Trans. Multim., 2013

Learning Semantic Signatures for 3D Object Retrieval.
IEEE Trans. Multim., 2013

Counting Vehicles from Semantic Regions.
IEEE Trans. Intell. Transp. Syst., 2013

Image Transformation Based on Learning Dictionaries across Image Spaces.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Anchor concept graph distance for web image re-ranking.
Proceedings of the ACM Multimedia Conference, 2013

Dimensionality Reduction with Generalized Linear Models.
Proceedings of the IJCAI 2013, 2013

Deep Learning Identity-Preserving Face Space.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Person Re-identification by Salience Matching.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Multi-stage Contextual Deep Learning for Pedestrian Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Visual Semantic Complex Network for Web Images.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Joint Deep Learning for Pedestrian Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2013

A Deep Sum-Product Architecture for Robust Facial Attributes Analysis.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Pedestrian Parsing via Deep Decompositional Network.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Measuring Crowd Collectiveness.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Unsupervised Salience Learning for Person Re-identification.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Deep Convolutional Network Cascade for Facial Point Detection.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Modeling Mutual Visibility Relationship in Pedestrian Detection.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Single-Pedestrian Detection Aided by Multi-pedestrian Detection.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Locally Aligned Feature Transforms across Views.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
IntentSearch: Capturing User Intention for One-Click Internet Image Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2012

Cross matching of music and image.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Joint semantic segmentation by searching for compatible-competitive references.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Coherent Filtering: Detecting Coherent Motions from Crowd Clutters.
Proceedings of the Computer Vision - ECCV 2012, 2012

Graph Degree Linkage: Agglomerative Clustering on a Directed Graph.
Proceedings of the Computer Vision - ECCV 2012, 2012

Understanding collective crowd behaviors: Learning a Mixture model of Dynamic pedestrian-Agents.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Transferring a generic pedestrian detector towards specific scenes.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

A discriminative deep model for pedestrian detection with occlusion handling.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Hierarchical face parsing via deep learning.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Synthesizing oil painting surface geometry from a single photograph.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Human Reidentification with Transferred Metric Learning.
Proceedings of the Computer Vision - ACCV 2012, 2012

2011
Tractography segmentation using a hierarchical Dirichlet processes mixture model.
NeuroImage, 2011

Trajectory Analysis and Semantic Region Modeling Using Nonparametric Hierarchical Bayesian Models.
Int. J. Comput. Vis., 2011

3D object retrieval with semantic attributes.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Optical flow estimation using learned sparse model.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Random field topic model for semantic region analysis in crowded scenes from tracklets.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Coupled information-theoretic encoding for face photo-sketch recognition.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Query-specific visual semantic spaces for web image re-ranking.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
Correspondence-Free Activity Analysis and Scene Modeling in Multiple Camera Views.
IEEE Trans. Pattern Anal. Mach. Intell., 2010

Lighting and Pose Robust Face Sketch Synthesis.
Proceedings of the Computer Vision - ECCV 2010, 2010

2009
Learning motion patterns using hierarchical Bayesian models.
PhD thesis, 2009

Face Photo-Sketch Synthesis and Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

2008
Correspondence-free multi-camera activity analysis and scene modeling.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Trajectory analysis and semantic region modeling using a nonparametric Bayesian model.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
Spatial Latent Dirichlet Allocation.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Unsupervised Activity Perception by Hierarchical Bayesian Models.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Multi-class object tracking algorithm that handles fragmentation and grouping.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
Random Sampling for Subspace Face Recognition.
Int. J. Comput. Vis., 2006

Learning Semantic Scene Models by Trajectory Analysis.
Proceedings of the Computer Vision, 2006

2005
Hallucinating face by eigentransformation.
IEEE Trans. Syst. Man Cybern. Part C, 2005

Subspace Analysis Using Random Mixture Models.
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005

2004
Face sketch recognition.
IEEE Trans. Circuits Syst. Video Technol., 2004

A Unified Framework for Subspace Face Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2004

Experimental Study on Multiple LDA Classifier Combination for High Dimensional Data Classification.
Proceedings of the Multiple Classifier Systems, 5th International Workshop, 2004

Bayesian Face Recognition Based on Gaussian Mixture Models.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Improving indoor and outdoor face recognition using unified subspace analvsis and gabor features.
Proceedings of the 2004 International Conference on Image Processing, 2004

Hallucinating Face by Eigentransformation with Distortion Reduction.
Proceedings of the Biometric Authentication, First International Conference, 2004

Using Random Subspace to Combine Multiple Features for Face Recognition.
Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition (FGR 2004), 2004

Dual-Space Linear Discriminant Analysis for Face Recognition.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

Random Sampling LDA for Face Recognition.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

2003
Bayesian face recognition using Gabor features.
Proceedings of the 2003 ACM SIGMM Workshop on Biometrics Methods and Applications, 2003

Unified Subspace Analysis for Face Recognition.
Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV 2003), 2003

Face Sketch Synthesis and Recognition.
Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV 2003), 2003

An improved Bayesian face recognition algorithm in PCA subspace.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Face Hallucination and Recognition.
Proceedings of the Audio-and Video-Based Biometrie Person Authentication, 2003

2002
Face photo recognition using sketch.
Proceedings of the 2002 International Conference on Image Processing, 2002


  Loading...