Yu Qiao

Orcid: 0000-0002-1889-2567

Affiliations:
  • Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, China
  • University of Tokyo, Graduate School of Information Science and Technology, Japan (former)
  • University of Electro-Communications, Tokyo, Japan (PhD 2006)


According to our database1, Yu Qiao authored at least 369 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Delving Into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Temporally consistent video colorization with deep feature propagation and self-regularization learning.
Comput. Vis. Media, April, 2024

Dual Masked Modeling for Weakly-Supervised Temporal Boundary Discovery.
IEEE Trans. Multim., 2024

Attentive Snippet Prompting for Video Retrieval.
IEEE Trans. Multim., 2024

Progressive Frame-Proposal Mining for Weakly Supervised Video Object Detection.
IEEE Trans. Image Process., 2024

MixStyle Neural Networks for Domain Generalization and Adaptation.
Int. J. Comput. Vis., 2024

2023
Evaluating the Generalization Ability of Super-Resolution Networks.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

UniFormer: Unifying Convolution and Self-Attention for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2023

Hybrid token transformer for deep face recognition.
Pattern Recognit., July, 2023

Blind Image Super-Resolution: A Survey and Beyond.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

COCAS+: Large-Scale Clothes-Changing Person Re-Identification With Clothes Templates.
IEEE Trans. Circuits Syst. Video Technol., April, 2023

Domain Generalization: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

ActFloor-GAN: Activity-Guided Adversarial Networks for Human-Centric Floorplan Design.
IEEE Trans. Vis. Comput. Graph., March, 2023

Towards robustness and generalization of point cloud representation: A geometry coding method and a large-scale object-level dataset.
Comput. Vis. Media, February, 2023

Protein engineering via Bayesian optimization-guided evolutionary algorithm and robotic experiments.
Briefings Bioinform., January, 2023

Blind Image Restoration Based on Cycle-Consistent Network.
IEEE Trans. Multim., 2023

Region-Aware Arbitrary-Shaped Text Detection With Progressive Fusion.
IEEE Trans. Multim., 2023

Very Lightweight Photo Retouching Network With Conditional Sequential Modulation.
IEEE Trans. Multim., 2023

Character-Aware Sampling and Rectification for Scene Text Recognition.
IEEE Trans. Multim., 2023

Dual Relation Network for Scene Text Recognition.
IEEE Trans. Multim., 2023

M-BEV: Masked BEV Perception for Robust Autonomous Driving.
CoRR, 2023

Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption.
CoRR, 2023

Harvest Video Foundation Models via Efficient Post-Pretraining.
CoRR, 2023

A Comparative Study of Image Restoration Networks for General Backbone Network Design.
CoRR, 2023

Unifying Image Processing as Visual Prompting Question Answering.
CoRR, 2023

HAT: Hybrid Attention Transformer for Image Restoration.
CoRR, 2023

Towards Efficient SDRTV-to-HDRTV by Learning from Image Formation.
CoRR, 2023

SEAL: A Framework for Systematic Evaluation of Real-World Super-Resolution.
CoRR, 2023

MGMAE: Motion Guided Masking for Video Masked Autoencoding.
CoRR, 2023

Networks are Slacking Off: Understanding Generalization Problem in Image Deraining.
CoRR, 2023

VideoLLM: Modeling Video Sequence with Large Language Models.
CoRR, 2023

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model.
CoRR, 2023

Long-Term Rhythmic Video Soundtracker.
CoRR, 2023

Unmasked Teacher: Towards Training-Efficient Video Foundation Models.
CoRR, 2023

Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields.
CoRR, 2023

Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners.
CoRR, 2023

Learning Discriminative Feature Representation for Open Set Action Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Parallelizable Simple Recurrent Units with Hierarchical Memory.
Proceedings of the Neural Information Processing - 30th International Conference, 2023

UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with Informative-Preserved Reconstruction and Self-Distilled Consistency.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Activating More Pixels in Image Super-Resolution Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DegAE: A New Pretraining Paradigm for Low-Level Vision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
FineAction: A Fine-Grained Video Dataset for Temporal Action Localization.
IEEE Trans. Image Process., 2022

Robust Image Forgery Detection Against Transmission Over Online Social Networks.
IEEE Trans. Inf. Forensics Secur., 2022

Temporal Weighting Appearance-Aligned Network for Nighttime Video Retrieval.
IEEE Signal Process. Lett., 2022

Unsupervised person re-identification with multi-label learning guided self-paced clustering.
Pattern Recognit., 2022

RankSRGAN: Super Resolution Generative Adversarial Networks With Learning to Rank.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Interactive Multi-Dimension Modulation for Image Restoration.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Author Correction: Development and clinical deployment of a smartphone-based visual field deep learning system for glaucoma detection.
npj Digit. Medicine, 2022

Joint 3D facial shape reconstruction and texture completion from a single image.
Comput. Vis. Media, 2022

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders.
CoRR, 2022

Diff-Font: Diffusion Model for Robust One-Shot Font Generation.
CoRR, 2022

InternVideo: General Video Foundation Models via Generative and Discriminative Learning.
CoRR, 2022

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE.
CoRR, 2022

UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer.
CoRR, 2022

Low-Resolution Action Recognition for Tiny Actions Challenge.
CoRR, 2022

Collaboration of Pre-trained Models Makes Better Few-shot Learner.
CoRR, 2022

Vision-Centric BEV Perception: A Survey.
CoRR, 2022

CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm.
CoRR, 2022

Illumination Adaptive Transformer.
CoRR, 2022

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results.
CoRR, 2022

POS-BERT: Point Cloud One-Stage BERT Pre-Training.
CoRR, 2022

MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection.
CoRR, 2022

Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning.
CoRR, 2022

CP-Net: Contour-Perturbed Reconstruction Network for Self-Supervised Point Cloud Learning.
CoRR, 2022

UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning.
CoRR, 2022

Asynchronous feature regularization and cross-modal distillation for OCT based glaucoma diagnosis.
Comput. Biol. Medicine, 2022

Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Cycle-Consistent Learning for Weakly Supervised Semantic Segmentation.
Proceedings of the HCMA@MM 2022: Proceedings of the 3rd International Workshop on Human-Centric Multimedia Analysis, 2022

Visual Knowledge Graph for Human Action Reasoning in Videos.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

VideoPipe 2022 Challenge: Real-World Video Understanding for Urban Pipe Inspection.
Proceedings of the 26th International Conference on Pattern Recognition, 2022

UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Self-slimmed Vision Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022

Efficient Image Super-Resolution Using Vast-Receptive-Field Attention.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Recurrent Bilinear Optimization for Binary Neural Networks.
Proceedings of the Computer Vision, 2022

PointCLIP: Point Cloud Understanding by CLIP.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Blueprint Separable Residual Network for Efficient Image Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Reflash Dropout in Image Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross Domain Object Detection by Target-Perceived Dual Branch Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Unleashing the Potential of Vision-Language Models for Long-Tailed Visual Recognition.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

You Only Need 90K Parameters to Adapt Light: a Light Weight Transformer for Image Enhancement and Exposure Correction.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
Wildfish++: A Comprehensive Fish Benchmark for Multimedia Research.
IEEE Trans. Multim., 2021

Deep Relation Transformer for Diagnosing Glaucoma With Optical Coherence Tomography and Visual Field Function.
IEEE Trans. Medical Imaging, 2021

Domain Adaptive Ensemble Learning.
IEEE Trans. Image Process., 2021

Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos.
IEEE Trans. Image Process., 2021

Deep Learning-Based Chroma Prediction for Intra Versatile Video Coding.
IEEE Trans. Circuits Syst. Video Technol., 2021

Multi-view self-supervised learning for 3D facial texture reconstruction from single image.
Image Vis. Comput., 2021

TTPP: Temporal Transformer with Progressive Prediction for efficient action anticipation.
Neurocomputing, 2021

A Comprehensive Review of Group Activity Recognition in Videos.
Int. J. Autom. Comput., 2021

Multi-View Partial (MVP) Point Cloud Challenge 2021 on Completion and Registration: Methods and Results.
CoRR, 2021

A Simple Long-Tailed Recognition Baseline via Vision-Language Model.
CoRR, 2021

MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video.
CoRR, 2021

CLIP-Adapter: Better Vision-Language Models with Feature Adapters.
CoRR, 2021

Discovering "Semantics" in Super-Resolution Networks.
CoRR, 2021

Transferable Knowledge-Based Multi-Granularity Aggregation Network for Temporal Action Localization: Submission to ActivityNet Challenge 2021.
CoRR, 2021

TSI: Temporal Saliency Integration for Video Action Recognition.
CoRR, 2021

Multiple Domain Experts Collaborative Learning: Multi-Source Domain Generalization For Person Re-Identification.
CoRR, 2021

FineAction: A Fined Video Dataset for Temporal Action Localization.
CoRR, 2021

Neighbourhood-guided Feature Reconstruction for Occluded Person Re-Identification.
CoRR, 2021

NTIRE 2021 Challenge on Perceptual Image Quality Assessment.
CoRR, 2021

Self-speculation of clinical features based on knowledge distillation for accurate ocular disease classification.
Biomed. Signal Process. Control., 2021

Multi-label ocular disease classification with a dense correlation deep neural network.
Biomed. Signal Process. Control., 2021

Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

A Novel Hybrid Convolutional Neural Network for Accurate Organ Segmentation in 3D Head and Neck CT Images.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

Collaborative Multi-View Convolutions With Gating For Accurate And Fast Volumetric Medical Image Segmentation.
Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021

Domain Generalization with MixStyle.
Proceedings of the 9th International Conference on Learning Representations, 2021

CT-Net: Channel Tensorization Network for Video Classification.
Proceedings of the 9th International Conference on Learning Representations, 2021

Digging into Uncertainty in Self-supervised Multi-view Stereo.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A New Journey from SDRTV to HDRTV.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Refining Pseudo Labels With Clustering Consensus Over Generations for Unsupervised Object Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Temporal Context Aggregation Network for Temporal Action Proposal Refinement.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Detecting Human-Object Interaction via Fabricated Compositional Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Affordance Transfer Learning for Human-Object Interaction Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

NTIRE 2021 Challenge on Perceptual Image Quality Assessment.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

HDRUNet: Single Image HDR Reconstruction With Denoising and Dequantization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Toward Interactive Modulation for Photo-Realistic Image Restoration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Learning Geometry-Disentangled Representation for Complementary Understanding of 3D Object Point Cloud.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Investigate Indistinguishable Points in Semantic Segmentation of 3D Point Cloud.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

MP-Mono: Monocular 3D Detection Using Multiple Priors for Autonomous Driving.
Proceedings of the International Conference on 3D Vision, 2021

2020
FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures.
IEEE Trans. Parallel Distributed Syst., 2020

Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition.
IEEE Trans. Image Process., 2020

Progressive Object Transfer Detection.
IEEE Trans. Image Process., 2020

DID: Disentangling-Imprinting-Distilling for Continuous Low-Shot Detection.
IEEE Trans. Image Process., 2020

Learning label correlations for multi-label image recognition with graph networks.
Pattern Recognit. Lett., 2020

Development and clinical deployment of a smartphone-based visual field deep learning system for glaucoma detection.
npj Digit. Medicine, 2020

Finding hard faces with better proposals and classifier.
Mach. Vis. Appl., 2020

Cascade multi-head attention networks for action recognition.
Comput. Vis. Image Underst., 2020

Product image recognition with guidance learning and noisy supervision.
Comput. Vis. Image Underst., 2020

Collaborative Distillation in the Parameter and Spectrum Domains for Video Action Recognition.
CoRR, 2020

Exploring Multi-Scale Feature Propagation and Communication for Image Super Resolution.
CoRR, 2020

A Comprehensive Study on Temporal Modeling for Online Action Detection.
CoRR, 2020

SIAT-3DFE: A High-Resolution 3D Facial Expression Dataset.
IEEE Access, 2020

Dense Correlation Network for Automated Multi-Label Ocular Disease Detection with Paired Color Fundus Photographs.
Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020

Classification of Ocular Diseases Employing Attention-Based Unilateral and Bilateral Feature Weighting and Fusion.
Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020

Learning Discriminative Representation For Facial Expression Recognition From Uncertainties.
Proceedings of the IEEE International Conference on Image Processing, 2020

Efficient Image Super-Resolution Using Pixel Attention.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax.
Proceedings of the Computer Vision - ECCV 2020, 2020


Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

MEAD: A Large-Scale Audio-Visual Dataset for Emotional Talking-Face Generation.
Proceedings of the Computer Vision - ECCV 2020, 2020

AIM 2020 Challenge on Video Temporal Super-Resolution.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Suppressing Mislabeled Data via Grouping and Self-attention.
Proceedings of the Computer Vision - ECCV 2020, 2020

Enhanced Quadratic Video Interpolation.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Learning to Predict Context-Adaptive Convolution for Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Visual Compositional Learning for Human-Object Interaction Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

Conditional Sequential Modulation for Efficient Global Image Retouching.
Proceedings of the Computer Vision - ECCV 2020, 2020

Interactive Multi-dimension Modulation with Dynamic Controllable Residual Learning for Image Restoration.
Proceedings of the Computer Vision - ECCV 2020, 2020

Mining Inter-Video Proposal Relations for Video Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

COCAS: A Large-Scale Clothes Changing Person Dataset for Re-Identification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Suppressing Uncertainties for Large-Scale Facial Expression Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Fast Texture Synthesis via Pseudo Optimizer.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

SmallBigNet: Integrating Core and Contextual Views for Video Classification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Multiple Transfer Learning and Multi-label Balanced Training Strategies for Facial AU Detection In the Wild.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Adaptive Dilated Network With Self-Correction Supervision for Counting.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

aDMSCN: A Novel Perspective for User Intent Prediction in Customer Service Bots.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Learning Attentive Pairwise Interaction for Fine-Grained Classification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Context-Transformer: Tackling Object Confusion for Few-Shot Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Geometry Sharing Network for 3D Point Cloud Classification and Segmentation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Dynamic Sampling Network for Semantic Segmentation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

FD-GAN: Generative Adversarial Networks with Fusion-Discriminator for Single Image Dehazing.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Mutual Component Convolutional Neural Networks for Heterogeneous Face Recognition.
IEEE Trans. Image Process., 2019

A Literature Review: Geometric Methods and Their Applications in Human-Related Analysis.
Sensors, 2019

Dual-supervised attention network for deep cross-modal hashing.
Pattern Recognit. Lett., 2019

Temporal Segment Networks for Action Recognition in Videos.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

DeepDeblur: text image recovery from blur to sharp.
Multim. Tools Appl., 2019

Pedestrian detection with unsupervised multispectral feature learning using deep neural networks.
Inf. Fusion, 2019

A Comprehensive Study on Center Loss for Deep Face Recognition.
Int. J. Comput. Vis., 2019

Multi-Dimension Modulation for Image Restoration with Dynamic Controllable Residual Learning.
CoRR, 2019

Learning Category Correlations for Multi-label Image Recognition with Graph Networks.
CoRR, 2019

Product Image Recognition with Guidance Learning and Noisy Supervision.
CoRR, 2019

Correction to: Automatic differentiation of Glaucoma visual field from non-glaucoma visual field using deep convolutional neural network.
BMC Medical Imaging, 2019

Robust Text Line Detection in Equipment Nameplate Images.
Proceedings of the 2019 IEEE International Conference on Robotics and Biomimetics, 2019

The Equipment Nameplate Dataset for Scene Text Detection and Recognition<sup>∗</sup>.
Proceedings of the 2019 IEEE International Conference on Robotics and Biomimetics, 2019

Orientation Robust Scene Text Recognition in Natural Scene.
Proceedings of the 2019 IEEE International Conference on Robotics and Biomimetics, 2019

AnoPCN: Video Anomaly Detection via Deep Predictive Coding Network.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Intelligent Glaucoma Diagnosis Via Active Learning And Adversarial Data Augmentation.
Proceedings of the 16th IEEE International Symposium on Biomedical Imaging, 2019

Prostate Segmentation using 2D Bridged U-net.
Proceedings of the International Joint Conference on Neural Networks, 2019

Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition.
Proceedings of the International Conference on Multimodal Interaction, 2019

Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression.
Proceedings of the International Conference on Multimodal Interaction, 2019

Exploring Regularizations with Face, Body and Image Cues for Group Cohesion Prediction.
Proceedings of the International Conference on Multimodal Interaction, 2019

Visual-Textual Sentiment Analysis in Product Reviews.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Frame Attention Networks for Facial Expression Recognition in Videos.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

RankSRGAN: Generative Adversarial Networks With Ranker for Image Super-Resolution.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

DF2Net: A Dense-Fine-Finer Network for Detailed 3D Face Reconstruction.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Dynamic Multi-Scale Filters for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

PA3D: Pose-Action 3D Machine for Video Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Adaptive Pyramid Context Network for Semantic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Modulating Image Restoration With Continual Levels via Adaptive Feature Modification Layers.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Suppressing Model Overfitting for Image Super-Resolution Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019


Residual Compensation Networks for Heterogeneous Face Recognition.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Real-Time Action Recognition With Deeply Transferred Motion Vector CNNs.
IEEE Trans. Image Process., 2018

Recurrent Spatial-Temporal Attention Network for Action Recognition in Videos.
IEEE Trans. Image Process., 2018

Deep embedding convolutional neural network for synthesizing CT image from T1-Weighted MR image.
Medical Image Anal., 2018

Transferring Deep Object and Scene Representations for Event Recognition in Still Images.
Int. J. Comput. Vis., 2018

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.
CoRR, 2018

W-net: Bridged U-net for 2D Medical Image Segmentation.
CoRR, 2018

Structured Triplet Learning with POS-tag Guided Attention for Visual Question Answering.
CoRR, 2018

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward.
CoRR, 2018

Automatic differentiation of Glaucoma visual field from non-glaucoma visual filed using deep convolutional neural network.
BMC Medical Imaging, 2018

Structured Triplet Learning with POS-Tag Guided Attention for Visual Question Answering.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

WildFish: A Large Benchmark for Fish Recognition in the Wild.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

StripNet: Towards Topology Consistent Strip Structure Segmentation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Visual Field Based Automatic Diagnosis of Glaucoma Using Deep Convolutional Neural Network.
Proceedings of the Computational Pathology and Ophthalmic Medical Image Analysis, 2018

A Multi-task Learning Approach for Image Captioning.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Deep Recurrent Multi-instance Learning with Spatio-temporal Features for Engagement Intensity Prediction.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Cascade Attention Networks For Group Emotion Recognition with Face, Body and Image Cues.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Super-Identity Convolutional Neural Network for Face Hallucination.
Proceedings of the Computer Vision - ECCV 2018, 2018

SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters.
Proceedings of the Computer Vision - ECCV 2018, 2018

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Find and Focus: Retrieve and Localize Video Events with Natural Language Queries.
Proceedings of the Computer Vision - ECCV 2018, 2018


Temporal Hallucinating for Action Recognition With Few Still Images.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

FOTS: Fast Oriented Text Spotting With a Unified Network.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

An End-to-End TextSpotter With Explicit Alignment and Attention.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

RDS-Denoiser: a Detail-preserving Convolutional Neural Network for Image Denoising.
Proceedings of the IEEE International Conference on Cyborg and Bionic Systems, 2018

Boosting up Scene Text Detectors with Guided CNN.
Proceedings of the British Machine Vision Conference 2018, 2018

Deep Reinforcement Learning for Unsupervised Video Summarization With Diversity-Representativeness Reward.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

LSTD: A Low-Shot Transfer Detector for Object Detection.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Weakly Supervised PatchNets: Describing and Aggregating Local Patches for Scene Recognition.
IEEE Trans. Image Process., 2017

Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs.
IEEE Trans. Image Process., 2017

Locally Supervised Deep Hybrid Model for Scene Recognition.
IEEE Trans. Image Process., 2017

Improving scale invariant feature transform with local color contrastive descriptor for image classification.
J. Electronic Imaging, 2017

A robust coherent point drift approach based on rotation invariant shape context.
Neurocomputing, 2017

Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI.
Neurocomputing, 2017

Learning multiple local binary descriptors for image matching.
Neurocomputing, 2017

Group emotion recognition with individual facial emotion CNNs and global image based CNNs.
Proceedings of the 19th ACM International Conference on Multimodal Interaction, 2017

Depth driven people counting using deep region proposal network.
Proceedings of the IEEE International Conference on Information and Automation, 2017

Detecting Faces Using Inside Cascaded Contextual CNN.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Range Loss for Deep Face Recognition with Long-Tailed Training Data.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Single Shot Text Detector with Regional Attention.
Proceedings of the IEEE International Conference on Computer Vision, 2017

RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos.
Proceedings of the IEEE International Conference on Computer Vision, 2017


Marine Animal Detection and Recognition with Advanced Deep Learning Models.
Proceedings of the Working Notes of CLEF 2017, 2017

Dual Learning for Cross-domain Image Captioning.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Orientation-Aware Text Proposals Network for Scene Text Detection.
Proceedings of the Biometric Recognition - 12th Chinese Conference, 2017

Sparse Deep Transfer Learning for Convolutional Neural Network.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Text-Attentional Convolutional Neural Network for Scene Text Detection.
IEEE Trans. Image Process., 2016

Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks.
IEEE Signal Process. Lett., 2016

Adaptive Part-Level Model Knowledge Transfer for Gender Classification.
IEEE Signal Process. Lett., 2016

MoFAP: A Multi-level Representation for Action Recognition.
Int. J. Comput. Vis., 2016

Reference-omitted affine soft correspondence algorithm.
IET Image Process., 2016

Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice.
Comput. Vis. Image Underst., 2016

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks.
CoRR, 2016

Range Loss for Deep Face Recognition with Long-tail.
CoRR, 2016

CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016.
CoRR, 2016

Transferring Object-Scene Convolutional Neural Networks for Event Recognition in Still Images.
CoRR, 2016

Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network.
CoRR, 2016

Locally-Supervised Deep Hybrid Model for Scene Recognition.
CoRR, 2016

Shenzhen Institutes of Advanced Technology, CAS, China at TRECVID INS 2016.
Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Deep rehabilitation gait learning for modeling knee joints of lower-limb exoskeleton.
Proceedings of the 2016 IEEE International Conference on Robotics and Biomimetics, 2016

Deep face attributes recognition using spatial transformer network.
Proceedings of the IEEE International Conference on Information and Automation, 2016

DeepWriter: A Multi-stream Deep CNN for Text-Independent Writer Identification.
Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition, 2016

Codebook enhancement of vlad representation for visual recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Human action recognition with DeepAction Kernel Gaussian Process.
Proceedings of the 2016 International Conference on Advanced Robotics and Mechatronics, 2016

A Discriminative Feature Learning Approach for Deep Face Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Detecting Text in Natural Image with Connectionist Text Proposal Network.
Proceedings of the Computer Vision - ECCV 2016, 2016

Real-Time Action Recognition with Enhanced Motion Vector CNNs.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Gender and Smile Classification Using Deep Convolutional Neural Networks.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016

Latent Factor Guided Convolutional Neural Networks for Age-Invariant Face Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Actionness Estimation Using Hybrid Fully Convolutional Networks.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Reading Scene Text in Deep Convolutional Sequences.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Local Multi-Grouped Binary Descriptor With Ring-Based Pooling Configuration and Optimization.
IEEE Trans. Image Process., 2015

On feature-specific parameter learning in conditional random field-based approach for interactive object segmentation.
J. Electronic Imaging, 2015

Towards Good Practices for Very Deep Two-Stream ConvNets.
CoRR, 2015

Object-Scene Convolutional Neural Networks for Event Recognition in Images.
CoRR, 2015

Places205-VGGNet Models for Scene Recognition.
CoRR, 2015

Text-Attentional Convolutional Neural Networks for Scene Text Detection.
CoRR, 2015

Local Color Contrastive Descriptor for Image Classification.
CoRR, 2015

Boosting Optical Character Recognition: A Super-Resolution Approach.
CoRR, 2015

Deep classification of vehicle makers and models: The effectiveness of pre-training and data enhancement.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015

Road segmentation via iterative deep analysis.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015

Fast single image dehazing through Edge-Guided Interpolated Filter.
Proceedings of the 14th IAPR International Conference on Machine Vision Applications, 2015

MIL: Music Exploration and Visualization via Lyric and Image.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Better Exploiting OS-CNNs for Better Event Recognition in Images.
Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop, 2015

Object-Scene Convolutional Neural Networks for event recognition in images.
Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015

Exploring Fisher vector and deep networks for action spotting.
Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015

Action recognition with trajectory-pooled deep-convolutional descriptors.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Latent Hierarchical Model of Temporal Structure for Complex Activity Classification.
IEEE Trans. Image Process., 2014

Common Feature Discriminant Analysis for Matching Infrared Face Images to Optical Face Images.
IEEE Trans. Image Process., 2014

Large Margin Dimensionality Reduction for Action Similarity Labeling.
IEEE Signal Process. Lett., 2014

Bayesian salient object detection based on saliency driven clustering.
Signal Process. Image Commun., 2014

Pairwise Rotation Invariant Co-Occurrence Local Binary Pattern.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Motion boundary based sampling and 3D co-occurrence descriptors for action recognition.
Image Vis. Comput., 2014

Robust visual tracking based on local kernelized representation.
Proceedings of the 2014 IEEE International Conference on Robotics and Biomimetics, 2014

A Joint Evaluation of Dictionary Learning and Feature Encoding for Action Recognition.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Saliency detection via foreground rendering and background exclusion.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Saliency driven clustering for salient object detection.
Proceedings of the IEEE International Conference on Acoustics, 2014

Saliency detection based on extended boundary prior with foci of attention.
Proceedings of the IEEE International Conference on Acoustics, 2014

Video Action Detection with Relational Dynamic-Poselets.
Proceedings of the Computer Vision - ECCV 2014, 2014

Action Recognition with Stacked Fisher Vectors.
Proceedings of the Computer Vision - ECCV 2014, 2014

Boosting VLAD with Supervised Dictionary Learning and High-Order Statistics.
Proceedings of the Computer Vision - ECCV 2014, 2014

Action and Gesture Temporal Spotting with Super Vector Representation.
Proceedings of the Computer Vision - ECCV 2014 Workshops, 2014

Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees.
Proceedings of the Computer Vision - ECCV 2014, 2014

Multi-view Super Vector for Action Recognition.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
One-class support vector machine-assisted robust tracking.
J. Electronic Imaging, 2013

Unsupervised optimal phoneme segmentation: theory and experimental evaluation.
IET Signal Process., 2013

A Study on Unsupervised Dictionary Learning and Feature Encoding for Action Classification.
CoRR, 2013

Multi-feature canonical correlation analysis for face photo-sketch image retrieval.
Proceedings of the ACM Multimedia Conference, 2013

Salient Object Segmentation Based on Automatic Labeling.
Proceedings of the Neural Information Processing - 20th International Conference, 2013

An active contour model based on multiple boundary measures.
Proceedings of the IEEE International Conference on Image Processing, 2013

Affine SoftAssign with bidirectional distance for point matching.
Proceedings of the IEEE International Conference on Image Processing, 2013

A semantic model for video based face recognition.
Proceedings of the IEEE International Conference on Information and Automation, 2013

LTD: Local Ternary Descriptor for image matching.
Proceedings of the IEEE International Conference on Information and Automation, 2013

Exploring dense trajectory feature and encoding methods for human interaction recognition.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

Mining Motion Atoms and Phrases for Complex Action Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Motionlets: Mid-level 3D Parts for Human Motion Recognition.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Exploring Cross-Channel Texture Correlation for Color Texture Classification.
Proceedings of the British Machine Vision Conference, 2013

Multi-scale Joint Encoding of Local Binary Patterns for Texture and Material Classification.
Proceedings of the British Machine Vision Conference, 2013

Exploring Motion Boundary based Sampling and Spatial-Temporal Context Descriptors for Action Recognition.
Proceedings of the British Machine Vision Conference, 2013

2012
Automatic music video generation: cross matching of music and image.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Voice conversion using Bayesian mixture of Probabilistic Linear Regressions and dynamic kernel features.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Bayesian Mixture of Probabilistic Linear Regressions for Voice Conversion.
Proceedings of the INTERSPEECH 2012, 2012

Learning geodesic CRF model for image segmentation.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Person re-identification across multi-camera system based on local descriptors.
Proceedings of the Sixth International Conference on Distributed Smart Cameras, 2012

One-Class SVM assisted accurate tracking.
Proceedings of the Sixth International Conference on Distributed Smart Cameras, 2012

A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition.
Proceedings of the Computer Vision - ACCV 2012, 2012

2011
Regularized Maximum Likelihood Linear Regression Adaptation for Computer-Assisted Language Learning Systems.
IEICE Trans. Inf. Syst., 2011

A Study on Bag of Gaussian Model with Application to Voice Conversion.
Proceedings of the INTERSPEECH 2011, 2011

Gesture Design of Hand-to-Speech Converter Derived from Speech-to-Hand Converter Based on Probabilistic Integration Model.
Proceedings of the INTERSPEECH 2011, 2011

Knowledge-Based Segmentation of Spine and Ribs from Bone Scintigraphy.
Proceedings of the Neural Information Processing - 18th International Conference, 2011

Adaptive Region Growing Based on Boundary Measures.
Proceedings of the Neural Information Processing - 18th International Conference, 2011

Adaptive Detection of Hotspots in Thoracic Spine from Bone Scintigraphy.
Proceedings of the Neural Information Processing - 18th International Conference, 2011


Structure-constrained distribution matching using quadratic programming and its application to pronunciation evaluation.
Proceedings of the First Asian Conference on Pattern Recognition, 2011

2010
A study on invariance of f-divergence and its application to speech recognition.
IEEE Trans. Signal Process., 2010

Speech Structure and Its Application to Robust Speech Processing.
New Gener. Comput., 2010

Face recognition based on gradient gabor feature and Efficient Kernel Fisher analysis.
Neural Comput. Appl., 2010

Dialect-based speaker classification using speaker-invariant dialect features.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Integration of multilayer regression analysis with structure-based pronunciation assessment.
Proceedings of the INTERSPEECH 2010, 2010

Regularized-MLLR speaker adaptation for computer-assisted language learning system.
Proceedings of the INTERSPEECH 2010, 2010

HMM-based sequence-to-frame mapping for voice conversion.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
A Theory of Phase Singularities for Image Representation and its Applications to Object Tracking and Image Matching.
IEEE Trans. Image Process., 2009

Optimal event search using a structural cost function - improvement of structure to speech conversion.
Proceedings of the INTERSPEECH 2009, 2009

On invariant structural representation for speech recognition: theoretical validation and experimental improvement.
Proceedings of the INTERSPEECH 2009, 2009

Structural analysis of dialects, sub-dialects and sub-sub-dialects of Chinese.
Proceedings of the INTERSPEECH 2009, 2009

Analysis and utilization of MLLR speaker adaptation technique for learners' pronunciation evaluation.
Proceedings of the INTERSPEECH 2009, 2009

Speech generation from hand gestures based on space mapping.
Proceedings of the INTERSPEECH 2009, 2009

Affine invariant features and their application to speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Mixture of Probabilistic Linear Regressions: A unified view of GMM-based mapping techiques.
Proceedings of the IEEE International Conference on Acoustics, 2009

Free hand sketch understanding using SVMs-chain modeling for spatial and temporal patterns.
Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation, 2009

A study on Hidden Structural Model and its application to labeling sequences.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
f-divergence is a generalized invariant measure between distributions.
Proceedings of the INTERSPEECH 2008, 2008

Metric learning for unsupervised phoneme segmentation.
Proceedings of the INTERSPEECH 2008, 2008

Face recognition based on Gradient Gabor feature.
Proceedings of the International Conference on Image Processing, 2008

Phase singularities for image representation and matching.
Proceedings of the IEEE International Conference on Acoustics, 2008

Unsupervised optimal phoneme segmentation: Objectives, algorithm and comparisons.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Optimal Euler Circuit of Maximum Contiguous Cost.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2007

Offline Signature Verification Using Online Handwriting Registration.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Random discriminant structure analysis for automatic recognition of connected vowels.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
A Framework Toward Restoration of Writing Order from Single-Stroked Handwriting Image.
IEEE Trans. Pattern Anal. Mach. Intell., 2006

Recover Writing Trajectory from Multiple Stroked Image Using Bidirectional Dynamic Search.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Affine Invariant Dynamic Time Warping and its Application to Online Rotated Handwriting Recognition.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Recovering Drawing Order from Offline Handwritten Image Using Direction Context and Optimal Euler Path.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
A Novel Approach to Recover Writing Order From Single Stroke Offline Handwritten Images.
Proceedings of the Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), 29 August, 2005

2004
Recovering dynamic information from static handwritten images.
Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition, 2004

2003
Vehicle Detection on Highway Based on Direction-Fractal Dimension.
Proceedings of the Wavelet Analysis and Its Applications, 2003


  Loading...