Hongtao Xie

Orcid: 0000-0002-6249-5315

According to our database1, Hongtao Xie authored at least 166 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
CDistNet: Perceiving Multi-domain Character Distance for Robust Text Recognition.
Int. J. Comput. Vis., February, 2024

Balanced Classification: A Unified Framework for Long-Tailed Object Detection.
IEEE Trans. Multim., 2024

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations.
CoRR, 2024

Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Discriminative Feature Mining Based on Frequency Information and Metric Learning for Face Forgery Detection.
IEEE Trans. Knowl. Data Eng., December, 2023

Meta semi-supervised medical image segmentation with label hierarchy.
Health Inf. Sci. Syst., December, 2023

Constructing Spatio-Temporal Graphs for Face Forgery Detection.
ACM Trans. Web, August, 2023

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Neighborhood-Adaptive Multi-Cluster Ranking for Deep Metric Learning.
IEEE Trans. Circuits Syst. Video Technol., April, 2023

Multi-task hourglass network for online automatic diagnosis of developmental dysplasia of the hip.
World Wide Web (WWW), March, 2023

Learning Pixel Affinity Pyramid for Arbitrary-Shaped Text Detection.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene Text Detection.
IEEE Trans. Multim., 2023

Learning Cross-Channel Representations for Semantic Segmentation.
IEEE Trans. Multim., 2023

What is the Real Need for Scene Text Removal? Exploring the Background Integrity and Erasure Exhaustivity Properties.
IEEE Trans. Image Process., 2023

Prototypical Matching Networks for Video Object Segmentation.
IEEE Trans. Image Process., 2023

Learning Complete Topology-Aware Correlations Between Relations for Inductive Link Prediction.
CoRR, 2023

MomentDiff: Generative Video Moment Retrieval from Random to Real.
CoRR, 2023

MomentDiff: Generative Video Moment Retrieval from Random to Real.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Frequency-based Zero-Shot Learning with Phase Augmentation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Masked Text Modeling: A Self-Supervised Pre-training Method for Scene Text Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CARIS: Context-Aware Referring Image Segmentation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

High Fidelity Face Swapping via Semantics Disentanglement and Structure Enhancement.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

TextPainter: Multimodal Text Image Generation with Visual-harmony and Text-comprehension for Poster Design.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

RAIRNet: Region-Aware Identity Rectification for Face Forgery Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Difference-Aware Iterative Reasoning Network for Key Relation Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

An Online/Offline Power Data Sharing System Based on Blockchain.
Proceedings of the 2023 International Conference on Communication Network and Machine Learning, 2023

Exploring Stroke-Level Modifications for Scene Text Editing.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Multimodal Learning for Temporally Coherent Talking Face Generation With Articulator Synergy.
IEEE Trans. Multim., 2022

Boundary-Aware Arbitrary-Shaped Scene Text Detector With Learnable Embedding Network.
IEEE Trans. Multim., 2022

Online Residual Quantization Via Streaming Data Correlation Preserving.
IEEE Trans. Multim., 2022

Dynamic-Aware Federated Learning for Face Forgery Video Detection.
ACM Trans. Intell. Syst. Technol., 2022

PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition.
IEEE Trans. Image Process., 2022

Deep Fourier Ranking Quantization for Semi-Supervised Image Retrieval.
IEEE Trans. Image Process., 2022

Bilateral Temporal Re-Aggregation for Weakly-Supervised Video Object Segmentation.
IEEE Trans. Circuits Syst. Video Technol., 2022

Self-Supervised Synthesis Ranking for Deep Metric Learning.
IEEE Trans. Circuits Syst. Video Technol., 2022

Semi-Supervised Text Detection With Accurate Pseudo-Labels.
IEEE Signal Process. Lett., 2022

Attention-guided transformation-invariant attack for black-box adversarial examples.
Int. J. Intell. Syst., 2022

Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

REMOT: A Region-to-Whole Framework for Realistic Human Motion Transfer.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Proxy Probing Decoder for Weakly Supervised Object Localization: A Baseline Investigation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Wavelet-enhanced Weakly Supervised Local Feature Learning for Face Forgery Detection.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Dual Part Discovery Network for Zero-Shot Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Geometry Aligned Variational Transformer for Image-conditioned Layout Generation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Weakly Supervised Pediatric Bone Age Assessment Using Ultrasonic Images via Automatic Anatomical RoI Detection.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Detecting Tampered Scene Text in the Wild.
Proceedings of the Computer Vision - ECCV 2022, 2022

Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval.
Proceedings of the Computer Vision - ECCV 2022, 2022

Partial Class Activation Attention for Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Neighborhood-Adaptive Structure Augmented Metric Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
R-Net: A Relationship Network for Efficient and Accurate Scene Text Detection.
IEEE Trans. Multim., 2021

Domain-Oriented Semantic Embedding for Zero-Shot Learning.
IEEE Trans. Multim., 2021

A Mutually Attentive Co-Training Framework for Semi-Supervised Recognition.
IEEE Trans. Multim., 2021

Hip Landmark Detection With Dependency Mining in Ultrasound Image.
IEEE Trans. Medical Imaging, 2021

Self-Supervised Attention Mechanism for Pediatric Bone Age Assessment With Efficient Weak Annotation.
IEEE Trans. Medical Imaging, 2021

PRRNet: Pixel-Region relation network for face forgery detection.
Pattern Recognit., 2021

Hierarchical multi-view context modelling for 3D object classification and retrieval.
Inf. Sci., 2021

A Simple and Strong Baseline: Progressively Region-based Scene Text Removal Networks.
CoRR, 2021

Cross-Modal Attention Consistency for Video-Audio Unsupervised Learning.
CoRR, 2021

Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval.
CoRR, 2021

TDI TextSpotter: Taking Data Imbalance into Account in Scene Text Spotting.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Cluster and Scatter: A Multi-grained Active Semi-supervised Learning Framework for Scalable Person Re-identification.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

End-to-end Boundary Exploration for Weakly-supervised Semantic Segmentation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Dynamic Inconsistency-aware DeepFake Video Detection.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Global Characteristic Guided Landmark Detection for Genu Valgus and Varus Diagnosis.
Proceedings of the Image and Graphics - 11th International Conference, 2021

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Query-Memory Re-Aggregation for Weakly-supervised Video Object Segmentation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Semantic-guided Reinforced Region Embedding for Generalized Zero-Shot Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Robust Deep Co-Saliency Detection With Group Semantic and Pyramid Attention.
IEEE Trans. Neural Networks Learn. Syst., 2020

Bidirectional Attention-Recognition Model for Fine-Grained Object Classification.
IEEE Trans. Multim., 2020

Misshapen Pelvis Landmark Detection With Local-Global Feature Learning for Diagnosing Developmental Dysplasia of the Hip.
IEEE Trans. Medical Imaging, 2020

Mining Spatial-Temporal Similarity for Visual Tracking.
IEEE Trans. Image Process., 2020

Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition.
IEEE Trans. Image Process., 2020

Global context and boundary structure-guided network for cross-modal organ segmentation.
Inf. Process. Manag., 2020

Hierarchical Granularity Transfer Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Improving Brain Tumor Segmentation with Dilated Pseudo-3D Convolution and Multi-direction Fusion.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Law Is Order: Protecting Multimedia Network Transmission by Game Theory and Mechanism Design.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

March on Data Imperfections: Domain Division and Domain Generalization for Semantic Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-Features Fusion and Decomposition for Age-Invariant Face Recognition.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning Rich Attention for Pediatric Bone Age Assessment.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Real-World Automatic Makeup via Identity Preservation Makeup Net.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Hierarchical Consistency and Refinement for Semi-supervised Medical Segmentation.
Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020

ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Graph Structured Network for Image-Text Matching.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Curriculum Learning for Natural Language Understanding.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

CircleNet for Hip Landmark Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Convolutional Attention Networks for Scene Text Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Double-Bit Quantization and Index Hashing for Nearest Neighbor Search.
IEEE Trans. Multim., 2019

Automated pulmonary nodule detection in CT images using deep convolutional neural networks.
Pattern Recognit., 2019

Supervised deep hashing for image content security.
Multim. Tools Appl., 2019

Name-face association with web facial image supervision.
Multim. Syst., 2019

Distributed data-dependent locality sensitive hashing.
Int. J. High Perform. Comput. Netw., 2019

Adaptive Alignment Network for Person Re-identification.
Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

WaveCSN: Cascade Segmentation Network for Hip Landmark Detection.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Adaptive Bilinear Pooling for Fine-grained Representation Learning.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Question-Aware Tube-Switch Network for Video Question Answering.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Domain-Specific Embedding Network for Zero-Shot Recognition.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

ACE-Net: Biomedical Image Segmentation with Augmented Contracting and Expansive Paths.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Deep Cascaded Attention Network for Multi-task Brain Tumor Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Misshapen Pelvis Landmark Detection by Spatial Local Correlation Mining for Diagnosing Developmental Dysplasia of the Hip.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Extract Bone Parts Without Human Prior: End-to-end Convolutional Neural Network for Pediatric Bone Age Assessment.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

DSRN: A Deep Scale Relationship Network for Scene Text Detection.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Learning to Draw Text in Natural Images with Conditional Adversarial Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Semi-supervised User Profiling with Heterogeneous Graph Attention Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

MLTS: A Multi-Language Scene Text Spotter.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Semantic-Embedding and Shape-Aware U-Net for Ultrasound Eyeball Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Accurate Segmentation of Synaptic Cleft with Contour Growing Concatenated with a Convnet.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Robust Deep Co-Saliency Detection with Group Semantic.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
A Fast Uyghur Text Detector for Complex Background Images.
IEEE Trans. Multim., 2018

Supervised Hash Coding With Deep Neural Network for Environment Perception of Intelligent Vehicles.
IEEE Trans. Intell. Transp. Syst., 2018

Effective Uyghur Language Text Detection in Complex Background Images for Traffic Prompt Identification.
IEEE Trans. Intell. Transp. Syst., 2018

Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis.
Neuroinformatics, 2018

CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
CoRR, 2018

Potential of Attention Mechanism for Classification of Optical Coherence Tomography Images.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Temporal-Contextual Attention Network for Video-Based Person Re-identification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Uyghur Text Localization with Fast Component Detection.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

CA<sub>3</sub>Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Convolutional Nets for Pulmonary Nodule Detection and Classification.
Proceedings of the Knowledge Science, Engineering and Management, 2018

2017
Triple-Bit Quantization with Asymmetric Distance for Image Content Security.
Mach. Vis. Appl., 2017

Robust and parallel Uyghur text localization in complex background images.
Mach. Vis. Appl., 2017

Residual domain dictionary learning for compressed sensing video recovery.
Multim. Tools Appl., 2017

Detecting Uyghur text in complex background images with convolutional neural network.
Multim. Tools Appl., 2017

RICS-DFA: a space and time-efficient signature matching algorithm with Reduced Input Character Set.
Concurr. Comput. Pract. Exp., 2017

Uyghur Language Text Detection in Complex Background Images Using Enhanced MSERs.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

CPMF: A collective pairwise matrix factorization model for upcoming event recommendation.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Supervised deep quantization for efficient image search.
Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017

Double-bit quantization and weighting for nearest neighbor search.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A new dataset for hand gesture estimation.
Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing, 2017

2016
Triple-Bit Quantization with Asymmetric Distance for Nearest Neighbor Search.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Context-Oriented Name-Face Association in Web Videos.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Robust Uyghur Text Localization in Complex Background Images.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

2015
Corrigendum to "Fast and scalable lock methods for video coding on many-core architecture" [J. Visual Communication and Image Representation 25 (7) (2014) 1758-1762].
J. Vis. Commun. Image Represent., 2015

Fast Search with Data-Oriented Multi-Index Hashing for Multimedia Data.
KSII Trans. Internet Inf. Syst., 2015

Fast approximate matching of binary codes with distinctive bits.
Frontiers Comput. Sci., 2015

Hierarchical Encoding of Binary Descriptors for Image Matching.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Data-oriented multi-index hashing.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

2014
Contextual Query Expansion for Image Retrieval.
IEEE Trans. Multim., 2014

Extracting salient region for pornographic image detection.
J. Vis. Commun. Image Represent., 2014

Fast and scalable lock methods for video coding on many-core architecture.
J. Vis. Commun. Image Represent., 2014

Fusing audio vocabulary with visual features for pornographic video detection.
Future Gener. Comput. Syst., 2014

Data-Dependent Locality Sensitive Hashing.
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

Fast Search of Binary Codes with Distinctive Bits.
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

Video to Article Hyperlinking by Multiple Tag Property Exploration.
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014

The study of methods for post-pruning decision trees based on comprehensive evaluation standard.
Proceedings of the 11th International Conference on Fuzzy Systems and Knowledge Discovery, 2014

Distributed online similarity search in high dimensional space.
Proceedings of the International Conference on Big Data and Smart Computing, BIGCOMP 2014, 2014

2013
Robust common visual pattern discovery using graph matching.
J. Vis. Commun. Image Represent., 2013

2012
Application research of Web3D technology in three-dimensional show of oil well pitshaft information.
Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery, 2012

2011
Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search.
IEEE Trans. Multim., 2011

Common visual pattern discovery via graph matching.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Pairwise weak geometric consistency for large scale image search.
Proceedings of the 1st International Conference on Multimedia Retrieval, 2011

Local geometric consistency constraint for image retrieval.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

2010
GPU-based fast scale invariant interest point detector.
Proceedings of the IEEE International Conference on Acoustics, 2010

Effective and Efficient Image Copy Detection Based on GPU.
Proceedings of the Trends and Topics in Computer Vision, 2010

2008
Stereo effect of image converted from planar.
Inf. Sci., 2008


  Loading...