Hongtao Xie

Orcid: 0000-0002-6249-5315

Affiliations:

University of Science and Technology of China, Hefei, China

According to our database¹, Hongtao Xie authored at least 165 papers between 2014 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

LRANet++: Low-Rank Approximation Network for Accurate and Efficient Text Spotting.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., May, 2026

Few-shot bone age assessment via contrast relation adaptation.

[BibT_eX]

[DOI]

Int. J. Mach. Learn. Cybern., May, 2026

Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement.

[BibT_eX]

[DOI]

CoRR, March, 2026

RegionRAG: Region-level Retrieval-Augmented Generation for Visual Document Understanding.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement.

[BibT_eX]

[DOI]

CoRR, December, 2025

Alternating Perception-Reasoning for Hallucination-Resistant Video Understanding.

[BibT_eX]

[DOI]

CoRR, November, 2025

RegionRAG: Region-level Retrieval-Augumented Generation for Visually-Rich Documents.

[BibT_eX]

[DOI]

CoRR, October, 2025

Attention-driven frequency-based Zero-Shot Learning with phase augmentation.

[BibT_eX]

[DOI]

Int. J. Mach. Learn. Cybern., August, 2025

Distilling Multi-Level Semantic Cues Across Multi-Modalities for Face Forgery Detection.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., May, 2025

ReferSAM: Unleashing Segment Anything Model for Referring Image Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., May, 2025

From Evaluation to Defense: Advancing Safety in Video Large Language Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

A Detail-Aware Transformer to Generalizable Face Forgery Detection.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., April, 2025

DHVT: Dynamic Hybrid Vision Transformer for Small Dataset Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2025

Self-refined variational transformer for image-conditioned layout generation.

[BibT_eX]

[DOI]

Int. J. Mach. Learn. Cybern., March, 2025

Mask<sup>2</sup>DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation.

[BibT_eX]

[DOI]

CoRR, March, 2025

What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Coverage of MLLMs.

[BibT_eX]

[DOI]

CoRR, February, 2025

Exploiting Pre-Trained Language Models for Black-Box Attack against Knowledge Graph Embeddings.

[BibT_eX]

[DOI]

ACM Trans. Knowl. Discov. Data, January, 2025

Leveraging Concise Concepts With Probabilistic Modeling for Interpretable Visual Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Masked Text Pre-Training for Scene Text Detection.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

High Fidelity Face Swapping via Facial Texture and Structure Consistency Mining.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Denoised and Dynamic Alignment Enhancement for Zero-Shot Learning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2025

IterMeme: Expert-Guided Multimodal LLM for Interactive Meme Creation with Layout-Aware Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Igd: Instructional Graphic Design With Multimodal Layer Generatio.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

CLIP-Adapted Region-to-Text Learning for Generative Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

IDseq: Decoupled and Sequentially Detecting and Grounding Multi-Modal Media Manipulation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

STIDNet: Identity-Aware Face Forgery Detection With Spatiotemporal Knowledge Distillation.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Soc. Syst., August, 2024

CDistNet: Perceiving Multi-domain Character Distance for Robust Text Recognition.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., February, 2024

Semantic-Enhanced Proxy-Guided Hashing for Long-Tailed Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Balanced Classification: A Unified Framework for Long-Tailed Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Towards Discriminative Feature Generation for Generalized Zero-Shot Learning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

IEIRNet: Inconsistency Exploiting Based Identity Rectification for Face Forgery Detection.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Exploring Bi-Level Inconsistency via Blended Images for Generalizable Face Forgery Detection.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2024

Cascade Semantic Prompt Alignment Network for Image Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2024

DCFP: Distribution Calibrated Filter Pruning for Lightweight and Accurate Long-Tail Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2024

Generalizable Speech Spoofing Detection Against Silence Trimming With Data Augmentation and Multi-Task Meta-Learning.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions.

[BibT_eX]

[DOI]

CoRR, 2024

Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing.

[BibT_eX]

[DOI]

CoRR, 2024

Hallucination Mitigation Prompts Long-term Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing.

[BibT_eX]

[DOI]

CoRR, 2024

Control-Talker: A Rapid-Customization Talking Head Generation Method for Multi-Condition Control and High-Texture Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Leveraging Text Localization for Scene Text Removal via Text-Aware Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

AlignZeg: Mitigating Objective Misalignment for Zero-Shot Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

OTE: Exploring Accurate Scene Text Recognition Using One Token.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DiffAM: Diffusion-Based Adversarial Makeup Transfer for Facial Privacy Protection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Knowledge Context Modeling with Pre-trained Language Models for Contrastive Knowledge Graph Completion.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Discriminative Feature Mining Based on Frequency Information and Metric Learning for Face Forgery Detection.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., December, 2023

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Neighborhood-Adaptive Multi-Cluster Ranking for Deep Metric Learning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., April, 2023

Multi-task hourglass network for online automatic diagnosis of developmental dysplasia of the hip.

[BibT_eX]

[DOI]

World Wide Web (WWW), March, 2023

Learning Pixel Affinity Pyramid for Arbitrary-Shaped Text Detection.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., February, 2023

ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene Text Detection.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Learning Cross-Channel Representations for Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

What is the Real Need for Scene Text Removal? Exploring the Background Integrity and Erasure Exhaustivity Properties.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

Prototypical Matching Networks for Video Object Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

MomentDiff: Generative Video Moment Retrieval from Random to Real.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Frequency-based Zero-Shot Learning with Phase Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Masked Text Modeling: A Self-Supervised Pre-training Method for Scene Text Detection.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

CARIS: Context-Aware Referring Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

High Fidelity Face Swapping via Semantics Disentanglement and Structure Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

TextPainter: Multimodal Text Image Generation with Visual-harmony and Text-comprehension for Poster Design.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

RAIRNet: Region-Aware Identity Rectification for Face Forgery Detection.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Difference-Aware Iterative Reasoning Network for Key Relation Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Exploring Stroke-Level Modifications for Scene Text Editing.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Multimodal Learning for Temporally Coherent Talking Face Generation With Articulator Synergy.

[BibT_eX]

[DOI]

Lingyun Yu

Hongtao Xie

Yongdong Zhang

IEEE Trans. Multim., 2022

Boundary-Aware Arbitrary-Shaped Scene Text Detector With Learnable Embedding Network.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Online Residual Quantization Via Streaming Data Correlation Preserving.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Dynamic-Aware Federated Learning for Face Forgery Video Detection.

[BibT_eX]

[DOI]

ACM Trans. Intell. Syst. Technol., 2022

PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Deep Fourier Ranking Quantization for Semi-Supervised Image Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Bilateral Temporal Re-Aggregation for Weakly-Supervised Video Object Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Self-Supervised Synthesis Ranking for Deep Metric Learning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Semi-Supervised Text Detection With Accurate Pseudo-Labels.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2022

Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Proxy Probing Decoder for Weakly Supervised Object Localization: A Baseline Investigation.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Wavelet-enhanced Weakly Supervised Local Feature Learning for Face Forgery Detection.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Dual Part Discovery Network for Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Geometry Aligned Variational Transformer for Image-conditioned Layout Generation.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Weakly Supervised Pediatric Bone Age Assessment Using Ultrasonic Images via Automatic Anatomical RoI Detection.

[BibT_eX]

[DOI]

Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Detecting Tampered Scene Text in the Wild.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Partial Class Activation Attention for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Neighborhood-Adaptive Structure Augmented Metric Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

R-Net: A Relationship Network for Efficient and Accurate Scene Text Detection.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

Domain-Oriented Semantic Embedding for Zero-Shot Learning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

A Mutually Attentive Co-Training Framework for Semi-Supervised Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

Hip Landmark Detection With Dependency Mining in Ultrasound Image.

[BibT_eX]

[DOI]

IEEE Trans. Medical Imaging, 2021

Self-Supervised Attention Mechanism for Pediatric Bone Age Assessment With Efficient Weak Annotation.

[BibT_eX]

[DOI]

Chuanbin Liu

Hongtao Xie

Yongdong Zhang

IEEE Trans. Medical Imaging, 2021

PRRNet: Pixel-Region relation network for face forgery detection.

[BibT_eX]

[DOI]

Pattern Recognit., 2021

Hierarchical multi-view context modelling for 3D object classification and retrieval.

[BibT_eX]

[DOI]

Inf. Sci., 2021

A Simple and Strong Baseline: Progressively Region-based Scene Text Removal Networks.

[BibT_eX]

[DOI]

CoRR, 2021

TDI TextSpotter: Taking Data Imbalance into Account in Scene Text Spotting.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

End-to-end Boundary Exploration for Weakly-supervised Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Dynamic Inconsistency-aware DeepFake Video Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Global Characteristic Guided Landmark Detection for Genu Valgus and Varus Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the Image and Graphics - 11th International Conference, 2021

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Query-Memory Re-Aggregation for Weakly-supervised Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Semantic-guided Reinforced Region Embedding for Generalized Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Robust Deep Co-Saliency Detection With Group Semantic and Pyramid Attention.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2020

Bidirectional Attention-Recognition Model for Fine-Grained Object Classification.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2020

Misshapen Pelvis Landmark Detection With Local-Global Feature Learning for Diagnosing Developmental Dysplasia of the Hip.

[BibT_eX]

[DOI]

IEEE Trans. Medical Imaging, 2020

Mining Spatial-Temporal Similarity for Visual Tracking.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Hierarchical Granularity Transfer Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Law Is Order: Protecting Multimedia Network Transmission by Game Theory and Mechanism Design.

[BibT_eX]

[DOI]

Chuanbin Liu

Youliang Tian

Hongtao Xie

Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-Features Fusion and Decomposition for Age-Invariant Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning Rich Attention for Pediatric Bone Age Assessment.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Real-World Automatic Makeup via Identity Preservation Makeup Net.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Hierarchical Consistency and Refinement for Semi-supervised Medical Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020

ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Graph Structured Network for Image-Text Matching.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Curriculum Learning for Natural Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

CircleNet for Hip Landmark Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Convolutional Attention Networks for Scene Text Recognition.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2019

Double-Bit Quantization and Index Hashing for Nearest Neighbor Search.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2019

Automated pulmonary nodule detection in CT images using deep convolutional neural networks.

[BibT_eX]

[DOI]

Pattern Recognit., 2019

Supervised deep hashing for image content security.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2019

WaveCSN: Cascade Segmentation Network for Hip Landmark Detection.

[BibT_eX]

[DOI]

Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Adaptive Bilinear Pooling for Fine-grained Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Domain-Specific Embedding Network for Zero-Shot Recognition.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Misshapen Pelvis Landmark Detection by Spatial Local Correlation Mining for Diagnosing Developmental Dysplasia of the Hip.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Extract Bone Parts Without Human Prior: End-to-end Convolutional Neural Network for Pediatric Bone Age Assessment.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

DSRN: A Deep Scale Relationship Network for Scene Text Detection.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Learning to Draw Text in Natural Images with Conditional Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

MLTS: A Multi-Language Scene Text Spotter.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Semantic-Embedding and Shape-Aware U-Net for Ultrasound Eyeball Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Accurate Segmentation of Synaptic Cleft with Contour Growing Concatenated with a Convnet.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Robust Deep Co-Saliency Detection with Group Semantic.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

A Fast Uyghur Text Detector for Complex Background Images.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2018

Supervised Hash Coding With Deep Neural Network for Environment Perception of Intelligent Vehicles.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2018

Effective Uyghur Language Text Detection in Complex Background Images for Traffic Prompt Identification.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2018

Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis.

[BibT_eX]

[DOI]

Neuroinformatics, 2018

Potential of Attention Mechanism for Classification of Optical Coherence Tomography Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE Visual Communications and Image Processing, 2018

Uyghur Text Localization with Fast Component Detection.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Convolutional Nets for Pulmonary Nodule Detection and Classification.

[BibT_eX]

[DOI]

Proceedings of the Knowledge Science, Engineering and Management, 2018

2017

Triple-Bit Quantization with Asymmetric Distance for Image Content Security.

[BibT_eX]

[DOI]

Degang Xu

Hongtao Xie

Chenggang Yan

Mach. Vis. Appl., 2017

Robust and parallel Uyghur text localization in complex background images.

[BibT_eX]

[DOI]

Mach. Vis. Appl., 2017

Detecting Uyghur text in complex background images with convolutional neural network.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2017

Uyghur Language Text Detection in Complex Background Images Using Enhanced MSERs.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

Supervised deep quantization for efficient image search.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017

Double-bit quantization and weighting for nearest neighbor search.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A new dataset for hand gesture estimation.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing, 2017

2015

Fast approximate matching of binary codes with distinctive bits.

[BibT_eX]

[DOI]

Chenggang Clarence Yan

Frontiers Comput. Sci., 2015

Hierarchical Encoding of Binary Descriptors for Image Matching.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

2014

Extracting salient region for pornographic image detection.

[BibT_eX]

[DOI]

Chenggang Clarence Yan

J. Vis. Commun. Image Represent., 2014

Hongtao Xie

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...