Hongtao Xie
Orcid: 0009-0005-5407-2581
According to our database1,
Hongtao Xie
authored at least 218 papers
between 2008 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Int. J. Mach. Learn. Cybern., August, 2025
Invisible Watermarks, Visible Gains: Steering Machine Unlearning with Bi-Level Watermarking Design.
CoRR, August, 2025
GestureHYDRA: Semantic Co-speech Gesture Synthesis via Hybrid Modality Diffusion Transformer and Cascaded-Synchronized Retrieval-Augmented Generation.
CoRR, July, 2025
Distilling Multi-Level Semantic Cues Across Multi-Modalities for Face Forgery Detection.
IEEE Trans. Circuits Syst. Video Technol., May, 2025
IEEE Trans. Circuits Syst. Video Technol., May, 2025
CoRR, May, 2025
IEEE Trans. Circuits Syst. Video Technol., April, 2025
IEEE Trans. Pattern Anal. Mach. Intell., April, 2025
Int. J. Mach. Learn. Cybern., March, 2025
Mask<sup>2</sup>DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation.
CoRR, March, 2025
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability.
CoRR, March, 2025
What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Coverage of MLLMs.
CoRR, February, 2025
THGS: Lifelike Talking Human Avatar Synthesis From Monocular Video Via 3D Gaussian Splatting.
Comput. Graph. Forum, February, 2025
Exploiting Pre-Trained Language Models for Black-Box Attack against Knowledge Graph Embeddings.
ACM Trans. Knowl. Discov. Data, January, 2025
Leveraging Concise Concepts With Probabilistic Modeling for Interpretable Visual Recognition.
IEEE Trans. Multim., 2025
IEEE Trans. Image Process., 2025
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
IDseq: Decoupled and Sequentially Detecting and Grounding Multi-Modal Media Manipulation.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
STIDNet: Identity-Aware Face Forgery Detection With Spatiotemporal Knowledge Distillation.
IEEE Trans. Comput. Soc. Syst., August, 2024
Int. J. Comput. Vis., February, 2024
IEEE Trans. Multim., 2024
IEEE Trans. Multim., 2024
IEEE Trans. Multim., 2024
IEIRNet: Inconsistency Exploiting Based Identity Rectification for Face Forgery Detection.
IEEE Trans. Multim., 2024
Exploring Bi-Level Inconsistency via Blended Images for Generalizable Face Forgery Detection.
IEEE Trans. Inf. Forensics Secur., 2024
IEEE Trans. Circuits Syst. Video Technol., 2024
DCFP: Distribution Calibrated Filter Pruning for Lightweight and Accurate Long-Tail Semantic Segmentation.
IEEE Trans. Circuits Syst. Video Technol., 2024
Generalizable Speech Spoofing Detection Against Silence Trimming With Data Augmentation and Multi-Task Meta-Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Comput. Vis. Image Underst., 2024
A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions.
CoRR, 2024
CoRR, 2024
Pistis-RAG: A Scalable Cascading Framework Towards Trustworthy Retrieval-Augmented Generation.
CoRR, 2024
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing.
CoRR, 2024
TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Control-Talker: A Rapid-Customization Talking Head Generation Method for Multi-Condition Control and High-Texture Enhancement.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024
Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Leveraging Text Localization for Scene Text Removal via Text-Aware Masked Image Modeling.
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Knowledge Context Modeling with Pre-trained Language Models for Contrastive Knowledge Graph Completion.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Discriminative Feature Mining Based on Frequency Information and Metric Learning for Face Forgery Detection.
IEEE Trans. Knowl. Data Eng., December, 2023
Health Inf. Sci. Syst., December, 2023
ACM Trans. Web, August, 2023
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023
IEEE Trans. Circuits Syst. Video Technol., April, 2023
Multi-task hourglass network for online automatic diagnosis of developmental dysplasia of the hip.
World Wide Web (WWW), March, 2023
ACM Trans. Multim. Comput. Commun. Appl., February, 2023
IEEE Trans. Multim., 2023
IEEE Trans. Multim., 2023
What is the Real Need for Scene Text Removal? Exploring the Background Integrity and Erasure Exhaustivity Properties.
IEEE Trans. Image Process., 2023
IEEE Trans. Image Process., 2023
Learning Complete Topology-Aware Correlations Between Relations for Inductive Link Prediction.
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Masked Text Modeling: A Self-Supervised Pre-training Method for Scene Text Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
TextPainter: Multimodal Text Image Generation with Visual-harmony and Text-comprehension for Poster Design.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the 2023 International Conference on Communication Network and Machine Learning, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Multimodal Learning for Temporally Coherent Talking Face Generation With Articulator Synergy.
IEEE Trans. Multim., 2022
Boundary-Aware Arbitrary-Shaped Scene Text Detector With Learnable Embedding Network.
IEEE Trans. Multim., 2022
IEEE Trans. Multim., 2022
ACM Trans. Intell. Syst. Technol., 2022
PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition.
IEEE Trans. Image Process., 2022
IEEE Trans. Image Process., 2022
IEEE Trans. Circuits Syst. Video Technol., 2022
IEEE Trans. Circuits Syst. Video Technol., 2022
IEEE Signal Process. Lett., 2022
Int. J. Intell. Syst., 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proxy Probing Decoder for Weakly Supervised Object Localization: A Baseline Investigation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Wavelet-enhanced Weakly Supervised Local Feature Learning for Face Forgery Detection.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Weakly Supervised Pediatric Bone Age Assessment Using Ultrasonic Images via Automatic Anatomical RoI Detection.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
IEEE Trans. Multim., 2021
IEEE Trans. Multim., 2021
IEEE Trans. Medical Imaging, 2021
Self-Supervised Attention Mechanism for Pediatric Bone Age Assessment With Efficient Weak Annotation.
IEEE Trans. Medical Imaging, 2021
Pattern Recognit., 2021
Hierarchical multi-view context modelling for 3D object classification and retrieval.
Inf. Sci., 2021
A Simple and Strong Baseline: Progressively Region-based Scene Text Removal Networks.
CoRR, 2021
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Cluster and Scatter: A Multi-grained Active Semi-supervised Learning Framework for Scalable Person Re-identification.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
Proceedings of the Image and Graphics - 11th International Conference, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
IEEE Trans. Neural Networks Learn. Syst., 2020
IEEE Trans. Multim., 2020
Misshapen Pelvis Landmark Detection With Local-Global Feature Learning for Diagnosing Developmental Dysplasia of the Hip.
IEEE Trans. Medical Imaging, 2020
IEEE Trans. Image Process., 2020
IEEE Trans. Image Process., 2020
Global context and boundary structure-guided network for cross-modal organ segmentation.
Inf. Process. Manag., 2020
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Improving Brain Tumor Segmentation with Dilated Pseudo-3D Convolution and Multi-direction Fusion.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020
Law Is Order: Protecting Multimedia Network Transmission by Game Theory and Mechanism Design.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
March on Data Imperfections: Domain Division and Domain Generalization for Semantic Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020
ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
ACM Trans. Multim. Comput. Commun. Appl., 2019
IEEE Trans. Multim., 2019
Automated pulmonary nodule detection in CT images using deep convolutional neural networks.
Pattern Recognit., 2019
Int. J. High Perform. Comput. Netw., 2019
Proceedings of the MultiMedia Modeling - 25th International Conference, 2019
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019
Proceedings of the 27th ACM International Conference on Multimedia, 2019
Proceedings of the 27th ACM International Conference on Multimedia, 2019
ACE-Net: Biomedical Image Segmentation with Augmented Contracting and Expansive Paths.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019
Misshapen Pelvis Landmark Detection by Spatial Local Correlation Mining for Diagnosing Developmental Dysplasia of the Hip.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019
Extract Bone Parts Without Human Prior: End-to-end Convolutional Neural Network for Pediatric Bone Age Assessment.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019
Accurate Segmentation of Synaptic Cleft with Contour Growing Concatenated with a Convnet.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
Supervised Hash Coding With Deep Neural Network for Environment Perception of Intelligent Vehicles.
IEEE Trans. Intell. Transp. Syst., 2018
Effective Uyghur Language Text Detection in Complex Background Images for Traffic Prompt Identification.
IEEE Trans. Intell. Transp. Syst., 2018
Neuroinformatics, 2018
CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
CoRR, 2018
Potential of Attention Mechanism for Classification of Optical Coherence Tomography Images.
Proceedings of the IEEE Visual Communications and Image Processing, 2018
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018
CA<sub>3</sub>Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018
Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018
Proceedings of the Knowledge Science, Engineering and Management, 2018
2017
Mach. Vis. Appl., 2017
Mach. Vis. Appl., 2017
Multim. Tools Appl., 2017
Detecting Uyghur text in complex background images with convolutional neural network.
Multim. Tools Appl., 2017
RICS-DFA: a space and time-efficient signature matching algorithm with Reduced Input Character Set.
Concurr. Comput. Pract. Exp., 2017
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017
CPMF: A collective pairwise matrix factorization model for upcoming event recommendation.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017
Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing, 2017
2016
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016
2015
Corrigendum to "Fast and scalable lock methods for video coding on many-core architecture" [J. Visual Communication and Image Representation 25(7) (2014) 1758-1762].
J. Vis. Commun. Image Represent., 2015
KSII Trans. Internet Inf. Syst., 2015
Frontiers Comput. Sci., 2015
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015
2014
J. Vis. Commun. Image Represent., 2014
J. Vis. Commun. Image Represent., 2014
Future Gener. Comput. Syst., 2014
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014
The study of methods for post-pruning decision trees based on comprehensive evaluation standard.
Proceedings of the 11th International Conference on Fuzzy Systems and Knowledge Discovery, 2014
Proceedings of the International Conference on Big Data and Smart Computing, BIGCOMP 2014, 2014
2013
J. Vis. Commun. Image Represent., 2013
2012
Application research of Web3D technology in three-dimensional show of oil well pitshaft information.
Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery, 2012
2011
Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search.
IEEE Trans. Multim., 2011
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011
Proceedings of the 1st International Conference on Multimedia Retrieval, 2011
Proceedings of the 18th IEEE International Conference on Image Processing, 2011
2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the Trends and Topics in Computer Vision, 2010
2008