Hongfa Wang

Orcid: 0000-0001-8230-9471

According to our database¹, Hongfa Wang authored at least 53 papers between 2008 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization.

[BibT_eX]

[DOI]

CoRR, May, 2026

Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., March, 2026

VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Global and Local Semantic Completion Learning for Vision-Language Pre-Training.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2025

Object-AVEdit: An Object-level Audio-Visual Editing Model.

[BibT_eX]

[DOI]

CoRR, October, 2025

Efficient Quantification of Multimodal Interaction at Sample Level.

[BibT_eX]

[DOI]

Zequn Yang

Hongfa Wang

Di Hu

CoRR, June, 2025

BalanceBenchmark: A Survey for Multimodal Imbalance Learning.

[BibT_eX]

[DOI]

CoRR, February, 2025

Tencent Text-Video Retrieval: Hierarchical Cross-Modal Interactions With Multi-Level Representations.

[BibT_eX]

[DOI]

IEEE Access, 2025

Efficient Quantification of Multimodal Interaction at Sample Level.

[BibT_eX]

[DOI]

Zequn Yang

Hongfa Wang

Di Hu

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Instructseg: Unifying Instructed Visual Segmentation with Multi-Modal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Follow-Your-Click: Open-domain Regional Image Animation via Motion Prompts.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Inverse-Like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

HunyuanVideo: A Systematic Framework For Large Video Generative Models.

[BibT_eX]

[DOI]

CoRR, 2024

Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Video-Language Alignment via Spatio-Temporal Graph Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control.

[BibT_eX]

[DOI]

CoRR, 2024

Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts.

[BibT_eX]

[DOI]

CoRR, 2024

Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation.

[BibT_eX]

[DOI]

Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Unsupervised Cross-Modal Hashing With Modality-Interaction.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., September, 2023

Deep Cross-Modal Proxy Hashing.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., July, 2023

Unsupervised Hashing with Semantic Concept Mining.

[BibT_eX]

[DOI]

Proc. ACM Manag. Data, 2023

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.

[BibT_eX]

[DOI]

CoRR, 2023

Img2Vec: A Teacher of High Token-Diversity Helps Masked AutoEncoders.

[BibT_eX]

[DOI]

CoRR, 2023

MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model.

[BibT_eX]

[DOI]

CoRR, 2022

Adaptive Perception Transformer for Temporal Action Localization.

[BibT_eX]

[DOI]

CoRR, 2022

DPTNet: A Dual-Path Transformer Architecture for Scene Text Detection.

[BibT_eX]

[DOI]

CoRR, 2022

Boosting Multi-Modal E-commerce Attribute Value Extraction via Unified Learning Scheme and Dynamic Range Minimization.

[BibT_eX]

[DOI]

CoRR, 2022

Egocentric Video-Language Pretraining @ Ego4D Challenge 2022.

[BibT_eX]

[DOI]

CoRR, 2022

Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022.

[BibT_eX]

[DOI]

CoRR, 2022

Egocentric Video-Language Pretraining.

[BibT_eX]

[DOI]

CoRR, 2022

HunYuan_tvr for Text-Video Retrievial.

[BibT_eX]

[DOI]

CoRR, 2022

Egocentric Video-Language Pretraining.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

CAliC: Accurate and Efficient Image-Text Retrieval via Contrastive Alignment and Visual Contexts Modeling.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Deep Unsupervised Hashing with Latent Semantic Components.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Detecting Text in Scene and Traffic Guide Panels With Attention Anchor Mechanism.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2021

Correction to: Prediction of urban water accumulation points and water accumulation process based on machine learning.

[BibT_eX]

[DOI]

Earth Sci. Informatics, 2021

Prediction of urban water accumulation points and water accumulation process based on machine learning.

[BibT_eX]

[DOI]

Earth Sci. Informatics, 2021

Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

HAM: Hidden Anchor Mechanism for Scene Text Detection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Detecting Advertising Materials via Multi-Scale Instance Segmentation Network.

[BibT_eX]

[DOI]

Aust. J. Intell. Inf. Process. Syst., 2019

2017

AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition.

[BibT_eX]

[DOI]

CoRR, 2017

2014

Weighted Cache Location Problem with Identical Servers.

[BibT_eX]

[DOI]

Hongfa Wang

Wei Ding

J. Appl. Math., 2014

2013

An Approach to Online Recommendation of Products with High Price-Performance Ratios Based on a Customized Price-Dominance Relationship.

[BibT_eX]

[DOI]

Hongfa Wang

Chen Xing

J. Softw., 2013

On the 2-MRS Problem in a Tree with Unreliable Edges.

[BibT_eX]

[DOI]

J. Appl. Math., 2013

2010

Genetic Algorithm-Based Evaluation Model of Teaching Quality.

[BibT_eX]

[DOI]

Proceedings of the Third International Symposium on Intelligent Information Technology and Security Informatics, 2010

2008

Singular Points Detection Based on Zero-Pole Model in Fingerprint Images.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2008

Hongfa Wang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...