Hao Feng

Orcid: 0000-0001-8127-6639

Affiliations:
  • University of Science and Technology of China, Department of Electronic Engineering and Information Science, CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System, Hefei, China


According to our database1, Hao Feng authored at least 37 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
DocScanner: Robust Document Image Rectification with Progressive Learning.
Int. J. Comput. Vis., August, 2025

MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement.
CoRR, August, 2025

Post-Completion Learning for Language Models.
CoRR, July, 2025

MCP-Zero: Active Tool Discovery for Autonomous LLM Agents.
CoRR, June, 2025

Advancing Sequential Numerical Prediction in Autoregressive Models.
CoRR, May, 2025

WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
CoRR, May, 2025

Multi-Cue Adaptive Visual Token Pruning for Large Vision-Language Models.
CoRR, March, 2025

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning.
CoRR, January, 2025

DeepEraser: Deep Iterative Context Mining for Generic Text Eraser.
IEEE Trans. Multim., 2025

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Recurrent Generic Contour-Based Instance Segmentation With Progressive Learning.
IEEE Trans. Circuits Syst. Video Technol., September, 2024

Rethinking Supervision in Document Unwarping: A Self-Consistent Flow-Free Approach.
IEEE Trans. Circuits Syst. Video Technol., June, 2024

Progressive Recurrent Network for shadow removal.
Comput. Vis. Image Underst., January, 2024

Deep Unrestricted Document Image Rectification.
IEEE Trans. Multim., 2024

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding.
CoRR, 2024

LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation.
CoRR, 2024

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding.
CoRR, 2024

RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation.
CoRR, 2024

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy.
CoRR, 2024

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering.
CoRR, 2024

TextSquare: Scaling up Text-Centric Visual Instruction Tuning.
CoRR, 2024

TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding.
CoRR, 2024

DocPedia: unleashing the power of large multimodal model in the frequency domain for versatile document understanding.
Sci. China Inf. Sci., 2024

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Progressive Multi-modal Conditional Prompt Tuning.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024

2023
Model-Aware Pre-Training for Radial Distortion Rectification.
IEEE Trans. Image Process., 2023

Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs.
CoRR, 2023

DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding.
CoRR, 2023

UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding.
CoRR, 2023

Recurrent Contour-based Instance Segmentation with Progressive Learning.
CoRR, 2023

DocMAE: Document Image Rectification via Self-supervised Representation Learning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Sign Language Translation with Iterative Prototype.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
PolyTracker: Progressive Contour Regression for Multiple Object Tracking and Segmentation.
Proceedings of the Pattern Recognition and Computer Vision - 5th Chinese Conference, 2022

Geometric Representation Learning for Document Image Rectification.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021


  Loading...