Jingqun Tang
Orcid: 0000-0003-2577-0119
According to our database1,
Jingqun Tang authored at least 38 papers
between 2022 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation.
CoRR, March, 2026
CoRR, March, 2026
CoRR, February, 2026
TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering.
CoRR, February, 2026
CoRR, February, 2026
DTP: A Simple yet Effective Distracting Token Pruning Framework for Vision-Language Action Models.
CoRR, January, 2026
Proceedings of the Companion Proceedings of the ACM Web Conference 2026, 2026
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
CME-CAD: Heterogeneous Collaborative Multi-Expert Reinforcement Learning for CAD Code Generation.
CoRR, December, 2025
Beyond Pixel Simulation: Pathology Image Generation via Diagnostic Semantic Tokens and Prototype Control.
CoRR, December, 2025
Resolving Evidence Sparsity: Agentic Context Engineering for Long-Document Understanding.
CoRR, November, 2025
ChineseVideoBench: Benchmarking Multi-modal Large Models for Chinese Video Question Answering.
CoRR, November, 2025
Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning.
CoRR, September, 2025
Prolonged Reasoning Is Not All You Need: Certainty-Based Adaptive Routing for Efficient LLM/MLLM Reasoning.
CoRR, May, 2025
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
CoRR, May, 2025
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
MINDEV: Multi-modal Integrated Diffusion Framework for Video Reconstruction from EEG Signals.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
A Bounding Box is Worth One Token - Interleaving Layout and Text in a Large Language Model for Document Understanding.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
CoRR, 2024
DocPedia: unleashing the power of large multimodal model in the frequency domain for versatile document understanding.
Sci. China Inf. Sci., 2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023
UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding.
CoRR, 2023
2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022
Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022