Conghui He
Orcid: 0000-0001-8697-695X
According to our database1,
Conghui He
authored at least 154 papers
between 2014 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving.
CoRR, August, 2025
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation.
CoRR, August, 2025
Geoint-R1: Formalizing Multimodal Geometric Reasoning with Dynamic Auxiliary Constructions.
CoRR, August, 2025
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning.
CoRR, July, 2025
CoRR, July, 2025
CoRR, July, 2025
Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models.
CoRR, June, 2025
CoRR, June, 2025
GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition.
CoRR, June, 2025
CoRR, June, 2025
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning.
CoRR, June, 2025
CoRR, May, 2025
Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering.
CoRR, May, 2025
CoRR, May, 2025
CoRR, May, 2025
SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models.
CoRR, May, 2025
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges.
CoRR, April, 2025
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis.
CoRR, April, 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.
CoRR, April, 2025
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding.
CoRR, April, 2025
CoRR, April, 2025
CoRR, March, 2025
PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model.
CoRR, March, 2025
CoRR, March, 2025
CoRR, March, 2025
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation.
CoRR, March, 2025
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer.
CoRR, March, 2025
CoRR, February, 2025
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More.
CoRR, February, 2025
CoRR, February, 2025
CoRR, February, 2025
GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation.
CoRR, February, 2025
CoRR, January, 2025
Fine-grained building function recognition with street-view images and GIS map data via geometry-aware semi-supervised learning.
Int. J. Appl. Earth Obs. Geoinformation, 2025
GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenge.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
Exploring the user guidance for more accurate building segmentation from high-resolution remote sensing images.
Int. J. Appl. Earth Obs. Geoinformation, February, 2024
IEEE Trans. Multim., 2024
IEEE Trans. Geosci. Remote. Sens., 2024
CoRR, 2024
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions.
CoRR, 2024
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.
CoRR, 2024
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation.
CoRR, 2024
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction.
CoRR, 2024
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction.
CoRR, 2024
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception.
CoRR, 2024
Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models.
CoRR, 2024
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search.
CoRR, 2024
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning.
CoRR, 2024
SkyDiffusion: Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm.
CoRR, 2024
SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models.
CoRR, 2024
CoRR, 2024
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.
CoRR, 2024
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.
CoRR, 2024
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.
CoRR, 2024
CoRR, 2024
CoRR, 2024
FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models.
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation.
CoRR, 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models.
CoRR, 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024
How far are we to GPT-4V? Closing the gap to commercial multimodal models with open-source suites.
Sci. China Inf. Sci., 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization.
CoRR, 2023
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023
MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models.
CoRR, 2023
WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models.
CoRR, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark.
Proceedings of the Computer Vision - ECCV 2022, 2022
2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Proceedings of the UIST '20 Adjunct: The 33rd Annual ACM Symposium on User Interface Software and Technology, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data.
Remote. Sens., 2019
A Real-Time Tree Crown Detection Approach for Large-Scale Remote Sensing Images on FPGAs.
Remote. Sens., 2019
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019
2018
Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2018
Semantic Segmentation Based Building Extraction Method Using Multi-Source GIS Map Datasets and Satellite Imagery.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018
swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight.
Proceedings of the IEEE International Conference on Cluster Computing, 2018
2017
IEEE Trans. Computers, 2017
18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios.
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the International Conference on Field Programmable Technology, 2017
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017
2016
Comput. Geosci., 2016
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2016
2014
Global-Scale Associations of Vegetation Phenology with Rainfall and Temperature at a High Spatio-Temporal Resolution.
Remote. Sens., 2014