Conghui He

Orcid: 0000-0001-8697-695X

According to our database¹, Conghui He authored at least 173 papers between 2014 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild.

[BibT_eX]

[DOI]

CoRR, November, 2025

RxnCaption: Reformulating Reaction Diagram Parsing as Visual Prompt Guided Captioning.

[BibT_eX]

[DOI]

CoRR, November, 2025

OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation.

[BibT_eX]

[DOI]

CoRR, October, 2025

Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data?

[BibT_eX]

[DOI]

CoRR, October, 2025

BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception.

[BibT_eX]

[DOI]

CoRR, October, 2025

AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation.

[BibT_eX]

[DOI]

CoRR, October, 2025

LLM/Agent-as-Data-Analyst: A Survey.

[BibT_eX]

[DOI]

CoRR, September, 2025

Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents.

[BibT_eX]

[DOI]

CoRR, September, 2025

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing.

[BibT_eX]

[DOI]

CoRR, September, 2025

ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning.

[BibT_eX]

[DOI]

CoRR, September, 2025

From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature.

[BibT_eX]

[DOI]

CoRR, September, 2025

Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning.

[BibT_eX]

[DOI]

CoRR, August, 2025

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers.

[BibT_eX]

[DOI]

CoRR, August, 2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency.

[BibT_eX]

[DOI]

CoRR, August, 2025

Intern-S1: A Scientific Multimodal Foundation Model.

[BibT_eX]

[DOI]

CoRR, August, 2025

Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, August, 2025

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation.

[BibT_eX]

[DOI]

CoRR, August, 2025

Geoint-R1: Formalizing Multimodal Geometric Reasoning with Dynamic Auxiliary Constructions.

[BibT_eX]

[DOI]

CoRR, August, 2025

Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, July, 2025

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs.

[BibT_eX]

[DOI]

CoRR, July, 2025

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once.

[BibT_eX]

[DOI]

CoRR, July, 2025

Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos.

[BibT_eX]

[DOI]

CoRR, June, 2025

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition.

[BibT_eX]

[DOI]

CoRR, June, 2025

Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification.

[BibT_eX]

[DOI]

CoRR, June, 2025

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning.

[BibT_eX]

[DOI]

CoRR, June, 2025

Shifting AI Efficiency From Model-Centric to Data-Centric Compression.

[BibT_eX]

[DOI]

CoRR, May, 2025

A Survey of LLM ⨉ DATA.

[BibT_eX]

[DOI]

CoRR, May, 2025

Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering.

[BibT_eX]

[DOI]

CoRR, May, 2025

IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment.

[BibT_eX]

[DOI]

CoRR, May, 2025

Not All Documents Are What You Need for Extracting Instruction Tuning Data.

[BibT_eX]

[DOI]

CoRR, May, 2025

SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges.

[BibT_eX]

[DOI]

CoRR, April, 2025

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis.

[BibT_eX]

[DOI]

CoRR, April, 2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding.

[BibT_eX]

[DOI]

CoRR, April, 2025

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation.

[BibT_eX]

[DOI]

CoRR, April, 2025

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework.

[BibT_eX]

[DOI]

CoRR, March, 2025

PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model.

[BibT_eX]

[DOI]

CoRR, March, 2025

MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion.

[BibT_eX]

[DOI]

CoRR, March, 2025

LEGION: Learning to Ground and Explain for Synthetic Image Detection.

[BibT_eX]

[DOI]

CoRR, March, 2025

Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation.

[BibT_eX]

[DOI]

CoRR, March, 2025

MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer.

[BibT_eX]

[DOI]

CoRR, March, 2025

Unsupervised Topic Models are Data Mixers for Pre-training Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More.

[BibT_eX]

[DOI]

CoRR, February, 2025

A Comprehensive Survey on Imbalanced Data Learning.

[BibT_eX]

[DOI]

CoRR, February, 2025

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT.

[BibT_eX]

[DOI]

CoRR, February, 2025

GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation.

[BibT_eX]

[DOI]

CoRR, February, 2025

WanJuanSiLu: A High-Quality Open-Source Webtext Dataset for Low-Resource Languages.

[BibT_eX]

[DOI]

CoRR, January, 2025

Fine-grained building function recognition with street-view images and GIS map data via geometry-aware semi-supervised learning.

[BibT_eX]

[DOI]

Int. J. Appl. Earth Obs. Geoinformation, 2025

GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MLLM-DataEngine: Closing the Loop of Multimodal Instruction Tuning Data Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Beyond Multimodal Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Conical Visual Concentration for Efficient Large Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Dataset Distillation with Neural Characteristic Function: A Minmax Perspective.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

OpenHuEval: Evaluating Large Language Model on Hungarian Specifics.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

LEMMA: Learning from Errors for MatheMatical Advancement in LLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement.

[BibT_eX]

[DOI]

Maosongcao Maosongcao

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

A Strategic Coordination Framework of Small LMs Matches Large LMs in Data Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenge.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Exploring the user guidance for more accurate building segmentation from high-resolution remote sensing images.

[BibT_eX]

[DOI]

Int. J. Appl. Earth Obs. Geoinformation, February, 2024

DropQueries: A Simple Way to Discover Comprehensive Segment Representations.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Weakly Supervised 3-D Building Reconstruction From Monocular Remote Sensing Images.

[BibT_eX]

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2024

Accelerating Diffusion Transformers with Dual Feature Caching.

[BibT_eX]

[DOI]

CoRR, 2024

Where am I? Cross-View Geo-localization with Natural Language Descriptions.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions.

[BibT_eX]

[DOI]

CoRR, 2024

Chimera: Improving Generalist Model with Domain-Specific Experts.

[BibT_eX]

[DOI]

CoRR, 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.

[BibT_eX]

[DOI]

CoRR, 2024

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Can LLMs be Good Graph Judger for Knowledge Graph Construction?

[BibT_eX]

[DOI]

CoRR, 2024

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction.

[BibT_eX]

[DOI]

Qintong Zhang

Victor Shea-Jay Huang

CoRR, 2024

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction.

[BibT_eX]

[DOI]

CoRR, 2024

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.

[BibT_eX]

[DOI]

CoRR, 2024

Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MinerU: An Open-Source Solution for Precise Document Content Extraction.

[BibT_eX]

[DOI]

CoRR, 2024

BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search.

[BibT_eX]

[DOI]

CoRR, 2024

Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis.

[BibT_eX]

[DOI]

CoRR, 2024

Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning.

[BibT_eX]

[DOI]

CoRR, 2024

SkyDiffusion: Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm.

[BibT_eX]

[DOI]

CoRR, 2024

Synth-Empathy: Towards High-Quality Synthetic Empathy Data.

[BibT_eX]

[DOI]

CoRR, 2024

SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

OpenDataLab: Empowering General Artificial Intelligence with Open Datasets.

[BibT_eX]

[DOI]

CoRR, 2024

Navigating the Data Trading Crossroads: An Interdisciplinary Survey.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

[BibT_eX]

[DOI]

CoRR, 2024

KeyVideoLLM: Towards Large-scale Video Keyframe Selection.

[BibT_eX]

[DOI]

CoRR, 2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

CoRR, 2024

DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data.

[BibT_eX]

[DOI]

CoRR, 2024

A Survey of Multimodal Large Language Model from A Data-centric Perspective.

[BibT_eX]

[DOI]

CoRR, 2024

FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM2 Technical Report.

[BibT_eX]

[DOI]

et al.

CoRR, 2024

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training.

[BibT_eX]

[DOI]

CoRR, 2024

LOCR: Location-Guided Transformer for Optical Character Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset.

[BibT_eX]

[DOI]

CoRR, 2024

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation.

[BibT_eX]

[DOI]

CoRR, 2024

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.

[BibT_eX]

[DOI]

CoRR, 2024

How far are we to GPT-4V? Closing the gap to commercial multimodal models with open-source suites.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

LOCR: Location-Guided Transformer for Optical Character Recognition.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

LongWanjuan: Towards Systematic Measurement for Long Text Quality.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Cross-View Image Geo-Localization with Panorama-BEV Co-retrieval Network.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

MMBench: Is Your Multi-modal Model an All-Around Player?

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Parrot Captions Teach CLIP to Spot Text.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

ShareGPT4V: Improving Large Multi-modal Models with Better Captions.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

VIGC: Visual Instruction Generation and Correction.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.

[BibT_eX]

[DOI]

CoRR, 2023

MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models.

[BibT_eX]

[DOI]

CoRR, 2023

MLLM-DataEngine: An Iterative Refinement Approach for MLLM.

[BibT_eX]

[DOI]

CoRR, 2023

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models.

[BibT_eX]

[DOI]

CoRR, 2023

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model.

[BibT_eX]

[DOI]

CoRR, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

SEPT: Towards Scalable and Efficient Visual Pre-training.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Unified Interactive Image Matting.

[BibT_eX]

[DOI]

CoRR, 2022

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

INTERN: A New Learning Paradigm Towards General Vision.

[BibT_eX]

[DOI]

CoRR, 2021

Influence Selection for Active Learning.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

3D Building Reconstruction from Monocular Remote Sensing Images.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Joint Semantic-geometric Learning for Polygonal Building Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-based Point Clouds.

[BibT_eX]

[DOI]

Proceedings of the UIST '20 Adjunct: The 33rd Annual ACM Symposium on User Interface Software and Technology, 2020

2019

Optimizing Finite Volume Method Solvers on Nvidia GPUs.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2019

Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data.

[BibT_eX]

[DOI]

Remote. Sens., 2019

A Real-Time Tree Crown Detection Approach for Large-Scale Remote Sensing Images on FPGAs.

[BibT_eX]

[DOI]

Remote. Sens., 2019

Finding Mutual X at WeChat-Scale Social Network in Ten Minitues.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018

Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2018

Semantic Segmentation Based Building Extraction Method Using Multi-Source GIS Map Datasets and Satellite Imagery.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017

A Fully-Pipelined Hardware Design for Gaussian Mixture Models.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2017

18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

An FPGA-based tree crown detection approach for remote sensing images.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field Programmable Technology, 2017

Exploring the potential of reconfigurable platforms for order book update.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

Accelerating Financial Market Server through Hybrid List Design (Abstract Only).

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

A Nanosecond-Level Hybrid Table Design for Financial Market Data Generators.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

2016

A time-space domain stereo finite difference method for 3D scalar wave propagation.

[BibT_eX]

[DOI]

Comput. Geosci., 2016

Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2016

2014

Global-Scale Associations of Vegetation Phenology with Rainfall and Temperature at a High Spatio-Temporal Resolution.

[BibT_eX]

[DOI]

Remote. Sens., 2014

Conghui He

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...