Gongwei Chen

Orcid: 0000-0002-0634-6075

According to our database1, Gongwei Chen authored at least 30 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation.
CoRR, August, 2025

Enhancing Diffusion-based Dataset Distillation via Adversary-Guided Curriculum Sampling.
CoRR, August, 2025

PUMA: Layer-Pruned Language Model for Efficient Unified Multimodal Retrieval with Modality-Adaptive Learning.
CoRR, July, 2025

Less is More: Empowering GUI Agent with Context-Aware Simplification.
CoRR, July, 2025

Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills.
CoRR, June, 2025

Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts.
CoRR, June, 2025

D2AF: A Dual-Driven Annotation and Filtering Framework for Visual Grounding.
CoRR, May, 2025

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers.
CoRR, January, 2025

Spa-Bench: a comprehensive Benchmark for Smartphone Agent Evaluation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding.
CoRR, 2024

Enhancing the Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought.
CoRR, 2024

Deep learning model for the automated detection and classification of central canal and neural foraminal stenosis upon cervical spine magnetic resonance imaging.
BMC Medical Imaging, 2024

MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Composite Object Relation Modeling for Few-Shot Scene Recognition.
IEEE Trans. Image Process., 2023

2022
An Accurate and Efficient Large-Scale Regression Method Through Best Friend Clustering.
IEEE Trans. Parallel Distributed Syst., 2022

Amorphous Region Context Modeling for Scene Recognition.
IEEE Trans. Multim., 2022

2021
An Accurate and Efficient Large-scale Regression Method through Best Friend Clustering.
CoRR, 2021

See More for Scene: Pairwise Consistency Learning for Scene Classification.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020
Learning Scene Attribute for Scene Recognition.
IEEE Trans. Multim., 2020

Image Representations With Spatial Object-to-Object Relations for RGB-D Scene Recognition.
IEEE Trans. Image Process., 2020

Scene Recognition With Prototype-Agnostic Scene Layout.
IEEE Trans. Image Process., 2020

2019
Deep Patch Representations with Shared Codebook for Scene Classification.
ACM Trans. Multim. Comput. Commun. Appl., 2019

MUCH: Mutual Coupling Enhancement of Scene Recognition and Dense Captioning.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Scene Recognition with Comprehensive Regions Graph Modeling.
Proceedings of the Image and Graphics - 10th International Conference, 2019


  Loading...