Xiaoxin Chen

Affiliations:
  • Vivo AI Lab, Shenzhen, China


According to our database1, Xiaoxin Chen authored at least 36 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning.
CoRR, April, 2026

Caption First, VQA Second: Knowledge Density, Not Task Format, Drives Multimodal Scaling.
CoRR, April, 2026

Skill-SD: Skill-Conditioned Self-Distillation for Multi-turn LLM Agents.
CoRR, April, 2026

UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI Agents.
CoRR, February, 2026

Bi-directional Bias Attribution: Debiasing Large Language Models without Modifying Prompts.
CoRR, February, 2026

LENS: Learning to Segment Anything with Unified Reinforced Reasoning.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning.
CoRR, November, 2025

BlueLM-2.5-3B Technical Report.
CoRR, July, 2025

UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents.
CoRR, May, 2025

PixelHacker: Image Inpainting with Structural and Semantic Consistency.
CoRR, April, 2025

Progressive Visual Prompt Learning with Contrastive Feature Re-formation.
Int. J. Comput. Vis., February, 2025

FCGhead: Fully Controllable Gaussian Human Heads from Monocular Videos.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2025

Predictive Data Selection: The Data That Predicts Is the Data That Teaches.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

ControlAR: Controllable Image Generation with Autoregressive Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

GenieBlue: Integrating Both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

GTA: Supervised-Guided Reinforcement Learning for Text Classification with Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

SmartBench: Is Your LLM Truly a Good Chinese Smartphone Assistant?
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), 2025

2024
A Learning Rate Path Switching Training Paradigm for Version Updates of Large Language Models.
CoRR, 2024

Efficient Test-Time Prompt Tuning for Vision-Language Models.
CoRR, 2024

EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model.
CoRR, 2024

FAGhead: Fully Animate Gaussian Head from Monocular Videos.
CoRR, 2024

DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

A Learning Rate Path Switching Training Paradigm for Version Updates of Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
ImageBind-LLM: Multi-modality Instruction Tuning.
CoRR, 2023

DPL: Decoupled Prompt Learning for Vision-Language Models.
CoRR, 2023

Progressive Visual Prompt Learning with Contrastive Feature Re-formation.
CoRR, 2023

Real-Time Image Demoiréing on Mobile Devices.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2021
EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

Weakly-Supervised Instance Segmentation via Class-Agnostic Learning With Salient Images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game.
CoRR, 2019


  Loading...