Yafei Wen

According to our database1, Yafei Wen authored at least 18 papers between 2014 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning.
CoRR, April, 2026

Skill-SD: Skill-Conditioned Self-Distillation for Multi-turn LLM Agents.
CoRR, April, 2026

UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI Agents.
CoRR, February, 2026

2025
VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning.
CoRR, November, 2025

BlueLM-2.5-3B Technical Report.
CoRR, July, 2025

UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

GenieBlue: Integrating Both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
TerDiT: Ternary Diffusion Models with Transformers.
CoRR, 2024

DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
ImageBind-LLM: Multi-modality Instruction Tuning.
CoRR, 2023

Real-Time Image Demoiréing on Mobile Devices.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2021
EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

2014
Viewpoint-Aware Representation for Sketch-Based 3D Model Retrieval.
IEEE Signal Process. Lett., 2014

Sketch-Based 3D Model Retrieval via Multi-feature Fusion.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Learning Flexible Binary Code for Linear Projection Based Hashing with Random Forest.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014


  Loading...