Peng Xia

Orcid: 0000-0003-2676-9128

Affiliations:
  • University of North Carolina at Chapel Hill, NC, USA
  • Monash University, Faculty of Information Technology, Melbourne, VIC, USA (PhD 2024)
  • Soochow University, School of Computer Science and Technology, Suzhou, China (former)


According to our database1, Peng Xia authored at least 34 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration.
CoRR, May, 2026

ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents.
CoRR, May, 2026

EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents.
CoRR, May, 2026

ClawArena: Benchmarking AI Agents in Evolving Information Environments.
CoRR, April, 2026

Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory.
CoRR, April, 2026

MetaClaw: Just Talk - An Agent That Meta-Learns and Evolves in the Wild.
CoRR, March, 2026

SimpleOCR: Rendering Visualized Questions to Teach MLLMs to Read.
CoRR, February, 2026

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning.
CoRR, February, 2026

Reliable and Responsible Foundation Models: A Comprehensive Survey.
CoRR, February, 2026

MedVerse: Efficient and Reliable Medical Reasoning via DAG-Structured Parallel Execution.
CoRR, February, 2026

SimpleMem: Efficient Lifelong Memory for LLM Agents.
CoRR, January, 2026

2025
Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs.
CoRR, December, 2025

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning.
CoRR, November, 2025

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding.
CoRR, March, 2025

MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation.
CoRR, February, 2025

Reliable and Responsible Foundation Models.
Trans. Mach. Learn. Res., 2025

MJ-Video: Benchmarking and Rewarding Video Generation with Fine-Grained Video Preference.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Anyprefer: An Agentic Framework for Preference Data Synthesis.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Neighbor Does Matter: Density-Aware Contrastive Learning for Medical Semi-supervised Segmentation.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Towards Realistic Semi-supervised Medical Image Classification.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification.
CoRR, 2024

CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

TP-DRSeg: Improving Diabetic Retinopathy Lesion Segmentation with Explicit Text-Prompts Assisted SAM.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition.
CoRR, 2023

NurViD: A Large Expert-Level Video Database for Nursing Procedure Activity Understanding.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Chinese grammatical error correction based on knowledge distillation.
CoRR, 2022

2019
Topic model with incremental vocabulary based on Belief Propagation.
Knowl. Based Syst., 2019


  Loading...