Bo Wang

Orcid: 0000-0003-0526-0533

Affiliations:
  • Fudan University, Shanghai, China


According to our database1, Bo Wang authored at least 18 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping.
CoRR, April, 2026

BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning.
CoRR, March, 2026

Explicit Multi-head Attention for Inter-head Interaction in Large Language Models.
CoRR, January, 2026

TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization.
CoRR, January, 2026

ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development.
CoRR, January, 2026

Time-Frequency Token Advantage Clipping for Training Efficient Large Reasoning Model.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Multi-hop Reasoning via Early Knowledge Alignment.
CoRR, December, 2025

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections.
CoRR, July, 2025

MOSS-MED: A Family of Multimodal Models Serving Medical Image Analysis.
ACM Trans. Manag. Inf. Syst., 2025

BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective.
CoRR, 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.
CoRR, 2024

BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments.
CoRR, 2024

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle.
CoRR, 2024

In-Memory Learning: A Declarative Learning Framework for Large Language Models.
CoRR, 2024

Memorize Step by Step: Efficient Long-Context Prefilling with Incremental Memory and Decremental Chunk.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024


  Loading...