Bo Wang

Orcid: 0000-0003-0526-0533

Affiliations:

Fudan University, Shanghai, China

According to our database¹, Bo Wang authored at least 18 papers between 2024 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping.

[BibT_eX]

[DOI]

CoRR, April, 2026

BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, March, 2026

Explicit Multi-head Attention for Inter-head Interaction in Large Language Models.

[BibT_eX]

[DOI]

CoRR, January, 2026

TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization.

[BibT_eX]

[DOI]

CoRR, January, 2026

ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development.

[BibT_eX]

[DOI]

CoRR, January, 2026

Time-Frequency Token Advantage Clipping for Training Efficient Large Reasoning Model.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Multi-hop Reasoning via Early Knowledge Alignment.

[BibT_eX]

[DOI]

CoRR, December, 2025

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections.

[BibT_eX]

[DOI]

CoRR, July, 2025

MOSS-MED: A Family of Multimodal Models Serving Medical Image Analysis.

[BibT_eX]

[DOI]

ACM Trans. Manag. Inf. Syst., 2025

BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective.

[BibT_eX]

[DOI]

CoRR, 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.

[BibT_eX]

[DOI]

CoRR, 2024

BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments.

[BibT_eX]

[DOI]

CoRR, 2024

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle.

[BibT_eX]

[DOI]

CoRR, 2024

In-Memory Learning: A Declarative Learning Framework for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Memorize Step by Step: Efficient Long-Context Prefilling with Incremental Memory and Decremental Chunk.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Bo Wang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...