Wenbo Su

Orcid: 0000-0003-3465-8284

According to our database1, Wenbo Su authored at least 54 papers between 2014 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
DESIGNER: Design-Logic-Guided Multidisciplinary Data Synthesis for LLM Reasoning.
CoRR, August, 2025

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning.
CoRR, August, 2025

RecGPT Technical Report.
CoRR, July, 2025

Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics.
CoRR, July, 2025

Comparative Analysis and Optimization of Magnetic Field Energy Harvesters Based on Split Three-Phase Power Line Joint Energy Harvesting.
IEEE Trans. Ind. Informatics, June, 2025

Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems.
CoRR, June, 2025

Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library.
CoRR, June, 2025

USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models.
CoRR, May, 2025

Weight Spectra Induced Efficient Model Adaptation.
CoRR, May, 2025

Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models.
CoRR, May, 2025

NAN: A Training-Free Solution to Coefficient Estimation in Model Merging.
CoRR, May, 2025

Think-J: Learning to Think for Generative LLM-as-a-Judge.
CoRR, May, 2025

A Comprehensive Survey on Long Context Language Modeling.
CoRR, March, 2025

Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation.
CoRR, March, 2025

ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph.
CoRR, March, 2025

ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models.
CoRR, February, 2025

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
CoRR, February, 2025

UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering.
CoRR, February, 2025

AIR: Complex Instruction Generation via Automatic Iterative Refinement.
CoRR, February, 2025

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines.
CoRR, February, 2025

ChineseSimpleVQA - "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models.
CoRR, February, 2025

Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models.
CoRR, February, 2025

Unlocking Scaling Law in Industrial Recommendation Systems with a Three-step Paradigm based Large User Model.
CoRR, February, 2025

MIM: Multi-modal Content Interest Modeling Paradigm for User Behavior Modeling.
CoRR, February, 2025

ProgCo: Program Helps Self-Correction of Large Language Models.
CoRR, January, 2025

2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models.
CoRR, 2024

WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis.
CoRR, 2024

Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models.
CoRR, 2024

Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment.
CoRR, 2024

M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation.
CoRR, 2024

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models.
CoRR, 2024

R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models.
CoRR, 2024

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models.
CoRR, 2024

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

DDK: Distilling Domain Knowledge for Efficient Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

E2-LLM: Efficient and Extreme Length Extension of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Sensitivity Analysis of Free-Standing Columnar Magnetic Field Energy Harvester for Powering Wireless Monitoring Sensors.
IEEE Trans. Circuits Syst. I Regul. Pap., November, 2023

Antisaturation and Power Decoupling Control of Multiwinding Energy Harvester Based on Magnetomotive Force Compensation.
IEEE Trans. Ind. Informatics, October, 2023

2022
GBA: A Tuning-free Approach to Switch between Synchronous and Asynchronous Training for Recommendation Model.
CoRR, 2022

GBA: A Tuning-free Approach to Switch between Synchronous and Asynchronous Training for Recommendation Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2018
Visualizing and Understanding Deep Neural Networks in CTR Prediction.
Proceedings of the SIGIR 2018 Workshop On eCommerce co-located with the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018), 2018

2015
SLA-Aware Tenant Placement and Dynamic Resource Provision in SaaS.
Proceedings of the 2015 IEEE International Conference on Web Services, 2015

Modeling and Analysis of Availability in Multi-Tenant SaaS.
Proceedings of the 24th International Conference on Computer Communication and Networks, 2015

2014
Modeling and Analysis of Availability for SaaS Multi-tenant Architecture.
Proceedings of the 8th IEEE International Symposium on Service Oriented System Engineering, 2014


  Loading...