Wenxuan Wang
Orcid: 0000-0002-9803-8204Affiliations:
- Chinese University of Hong Kong, Department of Computer Science and Engineering, Hong Kong (PhD 2023)
According to our database1,
Wenxuan Wang
authored at least 76 papers
between 2020 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
CoRR, August, 2025
Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications.
CoRR, August, 2025
CoRR, July, 2025
CoRR, May, 2025
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training.
CoRR, May, 2025
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards.
CoRR, May, 2025
A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
CoRR, May, 2025
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning.
CoRR, April, 2025
STShield: Single-Token Sentinel for Real-Time Jailbreak Detection in Large Language Models.
CoRR, March, 2025
CoRR, March, 2025
VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language Models.
CoRR, February, 2025
CoRR, February, 2025
VLMs as GeoGuessr Masters: Exceptional Performance, Hidden Biases, and Privacy Risks.
CoRR, February, 2025
Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries.
CoRR, February, 2025
CoRR, January, 2025
Neurocomputing, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
EAGLE: Expert-Guided Self-Enhancement for Preference Alignment in Pathology Large Vision-Language Model.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
MRWeb: An Exploration of Generating Multi-Page Resource-Aware Web Code from UI Designs.
CoRR, 2024
Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking.
CoRR, 2024
CoRR, 2024
Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach.
CoRR, 2024
CoRR, 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments.
CoRR, 2024
Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models.
CoRR, 2024
CoRR, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How.
Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Does ChatGPT Know That It Does Not Know? Evaluating the Black-Box Calibration of ChatGPT.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
APIBench: A Benchmark Dataset for Evaluating API Recommendation Approaches in Python and Java.
Dataset, November, 2023
IEEE Trans. Software Eng., April, 2023
CoRR, 2023
CoRR, 2023
CoRR, 2023
CoRR, 2023
ChatGPT an ENFJ, Bard an ISTJ: Empirical Study on Personalities of Large Language Models.
CoRR, 2023
Constructing Effective In-Context Demonstration for Code Intelligence Tasks: An Empirical Study.
CoRR, 2023
CoRR, 2023
Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023
An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software.
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023
Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023
Proceedings of the 45th IEEE/ACM International Conference on Software Engineering, 2023
ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Tencent's Multilingual Machine Translation System for WMT22 Large-Scale African Languages.
Proceedings of the Seventh Conference on Machine Translation, 2022
Proceedings of the ISSTA '22: 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, South Korea, July 18, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
APIBench: A Benchmark Dataset for Evaluating API Recommendation Approaches in Python and Java.
Dataset, December, 2021
2020
Proceedings of the 28th International Conference on Computational Linguistics, 2020