Shihan Dou

Orcid: 0009-0002-6013-3035

According to our database¹, Shihan Dou authored at least 68 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

MDAR: A Multi-scene Dynamic Audio Reasoning Benchmark.

[BibT_eX]

[DOI]

CoRR, September, 2025

Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization.

[BibT_eX]

[DOI]

CoRR, September, 2025

Intern-S1: A Scientific Multimodal Foundation Model.

[BibT_eX]

[DOI]

CoRR, August, 2025

LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.

[BibT_eX]

[DOI]

CoRR, August, 2025

VRPO: Rethinking Value Modeling for Robust RL Training under Noisy Supervision.

[BibT_eX]

[DOI]

CoRR, August, 2025

SpeechRole: A Large-Scale Dataset and Benchmark for Evaluating Speech Role-Playing Agents.

[BibT_eX]

[DOI]

CoRR, August, 2025

Pre-Trained Policy Discriminators are General Reward Models.

[BibT_eX]

[DOI]

CoRR, July, 2025

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation.

[BibT_eX]

[DOI]

CoRR, June, 2025

Progressive Mastery: Customized Curriculum Learning with Guided Prompting for Mathematical Reasoning.

[BibT_eX]

[DOI]

CoRR, June, 2025

EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving.

[BibT_eX]

[DOI]

CoRR, June, 2025

Compression Hacking: A Supplementary Perspective on Informatics Metric of Language Models from Geometric Distortion.

[BibT_eX]

[DOI]

CoRR, May, 2025

Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment.

[BibT_eX]

[DOI]

CoRR, May, 2025

Improving RL Exploration for LLM Reasoning through Retrospective Replay.

[BibT_eX]

[DOI]

CoRR, April, 2025

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, April, 2025

Detecting Essence Code Clones via Information Theoretic Analysis.

[BibT_eX]

[DOI]

CoRR, February, 2025

Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric.

[BibT_eX]

[DOI]

CoRR, February, 2025

Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training.

[BibT_eX]

[DOI]

CoRR, February, 2025

The rise and potential of large language model based agents: a survey.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2025

RMB: Comprehensively benchmarking reward models in LLM alignment.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

Revisiting Jailbreaking for Large Language Models: A Representation Engineering Perspective.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Lost in the Context: Insufficient and Distracted Attention to Contexts in Preference Modeling.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

DocFusion: A Unified Framework for Document Parsing Tasks.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

COCL: An Intelligent Framework for Enhancing Deep Learning-Based Vulnerability Detection.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, March, 2024

CC2Vec: Combining Typed Tokens with Contrastive Learning for Effective Code Clone Detection.

[BibT_eX]

[DOI]

Proc. ACM Softw. Eng., 2024

Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-Programming Language Sandbox for LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study.

[BibT_eX]

[DOI]

CoRR, 2024

SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance.

[BibT_eX]

[DOI]

CoRR, 2024

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle.

[BibT_eX]

[DOI]

CoRR, 2024

MetaRM: Shifted Distributions Alignment via Meta-Learning.

[BibT_eX]

[DOI]

CoRR, 2024

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution.

[BibT_eX]

[DOI]

CoRR, 2024

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback.

[BibT_eX]

[DOI]

CoRR, 2024

MouSi: Poly-Visual-Expert Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback.

[BibT_eX]

[DOI]

CoRR, 2024

Secrets of RLHF in Large Language Models Part II: Reward Modeling.

[BibT_eX]

[DOI]

CoRR, 2024

ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios.

[BibT_eX]

[DOI]

CoRR, 2024

CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing.

[BibT_eX]

[DOI]

Proceedings of the Natural Language Processing and Chinese Computing, 2024

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Improving Generalization of Alignment with Human Preferences through Group Invariant Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment.

[BibT_eX]

[DOI]

CoRR, 2023

The Rise and Potential of Large Language Model Based Agents: A Survey.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey.

[BibT_eX]

[DOI]

CoRR, 2023

Secrets of RLHF in Large Language Models Part I: PPO.

[BibT_eX]

[DOI]

CoRR, 2023

CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing.

[BibT_eX]

[DOI]

CoRR, 2023

Gitor: Scalable Code Clone Detection by Building Global Sample Graph.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023

Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Detecting Adversarial Samples through Sharpness of Loss Landscape.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective.

[BibT_eX]

[DOI]

CoRR, 2022

VulCNN: An Image-inspired Scalable Vulnerability Detection System.

[BibT_eX]

[DOI]

Proceedings of the 44th IEEE/ACM 44th International Conference on Software Engineering, 2022

Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computational Linguistics, 2022

MINER: Improving Out-of-Vocabulary Named Entity Recognition from an Information Theoretic Perspective.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

IntDroid: Android Malware Detection Based on API Intimacy Analysis.

[BibT_eX]

[DOI]

ACM Trans. Softw. Eng. Methodol., 2021

Boosting the Capability of Intelligent Vulnerability Detection by Training in a Human-Learning Manner.

[BibT_eX]

[DOI]

CoRR, 2021

Obfuscation-resilient Android Malware Analysis Based on Contrastive Learning.

[BibT_eX]

[DOI]

CoRR, 2021

2020

SCDetector: Software Functional Clone Detection Based on Semantic Tokens Analysis.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020

Shihan Dou

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...