Cunxiang Wang

Orcid: 0000-0002-3023-8082

According to our database1, Cunxiang Wang authored at least 44 papers between 2017 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
UDA: Unsupervised Debiasing Alignment for Pair-wise LLM-as-a-Judge.
CoRR, August, 2025

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models.
CoRR, August, 2025

Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future.
CoRR, August, 2025

Exploring the Evolution of Physics Cognition in Video Generation: A Survey.
CoRR, March, 2025

StepMathAgent: A Step-Wise Agent for Evaluating Mathematical Processes through Tree-of-Error.
CoRR, March, 2025

CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Self-DC: When to Reason and When to Act? Self Divide-and-Conquer for Compositional Unknown Questions.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

How Likely Do LLMs with CoT Mimic Human Reasoning?
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Unlocking Recursive Thinking of LLMs: Alignment via Refinement.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

HPSS: Heuristic Prompting Strategy Search for LLM Evaluators.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

LongSafety: Evaluating Long-Context Safety of Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Training Language Model to Critique for Better Refinement.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
A Survey on Evaluation of Large Language Models.
ACM Trans. Intell. Syst. Technol., June, 2024

Long<sup>2</sup>RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall.
CoRR, 2024

Nash CoT: Multi-Path Inference with Preference Equilibrium.
CoRR, 2024

NovelQA: A Benchmark for Long-Range Novel Question Answering.
CoRR, 2024

Knowledge Conflicts for LLMs: A Survey.
CoRR, 2024

LLMs with Chain-of-Thought Are Non-Causal Reasoners.
CoRR, 2024

SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning.
CoRR, 2024

Self-DC: When to retrieve and When to generate? Self Divide-and-Conquer for Compositional Unknown Questions.
CoRR, 2024

RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Nash CoT: Multi-Path Inference with Preference Equilibrium.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Knowledge Conflicts for LLMs: A Survey.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LONG²RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity.
CoRR, 2023

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization.
CoRR, 2023

Evaluating Open Question Answering Evaluation.
CoRR, 2023

Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base.
Proceedings of the Natural Language Processing and Chinese Computing, 2023

Evaluating Open-QA Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TRAMS: Training-free Memory Selection for Long-range Language Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Exploiting Abstract Meaning Representation for Open-Domain Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
On Effectively Learning of Knowledge in Continual Pre-training.
CoRR, 2022

2021
Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning.
Proceedings of the Natural Language Processing and Chinese Computing, 2021

Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA?
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Commonsense Knowledge Graph Reasoning by Selection or Generation? Why?
CoRR, 2020

SemEval-2020 Task 4: Commonsense Validation and Explanation.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

2019
Domain Representation for Knowledge Graph Embedding.
Proceedings of the Natural Language Processing and Chinese Computing, 2019

Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2017
Embedding Syntactic Tree Structures into CNN Architecture for Relation Classification.
Proceedings of the Knowledge Graph and Semantic Computing. Language, Knowledge, and Intelligence, 2017


  Loading...