Can Xu

Orcid: 0000-0002-1949-5715

Affiliations:
  • Microsoft, Beijing, China


According to our database1, Can Xu authored at least 81 papers between 2017 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Do Phone-Use Agents Respect Your Privacy?
CoRR, April, 2026

Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models.
CoRR, March, 2026

OffSeeker: Online Reinforcement Learning Is Not All You Need for Deep Research Agents.
CoRR, January, 2026

RubricBench: Aligning Model-Generated Rubrics with Human Standards.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025
AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent.
CoRR, December, 2025

Zero Reinforcement Learning Towards General Domains.
CoRR, October, 2025

RetriEVAL: Evaluating Text Generation with Contextualized Lexical Match.
Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining, 2025

AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Towards a Unified Paradigm: Integrating Recommendation Systems as a New Language in Large Models.
CoRR, 2024

AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation.
CoRR, 2024

Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena.
CoRR, 2024

A Survey on Knowledge Distillation of Large Language Models.
CoRR, 2024

Leveraging Large Language Models for NLG Evaluation: A Survey.
CoRR, 2024

WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

WizardCoder: Empowering Code Large Language Models with Evol-Instruct.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Automatic Instruction Evolving for Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Re-Reading Improves Reasoning in Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Leveraging Large Language Models for NLG Evaluation: Advances and Challenges.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

ADAM: Dense Retrieval Distillation with Adaptive Dark Examples.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Synergistic Interplay between Search and Large Language Models for Information Retrieval.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Fine-Grained Distillation for Long Document Retrieval.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Re-Reading Improves Reasoning in Language Models.
CoRR, 2023

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct.
CoRR, 2023

Knowledge Refinement via Interaction Between Search Engines and Large Language Models.
CoRR, 2023

Augmented Large Language Models with Parametric Knowledge Guiding.
CoRR, 2023

Self-Supervised Multi-Modal Sequential Recommendation.
CoRR, 2023

WizardLM: Empowering Large Language Models to Follow Complex Instructions.
CoRR, 2023

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval.
CoRR, 2023

LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval.
Proceedings of the ACM Web Conference 2023, 2023

UnifieR: A Unified Retriever for Large-Scale Retrieval.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

HypeR: Multitask Hyper-Prompted Training Enables Large-Scale Retrieval Generalization.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Investigating the Learning Behaviour of In-Context Learning: A Comparison with Supervised Learning.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

Iterative Proposal Refinement for Weakly-Supervised Video Grounding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Robust Ranker for Text Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Fine-Grained Distillation for Long Document Retrieval.
CoRR, 2022

Adam: Dense Retrieval Distillation with Adaptive Dark Examples.
CoRR, 2022

KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Few-Shot NLP.
CoRR, 2022

UnifieR: A Unified Retriever for Large-Scale Retrieval.
CoRR, 2022

Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training.
Proceedings of the WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21, 2022

Stylized Knowledge-Grounded Dialogue Generation via Disentangled Template Rewriting.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Small Changes Make Big Differences: Improving Multi-turn Response Selection in Dialogue Systems via Fine-Grained Contrastive Learning.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

RecInDial: A Unified Framework for Conversational Recommendation with Pretrained Language Models.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

TegTok: Augmenting Text Generation via Task-specific and Open-world Knowledge.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Multimodal Dialogue Response Generation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Finetuning Large-Scale Pre-trained Language Models for Conversational Recommendation with Knowledge Graph.
CoRR, 2021

Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Neural Templates for Recommender Dialogue System.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Learning to Ground Visual Objects for Visual Dialog.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation.
Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Maria: A Visual Experience Powered Conversational Agent.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Open Domain Dialogue Generation with Latent Images.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Open Domain Dialogue Generation with Latent Images.
CoRR, 2020

Zero-Resource Knowledge-Grounded Dialogue Generation.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Low-Resource Knowledge-Grounded Dialogue Generation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Knowledge-Grounded Dialogue Generation with Pre-trained Language Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

StyleDGPT: Stylized Response Generation with Pre-trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2019
A Sequential Matching Framework for Multi-Turn Response Selection in Retrieval-Based Chatbots.
Comput. Linguistics, 2019

Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots.
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 2019

A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Low-Resource Response Generation with Template Prior.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Neural Response Generation with Meta-words.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Improving Matching Models with Contextualized Word Representations for Multi-turn Response Selection in Retrieval-based Chatbots.
CoRR, 2018

Towards Explainable and Controllable Open Domain Dialogue Generation with Dialogue Acts.
CoRR, 2018

Playing 20 Question Game with Policy-Based Reinforcement Learning.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Neural Response Generation With Dynamic Vocabularies.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Knowledge Enhanced Hybrid Neural Network for Text Matching.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Neural Response Generation with Dynamic Vocabularies.
CoRR, 2017

Beihang at the NTCIR-13 STC-2 Task.
Proceedings of the 13th NTCIR Conference, 2017


  Loading...