Xiang Yue

Orcid: 0000-0003-4547-1685

According to our database1, Xiang Yue authored at least 102 papers between 2015 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning.
CoRR, July, 2025

Deep learning for recognition and detection of plant diseases and pests.
Neural Comput. Appl., June, 2025

OpusLM: A Family of Open Unified Speech Language Models.
CoRR, June, 2025

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation.
CoRR, June, 2025

Temporal Sampling for Forgotten Reasoning in LLMs.
CoRR, May, 2025

Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs.
CoRR, May, 2025

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think.
CoRR, May, 2025

Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time.
CoRR, April, 2025

VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge.
CoRR, April, 2025

ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations.
CoRR, April, 2025

Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators.
CoRR, March, 2025

Overtrained Language Models Are Harder to Fine-Tune.
CoRR, March, 2025

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search.
CoRR, March, 2025

ESPnet-SpeechLM: An Open Speech Language Model Toolkit.
CoRR, February, 2025

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines.
CoRR, February, 2025

Demystifying Long Chain-of-Thought Reasoning in LLMs.
CoRR, February, 2025

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate.
CoRR, January, 2025

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos.
CoRR, January, 2025

Aligning Instruction Tuning with Pre-training.
CoRR, January, 2025

Long-context LLMs Struggle with Long In-context Learning.
Trans. Mach. Learn. Res., 2025

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

SimulBench: Evaluating Language Models with Creative Simulation Tasks.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MixEval-X: Any-to-any Evaluations from Real-world Data Mixture.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Harnessing Webpage UIs for Text-Rich Visual Understanding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Evaluating Vision-Language Models as Evaluators in Path Planning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LIME: Less Is More for MLLM Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Small Models Struggle to Learn from Strong Reasoners.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Evaluating Language Models as Synthetic Data Generators.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning.
CoRR, 2024

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale.
CoRR, 2024

Teach Multimodal LLMs to Comprehend Electrocardiographic Images.
CoRR, 2024

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures.
CoRR, 2024

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.
CoRR, 2024

Visual Perception in Text Strings.
CoRR, 2024

SimulBench: Evaluating Language Models with Creative Simulation Tasks.
CoRR, 2024

LIME: Less Is More for MLLM Evaluation.
CoRR, 2024

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization.
CoRR, 2024

Long Context Alignment with Short Instructions and Synthesized Positions.
CoRR, 2024

MAmmoTH2: Scaling Instructions from the Web.
CoRR, 2024

MuPT: A Generative Symbolic Music Pretrained Transformer.
CoRR, 2024

VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?
CoRR, 2024

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models.
CoRR, 2024

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding.
CoRR, 2024

MAmmoTH2: Scaling Instructions from the Web.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Grokking of Implicit Reasoning in Transformers: A Mechanistic Journey to the Edge of Generalization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

TableLlama: Towards Open Large Generalist Models for Tables.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Data Engineering for Scaling Language Models to 128K Context.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Machine Unlearning of Pre-trained Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

AttributionBench: How Hard is Automatic Attribution Evaluation?
Proceedings of the Findings of the Association for Computational Linguistics, 2024

VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Vibration suppression of collaborative robot based on modified trajectory planning.
Ind. Robot, 2023

Can ChatGPT Defend the Truth? Automatic Dialectical Evaluation Elicits LLMs' Deficiencies in Reasoning.
CoRR, 2023

Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy.
Proceedings of the ACM Web Conference 2023, 2023

Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System.
Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue, 2023

A Deeping-Learning-Based Foreign Object Inspection System Design for Overhead Power Transmission Lines.
Proceedings of the International Conference on Advanced Robotics and Mechatronics, 2023

Automatic Evaluation of Attribution by Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass.
Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023

Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe.
CoRR, 2022

Bootstrapping a User-Centered Task-Oriented Dialogue System.
CoRR, 2022

Improving Stability of Line Inspection Robot During Crossing Jumper Lines With a Centroid Adjustment Adjusting Mechanism.
IEEE Access, 2022

Automatic Obstacle-Crossing Planning for a Transmission Line Inspection Robot Based on Multisensor Fusion.
IEEE Access, 2022

Synthetic Question Value Estimation for Domain Adaptation of Question Answering.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022

2021
Tensor decomposition with relational constraints for predicting multiple types of microRNA-disease associations.
Briefings Bioinform., 2021

COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2021

Differential Privacy for Text Analytics via Natural Text Sanitization.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval.
CoRR, 2020

Practical Annotation Strategies for Question Answering Datasets.
CoRR, 2020

Toward Making the Most of Context in Neural Machine Translation.
CoRR, 2020

Graph embedding on biomedical networks: methods, applications and evaluations.
Bioinform., 2020

Towards Making the Most of Context in Neural Machine Translation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Clinical Phrase Mining with Language Models.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation.
Proceedings of the 3rd Clinical Natural Language Processing Workshop, 2020

2019
Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition.
BMC Bioinform., 2019

SurfCon: Synonym Discovery on Privacy-Aware Clinical Data.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

LncRNA-miRNA interaction prediction from the heterogeneous network through graph embedding ensemble learning.
Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine, 2019

2018
SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions.
PLoS Comput. Biol., 2018

Manifold regularized matrix factorization for drug-drug interaction prediction.
J. Biomed. Informatics, 2018

Predicting drug-disease associations by using similarity constrained matrix factorization.
BMC Bioinform., 2018

Sequence-based bacterial small RNAs prediction using ensemble learning strategies.
BMC Bioinform., 2018

Prediction of Drug-Disease Associations and Their Effects by Signed Network-Based Nonnegative Matrix Factorization.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2018

2017
A unified frame of predicting side effects of drugs by using linear neighborhood similarity.
BMC Syst. Biol., 2017

Predicting drug-disease associations based on the known association bipartite network.
Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine, 2017

Predicting small RNAs in bacteria via sequence learning ensemble method.
Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine, 2017

2016
基于语义的文档特征提取研究方法 (Semantic-based Feature Extraction Method for Document).
计算机科学, 2016

2015
Research on a novel inspection robot mechanism for power transmission lines.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015


  Loading...