Yaojie Lu

Orcid: 0000-0002-5842-7715

Affiliations:
  • Institute of Software, Chinese Academy of Sciences, China


According to our database1, Yaojie Lu authored at least 98 papers between 2015 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
PaperRegister: Boosting Flexible-grained Paper Search via Hierarchical Register Indexing.
CoRR, August, 2025

LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
CoRR, August, 2025

RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing.
CoRR, July, 2025

Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction.
CoRR, July, 2025

RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback.
CoRR, July, 2025

Influence of External Information on Large Language Models Mirrors Social Cognitive Patterns.
IEEE Trans. Comput. Soc. Syst., June, 2025

EmbedAgent: Benchmarking Large Language Models in Embedded System Development.
CoRR, June, 2025

ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch.
CoRR, June, 2025

Across Programming Language Silos: A Study on Cross-Lingual Retrieval-augmented Code Generation.
CoRR, June, 2025

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers.
CoRR, April, 2025

Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models.
CoRR, March, 2025

The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models.
CoRR, March, 2025

Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch.
CoRR, February, 2025

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing.
CoRR, February, 2025

SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency.
CoRR, February, 2025

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models.
CoRR, February, 2025

PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides.
CoRR, January, 2025

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models.
CoRR, January, 2025

Transferable Post-training via Inverse Value Learning.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Aligning Retrieval with Reader Needs: Reader-Centered Passage Selection for Open-Domain Question Answering.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Improved Sparse Upcycling for Instruction Tuning.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Critic-CoT: Boosting the Reasoning Abilities of Large Language Model via Chain-of-Thought Critic.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

CRUXEVAL-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Sparse Latents Steer Retrieval-Augmented Generation.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

The Linguistic Connectivities Within Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

READoc: A Unified Benchmark for Realistic Document Structured Extraction.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

From Informal to Formal - Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering.
CoRR, 2024

Aligning Large Language Models via Self-Steering Optimization.
CoRR, 2024

A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models.
CoRR, 2024

Multi-Facet Counterfactual Learning for Content Quality Evaluation.
CoRR, 2024

Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?
CoRR, 2024

READoc: A Unified Benchmark for Realistic Document Structured Extraction.
CoRR, 2024

Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic.
CoRR, 2024

CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution.
CoRR, 2024

Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models.
CoRR, 2024

On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation.
CoRR, 2024

Towards Scalable Automated Alignment of LLMs: A Survey.
CoRR, 2024

URL: Universal Referential Knowledge Linking via Task-instructed Representation Compression.
CoRR, 2024

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect.
CoRR, 2024

Self-Retrieval: Building an Information Retrieval System with One Large Language Model.
CoRR, 2024

Self-Retrieval: End-to-End Information Retrieval with One Large Language Model.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Chain-of-Rewrite: Aligning Question and Documents for Open-Domain Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Seg2Act: Global Context-aware Action Generation for Document Logical Structuring.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Executing Natural Language-Described Algorithms with Large Language Models: An Investigation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Beyond Full Fine-tuning: Harnessing the Power of LoRA for Multi-Task Instruction Tuning.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge in Datasets and Large Language Models.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

ChatGPT Is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Few-shot Named Entity Recognition via Superposition Concept Discrimination.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Pattern Shifting or Knowledge Losing? A Forgetting Perspective for Understanding the Effect of Instruction Fine-Tuning.
Proceedings of the Chinese Computational Linguistics - 23rd China National Conference, 2024

SoFA: Shielded On-the-fly Alignment via Priority Rule Following.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

XMC-Agent : Dynamic Navigation over Scalable Hierarchical Index for Incremental Extreme Multi-label Classification.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Debiasing In-Context Learning by Instructing LLMs How to Follow Demonstrations.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Open Grounded Planning: Challenges and Benchmark Construction.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

REInstruct: Building Instruction Data from Unlabeled Corpus.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Harvesting Event Schemas from Large Language Models.
CoRR, 2023

A Drop of Ink Makes a Million Think: The Spread of False Information in Large Language Models.
CoRR, 2023

ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models.
CoRR, 2023

Testing Coreference Resolution Systems without Labeled Test Sets.
Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023

Document Information Extraction via Global Tagging.
Proceedings of the Chinese Computational Linguistics - 22nd China National Conference, 2023

Learning In-context Learning for Named Entity Recognition.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Universal Information Extraction as Unified Semantic Matching.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
End-to-end neural event coreference resolution.
Artif. Intell., 2022

ISCAS at SemEval-2022 Task 10: An Extraction-Validation Pipeline for Structured Sentiment Analysis.
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

Unified Structure Generation for Universal Information Extraction.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Procedural Text Understanding via Scene-Wise Evolution.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
From Discourse to Narrative: Knowledge Projection for Event Relation Extraction.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Semantically Smooth Bilingual Phrase Embeddings Based on Recursive Autoencoders.
Neural Process. Lett., 2020

A Rigourous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land?
CoRR, 2020

ISCAS at SemEval-2020 Task 5: Pre-trained Transformers for Counterfactual Statement Modeling.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

Syntactic and Semantic-driven Learning for Open Information Extraction.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land?
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

2019
Iterative Dual Domain Adaptation for Neural Machine Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Cost-sensitive Regularization for Label Confusion-aware Event Detection.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Exploring Implicit Semantic Constraints for Bilingual Word Embeddings.
Neural Process. Lett., 2018

Cross-lingual implicit discourse relation recognition with co-training.
Frontiers Inf. Technol. Electron. Eng., 2018

A Word Embedding Transfer Model for Robust Text Categorization.
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2018

Nugget Proposal Networks for Chinese Event Detection.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Adaptive Scaling for Sparse Detection in Information Extraction.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Variational Recurrent Neural Machine Translation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
ISCAS_Sogou at TAC-KBP 2017.
Proceedings of the 2017 Text Analysis Conference, 2017

2015
Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015


  Loading...