Zhenghao Liu

Orcid: 0000-0003-0083-3224

Affiliations:
  • Northeastern University, School of Computer Science and Engineering, Shenyang, China
  • Tsinghua University, Department of Computer Science and Technology, State Key Laboratory of Intelligent Technology and Systems, Beijing, China (PhD 2021)


According to our database1, Zhenghao Liu authored at least 93 papers between 2017 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Tailored Definitions With Easy Reach: Complexity-Controllable Definition Generation.
IEEE Trans. Big Data, August, 2025

ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization.
CoRR, June, 2025

KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs.
CoRR, June, 2025

KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG.
CoRR, June, 2025

A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings.
CoRR, May, 2025

ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation.
CoRR, May, 2025

EULER: Enhancing the Reasoning Ability of Large Language Models through Error-Induced Learning.
CoRR, May, 2025

ConsRec: Denoising Sequential Recommendation through User-Consistent Preference Modeling.
CoRR, May, 2025

Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning.
CoRR, May, 2025

Multi-Evidence Based Fact Verification via A Confidential Graph Neural Network.
IEEE Trans. Big Data, April, 2025

UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented Generation.
CoRR, April, 2025

Building a Coding Assistant via the Retrieval-Augmented Language Model.
ACM Trans. Inf. Syst., March, 2025

HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization.
CoRR, February, 2025

Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts.
CoRR, February, 2025

LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences.
CoRR, February, 2025

PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning.
CoRR, February, 2025

PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths.
CoRR, February, 2025

Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search.
CoRR, February, 2025

Enhancing the Patent Matching Capability of Large Language Models via the Memory Graph.
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025

COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Advancing LLM Reasoning Generalists with Preference Trees.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
CHGNN: A Semi-Supervised Contrastive Hypergraph Learning Network.
IEEE Trans. Knowl. Data Eng., September, 2024

KBAlign: Efficient Self Adaptation on Specific Knowledge Bases.
CoRR, 2024

Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation.
CoRR, 2024

Enhancing the Code Debugging Ability of LLMs via Communicative Agent Based Data Refinement.
CoRR, 2024

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework.
CoRR, 2024

PersLLM: A Personified Training Approach for Large Language Models.
CoRR, 2024

Advancing LLM Reasoning Generalists with Preference Trees.
CoRR, 2024

Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression.
CoRR, 2024

Cleaner Pretraining Corpus Curation with Neural Web Scraping.
CoRR, 2024

From Text to CQL: Bridging Natural Language and Corpus Search Engine.
CoRR, 2024

ActiveRAG: Revealing the Treasures of Knowledge via Active Learning.
CoRR, 2024

OMGEval: An Open Multilingual Generative Evaluation Benchmark for Large Language Models.
CoRR, 2024

Modeling User Viewing Flow using Large Language Models for Article Recommendation.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on Open-Source Model.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Chameleon: Towards Update-Efficient Learned Indexing for Locally Skewed Data.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Fusion-in-T5: Unifying Variant Signals for Simple and Effective Document Ranking with Attention Fusion.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

MCTS: A Multi-Reference Chinese Text Simplification Dataset.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Exploring the Potential of Dimension Reduction in Building Efficient Dense Retrieval Systems.
Proceedings of the Information Retrieval - 30th China Conference, 2024

MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

MMPDRec: A Denoising Model for Knowledge Concepts Recommendation Using Metapaths.
Proceedings of the Web Information Systems and Applications, 2024

Knowledge-Aware Self-supervised Educational Resources Recommendation.
Proceedings of the Web Information Systems and Applications, 2024

2023
Is Chinese Spelling Check ready? Understanding the correction behavior in real-world scenarios.
AI Open, January, 2023

INTERVENOR: Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing.
CoRR, 2023

Distributionally Robust Unsupervised Dense Retrieval Training on Web Graphs.
CoRR, 2023

Unlock Multi-Modal Capability of Dense Retrieval via Visual Module Plugin.
CoRR, 2023

Fusion-in-T5: Unifying Document Ranking Signals for Improved Information Retrieval.
CoRR, 2023

OpenMatch-v2: An All-in-one Multi-Modality PLM-based Information Retrieval Toolkit.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Text Matching Improves Sequential Recommendation by Reducing Popularity Biases.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Leveraging Prefix Transfer for Multi-Intent Text Revision.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022
Universal Multi-Modality Retrieval with One Unified Embedding Space.
CoRR, 2022

P<sup>3</sup> Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning.
CoRR, 2022

P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
YACLC: A Chinese Learner Corpus with Multidimensional Annotation.
CoRR, 2021

OpenMatch: An Open-Source Package for Information Retrieval.
CoRR, 2021

Few-Shot Conversational Dense Retrieval.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

OpenMatch: An Open Source Library for Neu-IR Research.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Capturing Global Informativeness in Open Domain Keyphrase Extraction.
Proceedings of the Natural Language Processing and Chinese Computing, 2021

Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

More Robust Dense Retrieval with Contrastive Dual Learning.
Proceedings of the ICTIR '21: The 2021 ACM SIGIR International Conference on the Theory of Information Retrieval, 2021

TIAGE: A Benchmark for Topic-Shift Aware Dialog Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Meta Adaptive Neural Ranking with Contrastive Synthetic Supervision.
CoRR, 2020

CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search.
CoRR, 2020

Toward Cross-Lingual Definition Generation for Language Learners.
CoRR, 2020

Joint Keyphrase Chunking and Salience Ranking with BERT.
CoRR, 2020

Coreferential Reasoning Learning for Language Representation.
CoRR, 2020

Selective Weak Supervision for Neural Information Retrieval.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Text Style Transfer via Learning Style Instance Supported Latent Space.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Coreferential Reasoning Learning for Language Representation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Generalizing Open Domain Fact Extraction and Verification to COVID-FACT thorough In-Domain Language Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Fine-grained Fact Verification with Kernel Graph Attention Network.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Conversation Generation with Concept Flow.
CoRR, 2019

Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network.
CoRR, 2019

Kernel Graph Attention Network for Fact Verification.
CoRR, 2019

Understanding the Behaviors of BERT in Ranking.
CoRR, 2019

Explore Entity Embedding Effectiveness in Entity Retrieval.
Proceedings of the Chinese Computational Linguistics - 18th China National Conference, 2019

DocRED: A Large-Scale Document-Level Relation Extraction Dataset.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Neural Parse Combination.
J. Comput. Sci. Technol., 2017


  Loading...