Shi Yu

Orcid: 0000-0001-6335-1076

Affiliations:
  • Tsinghua University, Department of Computer Science and Technology, Beijing, China


According to our database1, Shi Yu authored at least 30 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG.
CoRR, June, 2025

ConsRec: Denoising Sequential Recommendation through User-Consistent Preference Modeling.
CoRR, May, 2025

UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented Generation.
CoRR, April, 2025

Building a Coding Assistant via the Retrieval-Augmented Language Model.
ACM Trans. Inf. Syst., March, 2025

LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences.
CoRR, February, 2025

Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search.
CoRR, February, 2025

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slips.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Craw4LLM: Efficient Web Crawling for LLM Pretraining.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
KBAlign: Efficient Self Adaptation on Specific Knowledge Bases.
CoRR, 2024

Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation.
CoRR, 2024

Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts.
CoRR, 2024

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework.
CoRR, 2024

Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression.
CoRR, 2024

ActiveRAG: Revealing the Treasures of Knowledge via Active Learning.
CoRR, 2024

Fusion-in-T5: Unifying Variant Signals for Simple and Effective Document Ranking with Attention Fusion.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
Fusion-in-T5: Unifying Document Ranking Signals for Improved Information Retrieval.
CoRR, 2023

Rethinking Dense Retrieval's Few-Shot Ability.
CoRR, 2023

OpenMatch-v2: An All-in-one Multi-Modality PLM-based Information Retrieval Toolkit.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Text Matching Improves Sequential Recommendation by Reducing Popularity Biases.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
P<sup>3</sup> Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning.
CoRR, 2022

P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

2021
Few-Shot Conversational Dense Retrieval.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

2020
CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search.
CoRR, 2020

Few-Shot Generative Conversational Query Rewriting.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020


  Loading...