Xinyu Zhang
Orcid: 0009-0009-0756-8110Affiliations:
- University of Waterloo, David R. Cheriton School of Computer Science, Canada
According to our database1,
Xinyu Zhang
authored at least 44 papers
between 2020 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent.
CoRR, August, 2025
CoRR, June, 2025
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval.
CoRR, May, 2025
Tomato, Tomahto, Tomate: Do Multilingual Language Models Understand Based on Subword-Level Semantic Concepts?
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
Rank-Without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models.
Proceedings of the Advances in Information Retrieval, 2025
The Impact of Incidental Multilingual Text on Cross-Lingual Transfer in Monolingual Retrieval.
Proceedings of the Advances in Information Retrieval, 2025
2024
ACM Trans. Inf. Syst., March, 2024
Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models.
CoRR, 2024
Debatrix: Multi-dimensinal Debate Judge with Iterative Chronological Analysis Based on LLM.
CoRR, 2024
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
CELI: Simple yet Effective Approach to Enhance Out-of-Domain Generalization of Cross-Encoders.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024
Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
"Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Multi-Objective Forward Reasoning and Multi-Reward Backward Refinement for Product Review Summarization.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
Trans. Assoc. Comput. Linguistics, 2023
NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation.
CoRR, 2023
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations.
CoRR, 2023
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution.
CoRR, 2023
Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction.
CoRR, 2023
Overview of the CIRAL Track at FIRE 2023: Cross-lingual Information Retrieval for African Languages.
Proceedings of the Working Notes of FIRE 2023, 2023
Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023
Proceedings of the The 61st Annual Meeting of the Association for Computational Linguistics: Industry Track, 2023
2022
CoRR, 2022
Better Than Whitespace: Information Retrieval for Languages without Custom Tokenizers.
CoRR, 2022
Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval.
Proceedings of the Thirty-First Text REtrieval Conference, 2022
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking.
Proceedings of the Advances in Information Retrieval, 2022
2021
Comparing Score Aggregation Approaches for Document Retrieval with Pretrained Transformers.
Proceedings of the Advances in Information Retrieval, 2021
Approach Zero and Anserini at the CLEF-2021 ARQMath Track: Applying Substructure Search and BM25 on Operator Tree Path Tokens.
Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to, 2021
2020
Proceedings of the WSDM '20: The Thirteenth ACM International Conference on Web Search and Data Mining, 2020
H2oloo at TREC 2020: When all you got is a hammer... Deep Learning, Health Misinformation, and Precision Medicine.
Proceedings of the Twenty-Ninth Text REtrieval Conference, 2020
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, 2020
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020