Ruochen Zhang

Affiliations:
  • Brown University, Providence, RI, USA


According to our database1, Ruochen Zhang authored at least 23 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability.
CoRR, June, 2025

Paths Not Taken: Understanding and Mending the Multilingual Factual Recall Pipeline.
CoRR, May, 2025

Crosslingual Reasoning through Test-Time Scaling.
CoRR, May, 2025

Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance.
CoRR, March, 2025

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, March, 2025

TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning.
CoRR, February, 2025

Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Senses.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Sense.
CoRR, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.
CoRR, 2024

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark.
CoRR, 2024

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MINERS: Multilingual Language Models as Semantic Retrievers.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024


Re-Evaluating Evaluation for Multilingual Summarization.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

CroCoSum: A Benchmark Dataset for Cross-Lingual Code-Switched Summarization.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
Multilingual Large Language Models Are Not (Yet) Code-Switchers.
CoRR, 2023

Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages.
CoRR, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2021
SOCCER: An Information-Sparse Discourse State Tracking Collection in the Sports Commentary Domain.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

2019
Brown University at TREC Deep Learning 2019.
Proceedings of the Twenty-Eighth Text REtrieval Conference, 2019


  Loading...