Ruochen Zhang
Affiliations:- Brown University, Providence, RI, USA
  According to our database1,
  Ruochen Zhang
  authored at least 23 papers
  between 2019 and 2025.
  
  
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
  2025
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability.
    
  
    CoRR, June, 2025
    
  
    CoRR, May, 2025
    
  
Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance.
    
  
    CoRR, March, 2025
    
  
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.
    
  
    CoRR, March, 2025
    
  
TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning.
    
  
    CoRR, February, 2025
    
  
Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Senses.
    
  
    Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
    
  
The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling.
    
  
    Proceedings of the Thirteenth International Conference on Learning Representations, 2025
    
  
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.
    
  
    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
    
  
  2024
Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Sense.
    
  
    CoRR, 2024
    
  
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.
    
  
    CoRR, 2024
    
  
    CoRR, 2024
    
  
    Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
    
  
    Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
    
  
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.
    
  
    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
    
  
    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
    
  
    Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
    
  
  2023
Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages.
    
  
    CoRR, 2023
    
  
    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
    
  
  2021
SOCCER: An Information-Sparse Discourse State Tracking Collection in the Sports Commentary Domain.
    
  
    Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
    
  
    Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021
    
  
  2019
    Proceedings of the Twenty-Eighth Text REtrieval Conference, 2019