Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

BOUQuET : dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation.

[BibT_eX]

[DOI]

Pierre Andrews

Mikel Artetxe

Mariano Coria Meglioli

Nathanial Paul Ekberg

Albert Ventayol-Boada

Shireen Yates

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset Download PDF.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

LCFO: Long Context and Long Form Output Dataset and Benchmarking.

[BibT_eX]

[DOI]

Marta R. Costa-jussà

Pierre Andrews

Mariano Coria Meglioli

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Large Concept Models: Language Modeling in a Sentence Representation Space.

[BibT_eX]

[DOI]

CoRR, 2024

Y-NQ: English-Yorùbá Evaluation dataset for Open-Book Reading Comprehension and Text Generation.

[BibT_eX]

[DOI]

CoRR, 2024

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset.

[BibT_eX]

[DOI]

CoRR, 2024

Linguini: A benchmark for language-agnostic linguistic reasoning.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Massive Multilingual Holistic Bias.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Red Teaming in Multimodal and Multilingual Translation.

[BibT_eX]

[DOI]

Christophe Ropers

David Dale

Prangthip Hansanti

Gabriel Mejia Gonzalez

Cristian Canton-Ferrer

Pierre Andrews

Marta R. Costa-jussà

CoRR, 2024

Speech Data from Radio Broadcasts for Low Resource Languages.

[BibT_eX]

[DOI]

Bismarck Bamfo Odoom

Leibny Paola García-Perera

Proceedings of the 21st International Conference on Spoken Language Translation, 2024

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector.

[BibT_eX]

[DOI]

Marta R. Costa-jussà

Mariano Coria Meglioli

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Seamless: Multilingual Expressive and Streaming Speech Translation.

[BibT_eX]

[DOI]

Loïc Barrault

Yu-An Chung

Mariano Coria Meglioli

David Dale

Ning Dong

Mark Duppenthaler

Paul-Ambroise Duquenne

Kaushik Ram Sadagopan

Gabriel Mejia Gonzalez

CoRR, 2023

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation.

[BibT_eX]

[DOI]

Seamless Communication

Loïc Barrault

Yu-An Chung

Mariano Coria Meglioli

David Dale

Ning Dong

Paul-Ambroise Duquenne

Kaushik Ram Sadagopan

Gabriel Mejia Gonzalez

CoRR, 2023

The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Translation, 2023

HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Toxicity in Multilingual Machine Translation at Scale.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Multilingual Holistic Bias: Extending Descriptors and Patterns to Unveil Demographic Biases in Languages at Scale.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022

Toxicity in Multilingual Machine Translation at Scale.

[BibT_eX]

[DOI]

CoRR, 2022

No Language Left Behind: Scaling Human-Centered Machine Translation.

[BibT_eX]

[DOI]

CoRR, 2022

Christophe Ropers

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...