Christophe Ropers

According to our database1, Christophe Ropers authored at least 25 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech.
CoRR, March, 2026

Omnilingual MT: Machine Translation for 1,600 Languages.
CoRR, March, 2026

2025
Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages.
CoRR, November, 2025

Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset.
CoRR, June, 2025

BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation.
CoRR, February, 2025

On the Role of Speech Data in Reducing Toxicity Detection Bias.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

BOUQuET : dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset Download PDF.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

LCFO: Long Context and Long Form Output Dataset and Benchmarking.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Large Concept Models: Language Modeling in a Sentence Representation Space.
CoRR, 2024

Y-NQ: English-Yorùbá Evaluation dataset for Open-Book Reading Comprehension and Text Generation.
CoRR, 2024

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset.
CoRR, 2024

Linguini: A benchmark for language-agnostic linguistic reasoning.
CoRR, 2024

Towards Massive Multilingual Holistic Bias.
CoRR, 2024

Towards Red Teaming in Multimodal and Multilingual Translation.
CoRR, 2024

Speech Data from Radio Broadcasts for Low Resource Languages.
Proceedings of the 21st International Conference on Spoken Language Translation, 2024

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Seamless: Multilingual Expressive and Streaming Speech Translation.
CoRR, 2023

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation.
CoRR, 2023

The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages.
Proceedings of the Eighth Conference on Machine Translation, 2023

HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Toxicity in Multilingual Machine Translation at Scale.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Multilingual Holistic Bias: Extending Descriptors and Patterns to Unveil Demographic Biases in Languages at Scale.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
Toxicity in Multilingual Machine Translation at Scale.
CoRR, 2022

No Language Left Behind: Scaling Human-Centered Machine Translation.
CoRR, 2022


  Loading...