Matthias Gallé

Orcid: 0000-0001-5677-5911

According to our database1, Matthias Gallé authored at least 45 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
LLMCRIT: Teaching Large Language Models to Use Criteria.
CoRR, 2024

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs.
CoRR, 2024

2022
BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model.
CoRR, 2022

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
CoRR, 2022

Speeding Up Entmax.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics.
J. Artif. Intell. Res., 2021

Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP.
CoRR, 2021

Unsupervised and Distributional Detection of Machine-Generated Text.
CoRR, 2021

On the Evaluation of Machine Translation for Terminology Consistency.
CoRR, 2021

Findings of the WMT Shared Task on Machine Translation Using Terminologies.
Proceedings of the Sixth Conference on Machine Translation, 2021

Multilingual Unsupervised Neural Machine Translation with Denoising Adapters.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Self-Supervised and Controlled Multi-Document Opinion Summarization.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Breaking Writer's Block: Low-cost Fine-tuning of Natural Language Generation Models.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, 2021

2020
Monolingual Adapters for Zero-Shot Neural Machine Translation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

A Multilingual Neural Machine Translation Model for Biomedical Data.
Proceedings of the 1st Workshop on NLP for COVID-19@ EMNLP 2020, Online, December 2020, 2020

2019
Character-based NMT with Transformer.
CoRR, 2019

Joint Semantic and Distributional Word Representations with Multi-Graph Embeddings.
Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing, 2019

Investigating the Effectiveness of BPE: The Power of Shorter Sequences.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

To Annotate or Not? Predicting Performance Drop under Domain Shift.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
<i>xkcd</i>-repeats: A new taxonomy of repeats defined by their context diversity.
J. Discrete Algorithms, 2018

2017
Enriching how-to guides with actionable phrases and linked data.
Web Intell., 2017

Context-aware selection of multi-modal conversational fillers in human-robot dialogues.
Proceedings of the 26th IEEE International Symposium on Robot and Human Interactive Communication, 2017

Robots as Conversational Mediators.
Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, 2017

A Maximum Matching Algorithm for Basis Selection in Spectral Learning.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016
Discriminating between similar languages in Twitter using label propagation.
CoRR, 2016

Joint Event Detection and Entity Resolution: a Virtuous Cycle.
CoRR, 2016

Multi-view pattern matching.
CoRR, 2016

Enriching How-to Guides by Linking Actionable Phrases.
Proceedings of the 25th International Conference on World Wide Web, 2016

The Generalized Smallest Grammar Problem.
Proceedings of the 13th International Conference on Grammatical Inference, 2016

2015
Review of: Algorithms on Strings by Maxime Crochemore, Christophe Hancart and Thierry Lecroq.
SIGACT News, 2015

"Roles for the Boys?": Mining Cast Lists for Gender and Role Distributions over Time.
Proceedings of the 24th International Conference on World Wide Web Companion, 2015

Reconstructing Textual Documents from n-grams.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

2014
Review of bayesian reasoning and machine learning by David Barber.
SIGACT News, 2014

On Context-Diverse Repeats and Their Incremental Computation.
Proceedings of the Language and Automata Theory and Applications, 2014

Boilerplate Detection and Recoding.
Proceedings of the Advances in Information Retrieval, 2014

2013
Review of grammatical inference: learning automata and grammars by Colin de la Higuera.
SIGACT News, 2013

Who broke the news?: an analysis on first reports of news events.
Proceedings of the 22nd International World Wide Web Conference, 2013

The bag-of-repeats representation of documents.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

2012
Searching for smallest grammars on large sequences and application to DNA.
J. Discrete Algorithms, 2012

Full and Mini-batch Clustering of News Articles with Star-EM.
Proceedings of the Advances in Information Retrieval, 2012

2011
Searching for Compact Hierarchical Structures in DNA by means of the Smallest Grammar Problem.
PhD thesis, 2011

The Smallest Grammar Problem as Constituents Choice and Minimal Grammar Parsing.
Algorithms, 2011

2010
Choosing Word Occurrences for the Smallest Grammar Problem.
Proceedings of the Language and Automata Theory and Applications, 2010

A New Tree Distance Metric for Structural Comparison of Sequences.
Proceedings of the Structure Discovery in Biology: Motifs, Networks & Phylogenies, 06.06., 2010

2009
In-Place Update of Suffix Array while Recoding Words.
Int. J. Found. Comput. Sci., 2009


  Loading...