Daniel Simig

According to our database1, Daniel Simig authored at least 12 papers between 2021 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections.
CoRR, 2023

SemDeDup: Data-efficient learning at web-scale through semantic deduplication.
CoRR, 2023

MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

D4: Improving LLM Pretraining via Document De-Duplication and Diversification.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Understanding In-Context Learning via Supportive Pretraining Data.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization.
CoRR, 2022

Text Characterization Toolkit.
CoRR, 2022

OPT: Open Pre-trained Transformer Language Models.
CoRR, 2022

Text Characterization Toolkit (TCT).
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

Few-shot Learning with Multilingual Generative Language Models.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Open Vocabulary Extreme Classification Using Generative Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Few-shot Learning with Multilingual Language Models.
CoRR, 2021


  Loading...