Avi Shmidman

According to our database1, Avi Shmidman authored at least 16 papers between 2016 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs.
CoRR, February, 2026

2025
NeoDictaBERT: Pushing the Frontier of BERT models for Hebrew.
CoRR, October, 2025

Splintering Nonconcatenative Languages for Better Tokenization.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities.
CoRR, 2024

MRL Parsing Without Tears: The Case of Hebrew.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Introducing DictaLM - A Large Generative Language Model for Modern Hebrew.
CoRR, 2023

DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew.
CoRR, 2023

Do Pretrained Contextual Language Models Distinguish between Hebrew Homograph Analyses?
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

2022
Large Pre-Trained Models with Extra-Large Vocabularies: A Contrastive Analysis of Hebrew BERT Models and a New One to Outperform Them All.
CoRR, 2022

Introducing BEREL: BERT Embeddings for Rabbinic-Encoded Language.
CoRR, 2022

2020
FAST: Fast and Accurate Synoptic Texts.
Digit. Scholarsh. Humanit., 2020

A Novel Challenge Set for Hebrew Morphological Disambiguation and Diacritics Restoration.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Nakdan: Professional Hebrew Diacritizer.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

2019
Studying the history of the Arabic language: language technology and a large-scale historical corpus.
Lang. Resour. Evaluation, 2019

2018
Identification of Parallel Passages Across a Large Hebrew/Aramaic Corpus.
J. Data Min. Digit. Humanit., 2018

2016
Shamela: A Large-Scale Historical Arabic Corpus.
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities, 2016


  Loading...