Craig W. Schmidt

According to our database1, Craig W. Schmidt authored at least 11 papers between 2000 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Tokenisation via Convex Relaxations.
CoRR, May, 2026

Faster Superword Tokenization.
CoRR, April, 2026

The Effect of Scripts and Formats on LLM Numeracy.
CoRR, January, 2026

2025
Entropy-Driven Pre-Tokenization for Byte-Pair Encoding.
CoRR, June, 2025

Boundless Byte Pair Encoding: Breaking the Pre-tokenization Barrier.
CoRR, April, 2025

How Much is Enough? The Diminishing Returns of Tokenization Training Data.
CoRR, February, 2025

2024
SEC-QA: A Systematic Evaluation Corpus for Financial QA.
CoRR, 2024

Greed is All You Need: An Evaluation of Tokenizer Inference Methods.
CoRR, 2024

Tokenization Is More Than Compression.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2019
Improving a tf-idf weighted document vector embedding.
CoRR, 2019

2000
The exact overall time distribution of a project with uncertain task durations.
Eur. J. Oper. Res., 2000


  Loading...