Jillian Bommarito

According to our database1, Jillian Bommarito authored at least 5 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models.
CoRR, April, 2025

Precise Legal Sentence Boundary Detection for Retrieval at Scale: NUPunkt and CharBoundary.
CoRR, April, 2025

KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications.
CoRR, March, 2025

Towards Best Practices for Open Datasets for LLM Training.
CoRR, January, 2025

2023
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities.
CoRR, 2023


  Loading...