Taishi Nakamura
According to our database1,
Taishi Nakamura
authored at least 15 papers
between 2024 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources.
CoRR, September, 2025
Open-sci-ref-0.01: open and reproducible reference baselines for language model and dataset comparison.
CoRR, September, 2025
CoRR, August, 2025
Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models.
CoRR, March, 2025
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search.
CoRR, March, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the 31st International Conference on Computational Linguistics, 2025
2024
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs.
CoRR, 2024
CoRR, 2024
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs.
CoRR, 2024
Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities.
CoRR, 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order.
CoRR, 2024