Lester James V. Miranda

Orcid: 0000-0002-7872-6464

According to our database1, Lester James V. Miranda authored at least 23 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
FilBench: Can LLMs Understand and Generate Filipino?
CoRR, August, 2025

R3: Robust Rubric-Agnostic Reward Models.
CoRR, May, 2025

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, March, 2025

MMTEB: Massive Multilingual Text Embedding Benchmark.
CoRR, February, 2025

2 OLMo 2 Furious.
CoRR, January, 2025

RewardBench: Evaluating Reward Models for Language Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

M-RewardBench: Evaluating Reward Models in Multilingual Settings.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training.
CoRR, 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons.
CoRR, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.
CoRR, 2024


Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024


2023
calamanCy: A Tagalog Natural Language Processing Toolkit.
CoRR, 2023

Developing a Named Entity Recognition Dataset for Tagalog.
CoRR, 2023

2022
Multi hash embeddings in spaCy.
CoRR, 2022

2019
Geomancer: An Open-Source Framework for Geospatial Feature Engineering.
CoRR, 2019

2018
PySwarms: a research toolkit for Particle Swarm Optimization in Python.
J. Open Source Softw., 2018

Feature Extraction Using a Mutually-Competitive Autoencoder for Protein Function Prediction.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2018

A Deep Learning Approach Based on Stacked Denoising Autoencoders for Protein Function Prediction.
Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference, 2018


  Loading...