Luckeciano Carvalho Melo

Orcid: 0000-0003-2599-6265

According to our database¹, Luckeciano Carvalho Melo authored at least 15 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Learning Without Critics? Revisiting GRPO in Classical Reinforcement Learning Environments.

[BibT_eX]

[DOI]

Bryan L. M. de Oliveira

Felipe Vieira Frujeri

Marcos P. C. M. Queiroz

Luana G. B. Martins

Telma W. L. Soares

Luckeciano C. Melo

CoRR, November, 2025

Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning.

[BibT_eX]

[DOI]

Luckeciano C. Melo

Alessandro Abate

Yarin Gal

CoRR, October, 2025

InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context.

[BibT_eX]

[DOI]

Bryan L. M. de Oliveira

Luana G. B. Martins

Bruno Brandão

Luckeciano C. Melo

CoRR, February, 2025

Uncertainty-Aware Step-wise Verification with Generative Reward Models.

[BibT_eX]

[DOI]

Zihuiwen Ye

Luckeciano Carvalho Melo

CoRR, February, 2025

Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning.

[BibT_eX]

[DOI]

Bryan Lincoln Marques de Oliveira

Luana Guedes Barros Martins

Bruno Brandão

Murilo Lopes da Luz

Telma Woerle de Lima Soares

Luckeciano Carvalho Melo

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

Temporal-Difference Variational Continual Learning.

[BibT_eX]

[DOI]

Luckeciano C. Melo

Alessandro Abate

Yarin Gal

CoRR, 2024

Deep Bayesian Active Learning for Preference Modeling in Large Language Models.

[BibT_eX]

[DOI]

Luckeciano Carvalho Melo

Panagiotis Tigas

Alessandro Abate

Yarin Gal

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2022

Multiagent Reinforcement Learning for Strategic Decision Making and Control in Robotic Soccer Through Self-Play.

[BibT_eX]

[DOI]

Marcos R. O. A. Máximo

IEEE Access, 2022

Transformers are Meta-Reinforcement Learners.

[BibT_eX]

[DOI]

Luckeciano C. Melo

Proceedings of the International Conference on Machine Learning, 2022

2021

Learning Humanoid Robot Running Motions with Symmetry Incentive through Proximal Policy Optimization.

[BibT_eX]

[DOI]

Luckeciano Carvalho Melo

Dicksiano Carvalho Melo

Marcos Ricardo Omena de Albuquerque Máximo

J. Intell. Robotic Syst., 2021

2020

Contextual Meta-Bandit for Recommender Systems Selection.

[BibT_eX]

[DOI]

Marlesson R. O. Santana

Luckeciano C. Melo

Fernando H. F. Camargo

Proceedings of the RecSys 2020: Fourteenth ACM Conference on Recommender Systems, 2020

MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces.

[BibT_eX]

[DOI]

Marlesson R. O. Santana

Luckeciano C. Melo

Fernando H. F. Camargo

Proceedings of the 20th International Conference on Data Mining Workshops, 2020

2019

Bottom-Up Meta-Policy Search.

[BibT_eX]

[DOI]

Luckeciano Carvalho Melo

Marcos Ricardo Omena Albuquerque Máximo

Adilson Marques da Cunha

CoRR, 2019

Learning Humanoid Robot Motions Through Deep Neural Networks.

[BibT_eX]

[DOI]

Luckeciano Carvalho Melo

Marcos Ricardo Omena Albuquerque Máximo

Adilson Marques da Cunha

CoRR, 2019

Learning Humanoid Robot Running Skills through Proximal Policy Optimization.

[BibT_eX]

[DOI]

Luckeciano Carvalho Melo

Marcos Ricardo Omena Albuquerque Máximo

Proceedings of the Latin American Robotics Symposium, 2019

Luckeciano Carvalho Melo

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...