We stand with Ukraine

We stand with Ukraine

Johan Ferret

According to our database¹, Johan Ferret authored at least 23 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

On Teacher Hacking in Language Model Distillation.

[DOI]

,

Daniele Calandriello

,

,

,

,

Alexandre Ramé

,

Mathieu Blondel

Proceedings of the Forty-second International Conference on Machine Learning, 2025

BOND: Aligning LLMs with Best-of-N Distillation.

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Diversity-Rewarded CFG Distillation.

[DOI]

Geoffrey Cideron

,

Andrea Agostinelli

,

,

,

,

,

,

Alexandre Ramé

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning.

[DOI]

Eduardo Pignatelli

,

,

,

,

Hado van Hasselt

,

Trans. Mach. Learn. Res., 2024

Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL.

[DOI]

Eduardo Pignatelli

,

,

Tim Rocktäschel

,

Edward Grefenstette

,

Davide Paglieri

,

,

CoRR, 2024

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning.

[DOI]

CoRR, 2024

BOND: Aligning LLMs with Best-of-N Distillation.

[DOI]

CoRR, 2024

WARP: On the Benefits of Weight Averaged Rewarded Policies.

[DOI]

Alexandre Ramé

,

,

,

,

Léonard Hussenot

,

Pierre-Louis Cedoz

,

Pier Giuseppe Sessa

,

,

Arthur Douillard

,

CoRR, 2024

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models.

[DOI]

Aleksandar Botev

,

,

Samuel L. Smith

,

Anushan Fernando

,

George-Cristian Muraru

,

,

Leonard Berrada

,

,

Pier Giuseppe Sessa

,

,

Léonard Hussenot

,

,

,

,

,

Kathleen Kenealy

,

,

,

Surya Bhupatiraju

,

,

,

Morgane Rivière

,

Mihir Sanjay Kale

,

,

,

,

,

,

,

Srivatsan Srinivasan

,

Guillaume Desjardins

,

,

,

,

,

,

Sebastian Borgeaud

,

,

,

Antonia Paterson

,

,

,

,

Nesh Devanathan

,

,

,

,

Luiz Gustavo Martins

,

,

David Huntsperger

,

,

,

,

,

,

Zoubin Ghahramani

,

Clément Farabet

,

Koray Kavukcuoglu

,

,

,

,

Nando de Frietas

CoRR, 2024

Direct Language Model Alignment from Online AI Feedback.

[DOI]

,

,

,

,

,

Felipe Llinares

,

Alexandre Ramé

,

,

,

,

,

Mathieu Blondel

CoRR, 2024

WARM: On the Benefits of Weight Averaged Reward Models.

[DOI]

Alexandre Ramé

,

,

Léonard Hussenot

,

,

Geoffrey Cideron

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback.

[DOI]

,

,

,

,

,

,

,

,

,

Abhinav Rastogi

,

Sushant Prakash

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Conditional Language Policy: A General Framework For Steerable Multi-Objective Finetuning.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback.

[DOI]

,

,

,

,

Geoffrey Cideron

,

,

,

,

Léonard Hussenot

,

,

,

Sabela Ramos Garea

,

,

,

,

,

Avinatan Hassidim

,

Olivier Pietquin

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

On actions that matter: credit assignment and interpretability in reinforcement learning. (De l'importance des actions: assignation de crédit et interprétabilité pour l'apprentissage par renforcement).

[DOI]

PhD thesis, 2022

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act.

[DOI]

,

,

Olivier Pietquin

,

CoRR, 2022

Lazy-MDPs: Towards Interpretable RL by Learning When to Act.

[DOI]

,

,

Olivier Pietquin

,

Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

2021

More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences.

[DOI]

,

Nathan Grinsztajn

,

,

CoRR, 2021

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning.

[DOI]

Nathan Grinsztajn

,

,

Olivier Pietquin

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Adversarially Guided Actor-Critic.

[DOI]

Yannis Flet-Berliac

,

,

Olivier Pietquin

,

,

Proceedings of the 9th International Conference on Learning Representations, 2021

Self-Imitation Advantage Learning.

[DOI]

,

Olivier Pietquin

,

Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020

Self-Attentional Credit Assignment for Transfer in Reinforcement Learning.

[DOI]

,

Raphaël Marinier

,

,

Olivier Pietquin

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

2019

Credit Assignment as a Proxy for Transfer in Reinforcement Learning.

[DOI]

,

Raphaël Marinier

,

,

Olivier Pietquin

CoRR, 2019

Loading...