Yannis Flet-Berliac

According to our database1, Yannis Flet-Berliac authored at least 17 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Averaging log-likelihoods in direct alignment.
CoRR, 2024

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion.
CoRR, 2024

OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators.
CoRR, 2024

2023
PASTA: Pretrained Action-State Transformer Agents.
CoRR, 2023

Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Model-Based Offline Reinforcement Learning with Local Misspecification.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics.
CoRR, 2022

Offline policy optimization with eligible actions.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Sample-Efficient Deep Reinforcement Learning for Control, Exploration and Safety. (Apprentissage par renforcement profond éfficace pour le contrôle, l'exploration et la sûreté).
PhD thesis, 2021

Learning Value Functions in Deep Policy Gradients using Residual Variance.
Proceedings of the 9th International Conference on Learning Representations, 2021

Adversarially Guided Actor-Critic.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients.
CoRR, 2020

Only Relevant Information Matters: Filtering Out Noisy Samples To Boost RL.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

2019
High-Dimensional Control Using Generalized Auxiliary Tasks.
CoRR, 2019

Samples are not all useful: Denoising policy gradient updates using variance.
CoRR, 2019

2017
Hearables in Hearing Care: Discovering Usage Patterns Through IoT Devices.
Proceedings of the Universal Access in Human-Computer Interaction. Human and Technological Environments, 2017


  Loading...