Alessandro Lazaric

Orcid: 0000-0002-8970-413X

Affiliations:
  • Meta AI, France


According to our database1, Alessandro Lazaric authored at least 134 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Simple Ingredients for Offline Reinforcement Learning.
CoRR, 2024

Reinforcement Learning with Options and State Representation.
CoRR, 2024

2023
Group Fairness in Reinforcement Learning.
Trans. Mach. Learn. Res., 2023

Layered State Discovery for Incremental Autonomous Exploration.
Proceedings of the International Conference on Machine Learning, 2023

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Contextual bandits with concave rewards, and an application to fair ranking.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path.
Proceedings of the International Conference on Algorithmic Learning Theory, 2023

On the Complexity of Representation Learning in Contextual Linear Bandits.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Sketched Newton-Raphson.
SIAM J. Optim., 2022

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler.
CoRR, 2022

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning.
CoRR, 2022

Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times.
Proceedings of the International Conference on Machine Learning, 2022

Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping.
Proceedings of the Conference on Robot Learning, 2022

A general sample complexity analysis of vanilla policy gradient.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Adaptive Multi-Goal Exploration.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Top K Ranking for Multi-Armed Bandit with Noisy Evaluations.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Differentially Private Exploration in Reinforcement Learning with Linear Representation.
CoRR, 2021

Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching.
CoRR, 2021

A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs.
CoRR, 2021

A Unified Framework for Conservative Exploration.
CoRR, 2021

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Provably Efficient Sample Collection Strategy for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reinforcement Learning with Prototypical Representations.
Proceedings of the 38th International Conference on Machine Learning, 2021

Leveraging Good Representations in Linear Contextual Bandits.
Proceedings of the 38th International Conference on Machine Learning, 2021

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model.
Proceedings of the Algorithmic Learning Theory, 2021

2020
Improved Analysis of UCRL2 with Empirical Bernstein Inequality.
CoRR, 2020

Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization.
CoRR, 2020

Concentration Inequalities for Multinoulli Random Variables.
CoRR, 2020

Active Model Estimation in Markov Decision Processes.
Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Adversarial Attacks on Linear Contextual Bandits.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning Near Optimal Policies with Low Inherent Bellman Error.
Proceedings of the 37th International Conference on Machine Learning, 2020

No-Regret Exploration in Goal-Oriented Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Meta-learning with Stochastic Linear Bandits.
Proceedings of the 37th International Conference on Machine Learning, 2020

Near-linear time Gaussian process optimization with adaptive batching and resparsification.
Proceedings of the 37th International Conference on Machine Learning, 2020

Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation.
Proceedings of the 37th International Conference on Machine Learning, 2020

Frequentist Regret Bounds for Randomized Least-Squares Value Iteration.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Novel Confidence-Based Algorithm for Structured Bandits.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A single algorithm for both restless and rested rotting bandits.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Conservative Exploration in Reinforcement Learning.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Improved Algorithms for Conservative Exploration in Bandits.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration.
CoRR, 2019

Limiting Extrapolation in Linear Approximate Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Regret Bounds for Learning State Representations in Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret.
Proceedings of the Conference on Learning Theory, 2019

Active Exploration in Markov Decision Processes.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Rotting bandits are no harder than stochastic ones.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Word-order Biases in Deep-agent Emergent Communication.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes.
CoRR, 2018

Fighting Boredom in Recommender Systems with Linear Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Improved Large-Scale Graph Learning through Ridge Spectral Sparsification.
Proceedings of the 35th International Conference on Machine Learning, 2018

Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Experimental results : Reinforcement Learning of POMDPs using Spectral Methods.
CoRR, 2017

Regret Minimization in MDPs with Options without Prior Knowledge.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Efficient Second-Order Online Kernel Learning with Adaptive Embedding.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Active Learning for Accurate Estimation of Linear Models.
Proceedings of the 34th International Conference on Machine Learning, 2017

Second-Order Kernel Online Convex Optimization with Adaptive Sketching.
Proceedings of the 34th International Conference on Machine Learning, 2017

Exploration-Exploitation in MDPs with Options.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Trading off Rewards and Errors in Multi-Armed Bandits.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Distributed Adaptive Sampling for Kernel Matrix Approximation.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Thompson Sampling for Linear-Quadratic Control Problems.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Linear Thompson Sampling Revisited.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Parallel Higher Order Alternating Least Square for Tensor Recommender System.
Proceedings of the Workshops of the The Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Analysis of Classification-based Policy Iteration Algorithms.
J. Mach. Learn. Res., 2016

Incremental Spectral Sparsification for Large-Scale Graph-Based Semi-Supervised Learning.
CoRR, 2016

Analysis of Kelner and Levin graph sparsification algorithm for a streaming setting.
CoRR, 2016

Reinforcement Learning of Contextual MDPs using Spectral Methods.
CoRR, 2016

Reinforcement Learning of POMDP's using Spectral Methods.
CoRR, 2016

Analysis of Nyström method with sequential ridge leverage scores.
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies.
Proceedings of the 29th Conference on Learning Theory, 2016

Reinforcement Learning of POMDPs using Spectral Methods.
Proceedings of the 29th Conference on Learning Theory, 2016

Improved Learning Complexity in Combinatorial Pure Exploration Bandits.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
Sparse multi-task reinforcement learning.
Intelligenza Artificiale, 2015

Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits.
CoRR, 2015

Truthful learning mechanisms for multi-slot sponsored search auctions with externalities.
Artif. Intell., 2015

The replacement bootstrap for dependent data.
Proceedings of the IEEE International Symposium on Information Theory, 2015

Direct Policy Iteration with Demonstrations.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

2014
Stochastic Optimization of a Locally Smooth Function under Correlated Bandit Feedback.
CoRR, 2014

Best-Arm Identification in Linear Bandits.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Exploiting easy data in online optimization.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Online Stochastic Optimization under Correlated Bandit Feedback.
Proceedings of the 31th International Conference on Machine Learning, 2014

2013
Regret Bounds for Reinforcement Learning with Policy Advice.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

Sequential Transfer in Multi-armed Bandit with Finite Set of Models.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012
Finite-sample analysis of least-squares policy iteration.
J. Mach. Learn. Res., 2012

Learning with stochastic inputs and adversarial outputs.
J. Comput. Syst. Sci., 2012

A truthful learning mechanism for contextual multi-slot sponsored search auctions with externalities.
Proceedings of the 13th ACM Conference on Electronic Commerce, 2012

Risk-Aversion in Multi-armed Bandits.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

A Dantzig Selector Approach to Temporal Difference Learning.
Proceedings of the 29th International Conference on Machine Learning, 2012

Semi-Supervised Apprenticeship Learning.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

A truthful learning mechanism for multi-slot sponsored search auctions with externalities.
Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

Conservative and Greedy Approaches to Classification-Based Policy Iteration.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

Transfer in Reinforcement Learning: A Framework and a Survey.
Proceedings of the Reinforcement Learning, 2012

Least-Squares Methods for Policy Iteration.
Proceedings of the Reinforcement Learning, 2012

2011
Transfer from Multiple MDPs.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Multi-Bandit Best Arm Identification.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Finite-Sample Analysis of Lasso-TD.
Proceedings of the 28th International Conference on Machine Learning, 2011

Classification-based Policy Iteration with a Critic.
Proceedings of the 28th International Conference on Machine Learning, 2011

Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits.
Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

2010
Finite-sample Analysis of Bellman Residual Minimization.
Proceedings of the 2nd Asian Conference on Machine Learning, 2010

LSTD with Random Projections.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Finite-Sample Analysis of LSTD.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Analysis of a Classification-based Policy Iteration Algorithm.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Bayesian Multi-Task Reinforcement Learning.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

2009
Reinforcement distribution in fuzzy Q-learning.
Fuzzy Sets Syst., 2009

Workshop summary: On-line learning with limited feedback.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Hybrid Stochastic-Adversarial On-line Learning.
Proceedings of the COLT 2009, 2009

2008
Improving Batch Reinforcement Learning Performance through Transfer of Samples.
Proceedings of the STAIRS 2008, 2008

Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot.
Proceedings of the Artificial Intelligence in Theory and Practice II, 2008

Transfer of samples in batch reinforcement learning.
Proceedings of the Machine Learning, 2008

On the usefulness of opponent modeling: the Kuhn Poker case study.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Transfer of task representation in reinforcement learning using policy-based proto-value functions.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Towards Automated Bargaining in Electronic Markets: A Partially Two-Sided Competition Model.
Proceedings of the Agent-Mediated Electronic Commerce and Trading Agent Design and Analysis, 2008

2007
Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Piecewise constant reinforcement learning for robotic applications.
Proceedings of the ICINCO 2007, 2007

Reinforcement learning in extensive form games with incomplete information: the bargaining case study.
Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2007), 2007

Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions.
Proceedings of the AI*IA 2007: Artificial Intelligence and Human-Oriented Computing, 2007

Bifurcation Analysis of Reinforcement Learning Agents in the Selten's Horse Game.
Proceedings of the Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, 2007

2006
Incremental Skill Acquisition for Self-motivated Learning Animats.
Proceedings of the From Animals to Animats 9, 2006

Learning to cooperate in multi-agent social dilemmas.
Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2006), 2006


  Loading...