Gergely Neu

Orcid: 0000-0001-6287-3796

Affiliations:
  • Pompeu Fabra University, DTIC, Barcelona, Spain


According to our database1, Gergely Neu authored at least 70 papers between 2007 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Linear Bandits with Non-i.i.d. Noise.
CoRR, May, 2025

Inverse Q-Learning Done Right: Offline Imitation Learning in Q<sup>π</sup>-Realizable MDPs.
CoRR, May, 2025

Distances for Markov chains from sample streams.
CoRR, May, 2025

Confidence Sequences for Generalized Linear Models via Regret Analysis.
CoRR, April, 2025

Optimistically Optimistic Exploration for Provably Efficient Infinite-Horizon Reinforcement and Imitation Learning.
Proceedings of the Thirty Eighth Annual Conference on Learning Theory, 2025

Generalization bounds for mixing processes via delayed online-to-PAC conversions.
Proceedings of the International Conference on Algorithmic Learning Theory, 2025

Offline RL via Feature-Occupancy Gradient Ascent.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

Online-to-PAC generalization bounds under graph-mixing dependencies.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

2024
On the Hardness of Learning from Censored and Nonstationary Demand.
INFORMS J. Optim., 2024

Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Dealing With Unbounded Gradients in Stochastic Saddle-point Optimization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Optimistic Information Directed Sampling.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

Adversarial Contextual Bandits Go Kernelized.
Proceedings of the International Conference on Algorithmic Learning Theory, 2024

Importance-Weighted Offline Learning Done Right.
Proceedings of the International Conference on Algorithmic Learning Theory, 2024

Offline Primal-Dual Reinforcement Learning for Linear MDPs.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Online-to-PAC Conversions: Generalization Bounds via Regret Analysis.
CoRR, 2023

First- and Second-Order Bounds for Adversarial Linear Contextual Bandits.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Optimistic Planning by Regularized Dynamic Programming.
Proceedings of the International Conference on Machine Learning, 2023

Efficient Global Planning in Large MDPs via Stochastic Primal-Dual Optimization.
Proceedings of the International Conference on Algorithmic Learning Theory, 2023

Online Learning with Off-Policy Feedback.
Proceedings of the International Conference on Algorithmic Learning Theory, 2023

Nonstochastic Contextual Combinatorial Bandits.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Sufficient Exploration for Convex Q-learning.
CoRR, 2022

Proximal Point Imitation Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Generalization Bounds via Convex Analysis.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Convex Analytic Theory for Convex Q-Learning.
Proceedings of the 61st IEEE Conference on Decision and Control, 2022

2021
Robustness and risk management via distributional dynamic programming.
CoRR, 2021

Learning to maximize global influence from local observations.
CoRR, 2021

Online learning in MDPs with linear function approximation and bandit feedback.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent.
Proceedings of the Conference on Learning Theory, 2021

Convex Q-Learning.
Proceedings of the 2021 American Control Conference, 2021

Logistic Q-Learning.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
A Unifying View of Optimism in Episodic Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Faster saddle-point optimization for solving large-scale Markov decision processes.
Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control, 2020

Fast Rates for Online Prediction with Abstention.
Proceedings of the Conference on Learning Theory, 2020

Efficient and robust algorithms for adversarial linear contextual bandits.
Proceedings of the Conference on Learning Theory, 2020

Algorithmic Learning Theory 2020: Preface.
Proceedings of the Algorithmic Learning Theory, 2020

2019
Potential and pitfalls of Multi-Armed Bandits for decentralized Spatial Reuse in WLANs.
J. Netw. Comput. Appl., 2019

Collaborative Spatial Reuse in wireless networks via selfish Multi-Armed Bandits.
Ad Hoc Networks, 2019

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Beating SGD Saturation with Tail-Averaging and Minibatching.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Bandit Principal Component Analysis.
Proceedings of the Conference on Learning Theory, 2019

Online Influence Maximization with Local Observations.
Proceedings of the Algorithmic Learning Theory, 2019

2018
Wireless Optimisation via Convex Bandits: Unlicensed LTE/WiFi Coexistence.
Proceedings of the 2018 Workshop on Network Meets AI & ML, 2018

Iterate Averaging as Regularization for Stochastic Gradient Descent.
Proceedings of the Conference On Learning Theory, 2018

2017
On the Hardness of Inventory Management with Censored Demand Data.
CoRR, 2017

A unified view of entropy-regularized Markov decision processes.
CoRR, 2017

Boltzmann Exploration Done Right.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Algorithmic Stability and Hypothesis Complexity.
Proceedings of the 34th International Conference on Machine Learning, 2017

Fast rates for online learning in Linearly Solvable Markov Decision Processes.
Proceedings of the 30th Conference on Learning Theory, 2017

2016
Importance Weighting Without Importance Weights: An Efficient Algorithm for Combinatorial Semi-Bandits.
J. Mach. Learn. Res., 2016

Online learning with Erdos-Renyi side-observation graphs.
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Online Learning with Noisy Side Observations.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
Random-Walk Perturbations for Online Combinatorial Optimization.
IEEE Trans. Inf. Theory, 2015

Explore no more: Improved high-probability regret bounds for non-stochastic bandits.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

First-order regret bounds for combinatorial semi-bandits.
Proceedings of The 28th Conference on Learning Theory, 2015

2014
Near-Optimal Rates for Limited-Delay Universal Lossy Source Coding.
IEEE Trans. Inf. Theory, 2014

Online Markov Decision Processes Under Bandit Feedback.
IEEE Trans. Autom. Control., 2014

Online learning in MDPs with side information.
CoRR, 2014

Exploiting easy data in online optimization.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Online combinatorial optimization with stochastic decision sets and adversarial losses.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Efficient learning by implicit exploration in bandit problems with side observations.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Online tanulás nemstacionárius Markov döntési folyamatokban
PhD thesis, 2013

Online learning in episodic Markovian decision processes by relative entropy policy search.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Prediction by random-walk perturbation.
Proceedings of the COLT 2013, 2013

An Efficient Algorithm for Learning with Semi-bandit Feedback.
Proceedings of the Algorithmic Learning Theory - 24th International Conference, 2013

2012
The adversarial stochastic shortest path problem with unknown transition probabilities.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

2010
The Online Loop-free Stochastic Shortest-Path Problem.
Proceedings of the COLT 2010, 2010

2009
Training parsers by inverse reinforcement learning.
Mach. Learn., 2009

2007
Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods.
Proceedings of the UAI 2007, 2007


  Loading...