Benjamin Van Roy

According to our database1, Benjamin Van Roy authored at least 95 papers between 1995 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2019
Provably Efficient Reinforcement Learning with Aggregated States.
CoRR, 2019

Comments on the Du-Kakade-Wang-Yang Lower Bounds.
CoRR, 2019

Behaviour Suite for Reinforcement Learning.
CoRR, 2019

Information-Theoretic Confidence Bounds for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

On the Performance of Thompson Sampling on Logistic Bandits.
Proceedings of the Conference on Learning Theory, 2019

2018
Learning to Optimize via Information-Directed Sampling.
Operations Research, 2018

A Tutorial on Thompson Sampling.
Foundations and Trends in Machine Learning, 2018

An Information-Theoretic Analysis of Thompson Sampling for Large Action Spaces.
CoRR, 2018

Satisficing in Time-Sensitive Bandit Learning.
CoRR, 2018

An Information-Theoretic Analysis for Thompson Sampling with Many Actions.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Scalable Coordinated Exploration in Concurrent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Coordinated Exploration in Concurrent Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization.
Math. Oper. Res., 2017

Learning to Price with Reference Effects.
CoRR, 2017

On Optimistic versus Randomized Exploration in Reinforcement Learning.
CoRR, 2017

Gaussian-Dirichlet Posterior Dominance in Sequential Learning.
CoRR, 2017

Deep Exploration via Randomized Value Functions.
CoRR, 2017

Time-Sensitive Bandit Learning and Satisficing Thompson Sampling.
CoRR, 2017

A Tutorial on Thompson Sampling.
CoRR, 2017

Ensemble Sampling.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Conservative Contextual Linear Bandits.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Proceedings of the 34th International Conference on Machine Learning, 2017

2016
An Information-Theoretic Analysis of Thompson Sampling.
J. Mach. Learn. Res., 2016

On Lower Bounds for Regret in Reinforcement Learning.
CoRR, 2016

Posterior Sampling for Reinforcement Learning Without Episodes.
CoRR, 2016

Conservative Contextual Linear Bandits.
CoRR, 2016

Deep Exploration via Bootstrapped DQN.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Generalization and Exploration via Randomized Value Functions.
Proceedings of the 33nd International Conference on Machine Learning, 2016

2015
Adaptive Execution: Exploration and Learning of Price Impact.
Operations Research, 2015

Bootstrapped Thompson Sampling and Deep Exploration.
CoRR, 2015

2014
Learning to Optimize via Posterior Sampling.
Math. Oper. Res., 2014

Directed Principal Component Analysis.
Operations Research, 2014

Generalization and Exploration via Randomized Value Functions.
CoRR, 2014

Near-optimal Regret Bounds for Reinforcement Learning in Factored MDPs.
CoRR, 2014

Model-based Reinforcement Learning and the Eluder Dimension.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Near-optimal Reinforcement Learning in Factored MDPs.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Learning a factor model via regularized PCA.
Machine Learning, 2013

A Tractable POMDP for a Class of Sequencing Problems
CoRR, 2013

Efficient Exploration and Value Function Generalization in Deterministic Systems.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Eluder Dimension and the Sample Complexity of Optimistic Exploration.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

(More) Efficient Reinforcement Learning via Posterior Sampling.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012
Intermediated Blind Portfolio Auctions.
Management Science, 2012

Directed Time Series Regression for Control
CoRR, 2012

A Hybrid Method for Distance Metric Learning.
CoRR, 2012

Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2011
Industry dynamics: Foundations for models with an infinite number of firms.
J. Economic Theory, 2011

Resource Allocation via Message Passing.
INFORMS Journal on Computing, 2011

2010
Convergence of min-sum message-passing for convex optimization.
IEEE Trans. Information Theory, 2010

Universal reinforcement learning.
IEEE Trans. Information Theory, 2010

Manipulation Robustness of Collaborative Filtering.
Management Science, 2010

Computational Methods for Oblivious Equilibrium.
Operations Research, 2010

Investment and Market Structure in Industries with Congestion.
Operations Research, 2010

Dynamic Pricing with a Prior on Market Response.
Operations Research, 2010

On Regression-Based Stopping Times.
Discrete Event Dynamic Systems, 2010

2009
Convergence of min-sum message passing for quadratic optimization.
IEEE Trans. Information Theory, 2009

Manipulation Robustness of Collaborative Filtering Systems
CoRR, 2009

Manipulation-resistant collaborative filtering systems.
Proceedings of the 2009 ACM Conference on Recommender Systems, 2009

Directed Regression.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

2008
Capacity of the Trapdoor Channel With Feedback.
IEEE Trans. Information Theory, 2008

Reputation markets.
Proceedings of the ACM SIGCOMM 2008 Workshop on Economics of Networked Systems, 2008

2007
A short proof of optimality for the MIN cache replacement algorithm.
Inf. Process. Lett., 2007

Capacity and Zero-Error Capacity of the Chemical Channel with Feedback.
Proceedings of the IEEE International Symposium on Information Theory, 2007

2006
Consensus Propagation.
IEEE Trans. Information Theory, 2006

Approximation algorithms for dynamic resource allocation.
Oper. Res. Lett., 2006

Performance Loss Bounds for Approximate Value Iteration with State Aggregation.
Math. Oper. Res., 2006

A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees.
Math. Oper. Res., 2006

A Nonparametric Approach to Multiproduct Pricing.
Operations Research, 2006

A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning.
Discrete Event Dynamic Systems, 2006

Convergence of the Min-Sum Message Passing Algorithm for Quadratic Optimization
CoRR, 2006

2005
Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

TD(0) Leads to Better Policies than Approximate Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

A universal scheme for learning.
Proceedings of the 2005 IEEE International Symposium on Information Theory, 2005

2004
On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming.
Math. Oper. Res., 2004

Making Eigenvector-Based Reputation Systems Robust to Collusion.
Proceedings of the Algorithms and Models for the Web-Graph: Third International Workshop, 2004

Solitaire: Man Versus Machine.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

2003
The Linear Programming Approach to Approximate Dynamic Programming.
Operations Research, 2003

Decentralized decision-making in a large team with local information.
Games and Economic Behavior, 2003

Self-learning control of finite Markov chains: A.S. Poznyak, K. Najim, E. Gómez-Ramírez, Marcel Dekker, New York, 2000, $150, pp 298, ISBN 0-8247-9249-X.
Automatica, 2003

Distributed Optimization in Adaptive Networks.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

2002
On Average Versus Discounted Reward Temporal-Difference Learning.
Machine Learning, 2002

Approximate Linear Programming for Average-Cost Dynamic Programming.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

2001
Regression methods for pricing complex American-style options.
IEEE Trans. Neural Networks, 2001

An analysis of belief propagation on the turbo decoding graph with Gaussian densities.
IEEE Trans. Information Theory, 2001

A Tractable POMDP for Dynamic Sequencing with Applications to Personalized Internet Content Provision.
Proceedings of the UAI '01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, 2001

Approximate Dynamic Programming via Linear Programming.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

2000
Fixed Points of Approximate Value Iteration and Temporal-Difference Learning.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

1999
Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives.
IEEE Trans. Automat. Contr., 1999

Average cost temporal-difference learning.
Automatica, 1999

An Analysis of Turbo Decoding with Gaussian Densities.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

1998
Learning and value function approximation in complex decision processes.
PhD thesis, 1998

1996
Feature-Based Methods for Large Scale Dynamic Programming.
Machine Learning, 1996

Approximate Solutions to Optimal Stopping Problems.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

Analysis of Temporal-Diffference Learning with Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

1995
Stable LInear Approximations to Dynamic Programming for Stochastic Control Problems with Local Transitions.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995


  Loading...