Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

(More) Efficient Reinforcement Learning via Posterior Sampling.

[BibT_eX]

[DOI]

Ian Osband

Daniel Russo

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012

Intermediated Blind Portfolio Auctions.

[BibT_eX]

[DOI]

Michael Padilla

Manag. Sci., 2012

Directed Time Series Regression for Control

[BibT_eX]

[DOI]

Yi-Hao Kao

CoRR, 2012

A Hybrid Method for Distance Metric Learning.

[BibT_eX]

[DOI]

CoRR, 2012

Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems.

[BibT_eX]

[DOI]

Morteza Ibrahimi

Adel Javanmard

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2011

Industry dynamics: Foundations for models with an infinite number of firms.

[BibT_eX]

[DOI]

C. Lanier Benkard

J. Econ. Theory, 2011

Resource Allocation via Message Passing.

[BibT_eX]

[DOI]

INFORMS J. Comput., 2011

2010

Convergence of min-sum message-passing for convex optimization.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2010

Universal reinforcement learning.

[BibT_eX]

[DOI]

Vivek F. Farias

Tsachy Weissman

IEEE Trans. Inf. Theory, 2010

Manipulation Robustness of Collaborative Filtering.

[BibT_eX]

[DOI]

Manag. Sci., 2010

Computational Methods for Oblivious Equilibrium.

[BibT_eX]

[DOI]

C. Lanier Benkard

Oper. Res., 2010

Investment and Market Structure in Industries with Congestion.

[BibT_eX]

[DOI]

Ramesh Johari

Oper. Res., 2010

Dynamic Pricing with a Prior on Market Response.

[BibT_eX]

[DOI]

Vivek F. Farias

Oper. Res., 2010

On Regression-Based Stopping Times.

[BibT_eX]

[DOI]

Discret. Event Dyn. Syst., 2010

2009

Convergence of min-sum message passing for quadratic optimization.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2009

Manipulation Robustness of Collaborative Filtering Systems

[BibT_eX]

[DOI]

CoRR, 2009

Manipulation-resistant collaborative filtering systems.

[BibT_eX]

[DOI]

Proceedings of the 2009 ACM Conference on Recommender Systems, 2009

Directed Regression.

[BibT_eX]

[DOI]

Yi-Hao Kao

Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

2008

Capacity of the Trapdoor Channel With Feedback.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2008

Reputation markets.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2008 Workshop on Economics of Networked Systems, 2008

2007

A short proof of optimality for the MIN cache replacement algorithm.

[BibT_eX]

[DOI]

Inf. Process. Lett., 2007

Capacity and Zero-Error Capacity of the Chemical Channel with Feedback.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2007

2006

Approximation algorithms for dynamic resource allocation.

[BibT_eX]

[DOI]

Vivek F. Farias

Oper. Res. Lett., 2006

Performance Loss Bounds for Approximate Value Iteration with State Aggregation.

[BibT_eX]

[DOI]

Math. Oper. Res., 2006

A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees.

[BibT_eX]

[DOI]

Math. Oper. Res., 2006

A Nonparametric Approach to Multiproduct Pricing.

[BibT_eX]

[DOI]

Peter W. Glynn

Oper. Res., 2006

Convergence of the Min-Sum Message Passing Algorithm for Quadratic Optimization

[BibT_eX]

[DOI]

CoRR, 2006

2005

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games.

[BibT_eX]

[DOI]

C. Lanier Benkard

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

TD(0) Leads to Better Policies than Approximate Value Iteration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Consensus Propagation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

A universal scheme for learning.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Symposium on Information Theory, 2005

2004

On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming.

[BibT_eX]

[DOI]

Math. Oper. Res., 2004

Making Eigenvector-Based Reputation Systems Robust to Collusion.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Models for the Web-Graph: Third International Workshop, 2004

Solitaire: Man Versus Machine.

[BibT_eX]

[DOI]

Persi Diaconis

Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

2003

The Linear Programming Approach to Approximate Dynamic Programming.

[BibT_eX]

[DOI]

Oper. Res., 2003

Decentralized decision-making in a large team with local information.

[BibT_eX]

[DOI]

Games Econ. Behav., 2003

Self-learning control of finite Markov chains: A.S. Poznyak, K. Najim, E. Gómez-Ramírez, Marcel Dekker, New York, 2000, $150, pp 298, ISBN 0-8247-9249-X.

[BibT_eX]

[DOI]

Autom., 2003

Distributed Optimization in Adaptive Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

On constraint sampling in the linear programming approach to approximate linear programming.

[BibT_eX]

[DOI]

Proceedings of the 42nd IEEE Conference on Decision and Control, 2003

2002

On Average Versus Discounted Reward Temporal-Difference Learning.

[BibT_eX]

[DOI]

Mach. Learn., 2002

Approximate Linear Programming for Average-Cost Dynamic Programming.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

2001

Regression methods for pricing complex American-style options.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks, 2001

An analysis of belief propagation on the turbo decoding graph with Gaussian densities.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2001

A Tractable POMDP for Dynamic Sequencing with Applications to Personalized Internet Content Provision.

[BibT_eX]

[DOI]

Proceedings of the UAI '01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, 2001

Approximate Dynamic Programming via Linear Programming.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal Difference Learning.

[BibT_eX]

David Choi

Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

2000

Fixed Points of Approximate Value Iteration and Temporal-Difference Learning.

[BibT_eX]

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

The optimal harvesting of environmental bads.

[BibT_eX]

[DOI]

Nathaniel Keohane

Richard Zeckhauser

Proceedings of the 39th IEEE Conference on Decision and Control, 2000

Approximate value iteration with randomized policies.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE Conference on Decision and Control, 2000

1999

Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 1999

Average cost temporal-difference learning.

[BibT_eX]

[DOI]