Sham M. Kakade

Affiliations:
  • University of Washington, Department of Statistics, Seattle, WA, USA
  • Microsoft Research New England, Cambridge, MA, USA
  • Toyota Technological Institute at Chicago, IL, USA
  • University of Pennsylvania, Department of Statistics, Philadelphia, PA, USA
  • University College London, Gatsby Computational Neuroscience Unit, UK


According to our database1, Sham M. Kakade authored at least 200 papers between 1999 and 2022.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2022
Robust Aggregation for Federated Learning.
IEEE Trans. Signal Process., 2022

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity.
CoRR, 2022

The Role of Coverage in Online Reinforcement Learning.
CoRR, 2022

Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms.
CoRR, 2022

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift.
CoRR, 2022

Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit.
CoRR, 2022

Matryoshka Representations for Adaptive Deployment.
CoRR, 2022

A Sharp Characterization of Linear Estimators for Offline Policy Evaluation.
CoRR, 2022

Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime.
CoRR, 2022

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression.
Proceedings of the International Conference on Machine Learning, 2022

Understanding Contrastive Learning Requires Incorporating Inductive Biases.
Proceedings of the International Conference on Machine Learning, 2022

Sparsity in Partially Controllable Linear Systems.
Proceedings of the International Conference on Machine Learning, 2022

Inductive Biases and Variable Creation in Self-Attention Mechanisms.
Proceedings of the International Conference on Machine Learning, 2022

Multi-Stage Episodic Control for Strategic Exploration in Text Games.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Anti-Concentrated Confidence Bonuses For Scalable Exploration.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift.
J. Mach. Learn. Res., 2021

On Nonconvex Optimization for Machine Learning: Gradients, Stochasticity, and Saddle Points.
J. ACM, 2021

The Statistical Complexity of Interactive Decision Making.
CoRR, 2021

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization.
CoRR, 2021

A Short Note on the Relationship of Information Gain and Eluder Dimension.
CoRR, 2021

Koopman Spectrum Nonlinear Regulator and Provably Efficient Online Learning.
CoRR, 2021

An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap.
CoRR, 2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Robust and differentially private mean estimation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Going Beyond Linear RL: Sample Efficient Neural Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Gone Fishing: Neural Active Learning with Fisher Embeddings.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Instabilities of Offline RL with Pre-Trained Neural Representation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL.
Proceedings of the 38th International Conference on Machine Learning, 2021

How Important is the Train-Validation Split in Meta-Learning?
Proceedings of the 38th International Conference on Machine Learning, 2021

What are the Statistical Limits of Offline RL with Linear Function Approximation?
Proceedings of the 9th International Conference on Learning Representations, 2021

Optimal Regularization can Mitigate Double Descent.
Proceedings of the 9th International Conference on Learning Representations, 2021

Few-Shot Learning via Learning the Representation, Provably.
Proceedings of the 9th International Conference on Learning Representations, 2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression.
Proceedings of the Conference on Learning Theory, 2021

2020
Stochastic Subgradient Method Converges on Tame Functions.
Found. Comput. Math., 2020

PACT: Privacy-Sensitive Protocols And Mechanisms for Mobile Contact Tracing.
IEEE Data Eng. Bull., 2020

Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?
CoRR, 2020

PACT: Privacy Sensitive Protocols and Mechanisms for Mobile Contact Tracing.
CoRR, 2020

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Is Long Horizon RL More Difficult Than Short Horizon RL?
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Robust Meta-learning for Mixed Linear Regression with Small Batches.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Information Theoretic Regret Bounds for Online Nonlinear Control.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

The Implicit and Explicit Regularization Effects of Dropout.
Proceedings of the 37th International Conference on Machine Learning, 2020

Soft Threshold Weight Reparameterization for Learnable Sparsity.
Proceedings of the 37th International Conference on Machine Learning, 2020

Meta-learning for Mixed Linear Regression.
Proceedings of the 37th International Conference on Machine Learning, 2020

Calibration, Entropy Rates, and Memory in Language Models.
Proceedings of the 37th International Conference on Machine Learning, 2020

Provable Representation Learning for Imitation Learning via Bi-level Optimization.
Proceedings of the 37th International Conference on Machine Learning, 2020

Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
Proceedings of the 8th International Conference on Learning Representations, 2020

Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal.
Proceedings of the Conference on Learning Theory, 2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes.
Proceedings of the Conference on Learning Theory, 2020

The Nonstochastic Control Problem.
Proceedings of the Algorithmic Learning Theory, 2020

Leverage Score Sampling for Faster Accelerated Regression and ERM.
Proceedings of the Algorithmic Learning Theory, 2020

2019
Robust Aggregation for Federated Learning.
CoRR, 2019

Optimal Estimation of Change in a Population of Parameters.
CoRR, 2019

On the Optimality of Sparse Model-Based Planning for Markov Decision Processes.
CoRR, 2019

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure.
CoRR, 2019

Stochastic Gradient Descent Escapes Saddle Points Efficiently.
CoRR, 2019

A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm.
CoRR, 2019

The Illusion of Change: Correcting for Biases in Change Inference for Sparse, Societal-Scale Data.
Proceedings of the World Wide Web Conference, 2019

Meta-Learning with Implicit Gradients.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Coupled Recurrent Models for Polyphonic Music Composition.
Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019

Maximum Likelihood Estimation for Learning Populations of Parameters.
Proceedings of the 36th International Conference on Machine Learning, 2019

Provably Efficient Maximum Entropy Exploration.
Proceedings of the 36th International Conference on Machine Learning, 2019

Online Meta-Learning.
Proceedings of the 36th International Conference on Machine Learning, 2019

Online Control with Adversarial Disturbances.
Proceedings of the 36th International Conference on Machine Learning, 2019

Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control.
Proceedings of the 7th International Conference on Learning Representations, 2019

Open Problem: Do Good Algorithms Necessarily Query Bad Points?
Proceedings of the Conference on Learning Theory, 2019

2018
Provably Correct Automatic Subdifferentiation for Qualified Programs.
CoRR, 2018

Global Convergence of Policy Gradient Methods for Linearized Control Problems.
CoRR, 2018

Prediction with a short memory.
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, 2018

A Smoother Way to Train Structured Prediction Models.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Provably Correct Automatic Sub-Differentiation for Qualified Programs.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

On the Insufficiency of Existing Momentum Schemes for Stochastic Optimization.
Proceedings of the 2018 Information Theory and Applications Workshop, 2018

Recovering Structured Probability Matrices.
Proceedings of the 9th Innovations in Theoretical Computer Science Conference, 2018

Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator.
Proceedings of the 35th International Conference on Machine Learning, 2018

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines.
Proceedings of the 6th International Conference on Learning Representations, 2018

Invariances and Data Augmentation for Supervised Music Transcription.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Accelerating Stochastic Gradient Descent for Least Squares Regression.
Proceedings of the Conference On Learning Theory, 2018

2017
Parallelizing Stochastic Gradient Descent for Least Squares Regression: Mini-batching, Averaging, and Model Misspecification.
J. Mach. Learn. Res., 2017

Accelerating Stochastic Gradient Descent.
CoRR, 2017

Learning Overcomplete HMMs.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Towards Generalization and Simplicity in Continuous Control.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

How to Escape Saddle Points Efficiently.
Proceedings of the 34th International Conference on Machine Learning, 2017

Learning Features of Music From Scratch.
Proceedings of the 5th International Conference on Learning Representations, 2017

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares).
Proceedings of the 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, 2017

Global Convergence of Non-Convex Gradient Descent for Computing Matrix Squareroot.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016
Minimal Realization Problems for Hidden Markov Models.
IEEE Trans. Signal Process., 2016

Canonical Correlation Analysis for Analyzing Sequences of Medical Billing Codes.
CoRR, 2016

Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging.
CoRR, 2016

Matching Matrix Bernstein with Little Memory: Near-Optimal Finite Sample Guarantees for Oja's Algorithm.
CoRR, 2016

Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Faster Eigenvector Computation via Shift-and-Invert Preconditioning.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Streaming PCA: Matching Matrix Bernstein and Near-Optimal Finite Sample Guarantees for Oja's Algorithm.
Proceedings of the 29th Conference on Learning Theory, 2016

2015
When are overcomplete topic models identifiable? uniqueness of tensor tucker decompositions with structured sparsity.
J. Mach. Learn. Res., 2015

Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation.
CoRR, 2015

Computing Matrix Squareroot via Non Convex Local Search.
CoRR, 2015

A Spectral Algorithm for Latent Dirichlet Allocation.
Algorithmica, 2015

Learning Mixtures of Gaussians in High Dimensions.
Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, 2015

Super-Resolution Off the Grid.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Convergence Rates of Active Learning for Maximum Likelihood Estimation.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization.
Proceedings of the 32nd International Conference on Machine Learning, 2015

A Linear Dynamical System Model for Text.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Competing with the Empirical Risk Minimizer in a Single Pass.
Proceedings of The 28th Conference on Learning Theory, 2015

Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT).
Proceedings of the Algorithmic Learning Theory - 26th International Conference, 2015

2014
Tensor decompositions for learning latent variable models.
J. Mach. Learn. Res., 2014

A tensor approach to learning mixed membership community models.
J. Mach. Learn. Res., 2014

Least Squares Revisited: Scalable Approaches for Multi-class Prediction.
Proceedings of the 31th International Conference on Machine Learning, 2014

Minimal realization problem for Hidden Markov Models.
Proceedings of the 52nd Annual Allerton Conference on Communication, 2014

2013
Stochastic Convex Optimization with Bandit Feedback.
SIAM J. Optim., 2013

A risk comparison of ordinary least squares vs ridge regression.
J. Mach. Learn. Res., 2013

Optimal Dynamic Mechanism Design and the Virtual-Pivot Mechanism.
Oper. Res., 2013

Learning mixtures of spherical gaussians: moment methods and spectral decompositions.
Proceedings of the Innovations in Theoretical Computer Science, 2013

Learning Linear Bayesian Networks with Latent Variables.
Proceedings of the 30th International Conference on Machine Learning, 2013

A Tensor Spectral Approach to Learning Mixed Membership Community Models.
Proceedings of the COLT 2013, 2013

2012
Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting.
IEEE Trans. Inf. Theory, 2012

Domain Adaptation: A Small Sample Statistical Approach.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Regularization Techniques for Learning with Matrices.
J. Mach. Learn. Res., 2012

Random Design Analysis of Ridge Regression.
Proceedings of the COLT 2012, 2012

(weak) Calibration is Computationally Hard.
Proceedings of the COLT 2012, 2012

Towards Minimax Policies for Online Linear Optimization with Bandit Feedback.
Proceedings of the COLT 2012, 2012

A Method of Moments for Mixture Models and Hidden Markov Models.
Proceedings of the COLT 2012, 2012

A spectral algorithm for learning Hidden Markov Models.
J. Comput. Syst. Sci., 2012

Analysis of a randomized approximation scheme for matrix multiplication
CoRR, 2012

Learning Gaussian Mixture Models: Moment Methods and Spectral Decompositions
CoRR, 2012

Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation
CoRR, 2012

Learning High-Dimensional Mixtures of Graphical Models
CoRR, 2012

Identifiability and Unmixing of Latent Parse Trees.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Learning Mixtures of Tree Graphical Models.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2011
Robust Matrix Decomposition With Sparse Corruptions.
IEEE Trans. Inf. Theory, 2011

Optimal dynamic mechanism design via a virtual VCG mechanism.
SIGecom Exch., 2011

Preface.
Proceedings of the COLT 2011, 2011

Domain Adaptation with Coupled Subspaces.
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

A tail inequality for quadratic forms of subgaussian random vectors
CoRR, 2011

An Analysis of Random Design Linear Regression
CoRR, 2011

Domain Adaptation: Overfitting and Small Sample Statistics
CoRR, 2011

Dimension-free tail inequalities for sums of random matrices.
CoRR, 2011

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Spectral Methods for Learning Multivariate Latent Tree Structure.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Guest editorial: special issue on learning theory.
Mach. Learn., 2010

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Robust Matrix Decomposition with Outliers
CoRR, 2010

Learning from Logged Implicit Exploration Data
CoRR, 2010

An Optimal Dynamic Mechanism for Multi-Armed Bandit Processes
CoRR, 2010

Learning from Logged Implicit Exploration Data.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

2009
Playing Games with Approximation Algorithms.
SIAM J. Comput., 2009

Online Markov Decision Processes.
Math. Oper. Res., 2009

Gaussian Process Bandits without Regret: An Experimental Design Approach
CoRR, 2009

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity
CoRR, 2009

Applications of strong convexity--strong smoothness duality to learning with matrices
CoRR, 2009

The price of truthfulness for pay-per-click auctions.
Proceedings of the Proceedings 10th ACM Conference on Electronic Commerce (EC-2009), 2009

Multi-Label Prediction via Compressed Sensing.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Multi-view clustering via canonical correlation analysis.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Information Consistency of Nonparametric Gaussian Process Methods.
IEEE Trans. Inf. Theory, 2008

Deterministic calibration and Nash equilibrium.
J. Comput. Syst. Sci., 2008

Mind the Duality Gap: Logarithmic regret algorithms for online optimization.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

On the Generalization Ability of Online Strongly Convex Programming Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Efficient bandit algorithms for online multiclass prediction.
Proceedings of the Machine Learning, 2008

An Information Theoretic Framework for Multi-view Learning.
Proceedings of the 21st Annual Conference on Learning Theory, 2008

Stochastic Linear Optimization under Bandit Feedback.
Proceedings of the 21st Annual Conference on Learning Theory, 2008

High-Probability Regret Bounds for Bandit Online Linear Optimization.
Proceedings of the 21st Annual Conference on Learning Theory, 2008

2007
Maximum Entropy Correlated Equilibria.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

The Price of Bandit Information for Online Optimization.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

The Value of Observation for Monitoring Dynamic Systems.
Proceedings of the IJCAI 2007, 2007

Leveragingarchivalvideo for building face datasets.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Multi-view Regression Via Canonical Correlation Analysis.
Proceedings of the Learning Theory, 20th Annual Conference on Learning Theory, 2007

2006
(In)Stability properties of limit order dynamics.
Proceedings of the Proceedings 7th ACM Conference on Electronic Commerce (EC-2006), 2006

Calibration via Regression.
Proceedings of the 2006 IEEE Information Theory Workshop, 2006

Cover trees for nearest neighbor.
Proceedings of the Machine Learning, 2006

2005
Planning in POMDPs Using Multiplicity Automata.
Proceedings of the UAI '05, 2005

Worst-Case Bounds for Gaussian Process Models.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

From Batch to Transductive Online Learning.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Reinforcement Learning in POMDPs Without Resets.
Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Trading in Markovian Price Models.
Proceedings of the Learning Theory, 18th Annual Conference on Learning Theory, 2005

2004
Competitive algorithms for VWAP and limit order trading.
Proceedings of the Proceedings 5th ACM Conference on Electronic Commerce (EC-2004), 2004

Online Bounds for Bayesian Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Economic Properties of Social Networks.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Experts in a Markov Decision Process.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Graphical Economics.
Proceedings of the Learning Theory, 17th Annual Conference on Learning Theory, 2004

2003
Correlated equilibria in graphical games.
Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

Policy Search by Dynamic Programming.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

Exploration in Metric State Spaces.
Proceedings of the Machine Learning, 2003

2002
Dopamine: generalization and bonuses.
Neural Networks, 2002

Opponent interactions between serotonin and dopamine.
Neural Networks, 2002

Competitive Analysis of the Explore/Exploit Tradeoff.
Proceedings of the Machine Learning, 2002

An Alternate Objective Function for Markovian Fields.
Proceedings of the Machine Learning, 2002

Approximately Optimal Approximate Reinforcement Learning.
Proceedings of the Machine Learning, 2002

2001
A Natural Policy Gradient.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Optimizing Average Reward Using Discounted Rewards.
Proceedings of the Computational Learning Theory, 2001

2000
Dopamine Bonuses.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Explaining Away in Weight Space.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

1999
Acquisition in Autoshaping.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999


  Loading...