# Satinder P. Singh

According to our database

Collaborative distances:

^{1}, Satinder P. Singh authored at least 189 papers between 1991 and 2018.Collaborative distances:

## Timeline

#### Legend:

Book In proceedings Article PhD thesis Other## Links

#### Homepages:

#### On csauthors.net:

## Bibliography

2018

Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes.

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Challenges in the Trustworthy Pursuit of Maintenance Commitments Under Uncertainty.

Proceedings of the 20th International Trust Workshop co-located with AAMAS/IJCAI/ECAI/ICML 2018, 2018

On Querying for Safe Optimality in Factored Markov Decision Processes.

Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Markov Decision Processes with Continuous Side Information.

Proceedings of the Algorithmic Learning Theory, 2018

2017

Markov Decision Processes with Continuous Side Information.

CoRR, 2017

Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making.

CoRR, 2017

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning.

CoRR, 2017

Repeated Inverse Reinforcement Learning.

CoRR, 2017

Repeated Inverse Reinforcement Learning.

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning.

Proceedings of the 34th International Conference on Machine Learning, 2017

A Stackelberg Game Model for Botnet Data Exfiltration.

Proceedings of the Decision and Game Theory for Security - 8th International Conference, 2017

Multi-Stage Attack Graph Security Games: Heuristic Strategies, with Empirical Game-Theoretic Analysis.

Proceedings of the 2017 Workshop on Moving Target Defense, 2017

Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making.

Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, 2017

Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes.

Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, 2017

2016

Multi-task seizure detection: addressing intra-patient variation in seizure morphologies.

Machine Learning, 2016

Control of Memory, Active Perception, and Action in Minecraft.

CoRR, 2016

Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games.

CoRR, 2016

Towards Resolving Unidentifiability in Inverse Reinforcement Learning.

CoRR, 2016

Gradient Methods for Stackelberg Games.

Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Commitment Semantics for Sequential Decision Making under Reward Uncertainty.

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

On Structural Properties of MDPs that Bound Loss Due to Shallow Planning.

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

The Dependence of Effective Planning Horizon on Model Accuracy.

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games.

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Control of Memory, Active Perception, and Action in Minecraft.

Proceedings of the 33nd International Conference on Machine Learning, 2016

On the Trustworthy Fulfillment of Commitments.

Proceedings of the Autonomous Agents and Multiagent Systems - AAMAS 2016 Workshops, - Best Papers, 2016

On the Trustworthy Fulfillment of Commitments.

Proceedings of the 18th International Workshop on Trust in Agent Societies co-located with the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), 2016

Improving Predictive State Representations via Gradient Descent.

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Action-Conditional Video Prediction using Deep Networks in Atari Games.

CoRR, 2015

Action-Conditional Video Prediction using Deep Networks in Atari Games.

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Abstraction Selection in Model-based Reinforcement Learning.

Proceedings of the 32nd International Conference on Machine Learning, 2015

The Dependence of Effective Planning Horizon on Model Accuracy.

Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015

Low-Rank Spectral Learning with Weighted Loss Functions.

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

Spectral Learning of Predictive State Representations with Insufficient Statistics.

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

Optimal Rewards for Cooperative Agents.

IEEE Trans. Autonomous Mental Development, 2014

Learning to Make Predictions In Partially Observable Environments Without a Generative Model.

CoRR, 2014

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning.

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Ecologically valid long-term mood monitoring of individuals with bipolar disorder using speech.

Proceedings of the IEEE International Conference on Acoustics, 2014

Improving UCT planning via approximate homomorphisms.

Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2014

Low-Rank Spectral Learning.

Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, 2014

Characterizing EVOI-Sufficient k-Response Query Sets in Decision Problems.

Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, 2014

Computing Solutions in Infinite-Horizon Discounted Adversarial Patrolling Games.

Proceedings of the Twenty-Fourth International Conference on Automated Planning and Scheduling, 2014

Computationally Rational Saccadic Control: An Explanation of Spillover Effects Based on Sampling from Noisy Perception and Memory.

Proceedings of the Fifth Workshop on Cognitive Modeling and Computational Linguistics, 2014

Evaluating Trauma Patients: Addressing Missing Covariates with Joint Optimization.

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

Predicting Postoperative Atrial Fibrillation from Independent ECG Components.

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013

The Adaptive Nature of Eye Movements in Linguistic Tasks: How Payoff and Architecture Shape Speed-Accuracy Trade-Offs.

topiCS, 2013

Approximate Planning for Factored POMDPs using Belief State Simplification

CoRR, 2013

On the Complexity of Policy Iteration

CoRR, 2013

Nash Convergence of Gradient Dynamics in Iterated General-Sum Games

CoRR, 2013

Fast Planning in Stochastic Games

CoRR, 2013

Graphical Models for Game Theory

CoRR, 2013

Reward Mapping for Transfer in Long-Lived Agents.

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Linking Context to Evaluation in the Design of Safety Critical Interfaces.

Proceedings of the Human-Computer Interaction. Human-Centred Design Approaches, Methods, Tools, and Environments, 2013

2012

Predictive State Representations: A New Theory for Modeling Dynamical Systems

CoRR, 2012

Predictive Linear-Gaussian Models of Stochastic Dynamical Systems

CoRR, 2012

Knowledge Combination in Graphical Multiagent Model

CoRR, 2012

Variance-Based Rewards for Approximate Bayesian Reinforcement Learning

CoRR, 2012

Optimal Coordinated Planning Amongst Self-Interested Agents with Private State.

CoRR, 2012

Reports of the AAAI 2011 Conference Workshops.

AI Magazine, 2012

Lossy stochastic game abstraction with bounds.

Proceedings of the ACM Conference on Electronic Commerce, 2012

Optimal rewards in multiagent teams.

Proceedings of the 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics, 2012

Planning and evaluating multiagent influences under reward uncertainty.

Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

Learning and predicting dynamic networked behavior with graphical multiagent models.

Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

Strong mitigation: nesting search for good policies within search for good reward.

Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

Security Games with Limited Surveillance: An Initial Report.

Proceedings of the Game Theory for Security, 2012

Computing Stackelberg Equilibria in Discounted Stochastic Games.

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

Security Games with Limited Surveillance.

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011

IP Geolocation in Metropolitan Areas.

PhD thesis, 2011

Learning to Make Predictions In Partially Observable Environments Without a Generative Model.

J. Artif. Intell. Res., 2011

ATTac-2000: An Adaptive Autonomous Bidding Agent

CoRR, 2011

Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System

CoRR, 2011

Modeling Information Diffusion in Networks with Unobserved Links.

Proceedings of the PASSAT/SocialCom 2011, Privacy, 2011

IP geolocation in metropolitan areas.

Proceedings of the SIGMETRICS 2011, 2011

Comparing action-query strategies in semi-autonomous agents.

Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Optimal Rewards versus Leaf-Evaluation Heuristics in Planning Agents.

Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

Comparing Action-Query Strategies in Semi-Autonomous Agents.

Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010

Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective.

IEEE Trans. Autonomous Mental Development, 2010

Dynamic Incentive Mechanisms.

AI Magazine, 2010

Variance-Based Rewards for Approximate Bayesian Reinforcement Learning.

Proceedings of the UAI 2010, 2010

Reward Design via Online Gradient Ascent.

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Internal Rewards Mitigate Agent Boundedness.

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Selecting Operator Queries Using Expected Myopic Gain.

Proceedings of the 2010 IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2010

Linear options.

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

History-dependent graphical multiagent models.

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

2009

Maintaining Predictions over Time without a Model.

Proceedings of the IJCAI 2009, 2009

Learning Graphical Game Models.

Proceedings of the IJCAI 2009, 2009

Transfer via soft homomorphisms.

Proceedings of the 8th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), 2009

SarsaLandmark: an algorithm for learning in POMDPs with landmarks.

Proceedings of the 8th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), 2009

2008

Knowledge Combination in Graphical Multiagent Models.

Proceedings of the UAI 2008, 2008

Simple Local Models for Complex Dynamical Systems.

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Building Incomplete but Accurate Models.

Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

Predictive Linear-Gaussian Models of Dynamical Systems with Vector-Valued Actions and Observations.

Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

Efficiently learning linear-linear exponential family predictive representations of state.

Proceedings of the Machine Learning, 2008

Approximate predictive state representations.

Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

2007

Learning payoff functions in infinite games.

Machine Learning, 2007

DaNaLIX: a domain-adaptive natural language interface for querying XML.

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Exponential Family Predictive Representations of State.

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Relational Knowledge with Predictive State Representations.

Proceedings of the IJCAI 2007, 2007

An Experts Algorithm for Transfer Learning.

Proceedings of the IJCAI 2007, 2007

On discovery and learning of models with predictive representations of state for agents with continuous actions and observations.

Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2007), 2007

Constraint satisfaction algorithms for graphical games.

Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2007), 2007

Abstraction in Predictive State Representations.

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

Enabling Domain-Awareness for a Generic Natural Language Interface.

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006

Cobot in LambdaMOO: An Adaptive Social Statistics Agent.

Autonomous Agents and Multi-Agent Systems, 2006

Optimal Coordinated Planning Amongst Self-Interested Agents with Private State.

Proceedings of the UAI '06, 2006

Predictive state representations with options.

Proceedings of the Machine Learning, 2006

Kernel Predictive Linear Gaussian models for nonlinear stochastic dynamical systems.

Proceedings of the Machine Learning, 2006

Predictive linear-Gaussian models of controlled stochastic dynamical systems.

Proceedings of the Machine Learning, 2006

Mixtures of Predictive Linear Gaussian Models for Nonlinear, Stochastic Dynamical Systems.

Proceedings of the Proceedings, 2006

Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains.

Proceedings of the Proceedings, 2006

2005

Strategic Interactions in a Supply Chain Game.

Computational Intelligence, 2005

Reports on the 2004 AAAI Fall Symposia.

AI Magazine, 2005

Predictive Linear-Gaussian Models of Stochastic Dynamical Systems.

Proceedings of the UAI '05, 2005

Off-policy Learning with Options and Recognizers.

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Learning Payoff Functions in Infinite Games.

Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Combining Memory and Landmarks with Predictive State Representations.

Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Learning predictive state representations in dynamical systems without reset.

Proceedings of the Machine Learning, 2005

Planning in Models that Combine Memory with Predictive Representations of State.

Proceedings of the Proceedings, 2005

2004

Value-driven procurement in the TAC supply chain game.

SIGecom Exchanges, 2004

Predictive State Representations: A New Theory for Modeling Dynamical Systems.

Proceedings of the UAI '04, 2004

Computing approximate bayes-nash equilibria in tree-games of incomplete information.

Proceedings of the Proceedings 5th ACM Conference on Electronic Commerce (EC-2004), 2004

Intrinsically Motivated Reinforcement Learning.

Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Approximately Efficient Online Mechanism Design.

Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Planning with predictive state representations.

Proceedings of the 2004 International Conference on Machine Learning and Applications, 2004

Adaptive cognitive orthotics: combining reinforcement learning and constraint-based temporal reasoning.

Proceedings of the Machine Learning, 2004

Learning and discovery of predictive state representations in dynamical systems with reset.

Proceedings of the Machine Learning, 2004

Strategic Interactions in the TAC 2003 Supply Chain Tournament.

Proceedings of the Computers and Games, 4th International Conference, 2004

Distributed Feedback Control for Decision Making on Supply Chains.

Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling (ICAPS 2004), 2004

2003

A Nonlinear Predictive State Representation.

Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

An MDP-Based Approach to Online Mechanism Design.

Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

Learning Predictive State Representations.

Proceedings of the Machine Learning, 2003

2002

Introduction.

Machine Learning, 2002

Near-Optimal Reinforcement Learning in Polynomial Time.

Machine Learning, 2002

Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System.

J. Artif. Intell. Res., 2002

CobotDS: A Spoken Dialogue System for Chat.

Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28, 2002

2001

ATTac-2000: An Adaptive Autonomous Bidding Agent.

J. Artif. Intell. Res., 2001

FAucS : An FCC Spectrum Auction Simulator for Autonomous Bidding Agents.

Proceedings of the Electronic Commerce, Second International Workshop, 2001

Graphical Models for Game Theory.

Proceedings of the UAI '01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, 2001

Predictive Representations of State.

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

An Efficient, Exact Algorithm for Solving Tree-Structured Graphical Games.

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Cobot: A Social Reinforcement Learning Agent.

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

ATTac-2000: an adaptive autonomous bidding agent.

Proceedings of the Fifth International Conference on Autonomous Agents, 2001

A social reinforcement learning agent.

Proceedings of the Fifth International Conference on Autonomous Agents, 2001

2000

Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms.

Machine Learning, 2000

Nash Convergence of Gradient Dynamics in General-Sum Games.

Proceedings of the UAI '00: Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, Stanford University, Stanford, California, USA, June 30, 2000

Fast Planning in Stochastic Games.

Proceedings of the UAI '00: Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, Stanford University, Stanford, California, USA, June 30, 2000

Reinforcement Learning for 3 vs. 2 Keepaway

Proceedings of the RoboCup 2000: Robot Soccer World Cup IV, 2000

Eligibility Traces for Off-Policy Policy Evaluation.

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

A Boosting Approach to Topic Spotting on Subdialogues.

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Bias-Variance Error Bounds for Temporal Difference Updates.

Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT 2000), June 28, 2000

Automatic Optimization of Dialogue Management.

Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000

Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System.

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

Cobot in LambdaMOO: A Social Statistics Agent.

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

1999

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning.

Artif. Intell., 1999

Approximate Planning for Factored POMDPs using Belief State Simplification.

Proceedings of the UAI '99: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, July 30, 1999

On the Complexity of Policy Iteration.

Proceedings of the UAI '99: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, July 30, 1999

Policy Gradient Methods for Reinforcement Learning with Function Approximation.

Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

Reinforcement Learning for Spoken Dialogue Systems.

Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

1998

Analytical Mean Squared Error Curves for Temporal Difference Learning.

Machine Learning, 1998

Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes.

Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Improved Switching among Temporally Abstract Actions.

Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms.

Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Optimizing Admission Control while Ensuring Quality of Service in Multimedia Networks via Reinforcement Learning.

Intra-Option Learning about Temporally Abstract Actions.

Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes.

Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Near-Optimal Reinforcement Learning in Polynominal Time.

Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Theoretical Results on Reinforcement Learning with Temporally Abstract Options.

Proceedings of the Machine Learning: ECML-98, 1998

1997

How to Dynamically Merge Markov Decision Processes.

Proceedings of the Advances in Neural Information Processing Systems 10, 1997

1996

Reinforcement Learning with Replacing Eligibility Traces.

Machine Learning, 1996

Analytical Mean Squared Error Curves in Temporal Difference Learning.

Proceedings of the Advances in Neural Information Processing Systems 9, 1996

Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems.

Proceedings of the Advances in Neural Information Processing Systems 9, 1996

Predicting Lifetimes in Dynamically Allocated Memory.

Proceedings of the Advances in Neural Information Processing Systems 9, 1996

Learning Curve Bounds for a Markov Decision Process with Undiscounted Rewards.

Proceedings of the Ninth Annual Conference on Computational Learning Theory, 1996

1995

Learning to Act Using Real-Time Dynamic Programming.

Artif. Intell., 1995

Improving Policies without Measuring Merits.

Proceedings of the Advances in Neural Information Processing Systems 8, 1995

Markov Decision Processes in Large State Spaces.

Proceedings of the Eigth Annual Conference on Computational Learning Theory, 1995

1994

On the Convergence of Stochastic Iterative Dynamic Programming Algorithms.

Neural Computation, 1994

An Upper Bound on the Loss from Approximate Optimal-Value Functions.

Machine Learning, 1994

Reinforcement Learning with Soft State Aggregation.

Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems.

Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Learning Without State-Estimation in Partially Observable Markovian Decision Processes.

Proceedings of the Machine Learning, 1994

Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes.

Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA, USA, July 31, 1994

1993

Robust Reinforcement Learning in Motion Planning.

Proceedings of the Advances in Neural Information Processing Systems 6, 1993

Convergence of Stochastic Iterative Dynamic Programming Algorithms.

Proceedings of the Advances in Neural Information Processing Systems 6, 1993

1992

Transfer of Learning by Composing Solutions of Elemental Sequential Tasks.

Machine Learning, 1992

Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models.

Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), 1992

Reinforcement Learning with a Hierarchy of Abstract Models.

Proceedings of the 10th National Conference on Artificial Intelligence, 1992

1991

The Efficient Learning of Multiple Task Sequences.

Proceedings of the Advances in Neural Information Processing Systems 4, 1991

A Cortico-Cerebellar Model that Learns to Generate Distributed Motor Commands to Control a Kinematic Arm.

Proceedings of the Advances in Neural Information Processing Systems 4, 1991

Transfer of Learning Across Compositions of Sequentail Tasks.

Proceedings of the Eighth International Workshop (ML91), 1991