Michael L. Littman

Peter Norvig

Commun. ACM, February, 2023

Meta-learning Parameterized Skills.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Coarse-Grained Smoothness for Reinforcement Learning in Metric Spaces.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Computably Continuous Reinforcement-Learning Objectives Are PAC-Learnable.

[BibT_eX]

[DOI]

Cambridge Yang

Michael Littman

Michael Carbin

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Helping Users Debug Trigger-Action Programs.

[BibT_eX]

[DOI]

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2022

Specifying Behavior Preference with Tiered Reward Functions.

[BibT_eX]

[DOI]

Zhiyuan Zhou

Henry Sowerby

CoRR, 2022

Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex.

[BibT_eX]

[DOI]

CoRR, 2022

Reward-Predictive Clustering.

[BibT_eX]

[DOI]

Michael J. Frank

CoRR, 2022

Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report.

[BibT_eX]

[DOI]

CoRR, 2022

Meta-Learning Transferable Parameterized Skills.

[BibT_eX]

[DOI]

Haotian Fu

Shangqun Yu

Saket Tiwari

Michael Littman

CoRR, 2022

Designing Rewards for Fast Learning.

[BibT_eX]

[DOI]

Henry Sowerby

Zhiyuan Zhou

CoRR, 2022

Does DQN really learn? Exploring adversarial training schemes in Pong.

[BibT_eX]

[DOI]

CoRR, 2022

Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero in Hex.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model-based Lifelong Reinforcement Learning with Bayesian Exploration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Faster Deep Reinforcement Learning with Slower Online Network.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data.

[BibT_eX]

[DOI]

Jamar L. Sullivan Jr.

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

On the (In)Tractability of Reinforcement Learning for LTL Objectives.

[BibT_eX]

[DOI]

Cambridge Yang

Michael Carbin

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

On the Expressivity of Markov Reward (Extended Abstract).

[BibT_eX]

[DOI]

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

2021

Deep Q-Network with Proximal Iteration.

[BibT_eX]

[DOI]

CoRR, 2021

Learning Generalizable Behavior via Visual Rewrite Rules.

[BibT_eX]

[DOI]

CoRR, 2021

Reinforcement Learning for General LTL Objectives Is Intractable.

[BibT_eX]

[DOI]

Cambridge Yang

Michael Carbin

CoRR, 2021

Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator.

[BibT_eX]

[DOI]

CoRR, 2021

Coarse-Grained Smoothness for RL in Metric Spaces.

[BibT_eX]

[DOI]

CoRR, 2021

Bad-Policy Density: A Measure of Reinforcement Learning Hardness.

[BibT_eX]

[DOI]

CoRR, 2021

Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback.

[BibT_eX]

[DOI]

CoRR, 2021

Brittle AI, Causal Confusion, and Bad Mental Models: Challenges and Successes in the XAI Program.

[BibT_eX]

[DOI]

CoRR, 2021

Control of mental representations in human planning.

[BibT_eX]

[DOI]

CoRR, 2021

Model Selection's Disparate Impact in Real-World Deep Learning Applications.

[BibT_eX]

[DOI]

CoRR, 2021

Collusion rings threaten the integrity of computer science research.

[BibT_eX]

[DOI]

Commun. ACM, 2021

On the Expressivity of Markov Reward.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Understanding Trigger-Action Programs Through Novel Visualizations of Program Differences.

[BibT_eX]

[DOI]

Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

Towards Sample Efficient Agents through Algorithmic Alignment (Student Abstract).

[BibT_eX]

[DOI]

Mingxuan Li

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Lipschitz Lifelong Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Deep Radial-Basis Value Functions for Continuous Control.

[BibT_eX]

[DOI]

Neev Parikh

Ronald E. Parr

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Reward-predictive representations generalize across tasks in reinforcement learning.

[BibT_eX]

[DOI]

Michael J. Frank

PLoS Comput. Biol., 2020

Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2020

Trace2TAP: Synthesizing Trigger-Action Programs from Traces of Behavior.

[BibT_eX]

[DOI]

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2020

Task Scoping: Building Goal-Specific Abstractions for Planning in Complex Domains.

[BibT_eX]

[DOI]

CoRR, 2020

Towards Sample Efficient Agents through Algorithmic Alignment.

[BibT_eX]

[DOI]

Mingxuan Li

CoRR, 2020

The Efficiency of Human Cognition Reflects Planned Information Processing.

[BibT_eX]

[DOI]

CoRR, 2020

Learning State Abstractions for Transfer in Continuous Control.

[BibT_eX]

[DOI]

Michael Littman

CoRR, 2020

Deep RBF Value Functions for Continuous Control.

[BibT_eX]

[DOI]

Ronald E. Parr

CoRR, 2020

Applying prerequisite structure inference to adaptive testing.

[BibT_eX]

[DOI]

Sam Saarinen

Evan Cater

Proceedings of the LAK '20: 10th International Conference on Learning Analytics and Knowledge, 2020

Teaching a Robot Tasks of Arbitrary Complexity via Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the HRI '20: ACM/IEEE International Conference on Human-Robot Interaction, 2020

Value Preserving State-Action Abstractions.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Context-Driven Satirical News Generation.

[BibT_eX]

[DOI]

Zachary Horvitz

Nam Do

Proceedings of the Second Workshop on Figurative Language Processing, 2020

Task Scoping for Efficient Planning in Open Worlds (Student Abstract).

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

People Do Not Just Plan, They Plan to Plan.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Decision trees.

[BibT_eX]

[DOI]

John Barr

Inroads, 2019

Individual predictions matter: Assessing the effect of data ordering in training fine-tuned CNNs for medical imaging.

[BibT_eX]

[DOI]

John R. Zech

Jessica Zosa Forde

CoRR, 2019

Interactive Learning of Environment Dynamics for Sequential Tasks.

[BibT_eX]

[DOI]

CoRR, 2019

Combating the Compounding-Error Problem with a Multi-step Model.

[BibT_eX]

[DOI]

CoRR, 2019

Teaching with IMPACT.

[BibT_eX]

[DOI]

Carl Trimbach

CoRR, 2019

Deep Reinforcement Learning from Policy-Dependent Human Feedback.

[BibT_eX]

[DOI]

CoRR, 2019

Successor Features Support Model-based and Model-free Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

ReNeg and Backseat Driver: Learning from Demonstration with Continuous Human Feedback.

[BibT_eX]

[DOI]

Jacob Beck

Zoe Papakipos

CoRR, 2019

Stackelberg Punishment and Bully-Proofing Autonomous Vehicles.

[BibT_eX]

[DOI]

Proceedings of the Social Robotics - 11th International Conference, 2019

Evidence Humans Provide When Explaining Data-Labeling Decisions.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction - INTERACT 2019, 2019

DeepMellow: Removing the Need for a Target Network in Deep Q-Learning.

[BibT_eX]

[DOI]

Seungchan Kim

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

The Expected-Length Model of Options.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Finding Options that Minimize Planning Time.

[BibT_eX]

[DOI]

Yuu Jinnai

David Ellis Hershkowitz

Proceedings of the 36th International Conference on Machine Learning, 2019

How Users Interpret Bugs in Trigger-Action Programming.

[BibT_eX]

[DOI]

Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

Removing the Target Network from Deep Q-Networks with the Mellowmax Operator.

[BibT_eX]

[DOI]

Seungchan Kim

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Theory of Minds: Understanding Behavior in Groups through Inverse Planning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

State Abstraction as Compression in Apprenticeship Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Curriculum Design for Machine Learners in Sequential Decision Tasks.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput. Intell., 2018

Evolutionary huffman encoding.

[BibT_eX]

[DOI]

Inroads, 2018

Measuring and Characterizing Generalization in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Mitigating Planner Overfitting in Model-Based Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Towards a Simple Approach to Multi-step Model-based Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Finding Options that Minimize Planning Time.

[BibT_eX]

[DOI]

Yuu Jinnai

CoRR, 2018

Personalized Education at Scale.

[BibT_eX]

[DOI]

Sam Saarinen

Evan Cater

CoRR, 2018

Transfer with Model Features in Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Lipschitz Continuity in Model-based Reinforcement Learning.

[BibT_eX]

[DOI]

Dipendra Misra

Proceedings of the 35th International Conference on Machine Learning, 2018

Policy and Value Transfer in Lifelong Reinforcement Learning.

[BibT_eX]

[DOI]

Yuu Jinnai

Sophie Yue Guo

Proceedings of the 35th International Conference on Machine Learning, 2018

State Abstractions for Lifelong Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Effectively Learning from Pedagogical Demonstrations.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual Meeting of the Cognitive Science Society, 2018

Bandit-Based Solar Panel Control.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Learning Approximate Stochastic Transition Models.

[BibT_eX]

[DOI]

CoRR, 2017

Summable Reparameterizations of Wasserstein Critics in the One-Dimensional Setting.

[BibT_eX]

[DOI]

Christopher Grimm

Yuhang Song

CoRR, 2017

Mean Actor Critic.

[BibT_eX]

[DOI]

CoRR, 2017

Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning.

[BibT_eX]

[DOI]

Stefanie Tellex

CoRR, 2017

Interactive Learning from Policy-Dependent Human Feedback.

[BibT_eX]

[DOI]

CoRR, 2017

Environment-Independent Task Specifications via GLTL.

[BibT_eX]

[DOI]

Ufuk Topcu

Jie Fu

Min Wen

CoRR, 2017

Latent Attention Networks.

[BibT_eX]

[DOI]

CoRR, 2017

Ask Me Anything about MOOCs.

[BibT_eX]

[DOI]

Doug Fisher

AI Mag., 2017

Interactive Learning from Policy-Dependent Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

An Alternative Softmax Operator for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Teaching by Intervention: Working Backwards, Undoing Mistakes, or Correcting Mistakes?

[BibT_eX]

[DOI]

Mark K. Ho

Joseph L. Austerweil

Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017

Planning with Abstract Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, 2017

2016

A New Softmax Operator for Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2016

Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning.

[BibT_eX]

[DOI]

Auton. Agents Multi Agent Syst., 2016

Learning User's Preferred Household Organization via Collaborative Filtering Methods.

[BibT_eX]

[DOI]

Stephen Brawner

Proceedings of the Joint Workshop on Interfaces and Human Decision Making for Recommender Systems co-located with ACM Conference on Recommender Systems (RecSys 2016), 2016

Showing versus doing: Teaching by demonstration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Peer Reviewing Short Answers using Comparative Judgement.

[BibT_eX]

[DOI]

Pushkar Kolhe

Charles L. Isbell Jr.

Proceedings of the Third ACM Conference on Learning @ Scale, 2016

Near Optimal Behavior via Approximate State Abstraction.

[BibT_eX]

[DOI]

D. Ellis Hershkowitz

Proceedings of the 33nd International Conference on Machine Learning, 2016

Coordinate to cooperate or compete: Abstract goals and joint intentions in social interaction.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

Feature-based Joint Planning and Norm Learning in Collaborative Games.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

Trigger-Action Programming in the Wild: An Analysis of 200, 000 IFTTT Recipes.

[BibT_eX]

[DOI]

Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Towards Behavior-Aware Model Learning from Human-Generated Trajectories.

[BibT_eX]

[DOI]

Proceedings of the 2016 AAAI Fall Symposia, Arlington, Virginia, USA, November 17-19, 2016, 2016

Reinforcement Learning as a Framework for Ethical Decision Making.

[BibT_eX]

[DOI]

Proceedings of the AI, 2016

2015

Reinforcement learning improves behaviour from evaluative feedback.

[BibT_eX]

[DOI]

Nat., 2015

Who speaks for AI?

[BibT_eX]

[DOI]

Charles L. Isbell Jr.

Michael J. Wooldridge

AI Matters, 2015

Grounding English Commands to Reward Functions.

[BibT_eX]

[DOI]

Proceedings of the Robotics: Science and Systems XI, Sapienza University of Rome, 2015

Between Imitation and Intention Learning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Teaching with Rewards and Punishments: Reinforcement or Communication?

[BibT_eX]

[DOI]

Proceedings of the 37th Annual Meeting of the Cognitive Science Society, 2015

2014

Learning something from nothing: Leveraging implicit human feedback strategies.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication, 2014

Flexible theft and resolute punishment: Evolutionary dynamics of social behavior among reinforcement-learning agents.

[BibT_eX]

[DOI]

Fiery Cushman

Proceedings of the 36th Annual Meeting of the Cognitive Science Society, 2014

Practical trigger-action programming in the smart home.

[BibT_eX]

[DOI]

Proceedings of the CHI Conference on Human Factors in Computing Systems, 2014

Quantifying Uncertainty in Batch Personalized Sequential Decision Making.

[BibT_eX]

[DOI]

Proceedings of the Modern Artificial Intelligence for Health Analytics, 2014

A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013

Coco-Q: Learning in Stochastic Games with Side Payments.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Machine Learning, 2013

The Cross-Entropy Method Optimizes for Quantiles.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Machine Learning, 2013

Open-Loop Planning in Large-Scale Stochastic Domains.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

An Ensemble of Linearly Combined Reinforcement-Learning Agents.

[BibT_eX]

[DOI]

Vukosi Marivate

Proceedings of the Late-Breaking Developments in the Field of Artificial Intelligence, 2013

AAAI-13 Preface.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012

Learning web-service task descriptions from traces.

[BibT_eX]

[DOI]

Alexander Borgida

Web Intell. Agent Syst., 2012

On the Computational Complexity of Stochastic Controller Optimization in POMDPs.

[BibT_eX]

[DOI]

Nikos Vlassis

David Barber

ACM Trans. Comput. Theory, 2012

Inducing Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Grammatical Inference, 2012

A new way to search game trees: technical perspective.

[BibT_eX]

[DOI]

Commun. ACM, 2012

Rollout-based Game-tree Search Outprunes Traditional Alpha-beta.

[BibT_eX]

[DOI]

Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Planning in Reward-Rich Domains via PAC Bandits.

[BibT_eX]

[DOI]

Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second International Conference on Automated Planning and Scheduling, 2012

A framework for modeling population strategies by depth of reasoning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

Covering Number as a Complexity Measure for POMDP Planning and Learning.

[BibT_eX]

[DOI]

Zongzhang Zhang

Xiaoping Chen

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011

Puzzle: baffling raffling.

[BibT_eX]

[DOI]

Daniel M. Reeves

SIGecom Exch., 2011

Introduction to the special issue on empirical evaluations in reinforcement learning.

[BibT_eX]

[DOI]

Shimon Whiteson

Mach. Learn., 2011

Knows what it knows: a framework for self-aware learning.

[BibT_eX]

[DOI]

Mach. Learn., 2011

Integrating machine learning in <i>ad hoc</i> routing: A wireless adaptive routing protocol.

[BibT_eX]

[DOI]

Brian Russell

Wade Trappe

Int. J. Commun. Syst., 2011

Most Relevant Explanation: computational complexity and approximation methods.

[BibT_eX]

[DOI]

Changhe Yuan

Heejin Lim

Ann. Math. Artif. Intell., 2011

Democratic approximation of lexicographic preference models.

[BibT_eX]

[DOI]

Artif. Intell., 2011

Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search.

[BibT_eX]

[DOI]

John Asmuth

Proceedings of the UAI 2011, 2011

Apprenticeship Learning About Multiple Intentions.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Machine Learning, 2011

Scratchable Devices: User-Friendly Programming for Household Appliances.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction. Towards Mobile and Intelligent Interaction Environments, 2011

The effects of selection on noisy fitness optimization.

[BibT_eX]

[DOI]

David H. Ackley

Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference, 2011

Using iterated reasoning to predict opponent strategies.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Sample-Based Planning for Continuous Action Markov Decision Processes.

[BibT_eX]

[DOI]

Christopher R. Mansley

Proceedings of the 21st International Conference on Automated Planning and Scheduling, 2011

2010

Dimension reduction and its application to model-based exploration in continuous spaces.

[BibT_eX]

[DOI]

Ali Nouri

Mach. Learn., 2010

Reducing reinforcement learning to KWIK online regression.

[BibT_eX]

[DOI]

Ann. Math. Artif. Intell., 2010

Broadening student enthusiasm for computer science with a great insights course.

[BibT_eX]

[DOI]

Proceedings of the 41st ACM technical symposium on Computer science education, 2010

Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration.

[BibT_eX]

[DOI]

Michael Wunder

Monica Babes

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Generalizing Apprenticeship Learning across Hypothesis Classes.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

A Cognitive Hierarchy Model Applied to the Lemonade Game.

[BibT_eX]

[DOI]

Proceedings of the Interactive Decision Theory and Game Theory, 2010

Integrating Sample-Based Planning and Model-Based Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

Efficient Apprenticeship Learning with Smart Humans.

[BibT_eX]

[DOI]

Kaushik Subramanian

Proceedings of the Enabling Intelligence through Middleware, 2010

Learning Lexicographic Preference Models.

[BibT_eX]

[DOI]

Proceedings of the Preference Learning., 2010

2009

Hierarchical Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Artificial Intelligence (3 Volumes), 2009

Reinforcement Learning in Finite MDPs: PAC Analysis.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2009

Provably Efficient Learning with Typed Parametric Models.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2009

Learning and planning in environments with delayed feedback.

[BibT_eX]

[DOI]

Auton. Agents Multi Agent Syst., 2009

Exploring compact reinforcement-learning representations with linear regression.

[BibT_eX]

[DOI]

Proceedings of the UAI 2009, 2009

A Bayesian Sampling Approach to Exploration in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the UAI 2009, 2009

Online exploration in least-squares policy iteration.

[BibT_eX]

[DOI]

Christopher R. Mansley

Proceedings of the 8th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), 2009

2008

An analysis of model-based Interval Estimation for Markov Decision Processes.

[BibT_eX]

[DOI]

J. Comput. Syst. Sci., 2008

Optimization problems involving collections of dependent objects.

[BibT_eX]

[DOI]

David L. Roberts

Charles L. Isbell Jr.

Ann. Oper. Res., 2008

A Polynomial-time Nash Equilibrium Algorithm for Repeated Stochastic Games.

[BibT_eX]

[DOI]

Enrique Munoz de Cote

Proceedings of the UAI 2008, 2008

CORL: A Continuous-state Offset-dynamics Reinforcement Learner.

[BibT_eX]

[DOI]

Proceedings of the UAI 2008, 2008

Autonomous Model Learning for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on the Quantitative Evaluaiton of Systems (QEST 2008), 2008

Multi-resolution Exploration in Continuous Spaces.

[BibT_eX]

[DOI]

Ali Nouri

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Efficient Value-Function Approximation via Online Linear Regression.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning.

[BibT_eX]

[DOI]

Ronald Parr

Christopher Painter-Wakefield

Gavin Taylor

Proceedings of the Machine Learning, 2008

Knows what it knows: a framework for self-aware learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2008

An object-oriented representation for efficient reinforcement learning.

[BibT_eX]

[DOI]

Andre Cohen

Proceedings of the Machine Learning, 2008

Social reward shaping in the prisoner's dilemma.

[BibT_eX]

[DOI]

Monica Babes

Enrique Munoz de Cote

Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Efficient Learning of Action Schemas and Web-Service Descriptions.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

Potential-based Shaping in Model-based Reinforcement Learning.

[BibT_eX]

[DOI]

John Asmuth

Robert Zinkov

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007

Introduction to the special issue on learning and computational game theory.

[BibT_eX]

[DOI]

Amy Greenwald

Mach. Learn., 2007

A hierarchy of prescriptive goals for multiagent learning.

[BibT_eX]

[DOI]

Martin Zinkevich

Amy Greenwald

Artif. Intell., 2007

Online Linear Regression and Its Application to Model-Based Reinforcement Learning.

[BibT_eX]

[DOI]

Christopher Painter-Wakefield

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Analyzing feature generation for value-function approximation.

[BibT_eX]

[DOI]

Ronald Parr

Proceedings of the Machine Learning, 2007

Planning and Learning in Environments with Delayed Feedback.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning: ECML 2007, 2007

A Multiple Representation Approach to Learning Dynamical Systems.

[BibT_eX]

[DOI]

Proceedings of the Computational Approaches to Representation Change during Learning and Development, 2007

Efficient Structure Learning in Factored-State MDPs.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

Efficient Reinforcement Learning with Relocatable Action Models.

[BibT_eX]

[DOI]

Bethany R. Leffler

Timothy Edmunds

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006

Incremental Model-based Learners With Formal Learning-Time Guarantees.

[BibT_eX]

[DOI]

Proceedings of the UAI '06, 2006

An Efficient Optimal-Equilibrium Algorithm for Two-player Game Trees.

[BibT_eX]

[DOI]

Proceedings of the UAI '06, 2006

Towards a Unified Theory of State Abstraction for MDPs.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2006

Experience-efficient learning in associative bandit problems.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2006

PAC model-free reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2006

A hierarchical approach to efficient reinforcement learning in deterministic domains.

[BibT_eX]

[DOI]

Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2006), 2006

A Change Detection Model for Non-Stationary k-Armed Bandit Problems.

[BibT_eX]

[DOI]

Proceedings of the Between a Rock and a Hard Place: Cognitive Science Principles Meet AI-Hard Problems, 2006

Targeting Specific Distributions of Trajectories in MDPs.

[BibT_eX]

[DOI]

David L. Roberts

Mark J. Nelson

Michael Mateas

Proceedings of the Proceedings, 2006

2005

Corpus-based Learning of Analogies and Semantic Relations.

[BibT_eX]

[DOI]

Mach. Learn., 2005

The First Probabilistic Track of the International Planning Competition.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2005

A polynomial-time Nash equilibrium algorithm for repeated games.

[BibT_eX]

[DOI]

Peter Stone

Decis. Support Syst., 2005

Reports on the 2004 AAAI Fall Symposia.

[BibT_eX]

[DOI]

Nicholas L. Cassimatis

AI Mag., 2005

Efficient Exploration With Latent Structure.

[BibT_eX]

[DOI]

Proceedings of the Robotics: Science and Systems I, 2005

Cyclic Equilibria in Markov Games.

[BibT_eX]

[DOI]

Martin Zinkevich

Amy Greenwald

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

A theoretical analysis of Model-Based Interval Estimation.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2005

Activity Recognition from Accelerometer Data.

[BibT_eX]

Proceedings of the Proceedings, 2005

Lazy Approximation for Solving Continuous Finite-Horizon MDPs.

[BibT_eX]

[DOI]

Proceedings of the Proceedings, 2005

2004

An Empirical Evaluation of Interval Estimation for Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004), 2004

Planning with predictive state representations.

[BibT_eX]

[DOI]

Michael R. James

Proceedings of the 2004 International Conference on Machine Learning and Applications, 2004

Reinforcement Learning for Autonomic Network Repair.

[BibT_eX]

[DOI]

Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), 2004

An Instance-Based State Representation for Network Repair.

[BibT_eX]

[DOI]

Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003

Measuring praise and criticism: Inference of semantic orientation from association.

[BibT_eX]

[DOI]

ACM Trans. Inf. Syst., 2003

Decision-Theoretic Bidding Based on Learned Density Models in Simultaneous, Interacting Auctions.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2003

Learning Analogies and Semantic Relations

[BibT_eX]

[DOI]

CoRR, 2003

Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems

[BibT_eX]

[DOI]

CoRR, 2003

AAAI-2002 Fall Symposium Series.

[BibT_eX]

[DOI]

Yukio Ohsawa

Peter McBurney

Simon Parsons

Christopher A. Miller

AI Mag., 2003

Contingent planning under uncertainty via stochastic satisfiability.

[BibT_eX]

[DOI]

Artif. Intell., 2003

Combining independent modules in lexical multiple-choice problems.

[BibT_eX]

Proceedings of the Recent Advances in Natural Language Processing III, 2003

Learning Predictive State Representations.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2003

Tutorial: Learning Topics in Game-Theoretic Decision Making.

[BibT_eX]

[DOI]

Proceedings of the Computational Learning Theory and Kernel Machines, 2003

2002

Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus

[BibT_eX]

[DOI]

CoRR, 2002

A probabilistic approach to solving crossword puzzles.

[BibT_eX]

[DOI]

Noam M. Shazeer

Artif. Intell., 2002

Least-Squares Methods in Reinforcement Learning for Control.

[BibT_eX]

[DOI]

Ronald Parr

Proceedings of the Methods and Applications of Artificial Intelligence, 2002

Modeling Auction Price Uncertainty Using Boosting-based Conditional Density Estimation.

[BibT_eX]

Proceedings of the Machine Learning, 2002

Randomized strategic demand reduction: getting more by asking for less.

[BibT_eX]

[DOI]

Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

ATTac-2001: A Learning, Autonomous Bidding Agent.

[BibT_eX]

[DOI]

Proceedings of the Agent-Mediated Electronic Commerce IV, 2002

Self-Enforcing Strategic Demand Reduction.

[BibT_eX]

[DOI]

Proceedings of the Agent-Mediated Electronic Commerce IV, 2002

2001

Stochastic Boolean Satisfiability.

[BibT_eX]

[DOI]

Toniann Pitassi

J. Autom. Reason., 2001

ATTac-2000: An Adaptive Autonomous Bidding Agent.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2001

Learning to Select Branching Rules in the DPLL Procedure for Satisfiability.

[BibT_eX]

[DOI]

Electron. Notes Discret. Math., 2001

Value-function reinforcement learning in Markov games.

[BibT_eX]

[DOI]

Cogn. Syst. Res., 2001

FAucS : An FCC Spectrum Auction Simulator for Autonomous Bidding Agents.

[BibT_eX]

[DOI]

Proceedings of the Electronic Commerce, Second International Workshop, 2001

Graphical Models for Game Theory.

[BibT_eX]

[DOI]

Michael J. Kearns

Proceedings of the UAI '01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, 2001

Approximate Dimension Reduction at NTCIR.

[BibT_eX]

[DOI]

Fan Jiang

Proceedings of the Third Second Workshop Meeting on Evaluation of Chinese & Japanese Text Retrieval and Text Summarization, 2001

Predictive Representations of State.

[BibT_eX]

[DOI]

Richard S. Sutton

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

An Efficient, Exact Algorithm for Solving Tree-Structured Graphical Games.

[BibT_eX]

[DOI]

Michael J. Kearns

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

PAC Generalization Bounds for Co-training.

[BibT_eX]

[DOI]

Sanjoy Dasgupta

David A. McAllester

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Friend-or-Foe Q-learning in General-Sum Games.

[BibT_eX]

Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

Implicit Negotiation in Repeated Games.

[BibT_eX]

[DOI]

Peter Stone

Proceedings of the Intelligent Agents VIII, 8th International Workshop, 2001

2000

Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms.

[BibT_eX]

[DOI]

Mach. Learn., 2000

A Review of Reinforcement Learning.

[BibT_eX]

[DOI]

Sebastian Thrun

AI Mag., 2000

Exact Solutions to Time-Dependent MDPs.

[BibT_eX]

[DOI]

Justin A. Boyan

Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Algorithm Selection using Reinforcement Learning.

[BibT_eX]

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Approximate Dimension Equalization in Vector-based Information Retrieval.

[BibT_eX]

Fan Jiang

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Abstraction Methods for Game Theoretic Poker.

[BibT_eX]

[DOI]

Jiefu Shi

Proceedings of the Computers and Games, Second International Conference, 2000

Review: Computer Language Games.

[BibT_eX]

[DOI]

Proceedings of the Computers and Games, Second International Conference, 2000

Towards Approximately Optimal Poker.

[BibT_eX]

[DOI]

Jiefu Shi

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

Reinforcement Learning for Algorithm Selection.

[BibT_eX]

[DOI]

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

1999

A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms.

[BibT_eX]

[DOI]

Csaba Szepesvári

Neural Comput., 1999

The AAAI Fall Symposia.

[BibT_eX]

[DOI]

AI Mag., 1999

Solving Crossword Puzzles as Probabilistic Constraint Satisfaction.

[BibT_eX]

[DOI]

Noam M. Shazeer

Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

Solving Crosswords with PROVERB.

[BibT_eX]

[DOI]

Noam M. Shazeer

Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

Initial Experiments in Stochastic Satisfiability.

[BibT_eX]

[DOI]

Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

PROVERB: The Probabilistic Cruciverbalist.

[BibT_eX]

[DOI]

Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

1998

The Computational Complexity of Probabilistic Planning.

[BibT_eX]

[DOI]

Judy Goldsmith

Martin Mundhenk

J. Artif. Intell. Res., 1998

Planning and Acting in Partially Observable Stochastic Domains.

[BibT_eX]

[DOI]

Leslie Pack Kaelbling

Anthony R. Cassandra

Artif. Intell., 1998

Learning a Language-Independent Representation for Terms from a Partially Aligned Corpus.

[BibT_eX]

Fan Jiang

Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

MAXPLAN: A New Approach to Probabilistic Planning.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Conference on Artificial Intelligence Planning Systems, 1998

Using Caching to Solve Larger Probabilistic Planning Problems.

[BibT_eX]

[DOI]

Proceedings of the Fifteenth National Conference on Artificial Intelligence and Tenth Innovative Applications of Artificial Intelligence Conference, 1998

1997

The Complexity of Plan Existence and Evaluation in Probabilistic Domains.

[BibT_eX]

[DOI]

Judy Goldsmith

Martin Mundhenk

Proceedings of the UAI '97: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, 1997

Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Anthony R. Cassandra

Nevin Lianwen Zhang

Proceedings of the UAI '97: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, 1997

Automatic 3-Language Cross-Language Information Retrieval with Latent Semantic Indexing.

[BibT_eX]

[DOI]

Proceedings of The Sixth Text REtrieval Conference, 1997

Probabilistic Propositional Planning: Representations and Complexity.

[BibT_eX]

[DOI]

Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, 1997

Speeding Safely: Multi-Criteria Optimization in Probabilistic Planning.

[BibT_eX]

[DOI]

Michael S. Fulkerson

Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, 1997

1996

Reinforcement Learning: A Survey.

[BibT_eX]

[DOI]

Leslie Pack Kaelbling