Gerald Tesauro

Affiliations:
  • IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA


According to our database1, Gerald Tesauro authored at least 98 papers between 1987 and 2023.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2018, "For contributions to reinforcement learning, neural networks, and intelligent autonomous agents".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Learning in Factored Domains with Information-Constrained Visual Representations.
CoRR, 2023

2022
Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution Concept over Nash Equilibria.
CoRR, 2022

AI Planning Annotation for Sample Efficient Reinforcement Learning.
CoRR, 2022

Influencing Long-Term Behavior in Multiagent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Context-Specific Representation Abstraction for Deep Option Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Efficient Black-Box Planning Using Macro-Actions with Focused Effects.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Capacity-Limited Decentralized Actor-Critic for Multi-Agent Games.
Proceedings of the 2021 IEEE Conference on Games (CoG), 2021

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

RL Generalization in a Theory of Mind Game Through a Sleep Metaphor (Student Abstract).
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games.
CoRR, 2020

Deep RL With Information Constrained Policies: Generalization in Continuous Control.
CoRR, 2020

Finding Macro-Actions with Disentangled Effects for Efficient Planning with the Goal-Count Heuristic.
CoRR, 2020

Decentralized TD Tracking with Linear Function Approximation and its Finite-Time Analysis.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning Hierarchical Teaching Policies for Cooperative Agents.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

On the Role of Weight Sharing During Deep Option Learning.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning.
CoRR, 2019

Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference.
Proceedings of the 7th International Conference on Learning Representations, 2019

Learning to Teach in Cooperative Multiagent Reinforcement Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Hybrid Reinforcement Learning with Expert State Sequences.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Introduction to the special issue on deep reinforcement learning: An editorial.
Neural Networks, 2018

Learning Abstract Options.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Dialog-based Interactive Image Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Diverse Few-Shot Text Classification with Multiple Metrics.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering.
Proceedings of the 6th International Conference on Learning Representations, 2018

Eigenoption Discovery through the Deep Successor Representation.
Proceedings of the 6th International Conference on Learning Representations, 2018

R<sup>3</sup>: Reinforced Ranker-Reader for Open-Domain Question Answering.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Cognitive Computing.
IEEE Intell. Syst., 2017

The Eigenoption-Critic Framework.
CoRR, 2017

R<sup>3</sup>: Reinforced Reader-Ranker for Open-Domain Question Answering.
CoRR, 2017

Robust Task Clustering for Deep Many-Task Learning.
CoRR, 2017

Learning to Query, Reason, and Answer Questions On Ambiguous Texts.
Proceedings of the 5th International Conference on Learning Representations, 2017

Optimal Sequential Drilling for Hydrocarbon Field Development Planning.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Hierarchical Memory Networks.
CoRR, 2016

Selecting Near-Optimal Learners via Incremental Data Allocation.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Reports of the AAAI 2014 Conference Workshops.
AI Mag., 2015

Towards Cognitive Automation of Data Science.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Budgeted Prediction with Expert Advice.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2013
Analysis of Watson's Strategies for Playing Jeopardy!
J. Artif. Intell. Res., 2013

2012
Simulation, learning, and optimization techniques in Watson's game strategies.
IBM J. Res. Dev., 2012

Applying a framework for healthcare incentives simulation.
Proceedings of the Winter Simulation Conference, 2012

Playing repeated Stackelberg games with unknown opponents.
Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

2010
Bayesian Inference in Monte-Carlo Tree Search.
Proceedings of the UAI 2010, 2010

2009
Monte-Carlo simulation balancing.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Active Collaborative Prediction with Maximum Margin Matrix Factorization.
Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

Autonomic multi-agent management of power and performance in data centers.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

2007
Metric Learning for Kernel Regression.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

Reinforcement Learning in Autonomic Computing: A Manifesto and Case Studies.
IEEE Internet Comput., 2007

On the use of hybrid reinforcement learning for autonomic resource allocation.
Clust. Comput., 2007

Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Estimating End-to-End Performance by Collaborative Prediction with Active Sampling.
Proceedings of the Integrated Network Management, 2007

Coordinating Multiple Autonomic Managers to Achieve Specified Power-Performance Tradeoffs.
Proceedings of the Fourth International Conference on Autonomic Computing (ICAC'07), 2007

2006
A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation.
Proceedings of the 3rd International Conference on Autonomic Computing, 2006

Improvement of Systems Management Policies Using Hybrid Reinforcement Learning.
Proceedings of the Machine Learning: ECML 2006, 2006

2005
Utility-Function-Driven Resource Allocation in Autonomic Systems.
Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

Online Resource Allocation Using Decompositional Reinforcement Learning.
Proceedings of the Proceedings, 2005

New Approaches to Optimization and Utility Elicitation in Autonomic Computing.
Proceedings of the Proceedings, 2005

2004
Utility Functions in Autonomic Systems.
Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), 2004

A Multi-Agent Systems Approach to Autonomic Computing.
Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 2004

2003
Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation.
Proceedings of the UAI '03, 2003

A strategic decision model for multi-attribute bilateral negotiation with alternating.
Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

Multi-agent implementation of asymmetric protocol for bilateral negotiations.
Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

Extending Q-Learning to General Adaptive Multi-Agent Systems.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

2002
Programming backgammon using self-teaching neural nets.
Artif. Intell., 2002

Pricing in Agent Economies Using Multi-Agent Q-Learning.
Auton. Agents Multi Agent Syst., 2002

Strategic sequential bidding in auctions using dynamic programming.
Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

2001
High-performance bidding agents for the continuous double auction.
Proceedings of the Proceedings 3rd ACM Conference on Electronic Commerce (EC-2001), 2001

Pricing in Agent Economies Using Neural Networks and Multi-agent Q-Learning.
Proceedings of the Sequence Learning - Paradigms, Algorithms, and Applications, 2001

Agent-Human Interactions in the Continuous Double Auction.
Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001

2000
Foresight-based pricing algorithms in agent economies.
Decis. Support Syst., 2000

Pseudo-convergent Q-Learning by Competitive Pricebots.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Multi-Agent Q-Learning and Regression Trees for Automated Pricing Decisions.
Proceedings of the 4th International Conference on Multi-Agent Systems, 2000

1999
Strategic pricebot dynamics.
Proceedings of the First ACM Conference on Electronic Commerce (EC-99), 1999

1998
Comments on "Co-Evolution in the Successful Learning of Backgammon Strategy".
Mach. Learn., 1998

Foresight-based pricing algorithms in an economy of software agents.
Proceedings of the First International Conference on Information and Computation Economies, 1998

1996
On-line Policy Improvement using Monte-Carlo Search.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

1995
Temporal Difference Learning and TD-Gammon.
J. Int. Comput. Games Assoc., 1995

Biologically Inspired Defenses Against Computer Viruses.
Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995

1994
TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play.
Neural Comput., 1994

1992
How Tight Are the Vapnik-Chervonenkis Bounds?
Neural Comput., 1992

Practical Issues in Temporal Difference Learning.
Mach. Learn., 1992

Temporal Difference Learning of Backgammon Strategy.
Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), 1992

1991
Visualizing processes in neural networks.
IBM J. Res. Dev., 1991

1990
Can Neural Networks Do Better Than the Vapnik-Chervonenkis Bounds?
Proceedings of the Advances in Neural Information Processing Systems 3, 1990

Neurogammon: a neural-network backgammon program.
Proceedings of the IJCNN 1990, 1990

1989
Asymptotic Convergence of Backpropagation.
Neural Comput., 1989

Neurogammon Wins Computer Olympiad.
Neural Comput., 1989

A Parallel Network that Learns to Play Backgammon.
Artif. Intell., 1989

Neural Network Visualization.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

Asymptotic Convergence of Backpropagation: Numerical Experiments.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

1988
A study of scaling and generalization in neural networks.
Neural Networks, 1988

Scaling Relationships in Back-propagation Learning.
Complex Syst., 1988

Connectionist Learning of Expert Preferences by Comparison Training.
Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Scaling and Generalization in Neural Networks: A Case Study.
Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Connectionist Learning of Expert Backgammon Evaluations.
Proceedings of the Machine Learning, 1988

1987
Scaling Relationships in Back-Propagation Learning: Dependence on Training Set Size.
Complex Syst., 1987

A 'Neural' Network that Learns to Play Backgammon.
Proceedings of the Neural Information Processing Systems, Denver, Colorado, USA, 1987, 1987


  Loading...