Gerald Tesauro

According to our database1, Gerald Tesauro authored at least 86 papers between 1987 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2018
Introduction to the special issue on deep reinforcement learning: An editorial.
Neural Networks, 2018

Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference.
CoRR, 2018

Learning Abstract Options.
CoRR, 2018

Learning to Teach in Cooperative Multiagent Reinforcement Learning.
CoRR, 2018

Diverse Few-Shot Text Classification with Multiple Metrics.
CoRR, 2018

Diverse Few-Shot Text Classification with Multiple Metrics.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

R3: Reinforced Ranker-Reader for Open-Domain Question Answering.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Cognitive Computing.
IEEE Intelligent Systems, 2017

The Eigenoption-Critic Framework.
CoRR, 2017

Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering.
CoRR, 2017

Eigenoption Discovery through the Deep Successor Representation.
CoRR, 2017

R$^3$: Reinforced Reader-Ranker for Open-Domain Question Answering.
CoRR, 2017

Robust Task Clustering for Deep Many-Task Learning.
CoRR, 2017

Optimal Sequential Drilling for Hydrocarbon Field Development Planning.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation.
CoRR, 2016

Selecting Near-Optimal Learners via Incremental Data Allocation.
CoRR, 2016

Hierarchical Memory Networks.
CoRR, 2016

Selecting Near-Optimal Learners via Incremental Data Allocation.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Reports of the AAAI 2014 Conference Workshops.
AI Magazine, 2015

Towards Cognitive Automation of Data Science.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Budgeted Prediction with Expert Advice.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Analysis of Watson's Strategies for Playing Jeopardy!
CoRR, 2014

2013
Analysis of Watson's Strategies for Playing Jeopardy!
J. Artif. Intell. Res., 2013

2012
Simulation, learning, and optimization techniques in Watson's game strategies.
IBM Journal of Research and Development, 2012

Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation
CoRR, 2012

Bayesian Inference in Monte-Carlo Tree Search
CoRR, 2012

Applying a framework for healthcare incentives simulation.
Proceedings of the Winter Simulation Conference, 2012

Playing repeated Stackelberg games with unknown opponents.
Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

2010
Bayesian Inference in Monte-Carlo Tree Search.
Proceedings of the UAI 2010, 2010

2009
Monte-Carlo simulation balancing.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Active Collaborative Prediction with Maximum Margin Matrix Factorization.
Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

Autonomic multi-agent management of power and performance in data centers.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

2007
Metric Learning for Kernel Regression.
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

Reinforcement Learning in Autonomic Computing: A Manifesto and Case Studies.
IEEE Internet Computing, 2007

On the use of hybrid reinforcement learning for autonomic resource allocation.
Cluster Computing, 2007

Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Estimating End-to-End Performance by Collaborative Prediction with Active Sampling.
Proceedings of the Integrated Network Management, 2007

Coordinating Multiple Autonomic Managers to Achieve Specified Power-Performance Tradeoffs.
Proceedings of the Fourth International Conference on Autonomic Computing (ICAC'07), 2007

2006
A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation.
Proceedings of the 3rd International Conference on Autonomic Computing, 2006

Improvement of Systems Management Policies Using Hybrid Reinforcement Learning.
Proceedings of the Machine Learning: ECML 2006, 2006

2005
Utility-Function-Driven Resource Allocation in Autonomic Systems.
Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

Online Resource Allocation Using Decompositional Reinforcement Learning.
Proceedings of the Proceedings, 2005

New Approaches to Optimization and Utility Elicitation in Autonomic Computing.
Proceedings of the Proceedings, 2005

2004
Utility Functions in Autonomic Systems.
Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), 2004

A Multi-Agent Systems Approach to Autonomic Computing.
Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 2004

2003
Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation.
Proceedings of the UAI '03, 2003

A strategic decision model for multi-attribute bilateral negotiation with alternating.
Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

Multi-agent implementation of asymmetric protocol for bilateral negotiations.
Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

Extending Q-Learning to General Adaptive Multi-Agent Systems.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

2002
Programming backgammon using self-teaching neural nets.
Artif. Intell., 2002

Pricing in Agent Economies Using Multi-Agent Q-Learning.
Autonomous Agents and Multi-Agent Systems, 2002

Strategic sequential bidding in auctions using dynamic programming.
Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

2001
High-performance bidding agents for the continuous double auction.
Proceedings of the Proceedings 3rd ACM Conference on Electronic Commerce (EC-2001), 2001

Pricing in Agent Economies Using Neural Networks and Multi-agent Q-Learning.
Proceedings of the Sequence Learning - Paradigms, Algorithms, and Applications, 2001

Agent-Human Interactions in the Continuous Double Auction.
Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001

2000
Foresight-based pricing algorithms in agent economies.
Decision Support Systems, 2000

Multi-agent Q-learning and Regression Trees for Automated Pricing Decisions.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Pseudo-convergent Q-Learning by Competitive Pricebots.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Multi-Agent Q-Learning and Regression Trees for Automated Pricing Decisions.
Proceedings of the 4th International Conference on Multi-Agent Systems, 2000

1999
Strategic pricebot dynamics.
EC, 1999

1998
Comments on "Co-Evolution in the Successful Learning of Backgammon Strategy".
Machine Learning, 1998

1996
On-line Policy Improvement using Monte-Carlo Search.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

1995
Temporal Difference Learning and TD-Gammon.
ICGA Journal, 1995

Temporal Difference Learning and TD-Gammon.
Commun. ACM, 1995

Biologically Inspired Defenses Against Computer Viruses.
Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995

1994
TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play.
Neural Computation, 1994

1992
How Tight Are the Vapnik-Chervonenkis Bounds?
Neural Computation, 1992

Practical Issues in Temporal Difference Learning.
Machine Learning, 1992

Temporal Difference Learning of Backgammon Strategy.
Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), 1992

1991
Visualizing processes in neural networks.
IBM Journal of Research and Development, 1991

Practical Issues in Temporal Difference Learning.
Proceedings of the Advances in Neural Information Processing Systems 4, 1991

1990
Can Neural Networks Do Better Than the Vapnik-Chervonenkis Bounds?
Proceedings of the Advances in Neural Information Processing Systems 3, 1990

Neurogammon: a neural-network backgammon program.
Proceedings of the IJCNN 1990, 1990

1989
Asymptotic Convergence of Backpropagation.
Neural Computation, 1989

Neurogammon Wins Computer Olympiad.
Neural Computation, 1989

A Parallel Network that Learns to Play Backgammon.
Artif. Intell., 1989

Neural Network Visualization.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

Asymptotic Convergence of Backpropagation: Numerical Experiments.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

1988
A study of scaling and generalization in neural networks.
Neural Networks, 1988

Scaling Relationships in Back-propagation Learning.
Complex Systems, 1988

Connectionist Learning of Expert Preferences by Comparison Training.
Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Scaling and Generalization in Neural Networks: A Case Study.
Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Connectionist Learning of Expert Backgammon Evaluations.
Proceedings of the Machine Learning, 1988

1987
Scaling Relationships in Back-Propagation Learning: Dependence on Training Set Size.
Complex Systems, 1987

A 'Neural' Network that Learns to Play Backgammon.
Proceedings of the Neural Information Processing Systems, Denver, Colorado, USA, 1987, 1987


  Loading...