Philip S. Thomas

According to our database1, Philip S. Thomas authored at least 39 papers between 2009 and 2019.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2019
Asynchronous Coagent Networks: Stochastic Networks for Reinforcement Learning without Backpropagation or a Clock.
CoRR, 2019

A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning.
CoRR, 2019

Learning Action Representations for Reinforcement Learning.
CoRR, 2019

Privacy Preserving Off-Policy Evaluation.
CoRR, 2019

2018
Natural Option Critic.
CoRR, 2018

Importance Sampling for Fair Policy Selection.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Decoupling Gradient-Like Learning Rules from Representations.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
On Ensuring that Intelligent Machines Are Well-Behaved.
CoRR, 2017

Decoupling Learning Rules from Representations.
CoRR, 2017

Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines.
CoRR, 2017

Data-Efficient Policy Evaluation Through Behavior Policy Search.
CoRR, 2017

Using Options for Long-Horizon Off-Policy Evaluation.
CoRR, 2017

Importance Sampling for Fair Policy Selection.
Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Data-Efficient Policy Evaluation Through Behavior Policy Search.
Proceedings of the 34th International Conference on Machine Learning, 2017

Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Importance Sampling with Unequal Support.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Human-Like Rewards to Train a Reinforcement Learning Controller for Planar Arm Movement.
IEEE Trans. Human-Machine Systems, 2016

Importance Sampling with Unequal Support.
CoRR, 2016

Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning.
CoRR, 2016

Energetic Natural Gradient Descent.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Increasing the Action Gap: New Operators for Reinforcement Learning.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
A Notation for Markov Decision Processes.
CoRR, 2015

Increasing the Action Gap: New Operators for Reinforcement Learning.
CoRR, 2015

Ad Recommendation Systems for Life-Time Value Optimization.
Proceedings of the 24th International Conference on World Wide Web Companion, 2015

Policy Evaluation Using the Ω-Return.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

High Confidence Policy Improvement.
Proceedings of the 32nd International Conference on Machine Learning, 2015

High-Confidence Off-Policy Evaluation.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces.
CoRR, 2014

Natural Temporal Difference Learning.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Projected Natural Actor-Critic.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012
Motor primitive discovery.
Proceedings of the 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics, 2012

2011
Policy Gradient Coagent Networks.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

TD_gamma: Re-evaluating Complex Backups in Temporal Difference Learning.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Conjugate Markov Decision Processes.
Proceedings of the 28th International Conference on Machine Learning, 2011

Value Function Approximation in Reinforcement Learning Using the Fourier Basis.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2009
Application of the Actor-Critic Architecture to Functional Electrical Stimulation Control of a Human Arm.
Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence, 2009


  Loading...