Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills.

[BibT_eX]

[DOI]

Kolby Nottingham

Bodhisattwa Prasad Majumder

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors.

[BibT_eX]

[DOI]

CoRR, 2023

Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling.

[BibT_eX]

[DOI]

Kolby Nottingham

Prithviraj Ammanabrolu

Proceedings of the International Conference on Machine Learning, 2023

Learning to Design Analog Circuits to Meet Threshold Specifications.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments.

[BibT_eX]

[DOI]

CoRR, 2022

Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

CoRR, 2022

Learning to Query Internet Text for Informing Reinforcement Learning Agents.

[BibT_eX]

[DOI]

CoRR, 2022

Anytime PSRO for Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

CoRR, 2022

Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Independent Natural Policy Gradient always converges in Markov Potential Games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Target Entropy Annealing for Discrete Soft Actor-Critic.

[BibT_eX]

[DOI]

CoRR, 2021

Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning.

[BibT_eX]

[DOI]

Dailin Hu

Pieter Abbeel

Roy Fox

CoRR, 2021

Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates.

[BibT_eX]

[DOI]

CoRR, 2021

Modular Framework for Visuomotor Language Grounding.

[BibT_eX]

[DOI]

CoRR, 2021

Improving Social Welfare While Preserving Autonomy via a Pareto Mediator.

[BibT_eX]

[DOI]

CoRR, 2021

XDO: A Double Oracle Algorithm for Extensive-Form Games.

[BibT_eX]

[DOI]

CoRR, 2021

A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks.

[BibT_eX]

[DOI]

CoRR, 2021

XDO: A Double Oracle Algorithm for Extensive-Form Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

AutoPandas: neural-backed generators for program synthesis.

[BibT_eX]

[DOI]

Proc. ACM Program. Lang., 2019

Hierarchical Variational Imitation Learning of Control Programs.

[BibT_eX]

[DOI]

CoRR, 2019

Multi-Task Hierarchical Imitation Learning for Home Automation.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Conference on Automation Science and Engineering, 2019

2018

Derivative-Free Failure Avoidance Control for Manipulation using Learned Support Constraints.

[BibT_eX]

[DOI]

CoRR, 2018

Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Foundations of Robotics XIII, 2018

Fast and Reliable Autonomous Surgical Debridement with Cable-Driven Robots Using a Two-Phase Calibration Procedure.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

Robustly Adjusting Indoor Drip Irrigation Emitters with the Toyota HSR Robot.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

RLlib: Abstractions for Distributed Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Parametrized Hierarchical Procedures for Neural Programming.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Constraint Estimation and Derivative-Free Recovery for Robot Learning from Demonstrations.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on Automation Science and Engineering, 2018

2017

Ray RLLib: A Composable and Scalable Reinforcement Learning Library.

[BibT_eX]

[DOI]

CoRR, 2017

DDCO: Discovery of Deep Continuous Options forRobot Learning from Demonstrations.

[BibT_eX]

[DOI]

CoRR, 2017

Iterative Noise Injection for Scalable Imitation Learning.

[BibT_eX]

[DOI]

CoRR, 2017

Multi-Level Discovery of Deep Options.

[BibT_eX]

[DOI]

CoRR, 2017

DART: Noise Injection for Robust Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, 2017

DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations.

[BibT_eX]

[DOI]

Proceedings of the 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, 2017

Statistical data cleaning for deep learning of automation tasks from demonstrations.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE Conference on Automation Science and Engineering, 2017

An algorithm and user study for teaching bilateral manipulation via iterated best response demonstrations.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE Conference on Automation Science and Engineering, 2017

2016

Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes (שער נוסף בעברית: שיטות תורת-האינפורמציה לתכנון ולמידה בתהליכי החלטה מרקוב נצפים חלקית.).

[BibT_eX]

[DOI]

Roy Fox

PhD thesis, 2016

Principled Option Learning in Markov Decision Processes.

[BibT_eX]

[DOI]

Roy Fox

Michal Moshkovitz

Naftali Tishby

CoRR, 2016

Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Roy Fox

CoRR, 2016

Taming the Noise in Reinforcement Learning via Soft Updates.

[BibT_eX]

[DOI]

Roy Fox

Ari Pakman

Naftali Tishby

Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Minimum-information LQG control part I: Memoryless controllers.

[BibT_eX]

[DOI]

Roy Fox

Naftali Tishby

Proceedings of the 55th IEEE Conference on Decision and Control, 2016

Minimum-information LQG control Part II: Retentive controllers.

[BibT_eX]

[DOI]

Roy Fox

Naftali Tishby

Proceedings of the 55th IEEE Conference on Decision and Control, 2016

2015

Optimal Selective Attention in Reactive Agents.

[BibT_eX]

[DOI]

Roy Fox

Naftali Tishby

CoRR, 2015

G-Learning: Taming the Noise in Reinforcement Learning via Soft Updates.

[BibT_eX]

[DOI]

Roy Fox

Ari Pakman

Naftali Tishby

CoRR, 2015

2013

A multi-agent control framework for co-adaptation in brain-computer interfaces.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012

Bounded Planning in Passive POMDPs.

[BibT_eX]

[DOI]

Roy Fox

Naftali Tishby

Proceedings of the 29th International Conference on Machine Learning, 2012

2007

A Reinforcement Learning Algorithm with Polynomial Interaction Complexity for Only-Costly-Observable MDPs.

[BibT_eX]

[DOI]

Roy Fox