Joelle Pineau

Jean-Rémi King

Nat. Mach. Intell., 2020

The Bottleneck Simulator: A Model-Based Deep Reinforcement Learning Approach.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2020

Intervention Design for Effective Sim2Real Transfer.

[BibT_eX]

[DOI]

CoRR, 2020

Exploring Zero-Shot Emergent Communication in Embodied Multi-Agent Populations.

[BibT_eX]

[DOI]

CoRR, 2020

How To Evaluate Your Dialogue System: Probe Tasks as an Alternative for Token-level Evaluation Metrics.

[BibT_eX]

[DOI]

Sarath Chandar

CoRR, 2020

Multi-Task Reinforcement Learning as a Hidden-Parameter Block MDP.

[BibT_eX]

[DOI]

CoRR, 2020

TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

[BibT_eX]

[DOI]

CoRR, 2020

Deep interpretability for GWAS.

[BibT_eX]

[DOI]

Deepak Sharma

Louis-Philippe Lemieux Perreault

Marc-André Legault

Audrey Lemaçon

Marie-Pierre Dubé

CoRR, 2020

Evaluating Logical Generalization in Graph Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2020

Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic.

[BibT_eX]

[DOI]

CoRR, 2020

Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Provably efficient reconstruction of policy networks.

[BibT_eX]

[DOI]

CoRR, 2020

Stable Policy Optimization via Off-Policy Divergence Regularization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020

Novelty Search in Representational Space for Sample Efficient Exploration.

[BibT_eX]

[DOI]

Ruo Yu Tao

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Plan2Vec: Unsupervised Representation Learning by Latent Plans.

[BibT_eX]

[DOI]

Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control, 2020

Handling Black Swan Events in Deep Learning with Diversely Extrapolated Neural Networks.

[BibT_eX]

[DOI]

Maxime Wabartha

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract).

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Constrained Markov Decision Processes via Backward Value Functions.

[BibT_eX]

[DOI]

Harsh Satija

Philip Amortila

Proceedings of the 37th International Conference on Machine Learning, 2020

Online Learned Continual Compression with Adaptive Quantization Modules.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Interference and Generalization in Temporal Difference Learning.

[BibT_eX]

[DOI]

Emmanuel Bengio

Proceedings of the 37th International Conference on Machine Learning, 2020

Invariant Causal Prediction for Block MDPs.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

On the interaction between supervision and self-play in emergent communication.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Language GANs Falling Short.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Building reproducible, reusable, and robust machine learning software.

[BibT_eX]

[DOI]

Proceedings of the 14th ACM International Conference on Distributed and Event-based Systems, 2020

A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM.

[BibT_eX]

[DOI]

Proceedings of the Artificial Intelligence in Education - 21st International Conference, 2020

Automated Personalized Feedback Improves Learning Gains in An Intelligent Tutoring System.

[BibT_eX]

[DOI]

Proceedings of the Artificial Intelligence in Education - 21st International Conference, 2020

Learning an Unreferenced Metric for Online Dialogue Evaluation.

[BibT_eX]

[DOI]

Koustuv Sinha

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Literature Mining for Incorporating Inductive Bias in Biomedical Prediction Tasks (Student Abstract).

[BibT_eX]

[DOI]

Qizhen Zhang

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Exploiting Spatial Invariance for Scalable Unsupervised Object Tracking.

[BibT_eX]

[DOI]

Eric Crawford

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2019

Online Learned Continual Compression with Stacked Quantization Module.

[BibT_eX]

[DOI]

CoRR, 2019

MVFST-RL: An Asynchronous RL Framework for Congestion Control with Delayed Actions.

[BibT_eX]

[DOI]

CoRR, 2019

Benchmarking Batch Deep Reinforcement Learning Algorithms.

[BibT_eX]

[DOI]

CoRR, 2019

Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Learning Causal State Representations of Partially Observable Environments.

[BibT_eX]

[DOI]

Zachary C. Lipton

Luis Pineda

Kamyar Azizzadenesheli

CoRR, 2019

Recurrent Value Functions.

[BibT_eX]

[DOI]

CoRR, 2019

Separating value functions across time-scales.

[BibT_eX]

[DOI]

CoRR, 2019

The Second Conversational Intelligence Challenge (ConvAI2).

[BibT_eX]

[DOI]

Alexander I. Rudnicky

CoRR, 2019

Randomized Value Functions via Multiplicative Normalizing Flows.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

No-Press Diplomacy: Modeling Multi-Agent Gameplay.

[BibT_eX]

[DOI]

Jonathan K. Kummerfeld

Satinder Singh

Aaron C. Courville

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Deep Generative Modeling of LiDAR Data.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Separable value functions across time-scales.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

TarMAC: Targeted Multi-Agent Communication.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Seeded self-play for language learning.

[BibT_eX]

[DOI]

Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge, 2019

Leveraging exploration in off-policy algorithms via normalizing flows.

[BibT_eX]

[DOI]

Proceedings of the 3rd Annual Conference on Robot Learning, 2019

On the Pitfalls of Measuring Emergent Communication.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Multitask Metric Learning: Theory and Algorithm.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Combined Reinforcement Learning via Abstract Representations.

[BibT_eX]

[DOI]

Yoshua Bengio

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

On-Line Adaptative Curriculum Learning for GANs.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Spatially Invariant Unsupervised Object Detection with Convolutional Neural Networks.

[BibT_eX]

[DOI]

Eric Crawford

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Streaming kernel regression with provably adaptive mean, variance, and regularization.

[BibT_eX]

[DOI]

Odalric-Ambrym Maillard

J. Mach. Learn. Res., 2018

A Decision-Theoretic Approach for the Collaborative Control of a Smart Wheelchair.

[BibT_eX]

[DOI]

Siddhartha S. Srinivasa

Int. J. Soc. Robotics, 2018

An Introduction to Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Found. Trends Mach. Learn., 2018

A Survey of Available Corpora For Building Data-Driven Dialogue Systems: The Journal Version.

[BibT_eX]

[DOI]

Dialogue Discourse, 2018

Natural Environment Benchmarks for Reinforcement Learning.

[BibT_eX]

[DOI]

Yuxin Wu

CoRR, 2018

Compositional Language Understanding with Text-based Relational Reasoning.

[BibT_eX]

[DOI]

CoRR, 2018

The RLLChatbot: a solution to the ConvAI challenge.

[BibT_eX]

[DOI]

CoRR, 2018

Adversarial Gain.

[BibT_eX]

[DOI]

CoRR, 2018

Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods.

[BibT_eX]

[DOI]

Peter Henderson

Joshua Romoff

CoRR, 2018

Sequential Coordination of Deep Models for Learning Visual Arithmetic.

[BibT_eX]

[DOI]

Eric Crawford

Guillaume Rabusseau

CoRR, 2018

Online Adaptative Curriculum Learning for GANs.

[BibT_eX]

[DOI]

CoRR, 2018

A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning.

[BibT_eX]

[DOI]

Nicolas Ballas

CoRR, 2018

Disentangling the independently controllable factors of variation by interacting with the world.

[BibT_eX]

[DOI]

CoRR, 2018

A Deep Reinforcement Learning Chatbot (Short Version).

[BibT_eX]

[DOI]

Alexandre de Brébisson

CoRR, 2018

Temporal Regularization for Markov Decision Process.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Healthcare Conference, 2018

An Inference-Based Policy Gradient Method for Learning Options.

[BibT_eX]

[DOI]

Matthew J. A. Smith

Herke van Hoof

Proceedings of the 35th International Conference on Machine Learning, 2018

Focused Hierarchical RNNs for Conditional Sequence Processing.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Decoupling Dynamics and Reward for Transfer Learning.

[BibT_eX]

[DOI]

Harsh Satija

Proceedings of the 6th International Conference on Learning Representations, 2018

Extending Neural Generative Conversational Model using External Knowledge Sources.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Reward Estimation for Variance Reduction in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Joshua Romoff

Peter Henderson

Alexandre Piché

Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Ethical Challenges in Data-Driven Dialogue Systems.

[BibT_eX]

[DOI]

Peter Henderson

Koustuv Sinha

Nicolas Angelard-Gontier

Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 2018

Deep Reinforcement Learning That Matters.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

OptionGAN: Learning Joint Reward-Policy Options Using Generative Adversarial Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Modeling Glucagon Action in Patients With Type 1 Diabetes.

[BibT_eX]

[DOI]

IEEE J. Biomed. Health Informatics, 2017

Training End-to-End Dialogue Systems with the Ubuntu Dialogue Corpus.

[BibT_eX]

[DOI]

Dialogue Discourse, 2017

Tensor Regression Networks with various Low-Rank Tensor Approximations.

[BibT_eX]

[DOI]

Xingwei Cao

Guillaume Rabusseau

CoRR, 2017

ACtuAL: Actor-Critic Under Adversarial Learning.

[BibT_eX]

[DOI]

CoRR, 2017

A Deep Reinforcement Learning Chatbot.

[BibT_eX]

[DOI]

Alexandre de Brébisson

CoRR, 2017

Independently Controllable Factors.

[BibT_eX]

[DOI]

CoRR, 2017

Independently Controllable Features.

[BibT_eX]

[DOI]

CoRR, 2017

MACA: A Modular Architecture for Conversational Agents.

[BibT_eX]

[DOI]

Hoai Phuoc Truong

Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017

Predicting Success in Goal-Driven Human-Human Dialogues.

[BibT_eX]

[DOI]

Jackie Chi Kit Cheung

Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017

Multitask Spectral Learning of Weighted Automata.

[BibT_eX]

[DOI]

Guillaume Rabusseau

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Towards an automatic Turing test: Learning to evaluate dialogue responses.

[BibT_eX]

[DOI]

Ryan Lowe

Iulian Vlad Serban

Nicolas Angelard-Gontier

Yoshua Bengio

Proceedings of the 5th International Conference on Learning Representations, 2017

An Actor-Critic Algorithm for Sequence Prediction.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

Piecewise Latent Variables for Neural Variational Text Processing.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing, 2017

A Sparse Probabilistic Model of User Preference Data.

[BibT_eX]

[DOI]

Matthew Smith

Laurent Charlin

Proceedings of the Advances in Artificial Intelligence, 2017

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Online Bagging and Boosting for Imbalanced Data Streams.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2016

Practical Kernel-Based Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2016

Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Beomjoon Kim

Int. J. Soc. Robotics, 2016

Multi-modal Variational Encoder-Decoders.

[BibT_eX]

[DOI]

Iulian Vlad Serban

Alexander G. Ororbia II

Aaron C. Courville

CoRR, 2016

Generative Deep Neural Networks for Dialogue: A Short Review.

[BibT_eX]

[DOI]

CoRR, 2016

On the Evaluation of Dialogue Systems with Next Utterance Classification.

[BibT_eX]

[DOI]

Ryan Lowe

Iulian Vlad Serban

Laurent Charlin

Proceedings of the SIGDIAL 2016 Conference, 2016

Learning Robust Features using Deep Learning for Automatic Seizure Detection.

[BibT_eX]

[DOI]

Pierre Thodoroff

Andrew Lim

Proceedings of the 1st Machine Learning in Health Care, 2016

Generalized Dictionary for Multitask Learning with Boosting.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Learning time series models for pedestrian motion prediction.

[BibT_eX]

[DOI]

Chenghui Zhou

Proceedings of the 2016 IEEE International Conference on Robotics and Automation, 2016

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation.

[BibT_eX]

[DOI]

Chia-Wei Liu

Ryan Lowe

Iulian Serban

Laurent Charlin

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

On the Use of Modular Software and Hardware for Designing Wheelchair Robots.

[BibT_eX]

[DOI]

Proceedings of the 2016 AAAI Spring Symposia, 2016

Multitask Generalized Eigenvalue Program.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Incremental Stochastic Factorization for Online Reinforcement Learning.

[BibT_eX]

[DOI]

Rafael L. Beirigo

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Bayesian Reinforcement Learning: A Survey.

[BibT_eX]

[DOI]

Found. Trends Mach. Learn., 2015

Hierarchical Neural Network Generative Models for Movie Dialogues.

[BibT_eX]

[DOI]

CoRR, 2015

A Survey of Available Corpora for Building Data-Driven Dialogue Systems.

[BibT_eX]

[DOI]

CoRR, 2015

Conditional Computation in Neural Networks for faster models.

[BibT_eX]

[DOI]

CoRR, 2015

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems.

[BibT_eX]

[DOI]

Proceedings of the SIGDIAL 2015 Conference, 2015

Automatically characterizing driving activities onboard smart wheelchairs from accelerometer data.

[BibT_eX]

[DOI]

HiuKim Yuen

Philippe S. Archambault

Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data.

[BibT_eX]

[DOI]

Rafael L. Beirigo

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Person tracking and following with 2D laser scanners.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2015

Analyzing Open Data from the City of Montreal.

[BibT_eX]

[DOI]

Pierre-Luc Bacon

Proceedings of the 2nd International Workshop on Mining Urban Data co-located with 32nd International Conference on Machine Learning (ICML 2015), 2015

Improving the Design and Discovery of Dynamic Treatment Strategies Using Recent Results in Sequential Decision-Making.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling, 2015

Missteps in Robot Social Navigation.

[BibT_eX]

[DOI]

Andrew Sutcliffe

Neil Tenenholtz

Proceedings of the 2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12-14, 2015, 2015

Adaptive Treatment Allocation Using Sub-Sampled Gaussian Processes.

[BibT_eX]

[DOI]

Proceedings of the 2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12-14, 2015, 2015

Online Boosting Algorithms for Anytime Transfer and Multitask Learning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Information Gathering and Reward Exploitation of Subgoals for POMDPs.

[BibT_eX]

[DOI]

Hang Ma

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

Efficient learning and planning with compressed predictive states.

[BibT_eX]

[DOI]

William L. Hamilton

J. Mach. Learn. Res., 2014

Policy Iteration Based on Stochastic Factorization.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2014

End-to-End Text Recognition with Hybrid HMM Maxout Models.

[BibT_eX]

[DOI]

Ouais Alsharif

Proceedings of the 2nd International Conference on Learning Representations, 2014

Lifelong Learning of Discriminative Representations.

[BibT_eX]

[DOI]

Ouais Alsharif

Philip Bachman

CoRR, 2014

Methods of Moments for Learning Stochastic Languages: Unified Presentation and Empirical Comparison.

[BibT_eX]

[DOI]

William L. Hamilton

Proceedings of the 31th International Conference on Machine Learning, 2014

Estimating People's Subjective Experiences of Robot Behavior.

[BibT_eX]

[DOI]

Andrew Sutcliffe

Daniel H. Grollman

Proceedings of the 2014 AAAI Fall Symposia, Arlington, Virginia, USA, November 13-15, 2014, 2014

2013

Time Series Analysis Using Geometric Template Matching.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2013

Online Ensemble Learning for Imbalanced Data Streams.

[BibT_eX]

[DOI]

CoRR, 2013

A survey of point-based POMDP solvers.

[BibT_eX]

[DOI]

Guy Shani

Robert Kaplow

Auton. Agents Multi Agent Syst., 2013

Maximum Mean Discrepancy Imitation Learning.

[BibT_eX]

[DOI]

Beomjoon Kim

Proceedings of the Robotics: Science and Systems IX, Technische Universität Berlin, Berlin, Germany, June 24, 2013

Learning from Limited Demonstrations.

[BibT_eX]

[DOI]

Beomjoon Kim

Amir-massoud Farahmand

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces.

[BibT_eX]

[DOI]

Yuri Grinberg

Amir-massoud Farahmand

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Modelling Sparse Dynamical Systems with Compressed Predictive State Representations.

[BibT_eX]

[DOI]

William L. Hamilton

Proceedings of the 30th International Conference on Machine Learning, 2013

Designing Intelligent Wheelchairs: Reintegrating AI.

[BibT_eX]

[DOI]

Proceedings of the Designing Intelligent Robots: Reintegrating AI II, 2013

Mixed Observability Predictive State Representations.

[BibT_eX]

[DOI]

Sylvie C. W. Ong

Yuri Grinberg

Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012

Building Adaptive Dialogue Systems Via Bayes-Adaptive POMDPs.

[BibT_eX]

[DOI]

ShaoWei Png

IEEE J. Sel. Top. Signal Process., 2012

Proceedings of the 29th International Conference on Machine Learning (ICML-12)

[BibT_eX]

[DOI]

John Langford

CoRR, 2012

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs.

[BibT_eX]

[DOI]

Finale Doshi-Velez

Nicholas Roy

Artif. Intell., 2012

On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

An Empirical Analysis of Off-policy Learning in Discrete MDPs.

[BibT_eX]

[DOI]

Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Design and Evaluation of a Flexible Interface for Spatial Navigation.

[BibT_eX]

[DOI]

Emily Tsang

Sylvie C. W. Ong

Proceedings of the Ninth Conference on Computer and Robot Vision, 2012

Compressed Least-Squares Regression on Sparse Spaces.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011

A bistable computational model of recurring epileptiform activity as observed in rodent slice preparations.

[BibT_eX]

[DOI]

Robert D. Vincent

Aaron C. Courville

Neural Networks, 2011

Informing sequential clinical decision-making through reinforcement learning: an empirical study.

[BibT_eX]

[DOI]

Mach. Learn., 2011

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2011

Non-Deterministic Policies in Markovian Decision Processes.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2011

PAC-Bayesian Policy Evaluation for Reinforcement Learning.

[BibT_eX]

[DOI]

Csaba Szepesvári

Proceedings of the UAI 2011, 2011

Active Learning for Developing Personalized Treatment.

[BibT_eX]

[DOI]

Kun Deng

Susan A. Murphy

Proceedings of the UAI 2011, 2011

The Duality of State and Observation in Probabilistic Transition Systems.

[BibT_eX]

[DOI]

Proceedings of the Logic, Language, and Computation, 2011

Reinforcement Learning using Kernel-Based Stochastic Factorization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Bayesian reinforcement learning for POMDP-based dialogue systems.

[BibT_eX]

[DOI]

ShaoWei Png

Proceedings of the IEEE International Conference on Acoustics, 2011

A Framework for Computing Bounds for the Return of a Policy.

[BibT_eX]

[DOI]

Cosmin Paduraru

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Goal-Directed Online Learning of Predictive Models.

[BibT_eX]

[DOI]

Sylvie C. W. Ong

Yuri Grinberg

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Mobility profile and wheelchair driving skills of powered wheelchair users: Sensor-based event recognition using a support vector machine classifier.

[BibT_eX]

[DOI]

Athena K. Moghaddam

Jordan Frank

Philippe S. Archambault

Proceedings of the 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2011

Active learning for personalizing treatment.

[BibT_eX]

[DOI]

Kun Deng

Susan A. Murphy

Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

Automatic Seizure Detection in an In-Vivo Model of Epilepsy.

[BibT_eX]

[DOI]

Guillaume Saulnier

Proceedings of the Computational Physiology, 2011

2010

Towards a standardized test for intelligent wheelchairs.

[BibT_eX]

[DOI]

Proceedings of the 10th Performance Metrics for Intelligent Systems Workshop, 2010

PAC-Bayesian Model Selection for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Variable resolution decomposition for robotic navigation under a POMDP framework.

[BibT_eX]

[DOI]

Robert Kaplow

Amin Atrash

Proceedings of the IEEE International Conference on Robotics and Automation, 2010

Multi-tasking SLAM.

[BibT_eX]

[DOI]

Arthur Guez

Proceedings of the IEEE International Conference on Robotics and Automation, 2010

Automatically suggesting topics for augmenting text documents.

[BibT_eX]

[DOI]

Robert West

Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Treating Epilepsy by Reinforcement Learning Via Manifold-Based Simulation.

[BibT_eX]

[DOI]

Keith Bush

Proceedings of the Manifold Learning and Its Applications, 2010

2009

Development and Validation of a Robust Speech Interface for Improved Human-Robot Interaction.

[BibT_eX]

[DOI]

Int. J. Soc. Robotics, 2009

Treating Epilepsy via Adaptive Neurostimulation: a Reinforcement Learning Approach.

[BibT_eX]

[DOI]

Int. J. Neural Syst., 2009

AAAI 2008 Workshop Reports.

[BibT_eX]

[DOI]

AI Mag., 2009

Manifold Embeddings for Model-Based Reinforcement Learning under Partial Observability.

[BibT_eX]

[DOI]

Keith Bush

Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

A bayesian reinforcement learning approach for customizing human-robot interfaces.

[BibT_eX]

[DOI]

Amin Atrash

Proceedings of the 14th International Conference on Intelligent User Interfaces, 2009

Wikispeedia: An Online Game for Inferring Semantic Distances between Concepts.

[BibT_eX]

[DOI]

Robert West

Proceedings of the IJCAI 2009, 2009

Completing wikipedia's hyperlink structure through dimensionality reduction.

[BibT_eX]

[DOI]

Robert West

Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008

Online Planning Algorithms for POMDPs.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2008

Model-Based Bayesian Reinforcement Learning in Large Structured Domains.

[BibT_eX]

[DOI]

Proceedings of the UAI 2008, 2008

MDPs with Non-Deterministic Policies.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Bayes-Adaptive POMDPs: A New Perspective on the Explore-Exploit Tradeoff in Partially Observable Domains.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

Bayesian reinforcement learning in continuous POMDPs with application to robot navigation.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE International Conference on Robotics and Automation, 2008

Adaptive Treatment of Epilepsy via Batch-mode Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

A Variance Analysis for POMDP Policy Evaluation.

[BibT_eX]

[DOI]

Peng Sun

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007

Apprentissage actif dans les processus décisionnels de Markov partiellement observables L'algorithme MEDUSA.

[BibT_eX]

[DOI]

Robin Jaulmes

Rev. d'Intelligence Artif., 2007

Theoretical Analysis of Heuristic Search Methods for Online POMDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Bayes-Adaptive POMDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

A formal framework for robot learning and control under model uncertainty.

[BibT_eX]

[DOI]

Robin Jaulmes

Proceedings of the 2007 IEEE International Conference on Robotics and Automation, 2007

Recurrent Boosting for Classification of Natural and Synthetic Time-Series Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Artificial Intelligence, 2007

SmartWheeler: A Robotic Wheelchair Test-Bed for Investigating New Models of Human-Robot Interaction.

[BibT_eX]

[DOI]

Amin Atrash

Proceedings of the Multidisciplinary Collaboration for Socially Assistive Robotics, 2007

2006

Planning under uncertainty in robotics.

[BibT_eX]

[DOI]

Nikos Vlassis

Robotics Auton. Syst., 2006

Anytime Point-Based Approximations for Large POMDPs.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2006

PAC-Learning of Markov Models with Hidden State.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning: ECML 2006, 2006

RRT-Plan: A Randomized Algorithm for STRIPS Planning.

[BibT_eX]

[DOI]

Daniel Burfoot

Gregory Dudek

Proceedings of the Sixteenth International Conference on Automated Planning and Scheduling, 2006

Representing Systems with Hidden State.

[BibT_eX]

[DOI]

Proceedings of the Proceedings, 2006

2005

POMDP Planning for Robust Robot Control.

[BibT_eX]

[DOI]

Proceedings of the Robotics Research: Results of the 12th International Symposium, 2005

Active Learning in Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Robin Jaulmes

Proceedings of the Machine Learning: ECML 2005, 2005

2003

Towards robotic assistants in nursing homes: Challenges and results.

[BibT_eX]

[DOI]

Robotics Auton. Syst., 2003

Policy-contingent abstraction for robust robot control.

[BibT_eX]

[DOI]

Proceedings of the UAI '03, 2003

Applying Metric-Trees to Belief-Point POMDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

Point-based value iteration: An anytime algorithm for POMDPs.

[BibT_eX]

[DOI]

Proceedings of the IJCAI-03, 2003

2002

Robotic Assistance During Ambulation by Older Adults.

[BibT_eX]

[DOI]

Proceedings of the AMIA 2002, 2002

Experiences with a Mobile Robotic Guide for the Elderly.

[BibT_eX]

[DOI]

Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28, 2002

2000

Fast reinforcement learning of dialog strategies.

[BibT_eX]

[DOI]

David Goddeau

Proceedings of the IEEE International Conference on Acoustics, 2000

Spoken Dialogue Management Using Probabilistic Reasoning.

[BibT_eX]

[DOI]

Nicholas Roy