Doina Precup

Alexander Wong

CoRR, 2021

A Survey of Exploration Methods in Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Correcting Momentum in Temporal Difference Learning.

[BibT_eX]

[DOI]

Emmanuel Bengio

CoRR, 2021

Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Offline RL.

[BibT_eX]

[DOI]

CoRR, 2021

AndroidEnv: A Reinforcement Learning Platform for Android.

[BibT_eX]

[DOI]

CoRR, 2021

What is Going on Inside Recurrent Meta Reinforcement Learning Agents?

[BibT_eX]

[DOI]

Safa Alver

CoRR, 2021

Training a First-Order Theorem Prover from Synthetic Data.

[BibT_eX]

[DOI]

CoRR, 2021

Reward is enough.

[BibT_eX]

[DOI]

Artif. Intell., 2021

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Gradient Starvation: A Learning Proclivity in Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Flexible Option Learning.

[BibT_eX]

[DOI]

Martin Klissarov

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Temporally Abstract Partial Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Expressivity of Markov Reward.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Randomized Exploration in Reinforcement Learning with General Value Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation.

[BibT_eX]

[DOI]

Scott Fujimoto

Proceedings of the 38th International Conference on Machine Learning, 2021

Preferential Temporal Difference Learning.

[BibT_eX]

[DOI]

Nishanth V. Anand

Proceedings of the 38th International Conference on Machine Learning, 2021

Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata.

[BibT_eX]

[DOI]

Proceedings of the 48th International Colloquium on Automata, Languages, and Programming, 2021

Self-Supervised Attention-Aware Reinforcement Learning.

[BibT_eX]

[DOI]

Haiping Wu

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Variance Penalized On-Policy and Off-Policy Actor-Critic.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Multiple Kernel Learning-Based Transfer Regression for Electric Load Forecasting.

[BibT_eX]

[DOI]

IEEE Trans. Smart Grid, 2020

Fast reinforcement learning with generalized policy updates.

[BibT_eX]

[DOI]

Proc. Natl. Acad. Sci. USA, 2020

Diversity-Enriched Option-Critic.

[BibT_eX]

[DOI]

Anand Kamat

CoRR, 2020

A Study of Policy Gradient on a Class of Exactly Solvable Models.

[BibT_eX]

[DOI]

Gavin McCracken

Colin Daniels

Rosie Zhao

Anna M. Brandenberger

CoRR, 2020

A Fully Tensorized Recurrent Neural Network.

[BibT_eX]

[DOI]

Charles C. Onu

Jacob E. Miller

CoRR, 2020

Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks.

[BibT_eX]

[DOI]

CoRR, 2020

Learning to Prove from Synthetic Theorems.

[BibT_eX]

[DOI]

CoRR, 2020

A Brief Look at Generalization in Visual Meta-Reinforcement Learning.

[BibT_eX]

[DOI]

Safa Alver

CoRR, 2020

Policy Evaluation Networks.

[BibT_eX]

[DOI]

CoRR, 2020

oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions.

[BibT_eX]

[DOI]

CoRR, 2020

Provably efficient reconstruction of policy networks.

[BibT_eX]

[DOI]

CoRR, 2020

On Efficiency in Hierarchical Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Reward Propagation Using Graph Convolutional Networks.

[BibT_eX]

[DOI]

Martin Klissarov

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Value-driven Hindsight Modelling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay.

[BibT_eX]

[DOI]

Scott Fujimoto

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Forethought and Hindsight in Credit Assignment.

[BibT_eX]

[DOI]

Veronica Chelu

Hado van Hasselt

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

SVRG for Policy Evaluation with Fewer Gradient Evaluations.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

What can I do here? A Theory of Affordances in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Interference and Generalization in Temporal Difference Learning.

[BibT_eX]

[DOI]

Emmanuel Bengio

Proceedings of the 37th International Conference on Machine Learning, 2020

Invariant Causal Prediction for Block MDPs.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Keynote Lecture - Building Knowledge For AI AgentsWith Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE International Conference on Intelligent Computer Communication and Processing, 2020

Learning to cooperate: Emergent communication in multi-agent navigation.

[BibT_eX]

[DOI]

Ivana Kajic

Eser Aygün

Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Phylogenetic Manifold Regularization: A semi-supervised approach to predict transcription factor binding sites.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

META-Learning State-based Eligibility Traces for More Sample-Efficient Policy Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Gifting in Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Maxime Chevalier-Boisvert

Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Option-Critic in Cooperative Multi-agent Systems.

[BibT_eX]

[DOI]

Jhelum Chakravorty

Patrick Nadeem Ward

Julien Roy

Sumana Basu

Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Value Preserving State-Action Abstractions.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Gifting in Multi-Agent Reinforcement Learning (Student Abstract).

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Options of Interest: Temporal Abstraction with Interest Functions.

[BibT_eX]

[DOI]

Maxime Chevalier-Boisvert

Martin Klissarov

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Algorithmic Improvements for Deep Reinforcement Learning Applied to Interactive Fiction.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Singular value automata and approximate minimization.

[BibT_eX]

[DOI]

Math. Struct. Comput. Sci., 2019

Shaping representations through communication: community size effect in artificial learning systems.

[BibT_eX]

[DOI]

CoRR, 2019

Marginalized State Distribution Entropy Regularization in Policy Optimization.

[BibT_eX]

[DOI]

Riashat Islam

Zafarali Ahmed

CoRR, 2019

Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods.

[BibT_eX]

[DOI]

CoRR, 2019

Actor Critic with Differentially Private Critic.

[BibT_eX]

[DOI]

CoRR, 2019

Augmenting learning using symmetry in a biologically-inspired domain.

[BibT_eX]

[DOI]

CoRR, 2019

Avoidance Learning Using Observational Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Revisit Policy Optimization in Matrix Form.

[BibT_eX]

[DOI]

Sitao Luan

Xiao-Wen Chang

Srinivas Venkattaramanujam

CoRR, 2019

An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation.

[BibT_eX]

[DOI]

Vincent Michalski

Vikram Voleti

Samira Ebrahimi Kahou

CoRR, 2019

Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning.

[BibT_eX]

[DOI]

Eric Crawford

Thang Doan

Nazanin Mohammadi Sepahvand

CoRR, 2019

Recurrent Value Functions.

[BibT_eX]

[DOI]

CoRR, 2019

Community size effect in artificial learning systems.

[BibT_eX]

[DOI]

Proceedings of the Visually Grounded Interaction and Language (ViGIL), 2019

Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Hindsight Credit Assignment.

[BibT_eX]

[DOI]

Anna Harutyunyan

Will Dabney

Thomas Mesnard

Mohammad Gheshlaghi Azar

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

The Option Keyboard: Combining Skills in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Prediction of Disease Progression in Multiple Sclerosis Patients using Deep Learning Analysis of MRI Data.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Medical Imaging with Deep Learning, 2019

Improving Pathological Structure Segmentation via Transfer Learning Across Diseases.

[BibT_eX]

[DOI]

Barleen Kaur

Paul Lemaître

Raghav Mehta

Douglas L. Arnold

Proceedings of the Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data, 2019

Early Prediction of Alzheimer's Disease Progression Using Variational Autoencoders.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Neural Transfer Learning for Cry-Based Diagnosis of Perinatal Asphyxia.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Learning Reliable Policies in the Bandit Setting with Application to Adaptive Clinical Trials.

[BibT_eX]

[DOI]

Tibor Schuster

Proceedings of the 4th International Workshop on Knowledge Discovery in Healthcare Data co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials.

[BibT_eX]

[DOI]

Juan Camilo Gamboa Higuera

Tibor Schuster

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks.

[BibT_eX]

[DOI]

Sanjay Thakur

Herke van Hoof

Proceedings of the International Conference on Robotics and Automation, 2019

Per-Decision Option Discounting.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Off-Policy Deep Reinforcement Learning without Exploration.

[BibT_eX]

[DOI]

Scott Fujimoto

Proceedings of the 36th International Conference on Machine Learning, 2019

Learning proposals for sequential importance samplers using reinforced variational inference.

[BibT_eX]

[DOI]

Proceedings of the Deep Reinforcement Learning Meets Structured Prediction, 2019

Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments.

[BibT_eX]

[DOI]

Samira Ebrahimi Kahou

Proceedings of the 3rd Annual Conference on Robot Learning, 2019

Building Knowledge for AI Agents with Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning.

[BibT_eX]

[DOI]

Guillaume Rabusseau

Tianyu Li

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

The Termination Critic.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Leveraging Observations in Bandits: Between Risks and Benefits.

[BibT_eX]

[DOI]

Audrey Durand

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Learning Options with Interest Functions.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Combined Reinforcement Learning via Abstract Representations.

[BibT_eX]

[DOI]

Vincent François-Lavet

Yoshua Bengio

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Temporally Extended Metrics for Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the Thirty-Third AAAI Conference on Artificial Intelligence 2019 (AAAI-19), 2019

2018

Clustering-Oriented Representation Learning with Attractive-Repulsive Loss.

[BibT_eX]

[DOI]

CoRR, 2018

Environments for Lifelong Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2018

The Barbados 2018 List of Open Issues in Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Attend Before you Act: Leveraging human visual attention for continual learning.

[BibT_eX]

[DOI]

CoRR, 2018

Dyna Planning using a Feature Based Generative Model.

[BibT_eX]

[DOI]

Ryan Faulkner

CoRR, 2018

Disentangling the independently controllable factors of variation by interacting with the world.

[BibT_eX]

[DOI]

CoRR, 2018

Constructing Temporal Abstractions Autonomously in Reinforcement Learning.

[BibT_eX]

[DOI]

AI Mag., 2018

Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization.

[BibT_eX]

[DOI]

Kian Kenyon-Dean

Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, 2018

Temporal Regularization for Markov Decision Process.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Learning Safe Policies with Expert Guidance.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018

Convergent TREE BACKUP and RETRACE with Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

Leveraging Observational Learning for Exploration in Bandits.

[BibT_eX]

[DOI]

Audrey Durand

Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Eligibility Traces for Options.

[BibT_eX]

[DOI]

Ayush Jain

Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Nonlinear Weighted Finite Automata.

[BibT_eX]

[DOI]

Tianyu Li

Guillaume Rabusseau

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Learning Robust Options.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Imitation Upper Confidence Bound for Bandits on a Graph.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning With Options That Terminate Off-Policy.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

When Waiting Is Not an Option: Learning Options With a Deliberation Cost.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning Predictive State Representations From Non-Uniform Sampling.

[BibT_eX]

[DOI]

Melanie Lyman-Abramovitch

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Deep Reinforcement Learning That Matters.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

OptionGAN: Learning Joint Reward-Policy Options Using Generative Adversarial Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Learnings Options End-to-End for Continuous Action Tasks.

[BibT_eX]

[DOI]

CoRR, 2017

Ubenwa: Cry-based Diagnosis of Birth Asphyxia.

[BibT_eX]

[DOI]

Edward Alikor

Peace Opara

CoRR, 2017

Neural Network Based Nonlinear Weighted Finite Automata.

[BibT_eX]

[DOI]

Tianyu Li

Guillaume Rabusseau

CoRR, 2017

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control.

[BibT_eX]

[DOI]

CoRR, 2017

Independently Controllable Factors.

[BibT_eX]

[DOI]

CoRR, 2017

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options.

[BibT_eX]

[DOI]

Peeyush Kumar

CoRR, 2017

Investigating Recurrence and Eligibility Traces in Deep Q-Networks.

[BibT_eX]

[DOI]

Jean Harb

CoRR, 2017

Independently Controllable Features.

[BibT_eX]

[DOI]

CoRR, 2017

Predicting extubation readiness in extreme preterm infants based on patterns of breathing.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Boosting Based Multiple Kernel Learning and Transfer Regression for Electricity Load Forecasting.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

Learning-based interactive segmentation using the maximum mean cycle weight formalism.

[BibT_eX]

[DOI]

Proceedings of the Medical Imaging 2017: Image Processing, 2017

Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, 2017

Approximate Value Iteration with Temporally Extended Actions (Extended Abstract).

[BibT_eX]

[DOI]

Timothy A. Mann

Jesús Alejandro Cárdenes Cabré

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Horizontal and Vertical Self-Adaptive Cloud Controller with Reward Optimization for Resource Allocation.

[BibT_eX]

[DOI]

Ricardo Sanz

Proceedings of the 2017 International Conference on Cloud and Autonomic Computing, 2017

World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions.

[BibT_eX]

[DOI]

Teng Long

Emmanuel Bengio

Ryan Lowe

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

A semi-Markov chain approach to modeling respiratory patterns prior to extubation in preterm infants.

[BibT_eX]

[DOI]

Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

APEX_SCOPE: A graphical user interface for visualization of multi-modal data in inter-disciplinary studies.

[BibT_eX]

[DOI]

Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

Real-Time Indoor Localization in Smart Homes Using Semi-Supervised Learning.

[BibT_eX]

[DOI]

Negar Ghourchian

Michel Allegue-Martínez

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

The Option-Critic Architecture.

[BibT_eX]

[DOI]

Jean Harb

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Hierarchical Spatio-Temporal Probabilistic Graphical Model with Multiple Feature Fusion for Binary Facial Attribute Classification in Real-World Face Videos.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2016

Practical Kernel-Based Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2016

Editorial on Special Issue on Probabilistic Models for Biomedical Image Analysis.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2016

A Matrix Splitting Perspective on Planning with Options.

[BibT_eX]

[DOI]

CoRR, 2016

Learning Multi-Step Predictive State Representations.

[BibT_eX]

[DOI]

Lucas Langer

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Differentially Private Policy Evaluation.

[BibT_eX]

[DOI]

Maziar Gomrokchi

Proceedings of the 33nd International Conference on Machine Learning, 2016

Verb Phrase Ellipsis Resolution Using Discriminative and Margin-Infused Algorithms.

[BibT_eX]

[DOI]

Kian Kenyon-Dean

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Automated ongoing data validation and quality control of multi-institutional studies.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2016

Prediction of Cell Type Specific Transcription Factor Binding Site Occupancy.

[BibT_eX]

[DOI]

Faizy Ahsan

Mathieu Blanchette

Proceedings of the 7th ACM International Conference on Bioinformatics, 2016

Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data.

[BibT_eX]

[DOI]

Teng Long

Ryan Lowe

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Incremental Stochastic Factorization for Online Reinforcement Learning.

[BibT_eX]

[DOI]

Rafael L. Beirigo

Jayashree Kalpathy-Cramer

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).

[BibT_eX]

[DOI]

Bjoern H. Menze

András Jakab

Stefan Bauer

Elizabeth R. Gerstner

Khan M. Iftekharuddin

IEEE Trans. Medical Imaging, 2015

Classification-Based Approximate Policy Iteration.

[BibT_eX]

[DOI]

Mohammad Ghavamzadeh

IEEE Trans. Autom. Control., 2015

Quantifying the determinants of outbreak detection performance through simulation and machine learning.

[BibT_eX]

[DOI]

J. Biomed. Informatics, 2015

Approximate Value Iteration with Temporally Extended Actions.

[BibT_eX]

[DOI]

Timothy A. Mann

J. Artif. Intell. Res., 2015

Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2015

Policy Gradient Methods for Off-policy Control.

[BibT_eX]

[DOI]

Lucas Lehnert

CoRR, 2015

Conditional Computation in Neural Networks for faster models.

[BibT_eX]

[DOI]

CoRR, 2015

Testing Visual Attention in Dynamic Environments.

[BibT_eX]

[DOI]

David Krueger

CoRR, 2015

Learning and Planning with Timing Information in Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

Basis refinement strategies for linear value function approximation in MDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Data Generation as Sequential Decision Making.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

A Canonical Form for Weighted Automata and Applications to Approximate Minimization.

[BibT_eX]

[DOI]

Proceedings of the 30th Annual ACM/IEEE Symposium on Logic in Computer Science, 2015

IMaGe: Iterative Multilevel Probabilistic Graphical Model for Detection and Segmentation of Multiple Sclerosis Lesions in Brain MRI.

[BibT_eX]

[DOI]

Proceedings of the Information Processing in Medical Imaging, 2015

An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data.

[BibT_eX]

[DOI]

Rafael L. Beirigo

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Variational Generative Stochastic Networks with Collaborative Shaping.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015

Correlation of clinical parameters with cardiorespiratory behavior in successfully extubated extremely preterm infants.

[BibT_eX]

[DOI]

Lara J. Kanbar

Wissam Shalish

Carlos A. Robles-Rubio

Karen A. Brown

Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Organizational principles of cloud storage to support collaborative biomedical research.

[BibT_eX]

[DOI]

Lara J. Kanbar

Wissam Shalish

Carlos A. Robles-Rubio

Karen A. Brown

Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Feature selection and oversampling in analysis of clinical data for extubation readiness in extreme preterm infants.

[BibT_eX]

[DOI]

Pascale Gourdeau

Lara J. Kanbar

Wissam Shalish

Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Representation Discovery for MDPs Using Bisimulation Metrics.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

Policy Iteration Based on Stochastic Factorization.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2014

Algorithms for multi-armed bandit problems.

[BibT_eX]

[DOI]

Volodymyr Kuleshov

CoRR, 2014

Classification-based Approximate Policy Iteration: Experiments and Extended Discussions.

[BibT_eX]

[DOI]

Mohammad Ghavamzadeh

CoRR, 2014

Theoretical results on the effect of 'shortcut' actions in MDPs.

[BibT_eX]

[DOI]

Sara M. McCarthy

Connect. Sci., 2014

Bisimulation Metrics are Optimal Value Functions.

[BibT_eX]

[DOI]

Norman Ferns

Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Optimizing Energy Production Using Policy Search and Predictive State Representations.

[BibT_eX]

[DOI]

Michel Gendreau

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Learning with Pseudo-Ensembles.

[BibT_eX]

[DOI]

Ouais Alsharif

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

A new Q(lambda) with interim forward view and Monte Carlo equivalence.

[BibT_eX]

[DOI]

Ashique Rupam Mahmood

Hado van Hasselt

Proceedings of the 31th International Conference on Machine Learning, 2014

Sample-based approximate regularization.

[BibT_eX]

[DOI]

Proceedings of the 31th International Conference on Machine Learning, 2014

Multi-layer temporal graphical model for head pose estimation in real-world videos.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Probabilistic Temporal Head Pose Estimation Using a Hierarchical Graphical Model.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2014, 2014

Iterative Multilevel MRF Leveraging Context and Voxel Information for Brain Tumour Segmentation in MRI.

[BibT_eX]

[DOI]

Nagesh K. Subbanna

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Bisimulation for Markov Decision Processes through Families of Functional Expressions.

[BibT_eX]

[DOI]

Sophia Knight

Proceedings of the Horizons of the Mind. A Tribute to Prakash Panangaden, 2014

Analyzing User Trajectories from Mobile Device Data with Hierarchical Dirichlet Processes.

[BibT_eX]

[DOI]

Negar Ghourchian

Proceedings of the Advances in Artificial Intelligence, 2014

2013

Generating storylines from sensor data.

[BibT_eX]

[DOI]

Pervasive Mob. Comput., 2013

Time Series Analysis Using Geometric Template Matching.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2013

Greedy Confidence Pursuit: A Pragmatic Approach to Multi-bandit Optimization.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

Learning from Limited Demonstrations.

[BibT_eX]

[DOI]

Beomjoon Kim

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces.

[BibT_eX]

[DOI]

Mahdi Milani Fard

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Hierarchical Probabilistic Gabor and MRF Segmentation of Brain Tumours in MRI Volumes.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2013, 2013

Smart Classifier Selection for Activity Recognition on Wearable Devices.

[BibT_eX]

Negar Ghourchian

Proceedings of the ICPRAM 2013, 2013

Average Reward Optimization Objective In Partially Observable Domains.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Machine Learning, 2013

Assessing the Predictability of Hospital Readmission Using Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth Innovative Applications of Artificial Intelligence Conference, 2013

Smart exploration in reinforcement learning using absolute temporal difference errors.

[BibT_eX]

[DOI]

Clement Gehring

Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Using Hierarchical Mixture of Experts Model for Fusion of Outbreak Detection Methods.

[BibT_eX]

[DOI]

Proceedings of the AMIA 2013, 2013

2012

An information-theoretic approach to curiosity-driven reinforcement learning.

[BibT_eX]

[DOI]

Susanne Still

Theory Biosci., 2012

On Average Reward Policy Evaluation in Infinite-State Partially Observable Systems.

[BibT_eX]

[DOI]

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Reports of the AAAI 2011 Conference Workshops.

[BibT_eX]

[DOI]

AI Mag., 2012

On-the-Fly Algorithms for Bisimulation Metrics.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Quantitative Evaluation of Systems, 2012

Value Pursuit Iteration.

[BibT_eX]

[DOI]

Amir Massoud Farahmand

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Improved Estimation in Time Varying Models.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Machine Learning, 2012

An Empirical Analysis of Off-policy Learning in Discrete MDPs.

[BibT_eX]

[DOI]

Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Prediction of extubation readiness in extreme preterm infants based on measures of cardiorespiratory variability.

[BibT_eX]

[DOI]

Carlos A. Robles-Rubio

Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

Soft biometric trait classification from real-world face videos conditioned on head pose estimation.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

Mining Administrative Data to Predict Falls in the Elderly Population.

[BibT_eX]

[DOI]

Proceedings of the Advances in Artificial Intelligence, 2012

Compressed Least-Squares Regression on Sparse Spaces.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011

Bisimulation Metrics for Continuous Markov Decision Processes.

[BibT_eX]

[DOI]

SIAM J. Comput., 2011

The Duality of State and Observation in Probabilistic Transition Systems.

[BibT_eX]

[DOI]

Proceedings of the Logic, Language, and Computation, 2011

Activity Recognition with Mobile Phones.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

Reinforcement Learning using Kernel-Based Stochastic Factorization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Adapted MRF Segmentation of Multiple Sclerosis Lesions Using Local Contextual Information.

[BibT_eX]

Proceedings of the Medical Image Understanding and Analysis, 2011

A Framework for Computing Bounds for the Return of a Policy.

[BibT_eX]

[DOI]

Cosmin Paduraru

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Activity Recognition with Time-Delay Emobeddings.

[BibT_eX]

[DOI]

Proceedings of the Computational Physiology, 2011

Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

Learning Compact Representations of Time-Varying Processes.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010

Classification of Normal and Hypoxic Fetuses From Systems Modeling of Intrapartum Cardiotocography.

[BibT_eX]

[DOI]

IEEE Trans. Biomed. Eng., 2010

A Study of Approximate Inference in Probabilistic Relational Models.

[BibT_eX]

[DOI]

Fabian Kaelin

Proceedings of the 2nd Asian Conference on Machine Learning, 2010

Smarter Sampling in Model-Based Bayesian Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2010

Approximate Predictive Representations of Partially Observable Systems.

[BibT_eX]

[DOI]

Monica Dinculescu

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

A Machine Learning Approach to the Detection of Fetal Hypoxia during Labor and Delivery.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second Conference on Innovative Applications of Artificial Intelligence, 2010

A novel similarity measure for time series data with applications to gait and activity recognition.

[BibT_eX]

[DOI]

Proceedings of the UbiComp 2010: Ubiquitous Computing, 12th International Conference, 2010

An Algebraic Approach to Dynamic Epistemic Logic.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Workshop on Description Logics (DL 2010), 2010

Automatically suggesting topics for augmenting text documents.

[BibT_eX]

[DOI]

Robert West

Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Optimal policy switching algorithms for reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

Activity and Gait Recognition with Time-Delay Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

Using Bisimulation for Policy Transfer in MDPs.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009

Identification of the Dynamic Relationship Between Intrapartum Uterine Pressure and Fetal Heart Rate for Normal and Hypoxic Fetuses.

[BibT_eX]

[DOI]

IEEE Trans. Biomed. Eng., 2009

Learning the Difference between Partially Observable Dynamical Systems.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2009

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Wikispeedia: An Online Game for Inferring Semantic Distances between Concepts.

[BibT_eX]

[DOI]

Robert West

Proceedings of the IJCAI 2009, 2009

Equivalence Relations in Fully and Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2009, 2009

Fast gradient-descent methods for temporal-difference learning with linear function approximation.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Completing wikipedia's hyperlink structure through dimensionality reduction.

[BibT_eX]

[DOI]

Robert West

Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008

Anytime similarity measures for faster alignment.

[BibT_eX]

[DOI]

Rupert Brooks

Comput. Vis. Image Underst., 2008

Bounding Performance Loss in Approximate MDP Homomorphisms.

[BibT_eX]

[DOI]

Jonathan Taylor

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Reinforcement learning in the presence of rare events.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2008

Point-Based Planning for Predictive State Representations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Artificial Intelligence , 2008

2007

Apprentissage actif dans les processus décisionnels de Markov partiellement observables L'algorithme MEDUSA.

[BibT_eX]

[DOI]

Robin Jaulmes

Rev. d'Intelligence Artif., 2007

Using Linear Programming for Bayesian Exploration in Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2007, 2007

Fast Image Alignment Using Anytime Algorithms.

[BibT_eX]

[DOI]

Rupert Brooks

Proceedings of the IJCAI 2007, 2007

Context-Driven Predictions.

[BibT_eX]

[DOI]

Marc G. Bellemare

Proceedings of the IJCAI 2007, 2007

A formal framework for robot learning and control under model uncertainty.

[BibT_eX]

[DOI]

Robin Jaulmes

Proceedings of the 2007 IEEE International Conference on Robotics and Automation, 2007

Representing Systems with Hidden State.

[BibT_eX]

[DOI]

Proceedings of the Computational Approaches to Representation Change during Learning and Development, 2007

2006

Methods for Computing State Similarity in Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the UAI '06, 2006

Data Mining Using Relational Database Management Systems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Knowledge Discovery and Data Mining, 2006

Automatic basis function construction for approximate dynamic programming and reinforcement learning.

[BibT_eX]

[DOI]

Philipp W. Keller

Proceedings of the Machine Learning, 2006

Linear models of intrapartum uterine pressure-fetal heart rate interaction for the normal and hypoxic fetus.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference of the IEEE Engineering in Medicine and Biology Society, 2006

PAC-Learning of Markov Models with Hidden State.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning: ECML 2006, 2006

Belief Selection in Point-Based Planning Algorithms for POMDPs.

[BibT_eX]

[DOI]

Danielle Azar

Proceedings of the Advances in Artificial Intelligence, 2006

2005

The Workshop Program at the Nineteenth National Conference on Artificial Intelligence.

[BibT_eX]

[DOI]

AI Mag., 2005

Metrics for Markov Decision Processes with Infinite State Spaces.

[BibT_eX]

[DOI]

Proceedings of the UAI '05, 2005

An approximation algorithm for labelled Markov processes: towards realistic approximation.

[BibT_eX]

[DOI]

Alexandre Bouchard-Côté

Proceedings of the Second International Conference on the Quantitative Evaluaiton of Systems (QEST 2005), 2005

Off-policy Learning with Options and Recognizers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Using core beliefs for point-based value iteration.

[BibT_eX]

[DOI]

Ajit V. Rajwade

Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Model minimization by linear PSR.

[BibT_eX]

[DOI]

Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Active Learning in Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Robin Jaulmes

Proceedings of the Machine Learning: ECML 2005, 2005

Using Rewards for Belief State Updates in Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning: ECML 2005, 2005

2004

Redagent: winner of TAC SCM 2003.

[BibT_eX]

[DOI]

Philipp W. Keller

Felix-Olivier Duguay

SIGecom Exch., 2004

Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning.

[BibT_eX]

[DOI]

Bohdana Ratitch

Proceedings of the Machine Learning: ECML 2004, 2004

RedAgent-2003: An Autonomous Market-Based Supply-Chain Management Agent.

[BibT_eX]

[DOI]

Philipp W. Keller

Felix-Olivier Duguay

Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 2004

Metrics for Finite Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003

A Planning Algorithm for Predictive State Representations.

[BibT_eX]

[DOI]

Proceedings of the IJCAI-03, 2003

Combining TD-learning with Cascade-correlation Networks.

[BibT_eX]

[DOI]

François Rivest

Proceedings of the Machine Learning, 2003

Using MDP Characteristics to Guide Exploration in Reinforcement Learning.

[BibT_eX]

[DOI]

Bohdana Ratitch

Proceedings of the Machine Learning: ECML 2003, 2003

2002

Learning Options in Reinforcement Learning.

[BibT_eX]

[DOI]

Martin Stolle

Proceedings of the Abstraction, 2002

A Convergent Form of Approximate Policy Iteration.

[BibT_eX]

[DOI]

Theodore J. Perkins

Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

Combining and Adapting Software Quality Predictive Models by Genetic Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Automated Software Engineering (ASE 2002), 2002

Characterizing Markov Decision Processes.

[BibT_eX]

[DOI]

Bohdana Ratitch

Proceedings of the Machine Learning: ECML 2002, 2002

2001

Developing Collaborative Golog Agents by Reinforcement Learning.

[BibT_eX]

[DOI]

Ioan Alfred Letia

Proceedings of the 13th IEEE International Conference on Tools with Artificial Intelligence, 2001

Off-Policy Temporal Difference Learning with Function Approximation.

[BibT_eX]

Sanjoy Dasgupta

Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

2000

Eligibility Traces for Off-Policy Policy Evaluation.

[BibT_eX]

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Using Finite Experiments to Study Asymptotic Performance.

[BibT_eX]

[DOI]

Proceedings of the Experimental Algorithmics, 2000

1999

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning.

[BibT_eX]

[DOI]

Artif. Intell., 1999

1998

Improved Switching among Temporally Abstract Actions.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Intra-Option Learning about Temporally Abstract Actions.

[BibT_eX]

Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Classification Using Phi-Machines and Constructive Function Approximation.

[BibT_eX]

Paul E. Utgoff

Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Theoretical Results on Reinforcement Learning with Temporally Abstract Options.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning: ECML-98, 1998

1997

Multi-time Models for Temporally Abstract Planning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Learning to Schedule Straight-Line Code.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 10, 1997

How to Find Big-Oh in Your Data Set (and How Not to).

[BibT_eX]

[DOI]

Catherine C. McGeoch

Paul R. Cohen

Proceedings of the Advances in Intelligent Data Analysis, 1997

Exponentiated Gradient Methods for Reinforcement Learning.

[BibT_eX]