We stand with Ukraine

We stand with Ukraine

Martha White

Orcid: 0000-0002-5356-2950

Affiliations:

University of Alberta, Edmonton, Canada

According to our database¹, Martha White authored at least 141 papers between 2009 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

On csauthors.net:

Bibliography

2026

Measure-to-measure Regression with Transformers.

[DOI]

Matthew Vandergrift

,

,

Yury Polyanskiy

,

Philippe Rigollet

,

Lazar Atanackovic

CoRR, May, 2026

Addressing Terminal Constraints in Data-Driven Demand Response Scheduling.

[DOI]

Maximilian Bloor

,

,

Ehecatl Antonio del Rio-Chanona

,

CoRR, May, 2026

Revisiting Mixture Policies in Entropy-Regularized Actor-Critic.

[DOI]

,

,

,

,

CoRR, May, 2026

Forager: a lightweight testbed for continual learning with partial observability in RL.

[DOI]

,

,

Anna Hakhverdyan

,

Andrew Patterson

,

,

,

,

Parham Mohammad Panahi

,

,

CoRR, May, 2026

Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models.

[DOI]

,

,

Siddhant Agarwal

,

,

,

CoRR, March, 2026

Gradient Iterated Temporal-Difference Learning.

[DOI]

,

,

Yogesh Tripathi

,

,

,

,

,

CoRR, March, 2026

Value Bonuses using Ensemble Errors for Exploration in Reinforcement Learning.

[DOI]

,

Raksha Kumaraswamy

,

CoRR, February, 2026

PC-Gym: Benchmark environments for process control problems.

[DOI]

Maximilian Bloor

,

,

Ilya Orson Sandoval

,

,

,

Mehmet Mercangöz

,

,

Ehecatl Antonio del Rio-Chanona

,

Comput. Chem. Eng., 2026

2025

An Analysis of Action-Value Temporal-Difference Methods That Learn State Values.

[DOI]

,

Prabhat Nagarajan

,

,

Marlos C. Machado

CoRR, July, 2025

Deep Reinforcement Learning with Gradient Eligibility Traces.

[DOI]

,

,

Andrew Patterson

,

Marlos C. Machado

,

,

CoRR, July, 2025

Double Q-learning for Value-based Deep Reinforcement Learning, Revisited.

[DOI]

Prabhat Nagarajan

,

,

Marlos C. Machado

CoRR, July, 2025

Distribution Parameter Actor-Critic: Shifting the Agent-Environment Boundary for Diverse Action Spaces.

[DOI]

,

A. Rupam Mahmood

,

CoRR, June, 2025

Fine-Tuning without Performance Degradation.

[DOI]

,

,

CoRR, May, 2025

Position: Lifetime tuning is incompatible with continual reinforcement learning.

[DOI]

,

Parham Mohammad Panahi

,

Olya Mastikhina

,

,

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

q-exponential family for policy optimization.

[DOI]

,

,

,

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

GVFs in the real world: making predictions online for water treatment.

[DOI]

Muhammad Kamran Janjua

,

,

,

,

Marlos C. Machado

,

Mach. Learn., July, 2024

Offline Reinforcement Learning via Tsallis Regularization.

[DOI]

,

Matthew Schlegel

,

,

Trans. Mach. Learn. Res., 2024

Empirical Design in Reinforcement Learning.

[DOI]

Andrew Patterson

,

,

,

J. Mach. Learn. Res., 2024

Goal-Space Planning with Subgoal Models.

[DOI]

,

,

Parham Mohammad Panahi

,

Scott M. Jordan

,

,

,

Farzane Aminmansour

,

J. Mach. Learn. Res., 2024

Data-Efficient Policy Evaluation Through Behavior Policy Search.

[DOI]

Josiah P. Hanna

,

,

Philip S. Thomas

,

,

,

J. Mach. Learn. Res., 2024

Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models.

[DOI]

Farzane Aminmansour

,

Taher Jafferjee

,

,

Erin J. Talvitie

,

Michael Bowling

,

J. Artif. Intell. Res., 2024

q-exponential family for policy optimization.

[DOI]

,

,

,

CoRR, 2024

The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning.

[DOI]

Andrew Patterson

,

,

Raksha Kumaraswamy

,

,

CoRR, 2024

A New View on Planning in Online Reinforcement Learning.

[DOI]

,

Parham Mohammad Panahi

,

Scott M. Jordan

,

,

CoRR, 2024

Tuning for the Unknown: Revisiting Evaluation Strategies for Lifelong RL.

[DOI]

,

Olya Mastikhina

,

Parham Mohammad Panahi

,

,

CoRR, 2024

Investigating the Histogram Loss in Regression.

[DOI]

,

,

Sam Scholnick-Hughes

,

,

CoRR, 2024

What to Do When Your Discrete Optimization Is the Size of a Neural Network?

[DOI]

,

CoRR, 2024

Compound Returns Reduce Variance in Reinforcement Learning.

[DOI]

,

,

Marlos C. Machado

CoRR, 2024

Investigating the properties of neural network representations in reinforcement learning.

[DOI]

,

,

,

Marlos C. Machado

,

,

Raksha Kumaraswamy

,

,

Artif. Intell., 2024

Cross-environment Hyperparameter Tuning for Reinforcement Learning.

[DOI]

Andrew Patterson

,

,

Raksha Kumaraswamy

,

,

RLJ, 2024

Investigating the Interplay of Prioritized Replay and Generalization.

[DOI]

Parham Mohammad Panahi

,

Andrew Patterson

,

,

RLJ, 2024

Demystifying the Recency Heuristic in Temporal-Difference Learning.

[DOI]

,

Marlos C. Machado

,

RLJ, 2024

Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers.

[DOI]

,

Mohamed Elsayed

,

Seyed Alireza Azimi

,

,

,

Colin Bellinger

,

,

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Real-Time Recurrent Learning using Trace Units in Reinforcement Learning.

[DOI]

,

,

Michael Bowling

,

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Position: Benchmarking is Limited in Reinforcement Learning Research.

[DOI]

Scott M. Jordan

,

,

Bruno Castro da Silva

,

,

Philip S. Thomas

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Averaging n-step Returns Reduces Variance in Reinforcement Learning.

[DOI]

,

,

Marlos C. Machado

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning (Abstract Reprint).

[DOI]

,

James R. Wright

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Robust Losses for Learning Value Functions.

[DOI]

Andrew Patterson

,

,

IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Investigating Action Encodings in Recurrent Neural Networks in Reinforcement Learning.

[DOI]

Matthew Schlegel

,

Volodymyr Tkachuk

,

,

Trans. Mach. Learn. Res., 2023

Resmax: An Alternative Soft-Greedy Operator for Reinforcement Learning.

[DOI]

,

,

,

Abbas Masoumzadeh

,

Trans. Mach. Learn. Res., 2023

Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks.

[DOI]

,

,

Richard S. Sutton

,

J. Mach. Learn. Res., 2023

Off-Policy Actor-Critic with Emphatic Weightings.

[DOI]

,

,

Raksha Kumaraswamy

,

J. Mach. Learn. Res., 2023

Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning.

[DOI]

,

James R. Wright

,

J. Artif. Intell. Res., 2023

When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

[DOI]

,

Prabhat Nagarajan

,

Andrew Patterson

,

CoRR, 2023

Coagent Networks: Generalized and Scaled.

[DOI]

James E. Kostas

,

Scott M. Jordan

,

,

Georgios Theocharous

,

,

,

Bruno Castro da Silva

,

Philip S. Thomas

CoRR, 2023

Online Real-Time Recurrent Learning Using Sparse Connections and Selective Learning.

[DOI]

,

,

Richard S. Sutton

,

CoRR, 2023

Generalized Munchausen Reinforcement Learning using Tsallis KL Divergence.

[DOI]

,

,

Takamitsu Matsubara

,

CoRR, 2023

General Munchausen Reinforcement Learning with Tsallis Kullback-Leibler Divergence.

[DOI]

,

,

Matthew Schlegel

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning.

[DOI]

,

,

Christopher Amato

,

Marlos C. Machado

Proceedings of the International Conference on Machine Learning, 2023

The In-Sample Softmax for Offline Reinforcement Learning.

[DOI]

,

,

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement.

[DOI]

,

,

Ajin George Joseph

,

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Measuring and Mitigating Interference in Reinforcement Learning.

[DOI]

,

,

,

,

,

Proceedings of the Conference on Lifelong Learning Agents, 2023

Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments.

[DOI]

,

,

Philip S. Thomas

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL.

[DOI]

,

Archit Sakhadeo

,

,

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2022

Representation Alignment in Neural Networks.

[DOI]

,

,

Trans. Mach. Learn. Res., 2022

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning.

[DOI]

Andrew Patterson

,

,

J. Mach. Learn. Res., 2022

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences.

[DOI]

,

,

,

,

A. Rupam Mahmood

,

J. Mach. Learn. Res., 2022

Goal-Space Planning with Subgoal Models.

[DOI]

,

,

,

Farzane Aminmansour

,

CoRR, 2022

Understanding and mitigating the limitations of prioritized experience replay.

[DOI]

,

,

Amir-massoud Farahmand

,

,

,

,

Proceedings of the Uncertainty in Artificial Intelligence, 2022

A Temporal-Difference Approach to Policy Gradient Estimation.

[DOI]

Samuele Tosatto

,

Andrew Patterson

,

,

Proceedings of the International Conference on Machine Learning, 2022

Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum.

[DOI]

,

,

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

An Alternate Policy Gradient Estimator for Softmax Policies.

[DOI]

,

Samuele Tosatto

,

,

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Sim2Real in Robotics and Automation: Applications and Challenges.

[DOI]

Sebastian Höfer

,

Kostas E. Bekris

,

,

Juan Camilo Gamboa

,

Melissa Mozifian

,

,

Christopher G. Atkeson

,

,

,

John J. Leonard

,

,

,

,

,

IEEE Trans Autom. Sci. Eng., 2021

General Value Function Networks.

[DOI]

Matthew Schlegel

,

Andrew Jacobsen

,

,

Andrew Patterson

,

,

J. Artif. Intell. Res., 2021

Understanding Feature Transfer Through Representation Alignment.

[DOI]

,

,

CoRR, 2021

Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning.

[DOI]

,

James R. Wright

,

CoRR, 2021

Predictive Representation Learning for Language Modeling.

[DOI]

,

,

,

CoRR, 2021

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning.

[DOI]

Andrew Patterson

,

,

,

CoRR, 2021

Scalable Online Recurrent Learning Using Columnar Neural Networks.

[DOI]

,

,

Richard S. Sutton

CoRR, 2021

Continual Auxiliary Task Learning.

[DOI]

,

,

Matthew Schlegel

,

Andrew Jacobsen

,

Raksha Kumaraswamy

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Structural Credit Assignment in Neural Networks using Reinforcement Learning.

[DOI]

,

,

Matthew Schlegel

,

James E. Kostas

,

Philip S. Thomas

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online.

[DOI]

,

,

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Adapting Behavior via Intrinsic Reward: A Survey and Empirical Study.

[DOI]

,

,

,

,

J. Artif. Intell. Res., 2020

Perspectives on Sim2Real Transfer for Robotics: A Summary of the R: SS 2020 Workshop.

[DOI]

Sebastian Höfer

,

Kostas E. Bekris

,

,

Juan Camilo Gamboa Higuera

,

,

Melissa Mozifian

,

Christopher G. Atkeson

,

,

,

John J. Leonard

,

,

,

,

,

CoRR, 2020

From Language to Language-ish: How Brain-Like is an LSTM's Representation of Nonsensical Language Stimuli?

[DOI]

Maryam Hashemzadeh

,

,

,

Andrea E. Martin

,

CoRR, 2020

Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities.

[DOI]

,

,

,

Amir-massoud Farahmand

,

CoRR, 2020

Towards a practical measure of interference for reinforcement learning.

[DOI]

,

,

,

CoRR, 2020

Learning Causal Models Online.

[DOI]

,

,

CoRR, 2020

Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models.

[DOI]

Taher Jafferjee

,

,

,

,

Michael Bowling

CoRR, 2020

Maximizing Information Gain in Partially Observable Environments via Prediction Reward.

[DOI]

,

,

Shimon Whiteson

,

Frans A. Oliehoek

,

CoRR, 2020

An implicit function learning approach for parametric modal regression.

[DOI]

,

,

Amir-massoud Farahmand

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Towards Safe Policy Improvement for Non-Stationary MDPs.

[DOI]

,

Scott M. Jordan

,

Georgios Theocharous

,

,

Philip S. Thomas

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Gradient Temporal-Difference Learning with Regularized Corrections.

[DOI]

,

Andrew Patterson

,

,

,

,

Proceedings of the 37th International Conference on Machine Learning, 2020

Optimizing for the Future in Non-Stationary MDPs.

[DOI]

,

Georgios Theocharous

,

,

,

Sridhar Mahadevan

,

Philip S. Thomas

Proceedings of the 37th International Conference on Machine Learning, 2020

Selective Dyna-Style Planning Under Limited Model Capacity.

[DOI]

,

,

,

Proceedings of the 37th International Conference on Machine Learning, 2020

Training Recurrent Neural Networks Online by Learning Explicit State Variables.

[DOI]

,

,

,

,

,

Proceedings of the 8th International Conference on Learning Representations, 2020

Maxmin Q-learning: Controlling the Estimation Bias of Q-learning.

[DOI]

,

,

,

Proceedings of the 8th International Conference on Learning Representations, 2020

From Language to Language-ish: How Brain-Like is an LSTM's Representation of Atypical Language Stimuli?

[DOI]

Maryam Hashemzadeh

,

,

,

Andrea E. Martin

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Maximizing Information Gain in Partially Observable Environments via Prediction Rewards.

[DOI]

,

,

Shimon Whiteson

,

Frans A. Oliehoek

,

Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

2019

Is Fast Adaptation All You Need?

[DOI]

,

,

CoRR, 2019

Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study.

[DOI]

,

,

,

,

CoRR, 2019

Importance Resampling for Off-policy Prediction.

[DOI]

Matthew Schlegel

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Meta-Learning Representations for Continual Learning.

[DOI]

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning Macroscopic Brain Connectomes via Group-Sparse Factorization.

[DOI]

Farzane Aminmansour

,

Andrew Patterson

,

,

,

Daniel Mitchell

,

Franco Pestilli

,

Cesar F. Caiafa

,

Russell Greiner

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Planning with Expectation Models.

[DOI]

,

Muhammad Zaheer

,

,

,

Richard S. Sutton

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Hill Climbing on Value Estimates for Search-control in Dyna.

[DOI]

,

,

Amir-massoud Farahmand

,

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Two-Timescale Networks for Nonlinear Value Function Approximation.

[DOI]

,

,

,

Proceedings of the 7th International Conference on Learning Representations, 2019

The Utility of Sparse Representations for Control in Reinforcement Learning.

[DOI]

,

Raksha Kumaraswamy

,

,

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Meta-Descent for Online, Continual Prediction.

[DOI]

Andrew Jacobsen

,

Matthew Schlegel

,

,

,

,

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling.

[DOI]

,

,

,

,

CoRR, 2018

The Barbados 2018 List of Open Issues in Continual Learning.

[DOI]

,

Hado van Hasselt

,

,

,

,

Pierre-Luc Bacon

,

,

,

Marc G. Bellemare

,

CoRR, 2018

Online Off-policy Prediction.

[DOI]

,

Andrew Patterson

,

,

Richard S. Sutton

,

CoRR, 2018

Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces.

[DOI]

,

,

,

,

CoRR, 2018

General Value Function Networks.

[DOI]

Matthew Schlegel

,

,

Andrew Patterson

,

CoRR, 2018

Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods.

[DOI]

,

Brendan Bennett

,

,

Dylan R. Ashley

,

,

,

Richard S. Sutton

CoRR, 2018

Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return.

[DOI]

,

Dylan R. Ashley

,

Brendan Bennett

,

,

,

,

Richard S. Sutton

Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

High-confidence error estimates for learned value functions.

[DOI]

,

,

Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Supervised autoencoders: Improving generalization performance with unsupervised regularizers.

[DOI]

,

Andrew Patterson

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Context-dependent upper-confidence bounds for directed exploration.

[DOI]

Raksha Kumaraswamy

,

Matthew Schlegel

,

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

An Off-policy Policy Gradient Theorem Using Emphatic Weightings.

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains.

[DOI]

,

Muhammad Zaheer

,

,

Andrew Patterson

,

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control.

[DOI]

,

Amir-massoud Farahmand

,

,

,

,

Daniel Nikovski

Proceedings of the 35th International Conference on Machine Learning, 2018

Improving Regression Performance with Distributional Losses.

[DOI]

,

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Effective sketching methods for value function approximation.

[DOI]

,

Erfan Sadeqi Azer

,

Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Multi-view Matrix Factorization for Linear Dynamical System Estimation.

[DOI]

,

,

Dale Schuurmans

,

Csaba Szepesvári

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning Sparse Representations in Reinforcement Learning with Sparse Coding.

[DOI]

,

Raksha Kumaraswamy

,

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Unifying Task Specification in Reinforcement Learning.

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Adapting Kernel Representations Online Using Submodular Maximization.

[DOI]

Matthew Schlegel

,

,

,

Proceedings of the 34th International Conference on Machine Learning, 2017

Accelerated Gradient Temporal Difference Learning.

[DOI]

,

,

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Recovering True Classifier Performance in Positive-Unlabeled Learning.

[DOI]

,

,

Predrag Radivojac

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning.

[DOI]

Richard S. Sutton

,

Ashique Rupam Mahmood

,

J. Mach. Learn. Res., 2016

Global optimization of factor models using alternating minimization.

[DOI]

,

CoRR, 2016

Nonparametric semi-supervised learning of class proportions.

[DOI]

,

,

Michael W. Trosset

,

Predrag Radivojac

CoRR, 2016

Estimating the class prior and posterior from noisy positives and unlabeled data.

[DOI]

,

,

Predrag Radivojac

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Incremental Truncated LSTD.

[DOI]

Clement Gehring

,

,

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning.

[DOI]

,

Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Investigating Practical Linear Temporal Difference Learning.

[DOI]

,

Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

2015

Emphatic Temporal-Difference Learning.

[DOI]

Ashique Rupam Mahmood

,

,

,

Richard S. Sutton

CoRR, 2015

Incremental Truncated LSTD.

[DOI]

Clement Gehring

,

CoRR, 2015

Scalable Metric Learning for Co-Embedding.

[DOI]

Farzaneh Mirzazadeh

,

,

András György

,

Dale Schuurmans

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2015

Optimal Estimation of Multivariate ARMA Models.

[DOI]

,

,

Michael Bowling

,

Dale Schuurmans

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2013

Partition Tree Weighting.

[DOI]

,

,

Michael Bowling

,

András György

Proceedings of the 2013 Data Compression Conference, 2013

2012

Generalized Optimal Reverse Prediction.

[DOI]

,

Dale Schuurmans

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Off-Policy Actor-Critic

[DOI]

,

,

Richard S. Sutton

CoRR, 2012

Convex Multi-view Subspace Learning.

[DOI]

,

,

,

Dale Schuurmans

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Linear Off-Policy Actor-Critic.

[DOI]

,

,

Richard S. Sutton

Proceedings of the 29th International Conference on Machine Learning, 2012

2011

Convex Sparse Coding, Subspace Learning, and Semi-Supervised Extensions.

[DOI]

,

,

,

,

Dale Schuurmans

Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010

Relaxed Clipping: A Global Training Method for Robust Regression and Classification.

[DOI]

,

,

,

,

Dale Schuurmans

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains.

[DOI]

,

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

2009

Learning a Value Analysis Tool for Agent Evaluation.

[DOI]

,

Michael H. Bowling

Proceedings of the IJCAI 2009, 2009

Optimal reverse prediction: a unified perspective on supervised, unsupervised and semi-supervised learning.

[DOI]

,

,

Dale Schuurmans

Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Loading...