Doina Precup

According to our database1, Doina Precup authored at least 215 papers between 1997 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2018
Combined Reinforcement Learning via Abstract Representations.
CoRR, 2018

Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants.
CoRR, 2018

Predicting Extubation Readiness in Extreme Preterm Infants based on Patterns of Breathing.
CoRR, 2018

Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation.
CoRR, 2018

Attend Before you Act: Leveraging human visual attention for continual learning.
CoRR, 2018

Safe Option-Critic: Learning Safety in the Option-Critic Architecture.
CoRR, 2018

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning.
CoRR, 2018

Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization.
CoRR, 2018

Dyna Planning using a Feature Based Generative Model.
CoRR, 2018

Learning Safe Policies with Expert Guidance.
CoRR, 2018

Disentangling the independently controllable factors of variation by interacting with the world.
CoRR, 2018

Learning Robust Options.
CoRR, 2018

Constructing Temporal Abstractions Autonomously in Reinforcement Learning.
AI Magazine, 2018

Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization.
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, 2018

Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018

Convergent TREE BACKUP and RETRACE with Function Approximation.
Proceedings of the 35th International Conference on Machine Learning, 2018

Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

Leveraging Observational Learning for Exploration in Bandits.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Eligibility Traces for Options.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Nonlinear Weighted Finite Automata.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Learning Robust Options.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Imitation Upper Confidence Bound for Bandits on a Graph.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning With Options That Terminate Off-Policy.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

When Waiting Is Not an Option: Learning Options With a Deliberation Cost.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning Predictive State Representations From Non-Uniform Sampling.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Deep Reinforcement Learning That Matters.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

OptionGAN: Learning Joint Reward-Policy Options Using Generative Adversarial Inverse Reinforcement Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Learnings Options End-to-End for Continuous Action Tasks.
CoRR, 2017

Ubenwa: Cry-based Diagnosis of Birth Asphyxia.
CoRR, 2017

Singular value automata and approximate minimization.
CoRR, 2017

Learning with Options that Terminate Off-Policy.
CoRR, 2017

OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning.
CoRR, 2017

Deep Reinforcement Learning that Matters.
CoRR, 2017

When Waiting is not an Option : Learning Options with a Deliberation Cost.
CoRR, 2017

Neural Network Based Nonlinear Weighted Finite Automata.
CoRR, 2017

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control.
CoRR, 2017

Independently Controllable Factors.
CoRR, 2017

Variational Generative Stochastic Networks with Collaborative Shaping.
CoRR, 2017

Convergent Tree-Backup and Retrace with Function Approximation.
CoRR, 2017

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options.
CoRR, 2017

Investigating Recurrence and Eligibility Traces in Deep Q-Networks.
CoRR, 2017

Independently Controllable Features.
CoRR, 2017

Predicting extubation readiness in extreme preterm infants based on patterns of breathing.
Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Boosting Based Multiple Kernel Learning and Transfer Regression for Electricity Load Forecasting.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

Learning-based interactive segmentation using the maximum mean cycle weight formalism.
Proceedings of the Medical Imaging 2017: Image Processing, 2017

Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, 2017

Approximate Value Iteration with Temporally Extended Actions (Extended Abstract).
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Horizontal and Vertical Self-Adaptive Cloud Controller with Reward Optimization for Resource Allocation.
Proceedings of the 2017 International Conference on Cloud and Autonomic Computing, 2017

World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

A semi-Markov chain approach to modeling respiratory patterns prior to extubation in preterm infants.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

APEX_SCOPE: A graphical user interface for visualization of multi-modal data in inter-disciplinary studies.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

Real-Time Indoor Localization in Smart Homes Using Semi-Supervised Learning.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

The Option-Critic Architecture.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Hierarchical Spatio-Temporal Probabilistic Graphical Model with Multiple Feature Fusion for Binary Facial Attribute Classification in Real-World Face Videos.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

Practical Kernel-Based Reinforcement Learning.
Journal of Machine Learning Research, 2016

Editorial on Special Issue on Probabilistic Models for Biomedical Image Analysis.
Computer Vision and Image Understanding, 2016

Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data.
CoRR, 2016

Differentially Private Policy Evaluation.
CoRR, 2016

A Matrix Splitting Perspective on Planning with Options.
CoRR, 2016

The Option-Critic Architecture.
CoRR, 2016

Learning Multi-Step Predictive State Representations.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Differentially Private Policy Evaluation.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Verb Phrase Ellipsis Resolution Using Discriminative and Margin-Infused Algorithms.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Automated ongoing data validation and quality control of multi-institutional studies.
Proceedings of the 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2016

Prediction of Cell Type Specific Transcription Factor Binding Site Occupancy.
Proceedings of the 7th ACM International Conference on Bioinformatics, 2016

Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Incremental Stochastic Factorization for Online Reinforcement Learning.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).
IEEE Trans. Med. Imaging, 2015

Classification-Based Approximate Policy Iteration.
IEEE Trans. Automat. Contr., 2015

Quantifying the determinants of outbreak detection performance through simulation and machine learning.
Journal of Biomedical Informatics, 2015

Approximate Value Iteration with Temporally Extended Actions.
J. Artif. Intell. Res., 2015

Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos.
Computer Vision and Image Understanding, 2015

Policy Gradient Methods for Off-policy Control.
CoRR, 2015

Conditional Computation in Neural Networks for faster models.
CoRR, 2015

A Canonical Form for Weighted Automata and Applications to Approximate Minimization.
CoRR, 2015

Data Generation as Sequential Decision Making.
CoRR, 2015

Testing Visual Attention in Dynamic Environments.
CoRR, 2015

Learning and Planning with Timing Information in Markov Decision Processes.
Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

Basis refinement strategies for linear value function approximation in MDPs.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Data Generation as Sequential Decision Making.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

A Canonical Form for Weighted Automata and Applications to Approximate Minimization.
Proceedings of the 30th Annual ACM/IEEE Symposium on Logic in Computer Science, 2015

IMaGe: Iterative Multilevel Probabilistic Graphical Model for Detection and Segmentation of Multiple Sclerosis Lesions in Brain MRI.
Proceedings of the Information Processing in Medical Imaging, 2015

An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Variational Generative Stochastic Networks with Collaborative Shaping.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Correlation of clinical parameters with cardiorespiratory behavior in successfully extubated extremely preterm infants.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Organizational principles of cloud storage to support collaborative biomedical research.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Feature selection and oversampling in analysis of clinical data for extubation readiness in extreme preterm infants.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Representation Discovery for MDPs Using Bisimulation Metrics.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Representation Discovery for MDPs Using Bisimulation Metrics.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Policy Iteration Based on Stochastic Factorization.
J. Artif. Intell. Res., 2014

Algorithms for multi-armed bandit problems.
CoRR, 2014

Classification-based Approximate Policy Iteration: Experiments and Extended Discussions.
CoRR, 2014

Practical Kernel-Based Reinforcement Learning.
CoRR, 2014

Learning with Pseudo-Ensembles.
CoRR, 2014

Theoretical results on the effect of 'shortcut' actions in MDPs.
Connect. Sci., 2014

Bisimulation Metrics are Optimal Value Functions.
Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Optimizing Energy Production Using Policy Search and Predictive State Representations.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Learning with Pseudo-Ensembles.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

A new Q(lambda) with interim forward view and Monte Carlo equivalence.
Proceedings of the 31th International Conference on Machine Learning, 2014

Sample-based approximate regularization.
Proceedings of the 31th International Conference on Machine Learning, 2014

Multi-layer temporal graphical model for head pose estimation in real-world videos.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Probabilistic Temporal Head Pose Estimation Using a Hierarchical Graphical Model.
Proceedings of the Computer Vision - ECCV 2014, 2014

Iterative Multilevel MRF Leveraging Context and Voxel Information for Brain Tumour Segmentation in MRI.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Bisimulation for Markov Decision Processes through Families of Functional Expressions.
Proceedings of the Horizons of the Mind. A Tribute to Prakash Panangaden, 2014

Analyzing User Trajectories from Mobile Device Data with Hierarchical Dirichlet Processes.
Proceedings of the Advances in Artificial Intelligence, 2014

2013
Generating storylines from sensor data.
Pervasive and Mobile Computing, 2013

Time Series Analysis Using Geometric Template Matching.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Greedy Confidence Pursuit: A Pragmatic Approach to Multi-bandit Optimization.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

Learning from Limited Demonstrations.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Hierarchical Probabilistic Gabor and MRF Segmentation of Brain Tumours in MRI Volumes.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2013, 2013

Smart Classifier Selection for Activity Recognition on Wearable Devices.
Proceedings of the ICPRAM 2013, 2013

Average Reward Optimization Objective In Partially Observable Domains.
Proceedings of the 30th International Conference on Machine Learning, 2013

Assessing the Predictability of Hospital Readmission Using Machine Learning.
Proceedings of the Twenty-Fifth Innovative Applications of Artificial Intelligence Conference, 2013

Smart exploration in reinforcement learning using absolute temporal difference errors.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Using Hierarchical Mixture of Experts Model for Fusion of Outbreak Detection Methods.
Proceedings of the AMIA 2013, 2013

2012
An information-theoretic approach to curiosity-driven reinforcement learning.
Theory in Biosciences, 2012

On Average Reward Policy Evaluation in Infinite-State Partially Observable Systems.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces
CoRR, 2012

Metrics for Finite Markov Decision Processes
CoRR, 2012

Metrics for Markov Decision Processes with Infinite State Spaces
CoRR, 2012

Methods for computing state similarity in Markov Decision Processes.
CoRR, 2012

A Machine Learning Approach to the Detection of Fetal Hypoxia during Labor and Delivery.
AI Magazine, 2012

Reports of the AAAI 2011 Conference Workshops.
AI Magazine, 2012

On-the-Fly Algorithms for Bisimulation Metrics.
Proceedings of the Ninth International Conference on Quantitative Evaluation of Systems, 2012

Value Pursuit Iteration.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Improved Estimation in Time Varying Models.
Proceedings of the 29th International Conference on Machine Learning, 2012

An Empirical Analysis of Off-policy Learning in Discrete MDPs.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Prediction of extubation readiness in extreme preterm infants based on measures of cardiorespiratory variability.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

Soft biometric trait classification from real-world face videos conditioned on head pose estimation.
Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

Mining Administrative Data to Predict Falls in the Elderly Population.
Proceedings of the Advances in Artificial Intelligence, 2012

Compressed Least-Squares Regression on Sparse Spaces.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Bisimulation Metrics for Continuous Markov Decision Processes.
SIAM J. Comput., 2011

The Duality of State and Observation in Probabilistic Transition Systems.
Proceedings of the Logic, Language, and Computation, 2011

Activity Recognition with Mobile Phones.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

Reinforcement Learning using Kernel-Based Stochastic Factorization.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Adapted MRF Segmentation of Multiple Sclerosis Lesions Using Local Contextual Information.
Proceedings of the Medical Image Understanding and Analysis, 2011

A Framework for Computing Bounds for the Return of a Policy.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction.
Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics.
Proceedings of the Adaptive and Learning Agents - International Workshop, 2011

Basis function discovery using spectral clustering and bisimulation metrics.
Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Activity Recognition with Time-Delay Emobeddings.
Proceedings of the Computational Physiology, 2011

Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

Learning Compact Representations of Time-Varying Processes.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010
Classification of Normal and Hypoxic Fetuses From Systems Modeling of Intrapartum Cardiotocography.
IEEE Trans. Biomed. Engineering, 2010

A Study of Approximate Inference in Probabilistic Relational Models.
Proceedings of the 2nd Asian Conference on Machine Learning, 2010

Smarter Sampling in Model-Based Bayesian Reinforcement Learning.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2010

Approximate Predictive Representations of Partially Observable Systems.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

A Machine Learning Approach to the Detection of Fetal Hypoxia during Labor and Delivery.
Proceedings of the Twenty-Second Conference on Innovative Applications of Artificial Intelligence, 2010

A novel similarity measure for time series data with applications to gait and activity recognition.
Proceedings of the UbiComp 2010: Ubiquitous Computing, 12th International Conference, 2010

An Algebraic Approach to Dynamic Epistemic Logic.
Proceedings of the 23rd International Workshop on Description Logics (DL 2010), 2010

Automatically suggesting topics for augmenting text documents.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Optimal policy switching algorithms for reinforcement learning.
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

Using bisimulation for policy transfer in MDPs.
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

Activity and Gait Recognition with Time-Delay Embeddings.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

Using Bisimulation for Policy Transfer in MDPs.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009
Identification of the Dynamic Relationship Between Intrapartum Uterine Pressure and Fetal Heart Rate for Normal and Hypoxic Fetuses.
IEEE Trans. Biomed. Engineering, 2009

Learning the Difference between Partially Observable Dynamical Systems.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2009

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Wikispeedia: An Online Game for Inferring Semantic Distances between Concepts.
Proceedings of the IJCAI 2009, 2009

Equivalence Relations in Fully and Partially Observable Markov Decision Processes.
Proceedings of the IJCAI 2009, 2009

Fast gradient-descent methods for temporal-difference learning with linear function approximation.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Completing wikipedia's hyperlink structure through dimensionality reduction.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008
Anytime similarity measures for faster alignment.
Computer Vision and Image Understanding, 2008

Bounding Performance Loss in Approximate MDP Homomorphisms.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Reinforcement learning in the presence of rare events.
Proceedings of the Machine Learning, 2008

Point-Based Planning for Predictive State Representations.
Proceedings of the Advances in Artificial Intelligence , 2008

2007
Apprentissage actif dans les processus décisionnels de Markov partiellement observables L'algorithme MEDUSA.
Revue d'Intelligence Artificielle, 2007

Using Linear Programming for Bayesian Exploration in Markov Decision Processes.
Proceedings of the IJCAI 2007, 2007

Fast Image Alignment Using Anytime Algorithms.
Proceedings of the IJCAI 2007, 2007

Context-Driven Predictions.
Proceedings of the IJCAI 2007, 2007

A formal framework for robot learning and control under model uncertainty.
Proceedings of the 2007 IEEE International Conference on Robotics and Automation, 2007

2006
Methods for Computing State Similarity in Markov Decision Processes.
Proceedings of the UAI '06, 2006

Data Mining Using Relational Database Management Systems.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2006

Automatic basis function construction for approximate dynamic programming and reinforcement learning.
Proceedings of the Machine Learning, 2006

PAC-Learning of Markov Models with Hidden State.
Proceedings of the Machine Learning: ECML 2006, 2006

Belief Selection in Point-Based Planning Algorithms for POMDPs.
Proceedings of the Advances in Artificial Intelligence, 2006

Representing Systems with Hidden State.
Proceedings of the Proceedings, 2006

2005
The Workshop Program at the Nineteenth National Conference on Artificial Intelligence.
AI Magazine, 2005

Metrics for Markov Decision Processes with Infinite State Spaces.
Proceedings of the UAI '05, 2005

An approximation algorithm for labelled Markov processes: towards realistic approximation.
Proceedings of the Second International Conference on the Quantitative Evaluaiton of Systems (QEST 2005), 2005

Off-policy Learning with Options and Recognizers.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Using core beliefs for point-based value iteration.
Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Model minimization by linear PSR.
Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Active Learning in Partially Observable Markov Decision Processes.
Proceedings of the Machine Learning: ECML 2005, 2005

Using Rewards for Belief State Updates in Partially Observable Markov Decision Processes.
Proceedings of the Machine Learning: ECML 2005, 2005

2004
Redagent: winner of TAC SCM 2003.
SIGecom Exchanges, 2004

Classification Using Phi-Machines and Constructive Function Approximation.
Machine Learning, 2004

Metrics for Finite Markov Decision Processes.
Proceedings of the UAI '04, 2004

Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning.
Proceedings of the Machine Learning: ECML 2004, 2004

RedAgent-2003: An Autonomous Market-Based Supply-Chain Management Agent.
Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 2004

Metrics for Finite Markov Decision Processes.
Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003
A Planning Algorithm for Predictive State Representations.
Proceedings of the IJCAI-03, 2003

Combining TD-learning with Cascade-correlation Networks.
Proceedings of the Machine Learning, 2003

Using MDP Characteristics to Guide Exploration in Reinforcement Learning.
Proceedings of the Machine Learning: ECML 2003, 2003

2002
Developing Collaborative Golog Agents by Reinforcement Learning.
International Journal on Artificial Intelligence Tools, 2002

Learning Options in Reinforcement Learning.
Proceedings of the Abstraction, 2002

A Convergent Form of Approximate Policy Iteration.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

Combining and Adapting Software Quality Predictive Models by Genetic Algorithms.
Proceedings of the 17th IEEE International Conference on Automated Software Engineering (ASE 2002), 2002

Characterizing Markov Decision Processes.
Proceedings of the Machine Learning: ECML 2002, 2002

2001
Developing Collaborative Golog Agents by Reinforcement Learning.
Proceedings of the 13th IEEE International Conference on Tools with Artificial Intelligence, 2001

Off-Policy Temporal Difference Learning with Function Approximation.
Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

2000
Eligibility Traces for Off-Policy Policy Evaluation.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Using Finite Experiments to Study Asymptotic Performance.
Proceedings of the Experimental Algorithmics, 2000

1999
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning.
Artif. Intell., 1999

1998
Improved Switching among Temporally Abstract Actions.
Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Intra-Option Learning about Temporally Abstract Actions.
Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Classification Using Phi-Machines and Constructive Function Approximation.
Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Theoretical Results on Reinforcement Learning with Temporally Abstract Options.
Proceedings of the Machine Learning: ECML-98, 1998

1997
Multi-time Models for Temporally Abstract Planning.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Learning to Schedule Straight-Line Code.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

How to Find Big-Oh in Your Data Set (and How Not to).
Proceedings of the Advances in Intelligent Data Analysis, 1997

Exponentiated Gradient Methods for Reinforcement Learning.
Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), 1997


  Loading...