Csaba Szepesvári

According to our database1, Csaba Szepesvári authored at least 183 papers between 1993 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2019
Perturbed-History Exploration in Stochastic Linear Bandits.
Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback.
Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

Perturbed-History Exploration in Stochastic Multi-Armed Bandits.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

CapsAndRuns: An Improved Method for Approximately Optimal Algorithm Configuration.
Proceedings of the 36th International Conference on Machine Learning, 2019

Online Learning to Rank with Features.
Proceedings of the 36th International Conference on Machine Learning, 2019

Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits.
Proceedings of the 36th International Conference on Machine Learning, 2019

Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures.
Proceedings of the 7th International Conference on Learning Representations, 2019

An Information-Theoretic Approach to Minimax Regret in Partial Monitoring.
Proceedings of the Conference on Learning Theory, 2019

Distribution-Dependent Analysis of Gibbs-ERM Principle.
Proceedings of the Conference on Learning Theory, 2019

Cleaning up the neighborhood: A full classification for adversarial partial monitoring.
Proceedings of the Algorithmic Learning Theory, 2019

An Exponential Tail bound for Lq Stable Learning Rules.
Proceedings of the Algorithmic Learning Theory, 2019

Online Algorithm for Unsupervised Sensor Selection.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Model-Free Linear Quadratic Control via Reduction to Expert Prediction.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

An Exponential Tail Bound for the Deleted Estimate.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes.
IEEE Trans. Automat. Contr., 2018

Stochastic Optimization in a Cumulative Prospect Theory Framework.
IEEE Trans. Automat. Contr., 2018

PAC-Bayes bounds for stable algorithms with instance-dependent priors.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

TopRank: A practical algorithm for online stochastic ranking.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

An Exponential Tail Bound for Lq Stable Learning Rules. Application to k-Folds Cross-Validation.
Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2018

LEAPSANDBOUNDS: A Method for Approximately Optimal Algorithm Configuration.
Proceedings of the 35th International Conference on Machine Learning, 2018

Bandits with Delayed, Aggregated Anonymous Feedback.
Proceedings of the 35th International Conference on Machine Learning, 2018

Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers.
Proceedings of the 35th International Conference on Machine Learning, 2018

Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go?
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities.
J. Mach. Learn. Res., 2017

Multi-view Matrix Factorization for Linear Dynamical System Estimation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Bernoulli Rank-1 Bandits for Click Feedback.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Online Learning to Rank in Stochastic Click Models.
Proceedings of the 34th International Conference on Machine Learning, 2017

A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds.
Proceedings of the International Conference on Algorithmic Learning Theory, 2017

Structured Best Arm Identification with Fixed Confidence.
Proceedings of the International Conference on Algorithmic Learning Theory, 2017

The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Stochastic Rank-1 Bandits.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Unsupervised Sequential Sensor Acquisition.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016
Regularized Policy Iteration with Nonparametric Function Spaces.
J. Mach. Learn. Res., 2016

SDP Relaxation with Randomized Rounding for Energy Disaggregation.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Conservative Bandits.
Proceedings of the 33nd International Conference on Machine Learning, 2016

DCM Bandits: Learning to Rank with Multiple Clicks.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Shifting Regret, Mirror Descent, and Matrices.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control.
Proceedings of the 33nd International Conference on Machine Learning, 2016

(Bandit) Convex Optimization with Biased Noisy Gradient Oracles.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Bayesian Optimal Control of Smoothly Parameterized Systems.
Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

Online Learning with Gaussian Payoffs and Side Observations.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Linear Multi-Resource Allocation with Semi-Bandit Feedback.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Combinatorial Cascading Bandits.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Fast Cross-Validation for Incremental Learning.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Cascading Bandits: Learning to Rank in the Cascade Model.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Deterministic Independent Component Analysis.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Decision-theoretic Clustering of Strategies.
Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015

Exploiting Symmetries to Construct Efficient MCMC Algorithms With an Application to SLAM.
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

Toward Minimax Off-policy Value Estimation.
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits.
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

Near-optimal max-affine estimators for convex regression.
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

Pathological Effects of Variance on Classification-Based Policy Iteration.
Proceedings of the Learning for General Competency in Video Games, 2015

Decision-Theoretic Clustering of Strategies.
Proceedings of the Computer Poker and Imperfect Information, 2015

2014
Sequential Learning for Multi-Channel Wireless Network Monitoring With Channel Switching Costs.
IEEE Trans. Signal Processing, 2014

Guest Editors' introduction.
Theor. Comput. Sci., 2014

Partial Monitoring - Classification, Regret Bounds, and Algorithms.
Math. Oper. Res., 2014

Optimal Resource Allocation with Semi-Bandit Feedback.
Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Universal Option Models.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Generalization Bounds for Partially Linear Models.
Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2014

Adaptive Monte Carlo via Bandit Allocation.
Proceedings of the 31th International Conference on Machine Learning, 2014

Online Learning in Markov Decision Processes with Changing Cost Sequences.
Proceedings of the 31th International Conference on Machine Learning, 2014

On Learning the Optimal Waiting Time.
Proceedings of the Algorithmic Learning Theory - 25th International Conference, 2014

A Finite-Sample Generalization Bound for Semiparametric Regression: Partially Linear Models.
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, 2014

Pseudo-MDPs and factored linear action models.
Proceedings of the 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2014

2013
Toward a classification of finite partial-monitoring games.
Theor. Comput. Sci., 2013

Alignment based kernel learning with a continuous set of base kernels.
Machine Learning, 2013

Online Learning with Costly Features and Labels.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Characterizing the Representer Theorem.
Proceedings of the 30th International Conference on Machine Learning, 2013

Cost-sensitive Multiclass Classification Risk Bounds.
Proceedings of the 30th International Conference on Machine Learning, 2013

Online Learning under Delayed Feedback.
Proceedings of the 30th International Conference on Machine Learning, 2013

A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
The adversarial stochastic shortest path problem with unknown transition probabilities.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

The grand challenge of computer Go: Monte Carlo tree search and extensions.
Commun. ACM, 2012

Deep Representations and Codes for Image Auto-Annotation.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Analysis of Kernel Mean Matching under Covariate Shift.
Proceedings of the 29th International Conference on Machine Learning, 2012

Statistical linear estimation with penalized estimators: an application to reinforcement learning.
Proceedings of the 29th International Conference on Machine Learning, 2012

An adaptive algorithm for finite stochastic partial monitoring.
Proceedings of the 29th International Conference on Machine Learning, 2012

Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Preface.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Partial Monitoring with Side Information.
Proceedings of the Algorithmic Learning Theory - 23rd International Conference, 2012

Approximate Policy Iteration with Linear Action Models.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Model selection in reinforcement learning.
Machine Learning, 2011

Agnostic KWIK learning and efficient approximate reinforcement learning.
Proceedings of the COLT 2011, 2011

X-Armed Bandits.
J. Mach. Learn. Res., 2011

Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments.
Proceedings of the COLT 2011, 2011

Regret Bounds for the Adaptive Control of Linear Quadratic Systems.
Proceedings of the COLT 2011, 2011

PAC-Bayesian Policy Evaluation for Reinforcement Learning.
Proceedings of the UAI 2011, 2011

Improved Algorithms for Linear Stochastic Bandits.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Sequential learning for optimal monitoring of multi-channel wireless networks.
Proceedings of the INFOCOM 2011. 30th IEEE International Conference on Computer Communications, 2011

Invited Talk: Towards Robust Reinforcement Learning Algorithms.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Editors' Introduction.
Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

2010
Algorithms for Reinforcement Learning
Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers, 2010

Active learning in heteroscedastic noise.
Theor. Comput. Sci., 2010

A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Models of active learning in group-structured state spaces.
Inf. Comput., 2010

Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Online Markov Decision Processes under Bandit Feedback.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Parametric Bandits: The Generalized Linear Case.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Error Propagation for Approximate Policy and Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Extending rapidly-exploring random trees for asymptotically optimal anytime motion planning.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Model-based reinforcement learning with nearly tight exploration complexity bounds.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Toward Off-Policy Learning Control with Function Approximation.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Budgeted Distribution Learning of Belief Net Parameters.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

The Online Loop-free Stochastic Shortest-Path Problem.
Proceedings of the COLT 2010, 2010

Toward a Classification of Finite Partial-Monitoring Games.
Proceedings of the Algorithmic Learning Theory, 21st International Conference, 2010

2009
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits.
Theor. Comput. Sci., 2009

Training parsers by inverse reinforcement learning.
Machine Learning, 2009

Learning Exercise Policies for American Options.
Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, 2009

A General Projection Property for Distribution Families.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Multi-Step Dyna Planning for Policy Evaluation and Control.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Model-based and model-free reinforcement learning for visual servoing.
Proceedings of the 2009 IEEE International Conference on Robotics and Automation, 2009

Fast gradient-descent methods for temporal-difference learning with linear function approximation.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Learning when to stop thinking and do something!
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Learning to segment from a few well-selected training images.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Workshop summary: On-line learning with limited feedback.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

LMS-2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS.
Proceedings of the 48th IEEE Conference on Decision and Control, 2009

2008
Finite-Time Bounds for Fitted Value Iteration.
J. Mach. Learn. Res., 2008

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping.
Proceedings of the UAI 2008, 2008

Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstraction.
Proceedings of the UAI 2008, 2008

A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Regularized Policy Iteration.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Online Optimization in X-Armed Bandits.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Empirical Bernstein stopping.
Proceedings of the Machine Learning, 2008

Regularized Fitted Q-Iteration: Application to Planning.
Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

Active Learning of Group-Structured Environments.
Proceedings of the Algorithmic Learning Theory, 19th International Conference, 2008

Active Learning in Multi-armed Bandits.
Proceedings of the Algorithmic Learning Theory, 19th International Conference, 2008

2007
Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods.
Proceedings of the UAI 2007, 2007

Fitted Q-iteration in continuous action-space MDPs.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Continuous Time Associative Bandit Problems.
Proceedings of the IJCAI 2007, 2007

Sequence Prediction Exploiting Similary Information.
Proceedings of the IJCAI 2007, 2007

Manifold-adaptive dimension estimation.
Proceedings of the Machine Learning, 2007

Improved Rates for the Stochastic Continuum-Armed Bandit Problem.
Proceedings of the Learning Theory, 20th Annual Conference on Learning Theory, 2007

Tuning Bandit Algorithms in Stochastic Environments.
Proceedings of the Algorithmic Learning Theory, 18th International Conference, 2007

2006
Universal parameter optimisation in games based on SPSA.
Machine Learning, 2006

Local Importance Sampling: A Novel Technique to Enhance Particle Filtering.
Journal of Multimedia, 2006

Bandit Based Monte-Carlo Planning.
Proceedings of the Machine Learning: ECML 2006, 2006

Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

RSPSA: Enhanced Parameter Optimization in Games.
Proceedings of the Advances in Computer Games, 11th International Conference, 2006

2005
Finite time bounds for sampling based fitted value iteration.
Proceedings of the Machine Learning, 2005

X-mHMM: An Efficient Algorithm for Training Mixtures of HMMs When the Number of Mixtures Is Unknown.
Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 2005

2004
Interpolation-based Q-learning.
Proceedings of the Machine Learning, 2004

Margin Maximizing Discriminant Analysis.
Proceedings of the Machine Learning: ECML 2004, 2004

Enhancing Particle Filters Using Local Likelihood Sampling.
Proceedings of the Computer Vision, 2004

Kernel Machine Based Feature Extraction Algorithms for Regression Problems.
Proceedings of the 16th Eureopean Conference on Artificial Intelligence, 2004

Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results.
Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003
Sequential Importance Sampling for Visual Tracking Reconsidered.
Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, 2003

2002
An Asymptotic Scaling Analysis of LQ Performance for an Approximate Adaptive Control Design.
MCSS, 2002

LQ performance bounds for adaptive output feedback controllers for functionally uncertain nonlinear systems.
Automatica, 2002

2001
Ockham's Razor Modeling of the Matrisome Channels of the Basal Ganglia Thalamocortical Loops.
Int. J. Neural Syst., 2001

Efficient approximate planning in continuous space Markovian Decision Problems.
AI Commun., 2001

2000
Uncertainty, performance, and model dependency in approximate adaptive nonlinear control.
IEEE Trans. Automat. Contr., 2000

Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms.
Machine Learning, 2000

Modular Reinforcement Learning: A Case Study in a Robot Domain.
Acta Cybern., 2000

FlexVoice: A Parametric Approach to High-Quality Speech Synthesis.
Proceedings of the Text, Speech and Dialogue - Third International Workshop, 2000

1999
Parallel and robust skeletonization built on self-organizing elements.
Neural Networks, 1999

A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms.
Neural Computation, 1999

The SBASE protein domain library, release 6.0: a collection of annotated protein sequence segments.
Nucleic Acids Research, 1999

1998
Module-Based Reinforcement Learning: Experiments with a Real Robot.
Machine Learning, 1998

An integrated architecture for motion-control and path-planning.
J. Field Robotics, 1998

Module-Based Reinforcement Learning: Experiments with a Real Robot.
Auton. Robots, 1998

Non-Markovian Policies in Sequential Decision Problems.
Acta Cybern., 1998

Performance-Evaluation for Automated Detection of Microcalcifications in Mammograms Using Three Different Film-Digitizers.
Proceedings of the Digital Mammography, 1998

Automated Detection and Classification of Micro-Calcifications in Mammograms Using Artifical Neural Nets.
Proceedings of the Digital Mammography, 1998

Multi-criteria Reinforcement Learning.
Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

1997
Neurocontroller using dynamic state feedback for compensatory control.
Neural Networks, 1997

The Asymptotic Convergence-Rate of Q-learning.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Module Based Reinforcement Learning: An Application to a Real Robot.
Proceedings of the Learning Robots, 6th European Workshop, 1997

Learning and Exploitation Do Not Conflict Under Minimax Optimality.
Proceedings of the Machine Learning: ECML-97, 1997

1996
Approximate geometry representations and sensory fusion.
Neurocomputing, 1996

Self-Organizing Multi-Resolution Grid for Motion Planning and Control.
Int. J. Neural Syst., 1996

A Generalized Reinforcement-Learning Model: Convergence and Applications.
Proceedings of the Machine Learning, 1996

Inverse Dynamics Controllers for Robust Control: Consequences for Neurocontrollers.
Proceedings of the Artificial Neural Networks, 1996

1994
Topology Learning Solved by Extended Objects: A Neural Network Model.
Neural Computation, 1994

1993
Behavior of an Adaptive Self-organizing Autonomous Agent Working with Cues and Competing Concepts.
Adaptive Behaviour, 1993


  Loading...