Shie Mannor

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Online PCA for Contaminated Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Detecting epidemics using highly noisy data.

[BibT_eX]

[DOI]

Chris Milling

Sanjay Shakkottai

Proceedings of the Fourteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, 2013

Model selection in markovian processes.

[BibT_eX]

[DOI]

Assaf Hallak

Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013

Temporal Difference Methods for the Variance of the Reward To Go.

[BibT_eX]

[DOI]

Aviv Tamar

Proceedings of the 30th International Conference on Machine Learning, 2013

Robust Sparse Regression under Adversarial Corruption.

[BibT_eX]

[DOI]

Yudong Chen

Proceedings of the 30th International Conference on Machine Learning, 2013

Approachability, fast and slow.

[BibT_eX]

[DOI]

Vianney Perchet

Proceedings of the COLT 2013, 2013

Opportunistic Strategies for Generalized No-Regret Problems.

[BibT_eX]

[DOI]

Andrey Bernstein

Proceedings of the COLT 2013, 2013

Online Learning for Time Series Prediction.

[BibT_eX]

[DOI]

Proceedings of the COLT 2013, 2013

2012

Dithered Belief Propagation Decoding.

[BibT_eX]

[DOI]

François Leduc-Primeau

Saied Hemati

IEEE Trans. Commun., 2012

Statistical Optimization in High Dimensions.

[BibT_eX]

[DOI]

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Preface.

[BibT_eX]

[DOI]

Nathan Srebro

Proceedings of the COLT 2012, 2012

More Is Better: Large Scale Partially-supervised Sentiment Classication.

[BibT_eX]

[DOI]

Yoav Haimovitch

Koby Crammer

Proceedings of the 4th Asian Conference on Machine Learning, 2012

Optimization Under Probabilistic Envelope Constraints.

[BibT_eX]

[DOI]

Oper. Res., 2012

More Is Better: Large Scale Partially-supervised Sentiment Classification - Appendix

[BibT_eX]

[DOI]

Yoav Haimovitch

Koby Crammer

CoRR, 2012

How to sample if you must: on optimal functional sampling

[BibT_eX]

[DOI]

Assaf Hallak

CoRR, 2012

Clustered Bandits

[BibT_eX]

[DOI]

Loc Bui

CoRR, 2012

Approximately optimal bidding policies for repeated first-price auctions.

[BibT_eX]

[DOI]

Ann. Oper. Res., 2012

Joint Stochastic Decoding of LDPC Codes and Partial-Response Channels.

[BibT_eX]

[DOI]

Paul H. Siegel

Proceedings of the 2012 IEEE Workshop on Signal Processing Systems, 2012

Network forensics: random infection vs spreading epidemic.

[BibT_eX]

[DOI]

Chris Milling

Sanjay Shakkottai

Proceedings of the ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, 2012

The Perturbed Variation.

[BibT_eX]

[DOI]

Maayan Harel

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Lightning Does Not Strike Twice: Robust MDPs with Coupled Uncertainty.

[BibT_eX]

[DOI]

Ofir Mebel

Proceedings of the 29th International Conference on Machine Learning, 2012

Policy Gradients with Variance Related Risk Criteria.

[BibT_eX]

[DOI]

Aviv Tamar

Proceedings of the 29th International Conference on Machine Learning, 2012

Decoupling Exploration and Exploitation in Multi-Armed Bandits.

[BibT_eX]

[DOI]

Orly Avner

Ohad Shamir

Proceedings of the 29th International Conference on Machine Learning, 2012

Large scale real-time bidding in the smart grid: A mean field framework.

[BibT_eX]

[DOI]

Peter E. Caines

Proceedings of the 51th IEEE Conference on Decision and Control, 2012

Duality of ancillary services and intermittent suppliers.

[BibT_eX]

[DOI]

Proceedings of the 51th IEEE Conference on Decision and Control, 2012

On identifying the causative network of an epidemic.

[BibT_eX]

[DOI]

Chris Milling

Sanjay Shakkottai

Proceedings of the 50th Annual Allerton Conference on Communication, 2012

Bayesian Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Reinforcement Learning, 2012

2011

Tracking Forecast Memories for Stochastic Decoding.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2011

Delayed Stochastic Decoding of LDPC Codes.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2011

Efficient Bidding in Dynamic Grid Markets.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2011

A Robust Learning Approach to Repeated Auctions With Monitoring and Entry Fees.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Intell. AI Games, 2011

The Sample Complexity of Dictionary Learning.

[BibT_eX]

[DOI]

Daniel Vainsencher

Alfred M. Bruckstein

Proceedings of the COLT 2011, 2011

Robust approachability and regret minimization in games with partial monitoring.

[BibT_eX]

[DOI]

Vianney Perchet

Gilles Stoltz

Proceedings of the COLT 2011, 2011

Does an Efficient Calibrated Forecasting Strategy Exist?

[BibT_eX]

[DOI]

Jacob D. Abernethy

Proceedings of the COLT 2011, 2011

Regulation, Volatility and Efficiency in Continuous-Time Markets

[BibT_eX]

[DOI]

CoRR, 2011

Bandits with an Edge

[BibT_eX]

[DOI]

Claudio Gentile

CoRR, 2011

Activity Recognition with Mobile Phones.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

From Bandits to Experts: On the Value of Side-Observations.

[BibT_eX]

[DOI]

Ohad Shamir

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Committing Bandits.

[BibT_eX]

[DOI]

Loc Bui

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

A state action frequency approach to throughput maximization over uncertain wireless channels.

[BibT_eX]

[DOI]

Krishna P. Jagannathan

Ishai Menache

Eytan H. Modiano

Proceedings of the INFOCOM 2011. 30th IEEE International Conference on Computer Communications, 2011

Probabilistic Goal Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2011, 2011

Unimodal Bandits.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Machine Learning, 2011

Bundle Selling by Online Estimation of Valuation Functions.

[BibT_eX]

[DOI]

Daniel Vainsencher

Ofer Dekel

Proceedings of the 28th International Conference on Machine Learning, 2011

Mean-Variance Optimization in Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Machine Learning, 2011

Learning from Multiple Outlooks.

[BibT_eX]

[DOI]

Maayan Harel

Proceedings of the 28th International Conference on Machine Learning, 2011

Regulation and double price mechanisms in markets with friction.

[BibT_eX]

[DOI]

Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, 2011

Stochastic bandits with pathwise constraints.

[BibT_eX]

[DOI]

Orly Avner

Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, 2011

Activity Recognition with Time-Delay Emobeddings.

[BibT_eX]

[DOI]

Proceedings of the Computational Physiology, 2011

2010

k-Armed Bandit.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Machine Learning, 2010

Relaxation dynamics in stochastic iterative decoders.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2010

Majority-based tracking forecast memories for stochastic LDPC decoding.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2010

A Min-Sum Iterative Decoder Based on Pulsewidth Message Encoding.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2010

A Geometric Proof of Calibration.

[BibT_eX]

[DOI]

Gilles Stoltz

Math. Oper. Res., 2010

Percentile Optimization for Markov Decision Processes with Parameter Uncertainty.

[BibT_eX]

[DOI]

Erick Delage

Oper. Res., 2010

Stochastic Chase Decoding of Reed-Solomon Codes.

[BibT_eX]

[DOI]

IEEE Commun. Lett., 2010

Adaptive Bases for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2010

Distributionally Robust Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Online Classification with Specificity Constraints.

[BibT_eX]

[DOI]

Andrey Bernstein

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Generative models for rapid information propagation.

[BibT_eX]

[DOI]

Kirill Dyagilev

Elad Yom-Tov

Proceedings of the First Workshop on Social Media Analytics, 2010

Resource Allocation with Supply Adjustment in Distributed Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the 2010 International Conference on Distributed Computing Systems, 2010

A novel similarity measure for time series data with applications to gait and activity recognition.

[BibT_eX]

[DOI]

Proceedings of the UbiComp 2010: Ubiquitous Computing, 12th International Conference, 2010

Lowering Error Floors Using Dithered Belief Propagation.

[BibT_eX]

[DOI]

François Leduc-Primeau

Saied Hemati

Proceedings of the Global Communications Conference, 2010

Robustness and Generalization.

[BibT_eX]

[DOI]

Proceedings of the COLT 2010, 2010

Principal Component Analysis with Contaminated Data: The High Dimensional Case.

[BibT_eX]

[DOI]

Proceedings of the COLT 2010, 2010

Learning with Global Cost in Stochastic Environments.

[BibT_eX]

[DOI]

Proceedings of the COLT 2010, 2010

Regulation and efficiency in markets with friction.

[BibT_eX]

[DOI]

Proceedings of the 49th IEEE Conference on Decision and Control, 2010

Adaptive bases for Q-learning.

[BibT_eX]

[DOI]

Proceedings of the 49th IEEE Conference on Decision and Control, 2010

A distributional interpretation of robust optimization.

[BibT_eX]

[DOI]

Proceedings of the 48th Annual Allerton Conference on Communication, 2010

Relaxed half-stochastic decoding of LDPC codes over GF(q).

[BibT_eX]

[DOI]

Proceedings of the 48th Annual Allerton Conference on Communication, 2010

Volatility and efficiency in markets with friction.

[BibT_eX]

[DOI]

Proceedings of the 48th Annual Allerton Conference on Communication, 2010

Tutor learning using linear constraints in approximate dynamic programming.

[BibT_eX]

[DOI]

Proceedings of the 48th Annual Allerton Conference on Communication, 2010

Activity and Gait Recognition with Time-Delay Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009

A Kalman Filter Design Based on the Performance/Robustness Tradeoff.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2009

Robustness and Regularization of Support Vector Machines.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2009

Online Learning with Sample Path Constraints.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2009

Approachability in repeated games: Computational aspects and a Stackelberg variant.

[BibT_eX]

[DOI]

Games Econ. Behav., 2009

Bidirectional interleavers for LDPC decoders using transmission gates.

[BibT_eX]

[DOI]

Kevin Cushon

Proceedings of the IEEE Workshop on Signal Processing Systems, 2009

High dimensional Principal Component Analysis with contaminated data.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Information Theory Workshop, 2009

Piecewise-stationary bandit problems with side observations.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Stochastic Decoding of LDPC Codes over GF(q).

[BibT_eX]

[DOI]

Gabi Sarkis

Proceedings of IEEE International Conference on Communications, 2009

Tracking Forecast Memories in stochastic decoders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

A Relaxed Half-Stochastic Iterative Decoder for LDPC Codes.

[BibT_eX]

[DOI]

François Leduc-Primeau

Saied Hemati

Proceedings of the Global Communications Conference, 2009. GLOBECOM 2009, Honolulu, Hawaii, USA, 30 November, 2009

Online learning in Markov decision processes with arbitrarily changing rewards and transitions.

[BibT_eX]

[DOI]

Proceedings of the 1st International Conference on Game Theory for Networks, 2009

Bidding efficiently in repeated auctions with entry and observation costs.

[BibT_eX]

[DOI]

Proceedings of the 1st International Conference on Game Theory for Networks, 2009

Online Learning for Global Cost Functions.

[BibT_eX]

[DOI]

Proceedings of the COLT 2009, 2009

Arbitrarily modulated Markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the 48th IEEE Conference on Decision and Control, 2009

Parametric regret in uncertain Markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the 48th IEEE Conference on Decision and Control, 2009

Risk sensitive robust support vector machines.

[BibT_eX]

[DOI]

Sungho Yun

Proceedings of the 48th IEEE Conference on Decision and Control, 2009

Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems.

[BibT_eX]

[DOI]

Amir Massoud Farahmand

Mohammad Ghavamzadeh

Csaba Szepesvári

Proceedings of the American Control Conference, 2009

2008

Fully Parallel Stochastic LDPC Decoders.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2008

Regret minimization in repeated matrix games with variable stage duration.

[BibT_eX]

[DOI]

Games Econ. Behav., 2008

Robustness, Risk, and Regularization in Support Vector Machines

[BibT_eX]

[DOI]

CoRR, 2008

Local Two-Stage Myopic Dynamics for Network Formation Games.

[BibT_eX]

[DOI]

Esteban Arcaute

Proceedings of the Internet and Network Economics, 4th International Workshop, 2008

Efficient reinforcement learning in parameterized models: discrete parameters.

[BibT_eX]

[DOI]

Kirill Dyagilev

Proceedings of the 3rd International ICST Conference on Performance Evaluation Methodologies and Tools, 2008

Robust Regression and Lasso.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Regularized Policy Iteration.

[BibT_eX]

[DOI]

Amir Massoud Farahmand

Mohammad Ghavamzadeh

Csaba Szepesvári

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

A Lazy Approach to Online Learning with Constraints.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

Reinforcement learning in the presence of rare events.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2008

Markov Decision Processes with Arbitrary Reward Processes.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

Regularized Fitted Q-Iteration: Application to Planning.

[BibT_eX]

[DOI]

Amir Massoud Farahmand

Mohammad Ghavamzadeh

Csaba Szepesvári

Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case.

[BibT_eX]

[DOI]

Kirill Dyagilev

Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

Learning in the Limit with Adversarial Disturbances.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference on Learning Theory, 2008

Sparse algorithms are not stable: A no-free-lunch theorem.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual Allerton Conference on Communication, 2008

Robust dimensionality reduction for high-dimension data.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual Allerton Conference on Communication, 2008

Local dynamics for network formation games.

[BibT_eX]

[DOI]

Esteban Arcaute

Proceedings of the 46th Annual Allerton Conference on Communication, 2008

Online Learning with Expert Advice and Finite-Horizon Constraints.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007

Online calibrated forecasts: Memory efficiency versus universality for learning in games.

[BibT_eX]

[DOI]

Jeff S. Shamma

Gürdal Arslan

Mach. Learn., 2007

Bias and Variance Approximation in Value Function Estimates.

[BibT_eX]

[DOI]

Manag. Sci., 2007

Efficiency of Market-Based Resource Allocation among Many Participants.

[BibT_eX]

[DOI]

IEEE J. Sel. Areas Commun., 2007

Multi-agent learning for engineers.

[BibT_eX]

[DOI]

Jeff S. Shamma

Artif. Intell., 2007

Network Formation: Bilateral Contracting and Myopic Dynamics.

[BibT_eX]

[DOI]

Esteban Arcaute

Proceedings of the Internet and Network Economics, Third International Workshop, 2007

An Area-Efficient FPGA-Based Architecture for Fully-Parallel Stochastic LDPC Decoding.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Signal Processing Systems, 2007

Reinforcement Learning-Based Load Shared Sequential Routing.

[BibT_eX]

[DOI]

Fariba Heidari

Lorne Mason

Proceedings of the NETWORKING 2007. Ad Hoc and Sensor Networks, 2007

Survey of Stochastic Computation on Factor Graphs.

[BibT_eX]

[DOI]

Proceedings of the 37th International Symposium on Multiple-Valued Logic, 2007

Percentile optimization in uncertain Markov decision processes with application to efficient exploration.

[BibT_eX]

[DOI]

Erick Delage

Proceedings of the Machine Learning, 2007

Non-Cooperative Design of Translucent Networks.

[BibT_eX]

[DOI]

Proceedings of the Global Communications Conference, 2007

Strategies for Prediction Under Imperfect Monitoring.

[BibT_eX]

[DOI]

Gábor Lugosi

Gilles Stoltz

Proceedings of the Learning Theory, 20th Annual Conference on Learning Theory, 2007

Dynamics and stability in network formation games with bilateral contracts.

[BibT_eX]

[DOI]

Proceedings of the 46th IEEE Conference on Decision and Control, 2007

User Model and Utility Based Power Management.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

Adaptive Timeout Policies for Fast Fine-Grained Power Management.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006

Design of ℓ1-optimal controllers with flexible disturbance rejection level.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2006

Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2006

Stochastic decoding of LDPC codes.

[BibT_eX]

[DOI]

IEEE Commun. Lett., 2006

A contract-based model for directed network formation.

[BibT_eX]

[DOI]

Games Econ. Behav., 2006

The Robustness-Performance Tradeoff in Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Asymptotics of Efficiency Loss in Competitive Market Mechanisms.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2006. 25th IEEE International Conference on Computer Communications, 2006

Automatic basis function construction for approximate dynamic programming and reinforcement learning.

[BibT_eX]

[DOI]

Philipp W. Keller

Proceedings of the Machine Learning, 2006

Online Learning with Constraints.

[BibT_eX]

[DOI]

Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

Online Learning with Variable Stage Duration.

[BibT_eX]

[DOI]

Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

Design of l1-Optimal Controllers with Flexible Disturbance Rejection Level.

[BibT_eX]

[DOI]

Proceedings of the American Control Conference, 2006

2005

Efficiency loss in a network resource allocation game: the case of elastic supply.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2005

On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies.

[BibT_eX]

[DOI]

Math. Oper. Res., 2005

Basis Function Adaptation in Temporal Difference Reinforcement Learning.

[BibT_eX]

[DOI]

Ishai Menache

Ann. Oper. Res., 2005

A Tutorial on the Cross-Entropy Method.

[BibT_eX]

[DOI]

Ann. Oper. Res., 2005

The Workshop Program at the Nineteenth National Conference on Artificial Intelligence.

[BibT_eX]

[DOI]

AI Mag., 2005

The cross entropy method for classification.

[BibT_eX]

[DOI]

Dori Peleg

Reuven Y. Rubinstein

Proceedings of the Machine Learning, 2005

Reinforcement learning with Gaussian processes.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2005

2004

The kernel recursive least-squares algorithm.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2004

The Sample Complexity of Exploration in the Multi-Armed Bandit Problem.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2004

A Geometric Approach to Multi-Criterion Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2004

Bias and variance in value function estimation.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2004

Dynamic abstraction in reinforcement learning via clustering.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2004

Reinforcement Learning for Average Reward Zero-Sum Games.

[BibT_eX]

[DOI]

Proceedings of the Learning Theory, 17th Annual Conference on Learning Theory, 2004

An Inequality for Nearly Log-Concave Distributions with Applications to Learning.

[BibT_eX]

[DOI]

Proceedings of the Learning Theory, 17th Annual Conference on Learning Theory, 2004

Efficiency loss in a resource allocation game: A single link in elastic supply.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE Conference on Decision and Control, 2004

2003

The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes.

[BibT_eX]

[DOI]

Math. Oper. Res., 2003

Greedy Algorithms for Classification -- Consistency, Convergence Rates, and Adaptivity.

[BibT_eX]

[DOI]

Tong Zhang

J. Mach. Learn. Res., 2003

The Cross Entropy Method for Fast Policy Search.

[BibT_eX]

[DOI]

Reuven Y. Rubinstein

Yohai Gat

Proceedings of the Machine Learning, 2003

Action Elimination and Stopping Conditions for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2003

Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2003

Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem.

[BibT_eX]

[DOI]

Proceedings of the Computational Learning Theory and Kernel Machines, 2003

On-Line Learning with Imperfect Monitoring.

[BibT_eX]

[DOI]

Proceedings of the Computational Learning Theory and Kernel Machines, 2003

2002

On the Existence of Linear Weak Learners and Applications to Boosting.

[BibT_eX]

[DOI]

Mach. Learn., 2002

Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning.

[BibT_eX]

[DOI]

Ishai Menache

Proceedings of the Machine Learning: ECML 2002, 2002

Sparse Online Greedy Support Vector Regression.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning: ECML 2002, 2002

The Consistency of Greedy Algorithms for Classification.

[BibT_eX]

[DOI]

Tong Zhang

Proceedings of the Computational Learning Theory, 2002

PAC Bounds for Multi-armed Bandit and Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Computational Learning Theory, 2002

2001

The Steering Approach for Multi-Criteria Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Learning Embedded Maps of Markov Processes.

[BibT_eX]

Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

Adaptive Strategies and Regret Minimization in Arbitrarily Varying Markov Environments.

[BibT_eX]

[DOI]

Proceedings of the Computational Learning Theory, 2001

Geometric Bounds for Generalization in Boosting.

[BibT_eX]

[DOI]

Proceedings of the Computational Learning Theory, 2001

2000

Weak Learners and Improved Rates of Convergence in Boosting.

[BibT_eX]

[DOI]