Nahum Shimkin

Orcid: 0000-0001-7105-9956

According to our database1, Nahum Shimkin authored at least 89 papers between 1993 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 




Markov decision processes with burstiness constraints.
Eur. J. Oper. Res., February, 2024

Altitude-Loss Optimal Glides in Engine Failure Emergencies - Accounting for Ground Obstacles and Wind.
CoRR, 2023

Cooperative Multi-Agent Path Finding: Beyond Path Planning and Collision Avoidance.
Proceedings of the Fourteenth International Symposium on Combinatorial Search, 2021

Dynamic Scheduling of Multiclass Many-Server Queues with Abandonment: The Generalized <i>cμ</i>/<i>h</i> Rule.
Oper. Res., 2020

ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

PAC Bandits with Risk Constraints.
Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2018

On the Computation of Dynamic User Equilibrium in the Multiclass Transient Fluid Queue.
SIGMETRICS Perform. Evaluation Rev., 2017

Learning Control for Air Hockey Striking using Deep Reinforcement Learning.
CoRR, 2017

Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

An Online Convex Optimization Approach to Blackwell's Approachability.
J. Mach. Learn. Res., 2016

The Ordered Timeline Game: Strategic Posting Times Over a Temporally Ordered Shared Medium.
Dyn. Games Appl., 2016

Deep Reinforcement Learning with Averaged Target DQN.
CoRR, 2016

Pure Exploration for Max-Quantile Bandits.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2016

PAC Lower Bounds and Efficient Algorithms for The Max \(K\)-Armed Bandit Problem.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Response-based approachability with applications to generalized no-regret problems.
J. Mach. Learn. Res., 2015

The Max $K$-Armed Bandit: PAC Lower Bounds and Efficient Algorithms.
CoRR, 2015

The Max K-Armed Bandit: A PAC Lower Bound and tighter Algorithms.
CoRR, 2015

Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2015

Opportunistic Approachability and Generalized No-Regret Problems.
Math. Oper. Res., 2014

Fluid Limits for Many-Server Systems with Reneging Under a Priority Policy.
Math. Oper. Res., 2014

Infinitely Many-Armed Bandits with Unknown Value Distribution.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

Strategic posting times over a shared publication medium.
Proceedings of the 7th International Conference on NETwork Games, COntrol and OPtimization, 2014

The concert queueing game: strategic arrivals with waiting and tardiness costs.
Queueing Syst. Theory Appl., 2013

Response-Based Approachability and its Application to Generalized No-Regret Algorithms.
CoRR, 2013

Opportunistic Strategies for Generalized No-Regret Problems.
Proceedings of the COLT 2013, 2013

In memoriam Raphael Sivan, 1935-2011.
Autom., 2012

The concert queueing game with a random volume of arrivals.
Proceedings of the 6th International ICST Conference on Performance Evaluation Methodologies and Tools, 2012

Reservation-based distributed medium access in wireless collision channels.
Telecommun. Syst., 2011

Guest Editorial - Special Issue on Game Theory in Communication Networks.
Telecommun. Syst., 2011

Cross Entropy Algorithms for Data Association in Multi-Target Tracking.
IEEE Trans. Aerosp. Electron. Syst., 2011

On the asymptotic optimality of the <i>cμ</i>/<i>θ</i> rule under ergodic cost.
Queueing Syst. Theory Appl., 2011

The concert queueing game: to wait or to be late.
Discret. Event Dyn. Syst., 2011

Ann. Oper. Res., 2011

Socially optimal pricing of cloud computing resources.
Proceedings of the 5th International ICST Conference on Performance Evaluation Methodologies and Tools Communications, 2011

Unified Inter and Intra Options Learning Using Policy Gradient Methods.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Transient Behavior of Two-Machine Geometric Production Lines.
IEEE Trans. Autom. Control., 2010

Perform. Evaluation, 2010

Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains.
Mach. Learn., 2010

The <i>cµ/theta</i> Rule for Many-Server Queues with Abandonment.
Oper. Res., 2010

Online Classification with Specificity Constraints.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Markov Decision Processes with Arbitrary Reward Processes.
Math. Oper. Res., 2009

The Impact of Delay Announcements in Many-Server Queues with Abandonment.
Oper. Res., 2009

Capacity management and equilibrium for proportional QoS.
IEEE/ACM Trans. Netw., 2008

Rate-Based Equilibria in Collision Channels with Fading.
IEEE J. Sel. Areas Commun., 2008

Regret minimization in repeated matrix games with variable stage duration.
Games Econ. Behav., 2008

Cross-entropy based data association for multi target tracking.
Proceedings of the 3rd International ICST Conference on Performance Evaluation Methodologies and Tools, 2008

Efficient reinforcement learning in parameterized models: discrete parameters.
Proceedings of the 3rd International ICST Conference on Performance Evaluation Methodologies and Tools, 2008

The cμ/θ rule.
Proceedings of the 3rd International ICST Conference on Performance Evaluation Methodologies and Tools, 2008

Noncooperative power control and transmission scheduling in wireless collision channels.
Proceedings of the 2008 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2008

Efficient Rate-Constrained Nash Equilibrium in Collision Channels with State Information.
Proceedings of the INFOCOM 2008. 27th IEEE International Conference on Computer Communications, 2008

Decentralized Rate Regulation in Random Access Channels.
Proceedings of the INFOCOM 2008. 27th IEEE International Conference on Computer Communications, 2008

Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case.
Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

Adaptive Aggregation for Reinforcement Learning with Efficient Exploration: Deterministic Domains.
Proceedings of the 21st Annual Conference on Learning Theory, 2008

Topological Uniqueness of the Nash Equilibrium for Selfish Routing with Atomic Users.
Math. Oper. Res., 2007

A Survey of Uniqueness Results for Selfish Routing.
Proceedings of the Network Control and Optimization, 2007

Fixed-Rate Equilibrium in Wireless Collision Channels.
Proceedings of the Network Control and Optimization, 2007

Online Learning with Variable Stage Duration.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

Capacity Assignment for Proportional QoS in Diffserv-Like Networks.
Proceedings of the 45th IEEE Conference on Decision and Control, 2006

Basis Function Adaptation in Temporal Difference Reinforcement Learning.
Ann. Oper. Res., 2005

Proportional QoS in Differentiated Services Networks: Capacity Management, Equilibrium Analysis and Elastic Demands.
Proceedings of the Internet and Network Economics, First International Workshop, 2005

Optimal usage of color for disparity estimation in stereo vision.
Proceedings of the 13th European Signal Processing Conference, 2005

Uniqueness of the Nash Equilibrium in Convex Routing Games: Topological Conditions.
Proceedings of the 44th IEEE IEEE Conference on Decision and Control and 8th European Control Conference Control, 2005

Multigrid Methods for Policy Evaluation and Reinforcement Learning.
Proceedings of the Intelligent Control, 2005

Rational Abandonment from Tele-Queues: Nonlinear Waiting Costs with Heterogeneous Preferences.
Queueing Syst. Theory Appl., 2004

A Geometric Approach to Multi-Criterion Reinforcement Learning.
J. Mach. Learn. Res., 2004

The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes.
Math. Oper. Res., 2003

Markov Decision Processes with Slow Scale Periodic Decisions.
Math. Oper. Res., 2003

Velocity-Guided Tracking of Deformable Contours in Three Dimensional Space.
Int. J. Comput. Vis., 2003

Algorithms for stochastic approximations of curvature flows.
Proceedings of the 2003 International Conference on Image Processing, 2003

On-Line Learning with Imperfect Monitoring.
Proceedings of the Computational Learning Theory and Kernel Machines, 2003

Competitive routing in networks with polynomial costs.
IEEE Trans. Autom. Control., 2002

Adaptive Behavior of Impatient Customers in Tele-Queues: Theory and Empirical Support.
Manag. Sci., 2002

Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning.
Proceedings of the Machine Learning: ECML 2002, 2002

Routing into Two Parallel Links: Game-Theoretic Distributed Algorithms.
J. Parallel Distributed Comput., 2001

The Steering Approach for Multi-Criteria Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Adaptive Strategies and Regret Minimization in Arbitrarily Varying Markov Environments.
Proceedings of the Computational Learning Theory, 2001

Incentive pricing in multiclass systems.
Telecommun. Syst., 2000

A model for rational abandonments from invisible queues.
Queueing Syst. Theory Appl., 2000

Dynamic service sharing with heterogeneous preferences.
Queueing Syst. Theory Appl., 2000

Bandwidth allocation for guaranteed versus best effort service categories.
Queueing Syst. Theory Appl., 2000

Competitive Routing in Networks with Polynomial Cost.
Proceedings of the Proceedings IEEE INFOCOM 2000, 2000

Best-Effort Resource Sharing by Users with QoS Requirements.
Proceedings of the Proceedings IEEE INFOCOM '99, 1999

Individual Equilibrium and Learning in Processor Sharing Systems.
Oper. Res., 1998

Incentive Pricing in Multi-Class Communication Networks.
Proceedings of the Proceedings IEEE INFOCOM '97, 1997

Asymptotically Efficient Adaptive Strategies in Repeated Games Part II. Asymptotic Optimality.
Math. Oper. Res., 1996

Asymptotically Efficient Adaptive Strategies in Repeated Games Part I: Certainty Equivalence Strategies.
Math. Oper. Res., 1995

Competitive routing in multiuser communication networks.
IEEE/ACM Trans. Netw., 1993

Guaranteed performance regions in Markovian systems with competing decision makers.
IEEE Trans. Autom. Control., 1993

Competitive Routing in Multi-User Communication Networks.
Proceedings of the Proceedings IEEE INFOCOM '93, The Conference on Computer Communications, Twelfth Annual Joint Conference of the IEEE Computer and Communications Societies, Networking: Foundation for the Future, San Francisco, CA, USA, March 28, 1993
