Research article

Two-person zero-sum stochastic games with varying discount factors

  • Received: 07 May 2021 Accepted: 04 August 2021 Published: 09 August 2021
  • MSC : 91A15, 60J05

  • In this paper, two-person zero-sum Markov games with Borel state space and action space, unbounded reward function and state-dependent discount factors are studied. The optimal criterion is expected discount criterion. Firstly, sufficient conditions for the existence of optimal policies are given for the two-person zero-sum Markov games with varying discount factors. Then, the existence of optimal policies is proved by Banach fixed point theorem. Finally, we give an example for reservoir operations to illustrate the existence results.

    Citation: Xiao Wu, Qi Wang, Yinying Kong. Two-person zero-sum stochastic games with varying discount factors[J]. AIMS Mathematics, 2021, 6(10): 11516-11529. doi: 10.3934/math.2021668

    Related Papers:

  • In this paper, two-person zero-sum Markov games with Borel state space and action space, unbounded reward function and state-dependent discount factors are studied. The optimal criterion is expected discount criterion. Firstly, sufficient conditions for the existence of optimal policies are given for the two-person zero-sum Markov games with varying discount factors. Then, the existence of optimal policies is proved by Banach fixed point theorem. Finally, we give an example for reservoir operations to illustrate the existence results.



    加载中


    [1] L. S. Shapley, Stochastic games, P. Natl. Acad. Sci. USA, 39 (1953), 1095–1100.
    [2] A. Maitra, T. Parthasarathy, On stochastic games, J. Appl. Probab., 5 (1970), 289–300.
    [3] T. Parthasarathy, Discounted, positive and noncooperative stochastic games, Int. J. Game Theory, 2 (1973), 25–37. doi: 10.1007/BF01737555
    [4] H. Couwenbergh, Stochastic games with metric state space, Int. J. Game Theory, 9 (1980), 25–36. doi: 10.1007/BF01784794
    [5] J. Filar, K. Vrieze, Competitive Markov Decision Processes, New York: Springer-Verlag, 1997.
    [6] A. S. Nowak, Universally measurable strategies in zero-sum stochastic games, Ann. Probab., 13 (1985), 269–287.
    [7] A. Neyman, S. Sorin, Stochastic Games and Applications, Dordrecht: Kluwer Academic Publishers, 2003.
    [8] X. P. Guo, O. Hernández-Lerma, Zero-sum games for continuous-time jump Markov processes in Polish spaces: discounted payoffs, Adv. Appl. Probab., 39 (2007), 645–668. doi: 10.1017/S0001867800001981
    [9] J. Minjárez-Sosa, F. Luque-Vásquez, Two person zero-sum semi-Markov games with unknown holding times distribution on one side: a discounted payoff criterion, Appl. Math. Opt., 57 (2008), 289–305. doi: 10.1007/s00245-007-9016-7
    [10] O. Hernández-Lerma, J. B. Lasserre, Discrete-Time Markov Control Processes: Basic Optimality Criteria, New York: Springer-Verlag, 1996.
    [11] X. P. Guo, O. Hernández-Lerma, Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates, Bernoulli, 11 (2005), 1009–1029.
    [12] M. Schäll, Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal, Z. Wahrscheinlichkeitstheor Verw. Geb., 32 (1975), 179–196. doi: 10.1007/BF00532612
    [13] J. González-Hernández, R. López-Martinez, J. Pérez-Hernández, Markov control processes with randomized discounted cost, Math. Methods Oper. Res., 65 (2007), 27–44. doi: 10.1007/s00186-006-0092-2
    [14] J. González-Hernández, R. López-Martinez, J. Minjárez-Sosa, Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion, Kybernetika, 45 (2009), 737–754.
    [15] Y. Zhang, Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors, Top, 21 (2013), 378–408. doi: 10.1007/s11750-011-0186-8
    [16] X. Wu, X. P. Guo, First Passage Optimality and Variance Minimisation of Markov Decision Processes with Varying Discount Factors, J. Appl. Probab., 52 (2015), 441–456. doi: 10.1239/jap/1437658608
    [17] L. I. Sennott, Nonzero-sum stochastic games with unbounded costs: discounted and average cost cases, Math. Method Oper. Res., 40 (1994), 145–162. doi: 10.1007/BF01432807
    [18] X. P. Guo, Q. X. Zhu, Average optimality for Markov decision processes in Borel spaces: A new condition and approach, J. Appl. Probab., 43 (2006), 318–334. doi: 10.1239/jap/1152413725
    [19] X. P. Guo, O. Hernández-Lerma, Nonzero-sum games for continuous-time Markov chains with unbounded discounted payoffs, J. Appl. Probab., 42 (2005), 303–320. doi: 10.1239/jap/1118777172
    [20] K. Fan, Minimax theorems, P. Natl. Acad. Sci. USA, 39 (1953), 42–47.
    [21] A. S. Nowak, S. Andrzej, Measurable selection theorems for minimax stochastic optimization problems, SIAM J. Control Optim., 23 (1985), 466–476. doi: 10.1137/0323030
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1784) PDF downloads(95) Cited by(1)

Article outline

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog