Citation: Missag Hagop Parseghian. What is the role of histone H1 heterogeneity? A functional model emerges from a 50 year mystery[J]. AIMS Biophysics, 2015, 2(4): 724-772. doi: 10.3934/biophy.2015.4.724
[1] | Mica Grujicic, Jennifer S. Snipes, S. Ramaswami . Penetration resistance and ballistic-impact behavior of Ti/TiAl3 metal/intermetallic laminated composites (MILCs): A computational investigation. AIMS Materials Science, 2016, 3(3): 686-721. doi: 10.3934/matersci.2016.3.686 |
[2] | Ruaa Al-Mezrakchi, Ahmed Al-Ramthan, Shah Alam . Designing and modeling new generation of advanced hybrid composite sandwich structure armors for ballistic threats in defense applications. AIMS Materials Science, 2020, 7(5): 608-631. doi: 10.3934/matersci.2020.5.608 |
[3] | Piyada Suwanpinij, Audtaporn Worabut, Ratchadaporn Supruangnet, Hans Henning Dickert . Analysis of precipitation and dissolution of the microalloying elements by X-ray absorption spectroscopy (XAS). AIMS Materials Science, 2017, 4(4): 856-866. doi: 10.3934/matersci.2017.4.856 |
[4] | Bohumir Strnadel, Vratislav Mareš . Effect of micro-cracks in nonhomogeneous stress field on fracture instability in structural components. AIMS Materials Science, 2016, 3(4): 1534-1543. doi: 10.3934/matersci.2016.4.1534 |
[5] | Samaneh Nasiri, Michael Zaiser . Rupture of graphene sheets with randomly distributed defects. AIMS Materials Science, 2016, 3(4): 1340-1349. doi: 10.3934/matersci.2016.4.1340 |
[6] | Shaimaa Mazhar Mahdi, Majeed Ali Habeeb . Low-cost piezoelectric sensors and gamma ray attenuation fabricated from novel polymeric nanocomposites. AIMS Materials Science, 2023, 10(2): 288-300. doi: 10.3934/matersci.2023015 |
[7] | Na Ta, Kai Wang, Xiaoyan Yin, Michael Fleck, Claas Hüter . Elastically induced pattern formation in the initial and frustrated growth regime of bainitic subunits. AIMS Materials Science, 2019, 6(1): 52-62. doi: 10.3934/matersci.2019.1.52 |
[8] | Faiza Lourdjane, Mouhyddine Kadi-Hanifi, Azzeddine Abderrahmane Raho . Precipitation of the metastable phases in a tin microalloyed Al-10at%Ag alloy. AIMS Materials Science, 2017, 4(1): 1-15. doi: 10.3934/matersci.2017.1.1 |
[9] | Christian M. Julien, Alain Mauger . Functional behavior of AlF3 coatings for high-performance cathode materials for lithium-ion batteries. AIMS Materials Science, 2019, 6(3): 406-440. doi: 10.3934/matersci.2019.3.406 |
[10] | Tian Hao . Defining temperatures of granular powders analogously with thermodynamics to understand jamming phenomena. AIMS Materials Science, 2018, 5(1): 1-33. doi: 10.3934/matersci.2018.1.1 |
With the ever increasing amounts of data becoming available, strategic data analysis and decision-making will become more pervasive as a necessary ingredient for societal infrastructures. In many network engineering games, the performance metrics depend on some few aggregates of the parameters/choices. A typical example is the congestion field in traffic engineering where classical cars and smart autonomous driverless cars create traffic congestion levels on the roads. The congestion field can be learned, for example by means of crowdsensing, and can be used for efficient and accurate prediction of the end-to-end delays of commuters. Another example is the interference field where it is the aggregate-received signal of the other users that matters rather than their individual input signal. In such games, in order for a transmitter-receiver pair to determine his best-replies, it is unnecessary that the pair is informed about the other users' strategies. If a user is informed about the aggregative terms given her own strategy, she will be able to efficiently exploit such information to perform better. In these situations the outcome is influenced not only by the state-action profile but also by the distribution of it. The interaction can be captured by a game with distribution-dependent payoffs called mean-field-type games (MFTG). An MFTG is basically a game in which the instantaneous payoffs and/or the state dynamics functions involve not only the state and the action profile of the agents but also the joint distributions of state-action pairs.
The main contributions of this article can be summarized as follows. The first contribution of this article is the review of some relevant engineering applications of MFTG. Considering Liouville type systems with drift, diffusion and jumps, that are dependent on time-delays, state mean-field and action mean-field terms. Proposition 7 establishes an equilibrium equation for non-convex action spaces. Proposition 9 provides a stochastic maximum principle that covers decentralized information and partial observation systems which are crucial in engineering systems. Various engineering applications in discrete or continuous variables (state, action or time) are provided. Explicit solutions are provided in propositions 6 and 8 which are mean-field type game problems with non-quadratic costs.
The article is structured as follows. The next section overviews earlier works on static mean-field games, followed by discrete time mean-field games with measure-dependent transition kernels. Then, a basic MFTG with finite number of agents is presented. After that, the discussion is divided into two illustrations in each of the following areas of engineering (Figure 1) : Civil Engineering (CE), Electrical Engineering (EE), Computer Engineering (CompE), Mechanical Engineering (ME), General Engineering (GE).
• CE: road traffic networks with random incident states and multi-level building evacuation
• EE: Millimeter wave wireless communications and distributed power networks
• CompE: Virus spread over networks and virtual machine resource management in cloud networks
• ME: Synchronization of oscillators, consensus, alignment and energy-efficient buildings
• GE: Online meeting: strategic arrivals and starting time and mobile crowdsensing as a public good.
The article proceeds by presenting the effect of time delays of coupled mean-field dynamical systems and decentralized information structure. Then, a discussion on the drawbacks, limitations, and challenges of MFTGs is highlighted. Lastly, a summary of the article and concluding remarks are presented.
A static mean-field game is one in which all users make choices (or select a strategy) simultaneously, without knowledge of the strategies that are being chosen by other users and the game is played once. Any mean-field game with sequential moves is a dynamic mean-field game. In this work, games which are played more than once will be considered as dynamic game. This subsection overviews static mean-field games and games in which the underlying processes are in stationary regime (time-independent). Mean-field games have been around for quite some time in one form or another, especially in transportation networks and in competitive economy. In the context of competitive market with large number of agents, a 1936 article [1] captures the assumption made in mean-field games with large number of agents, in which the author states:
"Each of the participants has the opinion that its own actions do not influence the prevailing price".
Another comment on the impact on the population mean-field term was given in [2] page 13:
" When the number of participants becomes large, some hope emerges that of the influence of every particular participant will become negligible..."
The population interaction involves many agents for each type or class and location, a common approach is to replace the individual agents' variables and to use continuous variables to represent the aggregate average of type-location-actions. In the large population regime, the mean field limit is then modeled by state-action and location-dependent time process (Figure 2). This type of aggregate models are also known as non-atomic or population games. It is closely related to the mass-action interpretation in [3], Equation (4) in page 287.
In the context of transportation networks, the mean-field game framework, underlying the key foundation, goes back to the pioneering works of [4] in the 1950s. Therein, the basic idea is to describe and understand interacting traffic flows among a large population of agents moving from multiple sources to destinations, and interacting with each other. The congestion created on the road and at the intersection are subject to capacity and flow constraints. This corresponds to a constrained mean-field game problem as noted in [5]. A common behavioral assumption in the study of transportation and communication networks is that travelers or packets, respectively, choose routes that they perceive as being the shortest under the prevailing traffic conditions. As noted in [6], collection of individual decisions may result to a situation which drivers cannot reduce their journey times by unilaterally choosing another route. The work in [6] such a resulting traffic pattern as an equilibrium. Nowadays, it is indeed known as the Wardrop equilibrium [4,7], and it is thought of as a steady state obtained after a transient phase in which travelers successively adjust their route choices until a situation with stable route travel costs and route flows has been reached [8,9]. In the seminal contribution [4], page 345 the author stated two principles that formalize this notion of equilibrium and the alternative postulates of the minimization of the total travel costs. His first principle reads:
"The journey times on all the routes actually used are equal, and less than those which would be experienced by a single vehicle on any unused route."
Wardrop's first principle of route choice, which is identical to the notion postulated in [6,10], became widely used as a sound and simple behavioral principle to describe the spreading of trips over alternate routes due to congested conditions. Since its introduction in the context of transportation networks in 1952 and its mathematical formalization by [5,11] transportation planners have been using Wardrop equilibrium models to predict commuters decisions in real-life networks.
The key congestion factor is the flow or the fraction of travelers per edge on the roads (Application 1). The above Wardrop problem is indeed a mean-field on a discrete space. The exact mean-field term here corresponds to a mean-field of actions (a choice of a route). Putting this in the context of infinite number of commuters results to end-to-end travel times that are function of own choice of a route and the mean-field distribution of travelers across the graph (network).
In a population context, the equilibrium concept of [4] corresponds to a Nash equilibrium of the mean-field game with infinite number of agents. The works [7,12] provide a variational formulation of the (static) mean-field equilibrium.
The game theoretic models such as evolutionary games [13,14,15], global games [16,17], anonymous games, aggregative games [18], population games [19,20,21], and large games, share several common features. Static mean-field games with large number of agents were widely investigated ([22,23,24,25,26,27] and the references therein).
This section overviews mean-field games which are dynamic (time-varying and played more than once) and their applications in engineering.
Definition 1 (Mean-Field Game: Infinite Regime). A (homogeneous population) mean-field game (MFG) is a game in which the instantaneous payoff of a generic agent (say 1) and/or the state dynamics coefficient functions involve an individual state-action pair $x_{1t}, u_{1t}$ and the distribution of state-action pairs of the other decision-makers, $m_t$ at time $t.$ The individual state and action spaces are identical across the homogeneous population denoted by $\mathcal{X}_i=\mathcal{X}_1, U_j=U_1$ for all $i.$ The state transition to the next state follows $ \mathbb{P}(. | x_{1t}, u_{1t}, m_t).$ Thus, the instant payoff function of a generic agent (say $j$) has the following structure:
$r_i = r_1 = r:\ \mathcal{X}_1\times U_1\times \mathbb{P}( \mathcal{X}_1\times U_1) \rightarrow \mathbb{R}, $ |
with $r(x_{1t}, u_{1t}, m_t).$
The mean-field game model has been extended to include several other features such as incomplete information, common noise, heterogeneous population, finite population or a mixture between finite number of clusters and infinite population regimes.
The key ingredients of dynamic mean-field games appeared in [28,29] in the early 1980s. The work in [28] proposes a game-theoretic model that explains why smaller firms grow faster and are more likely to fail than larger firms in large economies. The game is played over a discrete time space. Therein, the mean-field is the aggregate demand/supply which generates a price dynamics. The price moves forwardly, and the agents react to the price and generate a demand and the firm produces a supply with associated cost, which regenerates the next price and so on. The author introduced a backward-forward system to find equilibria (for example Section 4, equations D.1 and D.2 in [28]). The backward equation is obtained as an optimality to the individual response, i.e., the value function associated with the best response to price, and the forward equation for the evolution of price. Therein, the consistency check is about the mean-field of equilibrium actions (population or mass of actions), that is, the equilibrium price solves a fixed-point system: the price regenerated after the reaction of the agents through their individual best-responses should be consistent with the price they responded to.
Following that analogy, a more general framework was developed in [29], where the mean-field equilibrium is introduced in the context of dynamic games with large number of decision-makers. A mean-field equilibrium is defined in [29], page 80 by two conditions: (1) each generic agent's action is best-response to the mean-field, and (2) the mean-field is consistent and is exactly reproduced from the reactions of the agents. This matching argument was widely used in the literature as it can be interpreted as a generic agent reacting to an evolving mean-field object and at the same time the meanfield is formed from the contributions of all the agents. The authors of [30] show how common noise can be introduced into the mean-field game model (the mean-field distribution evolves stochastically) and extend the Jovanovic-Rosenthal existence theorem [29].
Continuous time version of the works [28,29] can be found in [31,32,33,34]. The reader is referred to [36,37,38,39,40,41,42] for recent development of mean-field game theory. The authors [33,43,44,45,46,47] have developed a powerful tool for modelling strategic behavior of large population of agents, each of them having a negligible impact on the population mean-field term. Weak solutions of mean-field games are analyzed in [48], Markov jumps processes [49,50], and leader-followers models in [51]. Finite state mean-field game models were analyzed in [52,53,54,55,56,57,58,59]. Team and social optimum solutions can be found in [51,60,61,62,63]. The work in [64,65,66] provide mean-field convergence of a class of McKean-Vlasov dynamics. Numerical methods for mean-field games can be found in [67,68,69,70].
Table 1 summarizes some engineering applications of mean-field-type game models.
Area | Works |
planning | [72] |
state estimation and filtering | [73,74] |
synchronization | [75,76,77,78] |
opinion formation | [79] |
network security | [80,81,82,83,84] |
power control | [85,86,87] |
medium access control | [88,89] |
cognitive radio networks | [90,91] |
electrical vehicles | [92,93] |
scheduling | [94] |
cloud networks | [95,96,97] |
wireless networks | [98] |
auction | [99,100] |
cyber-physical systems | [101,102] |
airline networks | [103] |
sensor networks | [104] |
traffic networks | [105,106,107,108] |
big data | [109] |
D2D networks | [110,111,112] |
multilevel building evacuation | [140,141,142,143] |
power networks | [113,114,115,116,117,174,179] |
[93,118,119,120,121] | |
[122,123,124] | |
HVAC | [125,126,127,128,129,130] |
Most of the existing mean-field game models share the following assumptions:
Big size: A typical assumption is to consider an infinite number decision-makers, sometimes, a continuum of decision-makers. The idea of a continuum of decision-makers may seem outlandish to the reader. Actually, it is no stranger than a continuum of particles used in fluid mechanics, in water distribution, or in petroleum engineering. In terms of practice and experiment however, decision-making problems with continuum of decision-makers is rarely observed in engineering. There is a huge difference between a fluid with a continuum of particles and a decision-making problem with a continuum of agents. Agents may physically occupy a space (think of agents inside a building or a stadium) or a resource, and the size or number of agents that most of engineering systems can handle can be relatively large or growing but remain currently finite [71]. It is in part due to the limited resource per shot or limited number of servers at a time. In all the examples and applications provided below, we still have a finite number of interacting agents. Thus, this assumption appears to be very restrictive in terms of engineering applications.
Anonymity: The index of the decision-maker does not affect the utility. The agents are assumed to be indistinguishable within the same class or type. The drawback of this assumption is that most individual decision-makers in engineering are in fact not necessarily anonymous (think of Google, Microsoft, Twitter, Facebook, Tesla), the classical mean-field game model is inappropriate, and does not apply to such situations. In mean-field games with several types (or multi-population mean-field games), it is still assumed that there is large number of agents per type/class/population, which is not realistic in most of the engineering applications considered in this work.
NonAtomicity: A single decision-maker has a negligible effect on the mean-field-term and on the global utility. One typical example where this assumption is not satisfied is a situation of targeting a room comfort temperature, in which the air conditioning controller adjusts the heating/cooling depending on the temperature in the room, the temperatures of the other connecting zones and the ambient temperature. It is clear that the decision of the controller to heat or to cool affect the variance of the temperature inside the room. Thus, the effect of the individual action of that controller on the temperature distribution (mean-field) inside the room cannot be neglected.
To summarize, the above conditions appear to be very restrictive in terms of engineering applications, and to overcome this issue a more flexible MFTG framework has been proposed.
MFTGs not only relax of the above assumptions but also incorporate the behavior of the agents as well as their effects in the mean-field terms and in the outcomes (Table 2).
Area | Anonymity | Infinity | Atom |
population games [4,5] | yes | yes | no |
evolutionary games [131] | yes | yes | no |
non-atomic games [29] | yes | yes | no |
aggregative games [18] | relaxed | ||
global games [16,17] | yes | yes | no |
large games [22] | yes | yes | no |
anonymous games [29] | yes | yes | no |
mean-field games | yes | yes | no |
nonasymptotic mean-field games | nearly | no | yes |
MFTG | relaxed | relaxed | relaxed |
(1) In MFTGs, the number of users can be finite or infinite.
(2) The indistinguishability property (invariance in law by permutation of index of the users) is not assumed in MFTGs.
(3) A single user may have a non-negligible impact of the mean-field terms, specially in the distribution of own-states and own mixed strategies.
These properties (1)-(3) make strong differences between mean-field games and MFTGs ([132] and the references therein).
MFTG seems to be more appropriate in such engineering situations because it does not assume indistinguishability, it captures the effect of each agent in the distribution and the number of agents is arbitrary as we will see below. Table 3 summarizes the notations used in the manuscript.
$\mathcal{I}$ | $\triangleq$ | set of decision-makers |
$T$ | $\triangleq$ | Length of the horizon |
$[0, T]$ | $\triangleq$ | horizon of the mean-field-type game |
$t$ | $\triangleq$ | time index |
$\mathcal{X}$ | $\triangleq$ | state space |
$W$ | $\triangleq$ | Brownian motion |
$\sigma$ | $\triangleq$ | Diffusion coefiicient |
$N$ | $\triangleq$ | Poisson jump process |
$\gamma$ | $\triangleq$ | Jump rate coefiicient |
${U}_i$ | $\triangleq$ | control action space of agent $i\in \mathcal{I}$ |
$\mathcal{U}_i$ | $\triangleq$ | admissible strategy space |
$u_i$ | $\triangleq$ | state space |
$r_i$ | $\triangleq$ | instantaneous payoff |
$D_{(x, u)}$ | $\triangleq$ | distribution of state-action |
$R_i$ | $\triangleq$ | Long-term payoff functional |
This section presents a background on MFTGs.
Definition 2 (Mean-Field-Type Game). A mean-field-type game (MFTG) is a game in which the instantaneous payoffs and/or the state dynamics coefficient functions involve not only the state and the action profile but also the joint distributions of state-action pairs (or its marginal distributions, i.e., the distributions of states or the distribution of actions). Let $\mathcal{I}$ be the set of agents, $\mathcal{X}_i$ the state space of agent $i$ and $ \mathcal{X}:=\prod_{i\in \mathcal{I}}\mathcal{X}_i=\mathcal{X}_1\times \mathcal{X}_2\times \ldots $ the state profiles space of all agents. $U_i$ is the action space of agent $i$ and $U=\prod_j U_j $ is the action profile space of all agents. A typical example of payoff function of agent $j$ has the following structure:
$r_i:\ \mathcal{X}\times U\times \mathbb{P}( \mathcal{X}\times U) \rightarrow \mathbb{R}, $ |
with $r_i(x, u, D_{(x, u)})$ where $(x, u)$ is the state-action profile of the agents and $D_{(x, u)}$ is the distribution of the state-action pair $(x, u).$ $\mathcal{X}$ is the state space, $U$ is the action profile space of all agents and $\mathbb{P}(\mathcal{X}\times U)$ is the set of probability measures over $\mathcal{X}\times U.$
From Definition 2, a mean-field-type game can be static or dynamic in time. One may think that MFTG is a small and particular class of games. However, this class includes the classical games in strategic form because any payoff function $r_i(x, u)$ can be written as $r_i(x, u, D).$
When randomized/mixed strategies are used in the von Neumann-type payoff, the resulting payoff can be written as $E[r_i(x, u)]=\int r_i(x, u) D_{(x, u)}(dx, du)=\hat{r}_i(D).$ Thus, the form $r_i(x, u, D)$ is more general and includes non-von Neumann payoff functions.
Example 1 (Mean-variance payoff). The payoff function of agent $i$ is $E[r_i(x, u)]-\lambda \sqrt{var[r_i(x, u)]}, \lambda\in \mathbb{R}$ which can be written as a function of $r_i(x, u, D_{(x, u)}).$ For any number of interacting agents, the term $D_{(x_i, u_i)}$ plays a non-negligible role in the standard deviation $\sqrt{var[r_i(x, u)]}.$ Therefore, the impact of agent $i$ in the individual mean-field term $D_{(x_i, u_i)}$ cannot be neglected.
Example 2 (Aggregative games). The payoff function of each agent depends on its own action and an aggregative term of the other actions. Example of payoff functions include $r_i(u_i, \sum_{j\neq i} u^{\alpha}_j), \ \alpha>0$ and $r_i(x_iu_i, \sum_{j\neq i} x_ju_j).$
In the non-atomic setting, the influence of an individual state $x_i$ and individual action $u_i$ of any agent $i$ will have a negligible impact on mean-field term $\hat{D}_{(x, u)}=\lim_{n\rightarrow +\infty}\ \frac{1}{n-1}\sum_{j\neq i}\delta_{\{ x_j, u_j\}}.$ In that case, one gets to the so-called mean-field game.
Example 3 (Population games). Consider a large population of agents. Each agent has a certain state/type $x \in \mathcal{X}$ and can choose a control action $u\in \mathcal{U}(x).$ Let the proportion of type-action of the population as $m.$ The payoff of the agent with type/state $x, $ control action $u$ when the population profile $m$ is $r(x, u, m).$ Global games with continuum of agents were studied in [16] based on the Bayesian games of [17], which uses the proportion of actions.
In the case where both non-atomic and atomic terms are involved in the payoff, one can write the payoff as $r_i(x, u, D, \hat{D})$ where $\hat{D}$ is the population state-action measure. Agent $i$ may influence $D_i$ (distribution of its own state-action pairs) but its influence on $ \hat{D}$ may be limited.
The main goals of static mean-field-type games are: (1) identify solution concepts [35] such as Nash equilibrium, Bayesian equilibrium, correlated equilibrium, Stackelberg solution etc. (2) Computation of solution concepts. (3) Development of algorithms and learning procedures to reach and select efficient equilibria, (4) Mechanism design for incentivizing agents. The next section presents a class of dynamic MFTGs which are played over several stages.
Consider a basic MFTG with $n\geq 2$ agents interacting over horizon $[0, T], \ \ T>0. $ The individual state dynamics of agent $i$ is given by
$ dx_i=b_i\left(x_i, u_i, D_{(x_i, u_i)}, \frac{\sum\limits_{k\neq i}\delta_{(x_k, u_k)}}{n-1}\right) dt+ \sigma_i\left(x_i, u_i, D_{(x_i, u_i)}, \frac{\sum\limits_{k\neq i}\delta_{(x_k, u_k)}}{n-1}\right) dW_i, \\ x_i(0) \sim D_{i, 0} $ | (1) |
and the payoff functional of agent $i$ is
$ R_i(u)=g_i\left(x_i(T), D_{x_i(T)}, \frac{\sum\limits_{k\neq i}\delta_{x_k(T)}}{n-1}\right) +\int_0^T r_i\left(x_i, u_i, D_{(x_i, u_i)}, \frac{\sum\limits_{k\neq i}\delta_{(x_k, u_k)}}{n-1}\right) dt, $ | (2) |
where the strategy profile is $u=(u_1, \ldots, u_n), $ which also denoted as $(u_i, u_{-i}).$ The functions $b_i, \sigma_i, g_i, r_i$ are measurable functions. $x_i(t): = x_i(t)[u]$ is the state of agents $i$ under of the strategy profile $u, $ $ D_{x_i(t)}=\mathcal{L}(x_i(t))$ is the probability distribution (law) of $x_i(t).$ $ D_{(x_i(t), u_i(t))}=\mathcal{L}(x_i(t), u_i(t))$ is the probability distribution of the state-control action pair $(x_i(t), u_i(t))$ of agent $i$ at time $t.$ $\delta_y$ is the $\delta-$Dirac measure concentrated at $y, $ and $W_i$ is a standard Brownian motion defined over the filtration $(\Omega, \mathbb{P}, (\mathcal{F}_t)_{t\leq T}).$
The novelty in the modelling of (1)-(2) is that each individual agent $i$ influences its own mean-field terms $D_{x_i(t)}, $ and $ D_{(x_i(t), u_i(t))}$ independently on the total number of interacting agents. In particular, the influence of agent $i$ on those mean-field terms remain non-negligible even when there is a continuum of agents. The distributions $D_{x_i}$ and $D_{(x_i, u_i)}$ represent two important terms in the modeling of MFTGs. These terms are referred to as individual mean-field terms. In the finite regime, the other agents are captured by the empirical measures $\frac{\sum_{k\neq i}\delta_{x_k}}{n-1}$ and $ \frac{\sum_{k\neq i}\delta_{(x_k, u_k)}}{n-1}.$ We refer these terms to as population mean-field terms.
Similarly, a basic discrete time (discrete or continuous state) MFTG is given by
$
xi,t+1∼qi(.|xi,t,ui,t,D(xi,t,ui,t),∑k≠iδ(xk,t,uk,t)n−1),xi0∼Di,0Ri(u)=gi(xi,T,Dxi,T,∑k≠iδxk,Tn−1)+T−1∑t=0ri(xi,t,ui,t,D(xi,t,ui,t),∑k≠iδ(xk,t,uk,t)n−1),
$
|
(3) |
where $q_i(.|.)$ is the transition kernel of agent $i$ to next states.
Mean-field-type control and global optimization can be found in [36,133,134,172,173,175]. The models (1) and (3) are easily adapted to bargaining solution, cooperative and coalitional MFTGs and can be found in [135,176,177]. Psychological MFTG was recently introduced in [111,178] where spitefulness, altruism, selfishness, reciprocity of the agents are examined by means empathy, otherregarding behavior and psychological factors.
Definition 3. An admissible control strategy of agent $i$ is an $\mathcal{F}_i-$adapted and square integrable process with values in a non-empty subset ${U}_i.$ Denote by $\mathcal{U}_i =L^2_{\mathcal{F}_i}([0, T], \ U_i)$ the class of admissible control strategies of agent $i$.
Definition 4 (Best response). Given a strategy profile of the other agents $(u_1, \ldots, u_{i-1}, u_{i+1}, \ldots, u_n), $ with $u_j, \ j\neq i$ that are admissible and the mean-field terms $D$, the best response problem of agent $i$ is:
$
\left\{
supui∈UiE[Ri(u)],subject to \right.
$
|
(4) |
The first goal is to find and characterize the best response strategies of each user. For user $i$ it consists to solve problem (4). In problem (4), the information structure that is available to user plays in important role. We will distinguish three type of strategies: (1) open-loop strategies that are only measurable function of $t, $ (2) state-feedback strategies that are measurable functions of state and time, (3) state-and-mean-field feedback strategies that measurable functions of state, mean-field and time. To solve problem (4), four different methods have been developed:
• Direct approach which consists to write the payoff functional in a form such that the optimal value and optimizers are trivially obtained, and a verification and validation procedure follows.
• A stochastic maximum principle (Pontryagin's approach) which provides necessary conditions for optimality.
• A dynamic programming principle (Bellman's approach) which consists to write the value of the problem (per agent) in (backward) recursion form, or as solution to a dynamical system.
• Uncertainty quantification approach by means of Wiener chaos expansion of all the stochastic terms and the use of Kosambi-Karhunen-Loeve expansion which is a representation of a stochastic process as an infinite linear combination of orthogonal functions, analogous to a Fourier series representation of a function over a bounded domain.
If every user solves its best-response problem, the resulting system will be a Nash equilibrium system defined below.
Definition 5. A (Nash) equilibrium of the game is a strategy profile $(u^*_1, \ldots, u^*_n)$ such that for every agent $i$,
$\mathbb{E}[R_i(u^*)]\geq \mathbb{E}[R_i(u^*_1, \ldots, u_{i-1}^*, u_{i}, u_{i+1}^*, \ldots, u^*_n)], $ |
for all $ u_i\in \mathcal{U}_i.$
The second goal is to find and characterize Nash equilibria of the mean-field-type game. We provide below a basic example in which the Nash equilibrium problem can be solved semi-explicitly using Riccati system.
Example 4 (Network Security Investment [80]). A graph is connected if there is a path that joins any point to any other point in the graph. Consider $n\geq 2$ decision-makers over a connected graph. Thus, the security of a node is influenced by the others through possibly multiple hops. The effort of user $i$ in security investment is $u_i.$ The associated cost may include money (e.g., for purchasing antivirus software), time and energy (e.g., for system scanning, patching). Let $x(t)$ be the security level of the network at time $t$ and
$ R_i(u)= \ \ - \frac{1}{2}[x(T)-Ex(T)]^2 +\int_0^T q_i(t)x(t)(1-\epsilon_i(t) x(t))-\rho_i(t) u_i(t)-\frac{r_i(t)}{2}u^2_i(t)dt. $ | (5) |
The best-response of user $i$ to $(u_{-i}, E[x]): = (u_1, \ldots, u_{i-1}, u_{i+1}, \ldots, u_n, E[x]), $ solves the following linear-quadratic mean-field-type control problem
$
\left\{
supui∈UiE[Ri(u1,…,un)], subject to dx={−ax−ˉaE[x]+n∑i=1biui}dt+cxdW,x(0)∈R,
\right.
$
|
(6) |
where, $q_i(t)\geq 0, \epsilon_i(t)\geq 0, \rho_i(t)\geq 0, \ r_i(t)>0$ and $a, \bar{a}, b_i, c$ are real numbers and where $E[x(t)]$ is the expected value of network security level created by all users under the control action profile $(u_1, \ldots, u_n).$ Note that the expected value of the terminal term in $R_i$ can be seen as a weighted variance of the state [130] since $E[(x(t)-E[x(t)])^2]= var(x(t)).$ The optimal control action is in state-and-mean-field feedback form:
$
u∗i(t)=−biri(t)[βi(t)x(t)+η1i(t)E[x(t)]+η2i(t)]−ρi(t)ri(t),0=˙βi+(−2a+c2)βi−βin∑j=1b2jrjβj+2qiϵi,βi(T)=1,˙η1i−2(a+ˉa)η1i−2ˉaβi−βin∑j=1b2jrjη1j−η1in∑j=1b2jrj(βj+η1j)=0,η1i(T)=−1,˙η2i−(a+ˉa)η2i−βin∑j=1bjrj(bjη2j+ρj)−η1in∑j=1bjrj(bjη2j+ρj)−qi=0,η2i(T)=0.
$
|
Figure 3 plots the optimal cost trajectory with the step size $2^{-8}, $ the horizon is $[0, 1], $ the other parameters are $b=5, r=1, q=1, \rho=0.0001, \epsilon=0.1.$ Figure 4 plots the optimal state vs the equilibrium state. As noted in [136], the security state is higher when there is a cooperation between the users and when the coalition formation cost is small enough. The inefficiency of Nash equilibria behavior is widely known in game theory in which the Nash equilibrium can be inefficient compared to the global optimum of the system. The relative payoff difference between the worse Nash equilibrium payoff and global optimum payoff have been proposed in the literature [180,183] as measure of inefficiency. Another measure of inefficiency is the Price of anarchy, which has been proposed in [181,182]. It measures the ratio between the worse Nash equilibrium payoff and global optimum payoff. Note however that, one needs to be careful by taking a ratio here, because the denominator may vanish in our context. Note that restricting the analysis to the set of symmetric strategies may lead to performance degradation [173] as symmetric Nash equilibria may not be performant even in symmetric games. Thus, looking at $\epsilon-$Nash equilibria via mean-field limiting behavior does not help in improving the efficiency of Nash equilibria.
Example 4 can be used in the discrete-time mean-field-type game problem (4) associated with (3). It corresponds to a variance reduction problem which is widely used in risk quantification. The following example solves a distributed variance reduction problem in discrete time using MFTG.
Example 5 (Distributed Mean-Variance Paradigm, [137]). The best response problem of agent $i$ is
$
\left\{
infui∈Ui{qiTvar(xT)+(qiT+ˉqiT)(E[xT])2+T−1∑t=0qitvar(xt)+(qit+ˉqit)(E[xt])2+T−1∑t=0ritvar(uit)+rit(Euit)2} subject to xt+1={axt+ˉaExt+n∑i=1biuit}+σW(t), x0∼L(X0), E[X0]=m0,
\right.
$
|
(7) |
given the strategies $(u_j)_{j\neq i}$ of the other agents than $i$.
Under the assumption that for $t\in \{0, \ldots, T-1\}, $ and $ q_{jt}\geq 0, \ (q_{jt}+\bar{q}_{jt})\geq 0, \ r_{jt}>0, $ there exists a unique best-response of agent $i$ and it is given by
$
\left\{ui,t=ηit(xt−Ext)+ˉηitExt,ηit=−[abiβi,t+1+biβi,t+1∑j≠ibjηjt]rit+b2iβi,t+1,ˉηit=−biγi,t+1(a+ˉa+∑j≠ibjˉηj,t)rit+b2iγi,t+1,βit=qit+βi,t+1{a2+2a∑j≠ibjηjt+[∑j≠ibjηjt]2}−[abiβi,t+1+biβi,t+1∑j≠ibjηjt]2rit+b2iβi,t+1,βiT=qiT≥0γit=(qit+ˉqit)+γi,t+1(a+ˉa+∑j≠ibjˉηj,t)2−(biγi,t+1(a+ˉa+∑j≠ibjˉηj,t))2rit+b2iγi,t+1,γiT=qiT+ˉqiT≥0 \right.
$
|
(8) |
and the best response cost of agent $i$ is
$ E[L_i({u})]= E{\beta}_{i0}(x_0-Ex_0)^2+{\gamma}_{i0}(Ex_0)^2+ \sum\limits_{t=0}^{T-1}{\beta}_{i, t+1}\sigma^2.$ |
In both examples 4 and 5 the optimal strategy of agent $i$ is a feedback function of the state and the expected value of the state. This structure is different than the one obtained in classical stochastic optimal control which are mean-field-free. The methodology used in standard stochastic game problems do not apply directly to the mean-field-type game problems. These techniques need to be extended. This leads new optimality systems [36,179].
This subsection discusses two applications of MFTG in civil engineering.
Application 1 (Road Traffic over Networks). The example below concerns transportation networks under dynamic flow and possible stochastic incidents on the lanes. Consider a network ${\cal(V, L)}$, where $\cal{V}$ is a finite set of nodes and ${\cal L \subseteq V \times V } $ is a set of directed links. $n$ users share the network $\cal (V, L)$. Let ${\cal R}$ be the set of possible routes in the network. A user with a given source-destination pair arrives in the system at source node $s$ and leaves it at the destination node $d$ after visiting a series of nodes and links, which we refer to as a route or path. Denote by $c^{w}_i(x_t, u_{it}, m_t)$ the average $w-$weighted cost for the path $u_{it}$ when $m_t$ fraction of users choose that path at time $t$ and $x_t$ is the incident state on the route. The weight $w$ simply depicts that the effective cost is the weighted sum of several costs depending on certain objectives. These metrics could be the delayed costs, queueing times, memory costs, etc and can be weighted by $w$ in the multi-objective case. We define two regimes for the traffic game: a finite regime game with $n$ drivers denoted by $\mathcal{G}_{n}$ and an infinite regime game denoted by $\mathcal{G}_{\infty}.$ The basic components of these games are $(\mathcal{N}, \mathcal{X}, \mathcal{R}, I=\{x\}, c_i(x, .)).$ A pure strategy of driver $i$ is a mapping from the information set $I$ to a choice of a route that belongs to $\mathcal{R}.$ The set of pure strategies of a user is $\mathcal{R}^{\mathcal{X}}.$
An action profile (route selection) $(u_1, \ldots, u_n)\in\mathcal{R}^{n}$ is an equilibrium of the finite mean-field-type game if for every user $i$ the following holds:
$ c_i(x, u_{i}, m(x, u_{i}))\leq c_i(x, u^{\prime }_i, m(x, u^{\prime}_i)+\frac{1}{n}), \forall u^{\prime}_i\in\mathcal{R}, $ |
for the realized state $x.$
The term $+\frac{1}{n} $ is the contribution of the deviating user to the new route. When $n$ is sufficiently large the state-dependent equilibrium notion becomes a population profile $m(x)=(m(x, u))_{u\in\mathcal{R}}$ such that for every user $i$
$ m(x, u)>0 \Longrightarrow c_i(x, u, m(x, {u}))\leq c_{i}(x, u^{\prime}, m(x, u^{\prime})), % $ |
for the realized state $x$ and for all $ u^{\prime}\in\mathcal{R}.$ We refer to the equilibrium defined above as $0-$Nash equilibrium. Note that the equilibrium profile depends on the realized state $x$.
We now discuss the existence conditions.
The equilibrium conditions can be rewritten in the form of variational inequalities: for each state $x, $ $(*)\ \sum_{u\in\mathcal{R}}[m(x, u)-y(x, u)] c(x, u, m(x, u))\leq 0, \ $ for all $y.$ Hence, the existence of an equilibrium is reduced to the existence of a solution to the variational inequality (*). By the standard fixed-point arguments, we know from [138] that for each single state, such a population game has an equilibrium if the cost functions are continuous in the second variable $m$. Moreover, the equilibrium is unique under strict monotonicity conditions of the cost function $c_i(x, u, .).$ Note that uniqueness in $m$ does not mean uniqueness of the action profile $u$ since one can permute some of the commuters. We use imitative learning in an information-theoretic view point. We introduce the cost of learning from strategy $m_{i, t-1}$ to $m_{i, t}$ as the relative entropy $d_{KL}(m_{i, t-1}, m_{i, t}).$
Then, each user reacts by taking a myopic conjecture given by
$\min\limits_{m_{i, t}}\ \langle \hat{c}_{i, t}, m_{i, t}\rangle +\frac{1}{\beta_{i, t}}d_{KL}(m_{i, t-1}, m_{i, t})$ |
where $\hat{c}_{i, t}$ is the estimated cost vector, $\beta_{i, t}$ is a positive parameter, $d_{KL}$ is the relative entropy from $m_{i, t-1}$ to $m_{i, t}.$
$d_{KL}$ is not a distance (because it is not symmetric) but it is positive and can be seen as a cost to move from $m_{i, t-1}$ to $m_{i, t}.$ We use the convexity property of the relative entropy to compute the strategy that minimizes the perturbed expected cost.
Proposition 1. Let $\beta_{i, t}=\log(1+\nu_{i, t})$ for $\nu_{i, t}>0.$ Then, the imitative Boltzmann-Gibbs strategy is the minimizer of the above problem which becomes a multiplicative weighted imitative strategy:
$m_{i, t}(u):=\frac{ m_{i, t-1}(u) (1+\nu_{i, t})^{ -\hat{c}_{i, t-1}(u)}}{\sum\limits_{u^{\prime}\in\mathcal{R}} m_{i, t-1}(u^{\prime})(1+\nu_{i, t})^{- \hat{c}_{i, t-1}(u^{\prime})}}.$ |
The advantage of the imitative strategy is that it makes sense not only in small learning rate but also in high learning rate. When the learning rate is large, the trajectory gets closer to the best reply dynamics and for small learning it leads to the replicator dynamics [139]. One useful interpretation of the imitative strategy is the following: Consider a bounded rationality setup where the parameter $\nu_{i, t}$ is the rationality level of user $i.$ Then, a large value of $\nu_{i, t}$ means a very high rationality level for user $i, $ hence user $i$ will use an almost "best reply'' strategy. Small value of $\nu_{i, t}$ means that user $i$ is of a low rationality level and is described by the replicator equation. It is interesting to see that both behaviors can be captured by the same imitative mean-field learning. Note that the logit (or Boltzmann-Gibbs) learning does not cover the low rationality level case.
Proposition 2. As $\nu_{i, t}$ goes to zero, the trajectory of the multiplicative weighted imitative strategy is approximated by the replicator equation of the estimated delays
$ \dot{m}_{i, t}(u)=m_{i, t}(u)\left[-\hat{c}_{i, t}(u)+\sum\limits_{u^{\prime}} m_{i, t}(u^{\prime})\hat{c}_{i, t}(u^{\prime}) \right]. $ |
For one commuter case, the solution of the replicator equation yields
$ m_{i, t}(u)= \frac{m_{i, 0}(u)e^{-t .\frac{1}{t}\int_0^t \hat{c}_{i, t'}(u)\ dt'}}{\sum\limits_{u^{\prime}} m_{i, 0}(u^{\prime})e^{-t .\frac{1}{t}\int_0^t \hat{c}_{i, t'}(u^{\prime})\ dt'}} $ |
The solution is
$ m_{i, t}(u)= \frac{m_{i, 0}(u)e^{-t \bar{c}_{i}(u)}}{\sum\limits_{u^{\prime}} m_{i, 0}(u^{\prime})e^{-t \bar{c}_{i}(u^{\prime})}}. $ |
Clearly the time-average trajectory based on average payoff and smooth best reply dynamics are closely related with parameter $\beta_{i, t}=t.$ Each driver knows the current state and employs the learning pattern. Each driver tries to exploit the information on the current state and build a strategy based on the observation of the vector of realized delays over all the routes at the previous steps. Then the Folk theorem for evolutionary game dynamics states:
• When starting from an interior mixed strategy, the replicator equation converges to one of the equilibria.
• All the faces of the multi-simplex are forward invariant. In particular, the pure strategies are steady states of the imitative dynamics.
• The set of global optima belongs to the set of steady states of the imitative dynamics.
The strategy-learning of user $i$ is given by
$ \label{tecost1}\mathcal{L}_{i}^1(x_{t}):\ \ \ m_{i, t}(x_{t}, u):=\frac{ m_{i, t-1}(x_{t}, u)(1+\nu_{i, t})^{- c_{i, t-1}(x_{t}, u)}}{\sum\limits_{u^{\prime}\in\mathcal{R}} m_{i, t-1}(x_{t}, u^{\prime}) (1+\nu_{i, t})^{- c_{i, t-1}(x_{t}, u^{\prime})}} $ | (9) |
$ \mathcal{L}_{i}^{2}(x_{t}):\ \ \ m_{i, t}(x_{t}, u):=\frac{ m_{i, t-1}(x_{t}, u) (1+\nu_{i, t})^{ -\bar{c}_{i, t-1}(x_{t}, u)}}{\sum\limits_{u^{\prime}} m_{i, t-1}(x_{t}, u^{\prime}) (1+\nu_{i, t})^{- \bar{c}_{i, t-1}(x_{t}, u^{\prime})}}, $ | (10) |
where $\bar{c}_{i, t}(x, u)$ is the time-average delay (up to $t$) in route $u$ and state $x.$
The imitative mean-field learning above can be used to solve a long-term mean-field game problem. We observe in Figures 6-6 that the imitative learning converges to one of the global optima. However, the exploration space grows in complexity. We explain how to overcome to this issue using mean-field learning based on particle swarm optimization (PSO). In it each user has a population of particles (multi-swarm). The particles within the same population (coalition) may pool their effort to learn faster and exploit better the available information.
The next example concerns multi-level building evacuation [140,141,142,143] using constrained mean-field games.
Application 2 (Multi-level building evacuation). A typical mean-field game model assumes that agents have unconstrained state dynamics. This has been, for example, the case with most of the existing mean-field models developed in the last three decades. Such models may not however be useful in practice, for example in a context of building evacuation. Evacuation strategies and values are designed using constrained mean-field-type game theory.
Particle-based pedestrian models have been studied in [144,145]. Continuum approximation of theoretical models have been proposed in [144,145,146,147,148,149]. Recent mean-field studies on crowd and pedestrian flows include [150,151,152,153,154]. Below a mean-field game for multi-level building evacuation is presented. Consider a building with multiple floors and resolutions represented by a compact domain $D$ in the $m-$dimensional Euclidean space $\mathbb{R}^m.$ The number of floors is $K.$ The domain at floor $k$ is denoted as $D_k.$ For $1 < k < K, $ the floor $k$ is connected to the higher floor $k+1$ using the intermediary domain $I^{+}_{k}$ but also the lower floor $k-1$ using $I^{-}_{k}.$ The sets $I_k$ can be elevator zones or stairs. $n\geq 2$ agents are distributed in a multi-level multi-resolution building with stairs, exit doors, sky-bridges. Each agent knows her current location in the building. The state/location $x_i$ of an agent $i$ changes depending on her control action $u_i$. The agent is interested in a safe evacuation from the building. This means that she is interested in the minimal exit time that avoid huge crowd around her. The problem of the agent $i$ is equivalent to
$
\left\{
infui c3(xi(T),Gn(xi(T)))+∫T0c1(Gn(xi(t)))‖ui(t)‖2+c2(Gn(xi(t))) dt,˙xi=ui∈R3, 0<t<Txi(t)∈D⊂R3,Boundary constraints:ui| ∂D=0, ui| ExitDoor=k≠0∈R3
\right.
$
|
where $c_i$ is a positive increasing function, with $c_2(0)=0.$ $T>0$ is the exit time at one of the exits. The final exit cost is represented by $c_3$ which can be written as $\tilde{c}_3+\tilde{h}(x)$ where $\tilde{c}_3>0$ captures the initial response time of an agent (without congestion around),
$G_n(x_i(t))= \frac{1}{vol(B(x_i(t), \epsilon))}\ \frac{\sum\limits_{j\neq i} \mathbb{1}_{\{d(x_j(t), x_i(t))\leq \epsilon\}}}{n-1}, $ |
represents the number of the agents around the position $x_i$ except $i$ within a distance less than $\epsilon>0, $ $vol(B)$ is the $m$-dimensional volume of the ball $B(x_i(t), \epsilon)$ which does not depend on $x_i(t), $ due to translation invariance of the volume measure. When the number of agents grows, one obtains a mean-field game with several interacting agents. The state dynamics must satisfy the constraint $x_i(t)\in D$ at any time $t$ before the exit. The non-optimized Hamiltonian in macroscopic setting as
$H^0(x, u, G, p)=-c_1( G(x) )\|u\|^2 -c_2(G(x))+p.u, $ |
where $p$ is the adjoint variable. The Pontryagin maximum principle yields
$˙p=−H0x,p(T)=−gx(x(T)),˙x=p2c1(G(x)), 0<t≤Tx(0)∈D. $
|
The Hamiltonian $H^0 (., ., G, p(t))$ is concave in $(x, u) $ for almost everywhere (a.e.) $t \in [0, T].$ Then, for convex function $c_3, $ $u^*$ is an optimal response if $H^0(x^*(t), u^*, G^*, p^*(t))= \max_{u}H^0(x^*(t), u, G^*, p^*(t)).$ The (optimized) Hamiltonian as
$ H(x, p, G)=\sup\limits_{u} \{ - c_1(G(x))\| u\|^2- c_2(G(x)) +p u\}. $ |
The Hamiltonian can be computed as $ H(x, p, G)= \frac{\|p\|^2}{ 4 c_1(G(x))}-c_2(G(x)), $ and the optimal strategy is in (own)state-and-mean-field feedback form: $u^* =\frac{p}{2 c_1(G(x)) }=H_p(x, p, G(x)), $ to be projected to the tangent space. The dynamic programming principle leads to the following optimality system:
$ \left\{ vt+H(x,vx,G(x))=0, on (0,T)×Dv(T,x)=−g(x), on Dρt+divx(ρHp)=0, ρ0(.) on D⊂R3u=0, y=0 on ∂Du=k, at exits
\right.
$
|
The development of numerical result, simulation and a validation framework can be found in [140,141,142,143]. Figures 7 and 8 show the application to a two floors building where 500 agents are spatially distributed.
Next, two applications of MFTGs in electrical engineering are presented.
Application 3 (Millimeter Wave Wireless Communication). Millimeter wave (mmWave) frequencies, roughly between 30 and 300 GHz, offer a new frontier for wireless networks. The vast available bandwidths in these frequencies combined with large numbers of spatial degrees of freedom offer the potential for orders of magnitude increases in capacity relative to current networks and have thus attracted considerable attention for next generation 5G communication systems. However, sharing of the spectrum and the available infrastructure will be essential for fully achieving the potential of these bands. Unfortunately, rapidly changing network dynamics make it difficult to optimize resource sharing mechanisms for mmWave networks. MIMO mmWave wireless networks will rely extensively on highly directional transmissions, where both users, relays and base stations transmit in narrow, high-gain beams through electronically steerable antennas. While directional transmissions can improve signal range and provide greater degrees of freedom through spatial multiplexing, they also significantly complicate spectrum sharing. Nodes that share the spectrum must not only detect one another, but also search over a potentially large angular space to properly steer the beams and reduce interference. Power allocation, angle optimization and channel selection algorithms should consider the possible interference field and reduce it by adjusting the angles. This can facilitate rapid directional discovery in a dynamic and mobile environment as in Figure 9. Sometimes jammers and malicious are involved in the interactions. Beams adjustment and Interference coordination are central problem for users within the same network, or between users in different networks sharing the same spectrum. When multiple operators own separate core network and radio access network (RAN) nodes such as base stations and relays, but only loosely coordinate via wireless signaling, it is essential to use incentive mechanisms for better coordination to exploit the available resources. Cost sharing and pricing mechanisms capture some of the fundamental properties that arise when sharing resources among multiple operators. It can also be used in the uplink case, where users can select their preferred services and network provides and have to find tradeoffs between quality-of-experience (QoE) and cost (price).
As an illustrative example, a particle swarm learning mechanism, which is mean-field dynamics, in which the particles adapt the parameters such as angle and power is used to improve users' quality-of-experience. Here the key mean-field terms are the distributions of remaining energy, distribution of transmitter-receiver pairs and the sectorized interference field (per angle). Since users are carrying smartphones with limited power consumption, it is crucial to examine the remaining energy level. As in [91] the energy dynamic can be written as
$de= -u dt +v dt+\sigma dW, $ |
subject to $e(t)\geq 0, $ $e(0)=e_0, $ and $u(.)\geq 0$ is the transmission power and $v(.)$ is the energy harvesting rate (for example, with distributed renewable energy sources).
Proposition 3. The marginal distribution $m^e(t, e)$ of remaining energy solves the Fokker-Planck-Kolmogorov equation:
$ \partial_tm^e+\partial_e[(-u+v)m^e]-\frac{\sigma^2}{2}\partial_{ee}m^e =0, $ |
in a distribution sense. The first moment dynamics yields $ \frac{d}{dt}\bar{e}=-\bar{u}+\bar{v}, $ where $\bar{e}(t)=\mathbb{E}[e(t)], \bar{u}(t)=\mathbb{E}[u(t)], \bar{v}(t)=\mathbb{E}[v(t)]$ denotes the expected value of $e(t)$, $u(t), v(t)$ respectively.
Users move according to a mobility dynamics (which may not be stationary). The channel state can be modeled, for example using a matrix valued Ornstein-Uhlenbeck process $dH_j= \Gamma_j[\hat{H}_j-H_j] dt+dW_j$ where $\Gamma_j, \hat{H}_j$ are matrices with compatible dimensions of antennas at source and destination.
Proposition 4. The marginal distribution $m^{H_j}(t, H_j)$ of channel state of user $j$ solves the Fokker-Planck-Kolmogorov equation:
$ \partial_t m^{H_j}+\mbox{div}_{H_j}[(\Gamma_j(\hat{H}_j-H_j))m^{H_j}]-\frac{1}{2}trace[\partial_{H_jH_j}m^{H_j}] =0, $ |
in a weak sense. The first moment dynamics of user $j$ yields
$ \frac{d}{dt}\bar{H}_j=\Gamma_j (\hat{H}_j-\bar{H}_j), $ |
which decays exponentially to $\hat{H}_j$ as $t$ increases.
The (unnormalized) distribution of the triplet (position, energy, channel) of the population at time (or period) $t$ is $\nu(t, e, x, H)=\sum_{j=1}^n \delta_{\{e_j(t), x_j(t), H_j(t)\}}, $ and the one within a beam $A(s, d)$ with direction $s-d$ is
$\tilde{\nu}(t, e, x, H, s, d)=\sum\limits_{j=1}^n \delta_{\{e_j(t), x_j(t), H_j(t)\}}\mathbb{1}_{x_j(t)\in A(s, d)}.$ |
The sectorized interference field is $I(t, x(t), d)=\int_{(\bar{x}, \bar{H}, \bar{u})}\phi(\bar{x}-x(t), \bar{H}, \bar{u}))\tilde{\nu}(t, E, \bar{x}, \bar{H}, x(t), d).$ Compared to other wireless technologies, mmWave may generate less interference because of reduced and optimized angles. However, interference may still occur when several users and blocking objects fall within the same angle as depicted in Figure 9. The success probability $\mathbb{P}(\mbox{SINR}_i \geq \beta_i)$ from position $x_i(t)$ to destination $d_i$ for both LoS and non-LoS can then be derived. The quality-of-experience of users can be termed as function of the sectorized interference field, satisfaction level and user-centric subjective measures such as MOS (mean opinion score) values.
Application 4 (Distributed Power Networks (DIPONET)). Distributed power is a power generated at or near the point of use. This includes technologies that supply both electric power and mechanical power. The rise of distributed power is also being driven by the ability of distributed power systems to overcome the energy need constraints, and transmission and distribution lines. Mean-field game theoretic applications to power grid can be found in [93,113,114,115,116,117,118,119,120,121,122,123,124]. A prosumer (producer-consumer) is a user that not only consumes electricity, but can also produce and store electricity. Based on forecasted demand, each operator determines its production quantity, its mismatch cost, and engages an auction mechanism to the prosumer market. The performance index is $ L_j(s_j, e_j)= l_{jT}(e(T)) +\int_0^T l_{j}(D_j(t)-S_j(t)) + \frac{\rho}{2}\sum_{k}s^2_{jk}(t)\ dt.$ Each producer aims to find the optimal production strategies:
$\left\{
infsj,ejLj(sj,ej,T)ddtejk(t)=cjk(t)−sjk(t)cjk(t)≥0, sjk(t)∈[0,ˉsjk], ∀j,k,tsjk(w)=0 ifwis a starting time of a maintenance period. ej,k(0) given.
\right.
$
|
where $D_j(t)$ is a demand at time $t$, $l_j(D_j(t)-S(t))$ denotes the instant loss where $S(t)=S_{producer}(t)+S_{prosumer}(t), $ $S_{producer}(t)=\sum_{j=1}^n s_j(t)=\sum_{j=1}^n \sum_{k=1}^{K_ j}s_{j, k}(t)\, $ where $s_{j, k}(t)$ is the production rate of plant/generator $k$ of $j$ at time $t.$ $K_j$ total number of power plants of $j.$ The loss $l_j$ is assumed to be strictly convex. The stock of energy at time $t$ is given by the classical motion $\frac{d}{dt}e_{jk}$ where $c_{jk}(t)$ is the maintenance cost of plant/generator $k$ of $j$ when it is in the maintenance phase. The optimality equation of the problem is given by Hamilton-Jacobi-Bellman:
$
{∂tvj(t,ej)+Hj(Dj(t),∂ejvj(t,ej))=0,t<Tvj(T,ej)=ljT(ej),
$
|
(11) |
where $H_j$ is the Hamiltonian function is
$
Hj(Dj,yj)=infsj[lj(Dj−Sj)+ρ2∑ks2jk+∑k(cjk−sjk)yjk]
$
|
(12) |
The first order interior optimality condition yields $-l'_{j}(D_j-S_j)-y_{jk}+\rho s_{jk}=0.$ By summing over $k$ one gets an equation for the total production quantity $S_j^*$ solves $-K_j l'_{j}(D_j-S_j) -\sum_{k=1}^{K_j}y_{jk}+\rho S_j=0.$ The optimal supply of power plant $k$ is $ s_{jk}^*= \min(\bar{s}_{jk}, \frac{l'_{j}(D_j-S^*_j)+y_{jk}}{\rho}). $ The solution of partial differential equation (11) can be explicitly obtained and it is given by the Hopf-Lax formula:
$ \label{thh}v_{j}(t, e_j)=\inf\limits_{y\in \mathbb{R}^{K_j}} \big\{ l_{jT}(y)+ (T-t)H^*_j\big (D_j, \frac {e_j-y}{T-t}\big)\big \}, $ | (13) |
where $H^*_j$ is the Legendre transformation of $H_j, $ and is given by
$ H^*_j(D_j, a)=l_{j}\left( D_j-\frac{1}{\rho}\sum\limits_{k}a_{jk}-\frac{l'_{j}( D_j-S^*_j)}{\rho}\right) $ |
$+ \frac{\rho}{2}\sum\limits_{k}a^2_{jk}+\sum\limits_{k}c_{jk}a_{jk}. $ |
Note that (13) provides an explicit solution to the Demand-Supply matching problem between power plants of prosumer $j$ and this holds for arbitrary number of prosumers and power stations.
The mean-field equilibrium is obtained as fixed-point equation involving $S^*$ and $D^*.$ When $ l'_{j}$ is continuous and preserves the production domain $[0, \bar{s}]$ one can guarantee the existence of such a solution by using Brouwer fixed-point theorem. One can use higher order fast mean-field learning to learn and compute of such a mean-field equilibrium. Figure 10 illustrates the optimal supply based on an estimated demand curve. Figure 11 represents an allocation of the producer with two power stations.
This section provides applications of MFTG in computer engineering. It starts with an application of MFTG with number finite state-actions and then focuses on continuous state-action spaces.
Application 5 (Virus Spread over Networks). We study a malware propagation over computer networks where the nodes interact through network-based opportunistic meetings (Figure 12 and Table 4). The security level of network is measured as a function of some key control parameters: acceptance/rejection of a meeting, opening/not opening a suspicious e-mail, file or packet. We model the propagation of the virus in network as a sort of epidemic process on a random graph of opportunistic connections [155]. A computer/node can randomly get online an infected or non infected data from other computers.
Case | Transition proba. $(\theta, \theta' \in \{1, 2\})$. | $M^n_{\theta}(t+1)-M^n_{\theta}(t)$ | Actions | Propagation |
${D} \xrightarrow{\delta_{D}} {H}$ | $D^n_\theta(t) \delta_{D}$ | $(-1, 0, 1)/n$ | singleton set | $-1/n$ |
$2 D \xrightarrow{\lambda} 2 C$ | $D^n_{\theta}(t) \delta_m ^2 \lambda (D^n_{\theta}(t)-\frac{1}{n})$ | $(-2, 2, 0)/n$ | $\{m, \bar{m}\}$ | $0$ |
%2' | $P^n_{\theta}(t) \delta_m ^2 \lambda (P^n_{\theta'}(t)-\frac{1}{n}\mathbb{1}_{\{\theta=\theta'\}})\mathbb{1}_{\{\theta=\theta'\}} $ | $(-2, 2, 0)/n$ | $\{m, \bar{m}\}$ | $0$ |
${C} \xrightarrow{\delta_{C}} {H}$ | $C^n_\theta(t) \delta_C$ | $(0, -1, 1)/n$ | singleton set | $-1/n$ |
${C} \xrightarrow{\frac{\beta}{q_{\theta} + D^n_{\theta}(t)}} D$ | $C^n_\theta(t) \beta \frac{D^n_{\theta}(t)}{q_{\theta} + D^n_{\theta}(t)}$ | $(-1, 1, 0)/n$ | singleton set | $0$ |
${H} \xrightarrow{\delta_{H}+(1-\delta_{H})C^n} C$ | $H^n(t) [\delta_{H}+(1-\delta_{H})C^n(t)]$ | $(0, 1, -1)/n$ | singleton set | $1/n$ |
${H} \xrightarrow{\eta}D$ | $H^n(t)(\delta_e \delta_{Sm} + \delta_m \eta D^n(t))$ | $(1, 0, -1)/n$ | $\{o, \bar{o}, m, \bar{m}\}$ | $1/n$ |
An infected computer can be in two states: dormant or fully infected. The non-infected computers are susceptible to be approached by virus coming from infected ones. The possible states are therefore denoted as Dormant (D), Infected/Corrupt(C) and Susceptible/Honest (H). The set of types is 1 or 2, also denoted generically as $\theta, \theta'.$ For each type the state may be different except for honest state where it is considered as honest in both regimes of the network. The network size is $n\geq 1.$ The repartition of the nodes at time step $t$ is denoted as $n=D_{\theta}(t)+D_{\theta'}(t)+C_{\theta}(t)+C_{\theta'}(t)+H(t).$
The frequency of the states $\theta$ is called occupancy measure of the population and is denoted as $M^n(t) = (D_{\theta}(t)/n, D_{\theta'}(t)/n, C_{\theta}(t)/n, C_{\theta'}(t)/n, H(t)/n) =: (D^n_{\theta}(t), D^n_{\theta'}(t), C^n_{\theta}(t), C^n_{\theta'}(t), H^n(t)).$ $M^n(.)$ is a random process and its limit measure corresponds to the mean field term. The goal is understand the impact of the control action on combatting virus spread, which is the minimization of proportion $O^n(t): = 1-H^n(t))$. The interaction is simulated using the following rules:
Changes from Dormant states: A node in dormant state (transient) with type $\theta$ may become honest with probability $\delta_{D}\in (0, 1).$ A dormant with type $\theta$ may opportunistically meet another dormant of type $\theta'$, and both become active. This occurs with probability proportional to the frequency of other dormant agent at time $t$. For type $\theta, $ the probability is $\lambda (D^n_{\theta'}(t)-\frac{1}{n}\mathbb{1}_{\{ \theta=\theta'\}})$. Note that the dormant can decide to contact the other dormant or not, so there are two possible actions: $\{m, \bar{m}\}$ (to meet or not to meet). Those events will be modeled with a Bernoulli random variable with success (meeting) probability $\delta_m$, which represents $u(m| D, {\theta}).$
Changes from Corrupt States: A corrupt node may become honest with probability $\delta_{C}$. A corrupt node of type $\theta$ may become dormant with probability $\beta \frac{D^n_{\theta}(t)}{q_{\theta} + D^n_{\theta}(t)}$ at time $t$. Here is assumed that, at high concentrations of dormants, each corrupt node infects at most a certain maximum number of dormant nodes per time step. This reflects the fact a corrupt has a limitation in terms its power, domination and capabilities. The parameter $0 \leq \beta \leq 1$ can be interpreted as a maximum contamination rate. The parameter $0 \leq q_{\theta} \leq 1$ is the dormant node density at which the infection spread proceeds.
Changes from Susceptible/Honest states: An honest node may become infected with probability $\delta_{H}+(1-\delta_{H})C^n (t).$ An honest node may become dormant via two ways. First, $\delta_{Sm}$ is the probability of getting corrupt by the network representative node. In this case, the honest node can decide share or not, so there are two possible actions: $\{o, \bar{o}\}$. This case will be modeled using a coin toss with probability $\delta_e\in (0, 1).$ Second, $\eta (D^n_{\theta}(t)+D^n_{\theta'}(t))$ models the probability of meeting a dormant node. Here $\eta \in (0, 1).$ In this case, the dormant node can decide to contact the honest node or not, and it is modeled analogously to the other two cases.
The payoff function is the opposite of the infection level. Each transition described above has a certain contribution to be infection level of the society, which could be 0 if no corrupt or dormant node become honest, $-1/n$ if there is a node which become honest and $+1/n $ if one node is corrupt (D or C). In Table 4 are the transition probabilities, the contribution to $M^n(t+1)-M^n(t)$, the set of actions, and the contribution to information spread in the network.
The drift, that is, the expected change of $M^n$ in one time step, given the current state of the system is $f^n(m) = n \mathbb{E} (M^n(t+1)-M^n(t)|M^n(t)=m) $ which can be expressed as:
$f^n(m) =\left( −dθδD−2dθδ2mλndθ−1n−cθβdθqθ+dθ+h(δeδSm+δmηd)−dθ′δD−2dθ′δ2mλndθ′−1n−cθ′βdθ′qθ′+dθ′+h(δeδSm+δmηd)2dθδ2mλndθ−1n−cθδC+cθβdθqθ+dθ+h(δH+(1−δH)c)2dθ′δ2mλndθ′−1n−cθ′δC+cθ′βdθ′qθ′+dθ′+h(δH+(1−δH)c)dδD+cδC−2h(δH+(1−δH)c)−2h(δeδSm+δmηd) \right)$
|
where $m = (d_{\theta}, d_{\theta'}, c_{\theta}, c_{\theta'}, h) \, $, $d =d_{\theta}{+}d_{\theta'} $ and $c =c_{\theta}{+}c_{\theta'} $. Then the limit of $f^n(m)$ is
$f(m)={\left(\!\!\! −dθδD−2λd2θδ2m−cθβdθqθ+dθ+h(δeδSm+δmηd)−dθ′δD−2λd2θ′δ2m−cθ′βdθ′qθ′+dθ′+h(δeδSm+δmηd)2λd2θδ2m−cθδC+cθβdθqθ+dθ+h(δH+(1−δH)c)2λd2θ′δ2m−cθ′δC+cθ′βdθ′qθ′+dθ′+h(δH+(1−δH)c)dδD+cδC−2h(δH+(1−δH)c)−2h(δeδSm+δmηd) \!\!\! \right)}$
|
Notice that the sum of the all the components of $f(m)$ is zero. Furthermore, if one of the components $m_j$ of $m= (d_{\theta}, d_{\theta'}, c_{\theta}, c_{\theta'}, h)$ is zero then the corresponding drift function $f_j(m)\geq 0.$ As a consequence, in the absence of birth and death process, the $4-$dimensional simplex is forward invariant, meaning that if initially $m(0)$ is in the simplex, then for any time greater than $0$ the trajectory of $m(t)$ stays in the simplex domain.
We minimize the proportion of node with states $C$ or $D$ by means of controlling $u(.|), $ i.e., by adjusting $(\delta_m, \delta_e)\in [0, 1]^2.$ Since $o(t)=c_1+c_2+d_1+d_2=1-h(t), $ minimizing $o(t)$ is equivalent to maximize the proportion of susceptible node in the population. Therefore the optimization problem becomes
$
\left\{
sup δe,δm h(T)+∫T0h(t) dt˙m=f(m), m(0)=m0where, m=(c1,c2,d1,d2,h).
\right.
$
|
$\hat{H}= h+ f_1p_1+f_2p_2+f_3 p_3+f_4p_4+f_5p_5.$ This is a twice continuously differentiable function in $m, $ and $\partial_{m_j} \hat{H}=\sum_{i=1}^5 [\partial_{m_j}f_i] p_i $ for $j\leq 4. $ The optimum control strategies at time $t$ are the ones that maximize $\hat{H}.$
$
\left\{
argmax δe,δmˆH˙m=f(m), m(0)=m0˙pj=−5∑i=1[∂mjfi]pi, j≤4, t<T˙p5=−1−5∑i=1[∂hfi]pi, t<T p(T)=<sup>[<xrefref−type="bibr"rid="b0">0</xref>,<xrefref−type="bibr"rid="b0">0</xref>,<xrefref−type="bibr"rid="b0">0</xref>,<xrefref−type="bibr"rid="b0">0</xref>,<xrefref−type="bibr"rid="b1">1</xref>]</sup>.
\right.
$
|
Let $S(t)$ be the random variable describing the individual state at time $t$ of a generic individual and assume that a generic individual is in a state $s$ at time $t.$ Then $S(t + \frac{1}{n})$ is independent of previous values $(S(t') : t'\leq t)$ and as $n$ goes to infinity for all state $s'.$ The reward of a generic individual payoff is defined as follows: $p_{\theta}(s, u, m)=0$ if the individual state $s$ is different than $H, $ and equals $1$ if the state $s=H.$ By doing so, each individual tries to adjust its own trajectory. People in honest state will accept less meeting and will set their meeting rate $\delta_m$ to be minimal, and the other individual with state different than $H$ will try to enter to $H$ as soon as possible. As in a classical communicating Markov chain, this is the entry time to state $H.$
Figure 13 reports the result of the simulation with the following 3 starting points: $(d, c)= (0.2, 0.6)$, $(d, c)= (1/3, 1/3)$ and $(d, c)= (0.2, 0).$ In the three cases, the system converges to the same steady state which is around $(d, c)= (0.38, 0.6).$ Figure 14 plots the reward (honest people) as a function of time for two different control parameters $\delta_m =0.9$ and $\delta_m =0.1.$ We observe that the reward is greater for $\delta_m =0.1$ than the one for $\delta_m =0.9.$
The primary advantage of network models is their ability to capture complex individual-level structure in a simple framework. To specify all the connections within a network, we can form a matrix from all the interaction strengths which we expect to be sparse with the majority of values being zero. Usually, for simplicity, two individuals (or populations) are either assumed to be connected with a fixed interaction strength or unconnected. In such cases, the network of contacts is specified by a graph matrix $G$, where $G_{ij}$ is 1 if individuals $i$ and $j$ are connected, or 0 otherwise. A connection could be a relationship between the two nodes. It may be represent an internet, social network or physical connection. They may not be close in terms of location. The status of an node will be influenced by the status of its connection following the rules specified above. The resulting graph-based mean-field dynamics is illustrated in Figure 15.
Application 6 (Cloud Networks). Resource sharing solutions are very important for data centers as it is required and implemented at different layers of cloud networks [95,96,97]. The resource sharing problem can be formulated as a strategic decision-making problem. Lot of resources may be wasted if the cloud user consider an economic renting. Therefore a careful system design is required when a several clients interact. Price design can significantly improve the resource usage efficiency of large cloud networks. We denote such a game by $\mathcal{G}_{n}, $ where $n$ is the number of clients. The action space of every user is $\mathcal{U}=\mathbb{R}_+$ which is a convex set, i.e., each user $i$ chooses an action $u_i$ that belongs to the set $\mathcal{U}.$ An action may represent a certain demand. All the actions together determine an outcome. Let $p_n$ be the unit price of cloud resource usage by the clients. Then, the payoff of user $j$ is given by
$ r_i(x, u_1, \ldots, u_n)=c_n(x)\frac{h(u_i)}{\sum\limits_{j=1}^n h(u_j)}-p_n(x) u_i, $ | (14) |
if $\sum_{j=1}^n h(u_j)>0$ and zero otherwise. The structure of the payoff function $r_i(x, u_1, \ldots, u_n)$ for user $i$ shows that it is a percentage of allocated capacity minus the cost for using that capacity. Here, $c_n(x)$ represents the value of the available resources, $h$ is a positive and nondecreasing function with $h(0)=0.$ We fix the function $h$ to be $x^{\alpha}$ where $\alpha>0$ denotes a certain return index. $x$ is the state of cloud networks which is a random variable on the availability of the servers. The cloud game $\mathcal{G}_n$ is given by the collection $(\mathcal{X}, \mathcal{N}, \mathcal{U}, (r_i)_{i\in\mathcal{I}})$ where $\mathcal{I}=\{1, \ldots, n\}, \ n\geq 2, $ is the number of potential users. The next Proposition provides closed-form expression of the Nash equilibrium of the one-shot game $\mathcal{G}_n$ for a fixed state $x$ such that $c_n(x)>0, p_n(x)>0, $ and for some range of parameter $\alpha.$ It also provides the optimal price $p_n^*$ such that no resource is wasted in equilibrium.
Proposition 5. By direct computation, the following results:
(1) The resource sharing game $\mathcal{G}_n$ is a symmetric game. All the clients have symmetric strategies in equilibrium whenever it exists.
(2) For $0\leq \alpha \leq 1, $ and $x\in \mathcal{X}, $ the payoff $r_i$ is concave (outside the origin) with respect to own-action $u_i.$ The best response $BR_i(u_{-i})$ is strictly positive and is given by the root of
$ z^{(\alpha-1)/2} (\frac{\alpha c_n(x)}{np_n(x)}G)^{1/2}-\frac{z^{\alpha}}{n}-G=0, \ \ G\triangleq\frac{1}{n}\sum\limits_{j\neq i} u_j^{\alpha} $ |
where $z\triangleq u_i$ and there is a unique equilibrium (hence a symmetric one) given by $ \left(z^{\alpha-1}\frac{\alpha c_n(x)}{n p_n(x)}\frac{n-1}{n} z^{\alpha}\right)^{\frac{1}{2}} -\frac{z^{\alpha}}{n}-\frac{n-1}{n}z^{\alpha}=0, $ i.e.,
$u_{NE}^*(x)=\alpha \frac{(n-1)c_n(x)}{n^2p_n(x)}.$ |
It follows that the total demand $n a_{NE}^*(x)$ at equilibrium is less than $\frac{c_n(x)}{p_n(x)}$ which means that some resources are wasted.
The equilibrium payoff is ${r}_i(x, a_{NE}^*)=u_i p_n(x)\left[\frac{G+\frac{u^{\alpha}_i}{n}}{\alpha G}-1\right]$ which is positive for $\alpha\leq 1.$
(3) For $\alpha>1, $ the activity (participation) of user $i$ depends mainly of the aggregate of the others. $u_i^*>0$ only if $G\leq G_{*}$ and the number of active clients should be less than $\frac{\alpha}{\alpha-1}.$ If $n>\frac{\alpha}{\alpha-1}$ then $BR_i=0.$
(4) With a participation constraint, the payoff at equilibrium (whenever it exists) is at least $0.$
(5) By choosing the price $p_n^*=\alpha\frac{(n-1)}{n} < \alpha$ one gets that the total demand at equilibrium is exactly the available capacity of the cloud. Thus, pricing design can improve resource sharing efficiency in the cloud. Interestingly, as $n$ grows, the optimal pricing converges to $\alpha.$
We say that the cloud renting game is efficient if no resource is wasted, i.e., the equilibrium demand is exactly $c_n(x).$ Hence, the efficiency ratio is $\frac{n a^*_{NE}}{c_n(x)}.$ As we can see from (ⅱ) of Proposition 5, the efficiency ratio goes to $1$ by setting the price to $p_n^*.$ This type of efficiency loss is due to selfishness and have been widely used in the literature of mechanism design and auction theory. Note that the equilibrium demand increases with $\alpha$, decreases with the charged price and increases with the capacity per user. The equilibrium payoff is positive and if $\alpha\leq 1$ each user will participate in an equilibrium. In the Nash equilibrium the optimal pricing $p_n^*$ depends on the number of active clients in the cloud and value of $\alpha.$ When the active number of clients varies (for example, due to new entry or exit in the cloud), a new price needs to be setup which is not convenient.
Application 7 (Synchronization and Consensus). Consider a coupled oscillator dynamics with a control parameter per agent.
$d\theta_i=[\omega_i+\sum\limits_{j=1}^n K_{ij}(\theta) \sin(\theta_j-\theta_i) + u_i]dt +\sigma dW_i(t), $ |
where $\theta_i$ is the phase of oscillator $i$, $\omega_i$ is the natural frequency of oscillator $i$, $n$ is the total number of oscillators in the system and $K$ is a coupling interaction term. The objective here is to explore phase transition and self organization in large population dynamic systems. We explore the mean-field regime of the dynamical mean-field systems and explain how consensus and collective motion emerge from local interactions. These dynamics have interesting applications in multi-robot coordination. Figure 16 presents a Kuramoto-based synchronization scheme [156]. The uncontrolled Kuramoto model can lead to multiple clusters of alignment. Using mean-field control law, one can drive the trajectories (phases) towards a consensus as illustrated in Figure 17 which represents the behaviors for $u_i=-\omega_i+ \eta_i \sin\left(\frac{1}{n}\sum_{j=1}^n \theta_j-\theta_i\right).$ This type of behavior is useful in mobile robot rendezvous problems in which each agent needs to move towards a common point (where the rendezvous will take place).
We now provide another relevant application of the Kuramoto model in convoy protection scenario with mobile car-like robots. The goal of the robots is to keep protecting the convoy by occupying the space as the convoy moves. The mean-field-type control helps to balance between energy, placement error and risk. The authors in [157] have shown that the Kuramoto model modified with phase shift of $\frac{\pi}{2}$ radians can be used in convoy protection scenario given in Figure 18. In this scenario, we want the agents to follow the movement of the convoy while spreading out along a circular perimeter. The mean-field-type control law allow the agents to be positioned equally on a circle and self-organizing the distribution pattern once new agents are added into the network for protecting the convoy and occupying the space. Note that re-configuration of the multi-robot team will be done in a distributed way over the circle with center $c$ and with radius $r.$ The protecting convoy is a rear-wheel drive, front-wheel steerable car-like mobile robot. The car-like robot to be controlled is given in Figure 19. The kinematic parameters of the mobile robot $i$ are given by $(p_i(t), v_i(t), \theta_i(t), \beta_i(t), l_i)$ representing the cartesian coordinate (position) $p_i(t)=(x_{i, 1}(t), x_{i, 2}(t))$ of robot $i$ located at the mid-point of the rear-wheel axle, $v_i(t)$ is the translational driving speed, $\theta_i(t)$ is the orientation, $\beta_i(t)$ the steering angle of the front wheels and $l_i$ the distance between front and rear wheel axle. The goal is to control the robot to a desired orbit while spreading out. One can control the velocity $v_i$ through acceleration and the steering angle $\beta_i.$ The evolution of center point $c$ and the radius $r$ are given by the drift function $b_c(t), b_r(t).$ The connectivity in the circular graph for agent $i$ is limited to two other agents : $i-1$ and $i+1$ modulo $n.$ Each agent $i$ is influenced only by its neighboring agents. The instantaneous cost is
$ L_{i}(t)= \epsilon_1[\cos(\theta_{i+1}-\theta_{i})+\cos(\theta_i-\theta_{i-1})]+\epsilon_2(\theta_i-\frac{\pi}{2}- \mbox{tan}^{-1}(\frac{x_{i, 1}-c_1}{x_{i, 2}-c_2})).$ |
The first term in bracket says that agent $i$ should spread out from $i-1, i+1.$ The second in the bracket represents the orientation synchronization. The terminal cost is of mean-field type and is given by
$ L_{i}(T)=\epsilon_3 \frac{|v_i|^2}{d(x_i, c)^r}+\epsilon_4 |d(x_i, c)-r|^2+\epsilon_5 var(v_i), $ |
representing a balance between the kinetic energy spent, the error adjustment for being on the new circle and the variance respectively.
The finite horizon cost functional of agent $i$ is $J_i(u, \beta)= L_{i}(T)+\int_0^T L_{i}(t) dt.$ Let $\mathcal{C}(c(0), r(0))$ be the circle with center $c(0)$ and radius $r(0).$ The best-response problem of agent $i$ is
$ \left\{ supui,βi −EJi(u,β)˙xi,1=vicosθi, ˙xi,2=visinθi,dθi=vitanβilidt+σtanβilidWi(t),vi=d(xi,c)r[ωi+n∑j=1Kij(θ)cos(θj−θi)+ui]xi(0)∈C(c(0),r(0))⊂R2 \right.$
|
This is a mean-field-type optimization and the optimality system is easily derived from the stochastic maximum principle.
Application 8 (Energy-Efficient Buildings). Nowadays a large amount of the electricity consumed in buildings is wasted. A major reason for this wastage is inefficiencies in the building technologies, particularly in operating the HVAC (heating, ventilation and air conditioning) systems. These inefficiencies are in turn caused by the manner in which HVAC systems are currently operated. The temperature in each zone is controlled by a local controller, without regards to the effect that other zones may have on it or the effect it may have on others. Substantial improvement may be possible if inter-zone interactions are taken into account in designing control laws for individual zones [125,126,127,128,129]. The room/zone temperature evolution is a controlled stochastic process
$dT_i= [\epsilon_1(T_{ext}-T_i)+ \sum\limits_{j\in N_i} \epsilon_{2ij}(T_{j}-T_i) + \epsilon_3 u_i (T_{ref}-T_i)]dt+\sigma dW_i, $ |
where $\epsilon_1, \epsilon_{2ij}, \epsilon_3$ are positive real numbers. The control action $u_i$ in room $i$ depends on the price of electricity $p(demand, supply, location)$. The cost for driving to the comfort temperature zone (Figure 20) is $(T_i-T_{i, comfort})^2+{var(T_i-T_{i, comfort})}.$ The payoff of consumer is a sort of tradeoff between comfort temperature and electricity cost $u_i p.$ The instantaneous total cost of consumer $i$ is
$L_i(t)= \underbrace{u_ip(.)}_\text{energy price}+ \overbrace{(T_i-T_{i, comfort})^2}^\text{deviation to the comfort zone} +\underbrace{var(T_i-T_{i, comfort})}_\text{risk}.$ |
Within the time horizon $[0, \tau], \ \tau>0, $ consumer $i$ minimizes in $u_i:$
$var(T_i(\tau)-T_{i, comfort})+\mathbb{E}\int_0^{\tau} L_i(t) dt.$ |
However, the electricity price $p(.)$ depends on the demand $D=\int_I consumption(i) m_1(t, di)$ and supply $S=\int_J supply(j) m_2(t, dj).$ $m_1(t, .)$ is the population mean-field of consumers, i.e., the consumer distribution at time $t.$ Note that $m_1$ is an unnormalized measure. $m_2$ is the distribution of suppliers. The building is served by a producer whose remaining energy dynamics is
$de_{jk}(t)=[c_{jk}(t)\mathbb{1}_{\{ k \in A^c_j(t)\}}-s_{jk}(t)]dt+\sigma dW_{jk}, $ |
The instant payoff of the producer $j$ is its revenue minus the cost. The cost is decomposed as the cost due to mismatch between supply and demand and the production cost. The payoff is
$r_j=\underbrace{q_j p(D, S)}_\text{revenue}- \overbrace{var(D_j-S_j)}^\text{mismatch cost}- \underbrace{c(q_j)}_\text{production cost}.$ |
Producer $j$ solves $\max_{q_j} \mathbb{E} \int_0^{\tau} r_j dt$ subject to the production constraint above. Explicit solutions to both problem can be obtained using the framework developed in [132,134].
Application 9 (Online Meeting). Group meeting online, even over video, is much different than sitting in a boardroom communicating face-to-face with someone. But they something in common: deciding to join Early or on Time the group meeting. In the context of online video group meeting, since the communication is over video, the opportunity for miscommunication is much higher, and thus, one should pay close attention to how the group meeting is conducted. Each group member aims to heighten the quality of her online meetings by acting professionally and by signing early or on time: Nothing throws off a meeting worse than scheduling woes. This is in particular widely observed for online group meetings.
Scheduling and synchronization is probably the hardest job in these meetings. The help scheduling groups from different sites can login to the meeting space at their convenience makes it easier to get meetings started on time. However, it does not mean that the meeting will start exactly at scheduled time. The group members can decide to be at convenient place early and prepare for the meeting to start, giving you time to settle down and get acquainted with the interface. We examine how agents decide when to join the group meeting in a basic setup. We consider several industry and academia aiming to collaborate on a research development. The companies are located at different sites. Each company from each site has appointed work package leader. In order to improve savings from long business trips, hotels/ accommodation and to reduce jet-lags effect the companies decided to organize an online meeting. After coordinating all the members availability, date and time is found and the meeting is initially scheduled to start at time $\bar{t}.$ Each member has the starting time in his schedule and calendar remainders but in practice, the online meeting only begin when a certain number $\bar{n}$ of representative group leaders and group members will connect online and will be seated in these respective rooms. Thus, the effective starting time $T$ of the online meeting is unknown and people organize their behavior as a function of $(\bar{t}, \bar{n}, T).$
Each group member can move from her office to the meeting room (Figure 21). The dynamics of agent $i$ is simply given by $\dot{x}_i=u_i, $ where $x_i(0)\in D.$ Let $n(t)$ be the number of people arrived (and seated) in the room before $t.$ If the criterion is met (by all groups) before the initially scheduled time $\bar{t}$ of the meeting, this latter starts exactly at $\bar{t}$. If on the other hand the criterion is met at a later time, $T$ is determined by the self-consistency relation: $ T=\inf\{ t\ | \ \ t\geq \bar{t}, \ n(t)\geq \bar{n} \}.$ The instantaneous cost function is $h(G_n(x_i)) \| u_i\|^2$ and the terminal cost is $c(t_h)=c_1[t_h-\bar{t}]_{+}+c_2[t_h-T]_{+}+c_3[T-{t}_h]_{+}$ where $c_i$ are non-negative real numbers, and $ t_h=\inf\{ t, \ | \ x_i(t)\in MeetingRoom \}.$ Let $J(u)=c(t_h)+\int_0^{t_h} h(G_n(x_i)) \| u_i\|^2\ dt$ where $h(G_n(x_i)) \| u_i\|^2$ quantifies a congestion-dependent kinetic energy spent to reach the meeting room of her group. $[T-{t}_h]_{+}$ quantifies the useless waiting time, $[t_h-T]_{+}$ quantifies of the time for missing of beginning of the online meeting, $[t_h-\bar{t}]_{+}$ quantifies the sensitivity to her reputation of being late at the meeting. Given the strategies $(u_1, \ldots, u_{i-1}, u_{i+1}, \ldots, u_n), $ of the other agents, the best response problem of $i$ is:
$ \left\{ supui −J(u)˙xi=ui, xi(0)∈D⊂R2ui=0 over ∂D⊂R2, ui=k at Exits⊂R2 \right.$
|
Even if $h(.)$ is constant, the agents interact because of a common term: the starting time of the online meeting $T, $ and $n(T)\geq \bar{n}.$ For this reason, the choice of the other agents matters. The best response of agent $i$ solves the Pontryagin maximum principle
$˙pi=0, t<tih˙xi=u∗i=pi2,xi(0)∈Building⊂R3.
$
|
Hence, ${x}_i(t)=x_i(0)+ t \frac{p_i(t_h)}{2}$ will at arrive at position $x_{room}, $ at time $t_h= 2 \frac{x_{room}-x_i(0)}{p_i(t_h)}$ Thus, the optimal payoff of agent $i$ starting from $x$ at time $0$ is $ -c(t_h)-\int_0^{t_h} \frac{\|p(t_h)\|^2}{4} dt= -c(t_h)-t_h \frac{\|p(t_h)\|^2}{4}.$ The optimal payoff of agent $i$ starting from $x$ at time $t$ is $ -c(t_h)-(t_h-t) \frac{\|p(t_h)\|^2}{4}$ which is maximized for $-c'(t_h)+\frac{\|p(t_h)\|^2}{4}=0, $ i.e., $\|p(t_h)\|^2= 4 c'(t_h)$ hence $\|p(t_h)\|=2\sqrt{c'(t_h)}=\| v_x(t_h, x(t_h))\|.$ Knowing that the following two functions: $ \tilde{v}_1(x)= \langle x, p^*\rangle, $ with $\| p^*\|_*=1, $ and $ \tilde{v}_2(x)= c_2\pm \| x-y\|, $ with $x\neq y, $ solves the Eikonal equation, $\|\tilde{v}_x\|=1, $ one deduces an explicit solution of the Bellman equation: $v_t-\frac{\|v_x\|^2}{2}=0, \ \ v(t_h, x)=-c(t_h).$
Proposition 6. The tradeoff value to the meeting room starting from point $x$ at time $t$ is ${v}(t, x)= -2\sqrt{c'(t_h)} d(x(t), x_{room})-2(t_h-t)c'(t_h)-c(t_h).$
The next application uses MFTG theoretic modelling for smart cities.
Application 10 (Mobile CrowdSensing). The origins of crowdsourcing goes back at least to the nineteenth century and before [164,165]. Joseph Henry, the Smithsonian's first secretary, used the new networked technology of his day, the telegraph, to crowdsource weather reports from across the country, creating the first national weather map of the U.S. in 1856. Henry's successor, Spencer Baird, recruited citizen scientists to collect and ship natural history specimens to Washington, D.C. by the other revolutionary new technology of the day -the railroad -thus forming the bulk of the Institution's early scientific collections.
Today's mobile devices and vehicles not only serve as the key computing and communication device of choice, but it also comes with a rich set of embedded sensors, such as an accelerometer, digital compass, gyroscope, GPS, ambient light, dual microphone, proximity sensor, dual camera and many others. % (see all the available sensors on iPhone and Samsung Galaxy). Collectively, these sensors are enabling new applications across a wide variety of domains, creating huge data and give rise to a new area of research called mobile crowdsensing or mobile crowdsourcing [164,165,166]. Crowd sensing pertains to the monitoring of large-scale phenomena that cannot be easily measured by a single individual. For example, intelligent transportation systems may require traffic congestion monitoring and air pollution level monitoring. These phenomena can be measured accurately only when many individuals provide speed and air quality information from their daily commutes, which are then aggregated spatio-temporally to determine congestion and pollution levels in smart cities. Such a collected data from the crowd can be seen (up to a certain level) as a knowledge, which in turn, can be seen as a public good [167].
A great opportunity exists to fuse information from populations of privately-held sensors to create useful sensing applications will be public good. On the other hand, it is important to model, design, analyze and understand the behavior of the users and their concerns such as privacy issues and resource considerations limit access to such data streams. Two MFTGs where each user decides its level of participation to the crowdsensing: (ⅰ) public good, (ⅱ) information sharing, are presented below.
The smartphones are battery-operated mobile devices and sensors suffer from a limited battery lifetime. Hence, there is a need for solutions that will limit the energy consumptions of such mobile Internet-connected objects. Such an involvement is translated into a energy consumption cost.
All the data collected from these devices combine both voluntary participator sensing and opportunistic sensing from operators. The data is received by a network of cloud servers. For security and privacy concerns, several information are filtered, anonymized, aggregated and distributions (or mean-field) are computed. The model is a public good game with an extra reward for contributors. When decision-makers are optimizing their payoffs, a dilemma arises because individual and social benefits may not coincide. Since nobody can be excluded from the use of a public good, a user may not have an incentive to contribute to the public good. One way of solving the dilemma is to change the game by adding a second stage in which reward (fair) can be given to the contributors (non-free-riders).
The strategic form game with incomplete information denoted by $G_0, $ is described as follows: A stochastic state of the environment is represented by $x.$ There are $n_0$ potential participant to the mobile crowdsensing. The number $n_0$ is arbitrary, and represent the number of users of the game $G_0.$ As we will see, the important number is not $n_0$ but the number of active users (the ones with non-zero effort), who are contributing to the crowdsensing.
Each mobile user $i$ equipped with sensing capabilities, can decide to invest a certain level of involvement and effort $u_i\geq 0.$ The action space of user $i$ is $\mathcal{U}_i=\mathbb{R}_+.$ As we will see the degree of participation will be limited so that the action space can be included into a compact interval. The payoff of user $i$ is additive and has three components: a public good component $\bar G_i(m-\bar R(x)), $ a resource sharing component $\bar R(x) \frac{h_i(u_i)}{\sum_{j=1}^{n_0} h_j(u_j)}$ and a cost component $p(x, u_i).$ Putting together, the function payoff is
$ r_{0i}(x, u)=[\bar G_i(m-\bar{R}(x))-p(x, u_i)]\mathbb{1}_{m\geq \bar R(x)}+\bar R(x) \frac{h_i(u_i)}{\sum\limits_{j=1}^{n_0} h_j(u_j)}\mathbb{1}_{\sum\limits_{j=1}^{n_0} h_j(u_j)\neq 0}.$ |
where $m=\sum_{j=1}^{n_0} u_j$ is the total contribution of all the users, where $\mathbb{1}_{B}(x)$ is the indicator function which is equal to $1$ if $x$ belongs to the set $B$ and $0$ otherwise. This creates a discontinuous payoff function. The function $\bar G_i$ is a smooth and nondecreasing, $R(x)$ is a random non-negative number driven by $x.$ The discontinuity of the payoffs due the two branches $\{u:\ m\geq \bar R(x)\}$ and $\{ u: \ m < \bar R(x)\}$ can be handled easily by eliminating the fact that the actions in $\{u: \ m\leq \bar R(x)\}$ cannot be equilibrium candidates.
Using standard concavity assumption with the respect to own-effort, one can guarantee that the game has an equilibrium in pure strategies. We analyze the equilibrium for $\bar G_i(z)=a_i z^{\alpha}, h_i(z)==z$ where $a_i\geq 0, $ and $\alpha\in (0, 1].$ For any reward
$ \bar{R}(x)\geq \frac{4m^*\sigma}{(1-\sigma)^2}, \ \sigma=\frac{\bar G'_i(m)-1}{\bar G'_j(m)-1}>0$ |
where $m^*\in\arg\max [\bar G(m)-m], $ there exists a design parameter $(a_i)_i$ such that the "new" lottery based scheme provides the global optimum level of contribution in the public good. We collect mobile crowdsensing users to form a network in which secondary users who willing to share their throughput for the benefit of the society or their friends and friends' of friends. This can be seen as a virtual Multiple-Inputs-Multiple-Outputs (MIMO) system with several cells, multiple users per cell, multiple antennas at the transmitters, multiple antennas at the receivers. The virtual MIMO system is a sharing network represented by a graph $(V, E), $ where $V$ is the set of users representing the vertices of the social graph and $E$ is the set of edges. To an active connection $(i, j)\in E$ is associated a certain value $\epsilon_{ij}\geq 0.$ The term $\epsilon_{ij}$ is strictly positive if $j$ belongs to the altruistic outgoing network of $i$ and $i$ is concerned about the throughput of user $j.$ The first-order outgoing neighborhood of $i$ (excluding $i$) is $\mathcal{N}_{i, -}.$ Similarly, if $i$ is receiving a certain portion from $j$ then $i\in \mathcal{N}_{j, -}$ and $\epsilon_{ji}>0.$ In the virtual MIMO system, each user $i$ gets a potential initial throughput $Thp_{i, t}$ during the slot/frame $t$ and can decide to share/rent some portion of it to its altruism subnetwork members in $\mathcal{N}_{i, -}$. User $i$ makes a sharing decision vector $u_{i, t}=(u_{ij, t})_{j\in \mathcal{N}_i}, $ where $u_{ij, t}\geq 0.$ The ex-post throughput is therefore
$ Thp_{i, t+}=Thp_{i, t}+\sum\limits_{j \ |\ i \in \mathcal{N}_{j, -}}u_{ji, t}-\sum\limits_{j \in \mathcal{N}_{i, -}}u_{ij, t}. $ |
Denote $\{ j \ |\ i \in \mathcal{N}_{j, -}\}=: \mathcal{N}_{i, +}.$ Then,
$ Thp_{i, t+}=Thp_{i, t}+\sum\limits_{j \in \mathcal{N}_{i, +}}u_{ji, t}-\sum\limits_{j \in \mathcal{N}_{i, -}}u_{ij, t}. $ | (15) |
Since we are dealing with sharing decisions, the mathematical expressions are not necessarily needed if the output can be observed or measured. Given a measured throughput, A user can decide to share or not based its own needs/demands. The term $\sum_{j \in \mathcal{N}_{i, +}}u_{ji, t}$ represents the total extra throughput coming to user $i$ from the other users in $N_{i, +}$ (excluding $i$). The term $\sum_{j \in \mathcal{N}_{i, -}}u_{ij, t}$ represents the total outgoing throughput from user $i$ to the other users in $N_{i, -}$ (excluding $i$). In other word, user $i$ has shared $\sum_{j \in \mathcal{N}_{i, -}}u_{ij, t}$ to the others. If $j\notin \mathcal{N}_{i, -}$ then $u_{ij, t}=0$ and for all $i, $ $u_{ii, t}=0.$ The balance equation is
$
∑iThpi,t+=∑iThpi,t+∑i,juji,t−∑i,juij,t=∑iThpi,t,
$
|
(16) |
i.e., the system total throughput ex-post sharing is equal to the system total throughput ex-ante sharing. This means that the virtual MIMO throughput is redistributed and sharing among the users through individual sharing decisions $s.$ Some users may care about the others because he may be in their situation in other slot/day. For these (altruistic) users, the preferences are better captured by an altruism term in the payoff. We model it through a simple and parameterized altruism payoff.
The payoff function of $i$ at time $t$ is represented by
$ {r}_{1i}(x, u_{i, t}, u_{-i, t})=\hat{r}_i(Thp_{i, t+})+\sum\limits_{j\in \mathcal{N}_i}\epsilon_{ij}\hat{r}_j(Thp_{j, t+}). $ | (17) |
Here, $\epsilon_{ij}\geq 0$ and represents a certain weight on how much $i$ is helping $j.$ The matrix $(\epsilon_{ij})$ plays an important role in the sharing game under consideration since it determines the social network and the altruistic relationship between the users over the network. The throughput $Thp$ depends implicitly the random variable $x.$ The static simultaneous act one-shot game problem over the network $(V, E)$ is given by the collection $G_{1, \epsilon}=(V, (\mathbb{R}_+^{n_1-1}, {r}_{1i})_i).$ The vector $u_{i}$ is in $ \mathbb{R}_+^{n_1}, $ but the i-th component is $u_{ii}=0.$ Therefore the choice vector reduces to be in $ \mathbb{R}_+^{n_1-1}.$ and is denoted by $(u_{i, 1}, \ldots, u_{i, i-1}, 0, u_{i, i+1}, \ldots, u_{i, n_1}).$ An equilibrium of $G_{1, \epsilon}$ in state $w$ is a matrix $s\in \mathbb{R}_+^{n_1^2}$ such that
$ u_{i}\in \mathbb{R}_+^{n_1}, \ u_{ii}=0, \ $ |
$ {r}_{1i}(x, u_{i}, u_{-i})=\max\limits_{u'_{i}} {r}_{1i}(x, u'_{i}, u_{-i}). $ | (18) |
We analyze the equilibria of $G_{1, \epsilon}.$ Note that in practice the shared throughput cannot be arbitrary; it has to be feasible.
The set of actions can be restricted to
$ \mathcal{U}_i=\left\{ u_i\ | \ u_{ii}=0, \ u_{ij}\geq 0, \ \sum\limits_{j}u_{ij}\leq C\right\}, $ |
where $u_i=(u_{i, 1}, \ldots, u_{i, i-1}, 0, u_{i, i+1}, \ldots, u_{i, n}), $ and $C>0$ is large enough. For example, $C$ can be taken as the maximum system throughput $\sum_{j}Thp_{j, 0}.$ This way, the set of sharing actions $ \mathcal{U}_i $ of user $i$ is non-empty, convex and compact. Assuming that the functions $\hat{r}_i$ are strictly concave, non-decreasing and continuous, one obtains that the game $G_{1, \epsilon}$ has at least one equilibrium (in pure strategies).
As highlighted above, the set of actions can be made convex and compact. Since $\hat{r}_i$ are continuous and strictly convex, it turns out that, each payoff function ${r}_i$ is jointly continuous and is concave in the individual variable $u_i$ (which is a vector) when fixing the other variables. We can apply the well-known fixed-point results which give the existence of constrained Nash equilibria. As we know that $G_{1, \epsilon}$ has at least one equilibrium, the next step is to characterize them.
If the matrix $u$ is an equilibrium of $G_{1, \epsilon}$ then the following implications hold:
$ u_{ij}>0 \implies \hat{r}'_i(Thp_{i, 0+})=\epsilon_{ij}\hat{r}'_j(Thp_{j, 0+}). $ | (19) |
The equilibria may not be unique depending on the network topology. This is easily proved and it is due to the fact that one may have multiple ways to redistribute depending on the network structure and several redistributions can lead to the same sum $Thp_{i, 0}+\sum_{j}u_{ji}-\sum_{j}u_{ij}.$ Even if the game has a set of equilibria, the equilibrium throughput and the equilibrium payoff turn out to be uniquely determined. The set of equilibria has a special structure as it is non-empty, convex and compact. The ex-post equilibrium throughput increases with the ex-ante throughput and stochastically dominates the initial distribution of throughput of the entire network. For $\hat{r}_i=-\frac{1}{\theta} e^{-\theta Thp_i}, \ \theta>0$ let $\epsilon_{ij}=\epsilon$ where $\epsilon>0.$ Then, the fairness is improved in the network as $\epsilon$ increases. The topology of the network matters. The difference between the highest throughput and the lowest throughput in the network is given by the geodesic distance (strength) of the multi-hop connection.
This section presents MFTGs with time-delayed state dynamics. Delayed dynamical systems and delayed payoffs appear in many applications. They are characteristic of past-dependence, i.e., their behavior at time $t$ not only depends on the situation at $t$, but also on their past history and or time delayed state. Some of such situations can be described with controlled stochastic differential delay equations. Networked systems suffer from intermittent, delayed, and asynchronous communications and sensing. To accommodate such systems, time delays need to be introduced.
Applications include
• Consensus and collective motion of Cucker-Smale [163] type with delayed information states
$
dxi=vidtdvi=∫(ˉx,ˉv)a(‖ˉx−xi‖2)(ˉv−vi)ρ(t−τi,dˉxdˉv) dt+c(∫ˉvˉvρ(t−τi,X,dˉv)) dt+uidt+σdWi, $
|
where $\rho(t, dxdv)$ is the distribution of states at time $t.$
• Delayed information processing, where the difference of the states $\bar{x} -x_i$ influences the dynamics after some time delay $\tau_i$. Examples include Kuramoto-based oscillators [156]
$ dx_i=\left[w_i+ \int \rho(t-\tau_i, d\bar{x}) \sin(\bar{x}-x_i(t-\tau_i))+ u_i \right]dt+\sigma dW_i, $ |
used to describe synchronization.
• Delayed information transmission, where agent $i$ compares its state to the information coming from its neighbor $j $ after some time delay $\tau_i.$ Information transmission delays arise naturally in many dynamical processes on networks.
$ dx_i=\left[w_i+ \int \rho(t-\tau_i, d\bar{x}) \sin(\bar{x}-{x}_i(t)) + u_i \right]dt+\sigma dW_i. $ |
Delayed information transmission has direct applications in opinion dynamics and opinion formation on social graph:
$ dx_i= \left[\int_{B({x}_i, \epsilon_{i})} \rho(t-\tau_i, d\bar{x})-x_i + u_i \right]dt+\sigma dW_i, $ |
• The Air Conditioning control towards a comfort temperature is influenced by integrated-state which represents the trend.
• Transmission and propagation delay affect the performance of both wireline and wireless networks both delayed information processing and delayed information transmission occur.
• In computer network security, the proportion of infected nodes at time $t$ is a function of the delayed state, the topological delay, and the proportion of susceptible individuals and some time delay for the contamination period.
• In energy markets, there is an observed phenomenon for the dynamics of the price, which comes with a delayed effect.
We consider a mean-field game where agents interact within the time frame $\mathcal{T}.$ The best-response of a generic agent is
$
\left\{
supu∈UE[G(u,m1,m2)], subject to dx=b(t,x,y,z,u,m1,m2,ω)dt+σ(t,x,y,z,u,m1,m2,ω)dW+∫Θ γ(t,x,y,z,u,m1,m2,θ,ω)˜N(dt,dθ),x(t)=x0(t), t∈[−τ,0],
\right.
$
|
(20) |
where $\tau_k>0$ represents a time delay, $x=x(t)$ is the state at time $t$ of a generic agent, $ y=(x(t-\tau_k))_{1\leq k\leq D}, $ is a $D-$dimensional delayed state vector, $ z(t)=(\int_{t-\tau}^t \lambda(ds) \phi_l(t, s)x(s))_{l\leq I}$ is the integral state vector of the recent past state over $[t-\tau, t].$ This represents the trend of the state trajectory. The process $\phi_l(t, s)$ is an $\mathcal{F}_s-$adapted locally bounded process. $\lambda$ is a positive and $\sigma-$finite measure. $m_1$ the average states of all the agents, $m_2$ the average control actions of all the agents, $x_0$ is a initial deterministic function of state. $W(t)=W(t, \omega)$ be a standard Brownian motion on $\mathcal{T}=[0, T]$ defined on a given filtered probability space $ (\Omega, \mathcal{F}, \mathbb{P}, \{\mathcal{F}_t\}_{t\in \mathcal{T}}).$
Payoffs: $G(u, m_1, m_2)= g_1(T, x(T), m_1(T), \omega){+}\int_{t\in \mathcal{T}} g_0(t, x, y, z, u, m_1, m_2, \omega)\ dt, $ where the instantaneous payoff function is $g_0:\ \mathcal{T}\times \mathcal{X}^3\times {U}\times \mathcal{X}\times U\times \Omega\rightarrow \mathbb{R}, $ the terminal payoff function is $g_1:\ \mathcal{X}^2 \times \Omega\rightarrow \mathbb{R}.$
State dynamics: The drift coefficient function is $b: \ \mathcal{T}\times \mathcal{X}^3\times {U}\times \mathcal{X}\times U\times \Omega \rightarrow \mathbb{R}, $ the diffusion coefficient function is $ \sigma:\ \ \mathcal{T}\times \mathcal{X}^3\times {U}\times \mathcal{X}\times U\times \Omega \rightarrow \mathbb{R}.$
Jump process: Let ${N}$ be a Poisson random measure with Lévy measure $\mu (d\theta), $ independent of $\mathcal{B}$ and the measure $\mu$ is a $\sigma-$finite measure over $\Theta.$ $\tilde{N}(dt, d\theta)=N(dt, d\theta)-\mu(d\theta)dt.$ The function ${\gamma}: \ \mathcal{T}\times \mathcal{X}^3\times {U}\times \mathcal{X}\times U\times \Theta\times \Omega \rightarrow \ \mathbb{R}. $ The filtration $\mathcal{F}_t$ is the one generated by the union of events from ${W}$ or ${N}$ up time $t.$
The goal is to find or to characterize a best response strategy to mean-field $(m_1, m_2): $ $u^*\in \arg\max_{u\in\mathcal{U}} G(u, m_1, m_2).$
Hypothesis H1: The functions $b, \sigma, g$ are continuously differentiable with the respect to $(x, m).$ Moreover, $b, \sigma, g$ and all their first derivatives with the respect to $(x, y, z, m)$ are continuous in $(x, m, u)$ and bounded.
We explain below why the existing solution approaches cannot be used to solve (20). First, the presence of $y, z$ lead to a delayed integro-McKean-Vlasov and the stochastic maximum principle developed in [33,34,36,37,171,174] does not apply. The dynamic programming principle for Markovian mean-field control cannot be directly used here because the state dynamics is non-Markovian due to the past and time delayed states. Hence, a novel solution approach or an extension is needed in order to solve (20). A chaos expansion methodology can be developed as in [160] using generalized polynomial of Wick and Poisson jump process. The idea is to develop a finite-dimensional optimality equation for (20). In this respect, a stochastic maximum principle could be a good candidate solution approach. Under H1, for each control $u\in \mathcal{U}, $ $m_1$ and $m_2$ the state dynamics admits a unique solution, $x(t):=x^{u}(t).$ The non-optimized Hamiltonian is $H(t, x, y, z, u, m_1, m_2, p, q, \bar{r}, \omega): \mathcal{T}\times \mathcal{X}^3\times {U}\times \mathcal{X}\times U\times \mathbb{R}^2\times J \times \Omega \rightarrow \mathbb{R}$ where $\bar{r}(.)\in J$ and $J$ is the set of functions on $\Theta$ such that $\int_{\Theta} \gamma \bar{r}(t, \theta)\mu(t, d\theta)$ is finite. The Hamiltonian is $H=g_0+b p+\sigma q+\int_{\Theta} \gamma \bar{r}(t, \theta)\mu(d\theta).$ The first-order adjoint process $(p, q, \bar{r})$ is time-advanced and determined by
$
dp=E[−Hx1t≤T−D∑k=1Hyk(t+τk)1t≤T−τk | Ft]dt−I∑l=1E[λ(dt)∫t+τtϕl(t,s)Hz1s∈[0,T]ds | Ft]+qdW(t)+∫ˉr(t,dθ)˜N(dt,dθ),
$
|
(21) |
$ p(T)=g_{1, x}(x(T), m_1(T)). $ | (22) |
We now discuss the existence and uniqueness of the first-order adjoint equation.
Proposition 7. Assume that the coefficients are $L^2, $ the first order adjoint (22) has a unique solution such that
$ \mathbb{E}\left[\int_0^T p^2+q^2+\int_{\Theta} \bar{r}^2(t, \theta)\mu(d\theta) \ dt\right] <+\infty$ |
Moreover, the solution $(p, q, \bar{r})$ can be found backwardly as follows:
• Within the time frame $(T-\tau, T), $ $dp= E[-H_x \ | \ \mathcal{F}_t] dt+qdW(t) +\int_{\Theta} r(t, d\theta) \tilde{N}(dt, d\theta)$ with $p(T).$
• We fix $p(T-\tau)$ from the previous step and solve (21) on interval $(T-2\tau, T-\tau).$
• We inductively construct a procedure to compute $p(t)$ on $t\in [T-k\tau, T-(k-1)\tau], \ k\leq \frac{T}{\tau}$ ending with $p(T-(k-1)\tau).$
Note that, if $t\in [T-k\tau, T-(k-1)\tau]$ then $t+\tau\in [T-(k-1)\tau, T-(k-2)\tau]$ and hence, $(p(t+\tau), q(t+\tau), \bar{r}(t+\tau, \theta))$ is known from the previous step. However, $p(t+\tau)$ may not be $\mathcal{F}_t-$adapted. Therefore a conditional expectation with the respect to the filtration $\mathcal{F}_t$ is used.
If $U$ is a convex domain, we know that the second-order adjoint processes of Peng's type are not required, and if $(x^*, u^*)$ is a best response to $m_1, m_2$ then there is a triplet of processes $(p, q, \bar{r}), $ that satisfy the first order adjoint equation such that
$ H(t, x^*, y^*, z^*, u^*, m_1, m_2, p, q, \bar{r}) \nonumber \\ -H(t, x^*, y^*, z^*, u, m_1, m_2, p, q, \bar{r}) \geq 0, $ | (23) |
for all $u\in \mathcal{U}, $ almost every $t$ and $\mathbb{P}-$almost surely (a.s.). A necessary condition for (interior) best response strategy is therefore $E[H_u\ \ | \ \mathcal{F}_t]=0$ whenever $H_u$ makes sense. A sufficient condition for optimality can be obtained, for example, in the concave case: $g_1, H$ are concave in $(x, y, z, u)$ for each $t$ almost surely.
Let $c_1(t), c_2(t)$ and $c_3(t, z) $ be given bounded adapted processes, with $c_1$ assumed to be deterministic and $\int c_3^2 \nu(dz) < +\infty.$ Consider the energy dynamic generated by a prosumer as
$ de_{i}= (c_1(t) e_i(t-\tau) -u_i)dt +c_2(t)e_i(t-\tau) dW(t) +e_i(t-\tau)\int c_3(t, \theta)\tilde{N}(dt, d\theta), $ |
$e_i(t)=e_{i0}(t)\mathbb{1}_{[-\tau, 0]}(t)$ where $e_{i0}$ is deterministic and bounded function that is given. The energy $u_i$ is consumed by $i.$ Prosumer $i$ has a (random) satisfaction function $s(t, u_i, \omega)$ which is $\sigma(W_{t'}, N(t'), \ t'\leq t)-$adapted for each consumption strategy $u_i\geq 0, $ the random function $s$ is assumed to be continuously differentiable and increasing with the respect to $u_i$ and its derivative $s_{u_i}(t, u_i, \omega)$ is decreasing in $u_i$. The function $s_{u_i}(t, u_i, \omega)$ vanishes as the consumption $u_i$ grows without bound. Therefore, the maximum value of $s_{u_i}(t, u_i, \omega)$ is achieved when $u_i=0$ and the maximum value is $\bar{m}(t, \omega):=s_{u_i}(t, 0, \omega).$ The infinimum value of $s_{u_i}(t, u_i, \omega)$ is $0.$ It follows that $u_i \mapsto s_{u_i}(t, u_i, \omega)$ is a one-to-one mapping from $\mathbb{R}_{+}$ to $(0, \bar{m}(t, \omega)].$ In particular, the function $br: \ \lambda \mapsto (s_{u_i}(t, ., \omega))^{-1} [\lambda]\mathbb{1}_{(0, \bar{m}(t, \omega)]}(\lambda)$ is well-defined and is a measurable function. Prosumer $i$ aims to maximize her satisfaction functional together with her profit $\mathbb{E}\left[g(e_i(T))+ \int_0^T s(t, u_i, \omega)+price(m) q_i\ dt \right]$
The Hamiltonian is
$H(t, x, y, z, u_i, m_1, m_2, p, q, \bar{r})=s+(c_1 y -u_i) p+c_2y q+ y\int_{\Theta} c_3 \bar{r}(t, \theta)\mu(d\theta).$
$
dp=E[−Hy(t+τ)1t≤T−τ | Ft]dt+qdW(t)+∫ˉr(t,dθ)˜N(dt,dθ),p(T)=gx(x(T)),
$
|
(24) |
where $H_{y}(t+\tau)= c_1(t+\tau)p(t+\tau)+c_2(t+\tau)q(t+\tau)+ \int_{\Theta} c_3(t+\tau) \bar{r}(t+\tau, \theta)\mu(d\theta). $
We solve the solution explicitly with $g(x)= c_4 x, \ \ c_4\geq 0.$ $p(T)=c_4\geq 0.$ Between time $T-\tau$ and $T, $ the stochastic process $p(t)$ must solve $dp=qdW(t) +\int \bar{r}(t, d\theta) \tilde{N}(dt, d\theta)$ and it should be $ \mathcal{F}_t$-measurable. Therefore $p(t)=c_4$ on $t\in [T-\tau, T].$ For $t < T-\tau, $ the processes $q $ and $\bar{r}$ are zero and $p$ is entirely deterministic and solves
$ \dot{p}=-c_1(t+\tau)p(t+\tau). $ |
Thus, for $t\in [T-2\tau, T-\tau], $
$ p(t)=p(T-\tau)+\int^{T-\tau}_t c_1(t'+\tau)p(t'+\tau) \ dt'.$ |
This means that $p(t)= c_4[1+\int^{T}_{t+\tau} c_1(t'') \ dt'']. $ For $t\in [T-(k+1)\tau, T-k\tau], $ and $(k+1)\tau\leq T, $ one has $ p(t)=p(T-k\tau)+\int^{T-(k-1)\tau}_{t+\tau} c_1(t'')p(t'') \ dt''. $
By assumption, $s_{u_i}(t, u_i, \omega)$ is decreasing in $u_i$ and from the above relationship it is clear that $p$ is decreasing with $\tau.$ It follows that, if $\tau_1 < \tau_2, $ $p[\tau_1](t) > p[\tau_2](t).$ We would like to solve $s_{u_i}(t, u_i, \omega)=p[\tau_1](t) > p[\tau_2](t).$ By inverting the above equation one gets $u_i^*[\tau_1] < u_i^*[\tau_2].$ Thus, the optimal strategy $u_i^*$ increases if the time delay $\tau$ increases.
This proves the following result:
Proposition 8. The time delay decreases the prosumer market price. The optimal strategy $u_i^*$ increases as the time delay $\tau$ increases.
Numerical methods for delayed stochastic differential equations of mean-field type is not without challenge. Here we implement the Milstein scheme using MATLAB. We choose the following parameters $\gamma=0, c_1=c_2=c_3=1$ and set the satisfaction function as
$s(u)=1-(1+\mu\bar{m}_2)e^{-u}$ |
where $\mu>0$ and $\bar{m}_2$ is the average of all other agents' control actions. A typical shape of the satisfaction function is given in Figure 22. The optimal control is
$u^*(t)=-\frac{\log p(t)}{1+\mu \bar{m}_2(t)}\mathbb{1}_{(0, 1]}(p(t)).$ |
$
u^*(t)=\left\{−logc41+μˉm2(t) on t∈(T−τ,T],−logc4(1+T−t−τ)1+μˉm2(t) on t∈(T−2τ,T−τ],−11+μˉm2(t)log[c4(1+τ)+c4(1+T−τ)(T−t−2τ)−c42(T−t−2τ)(T+t)]on t∈(T−3τ,T−2τ].
\right.
$
|
The mean-field equilibrium solves the fixed-point equation $\mathbb{E}[u^*(t)]=\bar{m}_2(t).$ Putting together one obtains
$\bar{m}_2(t)=- \frac{\log p(t)}{1+\mu \bar{m}_2(t)}, $ |
i.e., the root (in $\bar{m}_2$) of $\bar{m}_2 \mapsto \bar{m}_2(1+\mu \bar{m}_2)+\log p(t).$ The quadratic polynomial has two roots: one positive and the other negative value. Since the consumption is nonnegative, the mean of the mean-field control action is hence given by
$ \bar{m}_2(t)=\frac{-1+\sqrt{1+4\mu\log [\frac{1}{p(t)}]}}{2\mu}. $ |
Notice that the effect of the time delay $\tau$ in this specific example was through the adjoint process $p$ which also enters into the control action $u$.
We plot the structure of the optimal strategy for $T=1, \tau=1/3, \tau=2/3.$ The theoretical result of Proposition 2 is numerically observed in Figure 23. Figure 24 plots sample optimal state trajectories for $T=1, \tau=1/3$ using Milstein scheme.
Let $\mathcal{F}_t^{W}$ be the $\mathbb{P}$-completed natural filtrations generated by $W$ up to $t$. Set $\mathcal{F}^W:=\{\mathcal{F}_t^{W}, \ 0\leq t \leq T\}$ and $\mathbb{F}:=\{{\mathcal{F}}_t, \ 0\leq t \leq T\}$, where $\mathcal{F}_t=\mathcal{F}_t^{W} \vee \sigma(x_0)$. An admissible control $u_i$ of agent $i$ is an $\mathcal{F}^{W_i}$-adapted process with values in a non-empty, closed and bounded subset (not necessarily convex) $U_i$ of $\mathbb{R}^{d}$ and satisfies $E[\int_0^T|u_i(t)|^2dt] < \infty$. Those are nonanticipative measurable functionals of the Brownian motions. Since each agent has a different information structure (decentralized information), let $\mathcal{U}_i$ be the set of admissible strategies of $i$ (with decentralized partial information) such that $\mathcal{G}_{i, t} \subset \mathcal{F}_{i, t}, $ i.e., $\mathcal{U}_i:= \{ u_i\in L^2_{\mathcal{G}_{i, T}}([0, T], \mathbb{R}^{d}), \ u_i(t, .)\in U_i \ \mathbb{P}-a.s\}.$ Given a strategy $u_i\in \mathcal{U}_i$, and a (population) mean-field term $m$ generated by other agents we consider the signal-observation $x_i^{u_i, m}$ which satisfies the following stochastic differential equation of mean-field type to which we associate a best-response to mean-field [132,158,159]:
$
\left\{
supui∈UiR(ui,m) subject to dxi(t)=b(t,xi(t),Exi(t),ui(t),m(t))dt+σ(t,xi(t),Exi(t),ui(t),m(t))dWi,t,xi(0)∼L(Xi,0),m(t)=population mean-field ,
\right.
$
|
(25) |
$ b(t,x,y,u,m): \,\,[0,T] \times \mathbb{R}^d\times\mathbb{R}^d\times U_i \times \Lambda \longrightarrow \mathbb{R}, $ | (26) |
$ \sigma(t, x_i, y_i, u_i, m): \, \, [0, T] \times \mathbb{R}\times \mathbb{R}\times U_i \times \Lambda\longrightarrow \mathbb{R}. $ | (27) |
$
R(ui,m)=g(xi(T),Exi(T),m(T))+∫T0r(t,xi(t),Exi(t),ui(t),m(t))dt, $
|
$g$ is the terminal payoff and $r$ is the running payoff. Given $m, $ any $ u^*_i\in {\cal U}_i$ which satisfies $R(u_i^*(\cdot), m)=\sup_{u_i(\cdot)\in {\cal U}_i}R(u_i, m)$ is called a pure best-response strategy to $m, $ by agent $i.$ In addition to the other coefficient we assume that $\gamma$ satisfies H1. Under H1, the state dynamics admits a unique strong solution ([161], Proposition 1.2.) Given $m, $ we apply the SMP for risk-neutral mean-field type control from ([162], Theorem 2.1) to the state dynamics $x$ to derive the first order adjoint equation. Under the assumption H1, there exists a unique $\mathbb{F}$-adapted pair of processes $(p, q)$, which solves the Backward SDE:
$
p(t)=gx(T)+E[gy(T)]+∫Tt{Hx(s)+E[Hy(s)]}ds−∫Ttq(s)dW(s),
$
|
(28) |
such that $ \mathbb{E}\left[\sup_{t\in [0, T]}\ |p(t)|^2+\int_0^T |q(t)|^2dt \right] < +\infty.$ However, these processes $(p, q)$ may not be adapted to decentralized information $\mathcal{G}_{i, t}.$ This is why their conditioning will appear in the maximum principle below. Again by ([162], Theorem 2.1), there exists a unique $\mathbb{F}$-adapted pair of processes $(P, Q)$, which solves the second order adjoint equation
$ P(t)= g_{xx}(T) $ | (29) |
$ + \int_t^T \{2 b_x(s) P(s)+\sigma_x^2 P(s) +2 \sigma_x(s) Q(s)+H_{xx}(s)\} ds \nonumber \\ -\int_t^T Q(s) dW(s), $ | (30) |
such that $ \mathbb{E}\left[\sup_{t\in [0, T]}\ |P(t)|^2+\int_0^T |Q(t)|^2dt \right] < +\infty.$ Note that in the multi-dimensional setting, the term $2 b_x(s) P(s)+\sigma_x^2 P(s) +2 \sigma_x(s) Q(s)$ becomes $b'_x P+ Pb_x+\sigma'_x P \sigma_x + \sigma'_x Q+Q\sigma_x.$
Proposition 9. Let H1 holds and $m$ be a given population mean-field profile. If $(x^*_i, u^*_i)$ is a best-response then, there are two pairs of $\mathbb{F}$-adapted processes $(p, q)$ and $(P, Q)$ that satisfy (28) and (29) respectively, such that
$
i∈N: [δH(t)+12δσ(t)′P(t)δσ(t) | Gi,t]≤0,
$
|
(31) |
for all $u_i\in \mathcal{U}_i, $ almost every $t$ and $\mathbb{P}-$almost surely, where,
$ \delta H(t):= H(t, x^*(t), u_i, m(t), {p}(t), {q}(t)) -H(t, x^*(t), u^*_i(t), m(t), {p}(t), {q}(t)), $ | (32) |
and $H_k(t):=b_k(t)p+\sigma_k(t)q+r_k(t), $ for $k\in \{x, y, xx\}.$
The examples above show that the continuum of agents assumption is rarely observed in engineering practice. The agents are not necessarily symmetric and a single agent may have a non-negligible effect on the mean field terms as illustrated in the HVAC application. Without having a broad set of facts on which to theorize, there is a certain danger of mean-field game models that are mathematically elegant, yet have little connection to actual behavior observed in engineering practice. At present, our empirical knowledge is inadequate to the main assumptions of the classical mean-field game theory. This is why a relaxed version is needed in order to better capture wide ranges of behaviors and constraints observed in engineering systems. MFTG relaxations include symmetry breaking, mixture between atomic and nonatomic agents, non-negligible effect on individual localized mean-field terms, and arbitrary number of decision-makers. In addition, behavioral and psychological factors should be incorporated for learning and information processes used by people-centric engineering systems. MFTG is still under development and is far from being a well-established tool for engineered systems. Until now, MFTG was not focused on behavioral and cognitively-plausible models of choices in humans, robots, machines, mobile devices and software-defined strategic interactions. Psychological and behavioral mean-field type game theories seem to explain behaviors that are better captured in experiments or in practice than classical game-theoretic equilibrium analysis. It allows to consider psychological aspects of the agent in addition to the traditional "material" payoff modelling. The value depends upon choice consequences, mean-field states, mean-field actions and on beliefs about what will happen. The psychological MFTG framework can link cognition and emotion. It expresses emotions, guilt, empathy, altruism, spitefulness (maliciousness) of the agents. It also include belief-dependent and other-regarding preferences in the motivations. It needs to be investigated how much the psychology of the people matters in their behaviors in engineering MFTGs. The answer to this question is particularly crucial when analyzing the quality-of-experience of the users in terms of MOS (mean opinion score) values. A preliminary result from a recent experiment conducted in [111,168] with 47 people carrying mobile devices with WiFi direct and D2D technology shows that the participation in forwarding the data of the users is correlated with their level of empathy towards their neighbors. This suggests the use of not only material payoffs but also non-material payoffs in order to better capture users behaviors. Another aspect of MFTGs is the complexity of the analysis (both equilibrium and non-equilibrium) when multiple agents (and multiple mean-field terms) are involved in the interaction [71,132,160,169,170].
The article presented basic applications of mean-field-type game theory in engineering, covering key aspects such as de-congestion in intelligent transportation networks, control of virus spread over network, multi-level building evacuation, next generation wireless networks, incentive-based demand satisfaction in smart energy systems, synchronization and coordination of nodes, mobile crowdsourcing and cloud resource management. It appears from the wide ranges of applications and coverage that mean-field-type game theory is a promising tool for engineering problems. However, the framework is still under development and needs to be improved to capture realistic behavior observed in practice. Possible extensions of the work described in this article include the study of mean-field-type games for risk engineering, and an integrated mean-field-type game framework for smarter cities ranging from transportation to water distribution with ICT (Information Communication Technology), big data and human-in-the-loop among several other interesting directions.
This research work is supported by U.S. Air Force Office of Scientific Research under grant number FA9550-17-1-0259. The authors would like to thank the Editor and the anonymous reviewers for interesting and constructive comments on the manuscript. The authors would like to thank the seminar participants at KTH Sweden for many extremely fruitful discussions and for their inputs on the first draft of the manuscript.
The authors declare no conflict of interest in this paper.
[1] | Kinkade JM, Cole RD (1966) The resolution of four lysine-rich histones derived from calf thymus. J Biol Chem 241: 5790-5797. |
[2] |
Parseghian MH, Hamkalo BA (2001) A compendium of the histone H1 family of somatic subtypes: An elusive cast of characters and their characteristics. Biochem Cell Biol 79: 289-304. doi: 10.1139/o01-099
![]() |
[3] | Ausio J (2000) Are linker histones (histone H1) dispensable for survival? BioEssays 22: 873-877. |
[4] |
Takami Y, Nishi R, Nakayama T (2000) Histone H1 variants play individual roles in transcription regulation in the DT40 chicken B cell line. Biochem Biophys Res Commun 268: 501-508. doi: 10.1006/bbrc.2000.2172
![]() |
[5] |
Fan Y, Nikitina T, Morin-Kensicki EM, et al. (2003) H1 linker histones are essential for mouse development and affect nucleosome spacing in vivo. Mol Cell Biol 23: 4559-4572. doi: 10.1128/MCB.23.13.4559-4572.2003
![]() |
[6] |
Alami R, Fan Y, Pack S, et al. (2003) Mammalian linker-histone subtypes differentially affect gene expression in vivo. Proc Natl Acad Sci U S A 100: 5920-5925. doi: 10.1073/pnas.0736105100
![]() |
[7] |
Sancho M, Diani E, Beato M, et al. (2008) Depletion of human histone H1 variants uncovers specific roles in gene expression and cell growth. PLoS Genet 4: e1000227. doi: 10.1371/journal.pgen.1000227
![]() |
[8] | Luger K, Mader AW, Richmond RK, et al. (1997) Crystal structure of the nucleosome core particle at 2.8 Angstrom resolution. Nature 389: 251-260. |
[9] |
Bednar J, Horowitz RA, Grigoryev SA, et al. (1998) Nucleosomes, linker DNA, and linker histone form a unique structural motif that directs the higher-order folding and compaction of chromatin. Proc Natl Acad Sci U S A 95: 14173-14178. doi: 10.1073/pnas.95.24.14173
![]() |
[10] | Bazett-Jones DP, Eskiw CH (2004) Chromatin structure and function: Lessons from imaging techniques. In Chromatin Structure and Dynamics: State-of-the-Art (Zlatanova, J.S. & Leuba, S.H., eds), pp. 343-368. Elsevier, Amsterdam. |
[11] | Langowski J, Schiessel H (2004) Theory and computational modeling of the 30 nm chromatin fiber. In Chromatin Structure and Dynamics: State-of-the-Art (Zlatanova, J.S. & Leuba, S.H., eds), pp. 397-420. Elsevier, Amsterdam. |
[12] | Thoma F, Koller T, Klug A (1979) Involvement of histone H1 in the organization of the nucleosome and the salt-dependent superstructures of chromatin. J Cell Biol 83: 402-427. |
[13] |
Strahl BD, Allis CD (2000) The language of covalent histone modifications. Nature 403: 41-45. doi: 10.1038/47412
![]() |
[14] | Parseghian MH, Newcomb RL, Winokur ST, et al. (2000) The distribution of somatic H1 subtypes is non-random on active vs. inactive chromatin: Distribution in human fetal fibroblasts. Chromosome Res 8: 405-424. |
[15] | Parseghian MH, Newcomb RL, Hamkalo BA (2001) The distribution of somatic H1 subtypes is non-random on active vs. inactive chromatin II: Distribution in human adult fibroblasts. J Cell Biochem 83: 643-659. |
[16] | Eick S, Nicolai M, Mumberg D, et al. (1989) Human H1 histones: Conserved and varied sequence elements in two H1 subtype genes. Eur J Cell Bio 49: 110-115. |
[17] |
Albig W, Kardalinou E, Drabent B, et al. (1991) Isolation and characterization of two human H1 histone genes within clusters of core histone genes. Genomics 10: 940-948. doi: 10.1016/0888-7543(91)90183-F
![]() |
[18] | Albig W, Meergans T, Doenecke D (1997) Characterization of the H1.5 gene completes the set of human H1 subtype genes. Gene 184: 141-148. |
[19] | Bustin M, Cole RD (1968) Species and organ specificity in very lysine-rich histones. J Biol Chem 243: 4500-4505. |
[20] | Bustin M, Cole RD (1969) A study of the multiplicity of lysine-rich histones. J Biol Chem 244: 5286-5290. |
[21] |
Bradbury EM, Crane-Robinson C, Johns EW (1972) Specific conformations and interactions in chicken erythrocyte histone F2C. Nat New Biol 238: 262-264. doi: 10.1038/newbio238262a0
![]() |
[22] |
Smith BJ, Walker JM, Johns EW (1980) Structural homology between a mammalian H1(0) subfraction and avian erythrocyte-specific histone H5. FEBS Lett 112: 42-44. doi: 10.1016/0014-5793(80)80122-0
![]() |
[23] | Seyedin SM, Kistler WS (1979) H1 histone subfractions of mammalian testes. 1. Organ specificity in the rat. Biochemistry 18: 1371-1375. |
[24] | Seyedin SM, Kistler WS (1979) H1 histone subfractions of mammalian testes. 2. Organ specificity in mice and rabbits. Biochemistry 18: 1376-1379. |
[25] | Seyedin SM, Kistler WS (1980) Isolation and characterization of rat testis H1t: an H1 histone variant associated with spermatogenesis. J Biol Chem 255: 5949-5954. |
[26] |
Yamamoto T, Horikoshi M (1996) Cloning of the cDNA encoding a novel subtype of histone H1. Gene 173: 281-285. doi: 10.1016/0378-1119(96)00020-0
![]() |
[27] | Tanaka M, Hennebold JD, Macfarlane J, et al. (2001) A mammalian oocyte-specific linker histone gene H1oo: Homology with the genes for the oocyte-specific cleavage stage histone (cs-H1) of sea urchin and the B4/H1M histone of the frog. Development 128: 655-664. |
[28] |
Yan W, Ma L, Burns KH, et al. (2003) HILS1 is a spermatid-specific linker histone H1-like protein implicated in chromatin remodeling during mammalian spermiogenesis. Proc Natl Acad Sci U S A 100: 10546-10551. doi: 10.1073/pnas.1837812100
![]() |
[29] |
Martianov I, Brancorsini S, Catena R, et al. (2005) Polar nuclear localization of H1T2, a histone H1 variant, required for spermatid elongation and DNA condensation during spermiogenesis. Proc Natl Acad Sci U S A 102: 2808-2813. doi: 10.1073/pnas.0406060102
![]() |
[30] | Parseghian MH, Henschen AH, Krieglstein KG, et al. (1994) A proposal for a coherent mammalian histone H1 nomenclature correlated with amino acid sequences. Protein Sci 3: 575-587. |
[31] |
Schlissel MS, Brown DD (1984) The transcriptional regulation of Xenopus 5S RNA genes in chromatin: The roles of active stable transcription complexes and histone H1. Cell 37: 903-913. doi: 10.1016/0092-8674(84)90425-2
![]() |
[32] |
Shimamura A, Sapp M, Rodriguez-Campos A, et al. (1989) Histone H1 represses transcription from minichromosomes assembled in vitro. Mol Cell Biol 9: 5573-5584. doi: 10.1128/MCB.9.12.5573
![]() |
[33] |
Lu ZH, Sittman DB, Romanowski P, et al. (1998) Histone H1 reduces the frequency of initiation in Xenopus egg extract by limiting the assembly of prereplication complexes on sperm chromatin. Molec Bio Cell 9: 1163-1176. doi: 10.1091/mbc.9.5.1163
![]() |
[34] |
Laybourn PJ, Kadonaga JT (1991) Role of nucleosomal cores and histone H1 in regulation of transcription by RNA polymerase II. Science 254: 238-245. doi: 10.1126/science.1718039
![]() |
[35] |
Lu ZH, Xu H, Leno GH (1999) DNA replication in quiescent cell nuclei: Regulation by the nuclear envelope and chromatin structure. Molec Bio Cell 10: 4091-4106. doi: 10.1091/mbc.10.12.4091
![]() |
[36] |
Carruthers LM, Bednar J, Woodcock CL, et al. (1998) Linker histones stabilize the intrinsic salt-dependent folding of nucleosomal arrays: Mechanistic ramifications for higher-order chromatin folding. Biochemistry 37: 14776-14787. doi: 10.1021/bi981684e
![]() |
[37] |
Parseghian MH, Luhrs KA (2006) Beyond the Walls of the Nucleus: The Role of Histones in Cellular Signaling and Innate Immunity. Biochem Cell Biol 84: 589-604. doi: 10.1139/o06-082
![]() |
[38] |
Sirotkin AM, Edelmann W, Cheng G, et al. (1995) Mice develop normally without the H1° linker histone. Proc Natl Acad Sci U S A 92: 6434-6438. doi: 10.1073/pnas.92.14.6434
![]() |
[39] |
Lin Q, Sirotkin AM, Skoultchi AI (2000) Normal spermatogenesis in mice lacking the testis-specific linker histone H1t. Mol Cell Biol 20: 2122-2128. doi: 10.1128/MCB.20.6.2122-2128.2000
![]() |
[40] |
Fan Y, Sirotkin AM, Russell RG, et al. (2001) Individual somatic H1 subtypes are dispensable for mouse development even in mice lacking the H1° replacement subtype. Mol Cell Biol 21: 7933-7943. doi: 10.1128/MCB.21.23.7933-7943.2001
![]() |
[41] |
Allan J, Hartman PG, Crane-Robinson C, et al. (1980) The structure of histone H1 and its location in chromatin. Nature 288: 675-679. doi: 10.1038/288675a0
![]() |
[42] |
Hartman PG, Chapman GE, MossT, et al. (1977) Studies on the role and mode of operation of the very-lysine-rich histone H1 in eukaryote chromatin. Eur J Biochem 77: 45-51. doi: 10.1111/j.1432-1033.1977.tb11639.x
![]() |
[43] | Izzo A, Kamieniarz K, Schneider R (2008) The histone H1 family: specific members, specific functions? Biol Chem 389: 333-343. |
[44] |
Ramakrishnan V, Finch JT, Graziano V, et al. (1993) Crystal structure of globular domain of histone H5 and its implications for nucleosome binding. Nature 362: 219-223. doi: 10.1038/362219a0
![]() |
[45] |
Cerf C, Lippens G, Muyldermans S, et al. (1993) Homo- and heteronuclear two-dimensional NMR studies of the globular domain of histone H1: sequential assignment and secondary structure. Biochemistry 32: 11345-11351. doi: 10.1021/bi00093a011
![]() |
[46] |
Zhou BR, Feng H, Kato H, et al. (2013) Structural insights into the histone H1-nucleosome complex. Proc Natl Acad Sci U S A 110: 19390-19395. doi: 10.1073/pnas.1314905110
![]() |
[47] |
Zhou BR, Jiang J, Feng H, et al. (2015) Structural Mechanisms of Nucleosome Recognition by Linker Histones. Mol Cell 59: 628-638. doi: 10.1016/j.molcel.2015.06.025
![]() |
[48] |
Syed SH, Goutte-Gattat D, Becker N, et al. (2010) Single-base resolution mapping of H1-nucleosome interactions and 3D organization of the nucleosome. Proc Natl Acad Sci U S A 107: 9620-9625. doi: 10.1073/pnas.1000309107
![]() |
[49] | Fan L, Roberts VA (2006) Complex of linker histone H5 with the nucleosome and its implications for chromatin packing. Proc Natl Acad Sci U S A 103: 8384-8389. |
[50] |
An W, Leuba SH, van Holde KE, et al. (1998) Linker histone protects linker DNA on only one side of the core particle and in a sequence-dependent manner. Proc Natl Acad Sci U S A 95: 3396-3401. doi: 10.1073/pnas.95.7.3396
![]() |
[51] |
Thomas JO (1999) Histone H1: Location and role. Curr Opin Cell Biol 11: 312-317. doi: 10.1016/S0955-0674(99)80042-8
![]() |
[52] |
Brown DT, Izard T, Misteli T (2006) Mapping the interaction surface of linker histone H1(0) with the nucleosome of native chromatin in vivo. Nat Struct Mol Biol 13: 250-255. doi: 10.1038/nsmb1050
![]() |
[53] |
Song F, Chen P, Sun D, et al. (2014) Cryo-EM study of the chromatin fiber reveals a double helix twisted by tetranucleosomal units. Science 344: 376-380. doi: 10.1126/science.1251413
![]() |
[54] |
George EM, Izard T, Anderson SD, et al. (2010) Nucleosome interaction surface of linker histone H1c is distinct from that of H1(0). J Biol Chem 285: 20891-20896. doi: 10.1074/jbc.M110.108639
![]() |
[55] |
Vila R, Ponte I, Collado M, et al. (2001) DNA-induced a-helical structure in the NH2 -terminal domain of histone H1. J Biol Chem 276: 46429-46435. doi: 10.1074/jbc.M106952200
![]() |
[56] | Vila R, Ponte I, Jiménez MA, et al. (2002) An inducible helix Gly-Gly helix motif in the N-terminal domain of histone H1e: A CD and NMR study. Protein Sci 11: 214-220. |
[57] |
Becker M, Becker A, Miyara F, et al. (2005) Differential in vivo binding dynamics of somatic and oocyte-specific linker histones in oocytes and during ES cell nuclear transfer. Mol Biol Cell 16: 3887-3895. doi: 10.1091/mbc.E05-04-0350
![]() |
[58] |
Vyas P, Brown DT (2012) N- and C-terminal domains determine differential nucleosomal binding geometry and affinity of linker histone isotypes H1(0) and H1c. J Biol Chem 287: 11778-11787. doi: 10.1074/jbc.M111.312819
![]() |
[59] | Kalashnikova AA, Winkler DD, McBryant SJ, et al. (2013) Linker histone H1.0 interacts with an extensive network of proteins found in the nucleolus. Nucleic Acids Res 41: 4026-4035. |
[60] |
Caterino TL, Hayes JJ (2011) Structure of the H1 C-terminal domain and function in chromatin condensation. Biochem Cell Biol 89: 35-44. doi: 10.1139/O10-024
![]() |
[61] | Hill CS, Martin SR, Thomas JO (1989) A stable a-helical element in the carboxy-terminal domain of free and chromatin-bound histone H1 from sea urchin sperm. EMBO J 8: 2591-2599. |
[62] | Vila R, Ponte I, Jiménez MA, et al. (2000) A helix-turn motif in the C-terminal domain of histone H1. Protein Sci 9: 627-636. |
[63] |
Vila R, Ponte I, Collado M, et al. (2001) Induction of secondary structure in a COOH-terminal peptide of histone H1 by interaction with the DNA. J Biol Chem 276: 30898-30903. doi: 10.1074/jbc.M104189200
![]() |
[64] |
Roque A, Iloro I, Ponte I, et al. (2005) DNA-induced secondary structure of the carboxyl-terminal domain of histone H1. J Biol Chem 280: 32141-32147. doi: 10.1074/jbc.M505636200
![]() |
[65] |
Hendzel MJ, Lever MA, Crawford E, et al. (2004) The C-terminal domain is the primary determinant of histone H1 binding to chromatin in vivo. J Biol Chem 279: 20028-20034. doi: 10.1074/jbc.M400070200
![]() |
[66] | Clausell J, Happel N, Hale TK, et al. (2009) Histone H1 subtypes differentially modulate chromatin condensation without preventing ATP-dependent remodeling by SWI/SNF or NURF. PLoS ONE 4: e0007243. |
[67] |
Lu X, Hansen JC (2004) Identification of specific functional subdomains within the linker histone H1° C-terminal domain. J Biol Chem 279: 8701-8707. doi: 10.1074/jbc.M311348200
![]() |
[68] |
De S, Brown DT, Lu ZH, et al. (2002) Histone H1 variants differentially inhibit DNA replication through an affinity for chromatin mediated by their carboxyl-terminal domains. Gene 292: 173-181. doi: 10.1016/S0378-1119(02)00675-3
![]() |
[69] |
Hansen JC, Lu X, Ross ED, et al. (2006) Intrinsic protein disorder, amino acid composition, and histone terminal domains. J Biol Chem 281: 1853-1856. doi: 10.1074/jbc.R500022200
![]() |
[70] |
Finn RM, Ellard K, Eirin-Lopez JM, et al. (2012) Vertebrate nucleoplasmin and NASP: egg histone storage proteins with multiple chaperone activities. FASEB J 26: 4788-4804. doi: 10.1096/fj.12-216663
![]() |
[71] |
Karetsou Z, Sandaltzopoulos R, Frangou-Lazaridis M, et al. (1998) Prothymosin a modulates the interaction of histone H1 with chromatin. Nucl Acids Res 26: 3111-3118. doi: 10.1093/nar/26.13.3111
![]() |
[72] |
Lu X, Hamkalo BA, Parseghian MH, et al. (2009) Chromatin condensing functions of the linker histone C-terminal domain are mediated by specific amino acid composition and intrinsic protein disorder. Biochemistry 48: 164-172. doi: 10.1021/bi801636y
![]() |
[73] |
Widlak P, Kalinowska M, Parseghian MH, et al. (2005) The histone H1 C-terminal domain binds to the apoptotic nuclease, DNA Fragmentation Factor (DFF40/CAD) and stimulates DNA cleavage. Biochemistry 44: 7871-7878. doi: 10.1021/bi050100n
![]() |
[74] |
Roque A, Orrego M, Ponte I, et al. (2004) The preferential binding of histone H1 to DNA scaffold-associated regions is determined by its C-terminal domain. Nucleic Acids Res 32: 6111-6119. doi: 10.1093/nar/gkh945
![]() |
[75] |
Stasevich TJ, Mueller F, Brown DT, et al. (2010) Dissecting the binding mechanism of the linker histone in live cells: an integrated FRAP analysis. EMBO J 29: 1225-1234. doi: 10.1038/emboj.2010.24
![]() |
[76] | Wisniewski JR, Zougman A, Kruger S, et al. (2007) Mass spectrometric mapping of linker histone H1 variants reveals multiple acetylations, methylations, and phosphorylation as well as differences between cell culture and tissue. Mol Cell Proteomics 6: 72-87. |
[77] | Ajiro K, Shibata K, Nishikawa Y (1990) Subtype-specific cyclic AMP-dependent histone H1 phosphorylation at the differentiation of mouse neuroblastoma cells. J Biol Chem 265: 6494-6500. |
[78] | Hill CS, Rimmer JM, Green BN, et al. (1991) Histone-DNA interactions and their modulation by phosphorylation of Ser-Pro-X-Lys/Arg motifs. EMBO J 10: 1939-1948. |
[79] |
Sarg B, Helliger W, Talasz H, et al. (2006) Histone H1 phosphorylation occurs site-specifically during interphase and mitosis: identification of a novel phosphorylation site on histone H1. J Biol Chem 281: 6573-6580. doi: 10.1074/jbc.M508957200
![]() |
[80] | Suzuki M (1989) SPKK, a new nucleic acid-binding unit of protein found in histone. EMBO J 8: 797-804. |
[81] |
Lopez R, Sarg B, Lindner H, et al. (2015) Linker histone partial phosphorylation: effects on secondary structure and chromatin condensation. Nucleic Acids Res 43: 4463-4476. doi: 10.1093/nar/gkv304
![]() |
[82] |
Talasz H, Helliger W, Puschendorf B, et al. (1996) In vivo phosphorylation of histone H1 variants during the cell cycle. Biochemistry 35: 1761-1767. doi: 10.1021/bi951914e
![]() |
[83] | Riquelme PT, Burzio LO, Koide SS (1979) ADP ribosylation of rat liver lysine-rich histone in vitro. J Biol Chem 254: 3018-3028. |
[84] |
D'Erme M, Zardo G, Reale A, et al. (1996) Co-operative interactions of oligonucleosomal DNA with the H1e histone variant and its poly(ADP-ribosyl)ated isoform. Biochem J 316: 475-480. doi: 10.1042/bj3160475
![]() |
[85] | de Murcia G, Huletsky A, Lamarre D, et al. (1986) Modulation of chromatin superstructure induced by poly(ADP-ribose) synthesis and degradation. J Biol Chem 261: 7011-7017. |
[86] |
Vaquero A, Scher M, Lee D, et al. (2004) Human SirT1 interacts with histone H1 and promotes formation of facultative heterochromatin. Mol Cell 16: 93-105. doi: 10.1016/j.molcel.2004.08.031
![]() |
[87] |
Kuzmichev A, Jenuwein T, Tempst P, et al. (2004) Different Ezh2-containing complexes target methylation of histone H1 or nucleosomal histone H3. Molec Cell 14: 183-193. doi: 10.1016/S1097-2765(04)00185-6
![]() |
[88] |
Weiss T, Hergeth S, Zeissler U, et al. (2010) Histone H1 variant-specific lysine methylation by G9a/KMT1C and Glp1/KMT1D. Epigenetics Chromatin 3: 7. doi: 10.1186/1756-8935-3-7
![]() |
[89] |
Th'ng JPH, Sung R, Ye M, et al. (2005) H1 family histones in the nucleus: Control of binding and localization by the C-terminal domain. J Biol Chem 280: 27809-27814. doi: 10.1074/jbc.M501627200
![]() |
[90] | Izzo A, Kamieniarz-Gdula K, Ramirez F, et al. (2013) The genomic landscape of the somatic linker histone subtypes H1.1 to H1.5 in human cells. Cell Rep 3: 2142-2154. |
[91] |
Cao K, Lailler N, Zhang Y, et al. (2013) High-resolution mapping of h1 linker histone variants in embryonic stem cells. PLoS Genet. 9: e1003417. doi: 10.1371/journal.pgen.1003417
![]() |
[92] |
Appels R, Bolund L, Ringertz NR (1974) Biochemical analysis of reactivated chick erythrocyte nuclei isolated from chick/HeLa heterokaryons. J Mol Biol 87: 339-355. doi: 10.1016/0022-2836(74)90154-5
![]() |
[93] |
Gao S, Chung YG, Parseghian MH, et al. (2004) Rapid H1 linker histone transitions following fertilization or somatic cell nuclear transfer: evidence for a uniform developmental program in mice. Dev Biol 266: 62-75. doi: 10.1016/j.ydbio.2003.10.003
![]() |
[94] | Dimitrov S, Wolffe AP (1996) Remodeling somatic nuclei in Xenopus laevis egg extracts: molecular mechanisms for the selective release of histones H1 and H1° from chromatin and the acquisition of transcriptional competence. EMBO J 15: 5897-5906. |
[95] |
Teranishi T, Tanaka M, Kimoto S, et al. (2004) Rapid replacement of somatic linker histones with the oocyte-specific linker histone H1foo in nuclear transfer. Dev Biol 266: 76-86. doi: 10.1016/j.ydbio.2003.10.004
![]() |
[96] |
Misteli T, Gunjan A, Hock R, et al. (2000) Dynamic binding of histone H1 to chromatin in living cells. Nature 408: 877-881. doi: 10.1038/35048610
![]() |
[97] |
Phair RD, Scaffidi P, Elbi C, et al. (2004) Global nature of dynamic protein-chromatin interactions in vivo: Three dimensional genome scanning and dynamic interaction networks of chromatin proteins. Mol Cell Biol 24: 6393-6402. doi: 10.1128/MCB.24.14.6393-6402.2004
![]() |
[98] |
Bustin M, Catez F, Lim J-H (2005) The dynamics of histone H1 function in chromatin. Molec Cell 17: 617-620. doi: 10.1016/j.molcel.2005.02.019
![]() |
[99] |
Raghuram N, Carrero G, Stasevich TJ, et al. (2010) Core histone hyperacetylation impacts cooperative behavior and high-affinity binding of histone H1 to chromatin. Biochemistry 49: 4420-4431. doi: 10.1021/bi100296z
![]() |
[100] |
Kimura H, Cook PR (2001) Kinetics of core histones in living human cells: Little exchange of H3 and H4 and some rapid exchange of H2B. J Cell Biology 153: 1341-1353. doi: 10.1083/jcb.153.7.1341
![]() |
[101] |
Contreras A, Hale TK, Stenoien DL, et al. (2003) The dynamic mobility of histone H1 is regulated by Cyclin/CDK Phosphorylation. Mol Cell Biol 23: 8626-8636. doi: 10.1128/MCB.23.23.8626-8636.2003
![]() |
[102] |
Yellajoshyula D, Brown DT (2006) Global modulation of chromatin dynamics mediated by dephosphorylation of linker histone H1 is necessary for erythroid differentiation. Proc Natl Acad Sci U S A 103: 18568-18573. doi: 10.1073/pnas.0606478103
![]() |
[103] | Takata H, Matsunaga S, Morimoto A, et al. (2007) H1.X with different properties from other linker histones is required for mitotic progression. FEBS Lett 581: 3783-3788. |
[104] | Lever MA, Th'ng JPH, Sun X, et al. (2000) Rapid exchange of histone H1.1 on chromatin in living human cells. Nature 408: 873-876. |
[105] |
Christophorou MA, Castelo-Branco G, Halley-Stott RP, et al. (2014) Citrullination regulates pluripotency and histone H1 binding to chromatin. Nature 507: 104-108. doi: 10.1038/nature12942
![]() |
[106] | Liao LW, Cole RD (1981) Condensation of dinucleosomes by individual subfractions of H1 histone. J Biol Chem 256: 10124-10128. |
[107] |
Khadake JR, Rao MR (1995) DNA- and chromatin-condensing properties of rat testes H1a and H1t compared to those of rat liver H1bdec; H1t is a poor condenser of chromatin. Biochemistry 34: 15792-15801. doi: 10.1021/bi00048a025
![]() |
[108] |
Talasz H, Sapojnikova N, Helliger W, et al. (1998) In vitro binding of H1 histone subtypes to nucleosomal organized mouse mammary tumor virus long terminal repeat promotor. J Biol Chem 273: 32236-32243. doi: 10.1074/jbc.273.48.32236
![]() |
[109] |
Hannon R, Bateman E, Allan J, et al. (1984) Control of RNA polymerase binding to chromatin by variations in linker histone composition. J Mol Biol 180: 131-149. doi: 10.1016/0022-2836(84)90434-0
![]() |
[110] |
Orrego M, Ponte I, Roque A, et al. (2007) Differential affinity of mammalian histone H1 somatic subtypes for DNA and chromatin. BMC Biol 5: 22. doi: 10.1186/1741-7007-5-22
![]() |
[111] |
Brown DT, Alexander BT, Sittman DB (1996) Differential effect of H1 variant overexpression on cell cycle progression and gene expression. Nucl Acids Res 24: 486-493. doi: 10.1093/nar/24.3.486
![]() |
[112] |
Bhan S, May W, Warren SL, et al. (2008) Global gene expression analysis reveals specific and redundant roles for H1 variants, H1c and H1(0), in gene expression regulation. Gene 414: 10-18. doi: 10.1016/j.gene.2008.01.025
![]() |
[113] |
Funayama R, Saito M, Tanobe H, et al. (2006) Loss of linker histone H1 in cellular senescence. J. Cell Biol 175: 869-880. doi: 10.1083/jcb.200604005
![]() |
[114] |
Kasinsky HE, Lewis JD, Dacks JB, et al. (2001) Origin of H1 linker histones. FASEB J 15: 34-42. doi: 10.1096/fj.00-0237rev
![]() |
[115] |
Wu M, Allis CD, Richman R, et al. (1986) An intervening sequence in an unusual histone H1 gene of Tetrahymena thermophila. Proc Natl Acad Sci U S A 83: 8674-8678. doi: 10.1073/pnas.83.22.8674
![]() |
[116] | Schulze E, Schulze B (1995) The vertebrate linker histones H1°, H5 and H1M are descendants of invertebrate ""orphon"" histone H1 genes. J Mol Evol 41: 833-840. |
[117] |
Ponte I, Vidal-Taboada JM, Suau P (1998) Evolution of the vertebrate H1 histone class: Evidence for the functional differentiation of the subtypes. Mol Biol Evol 15: 702-708. doi: 10.1093/oxfordjournals.molbev.a025973
![]() |
[118] |
Downs JA, Kosmidou E, Morgan A, et al. (2003) Suppression of homologous recombination by the Saccharomyces cerevisiae linker histone. Molec Cell 11: 1685-1692. doi: 10.1016/S1097-2765(03)00197-7
![]() |
[119] | Hellauer K, Sirard E, Turcotte B (2001) Decreased expression of specific genes in yeast cells lacking histone H1. J Biol Chem 276: 13587-13592. |
[120] |
Shen X, Gorovsky MA (1996) Linker histone H1 regulates specific gene expression but not global transcription in vivo. Cell 86: 475-483. doi: 10.1016/S0092-8674(00)80120-8
![]() |
[121] | Nalabothula N, McVicker G, Maiorano J, et al. (2014) The chromatin architectural proteins HMGD1 and H1 bind reciprocally and have opposite effects on chromatin structure and gene regulation. BMC. Genomics. 15, 92. |
[122] |
Meergans T, Albig W, Doenecke D (1997) Varied expression patterns of human H1 histone genes in different cell lines. DNA Cell Biol 16: 1041-1049. doi: 10.1089/dna.1997.16.1041
![]() |
[123] |
Pina B, Martinez P, Suau P (1987) Changes in H1 complement in differentiating rat-brain cortical neurons. Eur J Biochem 164: 71-76. doi: 10.1111/j.1432-1033.1987.tb10994.x
![]() |
[124] | Kamieniarz K, Izzo A, Dundr M, et al. (2012) A dual role of linker histone H1.4 Lys 34 acetylation in transcriptional activation. Genes Dev 26: 797-802. |
[125] | Huang H-C, Cole RD (1984) The distribution of H1 histone is nonuniform in chromatin and correlates with different degrees of condensation. J Biol Chem 259: 14237-14242. |
[126] |
Gunjan A, Alexander BT, Sittman DB, et al. (1999) Effects of H1 histone variant overexpression on chromatin structure. J Biol Chem 274: 37950-37956. doi: 10.1074/jbc.274.53.37950
![]() |
[127] |
Krishnakumar R, Gamble MJ, Frizzell KM, et al. (2008) Reciprocal binding of PARP-1 and histone H1 at promoters specifies transcriptional outcomes. Science. 319: 819-821. doi: 10.1126/science.1149250
![]() |
[128] |
Parseghian MH, Harris DA, Rishwain DR, et al. (1994) Characterization of a set of antibodies specific for three human histone H1 subtypes. Chromosoma 103: 198-208. doi: 10.1007/BF00368013
![]() |
[129] |
Boulikas T, Bastin B, Boulikas P, et al. (1990) Increase in histone poly(ADP-ribosylation) in mitogen-activated lymphoid cells. Exp Cell Res 187: 77-84. doi: 10.1016/0014-4827(90)90119-U
![]() |
[130] | Parseghian MH, Henschen AH, Krieglstein KG, et al. (1994) A proposal for a coherent mammalian histone H1 nomenclature correlated with amino acid sequences. Protein Sci 3: 575-587. |
[131] | Higurashi M, Adachi H, Ohba Y (1987) Synthesis and degradation of H1 histone subtypes in mouse lymphoma L5178Y cells. J Biol Chem 262: 13075-13080. |
[132] |
Cheng G, Nandi A, Clerk S, et al. (1989) Different 3'-end processing produces two independently regulated mRNAs from a single H1 histone gene. Proc Natl Acad Sci U S A 86: 7002-7006. doi: 10.1073/pnas.86.18.7002
![]() |
[133] |
Routh A, Sandin S, Rhodes D (2008) Nucleosome repeat length and linker histone stoichiometry determine chromatin fiber structure. Proc Natl Acad Sci U S A 105: 8872-8877. doi: 10.1073/pnas.0802336105
![]() |
[134] |
Beshnova DA, Cherstvy AG, Vainshtein Y, et al. (2014) Regulation of the nucleosome repeat length in vivo by the DNA sequence, protein concentrations and long-range interactions. PLoS Comput Biol 10: e1003698. doi: 10.1371/journal.pcbi.1003698
![]() |
[135] |
Chadee DN, Taylor WR, Hurta RAR, et al. (1995) Increased phosphorylation of histone H1 in mouse fibroblasts transformed with oncogenes or constitutively active mitogen-activated protein kinase kinase. J Biol Chem 270: 20098-20105. doi: 10.1074/jbc.270.34.20098
![]() |
[136] |
Chadee DN, Allis CD, Wright JA, et al. (1997) Histone H1b phosphorylation is dependent upon ongoing transcription and replication in normal and ras-transformed mouse fibroblasts. J Biol Chem 272: 8113-8116. doi: 10.1074/jbc.272.13.8113
![]() |
[137] | Li JY, Patterson M, Mikkola HK, et al. (2012) Dynamic distribution of linker histone H1.5 in cellular differentiation. PLoS Genet. 8: e1002879. |
[138] |
Lee H, Habas R, Abate-Shen C (2004) Msx1 cooperates with histone H1b for inhibition of transcription and myogenesis. Science 304: 1675-1678. doi: 10.1126/science.1098096
![]() |
[139] |
Wang X, Peng Y, Ma Y, et al. (2004) Histone H1-like protein participates in endothelial cell-specific activation of the von Willebrand factor promoter. Blood 104: 1725-1732. doi: 10.1182/blood-2004-01-0082
![]() |
[140] | Kaludov NK, Pabon-Pena L, Seavy M, et al. (1997) A mouse histone H1 variant, H1b, binds preferentially to a regulatory sequence within a mouse H3.2 replication-dependent histone gene. J Biol Chem 272: 15120-15127. |
[141] |
Oberg C, Izzo A, Schneider R, et al. (2012) Linker histone subtypes differ in their effect on nucleosomal spacing in vivo. J Mol Biol 419: 183-197. doi: 10.1016/j.jmb.2012.03.007
![]() |
[142] |
Pehrson JR, Cole RD (1982) Histone H1 subfractions and H1° turnover at different rates in nondividing cells. Biochemistry 21: 456-460. doi: 10.1021/bi00532a006
![]() |
[143] | Zlatanova JS, Doenecke D (1994) Histone H1°: A major player in cell differentiation? FASEB J 8: 1260-1268. |
[144] | Gabrilovich DI, Cheng P, Fan Y, et al. (2002) H1(0) histone and differentiation of dendritic cells. A molecular target for tumor-derived factors. J Leukoc Biol 72: 285-296. |
[145] | Lennox RW, Cohen LH (1983) The histone H1 complements of dividing and nondividing cells of the mouse. J Biol Chem 258: 262-268. |
[146] | Rasheed BKA, Whisenant EC, Ghai RD, et al. (1989) Biochemical and immunocytochemical analysis of a histone H1 variant from the mouse testis. J Cell Sci 94: 61-71. |
[147] |
Franke K, Drabent B, Doenecke D (1998) Expression of murine H1 histone genes during postnatal development. Biochim Biophys Acta 1398: 232-242. doi: 10.1016/S0167-4781(98)00062-1
![]() |
[148] | Lennox RW (1984) Differences in evolutionary stability among mammalian H1 subtypes: Implications for the roles of H1 subtypes in chromatin. J Biol Chem 259: 669-672. |
[149] | Lennox RW, Cohen LH (1984) The alterations in H1 histone complement during mouse spermatogenesis and their significance for H1 subtype function. Dev Biol 103, 80-84. |
[150] |
Rabini S, Franke K, Saftig P, et al. (2000) Spermatogenesis in mice is not affected by histone H1.1 deficiency. Exp Cell Res 255: 114-124. doi: 10.1006/excr.1999.4767
![]() |
[151] |
Oko RJ, Jando V, Wagner CL, et al. (1996) Chromatin reorganization in rat spermatids during the disappearance of testis-specific histone, H1t, and the appearance of transition proteins TP1 and TP2. Biol Reprod 54: 1141-1157. doi: 10.1095/biolreprod54.5.1141
![]() |
[152] |
de Lucia F, Faraone-Mennella MR, D'Erme M, et al. (1994) Histone-induced condensation of rat testis chromatin: Testis-specific H1t versus somatic H1 variants. Biochem Biophys Res Commun 198: 32-39. doi: 10.1006/bbrc.1994.1005
![]() |
[153] |
Catena R, Ronfani L, Sassone-Corsi P, et al. (2006) Changes in intranuclear chromatin architecture induce bipolar nuclear localization of histone variant H1T2 in male haploid spermatids. Dev Biol 296: 231-238. doi: 10.1016/j.ydbio.2006.04.458
![]() |
[154] |
Tanaka H, Iguchi N, Isotani A, et al. (2005) HANP1/H1T2, a novel histone H1-like protein involved in nuclear formation and sperm fertility. Mol Cell Biol 25: 7107-7119. doi: 10.1128/MCB.25.16.7107-7119.2005
![]() |
[155] |
Saeki H, Ohsumi K, Aihara H, et al. (2005) Linker histone variants control chromatin dynamics during early embryogenesis. Proc Natl Acad Sci U S A 102: 5697-5702. doi: 10.1073/pnas.0409824102
![]() |
[156] |
Tanaka M, Kihara M, Hennebold JD, et al. (2005) H1FOO is coupled to the initiation of oocytic growth. Biol Reprod 72: 135-142. doi: 10.1095/biolreprod.104.032474
![]() |
[157] |
Nazarov IB, Shlyakhtenko LS, Lyubchenko YL, et al. (2008) Sperm chromatin released by nucleases. Syst Biol Reprod Med 54: 37-46. doi: 10.1080/19396360701876849
![]() |
[158] |
Sanchez-Vazquez ML, Flores-Alonso JC, Merchant-Larios H, et al. (2008) Presence and release of bovine sperm histone H1 during chromatin decondensation by heparin-glutathione. Syst Biol Reprod Med 54: 221-230. doi: 10.1080/19396360802357087
![]() |
[159] |
Stoldt S, Wenzel D, Schulze E, et al. (2007) G1 phase-dependent nucleolar accumulation of human histone H1x. Biol Cell 99: 541-552. doi: 10.1042/BC20060117
![]() |
[160] | Mayor R, Izquierdo-Bouldstridge A, Millan-Arino L, et al. (2015) Genome distribution of replication-independent histone H1 variants shows H1.0 associated with nucleolar domains and H1X associated with RNA polymerase II-enriched regions. J Biol Chem 290: 7474-7491. |
[161] | Happel N, Schulze E, Doenecke D (2005) Characterization of human histone H1x. Biol Chem 386: 541-551. |
[162] | Drabent B, Saftig P, Bode C, et al. (2000) Spermatogenesis proceeds normally in mice without linker histone H1t. Histochem. Cell Biol 113: 433-442. |
[163] | Drabent B, Benavente R, Hoyer-Fender S (2003) Histone H1t is not replaced by H1.1 or H1.2 in pachytene spermatocytes or spermatids of H1t-deficient mice. Cytogenet Genome Res 103: 307-313. |
[164] |
Lin Q, Inselman A, Han X, et al. (2004) Reductions in linker histone levels are tolerated in developing spermatocytes but cause changes in specific gene expression. J Biol Chem 279: 23525-23535. doi: 10.1074/jbc.M400925200
![]() |
[165] |
Fan Y, Nikitina T, Zhao J, et al. (2005) Histone H1 depletion in mammals alters global chromatin structure but causes specific changes in gene regulation. Cell 123: 1199-1212. doi: 10.1016/j.cell.2005.10.028
![]() |
[166] |
Zhang Y, Cooke M, Panjwani S, et al. (2012) Histone h1 depletion impairs embryonic stem cell differentiation. PLoS Genet 8: e1002691. doi: 10.1371/journal.pgen.1002691
![]() |
[167] |
Cherstvy AG, Teif VB (2014) Electrostatic effect of H1-histone protein binding on nucleosome repeat length. Phys Biol 11: 044001. doi: 10.1088/1478-3975/11/4/044001
![]() |
[168] |
Woodcock CL, Skoultchi AI, Fan Y (2006) Role of linker histone in chromatin structure and function: H1 stoichiometry and nucleosome repeat length. Chromosome Res 14: 17-25. doi: 10.1007/s10577-005-1024-3
![]() |
[169] |
Schiessel H (2003) The physics of chromatin. J Phys Condens Matter 15: R699-R774. doi: 10.1088/0953-8984/15/19/203
![]() |
[170] |
Bedoyan JK, Lejnine S, Makarov VL, et al. (1996) Condensation of rat telomere-specific nucleosomal arrays containing unusually short DNA repeats and histone H1. J Biol Chem 271: 18485-18493. doi: 10.1074/jbc.271.31.18485
![]() |
[171] |
Ascenzi R, Gantt JS (1999) Subnuclear distribution of the entire complement of linker histone variants in Arabidopsis thaliana. Chromosoma 108: 345-355. doi: 10.1007/s004120050386
![]() |
[172] |
O'Sullivan RJ, Karlseder J (2012) The great unravelling: chromatin as a modulator of the aging process. Trends Biochem Sci 37: 466-476. doi: 10.1016/j.tibs.2012.08.001
![]() |
[173] |
Oberdoerffer P, Michan S, McVay M, et al. (2008) SIRT1 redistribution on chromatin promotes genomic stability but alters gene expression during aging. Cell 135: 907-918. doi: 10.1016/j.cell.2008.10.025
![]() |
[174] |
O'Sullivan RJ, Kubicek S, Schreiber SL, et al. (2010) Reduced histone biosynthesis and chromatin changes arising from a damage signal at telomeres. Nat Struct Mol Biol 17: 1218-1225. doi: 10.1038/nsmb.1897
![]() |
[175] |
Feser J, Truong D, Das C, et al. (2010) Elevated histone expression promotes life span extension. Mol Cell 39: 724-735. doi: 10.1016/j.molcel.2010.08.015
![]() |
[176] |
Winokur ST, Bengtsson U, Feddersen J, et al. (1994) The DNA rearrangement associated with facioscapulohumeral muscular dystrophy involves a heterochromatin-associated repetitive element: Implications for a role of chromatin structure in the pathogenesis of the disease. Chromosome Res 2: 225-234. doi: 10.1007/BF01553323
![]() |
[177] |
Makarov VL, Lejnine S, Bedoyan JK, et al. (1993) Nucleosomal organization of telomere-specific chromatin in rat. Cell 73: 775-787. doi: 10.1016/0092-8674(93)90256-P
![]() |
[178] |
Pina B, Martinez P, Simon L, et al. (1984) Differential kinetics of histone H1° accumulation in neuronal and glial cells from rat cerebral cortex during postnatal development. Biochem Biophys Res Commun 123: 697-702. doi: 10.1016/0006-291X(84)90285-7
![]() |
[179] | Drabent B, Kardalinou E, Doenecke D (1991) Structure and expression of the human testicular H1 histone (H1t) gene. GENBANK. |
[180] | Alberts B, Johnson A, Lewis J, et al. (1994) Molecular Biology of the Cell, 3 edn. Garland Science, New York. |
[181] |
Vickaryous MK, Hall BK (2006) Human cell type diversity, evolution, development, and classification with special reference to cells derived from the neural crest. Biol Rev Camb Philos Soc 81: 425-455. doi: 10.1017/S1464793106007068
![]() |
[182] | Margulis L, Chapman MJ (2009) Kingdoms and Domains: An Illustrated Guide to the Phyla of Life on Earth, 4 edn. Academic Press/Elsevier, Amsterdam. |
[183] | Lyndon RF (1990) Plant Development. Unwin Hyman Ltd, London. |
1. | L R Akhmetshin, A G Kushnarev, S V Pashkov, Influence of the angle interaction of the projectile with the wire mesh, 2021, 1093, 1757-8981, 012003, 10.1088/1757-899X/1093/1/012003 | |
2. | Caizheng Wang, Hongxu Wang, Krishna Shankar, Paul J. Hazell, Dynamic failure behavior of steel wire mesh subjected to medium velocity impact: Experiments and simulations, 2022, 216, 00207403, 106991, 10.1016/j.ijmecsci.2021.106991 | |
3. | T. A. Shumikhin, P. N. Kalmykov, N. V. Lapichev, A. I. Leont’ev, D. E. Martyushov, N. N. Myagkov, V. N. Nomakonova, A. V. Sal’nikov, L. N. Bezrukov, SOME FEATURES OF FRAGMENTATION OF AN ALUMINUM PROJECTILE AT VARIOUS VELOCITIES OF PENETRATION INTO A MESH BUMPER, 2021, 62, 0021-8944, 972, 10.1134/S0021894421060122 | |
4. | Rannveig Marie Færgestad, Bruce A. Davis, Christopher J. Cline II, Eric L. Christiansen, Kevin A. Ford, Odd S. Hopperstad, Jens K. Holmen, Tore Børvik, Hypervelocity impact on Whipple shields with varying bumper material at 3 and 7 km/s: An experimental study, 2025, 0734743X, 105328, 10.1016/j.ijimpeng.2025.105328 |
Area | Works |
planning | [72] |
state estimation and filtering | [73,74] |
synchronization | [75,76,77,78] |
opinion formation | [79] |
network security | [80,81,82,83,84] |
power control | [85,86,87] |
medium access control | [88,89] |
cognitive radio networks | [90,91] |
electrical vehicles | [92,93] |
scheduling | [94] |
cloud networks | [95,96,97] |
wireless networks | [98] |
auction | [99,100] |
cyber-physical systems | [101,102] |
airline networks | [103] |
sensor networks | [104] |
traffic networks | [105,106,107,108] |
big data | [109] |
D2D networks | [110,111,112] |
multilevel building evacuation | [140,141,142,143] |
power networks | [113,114,115,116,117,174,179] |
[93,118,119,120,121] | |
[122,123,124] | |
HVAC | [125,126,127,128,129,130] |
Area | Anonymity | Infinity | Atom |
population games [4,5] | yes | yes | no |
evolutionary games [131] | yes | yes | no |
non-atomic games [29] | yes | yes | no |
aggregative games [18] | relaxed | ||
global games [16,17] | yes | yes | no |
large games [22] | yes | yes | no |
anonymous games [29] | yes | yes | no |
mean-field games | yes | yes | no |
nonasymptotic mean-field games | nearly | no | yes |
MFTG | relaxed | relaxed | relaxed |
$\mathcal{I}$ | $\triangleq$ | set of decision-makers |
$T$ | $\triangleq$ | Length of the horizon |
$[0, T]$ | $\triangleq$ | horizon of the mean-field-type game |
$t$ | $\triangleq$ | time index |
$\mathcal{X}$ | $\triangleq$ | state space |
$W$ | $\triangleq$ | Brownian motion |
$\sigma$ | $\triangleq$ | Diffusion coefiicient |
$N$ | $\triangleq$ | Poisson jump process |
$\gamma$ | $\triangleq$ | Jump rate coefiicient |
${U}_i$ | $\triangleq$ | control action space of agent $i\in \mathcal{I}$ |
$\mathcal{U}_i$ | $\triangleq$ | admissible strategy space |
$u_i$ | $\triangleq$ | state space |
$r_i$ | $\triangleq$ | instantaneous payoff |
$D_{(x, u)}$ | $\triangleq$ | distribution of state-action |
$R_i$ | $\triangleq$ | Long-term payoff functional |
Case | Transition proba. $(\theta, \theta' \in \{1, 2\})$. | $M^n_{\theta}(t+1)-M^n_{\theta}(t)$ | Actions | Propagation |
${D} \xrightarrow{\delta_{D}} {H}$ | $D^n_\theta(t) \delta_{D}$ | $(-1, 0, 1)/n$ | singleton set | $-1/n$ |
$2 D \xrightarrow{\lambda} 2 C$ | $D^n_{\theta}(t) \delta_m ^2 \lambda (D^n_{\theta}(t)-\frac{1}{n})$ | $(-2, 2, 0)/n$ | $\{m, \bar{m}\}$ | $0$ |
%2' | $P^n_{\theta}(t) \delta_m ^2 \lambda (P^n_{\theta'}(t)-\frac{1}{n}\mathbb{1}_{\{\theta=\theta'\}})\mathbb{1}_{\{\theta=\theta'\}} $ | $(-2, 2, 0)/n$ | $\{m, \bar{m}\}$ | $0$ |
${C} \xrightarrow{\delta_{C}} {H}$ | $C^n_\theta(t) \delta_C$ | $(0, -1, 1)/n$ | singleton set | $-1/n$ |
${C} \xrightarrow{\frac{\beta}{q_{\theta} + D^n_{\theta}(t)}} D$ | $C^n_\theta(t) \beta \frac{D^n_{\theta}(t)}{q_{\theta} + D^n_{\theta}(t)}$ | $(-1, 1, 0)/n$ | singleton set | $0$ |
${H} \xrightarrow{\delta_{H}+(1-\delta_{H})C^n} C$ | $H^n(t) [\delta_{H}+(1-\delta_{H})C^n(t)]$ | $(0, 1, -1)/n$ | singleton set | $1/n$ |
${H} \xrightarrow{\eta}D$ | $H^n(t)(\delta_e \delta_{Sm} + \delta_m \eta D^n(t))$ | $(1, 0, -1)/n$ | $\{o, \bar{o}, m, \bar{m}\}$ | $1/n$ |
Area | Works |
planning | [72] |
state estimation and filtering | [73,74] |
synchronization | [75,76,77,78] |
opinion formation | [79] |
network security | [80,81,82,83,84] |
power control | [85,86,87] |
medium access control | [88,89] |
cognitive radio networks | [90,91] |
electrical vehicles | [92,93] |
scheduling | [94] |
cloud networks | [95,96,97] |
wireless networks | [98] |
auction | [99,100] |
cyber-physical systems | [101,102] |
airline networks | [103] |
sensor networks | [104] |
traffic networks | [105,106,107,108] |
big data | [109] |
D2D networks | [110,111,112] |
multilevel building evacuation | [140,141,142,143] |
power networks | [113,114,115,116,117,174,179] |
[93,118,119,120,121] | |
[122,123,124] | |
HVAC | [125,126,127,128,129,130] |
Area | Anonymity | Infinity | Atom |
population games [4,5] | yes | yes | no |
evolutionary games [131] | yes | yes | no |
non-atomic games [29] | yes | yes | no |
aggregative games [18] | relaxed | ||
global games [16,17] | yes | yes | no |
large games [22] | yes | yes | no |
anonymous games [29] | yes | yes | no |
mean-field games | yes | yes | no |
nonasymptotic mean-field games | nearly | no | yes |
MFTG | relaxed | relaxed | relaxed |
$\mathcal{I}$ | $\triangleq$ | set of decision-makers |
$T$ | $\triangleq$ | Length of the horizon |
$[0, T]$ | $\triangleq$ | horizon of the mean-field-type game |
$t$ | $\triangleq$ | time index |
$\mathcal{X}$ | $\triangleq$ | state space |
$W$ | $\triangleq$ | Brownian motion |
$\sigma$ | $\triangleq$ | Diffusion coefiicient |
$N$ | $\triangleq$ | Poisson jump process |
$\gamma$ | $\triangleq$ | Jump rate coefiicient |
${U}_i$ | $\triangleq$ | control action space of agent $i\in \mathcal{I}$ |
$\mathcal{U}_i$ | $\triangleq$ | admissible strategy space |
$u_i$ | $\triangleq$ | state space |
$r_i$ | $\triangleq$ | instantaneous payoff |
$D_{(x, u)}$ | $\triangleq$ | distribution of state-action |
$R_i$ | $\triangleq$ | Long-term payoff functional |
Case | Transition proba. $(\theta, \theta' \in \{1, 2\})$. | $M^n_{\theta}(t+1)-M^n_{\theta}(t)$ | Actions | Propagation |
${D} \xrightarrow{\delta_{D}} {H}$ | $D^n_\theta(t) \delta_{D}$ | $(-1, 0, 1)/n$ | singleton set | $-1/n$ |
$2 D \xrightarrow{\lambda} 2 C$ | $D^n_{\theta}(t) \delta_m ^2 \lambda (D^n_{\theta}(t)-\frac{1}{n})$ | $(-2, 2, 0)/n$ | $\{m, \bar{m}\}$ | $0$ |
%2' | $P^n_{\theta}(t) \delta_m ^2 \lambda (P^n_{\theta'}(t)-\frac{1}{n}\mathbb{1}_{\{\theta=\theta'\}})\mathbb{1}_{\{\theta=\theta'\}} $ | $(-2, 2, 0)/n$ | $\{m, \bar{m}\}$ | $0$ |
${C} \xrightarrow{\delta_{C}} {H}$ | $C^n_\theta(t) \delta_C$ | $(0, -1, 1)/n$ | singleton set | $-1/n$ |
${C} \xrightarrow{\frac{\beta}{q_{\theta} + D^n_{\theta}(t)}} D$ | $C^n_\theta(t) \beta \frac{D^n_{\theta}(t)}{q_{\theta} + D^n_{\theta}(t)}$ | $(-1, 1, 0)/n$ | singleton set | $0$ |
${H} \xrightarrow{\delta_{H}+(1-\delta_{H})C^n} C$ | $H^n(t) [\delta_{H}+(1-\delta_{H})C^n(t)]$ | $(0, 1, -1)/n$ | singleton set | $1/n$ |
${H} \xrightarrow{\eta}D$ | $H^n(t)(\delta_e \delta_{Sm} + \delta_m \eta D^n(t))$ | $(1, 0, -1)/n$ | $\{o, \bar{o}, m, \bar{m}\}$ | $1/n$ |