Loading [MathJax]/jax/element/mml/optable/GeneralPunctuation.js
Research article Special Issues

Benchmarking alternative interpretable machine learning models for corporate probability of default

  • In this study we investigate alternative interpretable machine learning ("IML") models in the context of probability of default ("PD") modeling for the large corporate asset class. IML models have become increasingly prominent in highly regulated industries where there are concerns over the unintended consequences of deploying black box models that may be deemed conceptually unsound. In the context of banking and in wholesale portfolios, there are challenges around using models where the outcomes may not be explainable, both in terms of the business use case as well as meeting model validation standards. We compare various IML models (deep neural networks and explainable boosting machines), including standard approaches such as logistic regression, using a long and robust history of corporate borrowers. We find that there are material differences between the approaches in terms of dimensions such as model predictive performance and the importance or robustness of risk factors in driving outcomes, including conflicting conclusions depending upon the IML model and the benchmarking measure considered. These findings call into question the value of the modest pickup in performance with the IML models relative to a more traditional technique, especially if these models are to be applied in contexts that must meet supervisory and model validation standards.

    Citation: Michael Jacobs, Jr. Benchmarking alternative interpretable machine learning models for corporate probability of default[J]. Data Science in Finance and Economics, 2024, 4(1): 1-52. doi: 10.3934/DSFE.2024001

    Related Papers:

    [1] Agustín Gabriel Yabo, Jean-Baptiste Caillau, Jean-Luc Gouzé . Optimal bacterial resource allocation: metabolite production in continuous bioreactors. Mathematical Biosciences and Engineering, 2020, 17(6): 7074-7100. doi: 10.3934/mbe.2020364
    [2] Shahab Shamshirband, Javad Hassannataj Joloudari, Sahar Khanjani Shirkharkolaie, Sanaz Mojrian, Fatemeh Rahmani, Seyedakbar Mostafavi, Zulkefli Mansor . Game theory and evolutionary optimization approaches applied to resource allocation problems in computing environments: A survey. Mathematical Biosciences and Engineering, 2021, 18(6): 9190-9232. doi: 10.3934/mbe.2021453
    [3] Zehan Liu, Daoxin Qiu, Shengqiang Liu . A two-group epidemic model with heterogeneity in cognitive effects. Mathematical Biosciences and Engineering, 2025, 22(5): 1109-1139. doi: 10.3934/mbe.2025040
    [4] Xuyin Wang, Weiguo Liu, Lu Li, Peizhen Zhao, Ruifeng Zhang . Resource dependent scheduling with truncated learning effects. Mathematical Biosciences and Engineering, 2022, 19(6): 5957-5967. doi: 10.3934/mbe.2022278
    [5] Yanpei Liu, Yunjing Zhu, Yanru Bin, Ningning Chen . Resources allocation optimization algorithm based on the comprehensive utility in edge computing applications. Mathematical Biosciences and Engineering, 2022, 19(9): 9147-9167. doi: 10.3934/mbe.2022425
    [6] Xinru Zhou, Xinmiao Rong, Meng Fan, Josué-Antonio Nescolarde-Selvaa . Stoichiometric modeling of aboveground-belowground interaction of herbaceous plant. Mathematical Biosciences and Engineering, 2019, 16(1): 25-55. doi: 10.3934/mbe.2019002
    [7] Semu Mitiku Kassa . Three-level global resource allocation model for HIV control: A hierarchical decision system approach. Mathematical Biosciences and Engineering, 2018, 15(1): 255-273. doi: 10.3934/mbe.2018011
    [8] Yafei Li, Yuxi Liu . Multi-airport system flight slot optimization method based on absolute fairness. Mathematical Biosciences and Engineering, 2023, 20(10): 17919-17948. doi: 10.3934/mbe.2023797
    [9] Ali Gharouni, Lin Wang . Modeling the spread of bed bug infestation and optimal resource allocation for disinfestation. Mathematical Biosciences and Engineering, 2016, 13(5): 969-980. doi: 10.3934/mbe.2016025
    [10] Yu Shen, Hecheng Li . A new differential evolution using a bilevel optimization model for solving generalized multi-point dynamic aggregation problems. Mathematical Biosciences and Engineering, 2023, 20(8): 13754-13776. doi: 10.3934/mbe.2023612
  • In this study we investigate alternative interpretable machine learning ("IML") models in the context of probability of default ("PD") modeling for the large corporate asset class. IML models have become increasingly prominent in highly regulated industries where there are concerns over the unintended consequences of deploying black box models that may be deemed conceptually unsound. In the context of banking and in wholesale portfolios, there are challenges around using models where the outcomes may not be explainable, both in terms of the business use case as well as meeting model validation standards. We compare various IML models (deep neural networks and explainable boosting machines), including standard approaches such as logistic regression, using a long and robust history of corporate borrowers. We find that there are material differences between the approaches in terms of dimensions such as model predictive performance and the importance or robustness of risk factors in driving outcomes, including conflicting conclusions depending upon the IML model and the benchmarking measure considered. These findings call into question the value of the modest pickup in performance with the IML models relative to a more traditional technique, especially if these models are to be applied in contexts that must meet supervisory and model validation standards.



    Plant life history theory is generally concerned with the strategies that plants employ for survival and reproduction, as well as how these processes influence population dynamics. The question of how plants allocate resources, e.g., C, N, P, and in some models, biomass, and what drives these allocation rules, is central to this pursuit. There are several schools of thought that seek to provide a framework for understanding these allocation patterns. One school of thought suggests that allocation rules should ultimately be the result of natural selection, and so resource allocation should optimize fitness in some sense, for example by maximizing fruits [1] or overall biomass [2]. Another framework views biomass allocation as following certain allometric scaling relationships [3], which has been shown to be compatible with the optimization framework [4]. Yet another views allocation not through the lens of an individual organism or specific genome but rather from a game-theoretical perspective in which allocation rules are driven by competition and follow an evolutionary stable strategy [5]. It has also been suggested that allocation may be an emergent property of the many processes involved in plant growth rather than the result of a central allocation process [6]. For a more complete review of allocation theory, we refer the reader to [7] or [8].

    In this paper, we take the viewpoint that resource allocation, in annual plants specifically, should serve to optimize overall fitness. It is important to note that, while patterns of growth consistent with optimal allocation have been observed to some extent [4], even if this is not universally true, it is still important to have a theory of optimal growth for comparison with the observed behavior. Whereas previous work has focused primarily on optimal allocation of a single resource, be it carbon or biomass, and ignored the role of mineral nutrients, our work seeks to develop a theory that acknowledges the importance of both carbon and mineral nutrients. This is in line with the functional equilibrium hypothesis, which states that optimal growth occurs when resources are allocated in such a way that no single resource is any more limiting than any other. This idea has its origin in economics [9] and has been studied across a wide range of experiments [10], although allometry may also play a role [11].

    This work is primarily an extension of the classical work by Iwasa and Roughgarden [1], who considered a model for photosynthate (C) allocation in annual plants. They assumed that photosynthate production depends on both shoots and roots, and that allocation is split between shoots, roots, and fruits so as to maximize fruit yield. They used optimal control theory to determine that fruit yield is maximized by a three-phase growth pattern, characterized by an initial phase of shoot-only or root-only growth, a period of balanced growth during which shoots and roots grow simultaneously, and ultimately a period of fruit-only growth at the end of the growing season.

    We present a model that extends the work of Iwasa and Roughgarden to a case that directly incorporates the role of mineral nutrients. In particular, we model the allocation of carbohydrates and mineral nutrients in an annual plant with the objective of optimizing fruit yield and use optimal control theory to determine the allocation strategies. In this paper, we will consider a case where shoots and roots require both resources, and, for simplicity, fruits only require carbohydrates, as in [1]. This simplification allows us to obtain mathematical results that provide an important proof of concept for the viability of this type of model, which we intend to use in the future to guide more general two-resource models.

    We find, among other things, that this addition results in four phases of growth, rather than three as seen in [1]. This additional phase consists of a period of mixed vegetative/reproductive growth, during which the fruits and shoots grow simultaneously. There has been some criticism of the use of optimal control models for carbon allocation because they typically predict a bang-bang control with a single switch between vegetative and reproductive growth, whereas this separation of function is not biologically realistic [12]. We find the emergence of this additional phase in our model to be evidence for its validity, even with the simplifying assumptions we are making. Furthermore, our results indicate that what is optimal for one plant may not be optimal for another, and optimal growth is largely dependent on initial conditions. We also conclude that this range of optimal strategies may have the effect of reducing population-level variation, thus driving a population toward common sizes and optimal yields. The question remains, however, as to whether plants actually have this degree of strategic plasticity.

    We will use the notation x(t)=(x1(t),x2(t),,xn(t)) for the function x:RRn and g(t,x,u)=(g1(t,x,u),g2(t,x,u),,gn(t,x,u)) for the function g:R×Rn×RmRn. Of particular importance to us are a well-known class of optimal control problems of the form

    maxut1t0f(t,x,u)dtsubject to:x(t)=g(t,x,u),x(t0)=x0subject to:akukbkfor k=1,,m. (2.1)

    The solution process is formalized in Pontryagin's maximum principle [13,14,]. Here, we form a Hamiltonian with n adjoints λi, one for each of the states xi:

    H(t,x,u,λ)=f(t,x,u)+λ(t)g(t,x,u), (2.2)

    where each adjoint satisfies the following:

    λi(t)=Hxi(t,x,u,λ),λi(t1)=0for i=1,,n. (2.3)

    An optimal pair (x,u) is characterized by a set of necessary conditions for optimality (2.4). We refer the reader to [13] for a thorough treatment of this class of optimal control problems.

    {uk=ak,if Huk<0akukbk,if Huk=0uk=bk,if Huk>0 (2.4)

    The control problem we will study is an extension of this class of problems, with additional constraints on the controls. We will discuss how the necessary conditions change when they are introduced, as well as include the derivations in Appendix A.

    We now turn our attention to the model for resource allocation in annual plants we will discuss as an extension of the work of Iwasa and Roughgarden [1] to a scenario where the growth of the roots and shoots relies on two resources, rather than one. For the sake of convenience, we will use the terms carbon and nitrogen to refer to more complicated classes of carbohydrates and mineral nutrients. As a simplifying assumption, we assume that fruits only require carbon for growth.

    This assumption makes the mathematical analysis more feasible, and aligns biologically with plants that encase small seeds in large carbon-rich fruits. While this may match our colloquial definition of fruits, e.g., watermelons, it is not an accurate representation of many types of annual plants for which the term fruits refers to seeds, nuts, etc., and for which nitrogen content is not negligible.

    We consider an annual plant with three organs: shoots, roots, and fruits. Shoots consist of all above-ground vegetative biomass, and roots consist of all below-ground vegetative biomass. Fruits refer to any reproductive biomass, be it in the form of colloquial fruits, nuts, seeds, etc. The biomass of each organ is measured in units of carbon, which is used as a catch-all term for carbohydrates produced by the shoots. The functions S(t),R(t), and F(t) give the biomass of shoots, roots, and fruits, respectively, at time t throughout a fixed growing season [0,T]. We assume that the plant relies on two resources, carbon and nitrogen, though we use these terms loosely to refer to more complicated classes of carbohydrates and soil nutrients. Carbon is fixed by the shoots at a rate of C(S), and nitrogen is absorbed by the roots at a rate of N(R). Note that here, we made a simplifying assumption, as in reality the rate of carbon fixation depends on both shoots and roots via transpiration. Throughout the growing season, fractions of carbon uSC(t),uRC(t), and uFC(t) and fractions of nitrogen uSN(t),uRN(t), and uFN(t) are allocated to the shoots, roots, and fruits, respectively, at time t, and we assume that the plant can freely allocate resources. The resources pass through a synthesizing unit (SU) in each organ, where they are converted into biomass. We use Kooijman's parallel complementary synthesizing unit (PCSU) function from [15], employing the same simplification seen in [7]HY__HY, Appendix A], given by

    g(A,B)=A2B+AB2A2+AB+B2, (3.1)

    to provide the rate of tissue production when resources are provided at rates A and B. We can view these rates as representing the maximum rate of tissue production given the full utilization of that particular resource. Therefore, when resources are available in the stoichiometric ratio, we have that A=B. As the tissue is measured in units of carbon, we need conversion factors νS,νR, and νF, which give the fixed stoichiometric C:N ratios in the shoots, roots, and fruits, respectively. We specify initial conditions S(0)=S0,R(0)=R0, and F(0)=0, which leads to the following model:

    dSdt=g(uSCC,νSuSNN),S(0)=S0 (3.2)
    dRdt=g(uRCC,νRuRNN),R(0)=R0 (3.3)
    dFdt=g(uFCC,νFuFNN),F(0)=0 (3.4)

    As we are considering the case where fruits only require carbon, we implement this assumption by taking νF, and note that for any nonzero uFNN, we have the limit

    limνFg(uFCC,νFuFNN)=uFCC. (3.5)

    This leads to the following simplified model, which will form the basis for our analysis.

    dSdt=g(uSCC,νSuSNN),S(0)=S0 (3.6)
    dRdt=g(uRCC,νRuRNN),R(0)=R0 (3.7)
    dFdt=uFCC,F(0)=0 (3.8)

    Note that we suppress the arguments in most functions for convenience. This model is shown schematically in Figure 1.

    Figure 1.  Schematic for the model described by equations 3.6, 3.7, and 3.8.

    Because the functions uFC,uSC,uRC,uSN, and uRN represent fractions of carbon and nitrogen we impose several restrictions. First, we require that each function be piecewise continuous and bounded between 0 and 1. Furthermore, as we assume full utilization of each resource, we assume that for all times t[0,T]

    uFC+uSC+uRC=1=uSN+uRN. (3.9)

    Additionally, to be biologically realistic, we assume that the plant's capacity to collect resources increases continuously with biomass, meaning that we require both C and N to be continuously differentiable, and

    dCdS>0anddNdR>0. (3.10)

    We also assume that the plant experiences possibly diminishing returns with increased biomass, meaning that

    d2CdS20andd2NdR20. (3.11)

    Before we introduce the optimal control framework we will use for determining the optimal resource allocation strategy, there are several important features of the PCSU that merit discussion. As introduced in (3.1), for resource fluxes A and B, the PCSU function

    g(A,B)=A2B+AB2A2+AB+B2, (3.1 revisited)

    gives the rate of tissue production. Note that when only one resource is present, the tissue production rate is zero as g(A,0)=0=g(0,B) for nonzero A and B. Additionally, this is still true when neither resource is present. In particular, writing (3.1) in polar coordinates allows us to show that g(A,B)0 as (A,B)(0,0). Because of this, we will take g to be the continuous extension of this function to the origin such that g(0,0)=0.

    Now, by changing variables to either zB=BA or zA=AB, we can rewrite g as

    g(A,B)=AG(zB)=BG(zA), (3.12)

    where

    G(z)=z(1+z)1+z+z2. (3.13)

    The fact that we can represent the PCSU function in this manner will aid in our analysis by restricting the nonlinearities present to a function of a single variable. By the above discussion, we observe that although zB and zA may be undefined when either A or B or both are zero, both AG(zB) and BG(zA) are zero when at least one of A or B is zero. It is important to note, however, that while g is continuous at the origin, G(zB) and G(zA) are not. Additionally, it will facilitate later analysis of the model to note that

    gA(A,B)=ABG(zA)=G(zA), (3.14)

    so long as (A,B)(0,0), and, recalling that g(0,0)=0,

    gA(0,0)=limh0g(0+h,0)g(0,0)h=limh00h=0, (3.15)

    although neither G(zB) nor G(zA) is continuous at the origin. We can obtain gB(A,B) in the same manner, and so we have

    gX(A,B)={G(zX),(A,B)(0,0)0,(A,B)=(0,0)for X{A,B} (3.16)

    where

    G(z)=1+2z(1+z+z2)2. (3.17)

    Finally, it is worth noting that both G and G are bounded between 0 and 1 and that

    G(0)=0,G(0)=1,limzG(z)=1,limzG(z)=0. (3.18)

    There are two identities related to the PCSU function and its derivatives that will be repeatedly cited in later analysis. We will start with the following notation:

    G2(z)=G(1/z)=z3(2+z)(1+z+z2)2. (3.19)

    Employing (3.19), we have the following. We omit the proofs as each can be verified directly.

    PCSU Identity 1.

    G(z)zG(z)=G2(z). (3.20)

    PCSU Identity 2.

    G2(z)=zG (3.21)

    In this section, we will translate the original problem of finding the growth trajectory to maximize fruit yield to an optimal control problem. Noting that

    \begin{equation} F(T) = \int_{0}^{T}\frac{dF}{dt}\,dt + F(0) = \int_{0}^{T}u_{FC}(t)C(S)\,dt, \end{equation} (3.22)

    we can reframe our goal as maximizing the right-hand side of (3.22). By imposing the previously discussed constraints that the fractions of each resource, henceforth referred to as the controls, be bounded in [0, 1], and fractions of the same resource sum to unity, along with the differential equations for the biomass of each vegetative organ, we can associate the optimal growth trajectory with the solution to the following optimal control problem.

    \begin{gather} \max\limits_{\mathbf{u}}\int_{0}^{T} u_{FC}C\, dt \\ \begin{aligned} &\text{subject to:}\quad u_{iC} \geq 0,\ u_{jN} \geq 0 \quad \text{for } i = 0,1,2,\ j = 1,2\\ &\phantom{\text{subject to:}\quad} u_{FC} + u_{SC} + u_{RC} = 1 = u_{SN} + u_{RN}\\ &\phantom{\text{subject to:}\quad} \frac{dS}{dt} = g(u_{SC}C, \nu_{S}u_{SN}N),\quad S(0) = S_{0} \\ &\phantom{\text{subject to:}\quad}\frac{dR}{dt} = g(u_{RC}C, \nu_{R}u_{RN}N),\quad R(0) = R_{0} \\ \end{aligned} \end{gather} (3.23)

    Note that here, we only require that the controls be non-negative, because implicit in the combination of those constraints and the equality constraints is the requirement that each control must also be less than one.

    We will solve the optimal control problem (3.23) using a set of necessary conditions that must be satisfied by the solution. The presence of the two equality constraints makes this problem non-standard, so we take the time to derive the necessary conditions for this type of problem. This derivation is found in Appendix A. Since we can essentially think of the carbon controls and the nitrogen controls separately, the necessary conditions for (3.23) are a combination of the conditions for the two types of problems discussed in Appendix A. We begin by forming a Hamiltonian with two piecewise differentiable adjoints, \lambda_{S}(t) and \lambda_{R}(t):

    \begin{equation} H = u_{FC}C + \lambda_{S}g(u_{SC}C, \nu_{S}u_{SN}N) + \lambda_{R}g(u_{RC}C, \nu_{R}u_{RN}N). \end{equation} (3.24)

    The necessary conditions for optimality are as follows:

    \begin{equation} \begin{cases} u_{iC} = 0 &\text{if}\quad \frac{\partial H}{\partial u_{iC}} < \frac{\partial H}{\partial u_{jC}}, \text{ for all }j\neq i\\ u_{iC} = 1 &\text{if}\quad \frac{\partial H}{\partial u_{iC}} > \frac{\partial H}{\partial u_{jC}}, \text{ for all }j\neq i\\ 0\leq u_{iC}\leq 1 &\text{if}\quad \frac{\partial H}{\partial u_{iC}} = \frac{\partial H}{\partial u_{jC}}, \text{ for any }j\neq i\\ u_{iN} = 0, u_{jN} = 1 &\text{if}\quad \frac{\partial H}{\partial u_{iN}} < \frac{\partial H}{\partial u_{jN}}\\ 0\leq u_{SN},u_{RN}\leq 1 &\text{if}\quad \frac{\partial H}{\partial u_{SN}} = \frac{\partial H}{\partial u_{RN}} \end{cases} \end{equation} (3.25)

    What this essentially says is that all the carbon is being allocated to the organ such that the partial derivative of the Hamiltonian with respect to that organ's carbon control is larger than the partial derivatives of the Hamiltonian with respect to the other two carbon controls. Likewise for the nitrogen controls. Note also that the Hamiltonian must be constant along the optimal trajectory because the optimal control problem (3.23) is autonomous [13].

    As mentioned in Section 3.2, we can rewrite the PCSU function g by a change of variables. In particular, we can rewrite (3.6), (3.7), and (3.8) via (3.12) and the substitutions

    \begin{align} z_{SC} & = \frac{u_{SC}C}{\nu_{S}u_{SN}N} \end{align} (3.26)
    \begin{align} z_{RC} & = \frac{u_{RC}C}{\nu_{R}u_{RN}N} . \end{align} (3.27)

    This results in the following system

    \frac{dS}{dt} = \nu_{S}u_{SN}NG(z_{SC}),\quad S(0) = S_{0} (3.28)
    \frac{dR}{dt} = \nu_{R}u_{RN}NG(z_{RC}),\quad R(0) = R_{0} (3.29)
    \frac{dF}{dt} = u_{FC}C,\quad F(0) = 0 (3.30)

    where G is given by (3.13). Therefore, the Hamiltonian can be rewritten as

    \begin{equation} H = u_{FC}C + \lambda_{S}\nu_{S}u_{SN}NG(z_{SC}) + \lambda_{R}\nu_{R}u_{RN}NG(z_{RC}). \end{equation} (3.31)

    Now, as we previously mentioned, we can essentially think of the carbon controls and nitrogen controls separately. The above formulation provides an avenue for restricting the appearance of the carbon controls to the argument of G by letting us write

    \begin{equation*} g(\text{Carbon Flux},\text{Nitrogen Flux}) = \left(\text{Nitrogen Flux}\right) \cdot G\left(\frac{\text{Carbon Flux}}{\text{Nitrogen Flux}}\right). \end{equation*}

    When working directly with the nitrogen controls, it will be useful to write

    \begin{equation*} g(\text{Carbon Flux},\text{Nitrogen Flux}) = \left(\text{Carbon Flux}\right)\cdot G\left(\frac{\text{Nitrogen Flux}}{\text{Carbon Flux}}\right). \end{equation*}

    With this in mind, we can make the substitutions

    \begin{align} z_{SN} & = \frac{\nu_{S}u_{SN}N}{u_{SC}C} \end{align} (3.32)
    \begin{align} z_{RN} & = \frac{\nu_{R}u_{RN}N}{u_{RC}C}, \end{align} (3.33)

    and rewrite (3.6), (3.7), and (3.8) as

    \frac{dS}{dt} = u_{SC}CG(z_{SN}),\quad S(0) = S_{0} (3.34)
    \frac{dR}{dt} = u_{RC}CG(z_{RN}),\quad R(0) = R_{0} (3.35)
    \frac{dF}{dt} = u_{FC}C,\quad F(0) = 0. (3.36)

    As before, the Hamiltonian can be rewritten as

    \begin{equation} H = u_{FC}C + \lambda_{S}u_{SC}CG(z_{SN}) + \lambda_{R}u_{RC}CG(z_{RN}). \end{equation} (3.37)

    Using (3.31) and (3.37), we can compute the following partial derivatives:

    \begin{align} \frac{\partial H}{\partial u_{FC}} & = C \end{align} (3.38)
    \begin{align} \frac{\partial H}{\partial u_{SC}} & = \lambda_{S}\nu_{S}u_{SN}NG'(z_{SC})\frac{\partial}{\partial u_{SC}}z_{SC} = \lambda_{S}CG'(z_{SC}) \end{align} (3.39)
    \begin{align} \frac{\partial H}{\partial u_{RC}} & = \lambda_{R}\nu_{R}u_{RN}NG'(z_{RC})\frac{\partial}{\partial u_{RC}}z_{RC} = \lambda_{R}CG'(z_{RC}) \end{align} (3.40)
    \begin{align} \frac{\partial H}{\partial u_{SN}} & = \lambda_{S}u_{SC}CG'(z_{SN})\frac{\partial}{\partial u_{SN}}z_{SN} = \lambda_{S}\nu_{S}NG'(z_{SN}) \end{align} (3.41)
    \begin{align} \frac{\partial H}{\partial u_{RN}} & = \lambda_{R}u_{RC}CG'(z_{RN})\frac{\partial}{\partial u_{RN}}z_{RN} = \lambda_{R}\nu_{R}NG'(z_{RN}). \end{align} (3.42)

    It is important to recall that by (3.16), these formulations of \frac{\partial g}{\partial A}(A, B) and \frac{\partial g}{\partial B}(A, B) are only valid so long as (A, B)\neq (0, 0), in which case these partial derivatives vanish.

    The last piece of the optimal control framework concerns the adjoints \lambda_{S} and \lambda_{R}. By the derivation of the necessary conditions in Appendix A, we have that

    \begin{align} \lambda_{S}' & = -\frac{\partial H}{\partial S},\quad \lambda_{S}(T) = 0 \end{align} (3.43)
    \begin{align} \lambda_{R}' & = -\frac{\partial H}{\partial R},\quad \lambda_{R}(T) = 0 . \end{align} (3.44)

    Making use of (3.31) and (3.37), we can express (3.43) and (3.44) as follows:

    \begin{align} \lambda_{S}' & = -C_{S}\left[u_{FC} + \lambda_{S}u_{SC}G'(z_{SC}) + \lambda_{R}u_{RC}G'(z_{RC})\right] \end{align} (3.45)
    \begin{align} \lambda_{R}' & = -N_{R}\left[\lambda_{S}\nu_{S}u_{SN}G'(z_{SN}) + \lambda_{R}\nu_{R}u_{RN}G'(z_{RN})\right] \end{align} (3.46)

    Note that here, we use the notation C_{S} = C'(S) and N_{R} = N'(R) for the sake of brevity and to reserve the prime notation for cases where either the derivative is taken with respect to time or when the argument of the function is made explicit.

    In this section, we will discuss the structure of the solution to (3.23). Generally speaking, the solution exhibits a four-phase structure. First, there is an initial phase of vegetative growth in which the plant addresses deficiencies in either shoots or roots. Second, the plant undergoes a phase of balanced growth when both the shoots and roots are growing. Third, there is a phase of mixed vegetative/reproductive growth during which the shoots continue to grow and the fruits begin growing simultaneously. The plant completes the growing season with a period of reproductive growth, during which only the fruits are growing.

    We will begin with the analysis of the four-phase structure of the optimal solution. Because we know that the adjoints, \lambda_{S} and \lambda_{R}, vanish at the end of the growing season, we use that as the starting point for our analysis and proceed in reverse.

    We begin by showing that there exists a switching time after which all carbon is allocated to fruit production. We will refer to the interval between the switching time and T the final interval. Note that by (3.43) and (3.44), we have that \lambda_{S}(T) = 0 = \lambda_{R}(T). Furthermore, recall that G' is bounded. Therefore, writing C(S(T)) = C^{*}, we have by (3.38), (3.39), and (3.40) that

    \begin{align} \frac{\partial H}{\partial u_{FC}}(T) & = C^{*} > 0 \end{align} (4.1)
    \begin{align} \frac{\partial H}{\partial u_{SC}}(T) & = 0 \end{align} (4.2)
    \begin{align} \frac{\partial H}{\partial u_{RC}}(T) & = 0. \end{align} (4.3)

    This implies that at t = T

    \begin{equation} \frac{\partial H}{\partial u_{FC}}(t) > \frac{\partial H}{\partial u_{SC}}(T) \quad\text{and}\quad\frac{\partial H}{\partial u_{FC}}(t) > \frac{\partial H}{\partial u_{RC}}(T). \end{equation} (4.4)

    Therefore, by (3.25), we have that u_{FC}(T) = 1 and u_{SC}(T) = 0 = u_{RC}(T). That is, at the end of the growing season, the plant is allocating all of the available carbon to the fruits, which in turn means that only the fruits are growing at this time.

    Now, because both \lambda_{S} and \lambda_{R} are continuous, \lambda_{S}(T) = 0 = \lambda_{R}(T), and G' is bounded, there must be some \varepsilon > 0 such that for all t in [T- \varepsilon, T], we have that (4.4) still holds. So, we have the existence of a (potentially small) interval of fruit-only growth. Now, writing C_{S}(S(T)) = C_{S}^{*}, during this interval we have by (3.45) that

    \begin{equation} \lambda_{S}' = -C_{S}^{*} \end{equation} (4.5)

    because here, we have u_{FC} = 1, u_{SC} = 0 = u_{RC} and G' is bounded. Because \lambda_{S}(T) = 0, this implies that during this interval

    \begin{equation} \lambda_{S}(t) = C_{S}^{*}\cdot (T - t). \end{equation} (4.6)

    Additionally, because during this interval we have a scenario where for i = 1, 2 either only u_{iC} is zero or both u_{iC} and u_{iN} are zero, we have by (3.46) that

    \begin{equation} \lambda_{R}' = 0 . \end{equation} (4.7)

    This is because either u_{iC} alone is zero, in which case z_{iN} \to \infty and by (3.18) we have that G'(z_{iN})\to 0, or in the case that both u_{iC} and u_{iN} are zero, then u_{iN}G'(z_{iN}) = 0 because G' is bounded. In either case, we get (4.7). So, because \lambda_{R}(T) = 0, we have that the following holds throughout the interval:

    \begin{equation} \lambda_{R}(t) = 0. \end{equation} (4.8)

    Therefore, because G' is bounded, we have by (3.42) that

    \begin{equation*} \frac{\partial H}{\partial u_{RN}} = 0. \end{equation*}

    Since (3.41) is non-negative in this interval, it must be the case that

    \begin{equation} \frac{\partial H}{\partial u_{RN}} \leq \frac{\partial H}{\partial u_{SN}}. \end{equation} (4.9)

    If this inequality were strict, then by (3.25), we would have u_{SN} = 1. In this case, because u_{SC} = 0, we have that z_{SN} \to \infty, and so by (3.18), we have that G'(z_{SN}) \to 0. This, however, implies that

    \begin{equation*} \frac{\partial H}{\partial u_{SN}} = 0 = \frac{\partial H}{\partial u_{RN}}, \end{equation*}

    and so the inequality (4.9) cannot be strict, and both partial derivatives are zero in this interval. In particular, G'(z_{SN}) = 0, which means that z_{SN} \to \infty and so z_{SC} = 0, which by (3.18) implies that G'(z_{SC}) = 1. In summary, during this interval, we have

    \begin{equation} \frac{\partial H}{\partial u_{FC}} = C^{*},\quad \frac{\partial H}{\partial u_{SC}} = \lambda_{S}C^{*},\quad \frac{\partial H}{\partial u_{RC}} = \frac{\partial H}{\partial u_{SN}} = \frac{\partial H}{\partial u_{RN}} = 0. \end{equation} (4.10)

    Now, since \lambda_{S} is decreasing by (4.5), we see from (4.10) that the partial derivatives will maintain the same ordering as long as \lambda_{S} < 1. We define the switching point t^{*} to be the time when the plant switches to fruit-only growth. We can identify the switching point as the time when \lambda_{S}(t) = 1, and we can use (4.6) to define t^{*} implicitly by the equation

    \begin{equation} C_{S}^{*}\cdot(T - t^{*}) = 1. \end{equation} (4.11)

    By solving for t^{*} in (4.11), we obtain

    \begin{equation} t^{*} = T - \frac{1}{C_{S}^{*}}. \end{equation} (4.12)

    In doing so, note that we have extended this period of fruit-only growth from the interval [T - \varepsilon, T] to [t^{*}, T]. Note also that during the final interval, we have by (3.24) that

    \begin{equation*} H = C, \end{equation*}

    and because the Hamiltonian must be constant along the optimal trajectory, we have that

    \begin{equation} H = C^{*} \end{equation} (4.13)

    throughout the growing season.

    We will now continue backwards to show that, prior to the final interval of fruit-only growth, we have a period of mixed vegetative/reproductive growth, during which both the shoots and fruits grow simultaneously. This interval will be referred to as the penultimate interval. Recall that at the beginning of the final interval, we have that \lambda_{S} = 1, and so (4.10) becomes

    \begin{equation} \frac{\partial H}{\partial u_{FC}} = \frac{\partial H}{\partial u_{SC}} = C,\quad \frac{\partial H}{\partial u_{RC}} = \frac{\partial H}{\partial u_{SN}} = \frac{\partial H}{\partial u_{RN}} = 0. \end{equation} (4.14)

    In order to apply (3.25) we need to determine the ordering of the partial derivatives of H in an interval immediately prior to t^{*}. We will accomplish this by a sequence of three lemmas, the proofs of which are included in Appendix B.1. We begin with the following inequality.

    Lemma 1. \frac{\partial H}{\partial u_{RC}} < \frac{\partial H}{\partial u_{FC}} in an open interval immediately prior to t^{*}.

    Note that, regardless of where \frac{\partial H}{\partial u_{SC}} falls in the order of partial derivatives, the strict inequality in Lemma 1 means that there is no root growth during this period of time immediately prior to t^{*}. Next, we have the following, which indicates that allocation to shoots and to fruits at this point in the growing season are equally important to overall fitness.

    Lemma 2. \frac{\partial H}{\partial u_{SC}} = \frac{\partial H}{\partial u_{FC}} in an open interval immediately prior to t^{*}.

    Putting together Lemmas 1 and 2, we get

    \begin{equation} \frac{\partial H}{\partial u_{RC}} < \frac{\partial H}{\partial u_{SC}} = \frac{\partial H}{\partial u_{FC}}. \end{equation} (4.15)

    By (3.25), this means that u_{RC} = 0, and by (3.38) and (3.39), we obtain

    \begin{equation} \lambda_{S}G'(z_{SC}) = 1 \end{equation} (4.16)

    during this interval. Next, we state the following result, which means that the shoots are growing during this stage. This will then be used to show that this interval is indeed marked by simultaneous fruit and shoot growth.

    Lemma 3. u_{SN} = 1 and u_{SC} > 0 in an open interval immediately prior to t^{*}.

    At this point, note that we have only showed that

    \begin{equation*} u_{SC} > 0,\quad u_{RC} = 0, \quad u_{SN} = 1, \quad u_{RN} = 0. \end{equation*}

    Lastly, as \lambda_{S} > 1 for this open interval before t^{*}, \lambda_{S}(t^{*}) = 1, and \lambda_{S} is continuous, we have that

    \begin{equation} 1 = \lim\limits_{t\to t^{*-}}\lambda_{S} = \lim\limits_{t\to t^{*-}}\frac{1}{G'(z_{SC})} \implies \lim\limits_{t\to t^{*-}} z_{SC} = 0. \end{equation} (4.17)

    As we have already verified that u_{SN} = 1, this means that

    \begin{equation} \lim\limits_{t\to t^{*-}}u_{SC} = 0, \end{equation} (4.18)

    and so u_{SC} is continuous at t^{*}. This also means that 0 < u_{SC} < 1 during an open interval immediately prior to t^{*}, so here u_{FC} > 0. Therefore, we have established the existence of a penultimate interval, during which

    \begin{equation*} u_{FC},u_{SC} > 0,\quad u_{RC} = 0,\quad u_{SN} = 1,\quad u_{RN} = 0, \end{equation*}

    that is, fruits and shoots alone grow simultaneously.

    Before discussing the stage that precedes the penultimate interval, we will briefly digress to discuss a key interpretation that will be relevant to the remaining portions of the growth path. One of the most elegant results from the work of Iwasa and Roughgarden in [1] was the relationship between the optimal growth trajectory and the marginal values of the sizes of the organs. In this context, the marginal value of the size of an organ at time t essentially measures the additional fruits present at time T given a unit increase in the size of that organ at time t. In their model, optimal growth was characterized as directing photosynthate to whichever organ had the highest marginal value, and only in the case that two organs had the same marginal value could resources be shared.

    Iwasa and Roughgarden identified the adjoints in their model as the marginal values of the shoots and roots, and we can make a similar identification here. Let J(S_{0}, R_{0}, t_{0}) be the maximum for the following optimal control problem.

    \begin{gather} \max\limits_{\mathbf{u}}\int_{t_{0}}^{T} u_{FC}C\, dt \\ \begin{aligned} &\text{subject to:}\quad u_{iC} \geq 0,\ u_{jN} \geq 0 \quad\text{for } i = 0,1,2,\ j = 1,2\\ &\phantom{\text{subject to:}\quad} u_{FC} + u_{SC} + u_{RC} = 1 = u_{SN} + u_{RN}\\ &\phantom{\text{subject to:}\quad} \frac{dS}{dt} = g(u_{SC}C, \nu_{S}u_{SN}N),\quad S(t_{0}) = S_{0} \\ &\phantom{\text{subject to:}\quad}\frac{dR}{dt} = g(u_{RC}C, \nu_{R}u_{RN}N),\quad R(t_{0}) = R_{0} \\ \end{aligned} \end{gather} (4.19)

    Note that this only differs from (3.23) in that the initial time is allowed to vary, and so if we initialize (4.19) with a point along the optimal trajectory for (3.23), the optimal trajectories for the two problems will coincide [13,]. In this context, given t\in [0, T] and S(t), R(t) optimal for (3.23), we have that

    \begin{align} \lambda_{S}(t) & = \frac{\partial J}{\partial S}(S(t),R(t),t) \end{align} (4.20)
    \begin{align} \lambda_{R}(t) & = \frac{\partial J}{\partial R}(S(t),R(t),t) \end{align} (4.21)

    which means that \lambda_{S}(t) and \lambda_{R}(t) measure the increase in the objective function given a small instantaneous increase in either shoots or roots, respectively. See [16] for a more general treatment. Therefore, as J(S(t), R(t), t) = F(T), we can identify \lambda_{S}(t) and \lambda_{R}(t) as the marginal values of the sizes of shoots and roots at time t, respectively. These values change over time, and by (3.31) and (3.37), we know they are both zero at the end of the growing season. This reflects both the compounding effects of investing biomass in shoots/roots over time and the idea that these investments have less of an effect on fruit yield later in the season. Furthermore, as resources invested in fruits are directly converted to tissue, and fruits do not have the capacity to increase their own biomass, we can identify 1 as the marginal value of the size of fruits. That is, a unit of fruits added early on is equivalent to a unit of fruit added near the end of the growing season. It is also worth mentioning that if we changed how we modeled fruits, for example by making fruits photosynthetic, then the marginal value of fruits would change over time to reflect the compounding effects of investing biomass in fruits.

    In Iwasa and Roughgarden's model [1], the organ with the highest marginal value was the organ that received the full supply of photosynthate. However, when we consider the optimal trajectory laid out so far, we see that this is not the case in our model, which begs the question why. In particular, in Section 4.2, we see by (4.16) that \lambda_{S}G'(z_{SC}) = 1 throughout the interval. Because z_{SC}\neq 0 on the interior of this interval, we know that G'(z_{SC}) < 1 and so \lambda_{S}(t) > 1. Furthermore, as \lambda_{R} = 0 at the end of the penultimate interval, we have a region where \lambda_{S}(t) > 1 > \lambda_{R}(t), meaning that the marginal value of the size of shoots is greater than either of the other marginal values, and yet both shoots and fruits are growing.

    The reason for this apparent contradiction stems from the addition of the synthesizing units in our model. Consider again the end of the penultimate interval. Here, we see that because the marginal value of shoots is highest, an instantaneous increase in shoots would be better for the plant than either an instantaneous increase in roots or fruits of the same amount. Note, however, that only the control functions can increase instantaneously, and it can be shown that during this phase, the shoot SU alone cannot outperform a mixed shoot/fruit strategy. We will make this more precise below.

    Keeping our discussion limited to the penultimate interval, there are two things in particular that we will examine to help us understand why resources are not necessarily directed solely to the organ with the highest marginal value; in this case, shoots. First, we will give a biological interpretation of the necessary conditions during this interval; second, we will use a linear approximation to directly compare the trade-off between a shoot-only strategy and a mixed fruit/shoot strategy during the penultimate interval.

    To this end, recall that during the penultimate interval, we have

    \begin{equation} \lambda_{S}G'(z_{SC}) = 1 \end{equation} (4.16 revisited)

    which relates the marginal values for shoots and fruits. To understand the biological relevance of this statement, we need an interpretation of G'(z_{SC}). By (3.16), we have that

    \begin{equation*} G'(z_{SC}) = \frac{\partial g}{\partial A}(u_{SC}C,\nu_{S}N) \approx g(u_{SC}C + 1, \nu_{S}N) - g(u_{SC}C, \nu_{S}N) \end{equation*}

    during the penultimate interval. That is, G'(z_{SC}) is approximately the additional units of shoot present per unit of time given a unit increase in the rate at which carbon is sent to the shoots. Multiplying this by the marginal value of shoots, \lambda_{S}(t), approximates the additional fruits at time T given the additional shoots produced by a unit increase in carbon allocation to shoots. We can think of the marginal value of fruits being 1 in the same way, in that if we increase the fruit carbon flux, u_{FC}C, by 1, we have an additional

    \begin{equation*} (u_{FC}C + 1) - u_{FC}C = 1 \end{equation*}

    unit of fruit per unit time, which yields an additional unit of fruit at time T. So, (4.16) essentially has the biological meaning that neither the shoot SU nor the fruit SU can outperform the other, and so a mixed strategy is best. In Iwasa and Roughgarden's model [1], raising the rate of carbon flux to an organ by one unit was approximately equivalent in the long run to increasing the biomass of that organ by one unit, which is why comparing marginal values was equivalent to using the necessary conditions.

    It is interesting to consider what happens when we attempt to construct a solution based on the marginal values alone during the penultimate interval. To this end, consider a point t^{\#} in the penultimate interval such that \lambda_{S}(t^{\#}) > 1 > \lambda_{R}(t^{\#}). A growth strategy based solely on the marginal values would suggest shoot-only growth, whereas we know that a mixed fruit/shoot strategy is optimal here based on the necessary conditions. We will examine what happens if at time t^{\#} there is a length- \varepsilon deviation from the optimal solution during which the mixed fruit/shoot strategy is replaced with a shoot-only strategy. To facilitate this analysis, we have the following theorem.

    Theorem 1. Let t^{\#} be a point in the penultimate interval such that \lambda_{S}(t^{\#}) > 1 > \lambda_{R}(t^{\#}) and such that this ordering of the marginal values holds on [t^{\#}, t^{\#} + \varepsilon] for some \varepsilon > 0. Let F(T) be the value obtained by following a mixed fruit/shoot strategy on [t^{\#}, t^{\#} + \varepsilon] and then following the optimal trajectory on (t^{\#} + \varepsilon, T]. Let F_{\text{alt}}(T) be the value obtained by following a shoot-only strategy S_{alt}(t) on [t^{\#}, t^{\#} + \varepsilon] and then following the optimal trajectory on (t^{\#} + \varepsilon, T]. Then, to leading order,

    \begin{equation} F(T) - F_{\mathit{\text{alt}}}(T) \approx \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\mathit{\text{alt}}}'(t^{\#}))\right) > 0. \end{equation} (4.22)

    Note that this not only gives us an approximation for the trade-off between the two strategies but also means that to leading order, the solution we have derived is better than the alternative solution suggested solely based on the marginal values. A full proof of Theorem 1 can be found in Appendix B.2, but here we will give an intuitive explanation for the approximation, as well as some numerical evidence.

    We can use the marginal values for shoots and fruits at time t^{\#} to approximate the impact that each strategy on the interval [t^{\#}, t^{\#} + \varepsilon] will ultimately have on the value of F(T). First, we will consider a mixed fruit/shoot strategy on [t^{\#}, t^{\#} + \varepsilon]; then we have

    \begin{align*} F(t^{\#} + \varepsilon) - F(t^{\#}) &\approx \varepsilon F'(t^{\#})\\ S(t^{\#} + \varepsilon) - S(t^{\#}) &\approx \varepsilon S'(t^{\#}), \end{align*}

    and so the total contribution to F(T) during the interval [t^{\#}, t^{\#} + \varepsilon] is approximately

    \begin{equation*} \varepsilon F'(t^{\#}) + \lambda_{S}(t^{\#}) \varepsilon S'(t^{\#}). \end{equation*}

    Likewise, if we consider a shoot-only strategy on [t^{\#}, t^{\#} + \varepsilon], then we have

    \begin{equation*} S_{\text{alt}}(t^{\#} + \varepsilon) - S_{\text{alt}}(t^{\#}) \approx \varepsilon S_{\text{alt}}'(t^{\#}), \end{equation*}

    and so the total contribution to F_{\text{alt}}(T) here is approximately \lambda_{S}(t^{\#}) \varepsilon S'_{\text{alt}}(t^{\#}). Therefore, assuming that \varepsilon is small enough that any difference between the optimal growth paths that the two solutions follow after t = t^{\#} + \varepsilon is negligible, we conclude that

    \begin{equation*} F(T) - F_{\text{alt}}(T) \approx \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#}))\right). \end{equation*}

    We use a numerical experiment to illustrate (4.22). Using methods we will discuss in Section 6, we directly compared the difference F(T) - F_{\text{alt}}(T) with the approximation \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#}))\right). Here, we used the terminal conditions S(T) = 222.11 and R(T) = 111.55 (see Section 7.2). Using ten linearly spaced points in the penultimate interval where \lambda_{S} > 1 > \lambda_{R}, and starting at each point, we simulated shoot-only growth on a length- \varepsilon interval, with \varepsilon = 10^{-3}, and then continued along the trajectory that is optimal for the new states at the end of each interval. This allowed us to compute F(T) - F_{\text{alt}}(T) directly. We also computed the estimate \varepsilon\left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#}))\right) for each of the ten sampled points, and then compared the computed value to the estimate. The results of this experiment are shown in Figure 2. Observe that the estimate follows the computed values, which illustrates the accuracy of our estimate. Note also that F(T) - F_{\text{alt}}(T) > 0, meaning that the mixed fruit/shoot strategy outperforms the shoot-only strategy, as claimed.

    Figure 2.  Numerical verification of (4.22) ten times during the penultimate interval.

    At this point, we have established the existence of a final interval during which only the fruits are growing, and a period before this when both the shoots and fruits are growing together. Continuing backwards we have arrived at a juncture where there are seemingly several options. This is made complicated by the fact that z_{RC} is not defined during the penultimate interval, and so it is not clear directly from the necessary conditions (3.25) which growth pattern is optimal here.

    Prior to this period of mixed shoot/fruit growth, there are several possibilities: fruit growth, root/fruit growth, root/shoot/fruit growth, root/shoot growth, shoot-only growth, and root-only growth. Because we assume that fruits are only carbon-dependent, it is never optimal for the plant to invest fully in fruits unless the shoots are done growing. We can also rule out either shoot-only growth or root-only growth here. As we will subsequently see, the first stage of growth consists of either shoot-only growth or root-only growth, and while it may be the case that the length of this second phase is infinitesimally small, we still consider it as distinct from the first phase. That said, it is not optimal to alternate between growing only shoots and only roots, because after addressing any initial imbalance between shoots and roots, to do so would only create further resource limitation for the growing organ.

    This leaves us with three remaining options: root/fruit growth, root/shoot/fruit growth, or root/shoot growth. We will use an argument based on our discussion of marginal values in Section 4.3 to eliminate both root/fruit and root/shoot/fruit growth. The necessary conditions (3.25) provide conditions that must be met on an open interval in each case:

    \begin{align*} &\text{root/fruit growth:} &\lambda_{R}G'(z_{RC}) = 1\\ &\text{root/shoot/fruit growth:} &\lambda_{S}G'(z_{SC}) = \lambda_{R}G'(z_{RC}) = 1\\ &\text{root/shoot growth:} &\lambda_{S}G'(z_{SC}) = \lambda_{R}G'(z_{RC}) > 1. \end{align*}

    As with \lambda_{S}G'(z_{SC}) in Section 4.3, we can think of \lambda_{R}G'(z_{RC}) as approximating the additional units of fruit at time T given the additional roots produced by a unit increase in carbon allocation to roots. When the growth path follows \lambda_{R}G'(z_{RC}) = 1, this means that, with respect to carbon, the root SU can do no better for the plant in the long run than the fruit SU can do at present. In the same vein, when a growth path follows \lambda_{R}G'(z_{RC}) > 1, the long-term returns of investing carbon in roots exceeds the immediate returns of investing carbon in fruits. This means that root/shoot growth is optimal here, which is consistent with the work of [1].

    It is not optimal to have an interval without fruit growth between two intervals that include fruit growth. Such a growth path could be improved by swapping the first two intervals so that the fruits grow continuously. This means that there is no fruit growth prior to the balanced growth phase. As we have already discussed the idea that alternating phases of shoot-only and root-only growth is not optimal, we are left with one option for the initial phase, namely that it consists of either shoot-only growth or root-only growth. Here, the plant addresses deficiencies present in either vegetative organ depending on the initial conditions. Note that this is also the first stage that we see in the single-resource case described in [1].

    In this section, we will present the basic equations governing the dynamics in each phase of the solution to (3.23) as well as show that z_{SC} and z_{RC} are continuous between any two consecutive phases in which they are defined. To keep this section organized, we will go through the dynamics for each phase in chronological order and then go on to discuss the transitions between phases. We will also limit this section to include only phase-specific versions of the differential equations for states and adjoints, relevant algebraic constraints due to the necessary conditions, and any additional equations used for understanding the transitions between phases. When we discuss the numerical scheme, we will make use of several additional equations that govern dynamics in various stages but are not relevant here. In what follows, we will also make frequent use of the notation G_{2}(x) = G'(1/x).

    During shoot-only growth, we have

    \begin{equation} u_{FC} = 0,\quad u_{SC} = 1,\quad u_{RC} = 0,\quad u_{SN} = 1,\quad u_{RN} = 0 \end{equation} (5.1)

    and only S, \lambda_{S}, and \lambda_{R} are changing, with differential equations in time given by

    \begin{align} S' & = \nu_{S}NG\left(z_{SC}\right) \end{align} (5.2)
    \begin{align} \lambda_{S}' & = -\lambda_{S}C_{S}G'\left(z_{SC}\right) \end{align} (5.3)
    \begin{align} \lambda_{R}' & = -\lambda_{S}N_{R}\nu_{S}G_{2}\left(z_{SC}\right). \end{align} (5.4)

    Here, we have that

    \begin{equation} z_{SC} = \frac{C}{\nu_{S}N}. \end{equation} (5.5)

    Note also that in this stage, both N and N_{R} are constant because R is constant. Furthermore, because we know by (4.13) that H = C^{*}, the Hamiltonian (3.31) during this phase becomes

    \begin{equation} C^{*} = \lambda_{S}\nu_{S}NG(z_{SC}). \end{equation} (5.6)

    By solving for \lambda_{S}, and employing (3.20), we obtain two expressions for \lambda_{S} during this phase:

    \begin{align} \lambda_{S} & = \frac{C^{*}}{\nu_{S}NG(z_{SC})} \end{align} (5.7)
    \begin{align} \lambda_{S} & = \frac{C^{*}}{\nu_{S}NG_{2}(z_{SC}) + CG'(z_{SC})}. \end{align} (5.8)

    During root-only growth, we have

    \begin{equation} u_{FC} = 0,\quad u_{SC} = 0,\quad u_{RC} = 1,\quad u_{SN} = 0,\quad u_{RN} = 1 \end{equation} (5.9)

    and only R, \lambda_{S}, and \lambda_{R} are changing, with differential equations in time given by

    \begin{align} R' & = \nu_{R}NG(z_{RC}) \end{align} (5.10)
    \begin{align} \lambda_{S}' & = -\lambda_{R}C_{S}G'(z_{RC}) \end{align} (5.11)
    \begin{align} \lambda_{R}' & = -\lambda_{R}N_{R}\nu_{R}G_{2}\left(z_{RC}\right). \end{align} (5.12)

    Here, we have that

    \begin{equation} z_{RC} = \frac{C}{\nu_{R}N}. \end{equation} (5.13)

    Note also that in this stage, both C and C_{S} are constant because S is constant. Furthermore, because we know by (4.13) that H = C^{*}, the Hamiltonian (3.31) during this phase becomes

    \begin{equation} C^{*} = \lambda_{R}\nu_{R}NG(z_{RC}). \end{equation} (5.14)

    By solving for \lambda_{R}, and employing (3.20), we obtain two expressions for \lambda_{R} during this phase:

    \begin{align} \lambda_{R} & = \frac{C^{*}}{\nu_{R}NG(z_{RC})} \end{align} (5.15)
    \begin{align} \lambda_{R} & = \frac{C^{*}}{\nu_{R}NG_{2}(z_{RC}) + CG'(z_{RC})}. \end{align} (5.16)

    During balanced growth, we have

    \begin{equation} u_{FC} = 0,\quad 0 \le u_{SC} \le 1,\quad u_{RC} = 1 - u_{SC},\quad 0 \le u_{SN} \le 1,\quad u_{RN} = 1-u_{SN} \end{equation} (5.17)

    and so by (3.25), we have that the following two conditions must hold during this phase

    \begin{align} \lambda_{S}G'(z_{SC}) & = \lambda_{R}G'(z_{RC}) > 1 \end{align} (5.18)
    \begin{align} \lambda_{S}\nu_{S}G_{2}(z_{SC}) & = \lambda_{R}\nu_{R}G_{2}(z_{RC}). \end{align} (5.19)

    During this stage, S, R, \lambda_{S}, and \lambda_{R} are changing. The differential equations in time for these four during this phase are given by

    \begin{align} S' & = \nu_{S}u_{SN}NG(z_{SC}) \end{align} (5.20)
    \begin{align} R' & = \nu_{R}(1-u_{SN})NG(z_{RC}) \end{align} (5.21)
    \begin{align} \lambda_{S}' & = -C_{S}\lambda_{S}G'(z_{SC}) = -C_{S}\lambda_{R}G'(z_{RC}) \end{align} (5.22)
    \begin{align} \lambda_{R}' & = - N_{R}\lambda_{S}\nu_{S}G_{2}(z_{SC}) = -N_{R}\lambda_{R}\nu_{R}G_{2}(z_{RC}). \end{align} (5.23)

    Furthermore, again taking advantage of the fact that (4.13) gives us H = C^{*}, we can rewrite the Hamiltonian (3.31) during this phase as

    \begin{equation} C^{*} = \lambda_{S}\nu_{S}u_{SN}NG(z_{SC}) + \lambda_{R}\nu_{R}u_{RN}NG(z_{RC}). \end{equation} (5.24)

    Rewriting this using (3.20), we get

    \begin{equation*} C^{*} = \lambda_{S}\nu_{S}u_{SN}N\left[G_{2}(z_{SC}) + z_{SC}G'(z_{SC})\right] + \lambda_{R}\nu_{R}u_{RN}N\left[G_{2}(z_{RC}) + z_{RC}G'(z_{RC})\right]. \end{equation*}

    Simplifying, and using (5.18) and (5.19) to write everything in terms of z_{SC}, gives us

    \begin{align} C^{*}& = \lambda_{S}\nu_{S}u_{SN}N\left[G_{2}(z_{SC}) + z_{SC}G'(z_{SC})\right] + \lambda_{S}u_{RN}N\left[\nu_{S}G_{2}(z_{SC}) + \nu_{R}z_{RC}G'(z_{SC})\right] \\ & = \lambda_{S}\left[\nu_{S}NG_{2}(z_{SC})(u_{SN} + u_{RN}) + CG'(z_{SC})(u_{SC} + u_{RC})\right] \\ & = \lambda_{S}\left[\nu_{S}NG_{2}(z_{SC}) + CG'(z_{SC})\right]. \end{align} (5.25)

    Solving for \lambda_{S} gives us

    \begin{equation} \lambda_{S} = \frac{C^{*}}{\nu_{S}NG_{2}(z_{SC}) + CG'(z_{SC})}. \end{equation} (5.26)

    Using (5.18) and (5.19) to rewrite (5.25) in terms of z_{RC} instead leads to

    \begin{equation} C^{*} = \lambda_{R}\left[\nu_{R}NG_{2}(z_{RC}) + CG'(z_{RC})\right], \end{equation} (5.27)

    which upon solving for \lambda_{R} gives us

    \begin{equation} \lambda_{R} = \frac{C^{*}}{\nu_{R}NG_{2}(z_{RC}) + CG'(z_{RC})}. \end{equation} (5.28)

    Note that (5.26) and (5.28) are consistent with (5.8) during shoot-only growth, and (5.16) during root-only growth. While we do not assume that either z_{SC} or z_{RC} is continuous between phases (because the controls need not be), we will use this consistency later to show continuity between the initial and balanced growth phases.

    During the penultimate interval, we have

    \begin{equation} 0\le u_{FC}\le 1,\quad u_{SC} = 1 - u_{FC},\quad u_{RC} = 0,\quad u_{SN} = 1,\quad u_{RN} = 0 \end{equation} (5.29)

    and so by (3.25), we have that

    \begin{equation} \lambda_{S}G'(z_{SC}) = 1. \end{equation} (4.16 revisited)

    Note that here, we have

    \begin{equation} z_{SC} = \frac{u_{SC}C}{\nu_{S}N^{*}}. \end{equation} (5.30)

    During this interval, S, F, \lambda_{S}, and \lambda_{R} are all changing. The differential equations in time for these four during this stage are

    \begin{align} S' & = \nu_{S}N^{*}G(z_{SC}) \end{align} (5.31)
    \begin{align} F' & = u_{FC}C \end{align} (5.32)
    \begin{align} \lambda_{S}' & = -C_{S} \end{align} (5.33)
    \begin{align} \lambda_{R}' & = -N_{R}^{*}\lambda_{S}\nu_{S}G_{2}(z_{SC}) \end{align} (5.34)

    Furthermore, again taking advantage of the fact that (4.13) gives us H = C^{*}, we can rewrite the Hamiltonian (3.31) during this phase as

    \begin{equation} C^{*} = u_{FC}C + \lambda_{S}\nu_{S}N^{*}G(z_{SC}). \end{equation} (5.35)

    Using (3.20) and (5.30), we get

    \begin{align*} C^{*} & = u_{FC}C + \lambda_{S}\nu_{S}N^{*}\bigl(G_{2}(z_{SC}) + z_{SC}G'(z_{SC})\bigr)\\ & = u_{FC}C + \lambda_{S}\nu_{S}N^{*}G_{2}(z_{SC}) + \lambda_{S}G'(z_{SC})u_{SC}C \end{align*}

    which by (4.16) becomes

    \begin{equation} C^{*} = u_{FC}C + u_{SC}C + \lambda_{S}\nu_{S}N^{*}G_{2}(z_{SC}) = C + \lambda_{S}\nu_{S}N^{*}G_{2}(z_{SC}). \end{equation} (5.36)

    Now, at this point, we can either solve for \lambda_{S} and obtain

    \begin{equation} \lambda_{S} = \frac{C^{*} - C}{\nu_{S}N^{*}G_{2}(z_{SC})}, \end{equation} (5.37)

    or we can use (4.16) to rewrite (5.36) as

    \begin{equation*} C^{*} = C\lambda_{S}G'(z_{SC}) + \lambda_{S}\nu_{S}N^{*}G_{2}(z_{SC}) \end{equation*}

    and solve to obtain the familiar expression

    \begin{equation} \lambda_{S} = \frac{C^{*}}{\nu_{S}N^{*}G_{2}(z_{SC}) + CG'(z_{SC})}. \end{equation} (5.38)

    As we commented earlier, the reappearance of this expression for \lambda_{S} suggests that z_{SC} may be continuous between the balanced growth phase and penultimate interval. We will prove this later with the help of (5.37).

    Another important result in the penultimate interval, which we prove here, is that z_{SC} is strictly decreasing throughout this phase. This will be especially important because, as we will see in Section 6.3, it will be advantageous for us to think of z_{SC}, rather than t, as the integration variable for the numerical scheme during the penultimate interval. In particular, we will see that numerically solving the differential equations for this phase in t requires integrating a singularity, whereas solving the differential equations in z_{SC} does not.

    Lemma 4. z_{SC} is strictly decreasing during the penultimate interval.

    Proof. Recall that during the penultimate interval, we have the relationship

    \begin{equation} \lambda_{S}G'(z_{SC}) = 1. \end{equation} (4.16 revisited)

    Differentiating both sides with respect to t yields

    \begin{equation*} \lambda_{S}'G'(z_{SC}) + \lambda_{S}G''(z_{SC})\frac{dz_{SC}}{dt} = 0. \end{equation*}

    Using the fact that \lambda_{S}' = -C_{S}, we have

    \begin{equation} \frac{dz_{SC}}{dt} = \frac{-\lambda_{S}'G'(z_{SC})}{\lambda_{S}G''(z_{SC})} = \frac{C_{S}\left[G'(z_{SC})\right]^{2}}{G''(z_{SC})} = -\frac{C_{S}\left(1+2z_{SC}\right)^{2}}{6z_{SC}\left(1+z_{SC}\right)\left(1+z_{SC}+z_{SC}^{2}\right)} < 0. \end{equation} (5.39)

    Therefore, z_{SC} is strictly decreasing throughout the penultimate interval.

    During the final interval of fruit-only growth, we have

    \begin{equation} u_{FC} = 1,\quad u_{SC} = 0,\quad u_{RC} = 0,\quad u_{SN} = 1,\quad u_{RN} = 0 \end{equation} (5.40)

    and only F and \lambda_{S} are changing, with differential equations in time given by

    \begin{align} F' & = C^{*} \end{align} (5.41)
    \begin{align} \lambda_{S}' & = -C_{S}^{*}, \end{align} (5.42)

    Because \lambda_{S}(T) = 0, we also have

    \begin{equation} \lambda_{S}(t) = C_{S}^{*}\cdot(T-t). \end{equation} (4.6 revisited)

    Lastly, recall that \lambda_{R}(T) = 0 and \lambda_{R}' = 0 during this phase, and so \lambda_{R} = 0 here as well.

    Now that we have established the main dynamics in each phase, we will turn our attention to the transitions between phases. Again, we will proceed in chronological order, beginning with the transition from the initial phase to balanced growth. We can use the fact that the states and adjoints are continuous at this boundary to characterize this transition. In particular, we will show that z_{SC} is continuous at this transition when the first stage consists of shoot-only growth and z_{RC} is continuous at this transition when the first stage consists of root-only growth. In the first case, this will mean that the ratio of the shoot carbon flux to shoot nitrogen flux is continuous across this boundary, and in the second case, the ratio of root carbon flux to root nitrogen flux is continuous.

    Due to the symmetry between equations (5.26) and (5.8) and equations (5.28) and (5.16), we will streamline the argument by considering a generalized version of these equations:

    \begin{equation} \lambda = \frac{C^{*}}{\nu NG_{2}(z) + CG'(z)}, \end{equation} (5.43)

    where (\lambda, \nu, z) is either (\lambda_{S}, \nu_{S}, z_{SC}) or (\lambda_{R}, \nu_{R}, z_{RC}) depending on whether the initial stage is shoot-only growth or root-only growth, respectively. We call this transition point t = \bar{t}, let x = \lim_{t\to \bar{t}^{+}}z, and note that \lim_{t\to \bar{t}^{-}}z = \frac{C}{\nu N}. Now, taking limits of (5.43) from both sides:

    \begin{align} \lim\limits_{t\to \bar{t}^{+}}\lambda(t) = \bar{\lambda}^{+} & = \frac{C^{*}}{\nu NG_{2}(x) + CG'(x)} \end{align} (5.44)
    \begin{align} \lim\limits_{t\to \bar{t}^{-}}\lambda(t) = \bar{\lambda}^{-} & = \frac{C^{*}}{\nu N G_{2}\left(\frac{C}{\nu N}\right) + CG'\left(\frac{C}{\nu N}\right)}. \end{align} (5.45)

    Note that while we do not know the values of C and N at t = \bar{t}, in what follows it will only matter that they are the same in both (5.44) and (5.45). At t = \bar{t}, we have \bar{\lambda}^{-} = \bar{\lambda}^{+} and so

    \begin{equation} G_{2}\left(\frac{C}{\nu N}\right) + \frac{C}{\nu N}G'\left(\frac{C}{\nu N}\right) = G_{2}(x) + \frac{C}{\nu N}G'(x). \end{equation} (5.46)

    Let

    \begin{equation} f(x): = G_{2}(x) + \frac{C}{\nu N}G'(x), \end{equation} (5.47)

    and note that is clear from (5.46) that x = \frac{C}{\nu N} is a solution to

    \begin{equation} f(x) = G_{2}\left(\frac{C}{\nu N}\right) + \frac{C}{\nu N}G'\left(\frac{C}{\nu N}\right). \end{equation} (5.48)

    Here, we will show that this solution is unique. Consider f'(x), recalling (3.21):

    \begin{equation*} f'(x) = G_{2}'(x) + \frac{C}{\nu N}G''(x) = G''(x)\left(\frac{C}{\nu N} - x\right). \end{equation*}

    Note that because G''(x) \leq 0 for x \geq 0, we have that

    \begin{equation} \text{sgn}\left(f'(x)\right) = \text{sgn}\left(x - \frac{C}{\nu N}\right) \quad\text{for } x\geq 0. \end{equation} (5.49)

    Now, as f is continuous and by (5.49), we have that f is decreasing on \left(0, \frac{C}{\nu N}\right) and increasing on \left(\frac{C}{\nu N}, \infty\right), it must be the case that x = \frac{C}{\nu N} is the only solution to (5.48). Therefore, we have that z is continuous at \bar{t}, which verifies our claim that z_{SC} is continuous at the transition from shoot-only growth to balanced growth and z_{RC} is continuous at the transition from root-only growth to balanced growth.

    As in Section 5.6, we can use the fact that the states and adjoints are continuous to show that z_{SC} is continuous at the boundary between the balanced growth and penultimate intervals. Let t = \hat{t} denote this transition point, and let z_{SC}^{-} and z_{SC}^{+} be the left and right limits of z_{SC} at t = \hat{t}, respectively. We have, then, the following theorem.

    Theorem 2. z_{SC} is continuous at t = \hat{t}.

    We include the details of the proof in Appendix C and only give a sketch of the proof here. To that end, note that by combining (5.26), (5.37), and (4.16), and writing \widehat{C} = C(\hat{t}), we have that (x, y) = (z_{SC}^{-}, z_{SC}^{+}) must solve

    \begin{align} C^{*}\nu_{S}N^{*}G_{2}(y) & = \bigl(C^{*} - \widehat{C}\bigr)\bigl(\nu_{S}N^{*}G_{2}(x) + \widehat{C}G'(x)\bigr) \end{align} (5.50)
    \begin{align} C^{*}G'(y) & = \nu_{S}N^{*}G_{2}(x) + \widehat{C}G'(x) . \end{align} (5.51)

    Because G_{2} and G' are both one-to-one, we have that y is a function of x in (5.50) and (5.51).

    Figure 3 illustrates the three types of solutions to (5.50) and (5.51), depending on the values of the parameters. Note that the intersection points (x, y) of the two curves correspond to possible pairs \left(z_{SC}^{-}, z_{SC}^{+}\right). In particular, there are two common features of each plot:

    Figure 3.  Three cases for graphs of (5.50) and (5.51). Case 1 (left): p < \frac{\widehat{C}}{\nu_{S}N^{*}} < z_{SC}^{+}. Case 2 (center): z_{SC}^{+} = \frac{\widehat{C}}{\nu_{S}N^{*}}. Case 3 (right): z_{SC}^{+} < \frac{\widehat{C}}{\nu_{S}N^{*}} < p.

    1. In each plot there is one y value that solves (5.50) and (5.51); this determines z_{SC}^{+}.

    2. There are at most two intersection points, one is always at (z_{SC}^{+}, z_{SC}^{+}).

    For each case in Figure 3, the only admissible solution is z_{SC}^{-} = z_{SC}^{+}. In Case 2 (Figure 3, center), this is the only possibility. In Case 1 (Figure 3, left), we observe that

    \begin{equation*} \frac{\widehat{C}}{\nu_{S}N^{*}} < z_{SC}^{+}. \end{equation*}

    However, as we know that

    \begin{equation*} z_{SC}^{+} = \lim\limits_{t\to\hat{t}^{+}}\frac{u_{SC}C}{\nu_{S}N^{*}} \leq \frac{\widehat{C}}{\nu_{S}N^{*}}, \end{equation*}

    this would mean that

    \begin{equation*} \frac{\widehat{C}}{\nu_{S}N^{*}} < \frac{\widehat{C}}{\nu_{S}N^{*}} \end{equation*}

    which is impossible. Therefore, Case 1 is not an admissible configuration of solutions to (5.50) and (5.51). In Case 3 (Figure 3, right), there is a possible intersection point at x = p > z_{SC}^{+}. However, if z_{SC}^{-} = p, then as G' is strictly decreasing, we have that G'(z_{SC}^{-}) < G'(z_{SC}^{+}), and

    \begin{equation*} \lambda_{S}(\hat{t})G'(z_{SC}^{-}) < \lambda_{S}(\hat{t})G'(z_{SC}^{+} ) = 1, \end{equation*}

    which is impossible by (5.18). Therefore, the only possible solution to (5.50) and (5.51) is z_{SC}^{-} = z_{SC}^{+}, which means that z_{SC} is continuous at t = \hat{t}. It remains to show that these three cases actually characterize the solution, the details of which are included in Appendix C.

    Note that in the final interval, neither z_{SC} nor z_{RC} is defined, so we will not be addressing continuity of either one here. We will, however, analyze the controls at this transition and conclude that at this particular junction, we actually have continuity of the controls.

    First, recall that in Section 4.2, we showed that \lim_{t\to t^{*-}}u_{SC}(t) = 0. Because during the penultimate interval u_{FC} = 1 - u_{SC} we have that \lim_{t\to t^{*-}} u_{FC}(t) = 1 as well. Note also that during the penultimate interval, we have from (5.29) that u_{RC} = 0, u_{SN} = 1, and u_{RN} = 0, and by (5.40) we have that all the controls maintain their values in the final interval as well. Therefore, all the controls are continuous at t = t^{*}.

    Next, we will prove the following lemma, which shows that the transition to fruit-only growth occurs rapidly at the end of the penultimate interval.

    Lemma 5.

    \begin{equation*} \lim\limits_{t\to t^{*^{-}}} \frac{d(u_{SC}C)}{dt} = \lim\limits_{t\to t^{*^{-}}} \frac{dz_{SC}}{dt}(t) = -\infty. \end{equation*}

    Proof. The first equality follows from the penultimate interval definition of z_{SC} (5.30). For the second equality, recall that by (4.17), z_{SC}\to 0^{+} as t \to t^{*^{-}}. Lastly, by (5.39), we have

    \begin{equation*} \frac{dz_{SC}}{dt} \sim -\frac{C_{S}^{*}}{3z_{SC}^{2}},\text{ as }\quad z_{SC}\to 0^{+}, \end{equation*}

    which because C_{S}^{*} > 0 means that

    \begin{equation*} \lim\limits_{t\to t^{*^{-}}}\frac{dz_{SC}}{dt} = \lim\limits_{z_{SC}\to 0^{+}}-\frac{C_{S}^{*}}{3z_{SC}^{2}} = -\infty. \end{equation*}

    In this section, we will outline the numerical scheme we developed for solving the optimal control problem (3.23). Because the solution exhibits a multi-phase structure, standard numerical methods such as forward-backward sweep or shooting [13,] are not well-suited. Our approach is to construct a numerical scheme for solving the problem backward in time. There are two primary components to the numerical scheme. First, we obtain the map

    \begin{equation} (S^{*},R^{*})\mapsto (S,R,F,\mathbf{u},\lambda_{S},\lambda_{R}). \end{equation} (6.1)

    Second, we use MATLAB's built-in nonlinear equation solver \mathsf{fsolve} to find the map

    \begin{equation*} (S_{0},R_{0}) \mapsto (S^{*},R^{*}). \end{equation*}

    Ultimately, we obtain the map

    \begin{equation*} (S_{0},R_{0}) \mapsto (S,R,F,\mathbf{u},\lambda_{S},\lambda_{R}). \end{equation*}

    We will now direct our attention to the numerical scheme for finding (6.1), proceeding phase-by-phase in reverse order. During each phase, we use the fourth-order Runge-Kutta method (RK4) to solve differential equations that govern the phase dynamics and use algebraic equations to update the controls, thus allowing us to avoid having to update the controls iteratively. We will use the fact that the Hamiltonian is constant along the optimal trajectory, as well as the information we have about transitions from Section 5, to find the boundaries between phases and ultimately stitch the four phases together to form one complete solution. We will discuss how the numerical scheme works without getting into the fine numerical details. The MATLAB code is publicly available on David McMorris' GitHub repository, https://github.com/DavidMcMorris/plant-code/blob/main/carbon-only-model.m.

    We begin by using equation (4.12) to find the end of the penultimate interval (t^{*}). Now, as the dynamics during the penultimate interval depend on z_{SC}, we would ideally use (5.39) and (5.31) to solve for z_{SC} and S simultaneously, and use (5.30) to update the controls. However, recall that by Lemma 5, z_{SC} approaches a vertical tangent as t\to t^{*^{-}}, which makes it difficult to solve for z_{SC} numerically. Note, however, that by the same reasoning, we have that \frac{dt}{dz_{SC}} approaches 0 as z_{SC} \to 0^{+}. So, to remedy this numerical difficulty, we take advantage of this fact, along with the fact that, by Lemma 4, z_{SC} is strictly decreasing throughout the penultimate interval, to think of z_{SC} as our ''time" and derive differential equations for t, S, and \lambda_{R} in terms of z_{SC}. This, combined with the fact that during the penultimate interval R is constant, \lambda_{S} is algebraically related to z_{SC} via (4.16), and F depends on the value of \hat{t}, gives us everything we need to numerically find the solution during this phase. Note that because z_{SC} is decreasing, solving forward in z_{SC} is equivalent to solving backward in time.

    To this end, note that by (5.39), we have that

    \begin{equation} \frac{dt}{dz_{SC}} = \frac{G''(z_{SC})}{C_{S}\left[G'(z_{SC})\right]^{2}},\quad t(0) = t^{*}. \end{equation} (6.2)

    Furthermore, using (5.31), (5.34), and (6.2), we obtain

    \frac{d\lambda_{R}}{dz_{SC}} = \frac{d\lambda_{R}}{dt}\frac{dt}{dz_{SC}} = -\frac{N_{R}^{*}\nu_{S}G_{2}(z_{SC})G''(z_{SC})}{C_{S}\left[G'(z_{SC})\right]^{3}},\quad \lambda_{R}(0) = 0 (6.3)
    \frac{dS}{dz_{SC}} = \frac{dS}{dt}\frac{dt}{dz_{SC}} = \frac{\nu_{S}N^{*}G(z_{SC})G''(z_{SC})}{C_{S}\left[G'(z_{SC})\right]^{2}}, \quad S(0) = S^{*}. (6.4)

    We use RK4 to solve (6.2), (6.3), and (6.4) forward in z_{SC}, and use (5.30) to update u_{SC}. Also, recall that \lambda_{S} = 1/G'(z_{SC}), u_{FC} = 1 - u_{SC}, and the other three controls are constant by (5.29). Upon reordering by t, this gives us all of the states, adjoints, and controls during the penultimate interval, with the exception of fruits, which we will determine after identifying the correct transition point from balanced growth to the penultimate interval (\hat{t}).

    We stop solving forward in z_{SC} when u_{SC} becomes unbounded, as this gives us an upper bound on the value of z_{SC} (lower bound for the value of t) for which the transition from balanced growth to the penultimate interval can occur. Furthermore, because during the balanced growth phase we have by (5.18) that \lambda_{R}G'(z_{RC}) > 1, it must be the case that \lambda_{R}(\hat{t}) \geq 1 because G'(z) \le 1. Finding where \lambda_{R} = 1 gives us a lower bound on the value of z_{SC} (upper bound for the value of t) for which the transition from balanced growth to the penultimate interval can occur. This gives us an interval [z_{SC}^{\text{min}}, z_{SC}^{\text{max}}] in which to search for \hat{t}.

    This portion of the numerical scheme begins with the interval [z_{SC}^{\text{min}}, z_{SC}^{\text{max}}] identified in Section 6.1. Since we know that the Hamiltonian must be constant along the optimal trajectory, we use the balanced-growth specific formulation of H in terms of \lambda_{R} and z_{RC}, given by (5.27):

    \begin{equation} H = C^{*} = \lambda_{R}\left[\nu_{R}NG_{2}(z_{RC}) + CG'(z_{RC})\right]. \end{equation} (5.27 revisited)

    Starting with [z_{SC}^{\text{min}}, z_{SC}^{\text{max}}], we use a binary search to find the smallest interval, relative to current RK4 step size, which contains the point t = \hat{t} at which (5.27) is satisfied. Next, we use a smaller step size, and follow the procedure outlined in Section 6.1 to increase the resolution of the solution and repeat the binary search until we find \hat{t} to within some specified tolerance. This also gives us the values of the states, adjoints, and controls at t = \hat{t}.

    One crucial feature on the process that we have not mentioned so far is the fact that since z_{RC} is not defined in the penultimate interval, we need to compute its limit from balanced growth at every step in the binary search. This is made possible by the fact that z_{SC} is continuous at \hat{t}, as established in Section 5.7. Therefore, we can use (5.18) and (5.19) to obtain

    \begin{equation} \frac{G'(z_{RC})}{G_{2}(z_{RC})} = \frac{\nu_{R}}{\nu_{S}}\frac{G'(z_{SC})}{G_{2}(z_{SC})}. \end{equation} (6.5)

    Using MATLAB's built-in nonlinear equation solver \mathsf{fsolve}, we use the value of z_{SC} at a particular point to calculate what z_{RC} would be if balanced growth ended at that point. This value of z_{RC} is then used to evaluate the right-hand side of (5.27) in the binary search.

    In Section 6.1, we were unable to solve for F because we did not know the value of \hat{t}. This was resolved in Section 6.2, so at this point, we use RK4 to solve (5.32) forward in time during the penultimate interval. This completes the numerical solution for the penultimate interval.

    Before continuing backward in time, we take advantage of the fact that, by the process described in Section 6.3, we now know the value of F at the beginning of the final interval. Here, the controls are constant by (5.40), S, R, and \lambda_{R} are constant, and \lambda_{S} is given by (4.6). Lastly, using the value of F(t^{*}) obtained as described in Section 6.3, and the fact that F' is constant during the final interval by (5.41), we have that

    \begin{equation} F(t) = F(t^{*}) + C^{*}\cdot (t - t^{*}). \end{equation} (6.6)

    Upon locating \hat{t} as discussed in Section 6.2, we have the value of the states, adjoints, z_{SC}, and z_{RC} at the end of the balanced growth stage. Recall that the controls need not be continuous, so we do not immediately know their values at the end of balanced growth. We can, however, start with the equations for z_{SC} and z_{RC} during balanced growth

    \begin{align} z_{SC} & = \frac{u_{SC}C}{\nu_{S}u_{SN}N} \end{align} (6.7)
    \begin{align} z_{RC} & = \frac{(1-u_{SC})C}{\nu_{R}(1-u_{SN})N} \end{align} (6.8)

    and solve for u_{SN} and u_{SC}:

    \begin{align} u_{SN} & = \frac{\frac{C}{N} - \nu_{R}z_{RC}}{\nu_{S}z_{SC} - \nu_{R}z_{RC}} \end{align} (6.9)
    \begin{align} u_{SC} & = \frac{\nu_{S}z_{SC}Nu_{SN}}{C}. \end{align} (6.10)

    We derive differential equations for z_{SC} and z_{RC} during this phase, and ultimately use (6.9) and (6.10) to obtain the controls. The differential equations in time for z_{SC} and z_{RC} are given below, and the derivation is included in Appendix D.

    \begin{align} \frac{dz_{SC}}{dt} & = \frac{N_{R}\nu_{S}\nu_{R}G(z_{RC})G_{2}(z_{SC}) - C_{S}G'(z_{SC})\left[\nu_{S}G_{2}(z_{SC}) + z_{RC}\nu_{R}G'(z_{SC})\right]}{G''(z_{SC})\left[\nu_{S}z_{SC} - \nu_{R}z_{RC}\right]} \end{align} (6.11)
    \begin{align} \frac{dz_{RC}}{dt} & = \frac{N_{R}\nu_{S}G_{2}(z_{SC})\left[G'(z_{RC})\right]^{2} - C_{S}\left[G'(z_{SC})\right]^{2}G'(z_{RC}) + G''(z_{SC})G'(z_{RC})\frac{dz_{SC}}{dt}}{G'(z_{SC})G''(z_{RC})} \end{align} (6.12)

    Beginning at \hat{t}, we use RK4 to simultaneously solve (5.20), (5.21), (5.22), (5.23), (6.11), and (6.12) backward in time to obtain S, R, \lambda_{S}, \lambda_{R}, z_{SC}, and z_{RC}, respectively. We use (6.10) and (6.9) to eliminate u_{SC} and u_{SN} in (5.20) and (5.21) so that these six differential equations are expressed exclusively in terms of these six variables. We then use (6.10) and (6.9) to obtain u_{SC} and u_{SN}, as well as the rest of the controls via (5.17). We continue until one of the controls exits [0, 1], which gives us the earliest time for the beginning of the balanced growth phase.

    The next piece of the numerical scheme finds the start of balanced growth, t = \bar{t}. Recall that in Section 5.6, we showed that either z_{SC} or z_{RC} is continuous at \bar{t}, depending on which is defined in the initial phase. In either case, the transition must occur at a point where

    \begin{equation} u_{SC} - u_{SN} = 0. \end{equation} (6.13)

    This is because in the case where the initial phase consists of shoot growth, we have

    \begin{equation*} \lim\limits_{t\to \bar{t}^{+}}z_{SC} = \lim\limits_{t\to \bar{t}^{+}}\frac{u_{SC}C}{\nu_{S}u_{SN}N} = \lim\limits_{t\to \bar{t}^{-}}\frac{C}{\nu_{S}N}\implies \lim\limits_{t\to \bar{t}^{+}}\frac{u_{SC}}{u_{SN}} = 1, \end{equation*}

    and in the case where the initial phase consists of root growth, we have

    \begin{equation*} \lim\limits_{t\to \bar{t}^{+}}z_{RC} = \lim\limits_{t\to \bar{t}^{+}}\frac{(1-u_{SC})C}{\nu_{R}(1-u_{SN})N} = \lim\limits_{t\to \bar{t}^{-}}\frac{C}{\nu_{R}N}\implies \lim\limits_{t\to \bar{t}^{+}}\frac{(1-u_{SC})}{(1-u_{SN})} = 1. \end{equation*}

    We use essentially the same procedure as in Section 6.2 for refining the transition point between the balanced growth and the penultimate intervals, with the exception that finding \bar{t} does not require computing the limits of any quantities from the earlier phase, as we had to do with z_{RC} at \hat{t}. We use a binary search to locate the smallest interval about which (6.13) is satisfied, and then use RK4 to solve equations (5.20), (5.21), (5.22), (5.23), (6.11), and (6.12) backward in time on a smaller integration mesh. We repeat this process until we have found a point at which (6.13) is met to within some specified tolerance. We call this point \bar{t}.

    Note that if no such point exists, then the plant begins in balanced growth, in which case there are only three phases. If \bar{t} exists, the next step in the numerical scheme is to determine the makeup of the initial phase. Using (5.7) and (5.15), we compute the following at \bar{t}:

    \begin{equation*} \left\vert\lambda_{S}\nu_{S}NG\left(\frac{C}{\nu_{S}N}\right) - C^{*}\right\vert \quad \text{and} \quad \left\vert\lambda_{R}\nu_{R}NG\left(\frac{C}{\nu_{R}N}\right) - C^{*}\right\vert. \end{equation*}

    If the first is smaller, then the initial phase consists of shoot growth; if the second is smaller, the initial phase consists of root growth.

    If the first phase consists of shoot-only growth, we use RK4 to solve (5.2), (5.3), and (5.4) simultaneously backward in time until we reach t = 0. This gives us S, \lambda_{S}, and \lambda_{R}, respectively. The controls are constant here and given by (5.1), and R = R(\bar{t}).

    If the first phase consists of root-only growth, we use RK4 to solve (5.10), (5.11), and (5.12) simultaneously backward in time until we reach t = 0. This gives us R, \lambda_{S}, and \lambda_{R}, respectively. The controls are constant here and given by (5.9), and S = S(\bar{t}).

    In this section, we will present some of the primary results from the numerical simulations of the model. In particular, we will look at several ''representative" simulation results that showcase the different strategies a plant can employ to maximize fruit production, as well as help us understand the relationship between initial and terminal conditions. In each simulation, we made the simplifying assumptions that C(S) = S and N(R) = R, which ignore any possibility of self-shading as the plant grows. We chose \nu_{R} = 1 for convenience, and chose \nu_{S} = 1/3, which had the effect of exaggerating the lengths of the various phases, resulting in easier-to-interpret plots. Note that this results in \frac{\nu_{R}}{\nu_{S}} = 3, which is likely a bit higher than data suggests [17,18,]. Lastly, we chose T = 10 because in the case that C(S) = S, we have by (4.12) that t^{*} = 9, which again facilitated our interpretation of the numerical results. We will begin with four examples of different strategies that all reach the same optimal value of fruits at time T. The terminal conditions in each example were arbitrarily chosen because they resulted in a solution that illustrated a particular pattern of growth.

    Here, we chose terminal conditions S^{*} = 223.20 and R^{*} = 112.86. This results in F(T) = 900, S_{0} = 17, and R_{0} = 84. In this case, we see the full four-stage structure of the solution. There is an initial phase of shoot growth, followed by a period of balanced growth between shoots and roots, a penultimate interval of shoot and fruit growth, and finally a period of fruit-only growth at the end of the growing season. The states and controls are shown in Figure 4.

    Figure 4.  Initial shoot growth. From top to bottom: Shoots and roots, fruits, carbon controls, nitrogen controls. Shoots, roots, and fruits in units of carbon; controls are dimensionless.

    The lower two plots in Figure 4 show the carbon and nitrogen allocation strategies for this plant. Following the initial period of shoot growth, there is a short period during which u_{RC} and u_{RN} are increasing, signifying a period of increasing root production. Shortly after t = 2, we see u_{RC} and u_{RN} begin to decrease, signifying that although it is still advantageous to invest resources in roots, the plant ultimately needs to prioritize shoot production again to prepare for mixed shoot/fruit growth. During the penultimate interval, we see the plant gradually stop investing in shoots, before switching to fruit-only growth at time t = 9.

    Here, we chose terminal conditions S^{*} = 222.11 and R^{*} = 111.55. This results in F(T) = 900, S_{0} = 48.9, and R_{0} = 28.9. In this simulation, we again see all four stages; however, here we see an initial phase of root growth instead of the initial phase of shoot growth we saw previously. The initial conditions of the two simulations are quite different, but the terminal conditions are nearly the same. This, as we will discuss later, suggests that optimal growth tends to smooth out initial transients. The states and controls are shown in Figure 5. The two graphs at the bottom of Figure 5 show a different allocation strategy than what appeared in the case of initial shoot growth in Figure 4. In particular, following the initial period of root growth, we see a decline in both carbon and nitrogen allocation to the roots throughout the entire balanced growth phase, and the steady increase in allocation to shoots.

    Figure 5.  Initial root growth. From top to bottom: Shoots and roots, fruits, carbon controls, nitrogen controls. Shoots, roots, and fruits in units of carbon; controls are dimensionless.

    Here, we chose terminal conditions S^{*} = 222.71 and R^{*} = 112.27. This results in F(T) = 900, S_{0} = 30.3, and R_{0} = 59.74. The states and controls are shown in Figure 6. In this simulation, we see a plant that starts in balance and therefore forgoes the initial phase of shoot-only or root-only growth. That said, comparing Figures 6 and 4, we see a similar allocation strategy during balanced growth. In particular, both show that allocation to shoots initially decreases before increasing again. This behavior corresponds to initial conditions that are more biased toward shoot deficiency than root deficiency. For this reason, we refer to this type of initially balanced growth as Type S.

    Figure 6.  Type S initially balanced growth. From top to bottom: Shoots and roots, fruits, carbon controls, nitrogen controls. Shoots, roots, and fruits in units of carbon; controls are dimensionless.

    Here, we chose terminal conditions S^{*} = 222.60 and R^{*} = 112.14. This results in F(T) = 900, S_{0} = 36.16, and R_{0} = 49.58. The states and controls are shown in Figure 7. Here, we see another example of a plant that begins in balanced growth. Unlike Figure 6, the balanced growth phase seen here has a similar structure to that in Figure 5, where the plant begins with root-only growth. In both examples, the phase consists of steadily increasing allocation to shoots while simultaneously decreasing allocation to roots. In this case, the plant is initially more biased toward root deficiency than shoot deficiency and so the early growth sees a greater investment in roots. We refer to this type of initially balanced growth as Type R.

    Figure 7.  Type R initially balanced growth. From top to bottom: Shoots and roots, fruits, carbon controls, nitrogen controls. Shoots, roots, and fruits in units of carbon; controls are dimensionless.

    In order to better understand the relationship between initial conditions and optimal fruit yield, we looked for points in the (S_{0}, R_{0})-plane for which F(T) is 700,800, or 900. For a given value of S_{0}, we used MATLAB to find the corresponding value of R_{0} for which the numerical scheme outlined in Section 6 would yield either 700,800, or 900 for F(T). We plotted the resulting contours in the (S_{0}, R_{0})-plane as seen in Figure 8. We also note that the four representative simulations we have examined occur on the F(T) = 900 contour.

    Figure 8.  Optimal fruit yield contours for F(T) = 700,800,900.

    There are several key features of this plot to notice. First is the wide range of initial conditions for which a particular fruit yield is optimal. Depending on the allocation strategy, plants with very different initial conditions may share an optimal fruit yield. Additionally, for any given point in Figure 8, we can see how much more initial shoots or roots would be necessary to move the plant to a higher-yield contour. Looking at the ends of the contours, it takes relatively little additional initial structure to move from one contour to the next, whereas in the middle of the contours we see that a larger increase in initial biomass is required to move to the next contour. Looking at Figures 6 and 7, we see that these middle regions contain initial conditions from which the plant begins in balance. Nearer to the ends of the contours, where the concavity becomes more pronounced, we see that the plant must be more biased toward the vegetative organ with the most biomass. It is also worth noting that due to the level of generality with which we built the model, it is difficult to say where the boundaries lie in the (S_{0}, R_{0}) plane for realistic initial conditions, and it may be biologically unlikely for a plant to begin in balance.

    By incorporating nitrogen into the model, but keeping the fruits solely reliant on carbon, we have considered a case in which \nu_{F} \to \infty. While not realistic for most annual plants, this provides a mathematically approachable framework in which to begin to analyze optimal allocation of two resources. It is worth noting, however, that if we exchange the assumption that fruits are C-only with the assumption that the fruits are N-only, we expect the model to predict a reversal in the roles of shoots and roots. The assumption that fruits are C-only, however, is a more natural extension of Iwasa and Roughgarden's work [1].

    An obvious outcome of our model, not present in [1], is the penultimate interval of mixed shoot/fruit growth. As \nu_{F} > \nu_{S}, \nu_{R}, we see a phase during which shoot production is more important overall to eventual fruit yield, but an increased capacity to assimilate nitrogen is not useful, so any excess carbon is invested in fruits. As we discussed in Section 4.3, this corresponds to when the marginal value of the size of shoots exceeds the marginal values of the sizes of the other organs, but because the plant allocates resources to synthesizing units in each organ rather than biomass directly, we see a situation where allocating carbon to both shoots and fruits is more beneficial than allocating only to shoots. We also note that this behavior is consistent with recent predictions in the literature [19].

    The theoretical results in Section 5 confer a degree of biological relevance to the model that adds credence to the model despite the narrow scope imposed by the assumption that fruits are built solely from carbon. In particular, we showed that z_{SC}, z_{RC}, and u_{FC} are continuous between any two phases in which they are defined and nonzero. Note that while z_{SC} and z_{RC} are dimensionless, they are multiples of the ratio of C flux to N flux in shoots and roots. So, the fact that z_{SC} and z_{RC} are continuous here means that these ratios are continuous in the shoots and roots. This is striking, given that, while these individual fluxes are continuous at the beginning of balanced growth, they are markedly discontinuous at the end in Figures 4, 5, 6, and 7. So, as long as either shoots or roots are growing, the amount of allocated C per unit of allocated N varies continuously, which is reasonable to expect from biochemical processes. Furthermore, by (3.30), the fact that u_{FC} is continuous between the penultimate and final intervals means that the rate of fruit growth is continuous here.

    The results presented in Section 7 provide several avenues for drawing more general conclusions about the nature of plant growth that optimizes fruit yield. In Figures 4, 5, 6, and 7, we see that allocation is a balancing act between avoiding limitation and investing in the organ that most directly contributes to increased fruit yield. The plant invests in the most deficient organ first until it can efficiently invest in both shoots and roots, doing so that by the end of balanced growth, we see increased C flux to shoots. During the penultimate interval, fruits are a better investment than roots, but fruits still benefit from increased C flux. Therefore, the plant continues to invest in shoots while gradually transitioning to fruit-only growth. These ideas are also reminiscent of our discussion of the marginal values and necessary conditions for optimality in Section 4.3, in that they suggest that two organs can grow simultaneously only if neither SU can outperform the other.

    Recall that Figures 4, 5, 6, and 7 represent snapshots of optimal growth along the F(T) = 900 contour represented in Figure 8. In particular, we carefully chose initial conditions and parameter values to present these snapshots with clearly-defined phases. However, it is worth noting that in general, the length of each phase can vary. It is also plausible that given initial conditions that are sufficiently biased toward either shoots or roots, along with a short enough growing season, the plant may jump from the initial stage directly to the final phase of reproductive growth, in which case we would have bang-bang controls. This, however, is not something we have analyzed from a mathematical standpoint, as in practice, a phase being omitted is biologically equivalent to it being infinitesimally short due to the timescales involved with resource allocation. It is also unclear how successful a plant could be in such an environment where the length of the growing season precludes the possibility of balanced growth. That is to say, while the general trends discussed above hold across the contour, there is a broad spectrum of optimal strategies. What is striking is the high level of variation in solutions with initial conditions along the contour, and the fact that these initial transients are absent by time T. We see this in the values of S(T), R(T), and F(T) in Figures 4, 5, 6, and 7, but the trend generally holds along the contour. In some sense, then, we can think of optimal growth under this model as being an equalizing agent that reduces initial variation in a population.

    A natural extension to this model is to drop the assumption that fruits are carbon-only and consider the more general case where fruits require both carbon and nitrogen. Here, we expect the optimal growth path to depend on the stoichiometric ratios of carbon to nitrogen, with a penultimate interval consisting of either shoot/fruit or root/fruit growth depending on the relative importance of each resource to the fruits.

    The inclusion of nitrogen also opens other avenues for inquiry that have been previously explored in carbon-only optimal allocation models. For example, we have not included the possibility of resource storage and remobilization in our model. Chiariello and Roughgarden studied a carbon-only allocation model for annual plants in [20] and found that storing carbon for future allocation to reproduction can be optimal in environments where the rate of seed loss exceeds certain thresholds related to the net growth rate of the structural tissue, or when the efficiency with which carbon can be stored/remobilized is sufficiently high. It seems likely that a similar conclusion would hold with the inclusion of nitrogen, although the conditions under which storage of carbon, nitrogen, or both is optimal would likely depend on the resource requirements for building fruits as well as how efficiently the plant can store/remobilize each resource. Investment in storage may also be optimal in an environment in which the plant may experience sudden tissue loss due to disturbances, in which case the storage compartment acts as a safety net to allow the plant to recover [21]. Here, again, we would generally expect the inclusion of a second resource in the model to allow for richer storage dynamics, where the size of the reserves may depend on the resource requirements for rebuilding lost tissue and the frequency/severity of disturbances.

    In this appendix, we will derive the necessary conditions presented in (3.25) in Section 3.4. Note that because the carbon controls do not directly depend on the nitrogen controls, we can obtain the necessary conditions by considering an n-state two-control problem and an n-state three-control problem separately. In our particular situation, n = 3, but we derive the conditions with higher generality because there is minimal complexity added to the derivation. In the following, we consider f, \mathbf{g} to be continuously differentiable in all arguments and \mathbf{u} to be piecewise continuous. We begin with an n-state two-control problem in Appendix A.1 followed by an n-state three-control problem in Appendix A.2.

    We consider the optimal control problem given by (A.1), with the goal of deriving the associated necessary conditions.

    \begin{gather} \max\limits_{\mathbf{u} = (u_{1}, u_{2})} \int_{0}^{T}f(t,\mathbf{x}(t),\mathbf{u}(t))\,dt\\ \begin{aligned}[t] &\text{subject to:}\quad 0\leq u_{1}(t)\leq 1,\quad 0\leq u_{2}(t)\leq 1 \quad \text{for }t\in[0,T],\\ &\phantom{\text{subject to:}}\quad u_{1}(t) + u_{2}(t) = 1 \quad\text{for }t\in[0,T]\\ &\phantom{\text{subject to:}}\quad \mathbf{x}'(t) = \mathbf{g}(t,\mathbf{x}(t),\mathbf{u}(t)),\quad \mathbf{x}(0) = \mathbf{x}_{0}\\ \end{aligned} \end{gather} (A.1)

    To this end, let U be the space of admissible controls, and define the functional J\colon U\to\mathbb{R} by

    \begin{align} J(\mathbf{u}) & = \int_{0}^{T} f(t,\mathbf{x}(t),\mathbf{u}(t))\,dt \\ & = \int_{0}^{T}\left\lbrace f(t,\mathbf{x}(t),\mathbf{u}(t)) + \mathit{\boldsymbol{\lambda}}(t)\cdot(\mathbf{g}(t,\mathbf{x}(t),\mathbf{u}(t)) - \mathbf{x}'(t))\right\rbrace\,dt \\ & = \int_{0}^{T}\left\lbrace f(t,\mathbf{x}(t),\mathbf{u}(t)) + \mathit{\boldsymbol{\lambda}}(t)\cdot \mathbf{g}(t,\mathbf{x}(t),\mathbf{u}(t)) + \mathit{\boldsymbol{\lambda}}'(t)\cdot \mathbf{x}(t)\right\rbrace\,dt\\ &\phantom{ = \int_{0}^{T}\lbrace f(t,\mathbf{x}(t),\mathbf{u}(t))} + \mathit{\boldsymbol{\lambda}}(0)\cdot\mathbf{x}_{0} - \mathit{\boldsymbol{\lambda}}(T)\cdot\mathbf{x}(t), \end{align} (A.2)

    where \mathit{\boldsymbol{\lambda}} (t) is a piecewise differentiable function to be specified later, and the final equality was obtained by integrating by parts. Note that by \mathit{\boldsymbol{\lambda}}' we mean the function whose n^{\text{th}} component is the time derivative of the n^{\text{th}} component of \mathit{\boldsymbol{\lambda}}. We use the same notation for \mathbf{x}'(t). Let \mathbf{u}^{*}, \mathbf{x}^{*} be optimal, and for piecewise continuous variations h_{1}, h_{2} and \varepsilon \in \mathbb{R} define (u_{1}^{ \varepsilon}, u_{2}^{ \varepsilon})\in U by u_{1}^{ \varepsilon}(t) = u_{1}^{*}(t) + \varepsilon h_{1}(t) and u_{2}^{ \varepsilon}(t) = u_{2}^{*}(t) + \varepsilon h_{2}(t), and let \mathbf{x}^{ \varepsilon} be the corresponding state. Note that admissibility requires u_{1}^{ \varepsilon} + u_{2}^{ \varepsilon} = 1, so we have h_{1} = -h_{2}. Because \mathbf{u}^{*}, \mathbf{x}^{*} are optimal, we have

    \begin{equation} 0 \geq \frac{dJ(\mathbf{u}^{ \varepsilon})}{d \varepsilon}\Big\vert_{ \varepsilon = 0} = \lim\limits_{ \varepsilon \to 0}\frac{J(\mathbf{u}^{ \varepsilon}) - J(\mathbf{u}^{*})}{ \varepsilon}. \end{equation} (A.3)

    We have an inequality rather than equality above because the boundedness of the controls makes it possible that optimality is only global rather than local, and the numerator in the difference quotient is nonpositive because J(\mathbf{u}^{*}) is maximal over admissible controls. We differentiate (A.2) with respect to \varepsilon and note that conditions on the functions involved in the integrand allow us to interchange the derivative and integral via the dominated convergence theorem. Suppressing the arguments from here on,

    \begin{align*} 0&\geq\int_{0}^{T} \frac{\partial}{\partial \varepsilon}\left(f + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g} + \mathit{\boldsymbol{\lambda}}'\cdot\mathbf{x}^{ \varepsilon}\right)\Big\vert_{ \varepsilon = 0}\,dt - \mathit{\boldsymbol{\lambda}}(T)\cdot\frac{\partial \mathbf{x}^{ \varepsilon}(t)}{\partial \varepsilon}\Big\vert_{ \varepsilon = 0}\\ & = \int_{0}^{T}\left\lbrace\left(\nabla f + \sum\limits_{i = 1}^{n}\lambda_{i}\nabla g_{i} + \mathit{\boldsymbol{\lambda}}'\right)\cdot\frac{\partial \mathbf{x}^{ \varepsilon}(t)}{\partial \varepsilon}\Big\vert_{ \varepsilon = 0}\right.\\ &\mathrel{\phantom{ = \int_{0}^{T}\left\lbrace\left(\nabla f + \right.\right.}}+ \left.\left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}}\right)h_{1} + \left(f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}}\right)h_{2}\vphantom{\sum\limits_{i = 1}^{n}}\right\rbrace\,dt \\ &\mathrel{\phantom{ = \int_{0}^{T}\left\lbrace\left(\nabla f + (f_{u_{1}} + \right.\right.}} {}-\mathit{\boldsymbol{\lambda}}(T)\cdot\frac{\partial \mathbf{x}^{ \varepsilon}(t)}{\partial \varepsilon}\Big\vert_{ \varepsilon = 0}. \end{align*}

    Note that here we are using \nabla f to mean the vector of derivatives of f with respect to each component of \mathbf{x}, and likewise for \nabla g_{i}. Choosing \mathit{\boldsymbol{\lambda}} to satisfy

    \begin{equation*} \mathit{\boldsymbol{\lambda}}'(t) = -\left(\nabla f + \sum\limits_{i = 1}^{n}\lambda_{i}\nabla g_{i}\right),\quad \mathit{\boldsymbol{\lambda}}(T) = \mathbf{0}, \end{equation*}

    the inequality reduces to

    \begin{equation*} 0 \geq \int_{0}^{T}\left\lbrace\left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}}\right)h_{1} + \left(f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}}\right)h_{2}\right\rbrace \,dt. \end{equation*}

    Using the fact that h_{1} = -h_{2}, we obtain

    \begin{equation*} 0\geq \int_{0}^{T}\left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}}\right)h_{1}\,dt. \end{equation*}

    Next, we use this inequality to obtain necessary conditions for optimality. Let s be a point of continuity for u_{1}^{*} and u_{2}^{*} such that 0\leq u_{1}^{*}(s) < 1, and suppose for the sake of contradiction that f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} > 0 at s. As \mathit{\boldsymbol{\lambda}} is continuous, and f and \mathbf{g} are continuously differentiable, by the continuity of u_{1}^{*} and u_{2}^{*} at s, we have that f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} is continuous at s as well. Therefore, we can find a closed interval I about s such that u_{1}^{*} < 1 and f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} >0 on I. Let

    \begin{equation*} M = \max\lbrace u_{1}^{*}(t)\mid t\in I\rbrace < 1, \end{equation*}

    and choose h_{1} and h_{2} to be

    \begin{equation*} h_{1}(t) = (1-M)\chi_{I}(t),\quad h_{2}(t) = (M-1)\chi_{I}(t), \end{equation*}

    where \chi_{I} is the characteristic function on I. Note that this gives us

    \begin{equation*} u_{1}^{ \varepsilon} = u_{1}^{*} + \varepsilon (1-M)\chi_{I},\quad u_{2}^{ \varepsilon} = u_{2}^{*} + \varepsilon(M-1)\chi_{I}. \end{equation*}

    As u_{1}^{*}\leq M on I, then it must be the case that u_{2}^{*} \geq 1-M on I since u_{2}^{*} = 1-u_{1}^{*}. This means that for all \varepsilon \in [0, 1] we have

    \begin{equation*} 0 \leq u_{1}^{ \varepsilon} \leq 1, \quad 0 \leq u_{2}^{ \varepsilon}\leq 1, \end{equation*}

    and as the variations were chosen such that h_{1} + h_{2} = 0, we have that these variations lead to admissible controls. We have then

    \begin{align*} 0 &\geq \int_{0}^{T} \left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}}\right)h_{1}\,dt\\ & = \int_{I} \left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}}\right)(1-M)\,dt\\ & > 0, \end{align*}

    a contradiction. So, it must be the case that f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} \leq 0. Note that because the controls sum to one, we have that 0 < u_{2}^{*} \leq 1 implies 0\leq u_{1}^{*} < 1, and so we actually have the following condition

    \begin{equation*} 0\leq u_{1}^{*} < 1 \quad \text{or}\quad 0 < u_{2}^{*} \leq 1 \implies f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} \leq 0. \end{equation*}

    Interchanging the roles of u_{1} and u_{2} in the argument above also gives us

    \begin{equation*} 0 < u_{1}^{*} \leq 1 \quad \text{or}\quad 0\leq u_{2}^{*} < 1 \implies f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} \geq 0. \end{equation*}

    So, together we have that

    \begin{equation} \begin{cases} 0\leq u_{1}^{*} < 1 \quad \text{or}\quad 0 < u_{2}^{*} \leq 1 \implies f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} \leq 0\\ 0 < u_{1}^{*} \leq 1 \quad \text{or}\quad 0\leq u_{2}^{*} < 1 \implies f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} \geq 0 \end{cases} \end{equation} (A.4)

    Now, suppose that we are in the case where f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{2}} < 0. By (A.4) we can rule out the possibility that either u_{1}^{*} > 0 or u_{2}^{*} < 1 because either choice would lead to a contradiction. Therefore, in this case, we can conclude that u_{1}^{*} = 0 and u_{2}^{*} = 1. Similar arguments lead to the following conditions:

    \begin{equation} \begin{cases} f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{2}} < 0 \implies u_{1}^{*} = 0, u_{2}^{*} = 1\\ f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{2}} > 0\implies u_{1}^{*} = 1,u_{2}^{*} = 0\\ f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}}\cdot\mathbf{g}_{u_{2}} = 0\implies 0 < u_{1}^{*},u_{2}^{*} < 1. \end{cases} \end{equation} (A.5)

    We can convert this into conditions involving a Hamiltonian as follows. Define H by

    \begin{equation*} H(t,\mathbf{x},\mathbf{u},\mathit{\boldsymbol{\lambda}}) : = f(t,\mathbf{x},\mathbf{u}) + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}(t,\mathbf{x},\mathbf{u}). \end{equation*}

    For distinct indices i, j \in \lbrace 1, 2\rbrace, rewriting (A.5) in terms of H yields the necessary conditions

    \begin{equation} \begin{cases} \frac{\partial H}{\partial u_{i}} < \frac{\partial H}{\partial u_{j}} &\implies u_{i}^{*} = 0, u_{j}^{*} = 1\\ \frac{\partial H}{\partial u_{i}} = \frac{\partial H}{\partial u_{j}}&\implies 0 \leq u_{i}^{*},u_{j}^{*} \leq 1. \end{cases} \end{equation} (A.6)

    We can also express the differential equations for \mathbf{x} and \mathit{\boldsymbol{\lambda}} in terms of the Hamiltonian as

    \begin{equation} \begin{cases} x_{i}'(t) = \frac{\partial H}{\partial \lambda_{i}}, &x_{i}(0) = x_{i0} \text{ for } i = 1,\dotsc, n\\ \lambda_{i}'(t) = -\frac{\partial H}{\partial x_{i}},\ &\lambda_{i}(T) = 0 \text{ for }i = 1,\dotsc,n. \end{cases} \end{equation} (A.7)

    Note that if we were to reduce the problem to one control by writing u_{2} = 1-u_{1}, these conditions would become exactly the standard necessary conditions for a problem with one bounded control and n states [13,]. Going through this derivation with two controls, however, gives us a starting point to approach the analogous problem with three controls.

    We now consider the following optimal control problem with n states and three controls. We will again derive the necessary conditions, following a similar procedure to that used in the simpler case with only two controls. The problem is stated as follows.

    \begin{gather} \max\limits_{\mathbf{u} = (u_{1}, u_{2}, u_{3})} \int_{0}^{T}f(t,\mathbf{x}(t),\mathbf{u}(t))\,dt\\ \begin{aligned}[t] &\text{subject to:}\quad 0\leq u_{1}(t)\leq 1,\quad 0\leq u_{2}(t)\leq 1,\quad 0 \leq u_{3}(t)\leq 1 \quad\text{for }t\in[0,T],\\ &\phantom{\text{subject to:}}\quad u_{1}(t) + u_{2}(t) + u_{3}(t) = 1 \quad\text{for }t\in[0,T]\\ &\phantom{\text{subject to:}}\quad \mathbf{x}'(t) = \mathbf{g}(t,\mathbf{x}(t),\mathbf{u}(t)),\quad \mathbf{x}(0) = \mathbf{x}_{0} \end{aligned} \end{gather} (A.8)

    The derivation of the necessary conditions is similar to the two-control case. We let U be the space of admissible controls for (A.8) and define J\colon U\to \mathbb{R} by (A.2), restated below.

    \begin{align*} J(\mathbf{u}) & = \int_{0}^{T}\left\lbrace f(t,\mathbf{x}(t),\mathbf{u}(t)) + \mathit{\boldsymbol{\lambda}}(t)\cdot \mathbf{g}(t,\mathbf{x}(t),\mathbf{u}(t)) + \mathit{\boldsymbol{\lambda}}'(t)\cdot \mathbf{x}(t)\right\rbrace\,dt\nonumber\\ &\phantom{ = \int_{0}^{T}\lbrace f(t,\mathbf{x}(t),\mathbf{u}(t))} + \mathit{\boldsymbol{\lambda}}(0)\cdot\mathbf{x}_{0} - \mathit{\boldsymbol{\lambda}}(T)\cdot\mathbf{x}(t). \end{align*} (A.2 revisited)

    Suppose that \mathbf{u}^{*}, \mathbf{x}^{*} are optimal, and let h_{1}, h_{2}, and h_{3} be piecewise continuous variations. Then, for \varepsilon \in \mathbb{R}, we define \mathbf{u}^{ \varepsilon} \in U by u_{i}^{ \varepsilon}(t) = u_{i}^{*}(t) + \varepsilon h_{i}(t) for i = 1, 2, 3, and let \mathbf{x}^{ \varepsilon} be the corresponding state. Admissibility here requires that \sum_{i = 1}^{3}u_{i}^{ \varepsilon} = 1, so we have that h_{3} = -(h_{1} + h_{2}). By the optimality of \mathbf{u}^{*}, \mathbf{x}^{*} we again get (A.3), restated below:

    \begin{equation} 0 \geq \frac{dJ(\mathbf{u}^{ \varepsilon})}{d \varepsilon}\Big\vert_{ \varepsilon = 0} = \lim_{ \varepsilon \to 0}\frac{J(\mathbf{u}^{ \varepsilon}) - J(\mathbf{u}^{*})}{ \varepsilon}. \end{equation} (A.3 revisited)

    We again differentiate (A.2), using the DCT to interchange the order of differentiation and integration, this time arriving at

    \begin{align*} 0&\geq\int_{0}^{T} \frac{\partial}{\partial \varepsilon}\left(f + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g} + \mathit{\boldsymbol{\lambda}}'\cdot\mathbf{x}^{ \varepsilon}\right)\Big\vert_{ \varepsilon = 0}\,dt - \mathit{\boldsymbol{\lambda}}(T)\cdot\frac{\partial \mathbf{x}^{ \varepsilon}(t)}{\partial \varepsilon}\Big\vert_{ \varepsilon = 0}\\ & = \int_{0}^{T}\left\lbrace\left(\nabla f + \sum\limits_{i = 1}^{n}\lambda_{i}\nabla g_{i} + \mathit{\boldsymbol{\lambda}}'\right)\cdot\frac{\partial \mathbf{x}^{ \varepsilon}(t)}{\partial \varepsilon}\Big\vert_{ \varepsilon = 0} + \left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}}\right)h_{1}\right.\\ &\mathrel{\phantom{ = \int_{0}^{T}\left\lbrace\left(\nabla f + \right.\right.}}+ \left.\left(f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}}\right)h_{2} + \left(f_{u_{3}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}}\right)h_{3}\vphantom{\sum\limits_{i = 1}^{n}}\right\rbrace\,dt -\mathit{\boldsymbol{\lambda}}(T)\cdot\frac{\partial \mathbf{x}^{ \varepsilon}(t)}{\partial \varepsilon}\Big\vert_{ \varepsilon = 0}. \end{align*}

    As before, we choose \mathit{\boldsymbol{\lambda}} so that

    \begin{equation*} \mathit{\boldsymbol{\lambda}}'(t) = -\left(\nabla f + \sum\limits_{i = 1}^{n}\lambda_{i}\nabla g_{i}\right),\quad \mathit{\boldsymbol{\lambda}}(T) = \mathbf{0}, \end{equation*}

    and so we arrive at the inequality

    \begin{equation} 0 \geq \int_{0}^{T}\left\lbrace \left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}}\right)h_{1} + \left(f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}}\right)h_{2} + \left(f_{u_{3}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}}\right)h_{3}\right\rbrace\,dt. \end{equation} (A.9)

    We will use this inequality to derive the necessary conditions for optimality. Due to the similarity between the various cases, we will only show one in detail. We begin by using the substitution h_{3} = -(h_{1} + h_{2}) to rewrite (A.9) as

    \begin{equation} 0 \geq \int_{0}^{T}\left\lbrace \left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{3}}\right)h_{1} + \left(f_{u_{2}} + \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{2}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{3}}\right)h_{2}\right\rbrace\,dt. \end{equation} (A.10)

    Next, let s be a point of continuity for all controls so that 0\leq u_{1}^{*}(s) < 1 and 0 < u_{3}^{*}(s)\leq 1. Note that because the controls sum to one, this also means that 0\leq u_{2}^{*}(s)< 1. Additionally, assume for the sake of contradiction that f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} >0 at s. As \mathit{\boldsymbol{\lambda}} is continuous, and f and \mathbf{g} are continuously differentiable, by the continuity of the controls, we have that f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} is continuous at s as well. Therefore, we can find a closed interval I about s such that u_{1}^{*} < 1, u_{3}^{*} > 0, and f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} > 0 on I. Next, let

    \begin{equation*} m = \min\biggl\lbrace 1 - \max\lbrace u_{1}^{*}(t)\mid t\in I\rbrace, \min\lbrace u_{3}^{*}(t)\mid t\in I\rbrace\biggr\rbrace. \end{equation*}

    and note that m > 0. Next, we choose variations to be

    \begin{equation*} h_{1}(t) = m\chi_{I}(t),\quad h_{2}(t) \equiv 0, \quad h_{3}(t) = -m\chi_{I}(t). \end{equation*}

    This gives us the controls

    \begin{equation*} u_{1}^{ \varepsilon}(t) = u_{1}^{*}(t) + \varepsilon m\chi_{I}(t),\quad u_{2}^{ \varepsilon}(t) = u_{2}^{*}(t),\quad u_{3}^{ \varepsilon}(t) = u_{3}^{*}(t) - \varepsilon m\chi_{I}(t). \end{equation*}

    Note that because \sum_{i = 1}^{3}u_{i}^{*}(t) = 1 and \sum_{i = 1}^{3}h_{i}(t) = 0, we have that \sum_{i}^{3}u_{i}^{ \varepsilon}(t) = 1 as well. Furthermore, restricting our attention to t\in I and \varepsilon \in [0, 1], we have

    \begin{align*} 0&\leq u_{1}^{ \varepsilon}(t)\\ & = u_{1}^{*}(t) + \varepsilon m\\ &\leq u_{1}^{*}(t) + \varepsilon(1 - \max\lbrace u_{1}^{*}(t)\rbrace)\\ &\leq u_{1}^{*}(t) + 1 - \max\lbrace u_{1}^{*}(t)\rbrace\\ & \leq 1. \end{align*}

    As this bound also holds outside of I by assumption, we have that 0\leq u_{1}^{ \varepsilon} \leq 1. Now, for u_{3}^{ \varepsilon}, again restricting our attention to t\in I and \varepsilon \in [0, 1], we have

    \begin{align*} 1 &\geq u_{3}^{ \varepsilon}(t)\\ & = u_{3}^{*}(t) - \varepsilon m\chi_{I}(t)\\ &\geq u_{3}^{*}(t) - \varepsilon\min\lbrace u_{3}^{*}(t)\rbrace\\ &\geq u_{3}^{*}(t) - \min\lbrace u_{3}^{*}(t)\rbrace\\ & \geq 0. \end{align*}

    Again, as this bound holds outside of I by assumption, we have that 0 \leq u_{3}^{ \varepsilon} \leq 1. Therefore, the controls u_{1}^{ \varepsilon}, u_{2}^{ \varepsilon}, and u_{3}^{ \varepsilon} are admissible for all \varepsilon \in [0, 1]. We have then

    \begin{align*} 0 &\geq \int_{0}^{T}\left\lbrace \left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{3}}\right)h_{1} + \left(f_{u_{2}} + \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{2}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{3}}\right)h_{2}\right\rbrace\,dt\\ & = \int_{I} \left(f_{u_{1}} + \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot \mathbf{g}_{u_{3}}\right)m \,dt\\ & > 0, \end{align*}

    a contradiction. Therefore, it must be the case that f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} \leq 0.

    Now, by swapping the indices 1 and 2 in the argument above, we can reach the additional conclusion that f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} \leq 0 for the same set of assumptions. Note that this is because if we assume that 0 < u_{3}^{*}\leq 1, then we can conclude both 0 \leq u_{1}^{*} < 1 and 0 \leq u_{2}^{*} < 1 regardless of which assumption is used for a particular argument. Therefore, if we are in the case that 0 \leq u_{1}^{*} < 1, 0 \leq u_{2}^{*} < 1, and 0 < u_{3}^{*}\leq 1, then we have that both f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} \leq 0 and f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} \leq 0. We can permute the roles of u_{1}, u_{2}, and u_{3}, or, equivalently, repeat the argument above with the substitutions of h_{1} = -(h_{3} + h_{2}) or h_{2} = -(h_{3} + h_{1}) into (A.9) to obtain the following.

    \begin{equation} \begin{cases} 0 \leq u_{1}^{*},u_{2}^{*} < 1, &0 < u_{3}^{*} \leq 1 \implies \begin{cases} f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} \leq 0\\ f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} \leq 0 \end{cases} \\ 0 \leq u_{1}^{*},u_{3}^{*} < 1, &0 < u_{2}^{*} \leq 1 \implies \begin{cases} f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{2}} \leq 0\\ f_{u_{3}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{2}} \leq 0 \end{cases}\\ 0 \leq u_{2}^{*},u_{3}^{*} < 1, &0 < u_{1}^{*} \leq 1 \implies \begin{cases} f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{1}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{1}} \leq 0\\ f_{u_{3}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} - f_{u_{1}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{1}} \leq 0 \end{cases} \end{cases} \end{equation} (A.11)

    We will now work to develop implications going in the other direction. We will illustrate this for a few cases, and the remaining cases follow from permuting the controls. First, consider the case that at a point t\in [0, T], we have f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} < 0 and f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} < 0, and suppose for the sake of contradiction that u_{3}^{*}(t) < 1. Then it must be the case that at t either 0 < u_{1}^{*}\leq 1 and 0\leq u_{2}^{*} < 1 or 0 \leq u_{1}^{*} < 1 and 0 < u_{2}^{*} \leq 1. Using (A.11), we see that the first case implies that f_{u_{3}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} - f_{u_{1}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{1}} \leq 0 at t, and the second case implies that f_{u_{3}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{2}} \leq 0 at t, both of which contradict our assumptions. So, it must be the case that u_{3}^{*}(t) = 1.

    Next, consider the case that at a point t\in [0, T], we have f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} < 0 and f_{u_{1}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{1}} - f_{u_{2}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{2}} < 0, and suppose for the sake of contradiction that u_{1}^{*}(t) > 0. Then it must be the case that at t, we have 0\leq u_{2}^{*} < 1 and 0\leq u_{3}^{*} < 1. Using (A.11), we see that this implies that f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{1}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{1}} \leq 0 and f_{u_{3}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{3}} - f_{u_{1}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{1}} \leq 0 at t, both of which contradict our assumptions here. Therefore, it must be the case that u_{1}^{*}(t) = 0. A similar argument can be used to show that if f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{1}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{1}} < 0 and f_{u_{2}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{2}} - f_{u_{3}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{3}} < 0 at a point t\in [0, T], then it must be the case that u_{2}^{*}(t) = 0.

    By permuting the roles of the controls in the arguments above, we can reach the following conclusions. For distinct indices i, j, k\in \lbrace 1, 2, 3\rbrace, we have that

    \begin{equation} \begin{cases} \begin{cases} f_{u_{j}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{j}} - f_{u_{i}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{i}} < 0\\ f_{u_{k}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{k}} - f_{u_{i}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{i}} < 0 \end{cases} \implies u_{i}^{*} = 1\\ \begin{cases} f_{u_{i}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{i}} - f_{u_{j}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{j}} < 0\\ f_{u_{i}} + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}_{u_{i}} - f_{u_{k}} - \mathit{\boldsymbol{\lambda}} \cdot\mathbf{g}_{u_{k}} < 0 \end{cases} \implies u_{i}^{*} = 0\\ \end{cases} \end{equation} (A.12)

    Note that this also tells us that the only way for a control to take on a value other than zero or one is for at least one of the inequalities in (A.12) to be an equality instead. We can convert this into conditions involving a Hamiltonian as follows. Define H by

    \begin{equation*} H(t,\mathbf{x},\mathbf{u},\mathit{\boldsymbol{\lambda}}) : = f(t,\mathbf{x},\mathbf{u}) + \mathit{\boldsymbol{\lambda}}\cdot \mathbf{g}(t,\mathbf{x},\mathbf{u}). \end{equation*}

    Rewriting (A.12) in terms of H yields the necessary conditions

    \begin{equation} \begin{cases} \frac{\partial H}{\partial u_{j}} < \frac{\partial H}{\partial u_{i}} \quad \text{and}\quad\frac{\partial H}{\partial u_{k}} < \frac{\partial H}{\partial u_{i}}&\implies u_{i}^{*} = 1\\ \frac{\partial H}{\partial u_{i}} < \frac{\partial H}{\partial u_{j}}\quad \text{and}\quad \frac{\partial H}{\partial u_{i}} < \frac{\partial H}{\partial u_{k}}&\implies u_{i}^{*} = 0\\ \frac{\partial H}{\partial u_{i}} = \frac{\partial H}{\partial u_{j}}\quad \text{or}\quad \frac{\partial H}{\partial u_{i}} = \frac{\partial H}{\partial u_{k}}\quad\text{or}\quad \frac{\partial H}{\partial u_{j}} = \frac{\partial H}{\partial u_{k}}&\implies 0 \leq u_{i}^{*} \leq 1. \end{cases} \end{equation} (A.13)

    We can also express the differential equations for \mathbf{x} and \mathit{\boldsymbol{\lambda}} in terms of the Hamiltonian:

    \begin{equation} \begin{cases} x_{i}'(t) = \frac{\partial H}{\partial \lambda_{i}}, &x_{i}(0) = x_{i0} \text{ for } i = 1,\dotsc, n\\ \lambda_{i}'(t) = -\frac{\partial H}{\partial x_{i}},\ &\lambda_{i}(T) = 0 \text{ for }i = 1,\dotsc,n. \end{cases} \end{equation} (A.14)

    In this appendix, we include the proofs of Lemma 1, Lemma 2, Lemma 3, and Theorem 1.

    Lemma 1. \frac{\partial H}{\partial u_{RC}} < \frac{\partial H}{\partial u_{FC}} in an open interval immediately prior to t^{*}.

    Proof. Because \lambda_{R}(t^{*}) = 0, \lambda_{R} is continuous, and G' is bounded, there exists \varepsilon > 0, such that for all t in (t^{*} - \varepsilon, t^{*}), we have

    \begin{equation*} \lambda_{R}G'(z_{RC})C < C, \end{equation*}

    which in terms of the Hamiltonian (3.31) gives the desired result:

    \begin{equation*} \frac{\partial H}{\partial u_{RC}} < \frac{\partial H}{\partial u_{FC}}. \end{equation*}

    Lemma 2. \frac{\partial H}{\partial u_{SC}} = \frac{\partial H}{\partial u_{FC}} in an open interval immediately prior to t^{*}.

    Proof. We will begin by showing that

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} \geq \frac{\partial H}{\partial u_{FC}} \end{equation*}

    in an open interval immediately prior to t^{*}. Suppose to the contrary that for some \varepsilon > 0 we have for all t in (t^{*} - \varepsilon, t^{*}) that

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} < \frac{\partial H}{\partial u_{FC}}. \end{equation*}

    By Lemma 1, we have that

    \begin{equation*} \frac{\partial H}{\partial u_{RC}} < \frac{\partial H}{\partial u_{FC}} \end{equation*}

    as well, and so by (3.25), this means that u_{FC} = 1 and u_{SC} = 0 = u_{RC}. As in the final interval, this yields

    \begin{equation*} \frac{\partial H}{\partial u_{FC}} = C,\quad \frac{\partial H}{\partial u_{SC}} = \lambda_{S}C. \end{equation*}

    However, as \lambda_{S}(t^{*}) = 1 and by (3.45), we have that \lambda_{S}' < 0, it must be the case that \lambda_{S} > 1. This then means that

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} > \frac{\partial H}{\partial u_{FC}}, \end{equation*}

    a contradiction. Therefore, we have shown that during this interval, we have

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} \geq \frac{\partial H}{\partial u_{FC}}. \end{equation*}

    Suppose now that the inequality is strict, and that in fact we have

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} > \frac{\partial H}{\partial u_{FC}}. \end{equation*}

    Now, combining this supposition with Lemma 1 gives us

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} > \frac{\partial H}{\partial u_{FC}} > \frac{\partial H}{\partial u_{RC}}, \end{equation*}

    which by (3.25) means that

    \begin{equation*} u_{SC} = 1, \quad u_{FC} = 0 = u_{RC}. \end{equation*}

    This then gives us

    \begin{equation*} z_{SN} = \frac{\nu_{S}u_{SN}N}{C}, \end{equation*}

    which, for fixed C and N, is bounded between 0 and \frac{\nu_{S}N}{C}. Therefore, we have that G'(z_{SN}) \neq 0. Furthermore, recall that we know that \lambda_{R}(t^{*}) = 0 and G' is bounded. This means that there is a (potentially smaller) open interval immediately prior to t^{*} in which we have

    \begin{equation*} \lambda_{S}\nu_{S}NG'(z_{SN}) > \lambda_{R}\nu_{R}NG'(z_{RN}), \end{equation*}

    or in terms of the Hamiltonian,

    \begin{equation*} \frac{\partial H}{\partial u_{SN}} > \frac{\partial H}{\partial u_{RN}}. \end{equation*}

    By (3.25), this means that u_{SN} = 1 and u_{RN} = 0 during this interval. With this configuration of the controls, we have that N = N^{*} is constant because R' = 0 and this interval reaches this final interval. So, we have that

    \begin{equation*} z_{SC} = \frac{C}{\nu_{S}N^{*}} > 0. \end{equation*}

    This then means that G'(z_{SC}) < 1, and so as \lambda_{S}(t^{*}) = 1 and both z_{SC} and \lambda_{S} are continuous here, there exists \varepsilon > 0 such that for all t in (t^{*} - \varepsilon, t^{*}), we have

    \begin{equation*} \lambda_{S}G'(z_{SC}) < 1 \end{equation*}

    that is

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} < \frac{\partial H}{\partial u_{FC}}, \end{equation*}

    a contradiction. Therefore, we have

    \begin{equation*} \frac{\partial H}{\partial u_{SC}} = \frac{\partial H}{\partial u_{FC}} \end{equation*}

    in an open interval immediately prior to t^{*}, as desired.

    Lemma 3. u_{SN} = 1 and u_{SC} > 0 in an open interval immediately prior to t^{*}.

    Proof. We will first show that u_{SN} is nonzero. Recall that by (4.15) and (3.25), we have that u_{RC} = 0 in an open interval immediately prior to t^{*}. Suppose furthermore that u_{SN} = 0 during this interval as well. By (3.6) and (3.7), this means that both S' = 0 and R' = 0. As this interval is followed by a period of fruit-only growth, in order to maximize the integral in (3.23), it must be the case that u_{FC} = 1. We therefore have

    \begin{equation*} u_{FC} = 1,\quad u_{SC} = 0 = u_{RC},\quad u_{SN} = 0,\quad u_{RN} = 1. \end{equation*}

    In particular, because here we are again in a case where u_{FC} = 1, u_{SC} = 0 = u_{RC}, and \lambda_{R}(t^{*}) = 0, we again arrive at a situation where

    \begin{equation*} \frac{\partial H}{\partial u_{FC}} = C,\quad \frac{\partial H}{\partial u_{SC}} = \lambda_{S}C \end{equation*}

    following the same argument as in the justification for the existence of a final interval of fruit-only growth. However, as we know that here \lambda_{S} > 1, we have that

    \begin{equation*} \frac{\partial H}{\partial u_{FC}} < \frac{\partial H}{\partial u_{SC}}, \end{equation*}

    which, regardless of the relationship between \frac{\partial H}{\partial u_{RC}} with the other two partial derivatives, contradicts the fact that u_{FC} = 1. Hence, u_{SN} > 0.

    Now, because \lambda_{S} > 1, we know by (4.16) that z_{SC} > 0 so that G'(z_{SC}) < 1. Because u_{SN} > 0, this allows us to conclude that u_{SC} > 0 as well without any concern regarding ambiguous limiting behavior with G'(z_{SC}) when one or both controls are zero. This in turn implies that

    \begin{equation*} \frac{\partial H}{\partial u_{SN}} = \lambda_{S}\nu_{S}NG'(z_{SN}) > 0. \end{equation*}

    Now, if u_{SN} < 1, we would also have u_{RN} > 0. Because u_{RC} = 0, this would imply that z_{RN} \to \infty, and so by (3.18), we would have G'(z_{RN}) = 0. This would then imply that

    \begin{equation*} \frac{\partial H}{\partial u_{SN}} > \frac{\partial H}{\partial u_{RN}}, \end{equation*}

    which by (3.25) means that u_{SN} = 1, a contradiction. Therefore, it must have been the case that u_{SN} = 1 in addition to u_{SC} > 0 in an open interval immediately prior to t^{*}, as desired.

    Theorem 1. Let t^{\#} be a point in the penultimate interval such that \lambda_{S}(t^{\#}) > 1 > \lambda_{R}(t^{\#}) and such that this ordering of the marginal values holds on [t^{\#}, t^{\#} + \varepsilon] for some \varepsilon > 0. Let F(T) be the value obtained by following a mixed fruit/shoot strategy on [t^{\#}, t^{\#} + \varepsilon] and then following the optimal trajectory on (t^{\#} + \varepsilon, T]. Let F_{\text{alt}}(T) be the value obtained by following a shoot-only strategy S_{alt}(t) on [t^{\#}, t^{\#} + \varepsilon] and then following the optimal trajectory on (t^{\#} + \varepsilon, T]. Then, to leading order,

    \begin{equation} F(T) - F_{\text{alt}}(T) \approx \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#}))\right) > 0. \end{equation} (4.22 revisited)

    Proof. To derive the approximation in Theorem 1, we will first recall a definition we introduced in Section 4.3 when discussing the marginal values. Let J(S_{0}, R_{0}, t_{0}) be the maximum for the following optimal control problem.

    \begin{gather} \max_{\mathbf{u}}\int_{t_{0}}^{T} u_{FC}C\, dt \\ {\begin{aligned}\nonumber &\text{subject to:}\quad u_{iC} \geq 0,\ u_{jN} \geq 0 \text{ for } i = 0,1,2,\ j = 1,2\\ &\phantom{\text{subject to:}\quad} u_{FC} + u_{SC} + u_{RC} = 1 = u_{SN} + u_{RN}\\ &\phantom{\text{subject to:}\quad} \frac{dS}{dt} = g(u_{SC}C, \nu_{S}u_{SN}N),\quad S(t_{0}) = S_{0} \\ &\phantom{\text{subject to:}\quad}\frac{dR}{dt} = g(u_{RC}C, \nu_{R}u_{RN}N),\quad R(t_{0}) = R_{0} \\ \end{aligned}} \end{gather} (4.19 revisited)

    We wish to compute F(T) - F_{\text{alt}}(T), which in terms of J is given by

    \begin{equation} \int_{t^{\#}}^{t^{\#} + \varepsilon} F'(t)\,dt + J(S(t^{\#} + \varepsilon), R^{*}, t^{\#} + \varepsilon) - J(S_{\text{alt}}(t^{\#} + \varepsilon), R^{*}, t^{\#} + \varepsilon), \end{equation} (B.1)

    where we use the 'alt' subscript wherever we need to distinguish variables that differ from the optimal solution based on the shoot-only growth strategy during [t^{\#}, t^{\#} + \varepsilon]. Note also that here R = R^{*}, because we are in the penultimate interval. We can interpret (B.1) as the sum of the amount of fruit grown during the interval [t^{\#}, t^{\#}+ \varepsilon] using the mixed fruit/shoot strategy and the difference between the final value of fruits depending on the different values of S(t^{\#}) produced by each strategy on [t^{\#}, t^{\#}+ \varepsilon]. Now, for small enough \varepsilon, we have that

    \begin{align*} S(t^{\#} + \varepsilon) &\approx S(t^{\#}) + \varepsilon S'(t^{\#}) = S(t^{\#}) + \varepsilon\nu_{S}R^{*}G\left(\frac{u_{SC}(t^{\#})C(S(t^{\#}))}{\nu_{S}R^{*}}\right)\\ S_{\text{alt}}(t^{\#} + \varepsilon) &\approx S(t^{\#}) + \varepsilon S_{\text{alt}}'(t^{\#}) = S(t^{\#}) + \varepsilon\nu_{S}R^{*}G\left(\frac{C(S(t^{\#}))}{\nu_{S}R^{*}}\right) \end{align*}

    which makes (B.1) approximately

    \begin{align} \int_{t^{\#}}^{t^{\#} + \varepsilon} F'(t)\,dt &+ J\left(S(t^{\#}) + \varepsilon\nu_{S}R^{*}G\left(\frac{u_{SC}(t^{\#})C(S(t^{\#}))}{\nu_{S}R^{*}}\right), R^{*}, t^{\#} + \varepsilon\right)\\ &- J\left(S(t^{\#}) + \varepsilon\nu_{S}R^{*}G\left(\frac{C(S(t^{\#}))}{\nu_{S}R^{*}}\right), R^{*}, t^{\#} + \varepsilon\right). \end{align} (B.2)

    We will simplify the second two terms in (B.2), using the following labels for convenience:

    \begin{align*} I_{1} & = \int_{t^{\#}}^{t^{\#} + \varepsilon} F'(t)\,dt\\ I_{2} & = J\left(S(t^{\#}) + \varepsilon\nu_{S}R^{*}G\left(\frac{u_{SC}(t^{\#})C(S(t^{\#}))}{\nu_{S}R^{*}}\right), R^{*}, t^{\#} + \varepsilon\right)\\ I_{3} & = J\left(S(t^{\#}) + \varepsilon\nu_{S}R^{*}G\left(\frac{C(S(t^{\#}))}{\nu_{S}R^{*}}\right), R^{*}, t^{\#} + \varepsilon\right). \end{align*}

    Beginning with I_{2}, and integrating by parts:

    \begin{align*} I_{2} & = \int_{t^{\#}+ \varepsilon}^{T}u_{FC}(t)C(S(t))\,dt\\ & = \int_{t^{\#}+ \varepsilon}^{T}\left\lbrace u_{FC}(t)C(S(t)) + \lambda_{S}(t)S'(t) - \lambda_{S}(t)S'(t) + \lambda_{R}(t)R'(t) - \lambda_{R}(t) R'(t)\right\rbrace\,dt\\ & = \int_{t^{\#}+ \varepsilon}^{T}\left\lbrace u_{FC}(t)C(S(t)) + \lambda_{S}(t)S'(t) + \lambda_{S}'(t)S(t) + \lambda_{R}(t)R'(t) + \lambda_{R}'(t)R(t)\right\rbrace\,dt \\ &\qquad - \lambda_{S}(T)S(T) + \lambda_{S}(t^{\#}+ \varepsilon)S(t^{\#}+ \varepsilon) - \lambda_{R}(T)R(T) + \lambda_{R}(t^{\#}+ \varepsilon)R(t^{\#}+ \varepsilon). \end{align*}

    Likewise for I_{3}, using the same \lambda_{S}, \lambda_{R}, we have:

    \begin{eqnarray} \int_{t^{\#}+ \varepsilon}^{T}\left\lbrace u_{FC,\text{alt}}(t)C(S_{\text{alt}}(t)) + \lambda_{S}(t)S_{\text{alt}}'(t) + \lambda_{S}'(t)S_{\text{alt}}(t) + \lambda_{R}(t)R_{\text{alt}}'(t) + \lambda_{R}'(t)R_{\text{alt}}(t)\right\rbrace\,dt\\- \lambda_{S}(T)S_{\text{alt}}(T) + \lambda_{S}(t^{\#}+ \varepsilon)S_{\text{alt}}(t^{\#}+ \varepsilon)- \lambda_{R}(T)R_{\text{alt}}(T) + \lambda_{R}(t^{\#}+ \varepsilon)R_{\text{alt}}(t^{\#}+ \varepsilon). \end{eqnarray}

    Therefore, I_{2} - I_{3} becomes

    \begin{align} \int_{t^{\#}+ \varepsilon}^{T}&\left\lbrace u_{FC}(t)C(S(t)) - u_{FC,\text{alt}}(t)C(S_{\text{alt}}(t)) + \lambda_{S}(t)(S'(t) - S_{\text{alt}}'(t)) + \lambda_{S}'(t)(S(t) - S_{\text{alt}}(t))\right.\\ &\left. + \lambda_{R}(t)(R'(t) - R_{\text{alt}}'(t)) + \lambda_{R}'(t)(R(t) - R_{\text{alt}}(t))\right\rbrace\,dt\\ & - \lambda_{S}(T)(S(T) - S_{\text{alt}}(T)) + \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon))\\ & - \lambda_{R}(T)(R(T) - R_{\text{alt}}(T)) + \lambda_{R}(t^{\#}+ \varepsilon)(R(t^{\#}+ \varepsilon) - R_{\text{alt}}(t^{\#}+ \varepsilon)). \end{align} (B.3)

    Next, we will approximate the integrand of (B.3) with its Taylor series around the point (t, S(t), R(t), \mathbf{u}(t)). We only keep track of the first-order terms of the expansion, because as \varepsilon\to 0, the integral already goes to zero. We have then that I_{2} - I_{3} is approximately

    \begin{align} &\int_{t^{\#} + \varepsilon}^{T}\left\lbrace \left(u_{FC}(t)\frac{dC}{dS}(S(t)) + \lambda_{S}(t)\frac{dS'}{dS}(t) + \lambda_{R}(t)\frac{dR'}{dS}(t) +\lambda_{S}'(t)\right)(S(t) - S_{\text{alt}}(t))\right.\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}+\left(\lambda_{S}(t)\frac{dS'}{dR}(t) + \lambda_{R}(t)\frac{dR'}{dR}(t) + \lambda_{R}'(t)\right)(R(t) - R_{\text{alt}}(t))\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}+C(S(t))(u_{FC}(t) - u_{FC,\text{alt}}(t))+\lambda_{S}(t)\frac{dS'}{du_{SC}}(t)(u_{SC}(t) - u_{SC,\text{alt}}(t))\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}+ \lambda_{R}(t)\frac{dR'}{du_{RC}}(t)(u_{RC}(t) - u_{RC,\text{alt}}(t))+\lambda_{S}(t)\frac{dS'}{du_{SN}}(t)(u_{SN}(t) - u_{SN,\text{alt}}(t))\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}\left.+ \lambda_{R}(t)\frac{dR'}{du_{RN}}(t)(u_{RN}(t) - u_{RN,\text{alt}}(t))\right\rbrace\,dt\\ & - \lambda_{S}(T)(S(T) - S_{\text{alt}}(T)) + \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon))\\ & - \lambda_{R}(T)(R(T) - R_{\text{alt}}(T)) + \lambda_{R}(t^{\#}+ \varepsilon)(R(t^{\#}+ \varepsilon) - R_{\text{alt}}(t^{\#}+ \varepsilon)). \end{align} (B.4)

    Note that by the differential equations for \lambda_{S} and \lambda_{R}, and the fact that R(t^{\#} + \varepsilon) = R_{\text{alt}}(t^{\#} + \varepsilon), this simplifies to

    \begin{align*} &\int_{t^{\#} + \varepsilon}^{T}\left\lbrace C(S(t))(u_{FC}(t) - u_{FC,\text{alt}}(t))+\lambda_{S}(t)\frac{dS'}{du_{SC}}(t)(u_{SC}(t) - u_{SC,\text{alt}}(t))\right.\nonumber\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}+ \lambda_{R}(t)\frac{dR'}{du_{RC}}(t)(u_{RC}(t) - u_{RC,\text{alt}}(t))+\lambda_{S}(t)\frac{dS'}{du_{SN}}(t)(u_{SN}(t) - u_{SN,\text{alt}}(t))\nonumber\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}\left.+ \lambda_{R}(t)\frac{dR'}{du_{RN}}(t)(u_{RN}(t) - u_{RN,\text{alt}}(t))\right\rbrace\,dt\nonumber\\ & + \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon)). \end{align*}

    We can rewrite this in terms of the Hamiltonian:

    \begin{align*} I_{2} - I_{3}\approx &\int_{t^{\#} + \varepsilon}^{T}\left\lbrace \frac{\partial H}{\partial u_{FC}}(t)(u_{FC}(t) - u_{FC,\text{alt}}(t))+\frac{\partial H}{\partial u_{SC}}(t)(u_{SC}(t) - u_{SC,\text{alt}}(t))\right.\nonumber\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}+ \frac{\partial H}{\partial u_{RC}}(t)(u_{RC}(t) - u_{RC,\text{alt}}(t))+\frac{\partial H}{\partial u_{SN}}(t)(u_{SN}(t) - u_{SN,\text{alt}}(t))\nonumber\\ &\phantom{\int_{t^{\#} + \varepsilon}^{T}\left\lbrace\right.}\left.+ \frac{\partial H}{\partial u_{RN}}(t)(u_{RN}(t) - u_{RN,\text{alt}}(t))\right\rbrace\,dt\nonumber\\ & + \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon)). \end{align*}

    Now, because we know the structure of the optimal solution, namely that during the penultimate interval \frac{\partial H}{\partial u_{FC}}(t) = \frac{\partial H}{\partial u_{SC}}(t), u_{SN} = 1, \frac{\partial H}{\partial u_{RC}}(t) = 0 = \frac{\partial H}{\partial u_{RN}}(t), and during the final interval u_{FC} = 1, \frac{\partial H}{\partial u_{FC}}(t) = C^{*}, we have that I_{2} - I_{3} is approximately

    \begin{align*} \int_{t^{\#} + \varepsilon}^{t^{*}}&\left\lbrace \frac{\partial H}{\partial u_{FC}}(u_{FC}(t) + u_{SC}(t) - u_{FC,\text{alt}}(t) - u_{SC,\text{alt}}(t)) + \frac{\partial H}{\partial u_{SN}}(1-u_{SN,\text{alt}}(t))\right\rbrace\,dt\\ &+\int_{t^{*}}^{T}C^{*}(1-u_{FC,\text{alt}}(t))\,dt + \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon)). \end{align*}

    Assume that \varepsilon is chosen small enough that the optimal solution to the perturbed problem also begins in the penultimate interval at t = t^{\#} + \varepsilon. This gives us u_{FC, \text{alt}}(t) + u_{SC, \text{alt}}(t) = 1 = u_{SN, \text{alt}}(t), and so

    \begin{equation*} I_{2}-I_{3} \approx C^{*}\int_{t^{*}}^{T}u_{SC,\text{alt}}(t)\,dt+ \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon)). \end{equation*}

    Next, we use

    \begin{equation*} \int_{t^{\#}}^{t^{\#}+ \varepsilon}F'(t)\,dt \approx \varepsilon F'(t^{\#}), \end{equation*}

    which gives us

    \begin{equation*} I_{1} + I_{2} - I_{3} \approx \varepsilon F'(t^{\#})+ \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon)) + C^{*}\int_{t^{*}}^{T}u_{SC,\text{alt}}(t)\,dt. \end{equation*}

    Using a linear approximation, and the fact that S(t^{\#}) = S_{\text{alt}}(t^{\#}), we have that to O(\varepsilon)

    \begin{align*} \lambda_{S}(t^{\#}+ \varepsilon)(S(t^{\#}+ \varepsilon) - S_{\text{alt}}(t^{\#}+ \varepsilon)) &\approx (\lambda_{S}(t^{\#}) + \varepsilon \lambda_{S}'(t^{\#}))( \varepsilon S'(t^{\#}) - \varepsilon S_{\text{alt}}'(t^{\#}))\\ &\approx \varepsilon \lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#})). \end{align*}

    Therefore, we have that

    \begin{equation} I_{1} + I_{2} - I_{3} \approx \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#})\right) + C^{*}\int_{t^{*}}^{T}u_{SC,\text{alt}}(t)\,dt. \end{equation} (B.5)

    The last step in obtaining the approximation in Theorem 1 is to show that the last term in (B.5) vanishes. To this end, we will show that t_{\text{alt}}^{*} \leq t^{*}, in which case the u_{SC, \text{alt}}(t) = 0 on [t^{*}, T]. To see why this is the case, first note that by the way we constructed the alternate growth path on [t^{\#}, t^{\#} + \varepsilon], we know that S_{\text{alt}}(t^{\#} + \varepsilon) \geq S(t^{\#} + \varepsilon). Because R(t) = R^{*} = R_{\text{alt}}(t) for all t in [t^{\#}, T], and S_{\text{alt}}(t^{\#} + \varepsilon) > S(t^{\#} + \varepsilon), it must be the case that S_{\text{alt}}(T) \geq S(T), else there would exist a time t' in (t^{\#} + \varepsilon, T) where S(t') = S_{\text{alt}}(t') and R(t') = R_{\text{alt}}(t'). In this case, the solutions to both the original problem and the perturbed problem would follow the same optimal trajectory for the remainder of the growing season. Now, because S_{\text{alt}}(T) \geq S(T) and by (3.11), we know that C_{S}(t) is decreasing, we have that C_{S}(S_{\text{alt}}(T)) \leq C_{S}(S(T)) which by (4.12) means that t_{\text{alt}}^{*} \leq t^{*}. Therefore, the last term in (B.5) vanishes and we are left with

    \begin{equation} I_{1} + I_{2} - I_{3} \approx \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#})\right). \end{equation} (B.6)

    Now that we have obtained (B.6), it remains to show that I_{1} + I_{2} - I_{3} > 0, i.e., the optimal strategy given by the necessary conditions (shoot/fruit growth) is better than the strategy suggested by the marginal values (shoot-only growth). To this end, dropping the arguments after the first line,

    \begin{align*} I_{1} + I_{2} - I_{3} &\approx \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#})\right)\\ & = \varepsilon \left(u_{FC}C+\lambda_{S}(S' - S_{\text{alt}}')\right)\\ & = \varepsilon \left((1 - u_{SC})C + \lambda_{S}\left(\nu_{S}N^{*}G\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) - \nu_{S}N^{*}G\left(\frac{C}{\nu_{S}N^{*}}\right)\right)\right) \end{align*}

    Because here \lambda_{S}G'\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) = 1, we have

    \begin{align*} I_{1} + I_{2} - I_{3} &\approx \varepsilon\lambda_{S}\left(CG'\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) - u_{SC}CG'\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right)\right.\\ &\qquad\qquad\left.+\nu_{S}N^{*}G\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) - \nu_{S}N^{*}G\left(\frac{C}{\nu_{S}N^{*}}\right)\right) \\ & = \varepsilon\lambda_{S}\nu_{S}N^{*}\left(\frac{C}{\nu_{S}N^{*}}G'\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) - \frac{u_{SC}C}{\nu_{S}N^{*}}G'\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right)\right.\\&\qquad\qquad+\left.G\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) - G\left(\frac{C}{\nu_{S}N^{*}}\right)\right) \\ & = \varepsilon\lambda_{S}\nu_{S}N^{*}\left(\frac{C}{\nu_{S}N^{*}}G'\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) +G_{2}\left(\frac{u_{SC}C}{\nu_{S}N^{*}}\right) - G\left(\frac{C}{\nu_{S}N^{*}}\right)\right), \end{align*}

    where the last line follows from (3.20). At this point, showing that for a\in (0, 1), x>0,

    \begin{equation*} xG'(ax) + G_{2}(ax) - G(x) > 0 \end{equation*}

    will be sufficient to show that I_{1} + I_{2} - I_{3} > 0. To that end, for a\in (0, 1), x>0,

    \begin{align*} xG'(ax) + G_{2}(ax) - G(x) & = \frac{x(1+2ax)}{(1+ax+(ax)^2)^2} + \frac{(ax)^{3}(2+ax)}{(1+ax+(ax)^2)^2} - \frac{x(1+x)}{1+x+x^2}\\ & = \frac{(a-1)^{2}x^{3}(1 + 2a + 2ax + a^2x)}{(1+ax+(ax)^2)^{2}(1+x+x^2)}\\ & > 0. \end{align*}

    Therefore, we have the desired result, namely that

    \begin{equation} F(T) - F_{\text{alt}}(T) \approx \varepsilon \left(F'(t^{\#})+\lambda_{S}(t^{\#})(S'(t^{\#}) - S_{\text{alt}}'(t^{\#}))\right) > 0. \end{equation} (4.22 revisited)

    In this appendix, we include the full proof of Theorem 2. Recall that t = \hat{t} denotes the transition point between the balanced growth phase and the penultimate interval, and z_{SC}^{-} and z_{SC}^{+} are the left and right limits of z_{SC} at t = \hat{t}, respectively.

    Theorem 2. z_{SC} is continuous at t = \hat{t}.

    Proof. First, note that by combining (5.26), (5.37), and (4.16), and writing \widehat{C} = C(\hat{t}), we have that (x, y) = (z_{SC}^{-}, z_{SC}^{+}) must solve

    \begin{align*} C^{*}\nu_{S}N^{*}G_{2}(y) & = \bigl(C^{*} - \widehat{C}\bigr)\bigl(\nu_{S}N^{*}G_{2}(x) + \widehat{C}G'(x)\bigr) \end{align*} (5.50 revisited)
    \begin{align*} C^{*}G'(y) & = \nu_{S}N^{*}G_{2}(x) + \widehat{C}G'(x). \end{align*} (5.51 revisited)

    Because G_{2} and G' are both one-to-one, we have that y is a function of x in (5.50) and (5.51). It will also be useful to equate the left-hand sides of (5.50) and (5.51) to obtain:

    \begin{equation} \frac{\nu_{S}N^{*}G_{2}(y)}{C^{*} - C} = G'(y). \end{equation} (C.1)

    We will break the argument up into three components.

    First, we will show that for any given values of C^{*}, \widehat{C}, \nu_{S}, and N^{*}, equation (C.1) has only one positive real solution. To this end, setting k = \frac{\nu_{S}N^{*}}{C^{*} -\widehat{C}}, equation (C.1) gives us

    \begin{align*} kG_{2}(y) & = G'(y)\\ k\frac{y^{3}(2+y)}{(1+y+y^{2})^{2}} & = \frac{1+2y}{(1+y+y^{2})^{2}}\\ ky^{3} & = \frac{1+2y}{2+y}. \end{align*}

    Note that since C^{*} - \widehat{C} \geq 0, we have that k > 0. For any positive choice of k, it is a simple matter to see that the graphs of

    \begin{equation*} x = ky^{3}\text{ and }x = \frac{1+2y}{2+y} \end{equation*}

    have only one positive intersection point. Note that this tells us that G'(y)/G_{2}(y) is invertible, and in particular, that there is only one admissible value of y (depending on k) that solves (C.1). Therefore, z_{SC}^{+} is completely determined by the values of C^{*}, \widehat{C}, \nu_{S}, and N^{*}.

    Next, we note that x = y is a solution to equations (5.50) and (5.51). We will omit the details of this computation, as for both (5.50) and (5.51), it can be verified directly that x = y is a solution if and only if (C.1) holds, which we have already established.

    Lastly, we will show that x = y is the only solution to (5.50) and (5.51). This, then, combined with the facts that (x, y) = (z_{SC}^{-}, z_{SC}^{+}) necessarily solves (5.50) and (5.51) and z_{SC}^{+} is completely determined by the values of C^{*}, \widehat{C}, \nu_{S}, and N^{*} will establish that z_{SC}^{-} = z_{SC}^{+}, that is z_{SC} is continuous at \hat{t}.

    To show that x = y is the only solution to (5.50) and (5.51), we will use \frac{dy}{dx} for both these equations as well as the PCSU identity (3.21). We proceed with finding \frac{dy}{dx} for (5.50), and so as to avoid confusion, we will refer to y in this equation as y_{A}. Differentiating (5.50) gives us

    \begin{equation*} C^{*}\nu_{S}N^{*}G_{2}'(y_{A})\frac{dy_{A}}{dx} = \bigl(C^{*} - \widehat{C}\bigr)\bigl(\nu_{S}N^{*}G_{2}'(x) + \widehat{C}G''(x)\bigr) \end{equation*}

    and so

    \begin{align} \frac{dy_{A}}{dx} & = \frac{\bigl(C^{*} - \widehat{C}\bigr)\bigl(\nu_{S}N^{*}G_{2}'(x) + \widehat{C}G''(x)\bigr)}{C^{*}\nu_{S}N^{*}G_{2}'(y_{A})}\\ & = -\frac{\bigl(C^{*} - \widehat{C}\bigr)\bigl(\widehat{C}G''(x) - \nu_{S}N^{*}xG''(x)\bigr)}{C^{*}\nu_{S}y_{A}G''(y_{A})}\\ & = \frac{\bigl(C^{*} - \widehat{C}\bigr)N^{*}G''(x)}{C^{*}y_{A}G''(y_{A})}\biggl(x - \frac{\widehat{C}}{\nu_{S}N^{*}}\biggr). \end{align} (C.2)

    Now,

    \begin{equation*} G''(x) = \frac{d}{dx}\frac{1+2x}{(1+x+x^{2})^{2}} = -\frac{6x(1 + x)}{(1 + x + x^2)^{3}} \leq 0 \text{ for }x \geq 0 \end{equation*}

    and so

    \begin{equation*} \frac{\bigl(C^{*} - \widehat{C}\bigr)N^{*}G''(x)}{C^{*}y_{A}G''(y_{A})} \geq 0 \text{ for }x,y_{A} \geq 0. \end{equation*}

    In particular, this means that

    \begin{equation} \text{sgn}\left(\frac{dy_{A}}{dx}\right) = \text{sgn}\biggl(x - \frac{\widehat{C}}{\nu_{S}N^{*}}\biggr). \end{equation} (C.3)

    Next, we do the same thing for (5.51), here using y_{B} to refer to y:

    \begin{equation*} C^{*}G''(y_{B})\frac{dy_{B}}{dx} = \nu_{S}N^{*}G_{2}'(x) + \widehat{C}G''(x) \end{equation*}

    and so

    \begin{align} \frac{dy_{B}}{dx} & = \frac{\nu_{S}N^{*}G_{2}'(x) + \widehat{C}G''(x)}{C^{*}G''(y_{B})} \\ & = \frac{\widehat{C}G''(x) - \nu_{S}N^{*}xG''(x)}{C^{*}G''(y_{B})} \\ & = -\frac{\nu_{S}N^{*}G''(x)}{C^{*}G''(y_{B})}\biggl(x - \frac{\widehat{C}}{\nu_{S}N^{*}}\biggr). \end{align} (C.4)

    As before,

    \begin{equation*} \frac{\nu_{S}N^{*}G''(x)}{C^{*}G''(y_{B})} \geq 0 \text{ for }x,y_{B}\geq 0, \end{equation*}

    so, in particular, we have that

    \begin{equation} \text{sgn}\left(\frac{dy_{B}}{dx}\right) = -\text{sgn}\biggl(x - \frac{\widehat{C}}{\nu_{S}N^{*}}\biggr) \end{equation} (C.5)

    which means that

    \begin{equation} \text{sgn}\left(\frac{dy_{A}}{dx}\right) = -\text{sgn}\left(\frac{dy_{B}}{dx}\right). \end{equation} (C.6)

    Additionally, note that both derivatives vanish at x = \frac{\widehat{C}}{\nu_{S}N^{*}}. We will use these properties of \frac{dy_{A}}{dx} and \frac{dy_{B}}{dx} to show that x = y is the only admissible solution to (5.50) and (5.51).

    Recall that (x, y) = (z_{SC}^{-}, z_{SC}^{+}) is necessarily a solution to (5.50) and (5.51), and we have established that the values of C^{*}, \widehat{C}, \nu_{S}, and N^{*} determine a single value of y, and therefore z_{SC}^{+}, that solves both equations. Now, as z_{SC}^{+} = \lim_{t\to \hat{t}^{+}} z_{SC}, we have that

    \begin{equation*} y = \lim\limits_{t\to \hat{t}^{+}}\frac{u_{SC}C}{\nu_{S}N^{*}} \leq \frac{\widehat{C}}{\nu_{S}N^{*}}, \end{equation*}

    which means that the y = x intersection point must be at or before x = \frac{\widehat{C}}{\nu_{S}N^{*}}, the point where both \frac{dy_{A}}{dx} and \frac{dy_{B}}{dx} vanish. There are two possible cases to consider: either u_{SC}^{+} = 1 or u_{SC}^{+} < 1, where here u_{SC}^{+} = \lim_{t\to \hat{t}^{+}}u_{SC}(t).

    1. Case 1: u_{SC}^{+} = 1

    In this case, the y = x intersection point occurs at \bigl(\frac{\widehat{C}}{\nu_{S}N^{*}}, \frac{\widehat{C}}{\nu_{S}N^{*}}\bigr). We have from (C.3) and (C.5) that x = \frac{\widehat{C}}{\nu_{S}N^{*}} is the unique global minimizer for y_{A} and the unique global maximizer for y_{B}, meaning that there are no additional intersection points.

    2. Case 2: u_{SC}^{+} < 1

    In this case, the y = x intersection point occurs at \left(\frac{u_{SC}^{+}\widehat{C}}{\nu_{S}N^{*}}, \frac{u_{SC}^{+}\widehat{C}}{\nu_{S}N^{*}}\right), before x = \frac{\widehat{C}}{\nu_{S}N^{*}} where \frac{dy_{A}}{dx} = 0 = \frac{dy_{B}}{dx}. Now, we have from (C.3) and (C.5) that y_{A} strictly decreases until x = \frac{\widehat{C}}{\nu_{S}N^{*}} and then strictly increases after that point, whereas y_{B} strictly increases until x = \frac{\widehat{C}}{\nu_{S}N^{*}} and then strictly decreases after that point, implying the existence of exactly one more intersection point at some x = p > \frac{\widehat{C}}{\nu_{S}N^{*}}. Now, as we established, there is only one solution for y, namely y = z_{SC}^{+}. Therefore, in this case, there is another intersection point (p, z_{SC}^{+}) where p > z_{SC}^{+}. Note that this is potentially admissible because we would have

    \begin{equation*} p = z_{SC}^{-} = \lim\limits_{t\to \hat{t}^{-}}\frac{u_{SC}C}{u_{SN}\nu_{S}N}, \end{equation*}

    which for admissible choices of the controls could be greater than \frac{u_{SC}^{+}\widehat{C}}{\nu_{S}N^{*}}. However, if z_{SC}^{-} = p, then as G' is strictly decreasing, we have that G'(z_{SC}^{-}) < G'(z_{SC}^{+}), and so

    \begin{equation*} \lambda_{S}(\hat{t})G'(z_{SC}^{-}) < \lambda_{S}(\hat{t})G'(z_{SC}^{+} ) = 1, \end{equation*}

    which is impossible by (5.18). Therefore, this second intersection point is not admissible.

    In both cases, we have shown that the only admissible intersection point for the curves defined by equations (5.50) and (5.51) is at the single intersection point where x = y, which because (x, y) = (z_{SC}^{-}, z_{SC}^{+}) is a solution means that z_{SC} is continuous at \hat{t}.

    This appendix includes a derivation, referenced in Section 6.5, for the differential equation for z_{SC} and z_{RC} during the balanced growth phase. In particular, we will derive equations (6.11) and (6.12), revisited below.

    \begin{align*} \frac{dz_{SC}}{dt} & = \frac{N_{R}\nu_{S}\nu_{R}G(z_{RC})G_{2}(z_{SC}) - C_{S}G'(z_{SC})\left[\nu_{S}G_{2}(z_{SC}) + z_{RC}\nu_{R}G'(z_{SC})\right]}{G''(z_{SC})\left[\nu_{S}z_{SC} - \nu_{R}z_{RC}\right]} \end{align*} (6.11 revisited)
    \begin{align*} \frac{dz_{RC}}{dt} & = \frac{N_{R}\nu_{S}G_{2}(z_{SC})\left[G'(z_{RC})\right]^{2} - C_{S}\left[G'(z_{SC})\right]^{2}G'(z_{RC}) + G''(z_{SC})G'(z_{RC})\frac{dz_{SC}}{dt}}{G'(z_{SC})G''(z_{RC})} \end{align*} (6.12 revisited)

    In deriving these differential equations, we will use the notation z_{S} = z_{SC} and z_{R} = z_{RC} for simplicity. We begin by differentiating (5.18) and (5.19) with respect to t:

    \begin{align} \lambda_{S}'G'(z_{S}) + \lambda_{S}G''(z_{S})\frac{dz_{S}}{dt} & = \lambda_{R}'G'(z_{R}) + \lambda_{R}G''(z_{R})\frac{dz_{R}}{dt} \end{align} (D.1)
    \begin{align} \lambda_{S}'\nu_{S}G_{2}(z_{S}) + \lambda_{S}\nu_{S}G_{2}'(z_{S})\frac{dz_{S}}{dt} & = \lambda_{R}'\nu_{R}G_{2}(z_{R}) + \lambda_{R}\nu_{R}G_{2}'(z_{R})\frac{dz_{R}}{dt}. \end{align} (D.2)

    Solving (D.1) for \lambda_{R}G''(z_{R})\frac{dz_{R}}{dt} gives us

    \begin{equation} \lambda_{R}G''(z_{R})\frac{dz_{R}}{dt} = \lambda_{S}'G'(z_{S}) + \lambda_{S}G''(z_{S})\frac{dz_{S}}{dt} - \lambda_{R}'G'(z_{R}). \end{equation} (D.3)

    Likewise, making the substitution G_{2}'(z) = -zG''(z) by means of (3.21), we can solve for z_{R}\nu_{R}\lambda_{R}G''(z_{R})\frac{dz_{R}}{dt} in (D.2) to get

    \begin{equation} z_{R}\nu_{R}\lambda_{R}G''(z_{R})\frac{dz_{R}}{dt} = \lambda_{R}'\nu_{R}G_{2}(z_{R}) - \lambda_{S}'\nu_{S}G_{2}(z_{S}) + z_{S}\lambda_{S}\nu_{S}G''(z_{S})\frac{dz_{S}}{dt}. \end{equation} (D.4)

    Substituting (D.3) into (D.4) gives us

    \begin{align} z_{R}\nu_{R}\left[\lambda_{S}'G'(z_{S}) + \lambda_{S}G''(z_{S})\frac{dz_{S}}{dt} - \lambda_{R}'G'(z_{R})\right]&\\ &= \lambda_{R}'\nu_{R}G_{2}(z_{R}) - \lambda_{S}'\nu_{S}G_{2}(z_{S}) + z_{S}\lambda_{S}\nu_{S}G''(z_{S})\frac{dz_{S}}{dt}. \end{align} (D.5)

    Solving for \frac{dz_{S}}{dt}, making use of (3.20), gives us

    \begin{equation} \frac{dz_{S}}{dt} = \frac{\lambda_{S}'\left[\nu_{S}G_{2}(z_{S}) + z_{R}\nu_{R}G'(z_{S})\right] - \lambda_{R}'\nu_{R}G(z_{R})}{\lambda_{S}G''(z_{S})\left[\nu_{S}z_{S} - \nu_{R}z_{R}\right]}. \end{equation} (D.6)

    Lastly, we use the differential equations for \lambda_{S} and \lambda_{R} during balanced growth, (5.22) and (5.23), to write both \lambda_{S}' and \lambda_{R}' in terms of \lambda_{S}. Canceling the resulting common factor of \lambda_{S} leaves us with

    \begin{equation} \frac{dz_{S}}{dt} = \frac{N_{R}\nu_{S}\nu_{R}G(z_{R})G_{2}(z_{S}) - C_{S}G'(z_{S})\left[\nu_{S}G_{2}(z_{S}) + z_{R}\nu_{R}G'(z_{S})\right]}{G''(z_{S})\left[\nu_{S}z_{S} - \nu_{R}z_{R}\right]} \end{equation} (D.7)

    Similarly, we can use (5.22) and (5.23) along with (5.18) to simplify (D.3), which upon canceling the common factors of \lambda_{S}, gives us

    \begin{equation} \frac{dz_{R}}{dt} = \frac{N_{R}\nu_{S}G_{2}(z_{S})\left[G'(z_{R})\right]^{2} - C_{S}\left[G'(z_{S})\right]^{2}G'(z_{R}) + G''(z_{S})G'(z_{R})\frac{dz_{S}}{dt}}{G'(z_{S})G''(z_{R})}. \end{equation} (D.8)

    The authors declare that they have not used Generative-AI tools in the creation of this paper.

    All authors declare no conflicts of interest in this paper.

    This work was primarily conducted as part of David McMorris' PhD dissertation under the advisement of Glenn Ledder.



    [1] Abdulrahman UFI, Panford JK, Hayfron-Acquah JB (2014) Fuzzy logic approach to credit scoring for micro finances in Ghana: a case study of KWIQPUS money lending. Int J Comput Appl 94: 11–18. https://doi.org/10.5120/16362-5772 doi: 10.5120/16362-5772
    [2] Vahid PR, Ahmadi A (2016) Modeling corporate customers' credit risk considering the ensemble approaches in multiclass classification: evidence from Iranian corporate credits. J Credit Risk 12: 71–95.
    [3] Allen L, Peng L, Shan Y (2020) Social networks and credit allocation on fintech lending platforms. working paper, social science research network. Available from: https://papers.ssrn.com/sol3/papers.cfm?abstract_id = 3537714.
    [4] Altman EI (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Financ 23: 589–609. Available from: https://pdfs.semanticscholar.org/cab5/059bfc5bf4b70b106434e0cb665f3183fd4a.pdf.
    [5] Altman EI, Narayanan P (1997) An international survey of business failure classification models. in financial markets, institutions and instruments. New York: New York University Salomon Center, 6. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/1468-0416.00010.
    [6] American Banking Association (2018) New credit score unveiled drawing on bank account data. ABA Bank J, October 22.
    [7] Anagnostou I, Kandhai D, Sanchez Rivero J, et al. (2020) Contagious defaults in a credit portfolio: a Bayesian network approach. J Credit Risk 16: 1–26. https://dx.doi.org/10.2139/ssrn.3446615 doi: 10.2139/ssrn.3446615
    [8] Bjorkegren D, Grissen D (2020) Behavior revealed in mobile phone usage predicts credit repayment. World Bank Econ Rev 34: 618–634. https://doi.org/10.1093/wber/lhz006 doi: 10.1093/wber/lhz006
    [9] Bonds D (1999) Modeling term structures of defaultable bonds. Rev Financ Stud 12: 687–720. Available from: https://academic.oup.com/rfs/article-abstract/12/4/687/1578719?redirectedFrom = fulltext.
    [10] Breeden J (2021) A survey of machine learning in credit risk. J Risk Model Validat 17: 1–62.
    [11] Chava S, Jarrow RA (2004) Bankruptcy prediction with industry effects. Rev Financ 8: 537–69. Available from: https://papers.ssrn.com/sol3/papers.cfm?abstract_id = 287474.
    [12] Clemen R (1989) Combining forecasts: a review and annotated bibliography. Int J Forecast 5: 559–583 Available from: https://people.duke.edu/~clemen/bio/Published%20Papers/13.CombiningReview-Clemen-IJOF-89.pdf.
    [13] Coats PK, Fant LF (1993) Recognizing financial distress patterns using a neural network tool. Financ Manage 22: 142–155. Available from: https://ideas.repec.org/a/fma/fmanag/coats93.html.
    [14] Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Neural Information Processing Systems, 1: 160–167. Association for Computing Machinery, New York. https://doi.org/10.1145/1390156.1390177
    [15] Duffie D, Singleton KJ (1999) Simulating correlated defaults. Paper presented at the Bank of England Conference on Credit Risk Modeling and Regulatory Implications Working Paper, Stanford University. Available from: https://kenneths.people.stanford.edu/sites/g/files/sbiybj3396/f/duffiesingleton1999.pdf.
    [16] Dwyer DW, Kogacil AE, Stein RM (2004) Moody's KMV RiskCalcTM v2.1 Model. Moody's Analytics. Available from: https://www.moodys.com/sites/products/productattachments/riskcalc%202.1%20whitepaper.pdf.
    [17] Harrell FE Jr (2018) Road Map for Choosing Between Statistical Modeling and Machine Learning, Stat Think. Available from: http://www.fharrell.com/post/stat-ml.
    [18] Jacobs Jr M (2022a) Quantification of model risk with an application to probability of default estimation and stress testing for a large corporate portfolio. J Risk Model Validat 15: 1–39. https://doi.org/10.21314/JRMV.2022.023 doi: 10.21314/JRMV.2022.023
    [19] Jacobs Jr M (2022b) Validation of corporate probability of default models considering alternative use cases and the quantification of model risk. Data Sci Financ Econ 2: 17–53. https://doi.org/10.3934/DSFE.2022002 doi: 10.3934/DSFE.2022002
    [20] Jarrow RA, Turnbull SM (1995) Pricing derivatives on financial securities subject to credit risk. J Fnanc 50: 53–85. https://doi.org/10.1111/j.1540-6261.1995.tb05167.x doi: 10.1111/j.1540-6261.1995.tb05167.x
    [21] Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, 1: 1097–1105. Available from: https://cse.iitk.ac.in/users/cs365/2013/hw2/krizhevsky-hinton-12-imagenet-convolutional-NN-deep.pdf.
    [22] Kumar IE, Venkatasubramanian S, Scheidegger C, et al. (2020) Problems with Shapley-Value-Based Explanations as Feature Importance Measures. In International Conference on Machine Learning Research: 5491–5500. Available from: https://proceedings.mlr.press/v119/kumar20e.html.
    [23] Lessmann S, Baesens B, Seow HV, et al. (2015) Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Europ J Oper Res 247: 124–136. https://doi.org/10.1016/j.ejor.2015.05.030 doi: 10.1016/j.ejor.2015.05.030
    [24] Li K, Niskanen J, Kolehmainen M, et al. (2016) Financial innovation: Credit default hybrid model for SME lending. Expert Syst Appl 61: 343–355. https://doi.org/10.1016/j.eswa.2016.05.029 doi: 10.1016/j.eswa.2016.05.029
    [25] Li X, Liu S, Li Z, et al. (2020) Flowscope: spotting money laundering based on graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, 34: 4731–4738. https://doi.org/10.1609/aaai.v34i04.5906
    [26] Lou Y, Caruana R, Gehrke J, et al. (2013) Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: 623–631. https://dl.acm.org/doi/abs/10.1145/2487575.2487579
    [27] Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, 4768–4777. Available from: https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html.
    [28] Mckee TE (2000) Developing a bankruptcy prediction model via rough sets theory. Intell Syst Account Financ Manage 9: 159–173. https://doi.org/fv22ks
    [29] Merton RC (1974) On the pricing of corporate debt: The risk structure of interest rates. J Fnanc 29: 449–470.
    [30] Mester LJ (1997) What's the point of credit scoring? Federal Reserve Bank of Philadelphia. Bus Rev 3: 3–16. Available from: https://fraser.stlouisfed.org/files/docs/historical/frbphi/businessreview/frbphil_rev_199709.pdf.
    [31] Min JH, Lee YC (2005) Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst Appl 28: 603–614 https://doi.org/10.1016/j.eswa.2004.12.008 doi: 10.1016/j.eswa.2004.12.008
    [32] Molnar C, König G, Herbinger J, et al. (2020) Pitfalls to avoid when interpreting machine learning models. Working Paper, University of Vienna. Available from: http://eprints.cs.univie.ac.at/6427/.
    [33] Odom MD, Sharda R (1990) A neural network model for bankruptcy prediction. In Joint Conference on Neural Networks, 163–168. IEEE Press, Piscataway, NJ. Available from: https://ieeexplore.ieee.org/abstract/document/5726669.
    [34] Opitz D, Maclin R (1999) Popular ensemble methods: An empirical study. J Artif Intell Res 11: 169–198.
    [35] Ribeiro MT, Singh S, Guestrin C (2016) "Why should I trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: 1135–1144. Available from: https://dl.acm.org/doi/abs/10.1145/2939672.2939778.
    [36] Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1: 206–215. https://doi.org/10.1038/s42256-019-0048-x doi: 10.1038/s42256-019-0048-x
    [37] Slack D, Hilgard S, Jia E, et al. (2020) Fooling LIME and SHAP: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 180–186. https://dl.acm.org/doi/abs/10.1145/3375627.3375830
    [38] Sudjianto A, Knauth W, Rahul S, et al. (2020) Unwrapping the black box of deep ReLU networks: Interpretability, diagnostics, and simplification. arXiv preprin, Cornell University. https://arXiv.org/abs/2011.04041
    [39] Sudjianto A, Zhang A (2021) Designing inherently interpretable machine learning models. In Proceedings of ACM ICAIF 2021 Workshop on Explainable AI in Finance. ACM, New York. https://arXiv.org/abs/2111.01743
    [40] Sudjianto A, Zhang A, Yang Z, et al. (2023) PiML toolbox for interpretable machine learning model development and validation. arXiv preprint arXiv. https://doi.org/10.48550/arXiv.2305.04214
    [41] U.S. Banking Regulatory Agencies (2011) The U.S. Office of the Comptroller of the Currency and the Board of Governors of Federal Reserve System. SR 11-7/OCC11-12: Supervisory Guidance on Model Risk Management. Washington, D.C. Available from: https://www.federalreserve.gov/supervisionreg/srletters/sr1107a1.pdf.
    [42] U.S. Banking Regulatory Agencies (2021) The U.S. Office of Comptroller of the Currency, the Board of Governors of the Federal Reserve System, the Federal Deposit Insurance Corporation, the Consumer Financial Protection Bureau, and the National Credit Union Administration. Request for Information and Comment on Financial Institutions' Use of Artificial Intelligence, Including Machine Learning. Washington, D.C. Available from: https://www.federalregister.gov/documents/2021/03/31/2021-06607/requestfor-information-and-comment-on-financial-institutions-use-of-artificialintelligence.
    [43] U.S. Office of the Comptroller of the Currency (2021) Comptroller's Handbook on Model Risk Management. Washington, D.C. Available from: https://www.occ.treas.gov/publicationsand-resources/publications/comptrollers-handbook/files/model-riskmanagement/index-model-risk-management.html.
    [44] Vassiliou PC (2013) Fuzzy semi-Markov migration process in credit risk. Fuzzy Sets and Syst 223: 39–58. https://doi.org/10.1016/j.fss.2013.02.016 doi: 10.1016/j.fss.2013.02.016
    [45] Yang Z, Zhang A, Sudjianto A (2021a) Enhancing explainability of neural networks through architecture constraints. IEEE T Neur Net Learn Syst 32: 2610–2621. https://doi.org/10.1109/TNNLS.2020.3007259 doi: 10.1109/TNNLS.2020.3007259
    [46] Yang Z, Zhang A, Sudjianto A (2021) GAMI-Net: An explainable neural network based on generalized additive models with structured interactions. Pattern Recogn 120: 108192. https://doi.org/10.1016/j.patcog.2021.10819 doi: 10.1016/j.patcog.2021.10819
    [47] Zhu Y, Xie C, Wang G J, et al. (2017) Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China's SME credit risk in supply chain finance. Neural Comput Appl 28: 41–50. https://doi.org/10.1007/s00521-016-2304-x doi: 10.1007/s00521-016-2304-x
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2676) PDF downloads(204) Cited by(1)

Article outline

Figures and Tables

Figures(67)  /  Tables(7)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog