Multifunctional farm advisory services in promoting change in agricultural systems: The case of Campania region of Italy

Marcello De Rosa; Giuseppina Olivieri; Concetta Menna; Ferdinando Gandolfi; Teresa Del Giudice; Marcello De Rosa; Giuseppina Olivieri; Concetta Menna; Ferdinando Gandolfi; Teresa Del Giudice

doi:10.3934/agrfood.2023051

AIMS Agriculture and Food

2023, Volume 8, Issue 4: 962-977. doi: 10.3934/agrfood.2023051

Previous Article Next Article

Research article Special Issues

Multifunctional farm advisory services in promoting change in agricultural systems: The case of Campania region of Italy

1.
Department of Economics and Law, University of Cassino and Southern Lazio, Via Folcara, 03043 Cassino (FR), Italy
2.
Agricultural Department, University of Naples Federico Ⅱ, Via Università, 80055 Portici (NA), Italy
3.
CREA-Center of Political and Bioeconomy, Centro Direzionale-Isola E5-Viale della Costituzione snc 80143 Napoli (NA), Italy
4.
Campania Region-Department of Agriculture, Via Taddeo da Sessa, 80142 Napoli (NA), Italy

Received: 01 August 2023 Revised: 04 September 2023 Accepted: 21 September 2023 Published: 10 October 2023

Entrepreneurial contexts may be marked by the presence of a 'cultural environment' that stimulates knowledge and innovation adoption, while other contexts may act as barriers toward change and innovation. Moreover, multiple paths of multifunctional agriculture bring about a call for "multifunctional farm advisory services" (MFAS), which consider both private and public goods provided by the farming sector. Set against the background of multiple roles of agriculture, how to identify sound and pertinent knowledge becomes of paramount, to specify the roles of agricultural extensionists and the mechanisms of governance of MFAS within the setting up of the Agricultural Knowledge and Innovation System (AKIS). Our aim of the study is either to analyze attitudes toward the privatization of extension services within a predominantly public system of regional governance and to identify advisors' profile and their suitability with the modern vision of multifunctional agriculture through the emergence of MFAS. Empirical analysis evidences the presence of a diversified set of advisory services with different degrees of coherence with the multifunctional agricultural model. Also, the more advisory services are oriented towards empowering multifunctional agriculture the less the propensity towards their privatization. The cluster analysis has demonstrated a relatively good advisor's capability to deal with the new demands of multifunctional agriculture. The idea of MFAS has important theoretical implications that the paper tries to excavate through the analysis of the mechanisms of governance (public/private) and the identification of the advisors' profile facing the growing complexity of the farming sector, grounded on multifunctional agriculture. The study tries to fill a gap in the literature, by providing an original contribution to modeling the profile of advisors in charge of supporting the transition towards multifunctionality.

Keywords:

multifunctionality,
farm advisory services,
privatization of extension services,
probit model, cluster analysis, advisors' profiles

Citation: Marcello De Rosa, Giuseppina Olivieri, Concetta Menna, Ferdinando Gandolfi, Teresa Del Giudice. Multifunctional farm advisory services in promoting change in agricultural systems: The case of Campania region of Italy[J]. AIMS Agriculture and Food, 2023, 8(4): 962-977. doi: 10.3934/agrfood.2023051

Related Papers:

[1]	Yan Ling Fu, Wei Zhang . Some results on frames by pre-frame operators in Q-Hilbert spaces. AIMS Mathematics, 2023, 8(12): 28878-28896. doi: 10.3934/math.20231480
[2]	Gang Wang . Some properties of weaving $K$ -frames in $n$ -Hilbert space. AIMS Mathematics, 2024, 9(9): 25438-25456. doi: 10.3934/math.20241242
[3]	Sergio Verdú . Relative information spectra with applications to statistical inference. AIMS Mathematics, 2024, 9(12): 35038-35090. doi: 10.3934/math.20241668
[4]	Ligong Wang . Output statistics, equivocation, and state masking. AIMS Mathematics, 2025, 10(6): 13151-13165. doi: 10.3934/math.2025590
[5]	Cure Arenas Jaffeth, Ferrer Sotelo Kandy, Ferrer Villar Osmin . Functions of bounded ${\bf (2, k)}$ -variation in 2-normed spaces. AIMS Mathematics, 2024, 9(9): 24166-24183. doi: 10.3934/math.20241175
[6]	Chibueze C. Okeke, Abubakar Adamu, Ratthaprom Promkam, Pongsakorn Sunthrayuth . Two-step inertial method for solving split common null point problem with multiple output sets in Hilbert spaces. AIMS Mathematics, 2023, 8(9): 20201-20222. doi: 10.3934/math.20231030
[7]	Osmin Ferrer Villar, Jesús Domínguez Acosta, Edilberto Arroyo Ortiz . Frames associated with an operator in spaces with an indefinite metric. AIMS Mathematics, 2023, 8(7): 15712-15722. doi: 10.3934/math.2023802
[8]	Abdullah Ali H. Ahmadini, Amal S. Hassan, Ahmed N. Zaky, Shokrya S. Alshqaq . Bayesian inference of dynamic cumulative residual entropy from Pareto Ⅱ distribution with application to COVID-19. AIMS Mathematics, 2021, 6(3): 2196-2216. doi: 10.3934/math.2021133
[9]	Messaoud Bounkhel . $V$ -Moreau envelope of nonconvex functions on smooth Banach spaces. AIMS Mathematics, 2024, 9(10): 28589-28610. doi: 10.3934/math.20241387
[10]	Jamilu Adamu, Kanikar Muangchoo, Abbas Ja'afaru Badakaya, Jewaidu Rilwan . On pursuit-evasion differential game problem in a Hilbert space. AIMS Mathematics, 2020, 5(6): 7467-7479. doi: 10.3934/math.2020478

Abstract

1. Introduction

In ^{[13,14,15,16,19]}, it was proposed that insight into a probability distribution, $\mu$ , posed on a Hilbert space, $\mathcal{H}$ , could be obtained by finding a best fit Gaussian approximation, $\nu$ . This notion of best, or optimal, was with respect to the relative entropy, or Kullback-Leibler divergence:

$\begin{equation} \mathcal{R}(\nu||\mu) = \begin{cases} \mathbb{E}^{\nu}\left[{\log \frac{d\nu}{d\mu}}\right], & \nu \ll \mu,\\ +\infty, &\text{otherwise}. \end{cases} \end{equation}$

(1.1)

Having a Gaussian approximation provides qualitative insight into $\mu$ , as it provides a concrete notion of the mean and variance of the distribution. Additionally, this optimized distribution can be used in algorithms, such as random walk Metropolis, as a preconditioned proposal distribution to improve performance. Such a strategy can benefit a number of applications, including path space sampling for molecular dynamics and parameter estimation in statistical inverse problems.

Observe that in the definition of $\mathcal{R}$ , (1.1), there is an asymmetry in the arguments. Were we to work with $\mathcal{R}(\mu||\nu)$ , our optimal Gaussian would capture the first and second moments of $\mu$ , and in some applications this is desirable. However, for a multimodal problem (consider a distribution with two well separated modes), this would be inadequate; our form attempts to match individual modes of the distribution by a Gaussian. For a recent review of the $\mathcal{R}(\nu||\mu)$ problem, see ^[4], where it is remarked that this choice of arguments is likely to underestimate the dispersion of the distribution of interest, $\mu$ . The other ordering of arguments has been explored, in the finite dimensional case, in ^[2,3,10,18].

To be of computational use, it is necessary to have an algorithm that will converge to this optimal distribution. In ^[15], this was accomplished by first expressing $\nu = N(m, C(p))$ , where $m$ is the mean and $p$ is a parameter inducing a well defined covariance operator, and then solving the problem,

$\begin{equation} (m,p) \in {\rm{argmin}} \mathcal{R}(N(m, C(p))||\mu), \end{equation}$

(1.2)

over an admissible set. The optimization step itself was done using the Robbins-Monro algorithm (RM), ^[17], by seeking a root of the first variation of the relative entropy. While the numerical results of ^[15] were satisfactory, being consistent with theoretical expectations, no rigorous justification for the application of RM to the examples was given.

In this work, we emphasize the study and application of RM to potentially infinite dimensional problems. Indeed, following the framework of ^[15,16], we assume that $\mu$ is posed on the Borel $\sigma$ -algebra of a separable Hilbert space $(\mathcal{H}, \left\langle {\bullet}, {\bullet}\right\rangle, \left \|{\bullet}\right\|)$ . For simplicity, we will leave the covariance operator $C$ fixed, and only optimize over the mean, $m$ . Even in this case, we are seeking $m\in \mathcal{H}$ , a potentially infinite-dimensional space.

1.1. Robbins-Monro

Given the objective function $f: \mathcal{H} \to \mathcal{H}$ , assume that it has a root, $x_\star$ . In our application to relative entropy, $f$ will be its first variation. Further, we assume that we can only observe a noisy version of $f$ , $F: \mathcal{H} \times \chi\to \mathcal{H}$ , such that for all $x \in \mathcal{H}$ ,

$\begin{equation} f(x) = \mathbb{E}[F(x,Z)] = \int_{\chi} F(x, z) \mu_Z(dz), \end{equation}$

(1.3)

where $\mu_Z$ is the distribution associated with the random variable (r.v.) $Z$ , taking values in the auxiliary space $\chi$ . The naive Robbins-Monro algorithm is given by

$\begin{equation} X_{n+1} = X_n - a_{n+1} F(X_n, Z_{n+1}), \end{equation}$

(1.4)

where $Z_n \sim \mu_Z$ , are independent and identically distributed (i.i.d.), and $a_n > 0$ is a carefully chosen sequence. Subject to assumptions on $f$ , $F$ , and the distribution $\mu_Z$ , it is known that $X_n$ will converge to $x_\star$ almost surely (a.s.), in finite dimensions, ^[5,6,17]. Often, one needs to assume that $f$ grows at most linearly,

$\begin{equation} \left \|{f(x)}\right\|\leq c_0 + c_1 \left \|{x}\right\|, \end{equation}$

(1.5)

in order to apply the results in the aforementioned papers. The analysis in the finite dimensional case has been refined tremendously over the years, including an analysis based on continuous dynamical systems. We refer the reader to the books ^[1,8,11] and references therein.

1.2. Trust regions and truncations

As noted, much of the analysis requires the regression function $f$ to have, at most, linear growth. Alternatively, an a priori assumption is sometimes made that the entire sequence generated by (1.4) stays in a bounded set. Both assumptions are limiting, though, in practice, one may find that the algorithms converge.

One way of overcoming these assumptions, while still ensuring convergence, is to introduce trust regions that the sequence $\{X_n\}$ is permitted to explore, along with a "truncation" which enforces the constraint. Such truncations distort (1.4) into

$\begin{equation} X_{n+1} = X_n - a_{n+1} F(X_n, Z_{n+1}) + a_{n_+1} P_{n+1}, \end{equation}$

(1.6)

where $P_{n+1}$ is the projection keeping the sequence $\{X_n\}$ within the trust region. Projection algorithms are also discussed in ^[1,8,11].

We consider RM on a possibly infinite dimensional separable Hilbert space. This is of particular interest as, in the context of relative entropy optimization, we may be seeking a distribution in a Sobolev space associated with a PDE model. A general analysis of RM with truncations in Hilbert spaces can be found in ^[20]. The main purpose of this work is to adapt the analysis of ^[12] to the Hilbert space setting for two versions of the truncated problem. The motivation for this is that the analysis of ^[12] is quite straightforward, and it is instructive to see how it can be easily adapted to the infinite dimensional setting. The key modification in the proof is that results for Banach space valued martingales must be invoked. We also adapt the results to a version of the algorithm where there is prior knowledge on the location of the root. With these results in hand, we can then verify that the relative entropy minimization problem can be solved using RM.

1.2.1. Fixed trust regions

In some problems, one may have a priori information on the root. For instance, we may know that $x_\star \in U_1$ , some open bounded set. In this version of the truncated algorithm, we have two open bounded sets, $U_0\subsetneq U_1$ , and $x_\star \in U_1$ . Let $\sigma_0 = 0$ and $X_0 \in U_0$ be given, then (1.6) can be formulated as

$\begin{gather} \tilde{X}_{n+1} = X_n - a_{n+1} F(X_n, Z_{n+1}) \end{gather}$

(1.7a)

$\begin{gather} X_{n+1} = \begin{cases} \tilde{X}_{n+1} & \tilde{X}_{n+1}\in U_{1}\\ X_{0}^{(\sigma_n)} & \tilde{X}_{n+1} \notin U_{1} \end{cases} \end{gather}$

(1.7b)

$\begin{gather} \sigma_{n+1} = \begin{cases} \sigma_n & \tilde{X}_{n+1} \in U_{1}\\ \sigma_n+1 &\tilde{X}_{n+1} \notin U_{1} \end{cases} \end{gather}$

(1.7c)

We interpret $\tilde{X}_{n+1}$ as the proposed move, which is either accepted or rejected depending on whether or not it will remain in the trust region. If it is rejected, the algorithm restarts at $X_{0}^{(\sigma_n)}\in U_0$ . The restart points, $\{X_{0}^{(\sigma_n)}\}$ , may be random, or it may be that $X_{0}^{(\sigma_n)} = X_0$ is fixed. The essential property is that the algorithm will restart in the interior of the trust region, away from its boundary. The r.v. $\sigma_n$ counts the number of times a truncation has occurred. Algorithm (1.7) can now be expressed as

$\begin{equation} \begin{split} X_{n+1} & = X_n - a_{n+1} F(X_n, Z_n+1) + P_{n+1}\\ P_{n+1} & = \{{X_{0}^{(\sigma_n)} -\tilde{X}_{n+1}}\}1_{\tilde{X}_{n+1} \notin U_1}. \end{split} \end{equation}$

(1.8)

1.2.2. Expanding trust regions

In the second version of truncated Robbins-Monro, define the sequence of open bounded sets, $U_{n}$ such that:

$\begin{gather} U_0\subsetneq U_1\subsetneq U_2\subsetneq\ldots, \quad \cup_{n = 0}^\infty U_n = \mathcal{H}. \end{gather}$

(1.9)

Again, letting $X_0 \in U_0$ , $\sigma_0 = 0$ , the algorithm is

$\begin{gather} \tilde{X}_{n+1} = X_n - a_{n+1} F(X_n, Z_{n+1}) \end{gather}$

(1.10a)

$\begin{gather} X_{n+1} = \begin{cases} \tilde{X}_{n+1} & \tilde{X}_{n+1} \in U_{\sigma_n}\\ X_{0}^{(\sigma_n)} & \tilde{X}_{n+1} \notin U_{\sigma_n} \end{cases} \end{gather}$

(1.10b)

$\begin{gather} \sigma_{n+1} = \begin{cases} \sigma_n & \tilde{X}_{n+1} \in U_{\sigma_n}\\ \sigma_n+1 &\tilde{X}_{n+1}\notin U_{\sigma_n} \end{cases} \end{gather}$

(1.10c)

A consequence of this formulation is that $X_n \in U_{\sigma_n}$ for all $n$ . As before, the restart points may be random or fixed, and they are in $U_0$ . This would appear superior to the fixed trust region algorithm, as it does not require knowledge of the sets. However, to guarantee convergence, global (in $\mathcal{H}$ ) assumptions on the regression function are required; see Assumption 2 below. (1.10) can written with $P_{n+1}$ as

$\begin{equation} \begin{split} X_{n+1} & = X_n - a_{n+1} F(X_n, Z_n+1) + P_{n+1}\\ P_{n+1} & = \{{X_{0}^{(\sigma_n)} -\tilde{X}_{n+1}}\}1_{\tilde{X}_{n+1} \notin U_{\sigma_n}} \end{split} \end{equation}$

(1.11)

1.3. Outline

In Section 2, we state sufficient assumptions for which we are able to prove convergence in both the fixed and expanding trust region problems, and we also establish some preliminary results. In Section 3, we focus on the relative entropy minimization problem, and identify what assumptions must hold for convergence to be guaranteed. Examples are then presented in Section 4, and we conclude with remarks in Section 5.

2. Convergence of Robbins-Monro

We first reformulate (1.8) and (1.15) in the more general form

$\begin{equation} X_{n+1} = \underbrace{X_n - a_{n+1} f(X_n) - a_{n+1}\delta M_{n+1}}_{ = \tilde{X}_{n+1}} + a_{n+1} P_{n+1},\\ \end{equation}$

(2.1)

where $\delta M_{n+1}$ , the noise term, is

$\begin{equation} \begin{split} \delta M_{n+1} & = F(X_n, Z_{n+1}) - f(X_n)\\ & = F(X_n, Z_{n+1})- \mathbb{E}[ F(X_n, Z_{n+1})\mid X_n]. \end{split} \end{equation}$

(2.2)

A natural filtration for this problem is $\mathcal{F}_n = \sigma(X_0, Z_1, \ldots, Z_n)$ . $X_n$ is $\mathcal{F}_n$ measurable and the noise term can be expressed in terms of the filtration as $\delta M_{n+1} = F(X_n, Z_{n+1}) - \mathbb{E}[F(X_n, Z_{n+1})\mid \mathcal{F}_n]$ .

We now state our main assumptions:

Assumption 1. $f$ has a zero, $x_\star$ . In the case of the fixed trust region problem, there exist $R_0 < R_1$ such that

$\begin{equation*} U_0\subseteq B_{R_0}(x_\star)\subset B_{R_1}(x_\star)\subseteq U_1. \end{equation*}$

In the case of the expanding trust region problem, the open sets are defined as $U_n = B_{r_n}(0)$ with

$\begin{equation} 0 \lt r_0 \lt r_1 \lt r_2 \lt \ldots \lt r_n \to \infty. \end{equation}$

(2.3)

These sets clearly satisfy (1.9).

Assumption 2. For any $0 < a < A$ , there exists $\delta > 0$ :

$\begin{align*} \inf\limits_{a\leq\left \|{x-x_\star}\right\|\leq A} \left\langle {x-x_\star},{f(x)}\right\rangle\geq \delta. \end{align*}$

In the case of the fixed truncation, this inequality is restricted to $x\in U_1$ . This is akin to a convexity condition on a functional $\mathcal{F}$ with $f = D\mathcal{F}$ .

Assumption 3. $x\mapsto \mathbb{E}[\left \|{F(x, Z)}\right\|^2]$ is bounded on bounded sets, with the restriction to $U_1$ in the case of fixed trust regions.

Assumption 4. $a_n > 0$ , $\sum a_n = \infty$ , and $\sum a_n^2 < \infty$

Theorem 2.1. Under the above assumptions, for the fixed trust region problem, $X_n \to x_\star$ a.s. and $\sigma_n$ is a.s. finite.

Theorem 2.2. Under the above assumptions, for the expanding trust region problem, $X_n \to x_\star$ a.s. and $\sigma_n$ is a.s. finite.

Note the distinction between the assumptions in the two algorithms. In the fixed truncation algorithm, Assumptions 2 and 3 need only hold in the set $U_1$ , while in the expanding truncation algorithm, they must hold in all of $\mathcal{H}$ . While this would seem to be a weaker condition, it requires identification of the sets $U_0$ and $U_1$ for which the assumptions hold. Such sets may not be readily identifiable, as we will see in our examples.

We first need some additional information about $f$ and the noise sequence $\delta M_n$ .

Lemma 2.1. Under Assumption 3, $f$ is bounded on $U_1$ , for the fixed trust region problem, and on arbitrary bounded sets, for the expanding trust region problem.

Proof. Trivially,

$\begin{equation*} \left \|{f(x)}\right\| = \left \|{ \mathbb{E}[F(x,Z)]}\right\|\leq \mathbb{E}[\left \|{F(x,Z)}\right\| ]\leq \sqrt{ \mathbb{E}[\left \|{F(x,Z)}\right\|^2}], \end{equation*}$

and the results follows from the assumption.

Proposition 2.1. For the fixed trust region problem, let

$M_n = \sum\limits_{i = 1}^n a_i \delta M_i.$

Alternatively, in the expanding trust region problem, for $r > 0$ , let

$M_n = \sum\limits_{i = 1}^n a_i \delta M_i 1_{\left \|{X_{i-1}-x_\star}\right\|\leq r}.$

Under Assumptions 3 and 4, $M_n$ is a martingale, converging in $\mathcal{H}$ , a.s.

Proof. The following argument holds in both the fixed and expanding trust region problems, with appropriate modifications. We present the expanding trust region case. The proof is broken up into 3 steps:

$1.$ Relying on Theorem 6 of ^[7] for Banach space valued martingales, it will be sufficient to show that $M_n$ is a martingale, uniformly bounded in $L^1(\mathbb{P})$ .

$2.$ In the case of the expanding truncations,

$\begin{equation*} \begin{split} \mathbb{E}[\left \|{\delta M_i 1_{\left\| {{X_{i - 1}} - {x_\star}} \right\| \le r}}\right\|^2] &\leq 2 \mathbb{E}[\left \|{F(X_{i-1},Z_i)1_{\left\| {{X_{i - 1}} - {x_\star}} \right\| \le r}}\right\|^2] + 2 \mathbb{E}[\left \|{f(X_{i-1})1_{\left\| {{X_{i - 1}} - {x_\star}} \right\| \le r}}\right\|^2] \\ &\leq 2\sup\limits_{\left \|{x- x_\star}\right\|\leq r} \mathbb{E}[\left \|{F(x,Z)}\right\|^2] + 2\sup\limits_{\left \|{x- x_\star}\right\|\leq r}\left \|{f(x)}\right\|^2 \end{split} \end{equation*}$

Since both of these terms are bounded, independently of $i$ , by Assumption 3 and Lemma 1, this is finite.

$3.$ Next, since $\{\delta M_i 1_{\|X_{i-1}-x_\star\|\leq r}\}$ is a martingale difference sequence, we can use the above estimate to obtain the uniform $L^2(\mathbb{P})$ bound,

$\begin{equation*} \begin{split} \mathbb{E}[\left \|{M_n}\right\|^2]& = \sum\limits_{i = 1}^na_i^2 \mathbb{E}[\left \|{{\delta M_i 1_{\left\| {{X_{i - 1}} - {x_\star}} \right\| \le r}}}\right\|^2] \leq\sup\limits_{i} \mathbb{E}[\left \|{{\delta M_i 1_{\left\| {{X_{i - 1}} - {x_\star}} \right\| \le r}}}\right\|^2]\sum\limits_{i = 1}^\infty a_i^2 \lt \infty \end{split} \end{equation*}$

Uniform boundedness in $L^2$ , gives boundedness in $L^1$ , and this implies a.s. convergence in $\mathcal{H}$ .

2.1. Finite truncations

In this section we prove results showing that only finitely many truncations will occur, in either the fixed or expanding trust region case. Recall that when a truncation occurs, the equivalent conditions hold: $P_{n+1}\neq 0$ ; $\sigma_{n+1} = \sigma_n+1$ ; and $\tilde{X}_{n+1}\notin U_1$ in the fixed trust region algorithm, while $\tilde{X}_{n+1}\notin U_{\sigma_n}$ in the expanding trust region case.

Lemma 2.2. In the fixed trust region algorithm, if Assumptions 1, 2, 3, and 4 hold, then the number of truncations is a.s. finite; a.s., there exists $N$ , such that for all $n\geq N$ , $\sigma_n = \sigma_N$ .

Proof. We break the proof up into 7 steps:

1. Pick $\rho$ and $\rho'$ such that

$\begin{equation} R_0 \lt \rho' \lt \rho \lt R_1 \end{equation}$

(2.4)

Let $\bar f = \sup\|f(x)\|$ , with the supremum over $U_1$ ; this bound exists by Lemma 1. Under Assumption 2, there exists $\delta > 0$ such that

$\begin{equation} \inf\limits_{R_0/2 \leq \|x-x_\star\|\leq R_1}\left\langle {x-x_\star},{f(x)}\right\rangle = \delta. \end{equation}$

(2.5)

Having fixed $\rho$ , $\rho'$ , $\bar f$ , and $\delta$ , take $\epsilon > 0$ such that:

$\begin{equation} \epsilon \lt \min\left\{{\rho'-R_0, \frac{R_1-\rho'}{2 + \bar f}, \frac{\rho' - R_0}{\bar f}, \frac{R_0}{2}, \frac{\delta}{2\bar{f}}, \frac{\delta}{\bar{f}^2}, {\rho - \rho'}}\right\}. \end{equation}$

(2.6)

Having fixed such an $\epsilon$ , by the assumptions of this lemma and Proposition 1, a.s., there exists $n_{ \epsilon}$ such that for any $n, m\geq n_{ \epsilon}$ , both

$\begin{equation} \left \|{\sum\limits_{k = n}^m a_k \delta M_k}\right\|\leq \epsilon, \quad a_n \leq \epsilon. \end{equation}$

(2.7)

$2.$ Define the auxiliary sequence

$\begin{equation} X_n' = X_n - \sum\limits_{k = n+1}^\infty a_k \delta M_k. \end{equation}$

(2.8)

Using (2.1), we can then write

$\begin{equation} X_{n+1}' = X_{n}' - a_{n+1}f(X_n) +a_{n+1}P_{n+1}. \end{equation}$

(2.9)

By (2.7), for any $n \geq n_ \epsilon$ ,

$\begin{equation} \|X_n' - X_n\|\leq \epsilon \end{equation}$

(2.10)

$3.$ We will show $X_n' \in B_{\rho'}(x_\star)$ for all $n$ large enough. The significance of this is that if $n\geq n_ \epsilon$ , and $X_n'\in B_{\rho'}(x_\star)$ , then no truncation occurs. Indeed, using (2.6)

$\begin{equation} \begin{split} \|\tilde{X}_{n+1} -x_\star\| &\leq \|X_{n}' -x_\star\| +\|X_n - X_n'\| + a_{n+1}\bar f + \|a_{n+1}\delta M_{n+1}\|\\ & \lt \rho' + \epsilon + \epsilon \bar f + \epsilon \lt R_1, \Rightarrow \tilde{X}_{n+1}\in U_1. \end{split} \end{equation}$

(2.11)

Consequently, $P_{n+1} = 0$ , $X_{n+1} = \tilde{X}_{n+1}$ , and $\sigma_{n+1} = \sigma_n$ . Thus, establishing $X_n'\in B_{\rho'}(x_\star)$ will yield the result.

$4.$ Let

$\begin{equation} N = \inf\{n\geq n_ \epsilon\mid \tilde{X}_{n+1}\not\in U_1\}+1 \end{equation}$

(2.12)

This corresponds to the the first truncation after $n_ \epsilon$ . If the above set is empty, for that realization, no truncations occur after $n_ \epsilon$ , and we are done. In such a case, we may take $N = n_ \epsilon$ in the statement of the lemma.

$5.$ We now prove by induction that in the case that (2.12) is finite, $X_n'\in B_{\rho'}(x_\star)$ for all $n\geq N$ . First, note that $X_N \in B_{R_0}(x_\star)\subset B_{\rho}(x_\star)$ . By (2.6) and (2.10),

$\|X_{N}'-x_\star\| \leq \|X_{N}-x_\star\| + \|X_{N}'- X_{N}\| \lt R_0 + \epsilon \lt \rho', \Rightarrow X_N' \in B_{\rho'}(x_\star).$

Next, assume $X_{N}', X_{N+1}', \ldots, X_n'$ are all in $B_{\rho'}(x_\star)$ . Using (2.11), we have that $P_{N+1} = \ldots = P_{n+1} = 0$ and $\sigma_{N} = \ldots = \sigma_{n} = \sigma_{n+1}$ . Therefore,

$\begin{equation} \begin{split} \left \|{X_{n+1}' - x_\star}\right\|^2 & = \left \|{X'_n - x_\star}\right\|^2 - 2a_{n+1} \left\langle {X'_{n}-x_\star},{f(X_n)}\right\rangle + a_{n+1}^2 \left \|{f(X_n)}\right\|^2\\ &\leq \left \|{X'_n - x_\star}\right\|^2 - 2a_{n+1} \left\langle {X'_{n}-x_\star},{f(X_n)}\right\rangle + a_{n+1} \epsilon \bar f^2 \end{split} \end{equation}$

(2.13)

We now consider two cases of (2.13) to conclude $\|X_{n+1}' - x_\star\| < \rho'$ .

$6.$ In the first case, $\|X'_n - x_\star\|\leq R_0$ . By Cauchy-Schwarz and (2.6)

$\|X_{n+1}' - x_\star\|^2 \lt R_0^2 + 2 \epsilon R_0 \bar f + \epsilon^2\bar{f}^2 = (R_0 + \epsilon \bar f)^2 \lt (\rho')^2.$

In the second case, $R_0 < \|X'_n - x_\star\| < \rho'$ . Dissecting the inner product term in (2.13) and using Assumption 2 and (2.10),

$\begin{equation} \begin{split} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle & = \left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle + \left\langle {X'_{n}-X_n},{f(X_n) }\right\rangle\\ &\geq\left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle-\bar f \epsilon \end{split} \end{equation}$

(2.14)

Conditions (2.6) and (2.10) yield the following upper and lower bounds:

$\begin{align*} \|X_n-x_\star\|&\geq\|X_n'-x_\star\| -\|X_n'-X_n\| \geq R_0 - \epsilon \gt \tfrac{1}{2}R_0, \\ \|X_n-x_\star\|&\leq \|X_n'-x_\star\| +\|X_n'-X_n\| \leq \rho' + \epsilon \lt \rho \lt R_1. \end{align*}$

Therefore, (2.5) applies and $\left\langle {X_{n}-x_\star}, {f(X_n)}\right\rangle\geq \delta$ . Using this in (2.14), and condition (2.6),

$\begin{equation*} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle\geq \delta - \bar f \epsilon \gt \tfrac{1}{2}\delta. \end{equation*}$

Substituting this last estimate back into (2.13), and using (2.6),

$\|X_{n+1}' - x_\star\|^2 \lt (\rho')^2 - a_{n+1}(\delta - \epsilon\bar f^2) \lt (\rho')^2.$

This completes the inductive step.

$7.$ Since the auxiliary sequence remains in $B_{\rho'}(x_\star)$ for all $n\geq N > n_ \epsilon$ , (2.11) ensures $\tilde X_{n+1}\in B_{R_1}(x_\star)$ , $P_{n+1} = 0$ , and $\sigma_{n+1} = \sigma_N$ , a.s.

To obtain a similar result for the expanding trust region problem, we first relate the finiteness of the number of truncations with the sequence persisting in a bounded set.

Lemma 2.3. In the expanding trust region algorithm, if Assumptions 1, 3, and 4 hold, then the sequence remains in a set of the form $B_{R}(0)$ for some $R > 0$ if and only if the number of truncations is finite, a.s.

Proof. We break this proof into 4 steps:

$1.$ If the number of truncations is finite, then there exists $N$ such that for all $n\geq N$ , $\sigma_n = \sigma_{N}$ . Consequently, the proposed moves are always accepted, and $X_{n} \in U_{\sigma_{n}} = U_{\sigma_N}$ for all $n\geq N$ . Since $X_n \in U_{\sigma_n}\subset U_{\sigma_N}$ for $n < N$ , $X_n \in U_{\sigma_N}$ for all $n$ . By Assumption 3, $B_R(0) = B_{r_{\sigma_N}}(0) = U_{\sigma_N}$ is the desired set.

$2.$ For the other direction, assume that there exists $R > 0$ such that $X_n \in B_R(0)$ for all $n$ . Since the $r_n$ in (2.3) tend to infinity, there exists $N_1$ , such that $R < R +1 < r_{N_1}$ . Hence, for all $n\geq N_1$ ,

$\begin{equation} B_{R}(0)\subset B_{R+1}(0) \subset U_{n} \end{equation}$

(2.15)

Let $\bar f = \sup \|f(x)\|$ , with the supremum over $B_{R}(0)$ . Let $\tilde{R}$ be sufficiently large such that $B_{R+1}(0)\subset B_{\tilde R}(x_\star)$ . Lastly, using Proposition 1 and Assumption 4, a.s., there exists $N_2$ , such that for all $n\geq N_2$

$\begin{equation} \| a_{n} \delta M_{n}1_{\|X_n - x_\star\|\leq \tilde R}\| \lt \frac{1}{2}, \quad a_{n} \lt \frac{1}{2(1+ \bar f)} \end{equation}$

(2.16)

Since $X_n \in B_{R}(0) \subset B_{\tilde R}(x_\star)$ , the indicator function in (2.16) is always one, and $\| a_{n} \delta M_{n}\| < 1/2$ .

$3.$ Next, let

$\begin{equation} N = \inf \{ n\geq 0\mid \sigma_n\geq \max\{N_1, N_2\}\} \end{equation}$

(2.17)

If the above set is empty, then $\sigma_n < \max\{N_1, N_2\}$ for all $n$ , and the number of truncations is a.s. finite. In this case, the proof is complete.

$4.$ If the set in (2.17) is not empy, then $N < \infty$ . Take $n\geq N$ . As $X_n \in B_R(0)$ , and since $n\geq \sigma_n\geq \max\{N_1, N_2\}$ , (2.16) applies. Therefore,

$\begin{equation} \begin{split} \|\tilde{X}_{n+1}\| &\leq \|X_n\| + \| \tilde{X}_{n+1} - X_n\|\\ &\leq \|X_n\| + a_{n+1}\|f(X_n)\| + \| a_{n+1} \delta M_{n+1}\|\\ & \lt R + \tfrac{1}{2}+\tfrac{1}{2} \lt R +1. \end{split} \end{equation}$

(2.18)

Thus, $\tilde{X}_{n+1}\in B_{R+1}(0)\subset U_{N_1}$ , $\sigma_n \geq N_1$ , and $U_{N_1}\subset U_{\sigma_n}$ . Therefore, $\tilde{X}_{n+1}\in U_{\sigma_n}$ . No truncation occurs, and $\sigma_n = \sigma_{n+1}$ . Since this holds for all $n\geq N$ , $\sigma_n = \sigma_N$ , and the number of truncations is a.s. finite.

Next, we establish that, subject to an additional assumption, the sequence remains in a bounded set; the finiteness of the truncations is then a corollary.

Lemma 2.4. In the expanding trust region algorithm, if Assumptions 1, 2, 3, and 4 hold, and for any $r > 0$ , there a.s. exists $N < \infty$ , such that for all $n \geq N$ ,

$\begin{equation*} P_{n+1} 1_{\|X_n- x_\star\|\leq r} = 0, \end{equation*}$

then $\{X_n\}$ remains in a bounded open set, a.s.

Proof. We break this proof into 7 steps:

$1.$ We begin by setting some constants for the rest of the proof. Fix $R > 0$ sufficiently large such that $B_{R}(x_\star)\supset U_0$ . Next, let $\bar f = \sup {\|f(x)\|}$ with the supremum taken over $B_{R+2}(x_\star)$ . Assumption 2 ensures there exists $\delta > 0$ such that

$\begin{equation} \inf\limits_{R/2 \leq \|x-x_\star\|\leq R+2}\left\langle {x-x_\star},{f(x)}\right\rangle = \delta. \end{equation}$

(2.19)

Having fixed $R$ , $\bar f$ , and $\delta$ , take $\epsilon > 0$ such that:

$\begin{equation} \epsilon \lt \min\left\{{1,\frac{1}{\bar{f}}, \frac{\delta}{2\bar{f}}, \frac{\delta}{\bar{f}^2}, \frac{R}{2}}\right\}. \end{equation}$

(2.20)

By the assumptions of this lemma and Proposition 1 there exists, a.s., $n_ \epsilon\geq N$ such that for all $n\geq n_ \epsilon$ ,

$\begin{gather} \left \|{\sum\limits_{i = n+1}^\infty a_i \delta M_i 1_{\left\| {{X_{i - 1}} - {x_\star}} \right\| \le R + 2}}\right\|\leq \epsilon, \end{gather}$

(2.21a)

$\begin{gather} P_{n+1} 1_{\left \|{X_{n}- x_\star}\right\|\leq R+2} = 0, \end{gather}$

(2.21b)

$\begin{gather} a_{n+1}\leq \epsilon \end{gather}$

(2.21c)

$2.$ Define the modified sequence for $n \geq n_ \epsilon$ as

$\begin{equation} X_n' = X_n - \sum\limits_{k = n+1}^\infty a_k \delta M_k 1_{\left \|{X_{k-1} - x_\star}\right\|\leq R+2}, \Rightarrow \|X_n' - X_n\|\leq \epsilon. \end{equation}$

(2.22)

Using (2.1), we have the iteration

$\begin{equation} X_{n+1}' = X_n' - a_{n+1} \delta M_{n+1} 1_{\left \|{X_n - x_\star}\right\| \gt R+2} - a_{n+1}f(X_n) +a_{n+1} P_{n+1}. \end{equation}$

(2.23)

$3.$ Let

$\begin{equation} N = \inf\{n\geq n_ \epsilon\mid \sigma_{n+1}\neq \sigma_n \}+1, \end{equation}$

(2.24)

the first time after $n_ \epsilon$ that a truncation occurs.

If the above set is empty, no truncations occur after $n_ \epsilon$ . In this case, $\sigma_n = \sigma_{n_ \epsilon}\leq n_ \epsilon < \infty$ for all $n \geq n_ \epsilon$ . Therefore, for all $n\geq n_ \epsilon$ , $X_n \in U_{\sigma_n}\subset U_{\sigma_{n_ \epsilon}}$ . Since $U_{\sigma_n}\subset U_{\sigma_{n_ \epsilon}}$ for all $n < n_ \epsilon$ too, the proof is complete in this case.

$4.$ Now assume that $N < \infty$ . We will show that $\{X_n'\}$ remains in $B_{R+1}(x_\star)$ for all $n\geq N$ . Were this to hold, then for $n\geq N$ ,

$\begin{equation} \begin{split} \|X_n - x_\star\| &\leq \|X_n' - x_\star\| + \left \|{\sum\limits_{i = n+1}^\infty a_i \delta M_i 1_{\left\| {{X_{i - 1}} - {x_\star}} \right\| \le R + 2}}\right\|\\ & \lt R+1 + \epsilon \lt R+2, \end{split} \end{equation}$

(2.25)

having used (2.21) and (2.22). For $n < N$ , $X_n \in U_{\sigma_n}\subset U_{\sigma_N} = B_{r_N}(0)$ . Therefore, for all $n$ , $X_n \in B_{\tilde{R}}(0)$ where $\tilde{R} = \max\{ r_N, \|x_\star\| + R+2\}$ .

$5.$ We prove $X_n' \in B_{R+1}(x_\star)$ by induction. First, since $\epsilon < 1$ and $X_N \in U_{0}\subset B_R(x_\star)$ ,

$\begin{equation*} \|X_N'-x_\star\|\leq \|X_N'-X_N\|+ \|X_N-x_\star\| \lt \epsilon + R \lt R+1. \end{equation*}$

Next, assume that $X_N', X_{N+1}', \ldots, X_n'$ are all in $B_{R+1}(x_\star)$ . By (2.25), $X_n\in B_{R+2}(x_\star)$ . Since $P_{n+1} 1_{\left \|{X_{n}- x_\star}\right\|\leq R+2} = 0$ , we conclude $P_{n+1} = 0$ . The modified iteration (2.23) simplifies to have

$X_{n+1}' = X_n' -a_{n+1}f(X_n),$

and

$\begin{equation} \begin{split} \|X_{n+1}' - x_\star\|^2 & = \|X'_n - x_\star\|^2 - 2a_{n+1} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle\\ &\quad + a_{n+1}^2 \|f(X_n)\|^2\\ & \lt \|X'_n - x_\star\|^2 - 2a_{n+1} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle\\ &\quad + a_{n+1} \epsilon \bar{f}^2. \end{split} \end{equation}$

(2.26)

$6.$ We now consider two cases of (2.26). First, assume $\|X'_{n} - x_\star\|\leq R$ . Then (2.26) can immediately be bounded as

$\begin{equation*} \begin{split} \|X_{n+1}' - x_\star\|^2& \lt R^2 + 2 \epsilon R \bar{f} + \epsilon^2\bar{f}^2 = (R+ \epsilon \bar{f})^2 \lt (R+1)^2, \end{split} \end{equation*}$

where we have used condition (2.20) in the last inequality.

$7.$ Now consider the case $R < \|X'_{n} - x_\star\| < R+1$ . Using (2.20), the inner product in (2.26) can first be bounded from below:

$\begin{equation*} \begin{split} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle & = \left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle + \left\langle {X'_{n}-X_n},{f(X_n) }\right\rangle\\ &\geq \left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle - \epsilon \bar{f} \gt \left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle - \tfrac{1}{2}\delta. \end{split} \end{equation*}$

Next, using (2.20)

$\begin{equation*} \|X_{n} - x_\star\|\geq \|X_{n}'-x_\star\| - \|X_{n} -X'_{n}\| \gt R- \epsilon \gt R - \tfrac{1}{2}R = \tfrac{1}{2}R \end{equation*}$

Therefore, $\tfrac{1}{2}R < \|X_n -x_\star\| < R+2$ , so (2.19) ensures $\left\langle {X_{n}-x_\star}, {f(X_n)}\right\rangle\geq\delta$ and

$\begin{equation*} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle \gt \delta-\tfrac{1}{2}\delta = \tfrac{1}{2}\delta. \end{equation*}$

Returning to (2.26), by (2.20),

$\begin{equation*} \|X_{n+1}' - x_\star\|^2\leq (R+1)^2 - a_{n+1}(\delta - \epsilon \bar{f}^2) \lt (R+1)^2. \end{equation*}$

This completes the proof of the inductive step in this second case, completing the proof.

Corollary 2.1. For the expanding trust region algorithm, if Assumptions 1, 2, 3, and 4 hold, then the number of truncations is a.s. finite.

Proof. The proof is by contradiction. We break the proof into 4 steps:

$1.$ Assuming that there are infinitely many truncations, Lemma 3 implies that the sequence cannot remain in a bounded set. Then, continuing to assume that Assumptions 1, 2, 3, and 4 hold, the only way for the conclusion of Lemma 4 to fail is if the assumption on $P_{n+1} 1_{\|X_n - x_\star\|\leq r }$ is false. Therefore, there exists $r > 0$ and a set of positive measure on which a subsequence, $P_{n_k+1} 1_{\|X_{n_k} - x_\star\|\leq r }\neq 0$ . Hence $X_{n_k}\in B_{r}(x_\star)$ , and $P_{n_k+1}\neq 0$ . So truncations occur at these indices, and $\tilde{X}_{n_k+1} \not\in U_{\sigma_{n_k}}$ .

$2.$ Let $\bar{f} = \sup \|f(x)\|$ with the supremum over the set $B_r(x_\star)$ and let $\epsilon > 0$ satisfy

$\begin{equation} \epsilon \lt (\bar{f}+1)^{-1}. \end{equation}$

(2.27)

By our assumptions of the lemma and Proposition 1, there exists $n_ \epsilon$ such that for all $n\geq n_ \epsilon$

$\begin{equation} \|a_{n+1}\delta M_{n+1}1_{\|X_n - x_\star\|\leq r}\|\leq \epsilon, \quad a_{n+1}\leq \epsilon \end{equation}$

(2.28)

Along the subsequence, for all $n_k\geq n_ \epsilon$ ,

$\begin{equation} \|a_{n_k+1}\delta M_{n_k+1}1_{\|X_{n_k} - x_\star\|\leq r}\| = \|a_{n_k+1}\delta M_{n_k+1}\|\leq \epsilon. \end{equation}$

(2.29)

$3.$ Furthermore, for $n_k \geq n_ \epsilon$ :

$\begin{equation} \begin{split} \|\tilde{X}_{n_k+1} - x_\star\| &\leq \| X_{n_k} - x_\star\| + a_{n_k+1}\|f(X_{n_k})\| + \|a_{n_k+1}\delta M_{n_k+1}\|\\ & \lt r + \epsilon \bar{f} + \epsilon \lt r+1, \Rightarrow \tilde{X}_{n_k+1} \in B_{r+1}(x_\star), \end{split} \end{equation}$

(2.30)

where (2.27) has been used in the last inequality.

$4.$ By the definition of the $U_n$ , there exists an index $M$ such that $U_{M}\supset B_{r+1}(x_\star)$ . Let

$\begin{equation} N = \inf\{n\geq n_ \epsilon\mid \sigma_n \geq M\}. \end{equation}$

(2.31)

This set is nonempty and $N < \infty$ since we have assumed there are infinitely many truncations. Let $n_k \geq N$ . Then $\sigma_{n_k}\geq M$ and $U_{\sigma_{n_k}}\supset B_{r+1}(x_\star)$ . But (2.30) then implies that $\tilde{X}_{n_k+1} \in U_{\sigma_{n_k}}$ , and no truncation will occur; $P_{n_k+1} = 0$ , providing the desired the contradiction.

2.2. Proof of convergence

Using the above results, we are able to prove Theorems 2.1 and 2.2. Since the proofs are quite similar, we present the more complicated expanding trust region case.

Proof. We split this proof into 6 steps:

$1.$ First, by Corollary 1, only finitely many truncations occur. By Lemma 3, there exists $R > 0$ such that $X_n\in B_R(0)$ for all $n$ . Consequently, there is an $r$ such that $X_n\in B_r(x_\star)$ for all $n$ .

$2.$ Next, we fix constants. Let $\bar f = \sup \|f(x)\|$ with the supremum taken over $B_r(x_\star)$ . Fix $\eta \in (0, 2R)$ , and use Assumption 2 to determine $\delta > 0$ such that

$\begin{equation} \inf\limits_{\eta/2 \leq \|x-x_\star\|\leq r}\left\langle {x-x_\star},{f(x)}\right\rangle = \delta \end{equation}$

(2.32)

Take $\epsilon > 0$ such that:

$\begin{equation} \epsilon \lt \min\left\{{1,\frac{\eta}{2}, \frac{\delta}{2\bar{f}}, \frac{\delta}{2\bar{f}^2}}\right\} \end{equation}$

(2.33)

Having set $\epsilon$ , we again appeal to Assumption 4 and Proposition 1 to find $n_ \epsilon$ such that for all $n\geq n_ \epsilon$ :

$\begin{equation} \left \|{\sum\limits_{i = n+1}^\infty a_i \delta M_i 1_{\|X_{i-1} -x_\star\|\leq r}}\right\| = \left \|{\sum\limits_{i = n+1}^\infty a_i \delta M_i }\right\| \leq \epsilon, \quad a_{n+1}\leq \epsilon \end{equation}$

(2.34)

$3.$ Define the auxiliary sequence,

$\begin{equation} X_n' = X_n - \sum\limits_{i = n+1}^\infty a_i \delta M_i 1_{\|X_{i-1} - x_\star \|\leq r} = X_n - \sum\limits_{i = n+1}^\infty a_i \delta M_i. \end{equation}$

(2.35)

Since there are only finitely many truncations, there exists $N\geq n_ \epsilon$ , such that for all $n\geq N$ , $P_{n+1} = 0$ , as the truncations have ceased. Consequently, for $n\geq N$ ,

$\begin{equation} X'_{n+1} = X'_n - a_{n+1}f(X_n) \end{equation}$

(2.36)

By (2.34) and (2.35), for $n\geq N$ , $\|X_n - X_n'\|\leq \epsilon$ . Since $\epsilon > 0$ may be arbitrarily small, it will be sufficient to prove $X_n'\to x_\star$ .

$4.$ To obtain convergence of $X_n'$ , we first examine $\|X_{n+1}'-x_\star\|$ . For $n\geq N$ ,

$\begin{equation} \begin{split} \|X_{n+1}' - x_\star\|^2 &\leq \|X_{n}' - x_\star\|^2-2a_{n+1}\left\langle {X_n' - x_\star},{f(X_n)}\right\rangle + a_{n+1} \epsilon \bar f^2, \end{split} \end{equation}$

(2.37)

Now consider two cases of this expression. First, assume $\|X_n' - x_\star\|\leq\eta$ . In this case, using (2.33),

$\begin{equation} \begin{split} -2a_{n+1}\left\langle {X_n' - x_\star},{f(X_n)}\right\rangle + a_{n+1} \epsilon \bar f^2&\leq a_{n+1}(2\eta \bar f + \epsilon \bar f^2) \\ & \lt a_{n+1}(4R\bar f + \bar f^2) = a_{n+1} B. \end{split} \end{equation}$

(2.38)

where $B > 0$ is a constant depending only on $R$ and $\bar f$ . For $\|X_n' - x_\star\| > \eta$ , using (2.33)

$\begin{equation} \begin{split} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle & = \left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle + \left\langle {X'_{n}-X_n},{f(X_n) }\right\rangle\\ &\geq \left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle - \epsilon \bar{f}\\ & \gt \left\langle {X_{n}-x_\star},{f(X_n) }\right\rangle -\tfrac{1}{2}\delta. \end{split} \end{equation}$

(2.39)

By (2.33),

$\begin{equation*} \|X_{n}-x_\star\|\geq \|X'_{n}-x_\star\| -\| X_{n}-X_n'\| \gt \eta - \epsilon \gt \tfrac{1}{2}\eta \end{equation*}$

Since $\|X_n - x_\star\| < r$ too, (2.32) and (2.39) yield the estimate

$\begin{equation*} \left\langle {X'_{n}-x_\star},{f(X_n) }\right\rangle \gt \delta - \epsilon \bar{f} \gt \tfrac{1}{2}\delta \end{equation*}$

Thus, in this regime, using (2.33),

$\begin{equation} \begin{split} -2a_{n+1}\left\langle {X_n' - x_\star},{f(X_n)}\right\rangle + a_{n+1} \epsilon \bar f^2&\leq -a_{n+1}(\delta - \epsilon \bar f^2)\\ & \lt -\tfrac{1}{2}\delta a_{n+1} = - A a_{n+1} \end{split} \end{equation}$

(2.40)

where $A > 0$ is a constant depending only on $\delta$ .

Combining estimates (2.38) and (2.40), we can write for $n\geq N$

$\begin{equation} \|X_{n+1}' - x_\star\|^2 \lt \|X_{n}' - x_\star\|^2 - a_{n+1} A 1_{\|X_n' - x_\star\| \gt \eta} + a_{n+1} B 1_{\|X_n' - x_\star\|\leq \eta}. \end{equation}$

(2.41)

$5.$ We now show that $\|X_n' - x_\star\|\leq \eta$ i.o. The argument is by contradiction. Let $M\geq N$ be such that for all $n\geq M$ , $\|X_n' - x_\star\| > \eta$ . For such $n$ ,

$\begin{equation} \begin{split} \eta^2 \lt \|X_{n+1}'-x_\star\|^2 & \lt \|X_{n}'-x_\star\|^2 - a_{n+1}A \\ & \lt \|X_{n-1}'-x_\star\|^2 - a_{n+1}A - a_n A\\ & \lt \ldots \lt \|X_M' - x_\star\|^2 - A \sum\limits_{i = M}^n a_{i+1}. \end{split} \end{equation}$

(2.42)

Using Assumption 4 and taking $n\to \infty$ , we obtain a contradiction.

$6.$ Finally, we prove convergence of $X_n'\to x_\star$ . Since $X_n'\in B_{\eta}(x_\star)$ i.o., let

$\begin{equation} N' = \inf \{n\geq N\mid \|X_n' - x_\star\| \lt \eta \}. \end{equation}$

(2.43)

For $n\geq N'$ , we can then define

$\begin{equation} \varphi(n) = \max\left\{{p\leq n\mid \left \|{X_p' - x_\star}\right\| \lt \eta }\right\}. \end{equation}$

(2.44)

For all such $n$ , $\varphi(n)\leq n$ , and $X_{\varphi(n)}'\in B_\eta(x_\star)$ .

We claim that for $n\geq N'$ ,

$\|X_{n+1}' - x_\star\|^2 \lt \|X_{\varphi(n)}' -x_\star\|^2 + B a_{\varphi(n)+1} \lt \eta^2 + Ba_{\varphi(n)+1}.$

First, if $n = \varphi(n)$ , this trivially holds in (2.41). Suppose now that $n > \varphi(n)$ . Then for $i = \varphi(n)+1, \varphi(n)+2, \ldots n$ , $\|X_{i}' - x_\star\| > \eta$ . Consequently,

$\begin{equation*} \begin{split} \|X_{n+1}' - x_\star\|^2& \lt \|X_{n}' - x_\star\|^2 \lt \|X_{n-1}' - x_\star\|^2 \lt \ldots\\ & \lt \|X_{\varphi(n)+1}' - x_\star\|^2 \lt \|X_{\varphi(n)}' - x_\star\|^2 + B a_{\varphi(n)+1}\\ & \lt \eta^2 + B a_{\varphi(n)+1} \end{split} \end{equation*}$

As $\varphi(n)\to \infty$ ,

$\begin{equation*} \limsup\limits_{n\to \infty }\|X_{n+1}' - x_\star\|^2\leq \eta^2 \end{equation*}$

Since $\eta$ may be arbitrarily small, we conclude that

$\limsup\limits_{n\to \infty }\|X_{n+1}' - x_\star\| = \lim\limits_{n\to \infty }\|X_{n+1}' - x_\star\| = 0,$

completing the proof.

3. Minimization of relative entropy

Recall from the introduction that our distribution of interest, $\mu$ , is posed on the Borel subsets of Hilbert space $\mathcal{H}$ . We assume that $\mu \ll \mu_0$ , where $\mu_0 = N(m_0, C_0)$ is some reference Gaussian. Thus, we write

$\begin{equation} \frac{d\mu}{d\mu_0} = \frac{1}{Z_\mu}\exp\left\{{-\Phi_\mu(u)}\right\}, \end{equation}$

(3.1)

where $\Phi_\nu: X\to \mathbb{R}$ , $X$ a Banach space, a subspace of $\mathcal{H}$ , of full measure with respect to $\mu_0$ , a Gaussian on $\mathcal{H}$ , assumed to be continuous. $Z_\mu = \mathbb{E}^{\mu_0}[\exp\left\{{-\Phi(u)}\right\}]\in (0, \infty)$ is the partition function ensuring we have a probability measure.

Let $\nu = N(m, C)$ , be another Gaussian, equivalent to $\mu_0$ , such that we can write

$\begin{equation} \frac{d\nu}{d\mu_0} = \frac{1}{Z_\nu}\exp\left\{{-\Phi_\nu(v)}\right\}, \end{equation}$

(3.2)

Assuming that $\nu \ll \mu$ , we can write

$\begin{equation} \mathcal{R}(\nu||\mu) = \mathbb{E}^{\nu}[\Phi_\mu(u) - \Phi_\nu(u)] + \log(Z_\mu) - \log(Z_\nu) \end{equation}$

(3.3)

The assumption that $\nu \ll \mu$ implies that $\nu$ and $\mu$ are equivalent measures. As was proven in ^[16], if $\mathcal{A}$ is a set of Gaussian measures, closed under weak convergence, such that at least one element of $\mathcal{A}$ is absolutely continuous with respect to $\mu$ , then any minimizing sequence over $\mathcal{A}$ will have a weak subsequential limit.

If we assume, for this work, that $C = C_0$ , then, by the Cameron-Martin formula (see ^[9]),

$\begin{equation} \Phi_\nu(u) = -\left\langle {u-m},{m -m_0}\right\rangle_{ \mathcal{H}^1} - \frac{1}{2}\left \|{m - m_0}\right\|_{ \mathcal{H}^1}^2, \quad Z_{\nu} = 1. \end{equation}$

(3.4)

Here, $\left\langle {\bullet}, {\bullet}\right\rangle_{ \mathcal{H}^1}$ and $\|\bullet\|_{ \mathcal{H}^1}$ are the inner product and norms of the Cameron-Martin Hilbert space, denoted $\mathcal{H}^1$ ,

$\begin{equation} \left\langle {f},{g}\right\rangle_{ \mathcal{H}^1} = \left\langle {C_0^{-1/2} f},{C_0^{-1/2} g}\right\rangle, \quad \left \|{f}\right\|_{ \mathcal{H}^1}^2 = \left\langle {f},{f}\right\rangle_{ \mathcal{H}^1}^2. \end{equation}$

(3.5)

Convergence to the minimizer will be established in $\mathcal{H}^1$ , and $\mathcal{H}^1$ will be the relevant Hilbert space in our application of Theorems 2.1 and 2.2 to this problem.

Letting $\nu_0 = N(0, C_0)$ and $v\sim \nu_0$ , we can then rewrite (3.3) as

$\begin{equation} J(m) \equiv \mathcal{R}(\nu||\mu) = \mathbb{E}^{\nu_0}[\Phi_{\mu}(v + m)] + \frac{1}{2}\left \|{m - m_0}\right\|_{ \mathcal{H}^1}^2 + \log(Z_\mu) \end{equation}$

(3.6)

The Euler-Lagrange equation associated with (3.6), and the second variation, are:

$\begin{align} J'(m) & = \mathbb{E}^{\nu_0}[\Phi'_\mu(v+m)] + C_0^{-1}(m-m_0), \end{align}$

(3.7)

$\begin{align} J''(m) & = \mathbb{E}^{\nu_0}[\Phi''_\mu(v+m)] + C_0^{-1}. \end{align}$

(3.8)

3.1. Application of Robbins-Monro

In ^[15], it was suggested that rather than try to find a root of (3.7), the equation first be preconditioned by multiplying by $C_0$ ,

$\begin{equation} C_0 \mathbb{E}^{\nu_0}[\Phi'_\mu(v+m)] + (m-m_0), \end{equation}$

(3.9)

and a root of this mapping is sought, instead. Defining

$\begin{align} f(m) & = C_0 \mathbb{E}^{\nu_0}[\Phi'_\mu(v+m)] + (m-m_0), \end{align}$

(3.10)

$\begin{align} F(m,v) & = C_0\Phi'_\mu(v+m) + (m-m_0). \end{align}$

(3.11)

The Robbins-Monro formulation is then

$\begin{equation} m_{n+1} = m_n - a_{n+1} F(m_n, v_{n+1}) + P_{n+1}, \end{equation}$

(3.12)

with $v_n \sim \nu_0$ , i.i.d.

We thus have

Theorem 3.1. Assume:

● There exists $\nu = N(m, C_0)\sim \mu_0$ such that $\nu\ll\mu$ .

● $\Phi_\mu'$ and $\Phi_\mu''$ exist for all $u \in \mathcal{H}^1$ .

● There exists $m_\star$ , a local minimizer of $J$ , such that $J'(m_\star) = 0$ .

● The mapping

$\begin{equation} m\mapsto \mathbb{E}^{\nu_0}\left[{\left \|{\sqrt{C_0}\Phi_\mu'(m+v)}\right\|^2}\right] \end{equation}$

(3.13)

is bounded on bounded subsets of $\mathcal{H}^1$ .

● There exists a convex neighborhood $U_\star$ of $m_\star$ and a constant $\alpha > 0$ , such that for all $m\in U_\star$ , for all $u \in \mathcal{H}^1$ ,

$\begin{equation} \left\langle {J''(m)u},{u}\right\rangle\geq \alpha \left \|{u}\right\|_{ \mathcal{H}^1}^2 \end{equation}$

(3.14)

Then, choosing $a_n$ according to Assumption 4,

● If the subset $U_\star$ can be taken to be all of $\mathcal{H}^1$ , for the expanding truncation algorithm, $m_n \to m_\star$ a.s. in $\mathcal{H}^1$ .

● If the subset $U_\star$ is not all of $\mathcal{H}^1$ , then, taking $U_1$ to be a bounded (in $\mathcal{H}^1$ ) convex subset of $U_\star$ , with $m_\star \in U_1$ , and $U_0$ any subset of $U_1$ such that there exist $R_0 < R_1$ with

$U_0 \subset B_{R_0}(x_\star) \subset B_{R_1}(x_\star)\subset U_1,$

for the fixed truncation algorithm, $m_n\to m_\star$ a.s. in $\mathcal{H}^1$ .

Proof. We split the proof into 2 steps:

$1.$ By the assumptions of the theorem, we clearly satisfy Assumptions 1 and 4. To satisfy Assumption 3, we observe that

$\begin{equation*} \mathbb{E}^{\nu_0}[\left \|{F(m,v)}\right\|^2_{ \mathcal{H}^1}]\leq 2 \mathbb{E}^{\nu_0}\left[{\left \|{\sqrt{C_0}\Phi_\mu'(m+v)}\right\|^2}\right] + 2\left \|{m-m_0}\right\|_{ \mathcal{H}^1}^2. \end{equation*}$

This is bounded on bounded subsets of $\mathcal{H}^1$ .

$2.$ Per the convexity assumption, (3.14), implies Assumption 2, since, by the mean value theorem in function spaces,

where $\tilde m$ is some intermediate point between $m$ and $m_\star$ . This completes the proof.

While condition (3.14) is sufficient to obtain convexity, other conditions are possible. For instance, suppose there is a convex open set $U_\star$ containing $m_\star$ and constant $\theta\in [0, 1)$ , such that for all $m \in U_\star$ ,

$\begin{equation} \inf\limits_{\substack{u\in \mathcal{H}\\ u\neq 0}} \frac{\left\langle { \mathbb{E}^{\nu_0}[\Phi''_\mu(v+m)]u},{u}\right\rangle}{\left \|{u}\right\|^2}\geq -\theta\lambda_1^{-1}, \end{equation}$

(3.15)

where $\lambda_1$ is the principal eigenvalue of $C_0$ . Then this would also imply Assumption 2, since

$\begin{equation*} \begin{split} \left\langle {m-m_\star},{f(m)}\right\rangle_{ \mathcal{H}^1} & = \left\langle {m-m_\star},{C_0\left[{J'(m_\star) +J''(\tilde m)(m-m_\star) }\right]}\right\rangle_{ \mathcal{H}^1}\\ & = \left\langle {m-m_\star},{J''(\tilde m)(m-m_\star)}\right\rangle\\ &\geq \left \|{m-m_\star}\right\|_{ \mathcal{H}^1}^2 + \left\langle {m-m_\star},{ \mathbb{E}^{\nu_0}[\Phi''_\mu(v+\tilde m)] (m-m_\star)}\right\rangle\\ &\geq \left \|{m-m_\star}\right\|_{ \mathcal{H}^1}^2 -\theta \lambda_1^{-1} \left \|{m-m_\star}\right\|^2\\ &\geq (1-\theta)\left \|{m-m_\star}\right\|_{ \mathcal{H}^1}^2. \end{split} \end{equation*}$

We mention (3.15) as there may be cases, shown below, for which the operator $\mathbb{E}^{\nu_0}[\Phi''_\mu(v+ m)]$ is obviously nonnegative.

4. Examples

To apply the Robbins-Monro algorithm to the relative entropy minimization problem, the $\Phi_\mu$ functional of interest must be examined. In this section we present a few examples, based on those presented in ^[15], and examine when the assumptions hold. The one outstanding assumption that we must make is that, a priori, $\mu_0$ is an equivalent measure to $\mu$ .

4.1. Scalar problem

Taking $\mu_0 = N(0, 1)$ , the standard unit Gaussian, let $V: \mathbb{R} \to \mathbb{R}$ be a smooth function such that

$\begin{equation} \frac{d\mu}{d\mu_0} = \frac{1}{Z_\mu} \exp\left\{{-{ \epsilon^{-1}}V(x)}\right\} \end{equation}$

(4.1)

is a probability measure on $\mathbb{R}$ . For these scalar cases, we use $x$ in place of $v$ . In the above framework,

$\begin{align*} F(x,\xi) & = { \epsilon^{-1}}V'(x+\xi) -\xi,\\ f(x) & = { \epsilon^{-1}} \mathbb{E}[V'(x+\xi)]m \\ \Phi_\mu'(x) & = { \epsilon^{-1}}V'(x), \\ \Phi_\mu''(x)& = { \epsilon^{-1}}V''(x) \end{align*}$

and $\xi\sim N(0, 1) = \nu_0 = \mu_0$ .

4.1.1. Globally convex case

Consider the case that

$\begin{equation} V(x) = \tfrac{1}{2}x^2 + \tfrac{1}{4}x^4. \end{equation}$

(4.2)

In this case

$\begin{align*} F(x,\xi)& = { \epsilon^{-1}}\left({x+\xi + (x+\xi)^3}\right) + x,\\ f(x) & = { \epsilon^{-1}}\left({4x + x^3}\right) +x, \\ \mathbb{E}[\Phi''_\mu(x+\xi)] & = { \epsilon^{-1}}(4 + 3x^2),\\ \mathbb{E}[\left |{\Phi'_\mu(x+\xi)}\right |^2] & = { \epsilon^{-1}} \left({22 + 58 x^2 + 17 x^4 + x^6}\right). \end{align*}$

Since $\mathbb{E}[\Phi''_\mu(x+\xi)] \geq 4 { \epsilon^{-1}}$ , all of our assumptions are satisfied and the expanding truncation algorithm will converge to the unique root at $x_\star = 0$ a.s. See for an example of the convergence at $\epsilon = 0.1$ , $U_{n} = (-n -1, n+1)$ , and always restarting at $0.5$ .

Figure 1. Robbins-Monro applied to a globally convex scalar problem associated with (4.2) with

$\epsilon = 0.1$ and expanding trust regions

$U_{n} = (-1-n, 1+n)$ .

DownLoad: Full-Size Img PowerPoint

We refer to this as a "globally convex'' problem since $\mathcal{R}$ is globally convex about the minimizer.

4.1.2. Locally convex case

In contrast to the above problem, some mimizers are only "locally'' convex. Consider the case the double well potential

$\begin{equation} V(x) = \tfrac{1}{4}(4-x^2)^2 \end{equation}$

(4.3)

Now, the expressions for RM are

$\begin{align*} F(x,\xi) & = { \epsilon^{-1}}\left({(x+\xi)^3-4(x+\xi))}\right) + x,\\ f(x) & = { \epsilon^{-1}}\left({x^3-x}\right) +x, \\ \mathbb{E}[\Phi''_\mu(x+\xi)] & = { \epsilon^{-1}}\left({3x^2-1}\right),\\ \mathbb{E}[\left |{\Phi'_\mu(x+\xi)}\right |^2] & = { \epsilon^{-1}} (1 + x^2) (7 + 6 x^2 +x^4). \end{align*}$

In this case, $f(x)$ vanishes at $0$ and $\pm \sqrt{1- \epsilon}$ , and $J''$ changes sign from positive to negative when $x$ enters $({-\sqrt{(1- \epsilon)/3}, \sqrt{({1- \epsilon})/{3}}})$ . We must therefore restrict to a fixed trust region if we want to ensure convergence to either of $\pm\sqrt{1- \epsilon}$ .

We ran the problem at $\epsilon = 0.1$ in two cases. In the first case, $U_1 = (0.6, 3.0)$ and the process always restarts at $2$ . This guarantees convergence since the second variation will be strictly postive. In the second case, $U_1 = (-0.5, 1.5)$ , and the process always restarts at $-0.1$ . Now, the second variation can change sign. The results of these two experiments appear in . For some random number sequences the algorithm still converged to $\sqrt{1- \epsilon}$ , even with the poor choice of trust region.

Figure 2. Robbins-Monro applied to the nonconvex scalar problem associated with (4.3). Figure (a) shows the result with a well chosen trust region, while (b) shows the outcome of a poorly chosen trust region.

DownLoad: Full-Size Img PowerPoint

4.2. Path space problem

Take $\mu_0 = N(m_0(t), C_0)$ , with

$\begin{equation} C_0 = \left({-\frac{d^2}{dt^2}}\right)^{-1}, \end{equation}$

(4.4)

equipped with Dirichlet boundary conditions on $\mathcal{H} = L^2(0, 1)$ .^* In this case the Cameron-Martin space $\mathcal{H}^1 = H^1_0(0, 1)$ , the standard Sobolev space equipped with the Dirichlet norm. Let us assume $m_0 \in H^1(0, 1)$ , taking values in $\mathbb{R}^d$ .

^* This is the covariance of the standard unit Brownian bridge, $Y_t = B_t - t B_1$ .

Consider the path space distribution on $L^2(0, 1)$ , induced by

$\begin{equation} \frac{d\mu}{d\mu_0} = - \frac{1}{Z_\mu}\exp\left\{{-\Phi_\mu(v)}\right\}, \quad \Phi_\mu(u) = { \epsilon^{-1}}\int_0^1 V(v(t))dt, \end{equation}$

(4.5)

where $V: \mathbb{R}^d\to \mathbb{R}$ is a smooth function. We assume that $V$ is such that this probability distribution exists and that $\mu \sim \mu_0$ , our reference measure.

We thus seek an $\mathbb{R}^d$ valued function $m(t) \in H^1(0, 1)$ for our Gaussian approximation of $\mu$ , satisfying the boundary conditions

$\begin{equation} m(0) = m_-,\quad m(1) = m_+. \end{equation}$

(4.6)

For simplicity, take $m_0 = (1-t)m_- + t m_+$ , the linear interpolant between $m_\pm$ . As above, we work in the shifted coordinated $x(t) = m(t) - m_0(t)\in H^1_0(0, 1)$ .

Given a path $v(t)\in H^1_0$ , by the Sobolev embedding, $v$ is continuous with its $L^\infty$ norm controlled by its $H^1$ norm. Also recall that for $\xi \sim N(0, C_0)$ , in the case of $\xi(t) \in \mathbb{R}$ ,

$\begin{equation} \mathbb{E}\left[{\xi(t)^p}\right] = \begin{cases} 0, & \text{$p$ odd},\\ (p-1)!!\left[{t(1-t)}\right]^{\frac p 2}, & \text{$p$ even}. \end{cases} \end{equation}$

(4.7)

Letting $\lambda_1 = 1/\pi^2$ be the ground state eigenvalue of $C_0$ ,

$\begin{equation*} \begin{split} \mathbb{E}[\|\sqrt{C_0}\Phi'_\mu(v +m_0+\xi)\|^2]&\leq {\lambda_1} \mathbb{E}[\|\Phi'_\mu(v +m_0+\xi)\|^2]\\ &\quad = {\lambda_1}{ \epsilon^{-2}}\int_0^1 \mathbb{E}[{\left |{V'(v(t)+m_0(t)+\xi(t))}\right |^2}]dt. \end{split} \end{equation*}$

The terms involving $v+m_0$ in the integrand can be controlled by the $L^\infty$ norm, which in turn is controlled by the $H^1$ norm, while the terms involving $\xi$ can be integrated according to (4.7). As a mapping applied to $x$ , this expression is bounded on bounded subsets of $H^1$ .

Minimizers will satisfy the ODE

$\begin{equation} { \epsilon}^{-1} \mathbb{E}\left[{V'(x+m_0 +\xi)}\right] -x'' = 0,\quad x(0) = x(1) = 0. \end{equation}$

(4.8)

4.3. Globally convex example

With regard to convexity about a minimizer, $m_\star$ , if, for instance, $V''$ were pointwise positive definite, then the problem would satisfy (3.15), ensuring convergence. Consider the quartic potential $V$ given by (4.2). In this case,

$\begin{equation} \Phi(v) = { \epsilon}^{-1}\int_0^1 \frac{1}{2}v(t)^2 +\frac{1}{4}v(t)^4 dt, \end{equation}$

(4.9)

and

$\begin{align*} \Phi'(v+m_0+ \xi) & = { \epsilon}^{-1}\left[{(v+m_0 +\xi) +3(v+m_0 +\xi)^3 }\right]\\ \Phi''(v+m_0 + \xi) & = { \epsilon}^{-1}\left[{1 +3 (v+m_0+\xi)^2}\right],\\ \mathbb{E}[\Phi'(v+m_0 + \xi)]& = { \epsilon}^{-1}\left[{v+m_0 +(v+m_0)^3+ 3 t(1-t) (v+m_0)}\right]\\ \mathbb{E}[\Phi''(v+m_0 + \xi)]& = { \epsilon}^{-1}\left[{1 + 3 (v+m_0)^2 + 3 t(1-t) }\right] \end{align*}$

Since $\Phi''(v+m_0+\xi)\geq \epsilon^{-1}$ , we are guaranteed convergence using expanding trust regions. Taking $\epsilon = 0.01$ , $m_- = 0$ and $m_+ = 2$ , this is illustrated in Figure 3, where we have also solved (4.8) by ODE methods for comparison. As trust regions, we take

$\begin{equation} U_n = \left\{{m \in H^1_0(0,1)\mid \left \|{x}\right\|_{H^1}\leq 10+n}\right\}, \end{equation}$

(4.10)

Figure 3. The mean paths computed for (4.9) at different resolutions, along with the truncation sequence.

DownLoad: Full-Size Img PowerPoint

and we always restart at the zero solution also shows robustness to discretization; the number of truncations is relatively insensitive to $\Delta t$ .

4.4. Locally convex example

For many problems of interest, we do not have global convexity. Consider the double well potential (4.3), but in the case of paths,

$\begin{equation} \Phi(u) = { \epsilon^{-1}}\int_0^1\frac{1}{4} (4-v(t)^2)^2dt. \end{equation}$

(4.11)

Then,

$\begin{align*} \Phi'(v + m_0 + \xi)& = { \epsilon}^{-1}\left[{(v + m_0 + \xi)^3 - 4 (v + m_0 + \xi)}\right]\\ \Phi''(v + m_0 + \xi) & = { \epsilon}^{-1}\left[{3 (v + m_0 + \xi)^2 - 4}\right],\\ \mathbb{E}[\Phi'(v+m_0 + \xi)]& = { \epsilon}^{-1}\left[{(v+m_0)^3 + 3 t(1-t) (v+m_0)-4(v+m_0)}\right]\\ \mathbb{E}[\Phi''(v+m_0 + \xi)]& = { \epsilon}^{-1}\left[{3(v+m_0)^2 + 3 t(1-t) -4}\right] \end{align*}$

Here, we take $m_- = 0$ , $m_+ = 2$ , and $\epsilon = 0.01$ . We have plotted the numerically solved ODE in . Also plotted is $\mathbb{E}[\Phi''(v_\star +m_0+ \xi)]$ . Note that $\mathbb{E}[\Phi''(v_\star +m_0+ \xi)]$ is not sign definite, becoming as small as $-400$ . Since $C_0$ has $\lambda_1 = 1/\pi^2 \approx 0.101$ , (3.15) cannot apply.

Figure 4. The numerically computed solution to (4.8) in the case of the double well, (4.11),

$m_\star$ , and the associated

$\mathbb{E}^{\nu_0}[\Phi''(m_\star + \xi)]$ .

DownLoad: Full-Size Img PowerPoint

Discretizing the Schrödinger operator

$\begin{equation} J''(v_\star) = -\frac{d^2}{dt^2} + { \epsilon}^{-1}\left({3(v_\star(t)+m_0(t))^2 + 3 t(1-t) -4}\right), \end{equation}$

(4.12)

we numerically compute the eigenvalues. Plotted in , we see that the minimal eigenvalue of $J''(m_\star)$ is approximately $\mu_1\approx 550$ . Therefore,

$\begin{equation} \left\langle {J''(x_\star)u},{u}\right\rangle\geq \mu_1 \left \|{u}\right\|^2_{L^2}\Rightarrow \left\langle {J''(x)u},{u}\right\rangle\geq \alpha\left \|{u}\right\|_{H^1}^2, \end{equation}$

(4.13)

Figure 5. The numerically computed spectrum for (4.12), associated with the

$m_\star$ shown in Figure 4. Also shown is the numerically computed spectrum for the path

$m(t) = 2t^2$ , which introduces negative eigenvalues.

DownLoad: Full-Size Img PowerPoint

for all $v$ in some neighborhood of $v_\star$ . For an appropriately selected fixed trust region, the algorithm will converge.

However, we can show that the convexity condition is not global. Consider the path $m(t) = 2t^2$ , which satisfies the boundary conditions. As shown in Figure 5, this path induces negative eigenvalues.

Despite this, we are still observe convergence. Using the fixed trust region

$\begin{equation} U_1 = \left\{{x\in H^1_0(0,1)\mid \left \|{x}\right\|_{H^1}\leq 100}\right\}, \end{equation}$

(4.14)

we obtain the results in Figure 6. Again, the convergence is robust to discretization.

Figure 6. The mean paths computed for (4.11) at different resolutions, along with the truncation sequence.

DownLoad: Full-Size Img PowerPoint

5. Discussion

We have shown that the Robbins-Monro algorithm, with both fixed and expanding trust regions, can be applied to Hilbert space valued problems, adapting the finite dimensional proof of ^[12]. We have also constructed sufficient conditions for which the relative entropy minimization problem fits within this framework.

One problem we did not address here was how to identify fixed trust regions. Indeed, that requires a tremendous amount of a priori information that is almost certainly not available. We interpret that result as a local convergence result that gives a theoretical basis for applying the algorithm. In practice, since the root is likely unknown, one might run some numerical experiments to identify a reasonable trust region, or just use expanding trust regions. The practitioner will find that the algorithm converges to a solution, though perhaps not the one originally envisioned. A more sophisticated analysis may address the convergence to a set of roots, while being agnostic as to which zero is found.

Another problem we did not address was how to optimize not just the mean, but also the covariance in the Gaussian. As discussed in ^[15], it is necessary to parameterize the covariance in some way, which will be application specific. Thus, while the form of the first variation of relative entropy with respect to the mean, (3.7), is quite generic, the corresponding expression for the covariance will be specific to the covariance parameterization. Additional constraints are also necessary to guarantee that the parameters always induce a covariance operator. We leave such specialization as future work.

Acknowledgments

This work was supported by US Department of Energy Award DE-SC0012733. This work was completed under US National Science Foundation Grant DMS-1818716. The authors would like to thank J. Lelong for helpful comments, along with anonymous reviewers whose reports significantly impacted our work.

Conflict of interest

The authors declare that there is no conflicts of interest in this paper.

References

[1]	Van Huylenbroeck G, Vandermeulen V, Mettepenningen E, et al. (2007) Multifunctionality of agriculture: A review of definitions, evidence and instruments. Living Rev Landsc Res 1: 1–43. https://doi.org/10.12942/lrlr-2007-3 doi: 10.12942/lrlr-2007-3
[2]	Rogers E M (1983) Diffusion of innovations. 3rd Ed., New York: Free Press, 15.
[3]	Hermans F, Klerkx L, Roep D (2015) Structural conditions for collaboration and learning in innovation networks: Using an innovation system performance lens to analyze agricultural knowledge systems. J Agric Educ Ext 21: 35–54. https://doi.org/10.1080/1389224X.2014.991113 doi: 10.1080/1389224X.2014.991113
[4]	Lioutas ED, Charatsari C, Černič Istenič M, et al. (2019) The challenges of setting up the evaluation of extension systems by using a systems approach: the case of Greece, Italy and Slovenia. J Agric Educ Ext 25: 139–160. https://doi.org/10.1080/1389224X.2019.1583818 doi: 10.1080/1389224X.2019.1583818
[5]	Klerkx L, Van Mierlo B, Leeuwis C (2012) Evolution of systems approaches to agricultural innovation: Concepts, analysis, and interventions, In: Darnhofer I, Gibbon D, Dedieu B (Eds.), Farming Systems Research into the 21st Century: The New Dynamic, Springer Dordrecht, 457–483. https://doi.org/10.1007/978-94-007-4503-2_20
[6]	Klerkx L, Leeuwis C (2008) Balancing multiple interests: Embedding innovation intermediation in the agricultural knowledge infrastructure. Technovation 28: 364–378. https://doi.org/10.1016/j.technovation.2007.05.005 doi: 10.1016/j.technovation.2007.05.005
[7]	EU SCAR (2012) Agricultural knowledge, and innovation systems in transition—a reflection paper, Brussels, p.13. Available from: https://ec.europa.eu/eip/agriculture/en/publications/agricultural-knowledge-and-innovation-systems.html.
[8]	EU SCAR AKIS (2019) Preparing for Future AKIS in Europe.
[9]	Klerkx L (2020) Advisory services and transformation, plurality and disruption of agriculture and food systems: Towards a new research agenda for agricultural education and extension studies. J Agric Educ Ext 26: 131–140. https://doi.org/10.1080/1389224X.2020.1738046 doi: 10.1080/1389224X.2020.1738046
[10]	Welter F (2011) Contextualizing entrepreneurship—conceptual challenges and ways forward. Entrep Theory Pract 35: 165–184. https://doi.org/10.1111/j.1540-6520.2010.00427.x doi: 10.1111/j.1540-6520.2010.00427.x
[11]	McElwee G, Smith R (2014) Chapter 14 Researching rural enterprise. In: Fayolle A (Ed.), Handbook of Research On Entrepreneurship: What We Know and What We Need to Know, Edward Elgar Publishing, 307. https://doi.org/10.4337/9780857936929.00022
[12]	Labarthe P, Laurent C (2013) Privatization of agricultural extension services in the EU: Towards a lack of adequate knowledge for small-scale farms? Food Policy 38: 240–252. https://doi.org/10.1016/j.foodpol.2012.10.005 doi: 10.1016/j.foodpol.2012.10.005
[13]	Davis K (2019) Agricultural Extension and Education for the Future, Closing keynote in 24th European Seminar on Extension and Education. Available from: https://www.google.com/url?sa = t & rct = j & q = & esrc = s & source = web & cd = & ved = 2ahUKEwiRjPnbrJuBAxW3xwIHHVGeA1gQFnoECBAQAQ & url = https%3A%2F%2Fwww.reterurale.it%2Fflex%2Fcm%2Fpages%2FServeAttachment.php%2FL%2FIT%2FD%2F2%25252Fb%25252F2%25252FD.b2f20be86583c28041cc%2FP%2FBLOB%253AID%253D19744%2FE%2Fpdf & usg = AOvVaw0tuNdggviVs-Zu1iIVxlLW & opi = 89978449.
[14]	EU Commission (2012) The future of food and farming.
[15]	ASHBY J (2009) Fostering farmer first methodological innovation: Organizational learning and change in international agricultural research. Farmer First Revisited: Innovation Agric Res Dev 2009: 39–45.
[16]	Sutherland LA, Madureira L, Dirimanova V, et al. (2017) New knowledge networks of small-scale farmers in Europe's periphery. Land Use Policy 63: 428–439. https://doi.org/10.1016/j.landusepol.2017.01.028 doi: 10.1016/j.landusepol.2017.01.028
[17]	EU Commission (2009) Report from the Commission to the European Parliament and the Council on the Application of the Farm Advisory System as Defined in Article 12 and 13 of Council Regulation (EC) No 73/2009. Available from: ttps: //eurlex.europa.eu/LexUriServ/LexUriServ.do?uri = OJ: L: 2009: 030: 0016: 0099: en: PDF.
[18]	Knierim A, Labarthe P, Laurent C, et al. (2017) Pluralism of agricultural advisory service providers—Facts and insights from Europe. J Rural Stud 55: 45–58. https://doi.org/10.1016/j.jrurstud.2017.07.018 doi: 10.1016/j.jrurstud.2017.07.018
[19]	Vecchio Y, De Castro P, Masi M, et al. (2021) Do rural development policies really help small farms? A reflection from Italy. EuroChoices 20: 75–80. https://doi.org/10.1111/1746-692X.12338 doi: 10.1111/1746-692X.12338
[20]	van der Ploeg JD, Barjolle D, Bruil J, et al. (2019) The economic potential of agroecology: Empirical evidence from Europe. J Rural Stud 71: 46–61. https://doi.org/10.1016/j.jrurstud.2019.09.003 doi: 10.1016/j.jrurstud.2019.09.003
[21]	Birner R, Davis K, Pender J, et al. (2009) From best practice to best fit: A framework for designing and analyzing pluralistic agricultural advisory services worldwide. J Agric Educ Ext 15: 341–355. https://doi.org/10.1080/13892240903309595 doi: 10.1080/13892240903309595
[22]	Klerkx L, Petter Stræ te E, Kvam G T, et al. (2017) Achieving best-fit configurations through advisory subsystems in AKIS: case studies of advisory service provisioning for diverse types of farmers in Norway. J Agric Educ Ext 23: 213–229. https://doi.org/10.1080/1389224X.2017.1320640 doi: 10.1080/1389224X.2017.1320640
[23]	Marsden T, Sonnino R (2012) Human health and wellbeing and the sustainability of urban-regional food systems. Curr Opin Environ Sustain 4: 427–430. https://doi.org/10.1016/j.cosust.2012.09.004 doi: 10.1016/j.cosust.2012.09.004
[24]	Landini F, Brites W, Mathot y Rebolé M I (2017) Towards a new paradigm for rural extensionists' in-service training. J Rural Stud 51: 158–167. https://doi.org/10.1016/j.jrurstud.2017.02.010 doi: 10.1016/j.jrurstud.2017.02.010
[25]	OECD (2000) Multifunctionality. Towards an Analytical Framework.
[26]	Wilson G (2008) From 'weak' to 'strong' multifunctionality: Conceptualizing farm-level multifunctional transitional pathways. J Rural Stud 24: 367–383. https://doi.org/10.1016/j.jrurstud.2007.12.010 doi: 10.1016/j.jrurstud.2007.12.010
[27]	Poppe K (2014) The role of the European innovation partnership in linking innovation and research in agricultural knowledge and innovation systems. Agriregionieuropa 10: 37.
[28]	Foti R, Nyakudya I, Moyo M, et al. (2007) Determinants of farmer demand for 'fee-for-service' extension in Zimbabwe: The case of Mashonaland Central province. J Agric Educ Ext 14: 95–104. https://doi.org/10.5191/jiaee.2007.14108 doi: 10.5191/jiaee.2007.14108
[29]	Vanni F (2014) Agriculture and Public Goods: The Role of Collective Action. https://doi.org/10.1007/978-94-007-7457-5
[30]	Acheson JM (2006) Institutional failure in resource management. Annu Rev Anthropol 35: 117–134. https://doi.org/10.1146/annurev.anthro.35.081705.123238 doi: 10.1146/annurev.anthro.35.081705.123238
[31]	Andrew B (2008) Market failure, government failure and externalities in climate change mitigation: The case for a carbon tax. Public Adm Dev 28: 393–401. https://doi.org/10.1002/pad.517 doi: 10.1002/pad.517
[32]	Stiglitz J (2009) Government failure vs. market failure: Principles of regulation. In: Balleisen E, Moss D (Eds.), Government and Markets: Toward a New Theory of Regulation, Cambridge: Cambridge University Press, 13–51. https://doi.org/10.1017/CBO9780511657504.002
[33]	Randall A (2002) Valuing the outputs of multifunctional agriculture. Eur Rev Agric Econ 29: 289–307. https://doi.org/10.1093/eurrag/29.3.289 doi: 10.1093/eurrag/29.3.289
[34]	Eastwood C, Klerkx L, Nettle R (2017) Dynamics and distribution of public and private research and extension roles for technological innovation and diffusion: Case studies of the implementation and adaptation of precision farming technologies. J Rural Stud 49: 1–12. https://doi.org/10.1016/j.jrurstud.2016.11.008 doi: 10.1016/j.jrurstud.2016.11.008
[35]	Faure G, Desjeux Y, Gasselin P (2012) New challenges in agricultural advisory services from a research perspective: A literature review, synthesis and research agenda. J Agric Educ Ext 18: 461–492. https://doi.org/10.1080/1389224X.2012.707063 doi: 10.1080/1389224X.2012.707063
[36]	Vargo SL, Lusch RF (2017) Service-dominant logic 2025. Int J Res Mark 34: 46–67. https://doi.org/10.1016/j.ijresmar.2016.11.001 doi: 10.1016/j.ijresmar.2016.11.001
[37]	Prager K, Labarthe P, Caggiano M, et al. (2016) How does commercialization impact on the provision of farm advisory services? Evidence from Belgium, Italy, Ireland and the UK. Land Use Policy 52: 329–344. https://doi.org/10.1016/j.landusepol.2015.12.024 doi: 10.1016/j.landusepol.2015.12.024
[38]	Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58: 236–244. https://doi.org/10.1080/01621459.1963.10500845 doi: 10.1080/01621459.1963.10500845
[39]	Pigford AAE, Hickey GM, Klerkx L (2018) Beyond agricultural innovation systems? Exploring an agricultural innovation ecosystems approach for niche design and development in sustainability transitions. Agric Syst 164: 116–121. https://doi.org/10.1016/j.agsy.2018.04.007 doi: 10.1016/j.agsy.2018.04.007
[40]	Van der Ploeg JD, Marsden T (2008) Unfolding webs: The dynamics of regional rural development van Gorcum, Assen.
[41]	Renting H, Rossing WAH, Groot JCJ, et al. (2009) Exploring multifunctional agriculture. A review of conceptual approaches and prospects for an integrative transitional framework. J Environ Manage 90: S112–S123. https://doi.org/10.1016/j.jenvman.2008.11.014 doi: 10.1016/j.jenvman.2008.11.014

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Agriculture and Food

1.3 3.9

Metrics

Article views(2084) PDF downloads(141) Cited by(1)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(1) / Tables(6)

AIMS Agriculture and Food

Multifunctional farm advisory services in promoting change in agricultural systems: The case of Campania region of Italy

Related Papers:

Abstract

1. Introduction

1.1. Robbins-Monro

1.2. Trust regions and truncations

1.2.1. Fixed trust regions

1.2.2. Expanding trust regions

1.3. Outline

2. Convergence of Robbins-Monro

2.1. Finite truncations

2.2. Proof of convergence

3. Minimization of relative entropy

3.1. Application of Robbins-Monro

4. Examples

4.1. Scalar problem

4.1.1. Globally convex case

4.1.2. Locally convex case

4.2. Path space problem

4.3. Globally convex example

4.4. Locally convex example

5. Discussion

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Agriculture and Food

Multifunctional farm advisory services in promoting change in agricultural systems: The case of Campania region of Italy

Related Papers:

Abstract

1. Introduction

1.1. Robbins-Monro

1.2. Trust regions and truncations

1.2.1. Fixed trust regions

1.2.2. Expanding trust regions

1.3. Outline

2. Convergence of Robbins-Monro

2.1. Finite truncations

2.2. Proof of convergence

3. Minimization of relative entropy

3.1. Application of Robbins-Monro

4. Examples

4.1. Scalar problem

4.1.1. Globally convex case

4.1.2. Locally convex case

4.2. Path space problem

4.3. Globally convex example

4.4. Locally convex example

5. Discussion

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog