Convergence and quasi-optimality based on an adaptive finite element method for the bilinear optimal control problem

Zuliang Lu; Xiankui Wu; Fei Huang; Fei Cai; Chunjuan Hou; Yin Yang; Zuliang Lu; Xiankui Wu; Fei Huang; Fei Cai; Chunjuan Hou; Yin Yang

doi:10.3934/math.2021553

AIMS Mathematics

2021, Volume 6, Issue 9: 9510-9535. doi: 10.3934/math.2021553

Previous Article Next Article

Research article

Convergence and quasi-optimality based on an adaptive finite element method for the bilinear optimal control problem

1.
Key Laboratory for Nonlinear Science and System Structure, Chongqing Three Gorges University, Chongqing, 404000, China
2.
Center for Mathematics and Economics, Tianjin University of Finance and Economics, Tianjin, 300222, China
3.
Guangzhou Huashang College, Guangzhou 511300, China
4.
School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, Hunan, China

Received: 05 March 2021 Accepted: 24 May 2021 Published: 24 June 2021
MSC : 49J20, 65M08

This paper investigates the adaptive finite element method for an optimal control problem governed by a bilinear elliptic equation. We establish the finite element discrete scheme for the bilinear optimal control problem and use a dual argument, linearization method, bubble function, and new bubble function to obtain a posteriori error estimates. To prove the convergence and the quasi-optimality for adaptive finite element methods, we introduce the adaptive finite element algorithm, local perturbation, error reduction, discrete local upper bound, Dörfler property, dual argument method, and quasi orthogonality. A few numerical examples are given at the end of the paper to demonstrate our theoretical analysis.

Keywords:

Citation: Zuliang Lu, Xiankui Wu, Fei Huang, Fei Cai, Chunjuan Hou, Yin Yang. Convergence and quasi-optimality based on an adaptive finite element method for the bilinear optimal control problem[J]. AIMS Mathematics, 2021, 6(9): 9510-9535. doi: 10.3934/math.2021553

Related Papers:

[1]	Zuliang Lu, Fei Cai, Ruixiang Xu, Lu Xing . Convergence and proposed optimality of adaptive finite element methods for nonlinear optimal control problems. AIMS Mathematics, 2022, 7(11): 19664-19695. doi: 10.3934/math.20221079
[2]	Zuliang Lu, Ruixiang Xu, Chunjuan Hou, Lu Xing . A priori error estimates of finite volume element method for bilinear parabolic optimal control problem. AIMS Mathematics, 2023, 8(8): 19374-19390. doi: 10.3934/math.2023988
[3]	Zuliang Lu, Fei Cai, Ruixiang Xu, Chunjuan Hou, Xiankui Wu, Yin Yang . A posteriori error estimates of hp spectral element method for parabolic optimal control problems. AIMS Mathematics, 2022, 7(4): 5220-5240. doi: 10.3934/math.2022291
[4]	Shengying Mu, Yanhui Zhou . An analysis of the isoparametric bilinear finite volume element method by applying the Simpson rule to quadrilateral meshes. AIMS Mathematics, 2023, 8(10): 22507-22537. doi: 10.3934/math.20231147
[5]	Zuliang Lu, Xiankui Wu, Fei Cai, Fei Huang, Shang Liu, Yin Yang . Error estimates in $L^2$ and $L^\infty$ norms of finite volume method for the bilinear elliptic optimal control problem. AIMS Mathematics, 2021, 6(8): 8585-8599. doi: 10.3934/math.2021498
[6]	Changling Xu, Hongbo Chen . A two-grid $P_0^2$ - $P_1$ mixed finite element scheme for semilinear elliptic optimal control problems. AIMS Mathematics, 2022, 7(4): 6153-6172. doi: 10.3934/math.2022342
[7]	Tiantian Zhang, Wenwen Xu, Xindong Li, Yan Wang . Multipoint flux mixed finite element method for parabolic optimal control problems. AIMS Mathematics, 2022, 7(9): 17461-17474. doi: 10.3934/math.2022962
[8]	Yuelong Tang . Error estimates of mixed finite elements combined with Crank-Nicolson scheme for parabolic control problems. AIMS Mathematics, 2023, 8(5): 12506-12519. doi: 10.3934/math.2023628
[9]	Jie Liu, Zhaojie Zhou . Finite element approximation of time fractional optimal control problem with integral state constraint. AIMS Mathematics, 2021, 6(1): 979-997. doi: 10.3934/math.2021059
[10]	Cagnur Corekli . The SIPG method of Dirichlet boundary optimal control problems with weakly imposed boundary conditions. AIMS Mathematics, 2022, 7(4): 6711-6742. doi: 10.3934/math.2022375

Abstract

1. Introduction

Bilinear optimal control problems are types of quintessential optimal control problems governed by a partial differential equation, which investigated in material mechanics, engineering mechanics, and engineering design, etc ^[9,11,22]. Effective numerical methods, such as in ^[1,12,21] for finite element methods, ^[4,5] for spectral Galerkin methods, and ^[6] for fast algorithms, are the key to the successful application of optimal control problems in the practical field. Accordingly, the study of the efficient numerical algorithm for the bilinear optimal control problem has far-reaching theoretical value and application prospects.

The finite element method is frequently applied to solve optimal control problems and uses the thought of mathematical approximation to model the actual physical system. It replaces complex problems with simpler ones by assuming that each element has a suitable, simple approximate solution and then derives a universal satisfaction condition for the solution domain ^[10,23]. Since the actual problem is replaced by a simpler one, this solution is not precise, but approximate. Given that the finite element method has high computational accuracy and can adapt to a variety of complex shapes, it has become an effective engineering analysis method.

Babuška first proposed the adaptive finite element method ^[2]. The adaptive finite element method is an accurate and efficient method of finite element discretization, which saves a lot of calculation time based on ensuring the specified accuracy. Appropriate grids can enormously reduce the errors resulting from the discretization of finite element approximation procedure in copying with an optimal control problem. As the case stands, the solutions for the optimal control problem of the nonlinear system are generally not available. In addition to the complexity and diversity of the nonlinear equations, it is practicable to solve the nonlinear equations with the thought of the adaptive finite element method.

Adaptive finite element methods had popularized for many years, but the theoretical analysis of the entire algorithm has just been actualized. The subject work in the early stage was made by Dörfler ^[13], in which he investigated an adaptive finite element for Poisson's equation by a reduction of energy errors under a mild assumption on the initial grid $\mathcal{T}_{h_0}$ . Later, Demlow, Morin, et-al. ^[14,24,25] proved the convergence of adaptive finite element methods without the mild assumption. Then another decisive issue of adaptive finite element methods has been investigated by Binev et al. ^[3]. In the last decades, much work has been done on the concept of total error and the sum of error energy plus oscillations ^[8,12,26]. Recently, Liu and Yan ^[22] found that adaptive finite element methods could be successfully applied in constrained optimal control problems. The later investigations could be found in ^[15,19]. Inspired by these works, Gong and Yan ^[18] investigated the convergence and quasi-optimality of an adaptive finite element for control constrained optimal control problems via applying variation control discretization. Later, Leng and Chen ^[20] studied the convergence of an adaptive finite element method for optimal control problems with integral control constraints.

To proceed, Section 2 next focuses on the bilinear optimal control problem with integral control constraints, which adopt piecewise constant discretization to deal with the control via. For the state and the co-state, by applying continuous piecewise linear discretization, respectively. Section 3 deduce a posteriori error estimate. For the convergence and the quasi-optimality, we prove them relying on quasi-orthogonality and discrete local upper bound. Section 4 based on the mild assumption to the initial grids, obtain the proof of convergence and quasi-optimality employing the solution operator of nonlinear elliptic equations. Section 5 then advances some numerical illustration.

Here are some notations that will be employed in this paper. Let $\Omega$ be a bounded Lipschitz domain on $\mathbb{R}^2$ and $\partial\Omega$ denote the boundary of $\Omega$ . We adopt the standard notation $W^{m, q}(\omega)$ for Sobolev space with norm $\|\cdot\|_{m, q, \omega}$ and seminorm $|\cdot|_{m, q, \omega}$ . Where $\omega\subset\Omega$ , and we will omit the sub description if $\omega = \Omega$ . We denote $W^{m, 2}(\Omega)$ by $H^m(\Omega)$ and set $H^1_0(\Omega) = \{v\in H^1(\Omega):v = 0\ \mathrm{on}\ \partial\Omega\}$ . For $m = 0 \ \mathrm{and}\ q = 2$ , we denote $W^{0, 2}(\omega) = L^2(\omega)$ and set $\|\cdot\|_{0, 2, \omega} = \|\cdot\|_{0, \omega}$ . Let $\mathcal{T}_{h_0}$ denote the initial partition of $\bar{\Omega}$ into disjoint triangles. By newest-vertex bisections for $\mathcal{T}_{h_0}$ , we obtain a class $\mathbb{T}$ of conforming partitions. For $\mathcal{T}_h, \tilde{\mathcal{T}}_h \in \mathbb{T}$ , we use $\mathcal{T}_h\subset\mathcal{\tilde{T}}_h$ to indicate that $\tilde{\mathcal{T}}_h$ is a refinement of $\mathcal{T}_h$ , where $h_{T} = |T|^{1/2}$ . Denote the $L^2$ inner product by $(\cdot, \cdot)$ , and let $C$ be a constant independent of grids size.

2. A residual-based posteriori error estimates

In this paper, we focus our attention on the following bilinear convex optimal control problem:

$\begin{align} &\min\limits_{u\in U_{ad}}\bigg\{\frac{1}{2}\|y-y_{d}\|^2_{0}+\frac{\alpha}{2}\|u\|^2_{0}\bigg\}, \end{align}$

(2.1)

$\begin{align} &-\Delta y+uy = f,\quad \mathrm{in} \ \Omega,\quad y|_{\partial\Omega} = 0, \end{align}$

(2.2)

where $\Omega$ and $\Omega_U$ are bounded open sets in $\mathbb{R}^2$ with a Lipschitz boundary $\partial\Omega$ and $\partial\Omega_U$ . Let $f\in L^2(\Omega)$ , and $U_{ad}$ be a closed convex set defined as follows

$U_{ad} = \{u\in L^2(\Omega_U):\int_{\Omega_U}u\geq0\}.$

We take the state space $V = H_0^1(\Omega)$ , the control space $U = L^2(\Omega_U)$ , and $H = L^2(\Omega)$ to determine the concept of finite element approximation for our discussion with the bilinear optimal control problems (2.1)-(2.2). We first give a weak formula for the state equation

$\begin{align*} &a(y,v) = \int_\Omega\nabla y\cdot\nabla vdx,\quad\forall\ y,v\in V,\\ &(f_1,f_2) = \int_\Omega f_1f_2dx,\quad\forall\ (f_1,f_2)\in H\times H,\\ &(y,v) = \int_{\Omega_U}yvdx,\quad\forall\ (f_1,f_2)\in U\times U. \end{align*}$

We set a norm $||v||_a = \sqrt{a(v, v)}$ , which is equivalent to $||v||_1$ . Exist constants $c$ and $C$ that satisfies

$\begin{equation} a(v,v)\geq c||v||^2_V,\quad a(y,v)\leq C||y||_V||v||_V,\quad\forall\ y,v\in V. \end{equation}$

(2.3)

Then the bilinear optimal control problem (2.1)-(2.2) can be restated as

$\begin{align} &\min\limits_{u\in U_{ad}}\bigg\{\frac{1}{2}\|y-y_{d}\|^2_{0}+\frac{\alpha}{2}\|u\|^2_{0}\bigg\}, \end{align}$

(2.4)

$\begin{align} &a(y,v)+(uy,v) = (f,v),\quad \forall \ v\in V. \end{align}$

(2.5)

It is well known that the optimal control problem (2.4)-(2.5) has a solution $(y, u)$ which was proved in the Theorem 2.2.4 of ^[22], and that if a pair $(y, u)$ is the solution of (2.4)-(2.5), then there exist a costate $p\in V$ such that the triplet $(y, u, p)$ satisfies the following optimality conditions:

$\begin{align} &a(y,v)+(uy,v) = (f,v),\quad\forall v\in V, \end{align}$

(2.6)

$\begin{align} &a(q,p)+(up,q) = (y-y_d,q),\quad\forall q\in V, \end{align}$

(2.7)

$\begin{align} &(\alpha u-yp,v-u)_U\geq0,\quad\forall v\in U_{ad}\subset U. \end{align}$

(2.8)

For (2.6), since the coercivity of $a(\cdot, \cdot)$ , and we set a operator $S:L^2(\Omega)\rightarrow H_{0}^1(\Omega)$ such that $Su = y$ . For (2.7), let $S^*$ be the adjoint operator of $S$ , we assume that $y-y_d$ is given, such that $S^*(y-y_{d}) = p$ . Let $V_h$ be the continuous piecewise linear finite element space for the partition $\mathcal{T}_h\in \mathbb{T}$ . We define $U^h$ as the piecewise constant finite element space of $\mathcal{T}_h$ . Set $U_{ad}^h = \{v_h\in U^h:\int_{\Omega}v_h\geq 0\}$ , then we derive the standard finite element discretization for the bilinear optimal control problem

$\begin{align} &\min\limits_{u\in U_{ad}}\bigg\{\frac{1}{2}\|y_h-y_{d}\|^2_{0}+\frac{\alpha}{2}\|u_h\|^2_{0}\bigg\}, \end{align}$

(2.9)

$\begin{align} &a(y_h,v)+(u_hy_h,v) = (f,v),\quad \forall \ v\in V_h. \end{align}$

(2.10)

It is well known that the optimal control problem (2.9)-(2.10) has a solution $(y_h, u_h)$ and that if a pair $(y_h, u_h)\in V_h\times U_{ad}^h$ is the solution of (2.9)-(2.10), then there is a co-state $p_h\in V_h$ such that the triplet $(y_h, u_h, p_h)$ satisfies the following optimality conditions

$\begin{align} &a(y_h,v)+(u_hy_h,v) = (f,v),\quad\forall v\in V_h\subset V, \end{align}$

(2.11)

$\begin{align} &a(q,p_h)+(u_hp_h,q) = (y_h-y_d,q),\quad\forall q\in V_h\subset V, \end{align}$

(2.12)

$\begin{align} &(\alpha u_h-y_hp_h,v_h-u_h)_U\geq0,\quad\forall v\in U_{ad}^h\subset U. \end{align}$

(2.13)

Similarly, we set a operator $S_{h}$ : $U_{ad}^{h} \; \rightarrow \; V_{h}$ such that $Su_{h} = y_{h}$ . Let $S_{h}^*$ be the adjoint operator of $S_{h}$ such that $S_{h}^*(y_{h}-y_{d}) = p_{h}$ . Let us now introduce some error indicators which we will frequently use in this paper, namely the error $\eta(\cdot)$ and oscillation $osc(\cdot)$ . For $\mathcal{T}_h\in \mathbb{T}, \ T\in \mathcal{T}_h$ , that are defined by:

$\begin{align*} &\eta_{1,\mathcal{T}_h}^2(p_{h},T) = h_{T}^2\|\nabla p_{h}\|_{0,T}^2,\\ &\eta_{2,\mathcal{T}_h}^2(u_{h},y_{h},T) = h_{T}^2\|f-u_hy_h\|_{0,T}^2+h_{T}\|[\nabla{y_{h}}]\cdot {\bf{n}}\|^2_{0,\partial T \backslash\partial\Omega},\\ &\eta^2_{3,\mathcal{T}_h}(y_{h},p_{h},T) = h_{T}^2\|y_{h}-y_{d}-u_hp_{h}\|_{0,T}^2+h_{T}\|[\nabla{p_{h}}]\cdot{\bf{n}}\|^2_{0,\partial T\backslash\partial\Omega},\\ &osc^2_{\mathcal{T}_h}(f,T) = h_{T}^2\|f-f_{T}\|_{0,T}^2,\\ &osc_{\mathcal{T}_h}^2(y_{h}-y_{d},T) = h_{T}^2\|(y_{h}-y_{d})-(y_{h}-y_{d})_T\|_{0,T}^2, \end{align*}$

where $u_{h}\in U_{ad}^{h}$ and $y_{h}, p_{h}\in V_{h}$ , $f_{T}$ is $L^2$ -projection of $f$ onto piecewise constant space on $T$ and $f_T = \frac{\int_Tf}{|T|}$ . For $\omega\subset \mathcal{T}_h$ , such that

$\begin{align*} &\eta^2_{1,\mathcal{T}_h}(p_{h},\omega) = \sum\limits_{T\in \omega}\eta^2_{1,\mathcal{T}_h}(p_{h},T),\\ &osc_{\mathcal{T}_h}^2(f,\omega) = \sum\limits_{T\in\omega}osc^2_{\mathcal{T}_h}(f,T), \end{align*}$

Similarly, $\eta_{2, \mathcal{T}_h}^2(u_{h}, y_{h}, \omega), \ \eta_{3, \mathcal{T}_h}^2(u_{h}, y_{h}, \omega)$ and $osc^2_{\mathcal{T}_h}(y_{h}-y_{d}, \omega)$ can be denoted as well.

Before we discuss residual-based posteriori error estimates for the bilinear optimal control problem, we introduce the well-known result for the integral averaging operator ^[12].

Lemma 2.1. Let $\pi_h^a:L^1(\Omega_U)\rightarrow U^h\subset U$ be the integral averaging operator such that

$\begin{equation*} (\pi_h^av)|_T: = \frac{1}{|T|}\int_{T}v,\quad\forall\ T\in\mathcal{T}_h. \end{equation*}$

Then for $m = 0, 1$ and $1\leq q\leq\infty$ , here

$\begin{equation} ||v-\pi_h^av||_{0,q,T}\leq Ch_T^m|v|_{m,q,T},\quad\forall\ v\in W^{m,q}(\Omega_U). \end{equation}$

(2.14)

We next discuss the residual-based posteriori error estimates for the bilinear optimal control problem. Moreover, we introduce the following auxiliary problems

$\begin{align} &a(y^h,v)+(u_hy^h,v) = (f,v),\quad\forall\ v\in V, \end{align}$

(2.15)

$\begin{align} &a(q,p^h)+(y^hp^h,q) = (y_h-y_d,q),\quad\forall\ v\in V. \end{align}$

(2.16)

Here we state the theorem about a posteriori error estimates.

Theorem 2.1. Let $(y, u, p)\in H_0^1(\Omega) \times U_{ad}\times H_0^1(\Omega)$ be the solution of problems (2.6)-(2.8), and $(y_h, u_h, p_h)\in V_h\times U_{ad}^h\times V_h$ be the solution of problems (2.11)-(2.13). Then exist constants $c$ and $C$ holds

$\begin{align} &||u-u_h||_0^2+||y-y_h||_a^2+||p-p_h||_a^2\\ \leq& c(\eta_{1,\mathcal{T}_h}^2(p_{h},\mathcal{T}_h)+\eta_{2,\mathcal{T}_h}^2(u_{h},y_{h},\mathcal{T}_h)+\eta_{3,\mathcal{T}_h}^2(y_{h},p_{h},\mathcal{T}_h)), \end{align}$

(2.17)

and

$\begin{align} &C(\eta_{1,\mathcal{T}_h}^2(p_{h},\mathcal{T}_h)+\eta_{2,\mathcal{T}_h}^2(u_{h},y_{h},\mathcal{T}_h)+\eta_{3,\mathcal{T}_h}^2(y_{h}, p_{h},\mathcal{T}_h))\\ \leq&||u-u_h||_0^2+||y-y_h||_a^2+||p-p_h||_a^2+osc_{\mathcal{T}_h}^2(f,\mathcal{T}_h)+osc_{\mathcal{T}_h}^2(y_{h}-y_{d},\mathcal{T}_h). \end{align}$

(2.18)

Proof. By applying Lemma 2.1, we infer that

$\sum\limits_{T\in \mathcal{T}_h}||p_h-\pi_hp_h||_{0,T}^2\leq C\eta_{1,\mathcal{T}_h}^2(p_{h},T),$

where $\pi_h$ is the $L^2-$ projection operator onto piecewise constant space on $\mathcal{T}_h$ . Therefore, it follows from Theorem 3.1 of ^[16], we derive that

$\begin{align} ||u-u_h||_{0,T}^2\leq&C\sum\limits_{T\in \mathcal{T}_h}||p_h-\pi_hp_h||_{0,T}^2+C\eta_{2,\mathcal{T}_h}^2(u_{h},y_{h},T)+C\eta^2_{3,\mathcal{T}_h}(y_{h},p_{h},T)\\ \leq&C(\eta_{1,\mathcal{T}_h}^2(p_{h},T)+\eta_{2,\mathcal{T}_h}^2(u_{h},y_{h},T)+\eta^2_{3,\mathcal{T}_h}(y_{h},p_{h},T)). \end{align}$

(2.19)

Assume that $e^p = p^h-p_h$ , and $e_I^p = \hat{\pi}_he^p$ , where $\hat{\pi}_h$ is the average interpolation operator defined in Lemma 3.2 of ^[17], then we obtain

$\begin{align*} c||p^h-p_h||_a^2\leq&(\nabla e^p,p^h-p_h)+(u_h(p^h-p_h),e^p)\\ = &(\nabla(e^p-e_I^p),\nabla(p^h-p_h))+(u_h(p^h-p_h),e^p-e_I^p)\\&+(e_I^p,p^h-p_h)+(u_h(p^h-p_h),e_I^p)\\ = &\sum\limits_{T\in \mathcal{T}_h}\int_T(y_h-y_d-u_hp_h)(e^p-e_I^p)-\sum\limits_{\partial T\backslash\partial\Omega}\int_{\partial T}([\nabla p_h]\cdot{\bf{n}})(e^p-e_I^p)-(y^h-y_h,e^p)\\ \leq&C(\sigma)\sum\limits_{T\in \mathcal{T}_h}h_T^2\int_T(y_h-y_d-u_hp_h)^2+C(\sigma)\sum\limits_{\partial T\backslash\partial\Omega}h_T\int_{\partial T}([\nabla p_h]\cdot{\bf{n}})^2\\&+C\sigma\sum\limits_{T\in \mathcal{T}_h}h_T^2\int_T|e^p-e_I^p|^2+C\sigma\sum\limits_{\partial T\backslash\partial\Omega}h_T|e^p-e_I^p|^2+C||y^h-y_h||_0||e^p||_a\\ \leq&C(\sigma)\sum\limits_{T\in \mathcal{T}_h}h_T^2\int_T(y_h-y_d-u_hp_h)^2+C(\sigma)\sum\limits_{\partial T\backslash\partial\Omega}h_T\int_{\partial T}([\nabla p_h]\cdot{\bf{n}})^2\\&+C(\sigma)||y_h-y^h||_0^2+C\sigma||e^p||_a^2, \end{align*}$

where $||v||_{0, 4}\leq C||v||_1$ is defined in the embedding theorem of ^[7], and the property $||v||_1\leq C$ . Then, let $\sigma = \frac{c}{2C}$ , we obtain

$\begin{align} ||p_h-p^h||_a^2\leq&C\sum\limits_{T\in \mathcal{T}_h}h_T^2\int_T(y_h-y_d-u_hp_h)^2\\ &+C\sum\limits_{\partial T\backslash\partial\Omega}h_T\int_{\partial T}([\nabla p_h]\cdot{\bf{n}})^2+||y_h-y^h||_0^2. \end{align}$

(2.20)

Similarly, let $e^y = y^h-y_h$ , and $e_I^y$ be the average interpolation of $e^y$ . It follows from (2.11) and (2.15) that

$\begin{align*} c||y^h-y_h||_a^2\leq&(\nabla(y^h-y_h),\nabla e^y)+(u_h(y^h-y_h),e^y)\\ = &(\nabla(y^h-y_h),\nabla(e^y-e^y_I))+(u_h(y^h-y_h),e^y-e^y_I)\\ = &\sum\limits_{T\in \mathcal{T}_h}\int_T(f-u_hy_h)(e^y-e^y_I)-\sum\limits_{\partial T\backslash\partial\Omega}\int_{\partial T}([\nabla p_h]\cdot{\bf{n}})(e^y-e^y_I)\\ \leq&C(\sigma)\sum\limits_{T\in \mathcal{T}_h}h_T^2\int_T(f-u_hy_h)^2+C(\sigma)\sum\limits_{\partial T\backslash\partial\Omega}h_T\int_{\partial T}([\nabla y_h]\cdot{\bf{n}})^2+C\sigma||e^y||_a^2. \end{align*}$

Thus, it holds that

$\begin{align} ||y_h-y^h||_a^2\leq C\sum\limits_{T\in \mathcal{T}_h}h_T^2\int_T(f-u_hy_h)^2+C\sum\limits_{\partial T\backslash\partial\Omega}h_T\int_{\partial T}([\nabla y_h]\cdot{\bf{n}})^2. \end{align}$

(2.21)

Note that

$\begin{align} & ||y-y_h||_a\leq||y^h-y_h||_a+||y-y^h||_a, \end{align}$

(2.22)

$\begin{align} &||p-p_h||_a\leq||p^h-p_h||+||p-p^h||_a, \end{align}$

(2.23)

$\begin{align} &||p-p^h||_a^2+||y-y^h||_a^2\leq C||u-u_h||_0^2. \end{align}$

(2.24)

Hence, (2.17) follows from (2.19)-(2.21) and (2.22)-(2.24).

To derive the a posteriori lower error bounds for the optimal control problems governed by bilinear elliptic equations we use the standard bubble function technique (see ^[1,17,27]). Let $b_T$ be the standard third order polynomial bubble on $T$ scaled with $b_T = \lambda_1\lambda_2\lambda_3$ . We denote by $\{\lambda_1, \lambda_2, \lambda_3\}$ the barycentric coordinates of $T$ . Due to $y_h$ is piecewise linear, we set

$w_T = ((y_h-y_d)_T-y_hp_h)b_T^2,$

where $c_1 = \frac{\int_Tb_T^2}{|T|}$ denotes a positive constant, here

$||w_T||_{0,T}^2 = \int_Tc_1^2((y_h-y_d)_T-u_hp_h)^2b_T^4\leq c_1\int_T((y_h-y_d)_T-u_hp_h)^2.$

Then there holds $\lambda_1\lambda_2\lambda_3 = 0$ on $\partial T$ , and satisfies

$w_T|_{\partial T\backslash\partial\Omega} = c_1((y_h-y_d)_T-u_hp_h)(\lambda_1\lambda_2\lambda_3)^2|_{\partial T} = 0.$

Moreover, we obtain

$\nabla w_T|_{\partial T\backslash\partial\Omega} = 2c_1((y_h-y_d)_T-u_hp_h)(\lambda_1\lambda_2\lambda_3)\nabla(\lambda_1\lambda_2\lambda_3)|_{\partial T} = 0,$

thus we can conclude that $w_T\in H^2_0(T)$ . Using the results from ^[1,27], we derive that

$|w_T|_{0,2,T}^2\leq Ch_T^{-4}\int_T|w_T(T)|^2.$

We set $\hat{T}$ be a reference element, and $\hat{X} = F_T(X) = X+b_T$ be an affine map from $T$ onto $\hat{T}$ , we also set $\hat{w} = w\circ F_T^{-1}(\hat{X})$ . Thus we conclude that

$|w_T|_{0,T}^2 = \int_{\hat{T}}|\hat{w}_T(\hat{X})|^2.$

Where $w_T\in H^2_0(T)$ , we obtain that $\hat{w}_T\in H^2_0(\hat{T})$ . Using the Poincare's inequality, we deduce the results below

$\int_{\hat{T}}|\hat{w}_T(\hat{X})|^2\leq C\sum\limits_{|a| = 2}\int_{\hat{T}}|D^a\hat{w}_T(\hat{X})|^2,$

such that

$|w_T|_{0,T}^2\leq Ch_T^4|w_T|_{0,2,T}^2.$

Thus we can obtain that

$\begin{align} ch_T^{-2}||w_T||_{0,T}^2\leq|w_T|_{0,2,T}^2\leq Ch_T^{-2}||w_T||_{0,T}^2,\quad\forall\ T\in\mathcal{T}_h. \end{align}$

(2.25)

Then use of bubble function $w_T$ with (2.25) yields

$\begin{align*} &\int_Th_T^2((y_h-y_d)_T-y_hp_h)^2 = \int_Th_T^2(y_h-y_d)_T-u_hp_h)w_T\\ \leq&\int_T(y_h-y_d-u_hp_h-(y-y_d-up))w_T\\&+C(\sigma)\int_T(y_h-y_d-(y_h-y_d)_T)^2+C\sigma h_T^{-2}||w_T||_{0,T}^2\\ = &-\int_T\nabla(p_h-p)\nabla w_T+\int_T(y_h-y)w_T+\int_T(u_hp_h-up)\\ &+C(\sigma)\int_T(y_h-y_d-(y_h-y_d)_T)^2+C\sigma h_T^{-2}||w_T||_{0,T}^2\\ \leq&C(\sigma)||p_h-p||_{a,T}^2+C(\sigma)||y_h-y||_{0,T}^2+C(\sigma)||u_h-u||_{0,T}^2\\&+C(\sigma)\int_Th_T^2(y_h-y_d-(y_h-y_d)_T)^2+C\sigma h_T^{-2}||w_T||_{0,T}^2, \end{align*}$

We set $u_hp_h-up = u_h(p_h-p)+(u_h-u)p$ , such that

$\begin{align} &\int_Th_T^2(y_h-y_d-u_h-p_h)^2\\ \leq&C\int_T((y_h-y_d)_T-u_hp_h)^2+C\int_Th_T^2(y_h-y_d-(y_h-y_d)_T))^2\\ \leq&C||p_h-p||_{a,T}^2+C||y_h-y||_{0,T}^2+C||u_h-u||_{0,T}^2+C\int_Th_T^2(y_h-y_d-(y_h-y_d)_T))^2. \end{align}$

(2.26)

Then by using the Schwarz's inequality, it follows from (2.25) that

$\begin{align*} \int_{\partial T}h_{T}([\nabla{p_{h}}]\cdot{\bf{n}})^2 = &\int_{\partial T}([\nabla{p_{h}}]\cdot{\bf{n}})w_{\partial T} = \int_{\partial T}([\nabla{p_{h}}]\cdot{\bf{n}}-[\nabla{p}]\cdot{\bf{n}})w_{\partial T}\\ = &\int_{\partial T}\nabla(p_h-p)\nabla w_{\partial T}+\int_{\partial T}(y-y_d-up)w_{\partial T}\\\leq&C(\sigma)||p_h-p||_{a,\partial T\backslash\partial\Omega}^2+\int_{\partial T}(y-y_h)w_{\partial T}\\&+\int_{\partial T}(y_h-y_d-u_hp_h)w_{\partial T}+\int_{\partial T}(up-u_hp_h)w_{\partial T}\\\leq&C(\sigma)||p_h-p||_{a,\partial T\backslash\partial\Omega}^2+C(\sigma)||y-y_h||_{0,\partial T\backslash\partial\Omega}^2+C(\sigma)||u-u_h||_{0,\partial T\backslash\partial\Omega}^2\\&+C(\sigma)\int_{\partial T}(y_h-y_d-u_hp_h)^2+C\sigma(||w_{\partial T}||_{a,\partial T\backslash\partial\Omega}^2+h_T^{-2}||w_{\partial T}||_{0,\partial T\backslash\partial\Omega}^2), \end{align*}$

assume that $up-u_hp_h = (u-u_h)p+u_h(p-p_h)$ . Then there holds

$\begin{align} \int_{\partial T}h_{T}([\nabla{p_{h}}]\cdot{\bf{n}})^2\leq&C||p_h-p||_{a,\partial T\backslash\partial\Omega}^2+C||y-y_h||_{0,\partial T\backslash\partial\Omega}^2\\ &+C||u-u_h||_{0,\partial T\backslash\partial\Omega}^2+C\int_{\partial T}(y_h-y_d-u_hp_h)^2. \end{align}$

(2.27)

Applying (2.26)-(2.27) and the Poincare's inequality, we obtain

$\begin{align} \eta^2_{3,\mathcal{T}_h}(y_{h},p_{h},\mathcal{T}_h) = &\sum\limits_{T\in\mathcal{T}_h}\int_Th_T^2(y_h-y_d-u_hp_h)^2+\sum\limits_{\partial T\backslash\partial\Omega}\int_{\partial T}h_T([\nabla{p_{h}}]\cdot{\bf{n}})^2\\ \leq&C||p-p_h||_a^2+C||y-y_h||_a^2+C||u-u_h||_0^2+Cosc_{\mathcal{T}_h}^2(y_{h}-y_{d},T). \end{align}$

(2.28)

Similarly, we also derive that

$\begin{align} \eta^2_{2,\mathcal{T}_h}(u_{h},y_{h},\mathcal{T}_h) = &\sum\limits_{T\in\mathcal{T}_h}\int_Th_T^2(f-u_hp_h)^2+\sum\limits_{\partial T\backslash\partial\Omega}\int_{\partial T}h_T([\nabla{y_{h}}]\cdot{\bf{n}})^2\\ \leq&C||p-p_h||_a^2+C||y-y_h||_a^2+C||u-u_h||_0^2+Cosc_{\mathcal{T}_h}^2(f,T). \end{align}$

(2.29)

Combining Lemma 3.6 of ^[17], we can derive the same result for bilinear elliptic equations

$||p_h-\pi_hp_h||_{0,T}^2\leq C||u-u_h||_{0,T}^2+C||p-p_h||_{0,T}^2.$

From Lemma 2.1, we infer that

$||p_h-\pi_hp_h||_{0,T}^2\leq Ch_T^2||\nabla p_h||_{a,T}^2.$

Then using the inverse estimates to infer that

$\begin{equation} \eta_{1,\mathcal{T}_h}^2(p_{h},\mathcal{T}_h) = \sum\limits_{T\in\mathcal{T}_h}h_T^2||p_h-\pi_hp_h||_{0,T}^2\leq C||u-u_h||_0^2+C||p-p_h||_a^2, \end{equation}$

(2.30)

and hence, (2.18) follows from (2.28)-(2.30).

Theorem 2.1 gives reliable and efficient posteriori error estimates. Now we introduce an adaptive finite element algorithm to explain what we mainly investigate in this paper.

Algorithm 2.1. Adaptive finite element algorithm for bilinear optimal control problems:

(0) Given an initial mesh $\mathcal{T}_{h_0}$ and construct finite element space $U_{ad}^{h_0}$ and $V_{h_0}$ . Select marking parameter $0 < \theta\leq 1$ and set $k: = 0$ .

(1) Solve the discrete bilinear optimal control problem (2.11)-(2.13), then obtain an approximate solution $(y_{h_k}, u_{h_k}, p_{h_k})$ with respect to $\mathcal{T}_{h_k}$ .

(2) Compute the local error estimator $\eta_{\mathcal{T}_{h_k}}(T)$ for all $T\in\mathcal{T}_{h_k}$ .

(3) Select a minimal subset $\mathcal{M}_{h_k}$ of $\mathcal{T}_{h_k}$ such that

$\eta_{\mathcal{T}_{h_k}}^2(\mathcal{M}_{h_k})\geq\theta\eta_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k}),$

where $\eta_{\mathcal{T}_{h_k}}^2(\omega) = \eta_{1, \mathcal{T}_h}^2(p_h, \omega)+ \eta_{2, \mathcal{T}_h}^2(u_{h}, y_{h}, \omega)+\eta_{3, \mathcal{T}_h}^2(y_{h}, p_{h}, \omega)$ for all $\omega\subset\mathcal{T}_{h_k}$ .

(4) Refine $\mathcal{M}_{h_k}$ by bisecting $b\geq 1$ times in passing from $\mathcal{T}_{h_k}$ to $\mathcal{T}_{h_{k+1}}$ and generally additional elements are refined in the process in order to ensure that $\mathcal{T}_{h_{k+1}}$ is conforming.

(5) Solve the discrete bilinear optimal control problem (2.11)-(2.13), then obtain approximate solution $(y_{h_{k+1}}, u_{h_{k+1}}, p_{h_{k+1}})$ with respect to $\mathcal{T}_{h_{k+1}}$ .

(6) Set $k = k+1$ and go to step (2).

3. Convergence analysis

In this section, we first introduce the relevant theorem of reference ^[24], where the authors construct a simple and efficient adaptive finite element methods to ensure a reduction rate of data oscillation, together with an error reduction based on a posteriori error estimators. In addition, we introduce the internal node properties and mark the data oscillations. To derive the residual-type posterior error estimates ^[21], we employ the lemmas as follows.

Lemma 3.1. For all $v\in H^1(\Omega)$ , $T_h\in\mathbb{T}, T\in\mathcal{T}_h$ , there exists

$\begin{equation*} ||v||_{0,\partial T\backslash\partial\Omega}\leq Ch_T^{-1/2}||v||_{0,T}+Ch_T^{1/2}||v||_{1,T}. \end{equation*}$

Here we state the following local perturbation property.

Lemma 3.2. For $\mathcal{T}_h\in \mathbb{T}, \ T\in \mathcal{T}_h$ , let $u_{h_1}, u_{h_2}\in U_{ad}^h, \ y_{h_1}, y_{h_2}, p_{h_1}, p_{h_2}\in V_{h}$ , we have

$\begin{align} \eta_{1,\mathcal{T}_h}(p_{h_1},T)-\eta_{1,\mathcal{T}_h}(p_{h_2},T)\leq& C h_{T}\|p_{h_1}-p_{h_2}\|_{a,T}, \end{align}$

(3.1)

$\begin{align} \eta_{2,\mathcal{T}_h}(u_{h_1},y_{h_1},T)-\eta_{2,\mathcal{T}_h}(u_{h_2},y_{h_2},T)\leq& C(h_T\|u_{h_1}-u_{h_2}\|_{0,T}+h_T\|p_{h_1}-p_{h_2}\|_{0,T}\\ &+\|y_{h_1}-y_{h_2}\|_{a,\omega_T}), \end{align}$

(3.2)

$\begin{align} \eta_{3,\mathcal{T}_h}(y_{h_1},p_{h_1},T)-\eta_{3,\mathcal{T}_h}(y_{h_2},p_{h_2},T)\leq& C(h_T\|u_{h_1}-u_{h_2}\|_{0,T}+h_T\|y_{h_1}-y_{h_2}\|_{0,T}\\ &+\|p_{h_1}-p_{h_2}\|_{a,\omega_T}), \end{align}$

(3.3)

$\begin{align} osc_{\mathcal{T}_h}(y_{h_1}-y_{d},T)-osc_{\mathcal{T}_h}(y_{h_2}-y_{d},T)\leq& C h_{T}^2\|y_{h_1}-y_{h_2}\|_{a,T}. \end{align}$

(3.4)

Proof. We only give the proof of (3.1) because of the proofs of (3.1), (3.3)-(3.4) are similar. By the definition of $\eta_{2, \mathcal{T}_h}(u_h, y_h, T)$ , we deduce that

$\begin{align} \eta_{2,\mathcal{T}_h}(u_{h_1},y_{h_1},T)\leq&\eta_{2,\mathcal{T}_h}(u_{h_2},y_{h_2},T)+h_T^{1/2}||[\nabla(y_{h_1}-y_{h_2})]\cdot{\bf{n}}||_{0,\partial T\backslash\partial\Omega}\\ &+h_T||u_{h_1}p_{h_1}-u_{h_2}p_{h_2}||_{0,T}\\ \leq&\eta_{2,\mathcal{T}_h}(u_{h_2},y_{h_2},T)+h_T^{1/2}||[\nabla(y_{h_1}-y_{h_2})]\cdot{\bf{n}}||_{0,\partial T\backslash\partial\Omega}\\ &+Ch_T||u_{h_1}-u_{h_2}||_{0,T}+Ch_T||p_{h_1}-p_{h_2}||_{0,T}. \end{align}$

(3.5)

With the help of inverse estimates and Lemma 3.1, we infer that

$\begin{equation} ||[\nabla(y_{h_1}-y_{h_2})]\cdot{\bf{n}}||_{0,\partial T\backslash\partial\Omega}\leq Ch_T^{-1/2}||y_{h_1}-y_{h_2}||_{a,\omega_T}. \end{equation}$

(3.6)

Therefore (3.2) follows from (3.5)-(3.6).

By using the similar method with ^[24], we prove the following result.

Lemma 3.3. Let $\mathcal{T}_h\subset\tilde{\mathcal{T}_h}$ for $\mathcal{T}_h, \tilde{\mathcal{T}}_h\in\mathbb{T}. \mathcal{M}_h\subset\mathcal{T}_h$ denotes the set of elements which are marked from $\mathcal{T}_h$ to $\tilde{\mathcal{T}}_h$ . Then for $u_h\in U_{ad}^h, \ \tilde{u}_h \in U_{ad}^{\tilde{h}}, \ y_h, p_h\in V_{h}, \ \tilde{y}_h, \tilde{p}_h\in V_{\tilde{h}}$ and any $\delta, \delta_{1}\in(0, 1]$ , we have

$\begin{align} &\quad\ \eta_{1,\tilde{\mathcal{T}}_h}^2(\tilde{p}_h,\tilde{\mathcal{T}}_h)-(1+\delta_{1})\bigg\{\eta_{1,\mathcal{T}_h}^2(p_h,\mathcal{T}_h)-(1-2^{-1/2})\eta_{1,\mathcal{T}_h}(p_h,\mathcal{R}_h)\bigg\}\\ &\leq C\left(1+\delta_{1}^{-1}\right)h_{0}^2\|p_h-\tilde{p}_h\|_{a}^2, \end{align}$

(3.7)

and

$\begin{align} &\quad\ \eta_{2,\tilde{\mathcal{T}}_h}^2(\tilde{u}_h,\tilde{y}_h,\tilde{\mathcal{T}}_h)-(1+\delta)\bigg\{\eta_{2,\mathcal{T}}^2(u_h,y_h,\mathcal{T}_h) -\lambda\eta_{2,\mathcal{T}_h}^2(u_h,y_h,\mathcal{M}_h)\bigg\}\\ &\leq C(1+\delta^{-1})\left(h_{0}^2\|u_h-\tilde{u}_h\|_{0}^2+h_{0}^2\|p_h-\tilde{p}_h\|_{0}^2+\|y_h-\tilde{y}_h\|_{a}^2\right), \end{align}$

(3.8)

and

$\begin{align} &\quad\ \eta_{3,\tilde{\mathcal{T}}_h}^2(\tilde{y}_h,\tilde{p}_h,\tilde{\mathcal{T}}_h)-(1+\delta)\bigg\{\eta_{3,\mathcal{T}_h}^2(y_h,p_h,\mathcal{T}_h) -\lambda\eta_{3,\mathcal{T}_h}^2(u_h,y_h,\mathcal{M}_h)\bigg\}\\ &\leq C(1+\delta^{-1})\bigg(h_{0}^2\|u_h-\tilde{u}_h\|_{0}^2+h_{0}^2\|y_h-\tilde{y}_h\|_{0}^2+\|p_h-\tilde{p}_h\|_{a}^2\bigg), \end{align}$

(3.9)

and

$\begin{align} osc_{\mathcal{T}_h}^2(y_h-y_{d},\mathcal{T}_h\cap\tilde{\mathcal{T}}_h)-2osc_{\tilde{\mathcal{T}}_h}^2(\tilde{y}_h-y_{d},\mathcal{T}_h\cap\tilde{\mathcal{T}}_h)\leq 2Ch_{0}^4\|y_h-\tilde{y}_h\|_{a}^2, \end{align}$

(3.10)

where $\lambda = 1-2^{-\frac{b}{2}}, \ h_{0} = \max\limits_{{T\in \mathcal{T}_{h_{0}}}}h_{T}$ and $\mathcal{R}_h$ denotes the set of elements which are refined from $\mathcal{T}_h$ to $\tilde{\mathcal{T}}_h$ .

Proof. We only give the proof of (3.7)-(3.8) and (3.10) owing to the proof of (3.9) can be prove with (3.8) similarly. From (3.1) of Lemma 3.2, then combining with the Young's inequality yield

$\begin{align} \eta_{1,\tilde{\mathcal{T}}_h}^2(\tilde{p}_h,\tilde{\mathcal{T}}_h)\leq&C\eta_{1,\tilde{\mathcal{T}}_h}^2(p_h,\tilde{\mathcal{T}}_h) +Ch_T^2||p_h-\tilde{p}_h||_{a,T}^2\\ &+C\delta_1\eta_{1,\tilde{\mathcal{T}}_h}^2(p_h,\tilde{\mathcal{T}}_h)+C\delta_1^{-1}h_T^2||p-\tilde{p}_h||_{a,T}^2. \end{align}$

(3.11)

Note for $T\in\mathcal{R}_h\subset\mathcal{T}_h$ that $T$ will be bisected at least one time, it holds that

$\sum\limits_{T'\in T}\eta_{1,\tilde{\mathcal{T}}_h}(p_h,T')^2\leq2^{-1/2}\eta_{1,\mathcal{T}_h}(p,T)^2.$

For $T\in\mathcal{T}_h\backslash\mathcal{R}_h$ , we obtain

$\eta_{1,\tilde{\mathcal{T}}_h}(p,T)^2 = \eta_{1,\mathcal{T}_h}(p,T)^2.$

Thus we can conclude that

$\begin{align} \eta_{1,\tilde{\mathcal{T}}_h}(p_h,\tilde{\mathcal{T}}_h)^2 = &\eta_{1,\tilde{\mathcal{T}}_h}(p_h,\mathcal{R}_h) +\eta_{1,\tilde{\mathcal{T}}_h}(p_h,\mathcal{T}_h\backslash\mathcal{R}_h)\\ \leq&\eta_{1,\mathcal{T}_h}(p_h,\mathcal{T}_h)^2-(1-2^{-1/2}) \eta_{1,\mathcal{T}_h}(p_h,\mathcal{R}_h). \end{align}$

(3.12)

Hence, (3.7) follows from (3.11)-(3.12).

From Lemma 3.2 and the Young's inequality with parameter $\delta$ , we deduce that

$\begin{align} \eta_{2,\tilde{\mathcal{T}_h}}^2(\tilde{u}_h,\tilde{y}_h,\tilde{\mathcal{T}_h})\leq&\eta_{2,\tilde{\mathcal{T}_h}}^2(u_h,y_h,\tilde{\mathcal{T}_h}) +Ch_T^2(||\tilde{u}_h-u_h||_0^2+||\tilde{p}_h-p_h||_0^2)\\ &+C\delta\eta_{2,\tilde{\mathcal{T}_h}}^2(u_h,y_h,\tilde{\mathcal{T}_h})+C\delta^{-1}h_T^2(||\tilde{u}_h-u_h||_0^2+||\tilde{p}_h-p_h||_0^2)\\ &+||\tilde{y}_h-y_h||_a^2+C\delta^{-1}||\tilde{y}_h-y_h||_a^2. \end{align}$

(3.13)

Given a marked element $T'\in \mathcal{M}_h$ , let $\tilde{\mathcal{T}}_{T'} = \{T\in\tilde{\mathcal{T}}_h:T\subset T'\}$ . For $y_h\in V_h\subset V_{\tilde{h}}$ , we derive the jump $[\nabla y_h] = 0$ on the interior sides of $\cup\tilde{\mathcal{T}}_{T'}$ . Let $b$ be the number of bisections, it implies $h_T = |T|^2\leq(2^{-b}|T'|)^{1/2}\leq2^{-b/2}h_{T'}$ , and then it holds that

$\sum\limits_{\mathcal{T}_h\in\tilde{\mathcal{T}}_{T'}}\eta_{2,\tilde{\mathcal{T}}_h}^2(u_h,y_h,T)\leq2^{-b/2}\eta_{2,\mathcal{T}_h}^2(u_h,y_h,T').$

When $T\in\mathcal{T}_h\backslash\mathcal{M}_h$ , we check that

$\eta_{2,\tilde{\mathcal{T}}_h}^2(u_h,y_h,T)\leq\eta_{2,\mathcal{T}_h}^2(u_h,y_h,T).$

Thus we conclude that

$\begin{align} \eta_{2,\tilde{\mathcal{T}}_h}^2(u_h,y_h,\tilde{\mathcal{T}}_h) = &\eta_{2,\mathcal{T}_h}^2(u_h,y_h,\mathcal{M}_h) +\eta_{2,\tilde{\mathcal{T}}_h}^2(u_h,y_h,\mathcal{T}_h\backslash\mathcal{M}_h)\\ \leq&\eta_{2,\mathcal{T}_h}^2(u_h,y_h,\mathcal{T}_h) -(1-2^{-b/2})\eta_{2,\mathcal{T}_h}^2(u_h,y_h,\mathcal{M}_h). \end{align}$

(3.14)

And hence, (3.8) follows from (3.13)-(3.14).

It is obviously that for $T\in\mathcal{T}_h\cap\tilde{\mathcal{T}}_h$ , such that

$\begin{equation} osc_{\mathcal{T}_h}(y_h-y_d,T) = osc_{\tilde{\mathcal{T}}_h}(y_h-y_d,T). \end{equation}$

(3.15)

From (3.4) of Lemma 3.2 and the Young's inequality, such that

$\begin{equation} osc_{\tilde{\mathcal{T}}_h}(\tilde{y}_h-y_d,T)\leq2osc_{\tilde{\mathcal{T}}_h}(y_h-y_d,T)+2h_T^4||\tilde{y}_h-y_h||_{a,T}^2. \end{equation}$

(3.16)

Brings (3.15) into (3.16), we conclude that (3.10) follows from summing over $T\in\mathcal{T}_h\cap\tilde{\mathcal{T}}_h$ for (3.16).

One difficulty in the proof of the convergence is lack of the orthogonality, thus we have to prove the quasi-orthogonality. We introduce the following fundamental relationships for $\mathcal{T}_{h_i}$ , where $\mathcal{T}_{h_{i+1}}\in\mathbb{T}$ , $\mathcal{T}_{h_i}\subset\mathcal{T}_{h_{i+1}}$ , such that

$\begin{align} &\|u-u_{h_{i+1}}\|_{0}^2 = \|u-u_{h_i}\|_{0}^2-\|u_{h_i}-u_{h_{i+1}}\|_{0}^2-2(u-u_{h_{i+1}},u_{h_{i+1}}-u_{h_i}), \end{align}$

(3.17)

$\begin{align} &\|y-y_{h_{i+1}}\|_{a}^2 = \|y-y_{h_i}\|_{a}^2-\|y_{h_i}-y_{h_{i+1}}\|_{a}^2-2a(y-y_{h_{i+1}},y_{h_{i+1}}-y_{h_i}), \end{align}$

(3.18)

$\begin{align} &\|p-p_{h_{i+1}}\|_{1}^2 = \|p-p_{h_i}\|_{a}^2-\|p_{h_i}-p_{h_{i+1}}\|_{a}^2-2a(p-p_{h_{i+1}},p_{h_{i+1}}-p_{h_i}), \end{align}$

(3.19)

where $(y, u, p)$ are the solution of (2.6)-(2.8), $(y_{h_i}, u_{h_i}, p_{h_i})$ and $(y_{h_{i+1}}, u_{h_{i+1}}, p_{h_{i+1}})$ are the solution of (2.11)-(2.13) with respect to $\mathcal{T}_{h_i}$ and $\mathcal{T}_{h_{i+1}}$ , respectively.

Lemma 3.4. For $\mathcal{T}_{h_i}, \mathcal{T}_{h_{i+1}}\in\mathbb{T}$ and $\mathcal{T}_{h_i}\subset\mathcal{T}_{h_{i+1}}$ , such that

$\begin{align} &\quad\ (1-\delta)\|u-u_{h_{i+1}}\|_{0}^2-\|u-u_{h_i}\|_{0}^2+\|u_{h_i}-u_{h_{i+1}}\|_{0}^2\\ &\leq C\delta^{-1}\left(\eta_{1,\mathcal{T}_{h_i}}^2(p_{h_i},\mathcal{R}_h)+\mathcal{J}^2(h_0)\big(\eta_{2,\mathcal{T}_{h_i}}^2(u_{h_i},y_{h_i},\mathcal{R}_h) +\eta_{3,\mathcal{T}_{h_i}}^2(y_{h_i},p_{h_i},\mathcal{R}_h)\big)\right), \end{align}$

(3.20)

and

$\begin{align} &\quad\ (1-\delta)\|y-y_{h_{i+1}}\|_{a}^2-\|y-y_{h_i}\|_a^2-\|y_{h_i}-y_{h_{i+1}}\|_a^2-\delta\|u-u_{h_{i+1}}\|_{0}^2-\delta\|p-p_{h_{i+1}}\|_{0}^2\\ &\leq C\delta^{-1}\left(\eta_{1,\mathcal{T}_{h_i}}^2(p_{h_i},\mathcal{R}_h)+\mathcal{J}^2(h_0)\big(\eta_{2,\mathcal{T}_{h_i}}^2(u_{h_i},y_{h_i},\mathcal{R}_h) +\eta_{3,\mathcal{T}_{h_i}}^2(y_{h_i},p_{h_i},\mathcal{R}_h)\big)\right), \end{align}$

(3.21)

and

$\begin{align} &\quad\ (1-\delta)\|p-p_{h_{i+1}}\|_{a}^2-\|p-p_{h_i}\|_{a}^2+\|p_{h_i}-p_{h_{i+1}}\|_{a}^2-\delta\|u-u_{h_{i+1}}\|_{0}^2-\delta\|y-y_{h_{i+1}}\|_{0}^2\\ &\leq C\delta^{-1}\left(\eta_{1,\mathcal{T}_{h_i}}^2(p_{h_i},\mathcal{R}_h)+\mathcal{J}^2(h_0)\big(\eta_{2,\mathcal{T}_{h_i}}^2(u_{h_i},y_{h_i},\mathcal{R}_h) +\eta_{3,\mathcal{T}_{h_i}}^2(y_{h_i},p_{h_i},\mathcal{R}_h)\big)\right), \end{align}$

(3.22)

we follow the idea of ^[28] to introduce the quantity $\mathcal{J}^2(h)$ that

$\mathcal{J}^2(h) = \sup\limits_{f\in L^2(\Omega),||f||_0 = 1}\inf\limits_{v_h\in V_h}||Sf-v_h||_a.$

Proof. For $\mathcal{T}_{h_{i+1}}$ and from (2.8), we obtain

$\begin{equation} (\alpha u_{h_{i+1}}-y_{h_{i+1}}p_{h_{i+1}},v_h-u_{h_{i+1}})\leq0,\quad\forall\ v_h\in U_{ad}^{h_{i+1}}. \end{equation}$

(3.23)

For $U_{ad}^{h_{i}}\subset U_{ad}^{h_{i+1}}$ and from (3.23), such that

$\begin{align} \alpha||u_{h_{i+1}}-u_{h_{i}}||_0^2 = &(\alpha u_{h_{i+1}},u_{h_{i+1}}-u_{h_{i}})-(\alpha u_{h_{i}},u_{h_{i+1}}-u_{h_{i}})\\ \leq&(y_{h_{i+1}}p_{h_{i+1}},u_{h_{i+1}}-u_{h_{i}})-(\alpha u_{h_{i}},u_{h_{i+1}}-u_{h_{i}})\\ = &(y_{h_{i+1}}p_{h_{i+1}}-y_{h_{i}}p_{h_{i}},u_{h_{i+1}}-u_{h_{i}})+(\alpha u_{h_{i}}-y_{h_{i}}p_{h_{i}},u_{h_{i}}-u_{h_{i+1}})\\ = &(y_{h_{i+1}}(p_{h_{i+1}}-p_{h_{i}}),u_{h_{i+1}}-u_{h_{i}})+(p_{h_{i}}(y_{h_{i+1}}-y_{h_{i}}),u_{h_{i+1}}-u_{h_{i}})\\ &+(\alpha u_{h_{i}}-y_{h_{i}}p_{h_{i}},u_{h_{i}}-u_{h_{i+1}})\\ \leq&C(p_{h_{i+1}}-p_{h_{i}},u_{h_{i+1}}-u_{h_{i}})+C(y_{h_{i+1}}-y_{h_{i}},u_{h_{i+1}}-u_{h_{i}})\\ &+(\alpha u_{h_{i}}-y_{h_{i}}p_{h_{i}},u_{h_{i}}-u_{h_{i+1}}). \end{align}$

(3.24)

Next, we divide (3.24) into three parts to prove. Firstly, Let $\pi_{h_i}$ be the $L^2-$ projection onto $U_{ad}^{h_i}$ , then for $T\in\mathcal{T}_{h_i}\cap\mathcal{T}_{h_{i+1}}$ , we obtain

$(\pi_{h_i}u_{h_{i+1}}-u_{h_{i+1}})|_T = 0.$

Thus we infer that

$\begin{align} (\alpha u_{h_{i}}-y_{h_{i}}p_{h_{i}},u_{h_{i}}-u_{h_{i+1}})\leq&(\alpha u_{h_{i}}-y_{h_{i}}p_{h_{i}},\pi_{h_i}u_{h_{i+1}}-u_{h_{i+1}})\\ = &(p_{h_{i}}-\pi_{h_i}p_{h_{i}},\pi_{h_i}(u_{h_{i+1}}-u_{h_{i}})-(u_{h_{i+1}}-u_{h_{i}}))\\ \leq&C\eta_{1,\mathcal{T}_{h_i}}(p_{h_i},\mathcal{R}_h)||u_{h_{i}}-u_{h_{i+1}}||_0, \end{align}$

(3.25)

we denote $\mathcal{R}_h$ be the set of elements that are refined from $\mathcal{T}_{h_i}$ to $\mathcal{T}_{h_{i+1}}$ . It follows from the definition of $S_{h}$ , there exists

$\begin{align} (p_{h_{i+1}}-p_{h_{i}},u_{h_{i+1}}-u_{h_{i}}) = &(S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i+1}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d),u_{h_{i+1}}-u_{h_{i}})\\ = &(S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i+1}}-y_d)-S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d),u_{h_{i+1}}-u_{h_{i}})\\ &+(S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d),u_{h_{i+1}}-u_{h_{i}})\\ = &(S_{h_{i+1}}(u_{h_{i+1}}-u_{h_{i}}),S_{h_{i+1}}(u_{h_{i+1}}-u_{h_{i}}))\\ &+(S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d),u_{h_{i+1}}-u_{h_{i}})\\ \leq&(S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d),u_{h_{i+1}}-u_{h_{i}})\\ &+||y_{h_{i+1}}-y_{h_{i}}||_0^2. \end{align}$

(3.26)

Let $\phi\in H_0^1(\Omega)$ be the solution of the following problem

$a(\phi_h,q) = (S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d),q),\quad\forall\ q\in V.$

Thus we obtain that

$\begin{align} &||S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d)||_0^2\\ = &a(\phi_h-\phi_{h_i},S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d))\\ &+(S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}},\phi_{h_i}-\phi_h)+(S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}},\phi_h), \end{align}$

(3.27)

we denote $\phi_{h_i}$ the standard finite element estimate of $\phi_h$ with respect to $V_{h_i}$ . It follows from Proposition 2.1 of ^[18] that for $V_{h_i}\subset V_{h_{i+1}}$

$\begin{align} &a(\phi_h-\phi_{h_i},S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d))\\ \leq&a(\phi_h-\phi_{h_i},S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i+1}}^*(S_{h_{i}}u_{h_{i}}-y_d))\\ &+a(\phi_h-\phi_{h_i},S_{h_{i+1}}^*(S_{h_{i}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d))\\ \leq&C\mathcal{J}(h_0)||S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d)||_0\Big(||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_a\\ &+||S_{h_{i+1}}^*(S_{h_{i}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d)||_a\Big), \end{align}$

(3.28)

and

$\begin{align} &(S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}},\phi_{h_i}-\phi_h)\\ \leq&C\mathcal{J}(h_0)||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_0||S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d)||_0, \end{align}$

(3.29)

and

$\begin{align} &(S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}},\phi_h)\\ \leq&C||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_0||S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d)||_0. \end{align}$

(3.30)

From Lemma 3.6 of ^[12], we obtain

$\begin{align} &||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_a\leq C\eta_{2,\mathcal{T}_{h_i}}(u_{h_i},y_{h_i},\mathcal{R}_h), \end{align}$

(3.31)

$\begin{align} &||S_{h_{i+1}}^*(S_{h_{i}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i}}-y_d)||_a\leq C\eta_{3,\mathcal{T}_{h_i}}(y_{h_i},p_{h_i},\mathcal{R}_h). \end{align}$

(3.32)

Then let $\varphi_h\in H_0^1(\Omega)$ be the solution of the following problem

$a(q,\varphi_h) = (S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}},q),\quad\forall\ q\in V.$

With the help of the standard duality argument, we infer that

$\begin{align} &||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_0^2\\ = &a(S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}},\varphi_h-\varphi_{h_i})\\ \leq&C\mathcal{J}(h_0)||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_a||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_0, \end{align}$

(3.33)

where $\varphi_{h_i}$ is the standard finite element estimate of $\varphi_h$ with respect to $V_{\mathcal{T}_{h_i}}$ . Then combining with (3.26)-(3.33), there holds that

$\begin{align} (p_{h_{i+1}}-p_{h_{i}},u_{h_{i+1}}-u_{h_{i}})\leq&||y_{h_{i+1}}-y_{h_i}||_0^2+C\mathcal{J}(h_0)(\eta_{2,\mathcal{T}_{h_i}}(u_{h_i},y_{h_i},\mathcal{R}_h)\\ &+\eta_{3,\mathcal{T}_{h_i}}(y_{h_i},p_{h_i},\mathcal{R}_h))||u_{h_{i}}-u_{h_{i+1}}||_0. \end{align}$

(3.34)

One easily sees that

$\begin{align} ||y_{h_{i+1}}-y_{h_i}||_0 = &||S_{h_{i+1}}u_{h_{i+1}}-S_{h_{i}}u_{h_{i}}||_0\\ \leq&||S_{h_{i+1}}u_{h_{i+1}}-S_{h_{i+1}}u_{h_{i}}||_0+||S_{h_{i+1}}u_{h_{i}}-S_{h_{i}}u_{h_{i}}||_0\\ \leq&C||u_{h_{i}}-u_{h_{i+1}}||_0+C\mathcal{J}(h_0)(\eta_{2,\mathcal{T}_{h_i}}(u_{h_i},y_{h_i},\mathcal{R}_h) +\eta_{3,\mathcal{T}_{h_i}}(y_{h_i},p_{h_i},\mathcal{R}_h)). \end{align}$

(3.35)

Hence, (3.20) follows from (3.17), (3.24), and (3.34)-(3.35).

In order to prove (3.21), we have to provide the following

$\begin{align} 2a(y-y_{h_{i+1}},y_{h_{i+1}}-y_{h_{i}}) = &-2(up-u_{h_{i+1}}p_{h_{i+1}},y_{h_{i+1}}-y_{h_{i}})\\ = &2(u(p-p_{h_{i+1}}),y_{h_{i}}-y_{h_{i+1}})+2(p_{h_{i+1}}(u-u_{h_{i+1}}),y_{h_{i}}-y_{h_{i+1}})\\ \leq&\delta||u-u_{h_{i+1}}||_0^2+\delta||p-p_{h_{i+1}}||_0^2+\delta^{-1}||y_{h_{i+1}}-y_{h_{i}}||_0^2. \end{align}$

(3.36)

Thus we conclude that (3.21) follows from (3.18), (3.25), and (3.35)-(3.36).

Now we will give the proof of (3.22).

$\begin{align} ||p_{h_{i+1}}-p_{h_{i}}||_0 = &||S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i+1}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i+1}}-y_d)||_0\\ \leq&||S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i+1}}-y_d)-S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)||_0\\ &+||S_{h_{i+1}}^*(S_{h_{i+1}}u_{h_{i}}-y_d)-S_{h_{i}}^*(S_{h_{i}}u_{h_{i+1}}-y_d)||_0\\ \leq&C||u_{h_{i}}-u_{h_{i+1}}||_0+C\mathcal{J}(h_0)(\eta_{2,\mathcal{T}_{h_i}}(u_{h_i},y_{h_i},\mathcal{R}_h) +\eta_{3,\mathcal{T}_{h_i}}(y_{h_i},p_{h_i},\mathcal{R}_h)). \end{align}$

(3.37)

It is similar to (3.36) yields

$\begin{align} 2a(p-p_{h_{i+1}},p_{h_{i+1}}-p_{h_{i}}) = &-2((y-up)-(y_{h_{i+1}}-u_{h_{i+1}}p_{h_{i+1}}),p_{h_{i+1}}-p_{h_{i}})\\ = &2(u(p-p_{h_{i+1}}),p_{h_{i+1}}-p_{h_{i}})+2(p_{h_{i+1}}(u-u_{h_{i+1}}),p_{h_{i+1}}-p_{h_{i}})\\ &+(y-y_{h_{i+1}},p_{h_{i+1}}-p_{h_{i}})\\ \leq&\delta||u-u_{h_{i+1}}||_0^2+\delta||p-p_{h_{i+1}}||_0^2+\delta^{-1}||p_{h_{i+1}}-p_{h_{i}}||_0^2\\ &+\delta||y-y_{h_{i+1}}||_0^2. \end{align}$

(3.38)

Hence, (3.22) follows from (3.19), (3.25), and (3.37)-(3.38).

For $\mathcal{T}_{h_i}\in \mathbb{T}$ , we denote $U_{ad}^{h_{i}}, \ V_{h_{i}}$ and the solution $(y_{h_{i}}, u_{h_{i}}, p_{h_{i}})$ of (2.11)-(2.13) with respect to $\mathcal{T}_{h_{i}}$ . Then we define the following notations that

$\begin{align*} &e_{h_i}^2 = \|u-u_{h_i}\|_{0}^2+\|y-y_{h_i}\|_{a}^2+\|p-p_{h_i}\|_a^2,\\ &E_{h_i}^2 = \|u_{h_i}-u_{h_{i+1}}\|_{0}^2+\|y_{h_i}-y_{h_{i+1}}\|_{a}^2+\|p_{h_i}-p_{h_{i+1}}\|_a^2,\\ &\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\omega) = \eta_{2,\mathcal{T}_{h_i}}^2(u_{h_i},y_{h_i},\omega)+\eta_{3,\mathcal{T}_{h_i}}^2(y_{h_i},p_{h_i},\omega), \end{align*}$

We now proceed to prove the contraction of the Algorithm 2.1.

Theorem 3.1. Let $(y, u, p)\in H_0^1(\Omega)\times U_{ad}\times H_0^1(\Omega)$ be the solution of (2.5)-(2.7) and $(y_h, u_h, p_h)\in V_h\times U_{ad}^h\times V_h$ be the solution of (2.10)-(2.12) generated by the adaptive finite element algorithm 1. There exist $\gamma_1 > 0, \ \gamma_2 > 0$ and $\alpha\in(0, 1]$ depending only on the shape of regularity of initial $\mathcal{T}_{h_{0}}$ , $b$ , $\Omega$ and the marking parameter $\theta\in(0, 1]$ such that

$\begin{align} &e_{h_{i+1}}^2+\gamma_1\eta_{1,\mathcal{T}_{h_{i+1}}}^2(p_{h_{i+1}},\mathcal{T}_{h_{i+1}})+\gamma_2\tilde{\eta}_{\mathcal{T}_{h_{i+1}}}^2(\mathcal{T}_{h_{i+1}})\\ \leq&\alpha(e_{h_{i}}^2+\gamma_1\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{T}_{h_{i}})+\gamma_2\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_{i}})), \end{align}$

(3.39)

where $h_0$ sufficiently small, and $h_0 \; <$ 1.

Proof. By Theorem 2.1, Lemma 3.3 and Lemma 3.3 yields

$\begin{align} e_{h_i}^2\leq&C\eta_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i}), \end{align}$

(3.40)

$\begin{align} \tilde{\eta}_{\mathcal{T}_{h_{i+1}}}^2(\mathcal{T}_{h_{i+1}})\leq&(1+\delta)\Big\{\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i}) -\lambda\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{M}_{h_i})\Big\}+C(2+\delta^{-1})E_{h_i}^2, \end{align}$

(3.41)

$\begin{align} \eta_{1,\mathcal{T}_{h_{i+1}}}^2(p_{h_{i+1}},\mathcal{T}_{h_{i+1}})\leq&(1+\delta_1)\Big\{\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{T}_{h_{i}}) -(1-2^{-1/2})\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{R}_{h_{i}})\Big\}\\ &+C(1+\delta_1^{-1})h_0^2||p_{h_{i}}-p_{h_{i+1}}||_0^2, \end{align}$

(3.42)

$\begin{align} (1-2\delta)e_{h_{i+1}}^2\leq&e_{h_i}^2-E_{h_i}^2+C\delta^{-1}(\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{R}_{h_{i}})+\mathcal{J}^2(h_0) \tilde{\eta}_{\mathcal{T}_{h_{i}}}^2(\mathcal{T}_{h_{i}})), \end{align}$

(3.43)

where $\mathcal{R}_{h_{i}}$ denote the set of elements which are refined from $\mathcal{T}_{h_{i}}$ to $\mathcal{T}_{h_{i+1}}$ . Combining with the upper bound of Theorem 2.1, then choosing parameters $\tilde{\gamma}_1, \tilde{\gamma}_2$ , we obtain

$\begin{align*} &(1-2\delta)e_{h_{i+1}}^2+\tilde{\gamma}_1\eta_{1,\mathcal{T}_{h_{i+1}}}^2(p_{h_{i+1}},\mathcal{T}_{h_{i+1}})+\tilde{\gamma}_2\tilde{\eta}_{\mathcal{T}_{h_{i+1}}}^2(\mathcal{T}_{h_{i+1}})\\ \leq&e_{h_{i}}^2+\tilde{\gamma}_2(1+\delta)\Big\{\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i})-\lambda\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{M}_{h_i})\Big\} +\tilde{\gamma}_2(2+\delta^{-1})E_{h_{i}}^2-E_{h_{i}}^2\\ &-(\tilde{\gamma}_1(1+\delta_1)(1-2^{-1/2})-C\delta^{-1})\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{R}_{h_i}) +\tilde{\gamma}_1(1+\delta_1)\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{T}_{h_{i}})\\ &+C\delta^{-1}\mathcal{J}^2(h_0)\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i})+\tilde{\gamma}_1(1+\delta^{-1})h_0^2(\eta_{1,\mathcal{T}_{h_{i+1}}}^2(p_{h_{i+1}}, \mathcal{T}_{h_{i+1}})\\ &+\tilde{\eta}_{\mathcal{T}_{h_{i+1}}}^2(\mathcal{T}_{h_{i+1}})+\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{T}_{h_{i}})+\tilde{\eta}_{\mathcal{T}_{h_{i}}}^2(\mathcal{T}_{h_{i}})). \end{align*}$

We choose

$\begin{align} &\tilde{\gamma}_1(1+\delta_1)(1-2^{-1/2})-C\delta^{-1} > 0,\\ &\tilde{\gamma}_2C(2+\delta^{-1}) = 1. \end{align}$

(3.44)

Then we find that

$\begin{align*} &(1-2\delta)e_{h_{i+1}}^2+\tilde{\gamma}_1(1-C(1+\delta^{-1})h_0^2)\eta_{1,\mathcal{T}_{h_{i+1}}}^2(p_{h_{i+1}},\mathcal{T}_{h_{i+1}})+(\tilde{\gamma}_2 -C(1+\delta^{-1})h_0^2)\tilde{\eta}_{\mathcal{T}_{h_{i+1}}}^2(\mathcal{T}_{h_{i+1}})\\ \leq&e_{h_{i}}^2+\tilde{\gamma}_1((1+\delta_1)+C(1+\delta^{-1})h_0^2)\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i+1}},\mathcal{T}_{h_{i}}) -c(\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{M}_{h_{i}})+\eta_{\mathcal{T}_{h_{i}}}^2(\mathcal{M}_{h_{i}}))\\ &+(\tilde{\gamma}_2(1+\delta)+\tilde{\gamma}_1C(1+\delta_1^{-1})h_0^2+C\delta^{-1}\mathcal{J}^2(h_0))\tilde{\eta}_{\mathcal{T}_{h_{i}}}^2(\mathcal{T}_{h_{i}}), \end{align*}$

where $c = \min\{\tilde{\gamma}_1(1+\delta_1)(1-2^{-1/2})-C\delta^{-1}, \tilde{\gamma}_2C(2+\delta^{-1})\}$ . Then choosing $\beta\in(0, 1)$ and using the marking strategy of Algorithm 2.1 and Theorem 2.1, we derive that

$\begin{align*} &(1-2\delta)e_{h_{i+1}}^2+\tilde{\gamma}_1(1-C(1+\delta^{-1})h_0^2)\eta_{1,\mathcal{T}_{h_{i+1}}}^2(p_{h_{i+1}},\mathcal{T}_{h_{i+1}})+(\tilde{\gamma}_2 -C(1+\delta^{-1})h_0^2)\tilde{\eta}_{\mathcal{T}_{h_{i+1}}}^2(\mathcal{T}_{h_{i+1}})\\ \leq&(1-C\theta\beta)e_{h_{i}}^2+(\tilde{\gamma}_1((1+\delta_1)+C(1+\delta^{-1})h_0^2)-c\theta(1-\beta))\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{T}_{h_{i}})\\ &+(\tilde{\gamma}_2(1+\delta)+\tilde{\gamma}_1C(1+\delta_1^{-1})h_0^2+C\delta^{-1}\mathcal{J}^2(h_0)-c\theta(1-\beta))\tilde{\eta}_{\mathcal{T}_{h_{i}}}^2(\mathcal{T}_{h_{i}}). \end{align*}$

One easily sees that

$\begin{align*} &e_{h_{i+1}}^2+\gamma_1\eta_{1,\mathcal{T}_{h_{i+1}}}^2(p_{h_{i+1}},\mathcal{T}_{h_{i+1}})+\gamma_2\tilde{\eta}_{\mathcal{T}_{h_{i+1}}}^2(\mathcal{T}_{h_{i+1}}) \leq\alpha_1e_{h_{i}}^2+\alpha_2\gamma_1\eta_{1,\mathcal{T}_{h_{i}}}^2(p_{h_{i}},\mathcal{T}_{h_{i}})+\alpha_3\gamma_2\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_{i}}), \end{align*}$

where

$\begin{align} \gamma_1 = &\frac{\tilde{\gamma}_1(1-C(1+\delta^{-1})h_0^2)}{1-2\delta}, \end{align}$

(3.45)

$\begin{align} \gamma_2 = &\frac{\tilde{\gamma}_2 -C(1+\delta^{-1})h_0^2}{1-2\delta}, \end{align}$

(3.46)

$\begin{align} \alpha_1 = &\frac{1-C\theta\beta}{1-2\delta}, \end{align}$

(3.47)

$\begin{align} \alpha_2 = &\frac{\tilde{\gamma}_1((1+\delta_1)+C(1+\delta^{-1})h_0^2)-c\theta(1-\beta)}{\tilde{\gamma}_1(1-C(1+\delta^{-1})h_0^2)}, \end{align}$

(3.48)

$\begin{align} \alpha_3 = &\frac{\tilde{\gamma}_2(1+\delta)+\tilde{\gamma}_1C(1+\delta_1^{-1})h_0^2+C\delta^{-1}\mathcal{J}^2(h_0)-c\theta(1-\beta)}{\tilde{\gamma}_2 -C(1+\delta^{-1})h_0^2}. \end{align}$

(3.49)

By choosing sufficiently small $\delta < 1$ and $\beta$ such that $\alpha_1\in(0, 1)$ . As long as (3.48)-(3.49) are slightly deformed, and combining with (3.44), when $h_0\ll1$ and $\delta_1, \delta$ small enough, we can conclude that

$\alpha_2\in(0,1)\quad\mathrm{and}\quad\alpha_3\in(0,1).$

Choosing $\alpha = \max\{\alpha_1, \alpha_2, \alpha_3\},$ and then (3.39) complete the rest of the proof.

4. Quasi-optimality analysis

In this section, we consider the quasi-optimality for the adaptive finite element method. Firstly we introduce some interpretation of the notations. For $\mathcal{T}_h, \mathcal{T}_{h_1}, \mathcal{T}_{h_2}\in \mathbb{T}$ , with the assumption that $\#\mathcal{T}_h$ be the number of elements in $\mathcal{T}_h$ , we also set $\mathcal{T}_{h_1}\oplus\mathcal{T}_{h_2}$ be the smallest common conforming refinement of $\mathcal{T}_{h_1}$ and $\mathcal{T}_{h_2}$ ^[12,26], such that

$\begin{equation} \#(\mathcal{T}_{h_1}\oplus\mathcal{T}_{h_2})\leq \#\mathcal{T}_{h_1}+\#\mathcal{T}_{h_2}-\#\mathcal{T}_{h_0}. \end{equation}$

(4.1)

According to ^[20], a function approximation class is defined by

$\begin{align*} \mathcal{A}^s: = &\{(y,u,p,y_{d},f)\in H_{0}^1(\Omega)\times L^2(\Omega)\times H_{0}^1(\Omega)\times L^2(\Omega)\\ &\times L^2(\Omega):|(y,u,p,y_{d},f)|_{s} < +\infty\}, \end{align*}$

where

$\begin{align*} |(y,u,p,y_{d},f)|{_s}: = &\sup\limits_{N > 0}N^s\inf\limits_{\mathcal{T}_h\in\mathbb{T}_{N}}\inf\limits_{(y_{h},u_{h},p_{h})\in V_{h\times }U_{ad}^{h}\times V_{h}}\{\|u-u_{h}\|_{0}^2\\ &+\|y-y_{h}\|_{a}^2+\|p-p_{h}\|_{a}^2+osc_{\mathcal{T}_h}^2(f,\mathcal{T}_h))+osc_{\mathcal{T}_h}^2(y_{h}-y_{d},\mathcal{T}_h)\}^{\frac{1}{2}}, \end{align*}$

and

$\mathbb{T}_{N_{0}}: = \{\mathcal{T}_h\in\mathbb{T}:\#\mathcal{T}_h-\#\mathcal{T}_{h_0}\leq N_{0}\}.$

A localized upper bound will be given in the following lemmas, which play an important role in proving the qusi-optimality for an adaptive finite element method.

Lemma 4.1. For $\mathcal{T}_h, \tilde{\mathcal{T}}_h\in \mathbb{T}$ , and $\mathcal{T}_h\subset\tilde{\mathcal{T}}_h$ , let $\mathcal{R}_h$ be the set of refined elements from $\mathcal{T}_h$ to $\tilde{\mathcal{T}}_h$ . Let $(y_{h}, u_{h}, p_{h})$ and $(\tilde{y}_h, \tilde{u}_h, \tilde{p}_h)$ be the solutions of (2.11) $-$ (2.13) with respect to $\mathcal{T}_h$ and $\tilde{\mathcal{T}}_h$ , respectively. Then there exists a constant $C$ , depending on the shape regularity of initial grids $\mathcal{T}_{h_{0}}$ and $b$ such that

$\begin{equation} \|u_{h}-\tilde{u}_h\|_{0}^2+\|y_{h}-\tilde{y}_h\|_{a}^2+\|p_{h}-\tilde{p}_h\|_{a}^2\leq C\eta_{\mathcal{T}_h}^2(\mathcal{R}_h), \end{equation}$

(4.2)

where

$\eta_{\mathcal{T}_h}^2(\mathcal{R}_h) = \eta_{1,\mathcal{T}_h}^2(p_{h},\mathcal{R}_h)+\eta_{2,\mathcal{T}_h}^2(u_{h},y_{h},\mathcal{R}_h)+\eta_{3,\mathcal{T}_h}^2(y_{h},p_{h},\mathcal{R}_h).$

Proof. From (3.26)-(3.27) and (3.34)-(3.35), we obtain

$\begin{equation} ||u_h-\tilde{u}_h||_0^2\leq C\eta_{\mathcal{T}_{h}}^2(\mathcal{R}_h). \end{equation}$

(4.3)

By using the results of Lemma 3.6 (see ^[12]), we derive that

$\begin{align} ||y_h-\tilde{y}_h||_a = &||S_hu_h-S_{\tilde{h}}\tilde{u}_h||_a\\ \leq&||S_hu_h-S_{\tilde{h}}-S_{\tilde{h}}u_h-S_{\tilde{h}}||_a+||S_{\tilde{h}}u_h-S_{\tilde{h}}-S_{\tilde{h}}\tilde{u}_h||_a\\ \leq&C\eta_{2,\mathcal{T}_h}(u_h,y_h,\mathcal{R}_h)+||u_h-\tilde{u}_h||_0. \end{align}$

(4.4)

Then, it holds that

$\begin{align} ||p_h-\tilde{p}_h||_a = &||S_h^*(S_hu_h-y_d)-S_{\tilde{h}}^*(S_{\tilde{h}}\tilde{u}_h-y_d)||_a\\ \leq&||S_h^*(S_hu_h-y_d)-S_{\tilde{h}}^*(S_{h}u_h-y_d)||_a+||S_{\tilde{h}}^*(S_{h}u_h-y_d)-S_{\tilde{h}}^*(S_{\tilde{h}}\tilde{u}_h-y_d)||_a\\ \leq&C\eta_{3,\mathcal{T}_h}(y_h,p_h,\mathcal{R}_h)+||y_h-\tilde{y}_h||_a. \end{align}$

(4.5)

And thus we conclude that (4.2) follows from (4.3)-(4.5).

In the following lemma, we list the error indicators on the coarse grids that must satisfy a Dörfler property on the refinement one.

Lemma 4.2. Assume that the marking parameter $\theta\in(0, \theta^*)$ , where

$\theta^* = \frac{C}{2C(1+h_{0}^4)+1}.$

For $\mathcal{T}_h, \tilde{\mathcal{T}}_h\in\mathbb{T}$ and $\mathcal{T}_h\subset\tilde{\mathcal{T}}_h$ , let $(y_{h}, u_{h}, p_{h})$ and $(\tilde{y}_{h}, \tilde{u}_{h}, \tilde{p}_{h})$ be the solutions of (2.11)-(2.13) with respect to $\mathcal{T}_h$ and $\tilde{\mathcal{T}}_h$ , respectively. If

$\begin{equation} e_{\tilde{\mathcal{T}}_h}^2+osc_{\tilde{\mathcal{T}}_h}^2(\tilde{\mathcal{T}}_h)\leq \mu[e_{\mathcal{T}_h}^2+osc_{\mathcal{T}_h}^2(\mathcal{T}_h)], \end{equation}$

(4.6)

is satisfied for $\mu: = \frac{1}{2}\Big(1-\frac{\theta}{\theta^*}\Big)$ . Then, the set $\mathcal{R}_h$ of elements which are refined from $\mathcal{T}_h$ to $\tilde{\mathcal{T}}_h$ satisfies the Döfler property

$\eta_{\mathcal{T}_h}^2(\mathcal{R}_h)\geq\theta\eta_{\mathcal{T}_h}^2(\mathcal{T}_h),$

where

$\begin{align*} &e_{\mathcal{T}_h}^2 = \|u-u_{h}\|_{0}^2+\|y-y_{h}\|_a^2+\|p-p_{h}\|_a^2,\\ &osc_{\mathcal{T}_h}^2(\omega) = osc_{\mathcal{T}_h}^2(f,\omega)+osc_{\mathcal{T}_h}^2(y_{h}-y_{d},\omega), \end{align*}$

for $\omega\subset\mathcal{T}_h$ and $e_{\tilde{\mathcal{T}}_h}^2$ , $osc_{\tilde{\mathcal{T}}_h}^2(\tilde{\mathcal{T}}_h)$ similarly to define.

Proof. From (4.6) and Theorem 2.1, we derive

$\begin{align*} (1-2\mu)C\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)\leq&(1-2\mu)(e_{\mathcal{T}_h}^2+osc_{\mathcal{T}_h}^2(\mathcal{T}_h)\\ \leq&e_{\mathcal{T}_h}^2-2e_{\tilde{\mathcal{T}}_h}^2+osc_{\mathcal{T}_h}^2(\mathcal{T}_h)-2osc_{\tilde{\mathcal{T}}_h}^2(\tilde{\mathcal{T}}_h). \end{align*}$

Applications of the triangle inequality yields

$\begin{align*} &||u-u_h||_0^2\leq2||u-\tilde{u}_h||_0^2+2||u_h-\tilde{u}_h||_0^2,\\ &||y-y_h||_a^2\leq2||y-\tilde{y}_h||_a^2+2||y_h-\tilde{y}_h||_a^2,\\ &||p-p_h||_a^2\leq2||p-\tilde{p}_h||_a^2+2||p_h-\tilde{p}_h||_a^2. \end{align*}$

Thus, from (4.2), we obtain

$\begin{equation} e_{\mathcal{T}_h}^2-2e_{\tilde{\mathcal{T}}_h}^2\leq2C\eta_{\mathcal{T}_h}^2(\mathcal{R}_h). \end{equation}$

(4.7)

From (3.10), for $T\in\mathcal{T}_h\cap\tilde{\mathcal{T}}_h$ , it is easy to see that

$\begin{equation} osc_{\mathcal{T}_h}^2(y_h-y_d,\mathcal{T}_h\cap\tilde{\mathcal{T}}_h)-2osc_{\tilde{\mathcal{T}}_h}^2 (\tilde{y}_h-y_d,\mathcal{T}_h\cap\tilde{\mathcal{T}}_h)\leq2Ch_0^4\eta_{\mathcal{T}_h}^2(\mathcal{R}_h). \end{equation}$

(4.8)

According to Remark 2.1 of ^[12], there holds for $T\in\mathcal{R}_h$

$\begin{equation} osc_{\mathcal{T}_h}(T)\leq\eta_{\mathcal{T}_h}(T). \end{equation}$

(4.9)

By adopting (4.7)-(4.9), there holds

$\begin{align*} (1-2\mu)C\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)\leq&(2C(1+h_0^2)+1)\eta_{\mathcal{T}_h}^2(\mathcal{R}_h),\\ (1-2\mu)\theta^*\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)\leq&\eta_{\mathcal{T}_h}^2(\mathcal{R}_h),\\ \theta\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)\leq&\eta_{\mathcal{T}_h}^2(\mathcal{R}_h), \end{align*}$

where

$\theta^* = \frac{C}{(2C(1+h_0^2)+1)}\quad\mathrm{and}\quad\theta = (1-2\mu)\theta^*.$

The following Lemma is essential to prove the quasi-optimality.

Lemma 4.3. Let $(y, u, p)$ and $(\mathcal{T}_{h_i}, U_{ad}^{h_i}, V_{h_i}, y_{h_i}, u_{h_i}, p_{h_i})$ be the solution of (2.6)-(2.8) and the sequence of grids, finite element spaces and discrete solutions produced by Algorithm 2.1, respectively. Assume that the marking parameter $\theta$ satisfies the condition in Lemma 4.2, then the following estimate is valid

$\begin{equation} \#\mathcal{M}_{h_i}\leq C\Big( C_1^{\frac{1}{2s}}|(u_h,y,p,y_{d},f)|_{s}^{\frac{1}{s}}\mu^{-\frac{1}{2s}}(e_{h_i}^2+osc_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i}))^{-\frac{1}{2s}}\Big), \end{equation}$

(4.10)

if $(u, y, p, y_{d}, f)\in \mathcal{A}^s$ .

Proof. Let $\epsilon: = \mu C_1^{-1}(\eta_{\mathcal{T}_{h_i}}^2+osc_{\mathcal{T}_{h_{i}}}^2(\mathcal{T}_{h_{i}}))$ , where $C_1$ shall be produced in (4.13). Due to $(u_h, y, p, y_{d}, f)\in\mathcal{A}^s$ , then there exists a $\mathcal{T}_{h_\epsilon}\in\mathbb{T}$ and $(y_\epsilon, u_\epsilon, p_\epsilon)\in U_{ad}^{h_\epsilon}\times V_{h\epsilon}\times V_{h_\epsilon}$ such that

$\begin{align} &\#\mathcal{T}_{h_\epsilon}-\#\mathcal{T}_{h_0}\leq C|(u,y,p,y_{d},f)|_s^{1/s}\epsilon^{-1/s}, \end{align}$

(4.11)

$\begin{align} &||u-u_\epsilon||_0^2+||y-y_\epsilon||_a^2+||p-p_\epsilon||_a^2 +osc^2_{\mathcal{T}_{h_\epsilon}}(f,\mathcal{T}_{h_\epsilon})+osc^2_{\mathcal{T}_{h_\epsilon}}(y_\epsilon-y_d,\mathcal{T}_{h_\epsilon})\leq\epsilon^2. \end{align}$

(4.12)

Let $(y_*, u_*, p_*)$ be the solution of (2.6)-(2.8) with respect $\mathcal{T}_{h_*} = \mathcal{T}_{h_\epsilon}\oplus\mathcal{T}_{h_i}$ which is the smallest common refinement of $\mathcal{T}_{h_\epsilon}$ and $\mathcal{T}_{h_i}$ . Next, we first prove the following estimate

$\begin{equation} e_{\mathcal{T}_{h_*}}^2+osc_{\mathcal{T}_{h_*}}^2\leq C_1(e_{\mathcal{T}_{h_\epsilon}}^2+osc_{\mathcal{T}_{h_\epsilon}}^2), \end{equation}$

(4.13)

where

$\begin{align*} &e_{\mathcal{T}_{h_\epsilon}}^2 = ||u-u_\epsilon||_0^2+||y-y_\epsilon||_a^2+||p-p_\epsilon||_a^2,\\ &osc_{\mathcal{T}_{h_\epsilon}}^2(\mathcal{T}_{h_\epsilon}) = osc_{\mathcal{T}_{h_\epsilon}}^2(f,\mathcal{T}_{h_\epsilon}) +osc_{\mathcal{T}_{h_\epsilon}}^2(y_\epsilon-y_d,\mathcal{T}_{h_\epsilon}). \end{align*}$

Combining with the Young's inequality yield

$\begin{align} (u-u_*,u_*-u_\epsilon) = &(u-u_\epsilon,u_*-u_\epsilon)-(u_*-u_\epsilon,u_*-u_\epsilon)\\ \leq&(u-u_\epsilon,u_*-u_\epsilon)\\ \leq&||u-u_\epsilon||_0^2+||u_*-u_\epsilon||_0^2. \end{align}$

(4.14)

Note that there holds the similar result to $a(y-y_*, y_*-y_\epsilon)$ and $a(p-p_*, p_*-p_\epsilon)$ . Thus combining with (3.17)-(3.19) and (4.14), we find that

$\begin{align} 6(||u-u_\epsilon||_0^2+||y-y_\epsilon||_a^2+||p-p_\epsilon||_a^2)\geq&||u-u_*||_0^2+||u_*-u_\epsilon||_0^2+||y-y_*||_a^2\\ &+||y_*-y_\epsilon||_a^2+||p-p_*||_a^2+||p_*-p_\epsilon||_a^2. \end{align}$

(4.15)

From (3.10) with $\mathcal{T}_{h} = \tilde{\mathcal{T}}_{h} = \mathcal{T}_{h_*}$ and $y_h = y_*$ , $\tilde{y}_h = y_\epsilon$ , such that

$\begin{equation} osc_{\mathcal{T}_{h_*}}^2(y_*-y_d,\mathcal{T}_{h_*})-2osc_{\mathcal{T}_{h_\epsilon}}^2(y_\epsilon-y_d,\mathcal{T}_{h_\epsilon}) \leq Ch_0^4||y_*-y_\epsilon||_a^2. \end{equation}$

(4.16)

For $T'\in\mathcal{T}_{h_\epsilon}$ , let $\mathcal{T}_{T'}: = \{T\in\mathcal{T}_{h_*}:T\in T'\}$ . There holds

$\begin{align*} \sum\limits_{T\in\mathcal{T}_{T'}}||f-f_T||_{0,T}^2 = &\sum\limits_{T\in\mathcal{T}_{T'}}\Big(\int_Tf^2-\frac{(\int_Tf)^2}{|T|}\Big)\\ = &\int_{T'}f^2-\sum\limits_{T\in\mathcal{T}_{T'}}\frac{(\int_Tf)^2}{|T|}\\ \leq&\int_{T'}f^2-\sum\limits_{T\in\mathcal{T}_{T'}}\frac{(\int_Tf)^2}{|T'|}\\ \leq&C\int_{T'}f^2-\frac{(\int_{T'}f)^2}{|T'|}\\ = &C||f-f_{T'}||_{0,T'}^2. \end{align*}$

Hence,

$\begin{equation} osc_{\mathcal{T}_{h_*}}+2(f,\mathcal{T}_{h_*})\leq Cosc_{\mathcal{T}_{h_\epsilon}}+2(f,\mathcal{T}_{h_\epsilon}). \end{equation}$

(4.17)

And thus we conclude that (4.13) follows from (4.15)-(4.17). Then combining with (4.12), we find that

$\begin{align} e_{\mathcal{T}_{h_*}}^2+osc_{\mathcal{T}_{h_*}}^2(\mathcal{T}_{h_*})\leq& C_1(e_{\mathcal{T}_{h_\epsilon}}^2+osc_{\mathcal{T}_{h_\epsilon}}^2(\mathcal{T}_{h_\epsilon}))\\ \leq&C_1\epsilon^2 = \mu(e_{\mathcal{T}_{h_i}}^2+osc_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i})) \end{align}$

(4.18)

Note that $\mathcal{R}_{h}$ , which is the set of refinement elements from $\mathcal{T}_{h_i}$ to $\mathcal{T}_{h_*}$ , it satisfies the marking property of Lemma 4.2 that

$\begin{equation} \#\mathcal{M}_{h_i}\leq\#\mathcal{R}_{h}\leq\#\mathcal{T}_{h_*}-\#\mathcal{T}_{h_i}\leq\#\mathcal{T}_{h_\epsilon}-\mathcal{T}_{h_0}. \end{equation}$

(4.19)

Thus we find that (4.10) follows from (4.11) and (4.19).

The following theorem describes the quasi-optimality of an adaptive finite element method.

Theorem 4.1. Let $(y, u, p)$ and $(\mathcal{T}_{h_i}, U_{ad}^{h_i}, V_{h_i}, y_{h_i}, u_{h_i}, p_{h_i})$ be the solution of (2.6)-(2.8). Where the sequence of grids, finite element spaces, and discrete solutions produced by Algorithm 2.1, respectively. Assume that $\mathcal{T}_{h_0}$ satisfies the condition (b) of Section 4 (see ^[3]). Let $(u, y, p, y_{d}, f)\in \mathcal{A}_{s}$ , then there holds

$\begin{equation} \#\mathcal{T}_{h_i}-\#\mathcal{T}_{h_0}\leq C|(u,y,p,y_{d},f)|_{s}^{\frac{1}{s}}\Big(e_{h_i}^2+osc_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i})\Big)^{-\frac{1}{2s}}, \end{equation}$

(4.20)

provided $h_{0}\ll 1$ .

Proof. According to Lemma 2.3 of ^[12], we derive that

$\begin{equation} \#\mathcal{T}_{h_i}-\#\mathcal{T}_{h_0}\leq\sum\limits_{j = 0}^{i-1}\mathcal{M}_{h_j}. \end{equation}$

(4.21)

Combining (4.21) with Lemma 4.3 leads to

$\begin{equation} \#\mathcal{T}_{h_i}-\#\mathcal{T}_{h_0}\leq \sum\limits_{j = 0}^{i-1}M_{h_j}CC_0\sum\limits_{j = 0}^{i-1} (e_{h_i}^2+osc_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i}))^{-\frac{1}{2s}}, \end{equation}$

(4.22)

where

$C_0 = C_1^{\frac{1}{2s}}|(u,y,p,y_{d},f)|_s^{1/s}\mu^{-\frac{1}{2s}}.$

An application of Theorem 2.1, such that

$\begin{equation} e_{h_i}^2+\gamma_1\eta_{1,\mathcal{T}_{h_i}}^2(p_{h_i},\mathcal{T}_{h_i})+\gamma_2\tilde{\eta}_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i}) \approx e_{h_i}^2+osc_{\mathcal{T}_{h_i}}^2(\mathcal{T}_{h_i}). \end{equation}$

(4.23)

Thus we conclude that (4.20) follows from (4.22)-(4.23).

5. Numerical examples

In this section, we are going to use the following iterations to solve the bilinear optimal control problem numerically.

Algorithm 5.1. It follows from linear optimal control problems, there exist similar algorithms in the reference ^[20], we define the following Algorithm 5.1 based on the bilinear optimal control problem:

$\begin{align} &a\big(y_h^k,w_h\big)+\big(u_h^{k-1}y_h^k,w_h\big) = \big(f,w_h\big),\quad\forall\ w_h\in V_h, \end{align}$

(5.1)

$\begin{align} &a\big(q_h,p_h^{k}\big)+\big(u_h^{k-1}p_h^{k},q_h\big) = \big(y_h^k-y_d,q_h\big),\quad\forall\ q_h\in V_h, \end{align}$

(5.2)

$\begin{align} &\big(\alpha u_h^k+y_h^kp_h^{k},v_h-u_h^k\big)\geq0,\quad\forall\ v_h^k\in U_{ad}^h. \end{align}$

(5.3)

Given an initial control $u_h^0\in U_{ad}^{h}$ , then substitute it into (5.1) yields $y_{h}^1$ . And then substitute $y_{h}^1$ into (5.2) yields $p_{h}^1$ . At last, substitute $p_{h}^1$ into (5.3) yields $u_{h}^1$ . Repeat the above step, we can solve $(y_h^k, p_h^k, u_h^k)$ , for $k = 1, 2, \cdots$ , it holds that

$\begin{align*} u_h^k = \frac{1}{\alpha}\big(-\mathcal{P}_hp_h^k+\max\big(0,\bar{p}_h^k\big)\big), \end{align*}$

where $\mathcal{P}_h$ is the $L^2$ -projection from $L^2(\Omega)$ to $U^h$ and $\bar{p}_h^k = \frac{\int_\Omega p_h^k}{|\Omega|}$ .

For ease of analysis the solution of bilinear optimal control problem we constructing examples as follows.

Example 1. The first example is a bilinear optimal control problem, we set state equation on $\Omega = (0, 1)^2$ , and $\alpha = 0.5$ . We take the exact solution as

$\begin{align*} &p = \sin\pi x_1\sin\pi x_2,\\ &u_0 = \frac{1}{\alpha}\max\{\bar{p},0\}-p,\\ &y_d = -\Delta p+up-y,\\ &f = -\Delta y+uy,\\ &y = 2\pi^2p. \end{align*}$

In Example 1, the state and co-state are approximated by the piecewise linear elements, while piecewise constant elements are used to approximate the control. We compute Example 1 on an adaptive grid and a uniform grid respectively.

In , we plot the profiles of the exact state and the numerical state on adaptive refinement grid with $\theta = 0.3$ and 21 adaptive loops. We plot the profiles of the co-exact state variables and the co-numerical state variables on adaptive refinement grid with $\theta = 0.3$ and 21 adaptive loops in Figure 2 as well. Connecting with Figure 1 and Figure 2, we find that although the solution of Example 1 is smooth, larger gradients can be observed in certain areas. Accordingly, the adaptive finite element method may obtain a much smaller error compared to the uniform refinement.

Figure 1. The profiles of the exact state (left) and the numerical state (right) on adaptively refined grid with

$\theta = 0.3$ and 21 adaptive loops for Example 1 generated by Algorithm 2.1.

DownLoad: Full-Size Img PowerPoint

Figure 2. The profiles of the exact co-state variables (left) and the numerical co-state variables (right) on adaptively refined grid with

$\theta = 0.3$ and 21 adaptive loops for Example 1 generated by Algorithm 2.1.

DownLoad: Full-Size Img PowerPoint

We show the comparisons of convergence history of the errors on adaptively grids with $\theta = 0.3$ and uniformly refined grids with $\theta = 1$ in . We can observe $\eta^2$ is approximately parallel to the line slop $-1$ which is the optimal convergence rate we expected by using linear finite elements. This situation confirms the theoretic results in Section 4. But in this example, the solutions of the optimal control are quite smooth. Here is little difference in the convergence history of the errors compared to uniform refinement.

Figure 3. The comparisons of convergence history of the estimates on uniformly refined grids with

$\theta = 1$ (left) and adaptively refined grids with

$\theta = 0.3$ (right) for Example 1.

DownLoad: Full-Size Img PowerPoint

Example 2. In the second example, we choose $\alpha = 0.3$ with the exact solutions on $\Omega = (0, 1)^2$

$\begin{align*} &u = \frac{1}{\alpha}(-p+\max(0,\bar{p})),\\ &p = \begin{cases}\sin\pi x_1\sin\pi x_2,\quad &\mathrm{if}\ s(x_1,x_2) < 0,\\ 0,&\mathrm{if}\ s(x_1,x_2)\geq0,\end{cases}\\ &y_d = \begin{cases}100\sqrt{(x_1-1)^2+(x_2-1)^2}, \quad &\mathrm{if}\ s(x_1,x_2) < 0,\\ 0,&\mathrm{if}\ s(x_1,x_2)\geq0,\end{cases} \end{align*}$

where $s(x_1, x_2) = (x_1-0.2)^2+(x_2-0.6)^2-0.04$ .

Similarly, the state and co-state are approximated by the piecewise linear elements, while piecewise constant elements are used to approximate the control. We compute Example 2 on an adaptive grid and a uniform grid respectively.

In , we plot the profiles of the numerical state and the numerical co-state on 21 adaptive loops with $\theta = 0.3$ for Example 2, both the numerical state and the numerical co-state can be seen in the singularities around the peak. Further, the grids are concentrated at the peak. In , we show the convergence history of the estimators. We can see $\eta^2$ is approximately parallel to slop $-1$ . From the figure, it is easy to see that the second-order convergence for the reduction of error estimators for the adaptively refined grids. In Example ref{exm2}, the optimal control is not smooth so that there is much difference in using either the uniform or adaptive grids to approximate the control.

Figure 4. The profiles of the numerical state (left) and the numerical co-state (right) on adaptively refined grids with

$\theta = 0.3$ for Example 2.

DownLoad: Full-Size Img PowerPoint

Figure 5. The comparisons of convergence history of the estimates on uniformly refined grids with

$\theta = 1$ (left) and adaptively refined grids with

$\theta = 0.3$ (right) for Example 2.

DownLoad: Full-Size Img PowerPoint

Acknowledgments

This work is supported by National Science Foundation of China (11201510), National Social Science Fund of China (19BGL190), China Postdoctoral Science Foundation (2017T100155, 2015M580197), Innovation Team Building at Institutions of Higher Education in Chongqing (CXTDX201601035), Chongqing Research Program of Basic Research and Frontier Technology (cstc2019jcyj-msxmX0280) and Scientific and Technological Research Program of Chongqing Municipal Education Commission (KJZD-K202001201), Youth Innovative Talents Project (Natural Science) of research on humanities and social sciences in Guangdong normal university (2017KQNCX265), Hunan Provincial Education Department of China (18C0196).

Conflict of interest

The authors declare no conflict of interest.

References

[1]	M. Ainsworth, J. T. Oden, A posteriori error estimators in finite element analysis, Comput. Methods Appl. Mech. Engrg., 142 (1997), 1–88. doi: 10.1016/S0045-7825(96)01107-3
[2]	I. Babuška, W. C. Rheinboldt, Error estimates for adaptive finite computations, SIAM J. Numer. Anal., 15 (1978), 736–754. doi: 10.1137/0715049
[3]	P. Binev, W. Dahmen, R. Devore, Adaptive finite element approximation for distributed elliptic optimal control problems, SIAM J. Control Optim., 97 (2004), 219–268.
[4]	L. Zhang, Z. Zhou, Spectral galerkin approximation of optimal control problem governed by riesz fractional differential equation, Appl. numer. math., 143 (2019), 247–262. doi: 10.1016/j.apnum.2019.04.003
[5]	F. Wang, Z. Zhang, Z. Zhou, A spectral galerkin approximation of optimal control problem governed by fractional advection diffusion reaction equations, J. Comput. Appl. Math., 386 (2021), 113–129.
[6]	N. Du, H. Wang, W. B. Liu, A fast gradient projection method for a constrained fractional optimal control, J. Sci. Comput., 68 (2016), 1–20. doi: 10.1007/s10915-015-0125-1
[7]	P. G. Ciarlet, The Finite Element Method for Elliptic Problems, Amsterdam: North-Holland, 1978.
[8]	Z. Chen, J. Feng, An adaptive finite element algorithm with reliable and efficient error for linear parabolic problems, Math. Comput., 73 (2004), 1167–1193. doi: 10.1090/S0025-5718-04-01634-5
[9]	Y. Chen, Z. Lu, High Efficient and Accuracy Numerical Methods for Opyimal Control Problems, Science Press, Beijing, 2015.
[10]	Y. Chen, Z. Lu, Y. Huang, Superconvergence of triangular Raviart-Thomas mixed finite element methods for a bilinear constrained optimal control problem, Comput. Math. Appl., 66 (2013), 1498–1513. doi: 10.1016/j.camwa.2013.08.019
[11]	Y. Chen, Z. Lu, L. Liu, Numerical Methods for Partial Differential Equations, Science Press, Beijing, 2015.
[12]	J. M. Cascon, C. Kreuzer, R. H. Nochetto, K. G. Siebert, Qusi-optimal convergence rate for an adaptive finite element method, SIAM J. Numer. Anal., 46 (2008), 2524–2550. doi: 10.1137/07069047X
[13]	W. Dörfler, A convergent adaptive algorithm for Poisson equation, SIAM J. Numer. Anal., 33 (1996), 1106–1124. doi: 10.1137/0733054
[14]	A. Demlow, R. Stevenson, Convergence and quasi-optimality of an adaptive finite element method for controlling $L^2$ errors, Numer. Math., 117 (2011), 185–218. doi: 10.1007/s00211-010-0349-9
[15]	A. Gaevskaya, R. H. W. Hoppe, Y. Iliash, M. Kieweg, Convergence anlysis of an adaptive finite element for distributed control problems with control constraints, Int. Serises Numer. Math., 155 (2007), 47–68. doi: 10.1007/978-3-7643-7721-2_3
[16]	L. Ge, W. Liu, D. Yang, Adaptive finite element approximation for a constrained optimal control problem via multi-meshes, J. Sci. Comput., 41 (2009), 238–255. doi: 10.1007/s10915-009-9296-y
[17]	L. Ge, W. Liu, D. Yang, $L^2$ norm equivalent a posteriori error estimate for a constrained optimal control problem, Inter. J. Numer. Anal. Model., 6 (2009), 335–353.
[18]	W. Gong, N. Yan, Adaptive finite element method for elliptic optimal control problems: convergence and optimality, Numer. Math., 135 (2017), 1121–1170. doi: 10.1007/s00211-016-0827-9
[19]	L. He, A. Zhou, Comvergence and optimality of adaptive finite element methods for elliptic partial differential equations, Int. J. Numer. Anal. Model., 8 (2011), 1721–1743.
[20]	H. Leng, Y. Chen, Convergence and quasi-optimality of an adaptive finite element method for optimal control problems with integral control constraint, Adv. Comput. Math., 44 (2018), 1367–1394.
[21]	R. Li, W. Liu, H. Ma, T. Tang, Adaptive finite element methods with convergence rates, Numer. Math., 41 (2002), 1321–1349.
[22]	W. Liu, N. Yan, Adaptive Finite Element Methods for Optimal Control Governed by PDEs, Science Press, Beijing, 2008.
[23]	Z. Lu, S. Zhang, $L^\infty$ -error estimates of rectangular mixed finite element methods for bilinear optimal control problem, Appl. Math. Comput., 300 (2017), 79–94.
[24]	P. Morin, R. H. Nochetto, K. G. Siebert, Data oscillation and convergence of adaptive FEM, SIAM J. Numer. Anal., 33 (1996), 1106–1124. doi: 10.1137/0733054
[25]	P. Morin, R. H. Nochetto, K. G. Siebert, Convergence of adaptive finite element methods, SIAM Reviews, 44 (2000), 466–488.
[26]	R. Stevenson, Optimality of a standard adaptive finite element method, Found Comput. Math., 7 (2007), 245–269. doi: 10.1007/s10208-005-0183-0
[27]	R. Verfurth, A Review of A Posteriori Error Estimation and Adaptive Mesh Refinement, Comput. Methods Appl. Mech. Engrg., Wiley-Teubner, London, 1996.
[28]	J. Xu, A. Zhou, Local and parallel finite element algorithms based on two-grid discretizations, Math. Comput., 69 (1996), 881–909.

Reader Comments

Your name:*

Email:*
© 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(2867) PDF downloads(93) Cited by(0)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(5)

AIMS Mathematics

Convergence and quasi-optimality based on an adaptive finite element method for the bilinear optimal control problem

Related Papers:

Abstract

1. Introduction

2. A residual-based posteriori error estimates

3. Convergence analysis

4. Quasi-optimality analysis

5. Numerical examples

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Mathematics

Convergence and quasi-optimality based on an adaptive finite element method for the bilinear optimal control problem

Related Papers:

Abstract

1. Introduction

2. A residual-based posteriori error estimates

3. Convergence analysis

4. Quasi-optimality analysis

5. Numerical examples

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog