
Quantile regression has been widely used in many fields because of its robustness and comprehensiveness. However, it remains challenging to perform the quantile regression (QR) of streaming data by a conventional methods, as they are all based on the assumption that the memory can fit all the data. To address this issue, this paper proposes a Bayesian QR approach for streaming data, in which the posterior distribution was updated by utilizing the aggregated statistics of current and historical data. In addition, theoretical results are presented to confirm that the streaming posterior distribution is theoretically equivalent to the orcale posterior distribution calculated using the entire dataset together. Moreover, we provide an algorithmic procedure for the proposed method. The algorithm shows that our proposed method only needs to store the parameters of historical posterior distribution of streaming data. Thus, it is computationally simple and not storage-intensive. Both simulations and real data analysis are conducted to illustrate the good performance of the proposed method.
Citation: Zixuan Tian, Xiaoyue Xie, Jian Shi. Bayesian quantile regression for streaming data[J]. AIMS Mathematics, 2024, 9(9): 26114-26138. doi: 10.3934/math.20241276
[1] | Zuliang Lu, Xiankui Wu, Fei Huang, Fei Cai, Chunjuan Hou, Yin Yang . Convergence and quasi-optimality based on an adaptive finite element method for the bilinear optimal control problem. AIMS Mathematics, 2021, 6(9): 9510-9535. doi: 10.3934/math.2021553 |
[2] | Zuliang Lu, Fei Cai, Ruixiang Xu, Chunjuan Hou, Xiankui Wu, Yin Yang . A posteriori error estimates of hp spectral element method for parabolic optimal control problems. AIMS Mathematics, 2022, 7(4): 5220-5240. doi: 10.3934/math.2022291 |
[3] | Xin Zhao, Xin Liu, Jian Li . Convergence analysis and error estimate of finite element method of a nonlinear fluid-structure interaction problem. AIMS Mathematics, 2020, 5(5): 5240-5260. doi: 10.3934/math.2020337 |
[4] | Zuliang Lu, Ruixiang Xu, Chunjuan Hou, Lu Xing . A priori error estimates of finite volume element method for bilinear parabolic optimal control problem. AIMS Mathematics, 2023, 8(8): 19374-19390. doi: 10.3934/math.2023988 |
[5] | Changling Xu, Hongbo Chen . A two-grid P20-P1 mixed finite element scheme for semilinear elliptic optimal control problems. AIMS Mathematics, 2022, 7(4): 6153-6172. doi: 10.3934/math.2022342 |
[6] | Tiantian Zhang, Wenwen Xu, Xindong Li, Yan Wang . Multipoint flux mixed finite element method for parabolic optimal control problems. AIMS Mathematics, 2022, 7(9): 17461-17474. doi: 10.3934/math.2022962 |
[7] | Yuelong Tang . Error estimates of mixed finite elements combined with Crank-Nicolson scheme for parabolic control problems. AIMS Mathematics, 2023, 8(5): 12506-12519. doi: 10.3934/math.2023628 |
[8] | Miao Xiao, Zhe Lin, Qian Jiang, Dingcheng Yang, Xiongfeng Deng . Neural network-based adaptive finite-time tracking control for multiple inputs uncertain nonlinear systems with positive odd integer powers and unknown multiple faults. AIMS Mathematics, 2025, 10(3): 4819-4841. doi: 10.3934/math.2025221 |
[9] | Jie Liu, Zhaojie Zhou . Finite element approximation of time fractional optimal control problem with integral state constraint. AIMS Mathematics, 2021, 6(1): 979-997. doi: 10.3934/math.2021059 |
[10] | Cagnur Corekli . The SIPG method of Dirichlet boundary optimal control problems with weakly imposed boundary conditions. AIMS Mathematics, 2022, 7(4): 6711-6742. doi: 10.3934/math.2022375 |
Quantile regression has been widely used in many fields because of its robustness and comprehensiveness. However, it remains challenging to perform the quantile regression (QR) of streaming data by a conventional methods, as they are all based on the assumption that the memory can fit all the data. To address this issue, this paper proposes a Bayesian QR approach for streaming data, in which the posterior distribution was updated by utilizing the aggregated statistics of current and historical data. In addition, theoretical results are presented to confirm that the streaming posterior distribution is theoretically equivalent to the orcale posterior distribution calculated using the entire dataset together. Moreover, we provide an algorithmic procedure for the proposed method. The algorithm shows that our proposed method only needs to store the parameters of historical posterior distribution of streaming data. Thus, it is computationally simple and not storage-intensive. Both simulations and real data analysis are conducted to illustrate the good performance of the proposed method.
With the development of computer technology, the finite element method follows closely in the early 1960s. The research on the reliability and validity of finite element analysis promoted the development of the finite element method [3,4,5,25,27]. Applying finite element methods, the emergence of errors has captured the attention of scholars. One of the main sources of errors is the error caused by the discretisation of the model, while researchers dissect various aspects of finite element analysis. In addition, the generation of finite element mesh is also a concern for scholars. In the early conventional FEA, scholars usually used experience, intuition or even guesses to generate meshes to simple judge whether the approximation results are reasonable or not. If it is not reasonable, the grid needs to be redesigned whenever necessary for the efficiency of the analysis and the reliability of the results. Therefore, the emergence of the adaptive finite element method is because the computer, after checking the current conditions, decides whether the solution is accurate enough to meet the determination needs according to the error information obtained during the adjustment process.
To the best of our knowledge, the adaptive finite element method is a numerical method that can automatically adjust the algorithm to improve the process of solving [2]. An appropriate mesh can greatly reduce the errors arising from the discretisation of the finite element approximation process during the replication. According to the current situation, solutions of optimal control problems for nonlinear systems are usually not available. Besides the complexity and diversity of nonlinear equations, it is also very practical to use the adaptive finite element method for solving nonlinear equations.
Adaptive finite element methods have been widely and successfully applied in various linear optimal control problems, for example, Eriksson and Johnson proposed that the adaptive finite element algorithm produces a series of successively refined meshes in which the final mesh satisfies a given error tolerance [11]. Gaevaskaya and Hope et al. proposed an adaptive finite element method for a class of distributed optimal control problems with control constraints is analysed by applying the reliability and discrete local efficiency of the posterior estimator and the quasi-orthogonality property as basic tools [13]. Braess and Carstensen et al. Investigate the residual jump contributions of a standard explicit residual-based a posteriori error estimator for an adaptive finite element method [1]. Hu and Xu et al. performed research on the convergence and optimality of the adaptive nonconforming linear element method for the stokes problem [17].
However, a large number of literature show that the researches on adaptive finite element method for nonlinear optimal control problems have not reached its peak yet.
Recently, for instance, Wriggers and Scherf propose an adaptive finite element technique for nonlinear contact problems [29]. Eriksson and Johnson consider adaptive finite element methods for parabolic problems to a class of nonlinear scalar problems, the authors obtain posteriori error estimates and design corresponding adaptive algorithms [12]. Nowadays, the problem of nonlinear optimal control problems, similar to big data, is the focus of scholars worldwide. Hence it is worthy of investigating the adaptive finite element method for such nonlinear problems.
Many scholars have also studied the prior error estimates of the bilinear optimal control problem. For example, Yang, Demlow, and Dobrowolski et al. investigated the prior error estimates and superconvergence of optimal control problems for bilinear models and give the optimal L2-norm error estimates and the almost optimal L∞-norm estimates about the state and co-state variables [7,8,31]. Lu investigated a second-order parabolic bilinear optimal control problems and provided a priori error estimates for the finite element solutions of the state equations describing the system [24]. Shen and Yang et al. investigated a quadratic optimal control problem governed by a linear hyperbolic integro differential equation and its finite element approximation, a priori estimates have been carried out using the standard functional analysis techniques, and the existence and regularity of the solution are provided by using these estimates. At the same time, some scholars have analysed the posteriori error estimates of the finite element method for bilinear optimal control problems [15,26]. Lu, Chen, and Leng et al. discussed the discretisation of Raviart-Thomas mixed finite element for general bilinear optimal control problems, a posteriori error estimates are derived for both the coupled state and the control solutions [18,23]. Although bilinear optimal control problems are frequently met in applications, they are much more difficult to handle than linear or nonlinear cases. There is little work on nonlinear optimal control problems.
In this paper, we focus on nonlinear optimal control problem with integral control constraints where we deal with the control via adopting piecewise constant discetization while applying continuous piecewise linear discretization for the state and the co-state, respectively. Then a posteriori error estimates is gained. For the convergence and the quasi-optimality, we prove them relying on quasi-orthogonality and discrete local upper bound. Based on the mild assumption to the initial grids, we obtain the proof of convergence and quasi-optimality by means of the solution operator of nonlinear elliptic equations. Finally, some numerical simulations are provided support for our theoretical analysis.
Here are some notations will be used in this paper. Let Ω be a bounded Lipschitz domain in R2 and ∂Ω denote the boundary of Ω. We use the standard notation Wm,q(ω) with norm ‖⋅‖m,q,ω and seminorm |⋅|m,q,ω to express the standard Sobolev space for ω⊂Ω. Moreover, we will omit the subscription if ω=Ω. For q=2, we denote Wm,2(Ω) by Hm(Ω) and ‖⋅‖m=‖⋅‖m,2. Also for m=0 and q=2, we denote W0,2(ω)=L2(ω) and ‖⋅‖0,2,ω=‖⋅‖0,ω. Additionally, we observe that H10(Ω)={v∈H1(Ω):v=0 on ∂Ω}. Let Th0 be the initial partition of ˉΩ into disjoint triangles. By newest-vertex bisections for Th0, we can obtain a class T of conforming partitions. For Th,˜Th∈T, we use Th⊂˜Th to indicate that ˜Th is a refinement of Th and hT=|T|1/2, T is the partition diameter. In addition (⋅,⋅) denotes the L2 inner product. Beyond that, let C be a constant which independent of grids size, then we use A≈B to represent cA≤B≤CA.
The rest of the paper is organized as follows. In Section 2, we introduce the optimal control problem of our interest and obtain a posteriori error estimates. The relevant algorithms are introduced in Section 3. In Section 4, we use quasi-orthogonality and discrete local upper bounds to prove the convergence of the adaptive finite element method, as well as quasi-optimality in Section 5. We provide an adaptive finite element algorithm and some numerical simulations to verify our theoretical analysis in Section 6. Finally, we summarize the results of this paper and develop a plan for future work.
In this paper we mainly enter into meaningful discussions with the following nonlinear optimal control problem governed by nonlinear elliptic equations:
minu∈Uad{12‖y−yd‖20+α2‖u‖20},−Δy+ϕ(y)=f+u,in Ω,y=0,on ∂Ω, |
where y is the state variable, u is the control variable, f is a function of the control variable, α is a constant greater than zero, yd∈L2(Ω), Uad={v:v∈L2(Ω), ∫Ωv dx≥0} is a closed convex subset of U=L2(Ω) and ϕ(⋅)∈W2,∞(−R,R) for any R>0, ϕ′(y)∈L2(Ω) for any y∈H1(Ω), ϕ′≥0. Let V=H10(Ω), we give the weak formulation to deal with state equation, namely, find y∈V such that
a(y,v)+(ϕ(y),v)=(f+u,v),∀ v∈V, |
where
a(y,v)=∫Ω∇y⋅∇v dx. |
Then the nonlinear optimal control problem can be restated as follows
minu∈Uad{12‖y−yd‖20+α2‖u‖20}, | (2.1) |
a(y,v)+(ϕ(y),v)=(f+u,v),∀ v∈V. | (2.2) |
It is well known [14,20] that the nonlinear optimal control problem has at least one solution (y,u), and that if a pair (y,u) is the solution of the nonlinear optimal control problem, then there is a co-state p∈V such that the triplet (y,p,u) satisfies the following optimality conditions:
a(y,v)+(ϕ(y),v)=(f+u,v),∀ v∈V, | (2.3) |
a(q,p)+(ϕ′(y)p,q)=(y−yd,q),∀ q∈V, | (2.4) |
(αu+p,v−u)≥0,∀ v∈Uad. | (2.5) |
Due to coercivity of a(⋅,⋅), we define a linear operator S:L2(Ω)→H10(Ω) such that S(f+u)=y and let S∗ be the adjoint of S such that S∗(y−yd)=p where Vh is the continuous piecewise linear finite element space with respect to the partition Th∈T. For Th∈T, we define Uh as the piecewise constant finite element space with respect to Th. Let Uhad={vh∈Uh:∫Ωvhdx≥0}. Then we derive the standard finite element discretization for the nonlinear optimal control problem as follows:
minuh∈Uhad{12‖yh−yd‖20+α2‖uh‖20}, | (2.6) |
a(yh,v)+(ϕ(yh),v)=(f+uh,v),∀ v∈Vh. | (2.7) |
Similarly the nonlinear optimal control problem (2.6)–(2.7) has a solution (yh,uh), and that if a pair (yh,uh)∈Vh×Uh is the solution of (2.6)–(2.7), then there is a co-state ph∈Vh such that the triplet (yh,ph,uh) satisfies the following optimality conditions:
a(yh,v)+(ϕ(yh),v)=(f+uh,v),∀ v∈Vh, | (2.8) |
a(q,ph)+(ϕ′(yh)ph,q)=(yh−yd,q),∀ q∈Vh, | (2.9) |
(αuh+ph,vh−uh)≥0,∀ vh∈Uhad. | (2.10) |
Here we define some error indicators we largely and frequently use in this paper where η(⋅) are error indicators and osc(⋅) represent the data oscillations. For Th∈T, T∈Th, we define
η21,Th(ph,T)=h2T‖∇ph‖20,T,η22,Th(uh,yh,T)=h2T‖f+uh−ϕ(yh)‖20,T+hT‖[∇yh]⋅n‖20,∂T∖∂Ω,η23,Th(yh,ph,T)=h2T‖yh−yd−ϕ′(yh)ph‖20,T+hT‖[∇ph]⋅n‖20,∂T∖∂Ω,osc2Th(f,T)=h2T‖f−fT‖20,T,osc2Th(yh−yd,T)=h2T‖(yh−yd)−(yh−yd)T‖20,T, |
where uh∈Uhad, yh,ph∈Vh, and where fT is L2-projection of f onto piecewise constant space on T and fT=∫Tf|T|. For ω⊂Th, we have
η21,Th(ph,ω)=∑T∈ωη21,Th(ph,T),osc2Th(f,ω)=∑T∈ωosc2Th(f,T), |
by which η22,Th(uh,yh,ω), η23,Th(uh,yh,ω) and osc2Th(yh−yd,ω) can be denoted similarly.
Also, a reliable and effective a posteriori error estimates for the nonlinear optimal control problem (2.1)–(2.2) which is presented in next Theorem.
Theorem 2.1. For Th∈T, let (y,p,u) be the exactsolution of problem (2.3), (2.4) and (2.5) and (yh,ph,uh) be the solution of problem (2.8), (2.9) and (2.10) with respect to Th. Then thereexist constants c and C such that
‖u−uh‖20+‖y−yh‖21+‖p−ph‖21≤C(η21,Th(ph,Th)+η22,Th(uh,yh,Th)+η23,Th(yh,ph,Th)), | (2.11) |
and
c(η21,Th(ph,Th)+η22,Th(uh,yh,Th)+η23,Th(yh,ph,Th))≤‖u−uh‖20+‖y−yh‖21+‖p−ph‖21+osc2Th(f,Th)+osc2Th(yh−yd,Th). | (2.12) |
Proof. Step 1. We seek functions yh,ph∈V satisfying the following auxiliary problems
a(yh,v)+(ϕ(yh),v)=(f+uh,v),∀ v∈V, | (2.13) |
a(q,ph)+(ϕ′(yh)ph,q)=(yh−yd,q),∀ q∈V. | (2.14) |
It follows that
‖u−uh‖20≤C(αu,u−uh)−C(αuh,u−uh)≤−C(αuh,u−uh)=C(αuh,uh−u)+C(αuh−αuh,u−uh). |
With the help of the proof of Lemma 3.4 in [14] and the lemma 3.4 in [21], we have
(αuh,uh−u)=(αuh+ph,uh−u)≤C(∑T∈Th‖ph−πhph‖20,T+‖u−uh‖20)≤C(σ)η21,Th(ph,Th)+Cσ‖u−uh‖20, | (2.15) |
where η21,Th(ph,Th)=∑T∈Thη21,Th(ph,T), and
(αuh−αuh,u−uh)=(ph−ph,u−uh)≤C(σ)‖ph−ph‖20+Cσ‖u−uh‖20, | (2.16) |
where πh is the L2-projection operator onto piecewise constant space on Th, C(σ) is a universal constant, which depends on σ, σ is an arbitrary positive number, and C is a general universal constant, which can include C(σ). Obviously, from (2.15) and (2.16) there holds
‖u−uh‖20≤C(η21,Th(ph,Th)+‖ph−ph‖20). |
Let ey=yh−yh, and eyI=ˆπhey, where ˆπh is the average interpolation operator defined in Lemma 3.2 of [14], then we can obtain
c‖ey‖21≤(∇(yh−yh),∇ey)+(ϕ(yh)−ϕ(yh),ey)=(∇(yh−yh),∇(ey−eyI))+(ϕ(yh)−ϕ(yh),ey−eyI)=∑T∈Th∫T(f+uh−ϕ(yh))(ey−eyI)dx−∑T∈Th∫∂T([∇yh]⋅n)(ey−eyI)dx≤C(σ)∑T∈Thh2T∫T(f+uh−ϕ(yh))2dx+C(σ)∑∂T∖∂ΩhT∫([∇yh]⋅n)2dx+Cσ‖ey‖21=C(σ)η22,Th(uh,yh,Th)+Cσ‖ey‖21, |
where σ is an arbitrary positive number. The definition of η2 will be given later on. Then let σ=c2C, we have
‖yh−yh‖21≤Cη22,Th(uh,yh,Th). |
Similarly, let ep=ph−ph and epI be the average interpolation of ep, then we can get
c‖ep‖21≤(∇ep,∇(ph−ph))+((ϕ′(yh)(ph−ph),ep)=(∇ep,∇(ph−ph))+(ϕ′(yh)ph−ϕ′(yh)ph,ep)+((ϕ′(yh)−ϕ′(yh))ph,ep)=(∇(ep−epI),∇(ph−ph))+(ϕ′(yh)ph−ϕ′(yh)ph,ep−epI)+(∇epI,∇(ph−ph))+(ϕ′(yh)ph−ϕ′(yh)ph,epI)+((ϕ′(yh)−ϕ′(yh))ph,ep)=∑T∈Th∫T(yh−yd−ϕ′(yh)ph)(ep−epI)dx−∑∂T∖∂Ω∫∂T([∇ph]⋅n)(ep−epI)−(yh−yh,epI)+((ϕ′(yh)−ϕ′(yh))ph,ep)dx≤C(σ)∑T∈Thh2T∫T(yh−yd−ϕ′(yh)ph)2dx+C(σ)∑∂T∖∂ΩhT∫∂T([∇ph]⋅n)2dx+Cσ∑T∈Thh2T∫T|∇ep|2dx+C‖yh−yh‖0‖ep‖0+C‖ϕ′(yh)−ϕ′(yh)‖0‖ph‖0,4‖ep‖0,4≤C(σ)η23,Th(yh,ph,Th)+C(σ)‖yh−yh‖20+C(σ)‖ph‖21‖yh−yh‖20+Cσ‖ph−ph‖21)≤C(σ)η23,Th(yh,ph,Th)+C‖yh−yh‖20+Cσ‖ph−ph‖21, |
in which we apply the embedding theorem ‖v‖0,4≤C‖v‖1 (see [6]) and the property: ‖ph‖1≤C, C(σ)‖yh−yh‖20+C(σ)‖ph‖21‖yh−yh‖20≤C(σ)‖yh−yh‖20+C(σ)C‖yh−yh‖20≤C‖yh−yh‖20. The definition of η3 will be given later on. Absolutely, we can obtain
‖ph−ph‖21≤Cη23,Th(yh,ph,Th)+C‖yh−yh‖20. |
Hence we gain
‖ph−ph‖20≤C‖ph−ph‖21+C‖yh−yh‖21≤C(η22,Th(uh,yh,Th)+η23,Th(yh,ph,Th)). |
By the triangular inequality we obtain that
‖y−yh‖1≤‖y−yh‖1+‖yh−yh‖1≤C(‖u−uh‖0+‖yh−yh‖1),‖p−ph‖1≤‖p−ph‖1+‖ph−ph‖1≤C(‖y−yh‖1+‖ph−ph‖1). |
In connection with what we discussed above, we have
‖u−uh‖20+‖p−ph‖21+‖y−yh‖21≤C(η21,Th(ph,Th)+η22,Th(uh,yh,Th)+η23,Th(yh,ph,Th)). |
Step 2. Now we are in the position to get the lower bound. Consulting to the proof of Lemma 3.6 in [14] and Lemma 3.4 in [21], we conclude that
h2T‖∇ph‖20≈∑T∈Th‖ph−πhph‖20,T≤C(‖u−uh‖20+‖p−ph‖21). | (2.17) |
Proof. It is easily seen that
∑T∈Th‖ph−πhph‖20,T=∑T∈Th‖ph−πhph‖0,T‖ph−p+p−πhp+πhp−πhph‖0,T≤∑T∈Th‖ph−πhph‖0,T‖p−πhp‖+13∑T∈Th‖ph−πhph‖20,T+C‖ph−p‖21. |
Since u+p=max(0,ˉp)=const, hence
πh(u+p)=u+p, |
such that
∑T∈Th‖ph−πhph‖0,T‖p−πhp‖0,T=∑T∈Th‖ph−πhph‖0,T‖p+u−πh(p+u)+πhp−u‖0,T=∑T∈Th‖ph−πhph‖0,T‖πh(u−uh)−(u−uh)‖0,T≤13∑T∈Th‖ph−πhph‖20,T+C‖uh−u‖21. |
This completes the proof.
According to [28], we empoly the standard bubble function technique to estimate error indicators η2,Th(uh,yh,Th) and η3,Th(yh,ph,Th). Similar to Chapter 7.2 in [22], there exists polynomials wT∈H10(T) and w∂T∈H10(∂T∖∂Ω) such that
∫∂ThT([∇ph]⋅n)2dx=∫∂T([∇ph]⋅n)w∂Tdx, | (2.18) |
∫Th2T((yh−yd)T−ϕ′(yh)ph)2dx=∫T((yh−yd)T−ϕ′(yh)ph)wTdx, | (2.19) |
and apparently
‖w∂T‖21,∂T∖∂Ω≤C∫∂ThT([∇ph]⋅n)2dx, | (2.20) |
h−2T‖w∂T‖20,∂T∖∂Ω≤C∫∂ThT([∇ph]⋅n)2dx, | (2.21) |
‖wT‖21,T≤C∫hh2T((yh−yd)T−ϕ′(yh)ph)2dx, | (2.22) |
h−2T‖wT‖20,T≤C∫hh2T((yh−yd)T−ϕ′(yh)ph)2dx. | (2.23) |
By using the Schwarz inequality, it follows from (2.18), (2.20), and (2.21) that
∫∂ThT([∇ph]⋅n)2dx=∫∂T([∇ph]⋅n)w∂Tdx=∫∂T([∇ph]⋅n−[∇p]⋅n)w∂Tdx=∫∂T∇(ph−p)∇w∂Tdx+div(Δ(ph−p))w∂T=∫∂T∇(ph−p)∇w∂Tdx+∫∂T(y−yd)dx−ϕ′(y)p)w∂Tdx=∫∂T∇(ph−p)∇w∂Tdx+∫∂T(yh−yd−ϕ′(yh)ph)w∂Tdx+∫∂T((y−yd)−(yh−yd))w∂Tdx+∫∂T(ϕ′(y)p−ϕ′(yh)ph)w∂Tdx≤C(σ)‖ph−p‖21,∂T∖∂Ω+C(σ)‖y−yh‖20,∂T∖∂Ω+∫∂Tϕ′(y)(p−ph)w∂Tdx+∫∂T(ϕ′(y)−ϕ′(yh))phw∂Tdx+C(σ)∫∂T(yh−yd−ϕ′(yh)ph)2dx+Cσ(‖w∂T‖21,∂T∖∂Ω+h−2T‖w∂T‖20,∂T∖∂Ω)≤C(σ)‖ph−p‖21,∂T∖∂Ω+C(σ)‖y−yh‖20,∂T∖∂Ω+C(σ)‖ϕ′(y)‖20,∂T∖∂Ω‖(p−ph)‖20,∂T∖∂Ω+∫∂T˜ϕ″(yh)(y−yh)phw∂Tdx+C(σ)∫∂T(yh−yd−ϕ′(yh)ph)2dx+Cσ(‖w∂T‖21,∂T∖∂Ω+h−2T‖w∂T‖20,∂T∖∂Ω)≤C(σ)‖ph−p‖21,∂T∖∂Ω+C(σ)‖y−yh‖20,∂T∖∂Ω+C(σ)‖˜ϕ″(yh)‖20,∂T∖∂Ω‖y−yh‖20,∂T∖∂Ω‖ph‖20,∂T∖∂Ω+C(σ)∫∂T(yh−yd−ϕ′(yh)ph)2dx+Cσ(‖w∂T‖21,∂T∖∂Ω+h−2T‖w∂T‖20,∂T∖∂Ω)≤C(σ)‖ph−p‖21,∂T∖∂Ω+C(σ)‖y−yh‖20,∂T∖∂Ω+C(σ)∫∂T(yh−yd−ϕ′(yh)ph)2dx+Cσ∫∂ThT([∇ph]⋅n)2dx, |
where σ is an arbitrary positive number and ϕ(⋅)∈W2,∞(Ω) has been used. Then let σ=12C and we have
∫∂ThT([∇ph]⋅n)2dx≤C‖ph−p‖21,∂T∖∂Ω+C‖y−yh‖20,∂T∖∂Ω+C∫∂T(yh−yd−ϕ′(yh)ph)2dx. | (2.24) |
Next, it follows from (2.19), (2.22), and (2.23) that
∫Th2T((yh−yd)T−ϕ′(yh)ph)2dx=∫T((yh−yd)T−ϕ′(yh)ph)wTdx=∫T(yh−yd−ϕ′(yh)ph)wTdx+∫T((yh−yd)−(yh−yd)T)wTdx≤∫T(yh−yd−ϕ′(yh)ph−(y−yd)+ϕ′(y)p)wTdx+C(σ)∫Th2T((yh−yd)−(yh−yd)T)2dx+Cσh−2T‖wT‖20,T=−∫T∇(ph−p)∇wTdx+∫T((yh−yd)−(y−yd))wTdx+∫T(ϕ′(yh)ph−ϕ′(y)p)wTdx+C(σ)∫Th2T((yh−yd)−(yh−yd)T)2dx+Cσh−2T‖wT‖20,T≤C(σ)‖ph−p‖21,T+C(σ)‖(y−yd)−(yh−yd)‖20,T+∫Tϕ′(yh)(ph−p)wTdx+∫T(ϕ′(yh)−ϕ′(y))pwTdx+C(σ)∫Th2T((yh−yd)−(yh−yd)T)2dx+Cσ(‖wT‖21,T+h−2T‖wT‖20,T)≤C(σ)‖ph−p‖21,T+C(σ)‖y−yh‖20,T+C(σ)∫Th2T((yh−yd)−(yh−yd)T)2dx+C(σ)‖ϕ′(yh)‖20,T‖ph−p‖20+C(σ)‖˜ϕ″(y)‖20,T‖yh−y‖20,T‖p‖20,T+Cσ(‖wT‖21,T+h−2T‖wT‖20,T)≤C(σ)‖ph−p‖21,T+C(σ)‖y−yh‖20,T+C(σ)∫Th2T((yh−yd)−(yh−yd)T)2dx+Cσ∫Th2T((yh−yd)T−ϕ′(yh)ph)2dx, |
where ϕ(⋅)∈W2,∞(Ω) has been used. Absolutely, we can deduce that
∫Th2T((yh−yd)T−ϕ′(yh)ph)2dx≤C‖ph−p‖21,T+C‖y−yh‖20,T+C∫Th2T((yh−yd)−(yh−yd)T)2dx. |
Then we have
∫Th2T(yh−yd−ϕ′(yh)ph)2dx≤C∫Th2T((yh−yd)T−ϕ′(yh)ph)2dx+C∫Th2T((yh−yd)−(yh−yd)T)2dx≤C‖ph−p‖21,T+C‖y−yh‖20,T+C∫Th2T((yh−yd)−(yh−yd)T)2dx. | (2.25) |
In connection with (2.24) and (2.25) we are easy to gain
η23,Th(yh,ph,Th)=∑T∈Th∫Th2T(yh−yd−ϕ′(yh)ph)2dx+∑∂T∖∂Ω∫∂ThT([∇ph]⋅n)2dx≤C‖ph−p‖21+C‖y−yh‖20+C∑T∈Th∫Th2T((yh−yd)−(yh−yd)T)2dx≤C‖ph−p‖21+C‖y−yh‖21+Cosc2Th(yh−yd,Th). |
It can also be deduced that
η22,Th(uh,yh,Th)=∑T∈Th∫Th2T(f+uh−ϕ(yh))2dx+∑∂T∖∂Ω∫∂ThT([∇yh]⋅n)2dx≤C‖yh−y‖21+C‖u−uh‖20+C∑T∈Th∫Th2T(f−fT)2dx≤C‖yh−y‖21+C‖u−uh‖20+Cosc2Th(f,Th). |
Above-mentioned results tell the proof of Theorem 2.1 is accomplished.
In this section, we introduce two related algorithms as follows:
Algorithm 3.1. Adaptive finite element algorithm for nonlinear optimal controlproblems:
(0) Given an initial grids Th0 and construct finiteelement space Uh0ad and Vh0. Select marking parameter 0<θ≤1 and set k:=0.
(1) Solve the discrete nonlinear optimal control problem (2.8)–(2.10), then obtain approximate solution (uhk,yhk,phk) with respect to Thk.
(2) Compute the local error estimator ηThk(T) for all T∈Thk.
(3) Select a minimal subset Mhk of Thk such that
η2Thk(Mhk)≥θη2Thk(Thk), |
where η2Thk(ω)=η21,Th(ph,ω)+η22,Th(uh,yh,ω)+η23,Th(yh,ph,ω) for all ω⊂Thk.
(4) Refine Mhk by bisecting b≥1 times inpassing from Thk to Thk+1 andgenerally additional elements are refined in the process in order toensure that Thk+1 is conforming.
(5) Solve the discrete nonlinear optimal control problem (2.8)–(2.10), then obtain approximate solution (uhk+1,yhk+1,phk+1) with respect to Thk+1.
(6) Set k=k+1 and go to step (2).
Algorithm 3.2. Given an initial control u0h∈Uhad, then seek (ykh,pkh,ukh) such that
a(ykh,wh)+(ϕ(ykh),wh)=(f+uk−1h,wh),∀ wh∈Vh,a(qh,pk−1h)+(ϕ(ykh)pk−1h,qh)=(ykh−yd,qh),∀ qh∈Vh,(αukh+pk−1h,vh−ukh)≥0,∀ vkh∈Uhad, |
for k=1,2,⋯, and apparently
ukh=1α(−Phpkh+max(0,ˉpkh)), |
where Ph is the L2-projection from L2(Ω) to Uh and ˉpkh=∫Ωpkh|Ω|.
In this section we first consider the nonlinear elliptic equations as follows:
{−Δy+ϕ(y)=f,in Ω,y=0,on ∂Ω, | (4.1) |
where f∈L2(Ω). We introduce the quantity J(h) in view of the idea in [30] as follows:
J(h):=supf∈L2(Ω), ‖f‖0,Ω=1infvh∈vT‖S−vh‖1, |
where S is the solution operator for nonlinear elliptic equations. Obviously, J(h)≪1 for h∈(0,h0) if h0≪1. Hereinafter there holds the Lemma about the quantity referring to [10,30].
Lemma 4.1. For each f∈L2(Ω), there exists a constant C such that
‖Sf−Shf‖1≤CJ(h)‖f‖0, | (4.2) |
and
‖Sf−Shf‖0≤CJ(h)‖Sf−Shf‖1, | (4.3) |
where Sh is the discretesolution operator for nonlinear elliptic equations.
Here, Lemma 4.1 is a preparation for Lemma 4.4.
According to [19], the local perturbation property plays an important role for the proof of the convergence. It impels us to combine the sum of the error estimates.
Lemma 4.2. For Th∈T, T∈Th, let uh1,uh2∈Uhad, yh1,yh2,ph1,ph2∈Vh, we have
η1,Th(ph1,T)−η1,Th(ph2,T)≤ChT‖ph1−ph2‖1,T, | (4.4) |
η2,Th(uh1,yh1,T)−η2,Th(uh2,yh2,T)≤C(hT‖uh1−uh2‖0,T+‖yh1−yh2‖1,T), | (4.5) |
η3,Th(yh1,ph1,T)−η3,Th(yh2,ph2,T)≤C(hT‖yh1−yh2‖0,T+‖ph1−ph2‖1,T), | (4.6) |
oscTh(yh1−yd,T)−oscTh(yh2−yd,T)≤Ch2T‖yh1−yh2‖1,T. | (4.7) |
Proof. Step 1. According to [16,21] we have
‖v‖0,∂T∖∂Ω≤C(h−1/2T‖v‖0,T+h1/2T‖v‖1,T). | (4.8) |
By adopting the inverse estimates and (4.8) we obtain
‖[∇(yh1−yh2)]⋅n‖0,∂T∖∂Ω≤Ch−1/2T‖yh1−yh2‖1,ωT, | (4.9) |
‖[∇(ph1−ph2)]⋅n‖0,∂T∖∂Ω≤Ch−1/2T‖ph1−ph2‖1,ωT, | (4.10) |
where ωT denotes the patch of elements that share an edge with T. By the definition of η1,Th(ph,T) and (4.10) we can deduce that
η1,Th(ph1,T)≤η1,Th(ph2,T)+ChT‖[∇(ph1−ph2)]⋅n‖0,∂T∖∂Ω. |
Then we have
η1,Th(ph1,T)−η1,Th(ph2,T)≤ChT‖[∇(ph1−ph2)]⋅n‖0,∂T∖∂Ω≤ChT‖ph1−ph2‖1,T, |
where the proof of (4.4) is finished.
Step 2. By the definition of \eta_{2, \mathcal{T}_h}(u_h, y_h, T) and (4.9), we can deduce that
\begin{align*} &\eta_{2, \mathcal{T}_h}(u_{h_1}, y_{h_1}, T)-\eta_{2, \mathcal{T}_h}(u_{h_2}, y_{h_2}, T)\\ \leq&h_T\|u_{h_1}-u_{h_2}\|_{0, T}+h_T^{1/2}\|[\nabla (y_{h_1}-y_{h_2})]\cdot\mathbf{n}\|_{0, \partial T\backslash\partial\Omega}+h_T\|\phi(y_{h_1})-\phi(y_{h_2})\|_{0, T}\\ \leq&h_T\|u_{h_1}-u_{h_2}\|_{0, T}+C\|y_{h_1}-y_{h_2}\|_{1, \omega_{T}}+Ch_T\|\tilde{\phi}'(y_{h_1})\|_{0, T}\|y_{h_1}-y_{h_2}\|_{0, T}\\ \leq&C(h_T\|u_{h_1}-u_{h_2}\|_{0, T}+\|y_{h_1}-y_{h_2}\|_{1, T}), \end{align*} |
where \phi(\cdot)\in W^{2, \infty}(\Omega) has been used and the proof of (4.5) is finished.
Step 3. By the definition of \eta_{3, \mathcal{T}_h}(y_h, p_h, T) and (4.10) we can derive that
\begin{align*} &\eta_{3, \mathcal{T}_h}(y_{h_1}, p_{h_1}, T)-\eta_{3, \mathcal{T}_h}(y_{h_2}, p_{h_2}, T)\\ \leq&h_T\|y_{h_1}-y_{h_2}\|_{0, T}+h_T^{1/2}\|[\nabla (p_{h_1}-p_{h_2})]\cdot\mathbf{n}\|_{0, \partial T\backslash\partial\Omega}+h_T\|\phi'(y_{h_1})p_{h_1}-\phi'(y_{h_2})p_{h_2}\|_{0, T}\\ \leq&h_T\|y_{h_1}-y_{h_2}\|_{0, T}+C\|p_{h_1}-p_{h_2}\|_{1, \omega_{T}}+h_T\|\phi'(y_{h_1})\|_{0, T}\|p_{h_1}-p_{h_2}\|_{0, T}\\ &+h_T\|\tilde{\phi}''(y_{h_1})\|_{0, T}\|y_{h_1}-y_{h_2}\|_{0, T}\|p_{h_2}\|_{0, T}\\ \leq&h_T\|y_{h_1}-y_{h_2}\|_{0, T}+C\|p_{h_1}-p_{h_2}\|_{1, \omega_{T}}+Ch_T\|p_{h_1}-p_{h_2}\|_{0, T} \\\leq&C( h_T\|y_{h_1}-y_{h_2}\|_{0, T}+\|p_{h_1}-p_{h_2}\|_{1, T}), \end{align*} |
where \phi(\cdot)\in W^{2, \infty}(\Omega) has been used and the proof of (4.6) is finished. \\Step 4. Similarly, we have
\begin{align*} \ osc_{\mathcal{T}_h}(y_{h_1}-y_d, T)-osc_{\mathcal{T}_h}(y_{h_2}-y_d, T)\leq&h_T^2\|y_{h_1}-y_{h_2}\|_{0, T}. \end{align*} |
In brief, Lemma 4.2 is proved.
The authors in [25] demonstrate an error reduction provided the current errors are larger than the desired errors, that is to say, the errors may not be reduced in the process of coarse grids refinement before introducing a node of the refined grids inside each marked element while D \ddot{\mathrm{o}} rfler proves a similar result assumption [9].
Lemma 4.3. Let \mathcal{T}_h\subset\tilde{\mathcal{T}_h} for \mathcal{T}_h, \tilde{\mathcal{T}_h}\in\mathbb{T}.\mathcal{M}_h\subset\mathcal{T}_h denotes the set of elements whichare marked from \mathcal{T}_h to \tilde{\mathcal{T}}_h . Then for u_h\in U_{ad}^h, \ \tilde{u}_h \in U_{ad}^{\tilde{h}}, \ y_h, p_h\in V_{h}, \ \tilde{y}_h, \tilde{p}_h\in V_{\tilde{h}} and any \delta, \delta_{1}\in(0, 1] , we have
\begin{align} &\quad\ \eta_{1, \tilde{\mathcal{T}}_h}^2(\tilde{p}_h, \tilde{\mathcal{T}_h})-(1+\delta_{1})\bigg\{\eta_{1, \mathcal{T}_h}^2(p_h, \mathcal{T}_h)-(1-2^{-1/2})\eta_{1, \mathcal{T}_h}(p_h, \mathcal{R}_h)\bigg\}\\ &\leq C\left(1+\delta_{1}^{-1}\right)h_{0}^2\|p_h-\tilde{p}_h\|_{1}^2, \end{align} | (4.11) |
and
\begin{align} &\quad\ \eta_{2, \tilde{\mathcal{T}}_h}^2(\tilde{u}_h, \tilde{y}_h, \tilde{\mathcal{T}}_h)-(1+\delta)\bigg\{\eta_{2, \mathcal{T}}^2(u_h, y_h, \mathcal{T}_h) -\lambda\eta_{2, \mathcal{T}_h}^2(u_h, y_h, \mathcal{M}_h)\bigg\}\\ &\leq C(1+\delta^{-1})\left(h_{0}^2\|u_h-\tilde{u}_h\|_{0}^2+\|y_h-\tilde{y}_h\|_{1}^2\right), \end{align} | (4.12) |
and
\begin{align} &\quad\ \eta_{3, \tilde{\mathcal{T}}_h}^2(\tilde{y}_h, \tilde{p}_h, \tilde{\mathcal{T}}_h)-(1+\delta)\bigg\{\eta_{3, \mathcal{T}_h}^2(y_h, p_h, \mathcal{T}_h) -\lambda\eta_{3, \mathcal{T}_h}^2(u_h, y_h, \mathcal{M}_h)\bigg\}\\ &\leq C(1+\delta^{-1})\bigg(h_{0}^2\|y_h-\tilde{y}_h\|_{0}^2+\|p_h-\tilde{p}_h\|_{1}^2\bigg), \end{align} | (4.13) |
and
\begin{align} &\quad\ osc_{\mathcal{T}_h}^2(y_h-y_{d}, \mathcal{T}_h\cap\tilde{\mathcal{T}}_h)-2osc_{\tilde{\mathcal{T}}_h}^2(\tilde{y}_h-y_{d}, \mathcal{T}_h\cap\tilde{\mathcal{T}}_h)\\ &\leq 2Ch_{0}^4\|y_h-\tilde{y}_h\|_{1}^2, \end{align} | (4.14) |
where \lambda = 1-2^{-\frac{b}{2}}, \ h_{0} = \max\limits_{{T\in\mathcal{T}_{h_{0}}}}h_{T} and \mathcal{R}_h denotes the set ofelements which are refined from \mathcal{T}_h to \tilde{\mathcal{T}_h} .
Proof. Step 1. Applying the Young's inequality with parameter \delta_{1} and (4.4) we get
\begin{align} \eta_{1, \tilde{\mathcal{T}_h}}^2(\tilde{p}_h, \tilde{\mathcal{T}_h})-\eta_{1, \tilde{\mathcal{T}_h}}^2(p_h, \tilde{\mathcal{T}_h})\leq&Ch_0^2\|p_h-\tilde{p}_h\|_1^2+ 2\eta_{1, \tilde{\mathcal{T}_h}}(\tilde{p}_h, \tilde{\mathcal{T}_h})\cdot\eta_{1, \tilde{\mathcal{T}_h}}(p_h, \tilde{\mathcal{T}_h})\\ \leq&C(1+\delta_1^{-1})h_0^2\|p_h-\tilde{p}_h\|_1^2+\delta_1\eta_{1, \tilde{\mathcal{T}_h}}^2(p_h, \tilde{\mathcal{T}_h}) . \end{align} | (4.15) |
Note that T will be bisected at least one time for the element T\in \mathcal{R}_h\subset\mathcal{T}_h , then we have
\sum\limits_{T'\in T}\eta_{1, \tilde{\mathcal{T}_h}}(p_h, T')\leq2^{-1/2}\eta_{1, \mathcal{T}_h}(p_h, T). |
For T\in \mathcal{T}_h\backslash\mathcal{R}_h , we gain
\eta_{1, \tilde{\mathcal{T}_h}}^2(p_h, T) = \eta_{1, \mathcal{T}_h}^2(p_h, T). |
In connection with the above estimates we demonstrate that
\begin{align*} &\eta_{1, \tilde{\mathcal{T}}_h}^2(\tilde{p}_h, \tilde{\mathcal{T}_h})-(1+\delta_{1})\bigg\{\eta_{1, \mathcal{T}_h}^2(p_h, \mathcal{T}_h)-(1-2^{-1/2})\eta_{1, \mathcal{T}_h}(p_h, \mathcal{R}_h)\bigg\}\\ = &\eta_{1, \tilde{\mathcal{T}_h}}^2(p_h, \mathcal{R}_h)+\eta_{1, \tilde{\mathcal{T}_h}}^2(p_h, \mathcal{T}_h\backslash\mathcal{R}_h)-(1+\delta_{1})\bigg\{\eta_{1, \mathcal{T}_h}^2(p_h, \mathcal{T}_h) -(1-2^{-1/2})\eta_{1, \mathcal{T}_h}(p_h, \mathcal{R}_h)\bigg\}\\ \leq&2^{-1/2}\eta_{1, \tilde{\mathcal{T}_h}}^2(p_h, \mathcal{R}_h)+\eta_{1, \tilde{\mathcal{T}_h}}^2(p_h, \mathcal{T}_h\backslash\mathcal{R}_h)-(1+\delta_{1})\bigg\{\eta_{1, \mathcal{T}_h}^2 (p_h, \mathcal{T}_h)-(1-2^{-1/2})\eta_{1, \mathcal{T}_h}(p_h, \mathcal{R}_h)\bigg\}\\ \leq&C\left(1+\delta_{1}^{-1}\right)h_{0}^2\|p_h-\tilde{p}_h\|_{1}^2, \end{align*} |
which illustrates (4.11) has been proved.
Step 2. Employing the Young's inequality with parameter \delta and (4.5) we obtain
\begin{align*} &\eta_{3, \tilde{\mathcal{T}_h}}^2(\tilde{y}_h, \tilde{p}_h, \tilde{\mathcal{T}_h})-\eta_{3, \tilde{\mathcal{T}_h}}^2(y_h, p_h, \tilde{\mathcal{T}_h})\\\leq& C(1+\delta^{-1})h_T^2\bigg(\sum\limits_{T\in\mathcal{T}_h}h_{T}\|\tilde{y}_h-y_h\|_{0, T}^2+\|\tilde{p}_h-p_h\|_1^2\bigg) +\delta\eta_{3, \tilde{\mathcal{T}_h}}^2(y_h, p_h, \tilde{\mathcal{T}_h}), \end{align*} |
where it is similar to (4.15). Then let \tilde{\mathcal{T}}_{h_{T'}} = \{T\in \tilde{\mathcal{T}_h}:T\subset T'\} where T'\in\mathcal{M}_h is a marked element. For arbitrary p_h\in V_{\mathcal{T}_h}\subset V_{\tilde{\mathcal{T}_h}} , we find the jump [\nabla p_h] = 0 on the interior sides of \cup\tilde{\mathcal{T}}_{h_{T'}} . Suppose b is the number of bisections, we can deduce that
h_{T} = |T|^{1/2}\leq(2^{-b}|T'|)^{1/2}\leq 2^{-\frac{b}{2}}h_{T'}, |
due to refinement by bisection, then we obtain
\begin{align*} \sum\limits_{T\in\tilde{\mathcal{T}_h}_{T'}}\eta_{3, \tilde{\mathcal{T}_h}}^2(y_h, p_h, T)\leq2^{-\frac{b}{2}}\eta_{3, \mathcal{T}_h}^2(y_h, p_h, T'). \end{align*} |
It is easy to find that
\begin{align*} \eta_{3, \tilde{\mathcal{T}_h}}^2(y_h, p_h, T)\leq\eta_{3, \mathcal{T}_h}^2(y_h, p_h, T), \end{align*} |
for any T\in\mathcal{T}_h\backslash\mathcal{M}_h . In connection with the above estimates we expound that
\begin{align*} \eta_{3, \tilde{\mathcal{T}_h}}^2(y_h, p_h, \tilde{\mathcal{T}_h})& = \eta_{3, \tilde{\mathcal{T}_h}}^2(y_h, p_h, \mathcal{M}_h) +\eta_{3, \tilde{\mathcal{T}_h}}^2(y_h, p_h, \mathcal{T}_h\backslash\mathcal{M}_h)\\ &\leq 2^{-\frac{b}{2}}\eta_{3, {\mathcal{T}_h}}^2(y_h, p_h, \mathcal{M}_h)+\eta_{3, \mathcal{T}_h}^2(y_h, p_h, \mathcal{T}_h\backslash\mathcal{M}_h)\\ & = \eta_{3, {\mathcal{T}_h}}^2(y_h, p_h, \mathcal{T}_h)-(1-2^{-\frac{b}{2}})\eta_{3, {\mathcal{T}_h}}^2(y_h, p_h, \mathcal{M}_h). \end{align*} |
As has been said, we can achieve (4.13) connecting with the above estimates to which the proof of (4.12) is similar.
Step 3. For arbitrary T\in\mathcal{T}_h\cap\tilde{\mathcal{T}_h} via using (4.7) and the Young's inequality we obtain that
\begin{align*} &osc_{\mathcal{T}_h}(y_h-y_d, T) = osc_{\tilde{\mathcal{T}_h}}(y_h-y_d, T), \\ &osc_{\mathcal{T}_h}^2(y_h-y_d, T)-2osc_{\tilde{\mathcal{T}_h}}^2(\tilde{y}_h-y_d, T)\leq 2Ch_T^4\|\tilde{y}_h-y_h\|_{1, T}^2. \end{align*} |
Obviously, (4.14) can be got by summing the above inequality over T\in\mathcal{T}_h\cap\tilde{\mathcal{T}_h} . To sum up, Lemma 4.3 is proved.
As to the proof of the convergence, one of the main obstacle is that there do not have the orthogonality while it is vital to prove the convergence. Thus getting back to the second place we transfer proof of the quasi-orthogonality. The latter is popularly adopted in the adaptive mixed and the nonconforming adaptive finite element methods [19]. Apparently it is true for the following basic relationships with \mathcal{T}_{h_k}, \mathcal{T}_{h_{k+1}}\in\mathbb{T} and \mathcal{T}_{h_k}\subset\mathcal{T}_{h_{k+1}} ,
\begin{align} &\|u-u_{h_{k+1}}\|_{0}^2 = \|u-u_{h_k}\|_{0}^2-\|u_{h_k}-u_{h_{k+1}}\|_{0}^2-2(u-u_{h_{k+1}}, u_{h_{k+1}}-u_{h_k}), \end{align} | (4.16) |
\begin{align} &\|y-y_{h_{k+1}}\|_{1}^2 = \|y-y_{h_k}\|_{1}^2-\|y_{h_k}-y_{h_{k+1}}\|_{1}^2-2a(y-y_{h_{k+1}}, y_{h_{k+1}}-y_{h_k}), \end{align} | (4.17) |
\begin{align} &\|p-p_{h_{k+1}}\|_{1}^2 = \|p-p_{h_k}\|_{1}^2-\|p_{h_k}-p_{h_{k+1}}\|_{1}^2-2a(p-p_{h_{k+1}}, p_{h_{k+1}}-p_{h_k}), \end{align} | (4.18) |
where (u, y, p) are the solution of (2.3)–(2.5), (u_{h_k}, y_{h_k}, p_{h_k}) and (u_{h_{k+1}}, y_{h_{k+1}}, p_{h_{k+1}}) are the solution of (2.8)–(2.10) with respect to \mathcal{T}_{h_k} and \mathcal{T}_{h_{k+1}} , respectively. Accordingly we have the quasi-orthogonality below.
Lemma 4.4. For \mathcal{T}_{h_k}, \mathcal{T}_{h_{k+1}}\in\mathbb{T} and \mathcal{T}_{h_k}\subset\mathcal{T}_{h_{k+1}} , we have
\begin{align} &\quad\ (1-\delta)\|u-u_{h_{k+1}}\|_{0}^2-\|u-u_{h_k}\|_{0}^2+\|u_{h_k}-u_{h_{k+1}}\|_{0}^2\\ &\leq C\delta^{-1}\left(\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{R}_h)+\mathcal{J}^2(h_0)\big(\eta_{2, \mathcal{T}_{h_k}}^2(u_{h_k}, y_{h_k}, \mathcal{R}_h) +\eta_{3, \mathcal{T}_{h_k}}^2(y_{h_k}, p_{h_k}, \mathcal{R}_h)\big)\right), \end{align} | (4.19) |
and
\begin{align} &\quad\ (1-\delta)\|y-y_{h_{k+1}}\|_{1}^2-\|y-y_{h_k}\|_{1}^2-\|y_{h_k}-y_{h_{k+1}}\|_{1}^2-\delta\|u-u_{h_{k+1}}\|_{0}^2\\ &\leq C\delta^{-1}\left(\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{R}_h)+\mathcal{J}^2(h_0)\big(\eta_{2, \mathcal{T}_{h_k}}^2(u_{h_k}, y_{h_k}, \mathcal{R}_h) +\eta_{3, \mathcal{T}_{h_k}}^2(y_{h_k}, p_{h_k}, \mathcal{R}_h)\big)\right), \end{align} | (4.20) |
and
\begin{align} &\quad\ (1-\delta)\|p-p_{h_{k+1}}\|_{1}^2-\|p-p_{h_k}\|_{1}^2+\|p_{h_k}-p_{h_{k+1}}\|_{1}^2-\delta\|y-y_{h_{k+1}}\|_{1}^2\\ &\leq C\delta^{-1}\left(\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{R}_h)+\mathcal{J}^2(h_0)\big(\eta_{2, \mathcal{T}_{h_k}}^2(u_{h_k}, y_{h_k}, \mathcal{R}_h) +\eta_{3, \mathcal{T}_{h_k}}^2(y_{h_k}, p_{h_k}, \mathcal{R}_h)\big)\right). \end{align} | (4.21) |
Proof. Step 1. It follows from Lemma 4.3 in [19] that for U_{ad}^{h_k}\subset U_{ad}^{h_{k+1}} we have
\begin{align} \alpha\|u_{h_{k+1}}-u_{h_k}\|_0^2\leq&(p_{h_{k+1}}-p_{h_k}, u_{h_k}-u_{h_{k+1}})+(\alpha u_{h_k}+p_{h_k}, u_{h_k}-u_{h_{k+1}})\\ \leq&(S_{h_{k+1}}^*(S_{h_{k+1}}(f+u_{h_k})-y_{d})-S_{h_k}^*(S_{h_k}(f+u_{h_k})-y_{d}), u_{h_k}-u_{h_{k+1}})\\ &+C\eta_{1, \mathcal{T}_{h_k}}(p_{h_k}, \mathcal{R}_h)\|u_{h_k}-u_{h_{k+1}}\|_{0}, \end{align} | (4.22) |
where \mathcal{R}_h is the set of elements which are refined from \mathcal{T}_{h_k} to \mathcal{T}_{h_{k+1}} .
For the right-hand first item of (4.22) we let \zeta_h\in H_{0}^1(\Omega) be the solution of the following problem based on Lemma 4.1
\begin{align*} &a(\zeta_h, q) = (S_{h_{k+1}}^*(S_{h_{k+1}}(f+u_{h_k})-y_{d})-S_{h_k}^*(S_{h_k}(f+u_{h_k})-y_{d}), q), \quad\forall \ q\in H_{0}^1(\Omega). \end{align*} |
Hence we can get
\begin{align*} &\|S_{h_{k+1}}^*(S_{h_{k+1}}(f+u_{h_k})-y_{d})-S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{0}^2\\ = &a(\zeta_h, S_{h_{k+1}}^*(S_{h_{k+1}}(f+u_{h_k})-y_{d})-S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}))\\ = &a(\zeta_h-\zeta_{h_k}, S_{h_{k+1}}^*(S_{h_{k+1}}(f+u_{h_k})-y_{d})-S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}))\\ &+(S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k}), \zeta_{h_k}-\zeta_h)+(S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k}), \zeta_h). \end{align*} |
From the proof of Lemma 3.3 in [19], we infer that
\begin{align*} &a(\zeta_h-\zeta_{h_k}, S_{h_{k+1}}^*(S_{h_{k+1}}(f+u_{h_k})-y_{d})-S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}))\\ \leq&\mathcal{J}(h_0)\|S_{h_{k+1}}^*(S_{h_{k+1}}(f+u_{h_k})-y_{d})-S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{0} \\ &\cdot\big(\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k})\|_{1}+\|S_{h_{k+1}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}) -S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{1}\big), \end{align*} |
and
\begin{align*} &(S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k}), \zeta_{h_k}-\zeta_h)\\ \leq&C\mathcal{J}(h_0)\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k})\|_{0}\\& \cdot\|S_{h_{k+1}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}) -S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{0}, \end{align*} |
and
\begin{align*} &(S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k}), \zeta)\\ \leq&\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k})\|_{0}\cdot\|S_{h_{k+1}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}) -S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{0}. \end{align*} |
Similar to Lemma 4.6 in [4], we derive that
\begin{align*} &\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k})\|_{1}\leq C\eta_{2, \mathcal{T}_{h_{k}}}(u_{h_k}, y_{h_k}, \mathcal{R}_h), \\ &\|S_{h_{k+1}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}) -S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{1}\leq C\eta_{3, \mathcal{T}_{h_{k}}}(y_{h_k}, p_{h_k}, \mathcal{R}_h). \end{align*} |
In connection with the above estimates we conclude that
\begin{align*} &\quad\ \|S_{h_{k+1}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}) -S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{0}\\ &\leq C\Big(\mathcal{J}(h_0)(\eta_{2, \mathcal{T}_{h_{k}}}(u_{h_k}, y_{h_k}, \mathcal{R}_h)+\eta_{3, \mathcal{T}_{h_{k}}}(y_{h_k}, p_{h_k}, \mathcal{R}_h)) +\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k})\|_{0}\Big). \end{align*} |
For the third term at the right end of the above inequality, we let \varphi_h\in H_{0}^1(\Omega) be the solution of the following problem
\begin{align*} &a(q, \varphi_h) = (S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k}), q), \quad\forall\ q\in H_{0}^1(\Omega). \end{align*} |
According to the standard duality theory, we can deduce that
\begin{align*} &\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k})\|_{0}^2\\ = &a(S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k}), \varphi_h-\varphi_{h_{k}})\\ \leq&C\mathcal{J}(h_0)\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{k}}(f+u_{h_k})\|_{1}\cdot\|S_{h_{k+1}}(f+u_{h_k}) -S_{h_{k}}(f+u_{h_k})\|_{0}, \end{align*} |
where \varphi_{h_k} is the standard finite element estimate of \varphi_h with respect to V_{\mathcal{T}_{h_k}} . So we have
\begin{align*} &\|S_{h_{k+1}}^*(S_{h_{k}}(f+u_{h_k})-y_{d}) -S_{h_{k}}^*(S_{h_{k}}(f+u_{h_k})-y_{d})\|_{0}\\ \leq& C\Big(\mathcal{J}(h_0)(\eta_{2, \mathcal{T}_{h_{k}}}(u_{k}, y_{k}, \mathcal{R}_h)+\eta_{3, \mathcal{T}_{h_{k}}}(y_{h_k}, p_{h_k}, \mathcal{R}_h))\Big). \end{align*} |
As mentioned above, we can get
\begin{align*} (p_{h_{k+1}}-p_{h_k}, u_{h_k}-u_{h_{k+1}})\leq& C\Big(\mathcal{J}(h_0)(\eta_{2, \mathcal{T}_{h_{k}}}(u_{h_k}, y_{h_k}, \mathcal{R}_h)\\ &+\eta_{3, \mathcal{T}_{h_{k}}}(y_{h_k}, p_{h_k}, \mathcal{R}_h))\|u_{h_k}-u_{h_{k+1}}\|_{0}\Big). \end{align*} |
In combination with (4.22) and above inequality, we deduct that
\begin{equation} \|u_{h_k}-u_{h_{k+1}}\|_{0}\leq C\Big(\eta_{1, \mathcal{T}_{h_k}}(p_{h_k}, \mathcal{R}_h)+\mathcal{J}(h_0)(\eta_{2, \mathcal{T}_{h_k}}(u_{h_k}, y_{h_k}, \mathcal{R}_h)+\eta_{3, \mathcal{T}_{h_k}}(y_{h_k}, p_{h_k}, \mathcal{R}_h))\Big). \end{equation} | (4.23) |
It is easy to derive the desired result (4.19) with the help of (4.16) and (4.23).
Step 2. Our task now is to prove (4.20) and so is (4.21). Obviously we have
\begin{align} \|y_{h_{k+1}}-y_{h_k}\|_{0}& = \|S_{h_{k+1}}(f+u_{h_{k+1}})-S_{h_{k}}(f+u_{h_k})\|_{0}\\ &\leq\|S_{h_{k+1}}(f+u_{h_{k+1}})-S_{h_{k+1}}(f+u_{h_k})\|_{0}+\|S_{h_{k+1}}(f+u_{h_k})-S_{h_{h_k}}(f+u_{h_k})\|_{0}\\ &\leq C(\|u_{h_k}-u_{h_{k+1}}\|_{0}+\mathcal{J}(h_0)\eta_{2, \mathcal{T}_{h_k}}(u_{h_k}, y_{h_k}, \mathcal{R}_h)). \end{align} | (4.24) |
By using the Cauchy inequality, we obtain
\begin{align} &2a(y-y_{h_{k+1}}, y_{h_{k+1}}-y_{h_k})\\ = &2(u-u_{h_{k+1}}, y_{h_{k+1}}-y_{h_k})-2(\phi(y)-\phi(y_{h_{k+1}}), y_{h_{k+1}}-y_{h_k})\\ \leq&\delta\|u-u_{h_{k+1}}\|_{0}^2+\frac{1}{\delta}\|y_{h_{k+1}}-y_{h_k}\|_{0}^2+(\tilde{\phi}'(y)(y-y_{h_{k+1}}), y_{h_{k+1}}-y_{h_k})\\ \leq&\delta\|u-u_{h_{k+1}}\|_{0}^2+\frac{1}{\delta}\|y_{h_{k+1}}-y_{h_k}\|_{0}^2+\|\tilde{\phi}'(y)\|_0\|y-y_{h_{k+1}}\|_0\|y_{h_{k+1}}-y_{h_k}\|_0\\ \leq&\delta\|u-u_{h_{k+1}}\|_{0}^2+\frac{1}{\delta}\|y_{h_{k+1}}-y_{h_k}\|_{0}^2+C\left(\delta\|y-y_{h_{k+1}}\|_0^2+\frac{1}{\delta}\|y_{h_{k+1}}-y_{h_k}\|_0^2\right). \end{align} | (4.25) |
It is easy to derive the desired result (4.20) with the assistance of (4.17) and (4.23)–(4.25).
For \mathcal{T}_{h_k}\in \mathbb{T} , we will denote U_{ad}^{h_{k}}, \ V_{h_{k}} and the solution (u_{h_{k}}, y_{h_{k}}, p_{h_{k}}) of (2.8)–(2.10) with respect to \mathcal{T}_{h_{k}} by U_{ad}^{h_k}, \ V_{h_k} , and (u_{h_k}, y_{h_k}, p_{h_k}) and we define some notations before we prove the convergence of Algorithm 3.1 as follows:
\begin{align*} &e_{h_k}^2 = \|u-u_{h_k}\|_{0}^2+\|y-y_{h_k}\|_{1}^2+\|p-p_{h_k}\|_{1}^2, \\ &E_{h_k}^2 = \|u_{h_k}-u_{h_{k+1}}\|_{0}^2+\|y_{h_k}-y_{h_{k+1}}\|_{1}^2+\|p_{h_k}-p_{h_{k+1}}\|_{1}^2, \\ &\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\omega) = \eta_{2, \mathcal{T}_{h_k}}^2(u_{h_k}, y_{h_k}, \omega)+\eta_{3, \mathcal{T}_{h_k}}^2(y_{h_k}, p_{h_k}, \omega), \end{align*} |
for \omega\subset\Omega .
Theorem 4.1. Let (\mathcal{T}_{h_k}, U_{ad}^{h_k}, V_{h_k}, u_{h_k}, y_{h_k}, p_{h_k}) be the sequence of grids, finite element spaces and discretesolutions produced by the Algorithm 3.1. Then there existconstants \gamma_{1} > 0, \ \gamma_{2} > 0 and \alpha\in (0, 1) , onlydepending on the shape regularity of initial grids \mathcal{T}_{h_0}, \ b , and the marking parameter \theta\in(0, 1] , such that
\begin{equation} e_{h_{k+1}}^2+\gamma_{1}\eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})+\gamma_{2}\tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}}) \leq\alpha\left(e_{h_k}^2+\gamma_{1}\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})+\gamma_{2}\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\right), \end{equation} | (4.26) |
provided h_{0}\ll 1 .
Proof. We get the following results from Theorem 2.1, Lemma 4.3 and Lemma 4.4,
\begin{align} e_{h_k}^2&\leq C\eta_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k}), \end{align} | (4.27) |
\begin{align} \tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}})&\leq(1+\delta)\Big\{\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k}) -\lambda\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{M}_{h_k})\Big\}+C\left(1+\frac{1}{\delta}\right)E_{h_k}^2, \end{align} | (4.28) |
\begin{align} \eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})&\leq(1+\delta_{1})\Big\{\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k}) -(1-2^{-1/2})\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{R}_{h_k})\Big\}\\ &\quad + C\left(1+\frac{1}{\delta_{1}}\right)h_{0}^2\|p_{h_k}-p_{h_{k+1}}\|_{1}^2, \end{align} | (4.29) |
\begin{align} (1-2\delta)e_{h_{k+1}}^2&\leq e_{h_k}^2-E_{h_k}^2+C\frac{1}{\delta}\Big(\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{R}_{h_k})+\mathcal{J}(h_0)\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big), \end{align} | (4.30) |
where \mathcal{R}_{h_k} is the set of elements which are refined from \mathcal{T}_{h_k} to \mathcal{T}_{h_{k+1}} . Applying the upper bound in Theorem 2.1, (4.29) can be simplified into
\begin{align} \eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})&\leq (1+\delta_{1})\Big\{\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})- (1-2^{-1/2})\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{R}_{h_k})\Big\}\\ &\quad+C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big(\eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})+\tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}})\\ &\quad+\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})+\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big). \end{align} | (4.31) |
Multiplying (4.28) and (4.31) with \tilde{\gamma_{2}} and \tilde{\gamma_{1}} , respectively, and adding the results to (4.30) yields
\begin{align*} &\ \quad(1-2\delta)e_{h_{k+1}}^2+\tilde{\gamma_{1}}\eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})+\tilde{\gamma_{2}}\tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}})\\ &\leq e_{h_k}^2+\tilde{\gamma_{2}}(1+\delta)\Big\{\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})-\lambda\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{M}_{h_k})\Big\}\\ &\quad +\tilde{\gamma_{2}}C\Big(1+\frac{1}{\delta}\Big)E_{h_k}^2-E_{h_k}^2-\Big(\tilde{\gamma_{1}}(1+\delta_{1})(1-2^{-1/2})-C\frac{1}{\delta}\Big)\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{R}_{h_k})\\ &\quad +\tilde{\gamma_{1}}(1+\delta_{1})\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})+C\frac{1}{\delta}\mathcal{J}^2(h_0)\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\\ &\quad +\tilde{\gamma_{1}}C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big(\eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})+\tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}})+ \eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})+\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big). \end{align*} |
If \tilde{\gamma_{1}} is such that
\tilde{\gamma_{1}}(1+\delta_{1})(1-2^{-1/2})-C\frac{1}{\delta} > 0, |
and one chooses \tilde{\gamma_{2}} such that
\begin{equation} \tilde{\gamma_{2}}C\Big(1+\frac{1}{\delta}\Big) = 1, \end{equation} | (4.32) |
then we have
\begin{align*} &\quad(1-2\delta)e_{h_{k+1}}^2+\tilde{\gamma_{1}}\Big(1-C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big)\eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})\\ &\quad +\Big(\tilde{\gamma_{2}}-\tilde{\gamma_{1}}C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big)\tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}})\\ &\leq e_{h_k}^2+\tilde{\gamma_{1}}\Big((1+\delta_{1})+C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big)\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})\\ &\quad +\Big(\tilde{\gamma_{2}}(1+\delta)+\tilde{\gamma_{1}}C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2+C\frac{1}{\delta}\mathcal{J}^2(h_0)\Big)\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\\ &\quad -c\Big(\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{M}_{h_k})+\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{M}_{h_k})\Big), \end{align*} |
where
c = \min\Big\{\tilde{\gamma_{2}}(1+\delta)\lambda, \tilde{\gamma_{1}}(1+\delta_{1})(1-2^{-1/2})-C\frac{1}{\delta}\Big\}. |
By using the marking strategy in Algorithm 3.1 and the upper bound in Theorem 2.1 to arrive at
\begin{align*} &\quad(1-2\delta)e_{h_{k+1}}^2+\tilde{\gamma_{1}}\Big(1-C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big)\eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})\\ &\quad +\Big(\tilde{\gamma_{2}}-\tilde{\gamma_{1}}C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big)\tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}})\\ &\leq(1-C\theta\beta)e_{h_k}^2+\Big(\tilde{\gamma_{1}}\Big((1+\delta_{1})+C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2\Big)-c\theta(1-\beta)\Big)\eta_{1, \mathcal{T}_{h_k}}(p_{h_k}, \mathcal{T}_{h_k})\\ &\quad +\Big(\tilde{\gamma_{2}}(1+\delta)+\tilde{\gamma_{1}}C\Big(1+\frac{1}{\delta_{1}}\Big)h_{0}^2+C\frac{1}{\delta}\mathcal{J}^2(h_{0})-c\theta(1-\beta)\Big) \tilde{\eta}_{\mathcal{T}_{h_k}}(\mathcal{T}_{h_k}), \end{align*} |
where \beta\in(0, 1) . Then we deduce that
\begin{align} &\ \quad e_{h_{k+1}}^2+\gamma_{1}\eta_{1, \mathcal{T}_{h_{k+1}}}^2(p_{h_{k+1}}, \mathcal{T}_{h_{k+1}})+\gamma_{2}\tilde{\eta}_{\mathcal{T}_{h_{k+1}}}^2(\mathcal{T}_{h_{k+1}})\\ &\leq\alpha_{1}e_{h_k}^2+\alpha_{2}\gamma_{1}\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})+\alpha_{3}\gamma_{2}\tilde{\eta}_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k}), \end{align} |
where
\begin{align*} &\alpha_{1} = \frac{1-C\theta\beta}{1-2\delta}, \\ &\gamma_{1} = \frac{\tilde{\gamma_{1}}\Big(1-C(1+\delta_{1}^{-1})h_{0}^2\Big)}{1-2\delta}, \\ &\gamma_{2} = \frac{\tilde{\gamma_{2}}-\tilde{\gamma_{1}}C(1+\delta_{1}^{-1})h_{0}^2}{1-2\delta}, \\ &\alpha_{2} = \frac{\tilde{\gamma_{1}}\Big((1+\delta_{1})+C(1+\delta_{1}^{-1})h_{0}^2\Big)-c\theta(1-\beta)}{\tilde{\gamma_{1}}\Big(1-C(1+\delta_{1}^{-1})h_{0}^2\Big)}, \\ &\alpha_{3} = \frac{\tilde{\gamma_{2}}(1+\delta)+\tilde{\gamma_{1}}C(1+\delta_{1}^{-1})h_{0}^2+C\delta^{-1}\mathcal{J}^2(h_0)-c\theta(1-\beta)} {\tilde{\gamma_{2}}-\tilde{\gamma_{1}}C(1+\delta_{1}^{-1})h_{0}^2}. \end{align*} |
As long as \delta < 1 and \beta is small enough, \alpha_{1}\in(0, 1) can be guaranteed. To facilitate judgment, we transfer the following adjustments to the above formula:
\begin{align*} &\alpha_{2} = \frac{(1-C(1+\delta_{1}^{-1})h_{0}^2)+\delta_{1}+2C(1+\delta_{1}^{-1})h_{0}^2-\frac{c\theta(1-\beta)}{\tilde{\gamma_{1}}}}{1-C(1+\delta_{1}^{-1})h_{0}^2}, \\ &\alpha_{3} = \frac{\tilde{\gamma_{2}}(1+\delta)-\tilde{\gamma_{1}} C(1+\delta_{1}^{-1})h_{0}^2+2\tilde{\gamma_{1}} C(1+\delta_{1}^{-1})h_{0}^2+C\delta^{-1}\mathcal{J}^2(h_{0})-c\theta(1-\beta)}{\tilde{\gamma_{2}}-\tilde{\gamma_{1}}C(1+\delta_{1}^{-1})h_{0}^2}. \end{align*} |
It is absolutely clear that
\alpha_{2}\in(0, 1), |
if h_{0}\ll 1 and \delta_{1} is sufficiently small. Then consider (4.32) to deduce that
\gamma_{2} = \frac{\delta^2}{C(1+\delta)}, |
which can say
\alpha_{3}\in(0, 1), |
if h_{0}\ll 1 and \delta is sufficiently small. Therefore if choose \alpha = \max\{\alpha_{1}, \alpha_{2}, \alpha_{3}\} , we can derive the expected results.
Theorem 4.2. Let (\mathcal{T}_{h_k}, U_{ad}^{h_k}, V_{h_k}, u_{h_k}, y_{h_k}, p_{h_k}) be the sequence of grids, finite element spaces and discretesolutions produced by the Algorithm 3.1 and the conditionsof Theorem 4.1 keep. Then we have
\begin{align*} \|u-u_{h_k}\|_0^2+\|y-y_{h_k}\|^2_1+\|p-p_{h_k}\|_1^2\rightarrow 0 \quad\mathrm{as}\quad k\rightarrow \infty. \end{align*} |
Proof. It is obviously true combining Theorem 2.1 and Theorem 4.1.
In this section we consider the quasi-optimality for the adaptive finite element method. Firstly we give the notations interpretation. For \mathcal{T}_h, \mathcal{T}_{h_1}, \mathcal{T}_{h_2}\in \mathbb{T} , let \#\mathcal{T}_h be the number of elements in \mathcal{T}_h , and \mathcal{T}_{h_1}\oplus\mathcal{T}_{h_2} be the smallest common conforming refinement of \mathcal{T}_{h_1} and \mathcal{T}_{h_2} and satisfies [4,27]
\begin{equation} \#(\mathcal{T}_{h_1}\oplus\mathcal{T}_{h_2})\leq \#\mathcal{T}_{h_1}+\#\mathcal{T}_{h_2}-\#\mathcal{T}_{h_0}. \end{equation} | (5.1) |
According to [19], we need defining a function approximation class
\begin{align*} \mathcal{A}^s: = &\{(u, y, p, y_{d}, f)\in L^2(\Omega)\times H_{0}^1(\Omega)\times H_{0}^1(\Omega)\times L^2(\Omega)\\ &\times L^2(\Omega):|(u, y, p, y_{d}, f)|_{s} < +\infty\}, \end{align*} |
where
\begin{align*} |(u, y, p, y_{d}, f)|{_s}: = &\sup\limits_{N > 0}N^s\inf\limits_{\mathcal{T}_h\in\mathbb{T}_{N}}\inf\limits_{(u_{h}, y_{h}, p_{h})\in U_{ad}^{h}\times V_{h}\times V_{h}}\{\|u-u_{h}\|_{0}^2\\ &+\|y-y_{h}\|_{1}^2+\|p-p_{h}\|_{1}^2+osc_{\mathcal{T}_h}^2(f, \mathcal{T}_h))+osc_{\mathcal{T}_h}^2(y_{h}-y_{d}, \mathcal{T}_h)\}^{\frac{1}{2}}, \end{align*} |
and
\mathbb{T}_{N}: = \{\mathcal{T}_h\in\mathbb{T}:\#\mathcal{T}_h-\#\mathcal{T}_{h_0}\leq N_{0}\}. |
To illustrate the quasi-optimality of the adaptive finite element method, we need a local upper bound on the distance between nested solutions [4], since the error of this method can only be estimated by using the indicators of refined elements without a buffer layer.
Lemma 5.1. For \mathcal{T}_h, \tilde{\mathcal{T}}_h\in \mathbb{T} and \mathcal{T}_h\subset\tilde{\mathcal{T}}_h , let \mathcal{R}_h bethe set of refined elements from \mathcal{T}_h to \tilde{\mathcal{T}}_h . Let (u_{h}, y_{h}, p_{h}) and (\tilde{u}, \tilde{y}, \tilde{p}) be the solutions of (2.8)–(2.10) with respect to \mathcal{T}_h and \tilde{\mathcal{T}}_h , respectively. Then there exists a constant C , depending on the shape regularity of initial grids \mathcal{T}_{h_{0}} and b such that
\begin{equation} \|u_{h}-\tilde{u}\|_{0}^2+\|y_{h}-\tilde{y}\|_{1}^2+\|p_{h}-\tilde{p}\|_{1}^2\leq C\eta_{\mathcal{T}_h}^2(\mathcal{R}_h), \end{equation} | (5.2) |
where
\eta_{\mathcal{T}_h}^2(\mathcal{R}_h) = \eta_{1, \mathcal{T}_h}^2(p_{h}, \mathcal{R}_h)+\eta_{2, \mathcal{T}_h}^2(u_{h}, y_{h}, \mathcal{R}_h)+\eta_{3, \mathcal{T}_h}^2(y_{h}, p_{h}, \mathcal{R}_h). |
Proof. From (4.23) of Lemma 4.4, we have
\begin{equation} \|u_{h}-\tilde{u}\|_{0}^2\leq\eta_{\mathcal{T}_h}^2(\mathcal{R}_h). \end{equation} | (5.3) |
By Lemma 4.6 in [4], we deduce that
\begin{align} \|y_{h}-\tilde{y}\|_{1}& = \|S_{h}(f+u_{h})-S_{\tilde{h}}(f+\tilde{u})\|_{1}\\ &\leq\|S_{h}(f+u_{h})-S_{\tilde{h}}(f+u_{h})\|_{1}+\|S_{\tilde{h}}(f+u_{h})-S_{\tilde{h}}(f+\tilde{u})\|_{1}\\ &\leq C(\eta_{2, \mathcal{T}_h}(u_{h}, y_{h}, \mathcal{R}_h)+\|u_{h}-\tilde{u}\|_{0}). \end{align} | (5.4) |
The corresponding result is attained by the similar method above
\begin{align} \|p_{h}-\tilde{p}\|_{1}& = \|S_{h}^*(S_{h}(f+u_{h})-y_{d})-S_{\tilde{h}}^*(S_{\tilde{h}}(f+\tilde{u})-y_{d})\|_{1}\\ \ &\leq C(\eta_{3, \mathcal{T}_h}(y_{h}, p_{h}, \mathcal{R}_h)+\|y_{h}-\tilde{y}\|_{1}). \end{align} | (5.5) |
In connection with (5.3), (5.4) and (5.5) we can derive (5.2).
Dörfler introduced a crucial marking and proved strict energy error reduction for the Laplacian provided the initial grids \mathcal{T}_{h_0} satisfying a mild assumption [9]. If the sum of errors satisfy suitable error reductions, the error indicators on the coarse grids must satisfy a Dörfler property on the refined one [19].
Lemma 5.2. Assume that the marking parameter \theta\in(0, \theta^*) , where
\theta^* = \frac{C}{2C(1+h_{0}^4)+1}. |
For \mathcal{T}_h, \tilde{\mathcal{T}}_h\in\mathbb{T} and \mathcal{T}_h\subset\tilde{\mathcal{T}}_h , let (u_{h}, y_{h}, p_{h}) and (\tilde{u}, \tilde{y}, \tilde{p}) be thesolutions of (2.8)–(2.10) with respect to \mathcal{T}_h and \tilde{\mathcal{T}}_h , respectively. If
\begin{equation} e_{\tilde{\mathcal{T}}_h}^2+osc_{\tilde{\mathcal{T}}_h}^2(\tilde{\mathcal{T}}_h)\leq \mu[e_{\mathcal{T}_h}^2+osc_{\mathcal{T}_h}^2(\mathcal{T}_h)], \end{equation} | (5.6) |
is satisfied for \mu: = \frac{1}{2}\Big(1-\frac{\theta}{\theta^*}\Big) . Then, the set \mathcal{R}_h of elements which are refined from \mathcal{T}_h to \tilde{\mathcal{T}}_h satisfies the Dörfler property
\eta_{\mathcal{T}_h}^2(\mathcal{R}_h)\geq\theta\eta_{\mathcal{T}_h}^2(\mathcal{T}_h), |
where
\begin{align*} &e_{\mathcal{T}_h}^2 = \|u-u_{h}\|_{0}^2+\|y-y_{h}\|_{1}^2+\|p-p_{h}\|_{1}^2, \\ &osc_{\mathcal{T}_h}^2(\omega) = osc_{\mathcal{T}_h}^2(f, \omega)+osc_{\mathcal{T}_h}^2(y_{h}-y_{d}, \omega), \end{align*} |
for \omega\subset\mathcal{T}_h and e_{\tilde{\mathcal{T}}_h}^2 , osc_{\tilde{\mathcal{T}}_h}^2(\tilde{\mathcal{T}}_h) similarly defined.
Proof. By the lower bound in Theorem 2.1 and (5.6) to obtain that
\begin{align} (1-2\mu)C\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)&\leq(1-2\mu)(e_{\mathcal{T}_h}^2+osc_{\mathcal{T}_h}^2(\mathcal{T}_h))\\ &\leq e_{\mathcal{T}_h}^2-2e_{\tilde{\mathcal{T}}_h}^2+osc_{\mathcal{T}_h}^2(\mathcal{T}_h)-2osc_{\tilde{\mathcal{T}}_h}^2(\tilde{\mathcal{T}}_h). \end{align} |
It is well-known that there exists the fundamental relationships:
\begin{align} &\|u-u_{h}\|_{0}^2\leq2\|u-\tilde{u}\|_{0}^2+2\|u_{h}-\tilde{u}\|_{0}^2, \\ &\|y-y_{h}\|_{1}^2\leq2\|y-\tilde{y}\|_{1}^2+2\|y_{h}-\tilde{y}\|_{1}^2, \\ &\|p-p_{h}\|_{1}^2\leq2\|p-\tilde{p}\|_{1}^2+2\|p_{h}-\tilde{p}\|_{1}^2. \end{align} |
Hence we can get the following result form Lemma 5.1
\begin{equation} e_{\mathcal{T}_h}^2-2e_{\tilde{\mathcal{T}}_h}^2\leq2C\eta_{\mathcal{T}_h}^2(\mathcal{R}_h). \end{equation} | (5.7) |
For T\in \mathcal{T}_h\cap\tilde{\mathcal{T}}_h , we can get the following result from (4.14) of Lemma 4.3
\begin{equation} osc_{\mathcal{T}_h}^2(y_{h}-y_{d}, \mathcal{T}_h\cap\tilde{\mathcal{T}}_h)-2osc_{\tilde{\mathcal{T}}_h}^2(\tilde{y}-y_{d}, \mathcal{T}_h\cap\tilde{\mathcal{T}}_h)\leq 2C(h_{0}^4\eta_{\mathcal{T}_h}^2(\mathcal{R}_h)). \end{equation} | (5.8) |
According to Remark 2.1 in [4], we get the following result as the indicator \eta_{\mathcal{T}_h}(T) dominates oscillation osc_{\mathcal{T}_h}(T)
\begin{equation} osc_{\mathcal{T}_h}^2(T)\leq\eta_{\mathcal{T}_h}^2(T), \end{equation} | (5.9) |
for all T\in\mathcal{R}_h . Then in connection with (5.7), (5.8) and (5.9) one obtains
\begin{align*} (1-2\mu)C\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)&\leq(2C(1+h_{0}^4)+1)\eta_{\mathcal{T}_h}^2(\mathcal{R}_h), \\ (1-2\mu)\theta^*\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)&\leq\eta_{\mathcal{T}_h}^2(\mathcal{R}_h), \\ \theta\eta_{\mathcal{T}_h}^2(\mathcal{T}_h)&\leq\eta_{\mathcal{T}_h}^2(\mathcal{R}_h), \end{align*} |
where
\begin{align*} &\theta^* = \frac{C}{2C(1+h_{0}^4)+1}, \quad\mathrm{and}\quad\theta = (1-2\mu)\theta^*. \end{align*} |
Lemma 5.3. Let (u, y, p) and (\mathcal{T}_{h_k}, U_{ad}^{h_k}, V_{h_k}, u_{h_k}, y_{h_k}, p_{h_k}) be the solution of (2.3)–(2.5) and the sequence ofgrids, finite element spaces and discrete solutions produced by Algorithm 3.1, respectively. Assume that the markingparameter \theta satisfies the condition in Lemma 5.2, then the following estimate is valid
\begin{equation} \#\mathcal{M}_{h_k}\leq C\Big( N^{\frac{1}{2s}}|(u, y, p, y_{d}, f)|_{s}^{\frac{1}{s}}\mu^{-\frac{1}{2s}}(e_{h_k}^2+osc_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k}))^{-\frac{1}{2s}}\Big), \end{equation} | (5.10) |
if (u, y, p, y_{d}, f)\in \mathcal{A}^s .
Proof. Let \epsilon^2: = \mu N^{-1}(e_{\mathcal{T}_{k}}^2+osc_{\mathcal{T}_{k}}^2(\mathcal{T}_{k})) , where N shall be produced in the proof of (5.13) and \mu is defined in Lemma 5.2. Because of (u, y, p, y_{d}, f)\in \mathcal{A}^s , there exists a \mathcal{T}_{h_\epsilon}\in\mathbb{T} and a (u_{h_\epsilon}, y_{h_\epsilon}, p_{h_\epsilon})\in U_{ad}^{h_{\epsilon}}\times V_{h_{\epsilon}}\times V_{h_{\epsilon}} such that
\begin{equation} \#\mathcal{T}_{h_\epsilon}-\#\mathcal{T}_{h_0}\leq|(u, y, p, y_{d}, f)|_{s}^{1/s}\in\epsilon^{-1/s}, \end{equation} | (5.11) |
and
\begin{equation} \|u-u_{h_\epsilon}\|_{0}^2+\|y-y_{h_\epsilon}\|_{1}^2+\|p-p_{h_\epsilon}\|_{1}^2+osc_{\mathcal{T}_{h_\epsilon}}^2(f, \mathcal{T}_{h_\epsilon}) +osc_{\mathcal{T}_{h_\epsilon}}^2(y_{h_\epsilon}-y_{d}, \mathcal{T}_{h_\epsilon})\leq\epsilon^2. \end{equation} | (5.12) |
Let (u_{h_*}, y_{h_*}, p_{h_*}) be the solution of (2.8)–(2.10) with respect to \mathcal{T}_{h_*} , where \mathcal{T}_{h_*} = \mathcal{T}_{h_\epsilon}\oplus\mathcal{T}_{h_k} is the smallest common refinement of \mathcal{T}_{h_\epsilon} and \mathcal{T}_{h_k} . In the following we will prove the following inequality firstly
\begin{equation} e_{h_*}^2+osc_{\mathcal{T}_{h_*}}^2(\mathcal{T}_{h_*})\leq N\Big(e_{\mathcal{T}_{h_\epsilon}}^2+osc_{\mathcal{T}_{h_\epsilon}}^2(\mathcal{T}_{h_\epsilon})\Big), \end{equation} | (5.13) |
where
\begin{align*} &e_{\mathcal{T}_{h_\epsilon}}^2: = \|u-u_{h_\epsilon}\|_{0}^2+\|y-y_{h_\epsilon}\|_{1}^2+\|p-p_{h_\epsilon}\|_{1}^2, \\ &osc_{\mathcal{T}_{h_\epsilon}}^2(\mathcal{T}_{h_\epsilon}): = osc_{\mathcal{T}_{h_\epsilon}}^2(f, \mathcal{T}_{h_\epsilon}) +osc_{\mathcal{T}_{h_\epsilon}}^2(y_{h_\epsilon}-y_{d}, \mathcal{T}_{h_\epsilon}). \end{align*} |
According to the principle of adding one item and subtracting one item, then we have
\begin{align*} &\|u-u_{h_\epsilon}\|_{0}^2 = \|u-u_{h_*}\|^2_{0}+\|u_{h_\epsilon}-u_{h_*}\|_{0}^2+2(u-u_{h_*}, u_{h_*}-u_{h_\epsilon}), \\ &\|y-y_{h_\epsilon}\|_{1}^2 = \|y-y_{h_*}\|_{1}^2+\|y_{h_\epsilon}-y_{h_*}\|_{1}^2+2a(y-y_{h_*}, y_{h_*}-y_{h_\epsilon}), \\ &\|p-p_{h_\epsilon}\|_{1}^2 = \|p-p_{h_*}\|_{1}^2+\|p_{h_\epsilon}-p_{h_*}\|_{1}^2+2a(p-p_{h_*}, p_{h_*}-p_{h_\epsilon}). \end{align*} |
With the help of the Young's inequality one obtains
\begin{align*} (u-u_{h_*}, u_{h_*}-u_{h_\epsilon})& = (u-u_{h_\epsilon}, u_{h_*}-u_{h_\epsilon})-(u_{h_*}-u_{h_\epsilon}, u_{h_*}-u_{h_\epsilon})\\ &\leq (u-u_{h_\epsilon}, u_{h_*}-u_{h_\epsilon})\\ &\leq\|u-u_{h_\epsilon}\|_{0}^2+\frac{1}{4}\|u_{h_*}-u_{h_\epsilon}\|_{0}^2, \end{align*} |
and so are a(y-y_{h_*}, y_{h_*}-y_{h_\epsilon}) and a(p-p_{h_*}, p_{h_*}-p_{h_\epsilon}) . Hence in connection with what we get above we deduce that
\begin{align} &\ \quad \|u-u_{h_*}\|_{0}^2+\|u_{h_*}-u_{h_\epsilon}\|_{0}^2+\|y-y_{h_*}\|_{a}^2+\|y_{h_*}-y_{h_\epsilon}\|_{1}^2+\|p-p_{h_*}\|_{1}^2+\|p_{h_*}-p_{h_\epsilon}\|_{1}^2\\ &\leq 6\Big(\|u-u_{h_\epsilon}\|_{0}^2+\|y-y_{h_\epsilon}\|_{1}^2+\|p-p_{h_\epsilon}\|_{1}^2\Big). \end{align} | (5.14) |
From Remark 2.1 in [4] and (4.14) in Lemma 4.3 with \mathcal{T}_h = \tilde{\mathcal{T}}_h = \mathcal{T}_{h_*}, \ y = y_{h_*} and \tilde{y} = y_{h_\epsilon} , we obtain that
\begin{align} osc_{\mathcal{T}_{h_*}}^2(y_{h_*}-y_{d}, \mathcal{T}_{h_*})-2osc_{\mathcal{T}_{h_\epsilon}}^2(y_{h_\epsilon}-y_{d}, \mathcal{T}_{h_\epsilon})&\leq osc_{\mathcal{T}_{h_*}}^2(y_{h_*}-y_{d}, \mathcal{T}_{h_*})-2osc_{\mathcal{T}_{h_*}}^2(y_{h_\epsilon}-y_{d}, \mathcal{T}_{h_*})\\ &\leq2Nh_{0}^4\|y_{h_*}-y_{h_\epsilon}\|_{1}^2. \end{align} | (5.15) |
For any T'\in\mathcal{T}_{h_\epsilon} , let \mathcal{T}_{h_{T'}}: = \{T\in \mathcal{T}_{h_*}:T\in T'\} . From the proof of Lemma 4.3 in [19], we derive that
\begin{align*} \sum\limits_{T\in\mathcal{T}_{h_{T'}}}\|f-f_{T}\|_{0, T}^2\leq N\|f-f_{T'}\|_{0, T'}^2. \end{align*} |
And then we can get
\begin{equation} osc_{\mathcal{T}_{*}}^2(f, \mathcal{T}_{*})\leq N (osc_{\mathcal{T}_{\epsilon}}^2(f, \mathcal{T}_{\epsilon})). \end{equation} | (5.16) |
Combining (5.14)–(5.16) to obtain (5.13) and using (5.12) and the definition of \epsilon^2 , we have
\begin{align} e_{h_*}^2+osc_{\mathcal{T}_{h_*}}^2(\mathcal{T}_{h_*})\leq N\Big(e_{\mathcal{T}_{h_\epsilon}}^2+osc_{\mathcal{T}_{h_\epsilon}}^2(\mathcal{T}_{h_\epsilon})\Big)\leq N\epsilon^2 = \mu\Big(e_{h_k}^2+osc_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big). \label{4-17} \end{align} |
It is true for the following result from Lemma 5.2
\begin{equation} \#\mathcal{M}_{h_k}\leq\#\mathcal{R}_h\leq\#\mathcal{T}_{h_*}-\#\mathcal{T}_{h_k}\leq\#\mathcal{T}_{h_\epsilon}-\#\mathcal{T}_{h_0}. \end{equation} | (5.17) |
Combining (5.11), (5.17) and the definition of \epsilon^2 to derive the desired result (5.10).
The following consequence is the result of previous estimates where the truth is that the number of elements is a dwindle for the errors. Namely, if a given adaptive method is used to approximate the exact solution at a certain convergence rate, the iteratively constructed grids sequence will achieve this rate until a constant factor.
Theorem 5.1. Let (u, y, p) and (\mathcal{T}_{h_k}, U_{ad}^{h_k}, V_{h_k}, u_{h_k}, y_{h_k}, p_{h_k}) be the solution of (2.3)–(2.5) and the sequence ofgrids, finite element spaces and discrete solutions produced by Algorithm 3.1, respectively. Assume that \mathcal{T}_{h_0} satisfies the condition (b) of Section 4 in [27]. Let (u, y, p, y_{d}, f)\in \mathcal{A}_{s} , then we have
\begin{equation} \#\mathcal{T}_{h_k}-\#\mathcal{T}_{h_0}\leq C|(u, y, p, y_{d}, f)|_{s}^{\frac{1}{s}}\Big(e_{h_k}^2+osc_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big)^{-\frac{1}{2s}}, \end{equation} | (5.18) |
provided h_{0}\ll 1 .
Proof. It follows from Theorem 2.1 and Lemma 5.3 that
\begin{equation} e_{h_k}^2+\gamma_{1}\eta_{1, \mathcal{T}_{h_k}}^2(p_{h_k}, \mathcal{T}_{h_k})+\gamma_{2}\tilde{\eta}_{\mathcal{T}_{h_k}}(\mathcal{T}_{h_k})\approx e_{h_k}^2+osc_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k}). \end{equation} | (5.19) |
From Lemma 2.3 in [4], we have
\begin{equation} \#\mathcal{T}_{h_k}-\#\mathcal{T}_{h_0}\leq C\sum\limits_{i = 0}^{k-1}\mathcal{M}_{h_i}. \end{equation} | (5.20) |
With the assistance of Lemma 5.3 and (5.20) to gain
\begin{equation} \#\mathcal{T}_{h_k}-\#\mathcal{T}_{h_0}\leq C\sum\limits_{i = 0}^{k-1}\mathcal{M}_{h_i}\leq C\Big( M\sum\limits_{i = 0}^{k-1}\Big(e_{h_k}^2+osc_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big)^{-\frac{1}{2s}}\Big), \end{equation} | (5.21) |
where
M = N^{\frac{1}{2s}}|(u, y, p, y_{d}, f)|_{s}^{\frac{1}{s}}\alpha^{-\frac{1}{2s}}. |
Then it follows from (5.19), (5.21) and Theorem 4.1 that
\begin{align} \#\mathcal{T}_{h_k}-\#\mathcal{T}_{h_0}&\leq C\Big( M\Big(e_{h_k}^2+osc_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big)^{-\frac{1}{2s}}\sum\limits_{i = 1}^k\alpha^{-\frac{1}{2s}}\Big)\\ &\leq C|(u, y, p, y_{d}, f)|_{s}^{\frac{1}{s}}\Big(e_{h_k}^2+osc_{\mathcal{T}_{h_k}}^2(\mathcal{T}_{h_k})\Big)^{-\frac{1}{2s}}, \end{align} |
which tells the proof of Theorem 5.1.
In this section, we firstly present an adaptive finite element and then give the adaptive iteration method where the purpose is to provide empirical analysis for our theory.
Example 1. We consider the nonlinear optimal control problem governed by nonlinear elliptic equations subject to the state equation
\begin{align*} -\Delta y+y^3 = f+u, \quad-\Delta p+3y^2p = y-y_d, \end{align*} |
where we choose \alpha = 1 and \Omega = [0, 1]\times[0, 1] and apparently exact solution
\begin{align*} u = &\frac{1}{\alpha}(\max(0, \bar{p})-p), \\ y = &\sin(\pi x_1)+\sin(\pi x_2), \\ p = &-y. \end{align*} |
By simple calculation we have \int_\Omega pdx = -\frac{4}{\pi} which satisfies u\in U_{ad} .
We choose 15 adaptive loops for Example 1, then we plot the profiles of the exact state and the numerical state on adaptively refined grids with \theta = 0.5 in Figure 1. It is easy to see that the solution is smooth, but we can find larger gradients in some regions, hence comparing with uniform refinement, adaptive finite element method can provide smaller error. In Figure 2, we provide the triangle refined grids after 6 and 12 adaptive iterations of Algorithm 3.1.
In Figure 3, we plot the convergence history for the errors where the left is adaptive refinement (\theta = 0.5) and the right is uniform refinement (\theta = 1) . We can find it very intuitively when we provide the optimal convergence rate slope -1 via adopting the linear finite elements, the error reduction can be observed.
Example 2. We consider the same nonlinear optimal control problem as Example 1 with \alpha = 0.1 , \Omega = (-1, 1)\times(0, 1)\cup(-1, 0)\times(-1, 0] and apparently exact solution
\begin{align*} &p = \begin{cases} -5\times10^{10}e^\frac{1}{m}, \quad&m < 0, \\0, &m\geq 0, \end{cases}\\ &y = -p, \\ &m = (x_1-0.2)^2+(x_2-0.6)^2-0.04, \end{align*} |
where u\in U_{ad} can be guaranteed.
Comparing with Example 1, we provide some plots concerning with 21 adaptive loops for Example 2 with \theta = 0.5 . In Figure 4, it is more easy to say that the solution is smooth while the large gradients can be found in some regions illustrating that adaptive refinement can obtain smaller errors than the uniformly refinement where we offer the error estimate graphs to explain. In Figure 5, the left plot tells us the error estimates on adaptive refinement (\theta = 0.5) and the right shows the error estimates on uniform refinement (\theta = 1) . With the slope -1 being the optimal convergence rate expected, we see the error reduction from Figure 5. Meanwhile we can also find that the convergence order of the total-error and the error estimate indicators are approaching to straight line slope -1 in which they are roughly parallel where it is showed that the posteriori error estimates we obtained in Section 2 are reliable.
In Figure 6, we show the adaptive grids after 9 and 19 adaptive iterations for Example 2 of 21 adaptive loops with \theta = 0.5 . We can find that the grids are concentrated on the regions where the solutions have larger gradients. Only can we note that reduced orders are observed for the uniform refinement because of the singularity of the solutions.
In this paper, we first study the adaptive finite element method for nonlinear optimal control problems and give the corresponding adaptive algorithm. To evaluate the adaptive finite element method, we obtain the a posteriori error estimates for the nonlinear elliptic equations with upper and lower bound convergence and optimality, which are also important indicators for evaluating the algorithms. Therefore, we prove that the sum of the posterior errors of the control, state and covariance variables are convergent, as shown in Theorem 4.2. Based on the local upper bound, we prove the quasi-optimality of the proposed adaptive algorithm, see Section 5. To verify our theoretical analysis, we finally provide some numerical simulations. In previous research papers, the finite element methods for linear optimal control problems were studied. Our innovation is to extend the method of linear optimal control problems to a series of nonlinear optimal control problems.
There are a lot of problems which can not be tackled, such as the L^2-L^2 posteriori error estimates for nonlinear elliptic equations as well as the convergence and quasi-optimality for nonlinear parabolic equations. Furthermore, we note that the analysis in this paper can be generalized to common nonlinear parabolic problems and boundary problems, and we will work on these problems.
This work is supported by National Science Foundation of China (11201510), National Social Science Fund of China (19BGL190), Chongqing Research Program of Basic Research and Frontier Technology (cstc2019jcyj-msxmX0280), General project of Chongqing Natural Science Foundation (cstc2021jcyj-msxmX0949), Scientific and Technological Research Program of Chongqing Municipal Education Commission (KJZD-K20200120), Chongqing Key Laboratory of Water Environment Evolution and Pollution Control in Three Gorges Reservoir Area (WEPKL2018YB-04), Science and technology research project of Chongqing Education Commission (KJQN202101212), Research Center for Sustainable Development of Three Gorges Reservoir Area (2019sxxyjd07), Guangdong Basic and Applied Basic Research Foundation of Joint Fund Project (2021A1515111048), and Guangdong Province Characteristic Innovation Project (2021WTSCX120).
The authors declare that they have no competing interests.
[1] |
M. Hilbert, Big data for development: A review of promises and challenges, Dev. Policy. Rev., 34 (2016), 135–174. http://doi.org/10.1111/dpr.12142 doi: 10.1111/dpr.12142
![]() |
[2] |
C. Wang, J. Wu, J. Yan, Statistical methods and computing for big data, Stat. Interface, 9 (2016), 399. https://dx.doi.org/10.4310/SII.2016.v9.n4.a1 doi: 10.4310/SII.2016.v9.n4.a1
![]() |
[3] |
H. Wang, Y. Ma, Optimal subsampling for quantile regression in big data, Biometrika, 108 (2021), 99–112. https://doi.org/10.1093/biomet/asaa043 doi: 10.1093/biomet/asaa043
![]() |
[4] |
H. Wang, R. Zhu, P. Ma, Optimal subsampling for large sample logistic regression, J. Am. Stat. Assoc., 117 (2022), 265–276. https://doi.org/10.1080/01621459.2020.1773832 doi: 10.1080/01621459.2020.1773832
![]() |
[5] |
X. Chen, W. Liu, X. Mao, Z. Yang, Distributed high-dimensional regression under a quantile loss function, J. Mach. Learn. Res., 21 (2020), 7432–7474. https://doi.org/10.1214/18-AOS1777 doi: 10.1214/18-AOS1777
![]() |
[6] |
A. Hu, Y. Jiao, Y. Liu, Y. Shi, Y. Wu, Distributed quantile regression for massive heterogeneous data, Neurocomputing, 448 (2021), 249–262. https://doi.org/10.1016/j.neucom.2021.03.041 doi: 10.1016/j.neucom.2021.03.041
![]() |
[7] |
R. Jiang, K. Yu, Smoothing quantile regression for a distributed system, Neurocomputing, 466 (2021), 311–326. https://doi.org/10.1016/j.neucom.2021.08.101 doi: 10.1016/j.neucom.2021.08.101
![]() |
[8] |
M. I. Jordan, J. D. Lee, Y. Yang, Communication-efficient distributed statistical inference, J. Am. Stat. Assoc., 526 (2018), 668–681. https://doi.org/10.1080/01621459.2018.1429274 doi: 10.1080/01621459.2018.1429274
![]() |
[9] |
N. Lin, R. Xi, Aggregated estimating equation estimation, Stat. Interface, 4 (2011), 73–83. https://dx.doi.org/10.4310/SII.2011.v4.n1.a8 doi: 10.4310/SII.2011.v4.n1.a8
![]() |
[10] |
L. Luo, P. Song, Renewable estimation and incremental inference in generalized linear models with streaming data sets, J. R. Stat. Soc. B, 82 (2020), 69–97. https://doi.org/10.1111/rssb.12352 doi: 10.1111/rssb.12352
![]() |
[11] |
C. Shi, R. Song, W. Lu, R. Li, Statistical inference for high-dimensional models via recursive online-score estimation, J. Am. Stat. Assoc., 116 (2021), 1307–1318. https://doi.org/10.1080/01621459.2019.1710154 doi: 10.1080/01621459.2019.1710154
![]() |
[12] |
E. D. Schifano, J. Wu, C. Wang, J. Yan, M. Chen, Online updating of statistical inference in the big data setting, Technometrics, 58 (2016), 393–403. https://doi.org/10.1080/00401706.2016.1142900 doi: 10.1080/00401706.2016.1142900
![]() |
[13] |
S. Mohamad, A. Bouchachia, Deep online hierarchical dynamic unsupervised learning for pattern mining from utility usage data, Neurocomputing, 390 (2020), 359–373. https://doi.org/10.1016/j.neucom.2019.08.093 doi: 10.1016/j.neucom.2019.08.093
![]() |
[14] |
H. M. Gomes, J. Read, A. Bifet, J. Paul, J. Gama, Machine learning for streaming data: State of the art, challenges, and opportunities, ACM Sigkdd Explor. Newslett., 21 (2019), 6–22. https://doi.org/10.1145/3373464.3373470 doi: 10.1145/3373464.3373470
![]() |
[15] | L. Lin, W. Li, J. Lu, Unified rules of renewable weighted sums for various online updating estimations, arXiv Preprint, 2020. https://doi.org/10.48550/arXiv.2008.08824 |
[16] |
C. Wang, M. Chen, J. Wu, J. Yan, Y. Zhang, E. Schifano, Online updating method with new variables for big data streams, Can. J. Stat., 46 (2018), 123–146. https://doi.org/10.1002/cjs.11330 doi: 10.1002/cjs.11330
![]() |
[17] |
J. Wu, M. Chen, Online updating of survival analysis, J. Comput. Graph. Stat., 30 (2021), 1209–1223. https://doi.org/10.1080/10618600.2020.1870481 doi: 10.1080/10618600.2020.1870481
![]() |
[18] |
Y. Xue, H. Wang, J. Yan, E. D. Schifano, An online updating approach for testing the proportional hazards assumption with streams of survival data, Biometrics, 76 (2020), 171–182. https://doi.org/10.1111/biom.13137 doi: 10.1111/biom.13137
![]() |
[19] |
S. Balakrishnan, D. Madigan, A one-pass sequential Monte Carlo method for Bayesian analysis of massive datasets, Bayesian Anal., 1 (2006), 345–361. https://doi.org/10.1214/06-BA112 doi: 10.1214/06-BA112
![]() |
[20] |
L. N. Geppert, K. Ickstadt, A. Munteanu, J. Quedenfeld, C. Sohler, Random projections for Bayesian regression, Biometrics, 27 (2017), 79–101. https://doi.org/10.1007/s11222-015-9608-z doi: 10.1007/s11222-015-9608-z
![]() |
[21] |
R. Koenker, G. Bassett, Regression quantiles, Econometrica, 1978, 33–50. https://doi.org/10.2307/1913643 doi: 10.2307/1913643
![]() |
[22] |
Y. Wei, A. Pere, R. Koenker, X. He, Quantile regression methods for reference growth charts, Stat. Med., 25 (2006), 1369–1382. https://doi.org/10.1002/sim.2271 doi: 10.1002/sim.2271
![]() |
[23] |
H. Wang, Z. Zhu, J. Zhou, Quantile regression in partially linear varying coefficient models, Ann. Stat., 2009, 3841–3866. https://doi.org/10.1214/09-AOS695 doi: 10.1214/09-AOS695
![]() |
[24] |
X. He, B. Fu, W. K. Fung, Median regression for longitudinal data, Stat. Med., 22 (2003), 3655–3669. https://doi.org/10.1002/sim.1581 doi: 10.1002/sim.1581
![]() |
[25] |
M. Buchinsky, Changes in the US wage structure 1963–1987: Application of quantile regression, Econometrica, 1994,405–458. https://doi.org/10.2307/2951618 doi: 10.2307/2951618
![]() |
[26] |
A. J. Cannon, Quantile regression neural networks: Implementation in R and application to precipitation downscaling, Comput. Geosci., 37 (2011), 1277–1284. https://doi.org/10.1002/sim.1581 doi: 10.1002/sim.1581
![]() |
[27] |
Q. Xu, K. Deng, C. Jiang, F. Sun, X. Huang, Composite quantile regression neural network with applications, Expert Syst. Appl., 76 (2017), 129–139. https://doi.org/10.1016/j.eswa.2017.01.054 doi: 10.1016/j.eswa.2017.01.054
![]() |
[28] |
X. Chen, W. Liu, Y. Zhang, Quantile regression under memory constraint, Ann. Stat., 47 (2019), 3244–3273. https://doi.org/10.1214/18-AOS1777 doi: 10.1214/18-AOS1777
![]() |
[29] |
L. Chen, Y. Zhou, Quantile regression in big data: A divide and conquer based strategy, Comput. Stat. Data. An., 144 (2020), 106892. https://doi.org/10.1016/j.csda.2019.106892 doi: 10.1016/j.csda.2019.106892
![]() |
[30] |
K. Wang, H. Wang, S. Li, Renewable quantile regression for streaming datasets, Knowl.-Based Syst., 235 (2022), 107675. https://doi.org/10.1016/j.knosys.2021.107675 doi: 10.1016/j.knosys.2021.107675
![]() |
[31] |
Y. Chu, Z. Yin, K. Yu, Bayesian scale mixtures of normals linear regression and Bayesian quantile regression with big data and variable selection, J. Comput. Appl. Math., 428 (2023), 115192. https://doi.org/10.1016/j.cam.2023.115192 doi: 10.1016/j.cam.2023.115192
![]() |
[32] |
K. Lum, A. E. Gelfand, Spatial quantile multiple regression using the asymmetric Laplace process, Bayesian Anal., 7 (2012), 235–258. https://doi.org/10.1214/12-BA708 doi: 10.1214/12-BA708
![]() |
[33] |
M. Smith, R. Kohn, Nonparametric regression using Bayesian variable, J. Econometrics, 75 (1996), 317–343. https://doi.org/10.1016/0304-4076(95)01763-1 doi: 10.1016/0304-4076(95)01763-1
![]() |
[34] |
M. Dao, M. Wang, S. Ghosh, K. Ye, Bayesian variable selection and estimation in quantile regression using a quantile-specific prior, Computation. Stat., 37 (2022), 1339–1368. https://doi.org/10.1007/s00180-021-01181-5 doi: 10.1007/s00180-021-01181-5
![]() |
[35] |
K. E. Lee, N. Sha, E. R. Dougherty, M. Vannucci, B. K. Mallick, Gene selection: A Bayesian variable selection approach, Bioinformatics, 19 (2003), 90–97. https://doi.org/10.1093/bioinformatics/19.1.90 doi: 10.1093/bioinformatics/19.1.90
![]() |
[36] |
R. Chen, C. Chu, T. Lai, Y. Wu, Stochastic matching pursuit for Bayesian variable selection, Stat. Comput., 21 (2011), 247–259. https://doi.org/10.1007/s11222-009-9165-4 doi: 10.1007/s11222-009-9165-4
![]() |
[37] |
R. Jiang, K. Yu, Renewable quantile regression for streaming data sets, Neurocomputing, 508 (2022), 208–224. https://doi.org/10.1016/j.knosys.2021.107675 doi: 10.1016/j.knosys.2021.107675
![]() |
[38] |
X. Li, The influencing factors on PM_{2.5} concentration of Lanzhou based on quantile eegression, HGU. J., 41 (2018), 61–68. https://doi.org/10.13937/j.cnki.hbdzdxxb.2018.06.009 doi: 10.13937/j.cnki.hbdzdxxb.2018.06.009
![]() |
[39] |
X. Zhang, W. Zhang, Spatial and temporal variation of PM_{2.5} in Beijing city after rain, Ecol. Environ. Sci., 23 (2014), 797–805. https://doi.org/10.3969/j.issn.1674-5906.2014.05.011 doi: 10.3969/j.issn.1674-5906.2014.05.011
![]() |
[40] |
R. Tibshirani, Regression shrinkage and selection via the Lasso, J. R. Stat. Soc. B, 58 (2018), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x doi: 10.1111/j.2517-6161.1996.tb02080.x
![]() |
[41] |
J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc., 96 (2011), 1348–1360. https://doi.org/10.1198/016214501753382273 doi: 10.1198/016214501753382273
![]() |
[42] |
F. E. Streib, M. Dehmer, High-dimensional LASSO-based computational regression models: Regularization, shrinkage, and selection, Mach. Learn. Know. Extr., 1 (2019), 359–383. https://doi.org/10.3390/make1010021 doi: 10.3390/make1010021
![]() |
[43] |
X. Ma, L. Lin, Y. Gai, A general framework of online updating variable selection for generalized linear models with streaming datasets, J. Stat. Comput. Sim., 93 (2023), 325–340. https://doi.org/10.1080/00949655.2022.2107207 doi: 10.1080/00949655.2022.2107207
![]() |
[44] |
A. Liu, J. Lu, F. Liu, G. Zhang, Accumulating regional density dissimilarity for concept drift detection in data streams, Pattern Recogn., 76 (2018), 256–272. https://doi.org/10.1016/j.patcog.2017.11.009 doi: 10.1016/j.patcog.2017.11.009
![]() |
[45] | J. Wang, J. Shen, P. Li, Provable variable selection for streaming features, International Conference On Machine Learning, 80 (2018), 5171–5179. Available from: https://proceedings.mlr.press/v80/wang18g.html. |
[46] |
J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, G. Zhang, Learning under concept drift: A review, IEEE T. Knowl. Data En., 31 (2018), 2346–2363. https://doi.org/10.1109/TKDE.2018.2876857 doi: 10.1109/TKDE.2018.2876857
![]() |
[47] |
R. Elwell, R. Polikar, Incremental learning of concept drift in nonstationary environments, IEEE T. Neural Networ., 22 (2011), 1517–1531. https://doi.org/10.1109/TNN.2011.2160459 doi: 10.1109/TNN.2011.2160459
![]() |
[48] | D. Rezende, S. Mohamed, Variational inference with normalizing flows, International Conference On Machine Learning, 22 (2015), 1530–1538. Available from: https://proceedings.mlr.press/v37/rezende15. |
[49] | P. Müller, F. A. Quintana, A. Jara, T. Hanson, Bayesian nonparametric data analysis, New York: Springer Press, 2015. https://doi.org/10.1007/978-0-387-69765-9-7 |
[50] |
R. Koenker, J. A. Machado, Goodness of fit and related inference processes for quantile regression, J. Am. Stat. Assoc., 94 (1999), 1296–1310. https://doi.org/10.1109/TNN.2011.2160459 doi: 10.1109/TNN.2011.2160459
![]() |
[51] |
K. Yu, R. A. Moyeed, Bayesian quantile regression, Stat. Probab. Lett., 54 (2001), 437–447. https://doi.org/10.1016/S0167-7152(01)00124-9 doi: 10.1016/S0167-7152(01)00124-9
![]() |
[52] |
M. Geraci, Linear quantile mixed models: The lqmm package for Laplace quantile regression, J. Stat. Softw., 57 (2014), 1–29. https://doi.org/10.18637/jss.v057.i13 doi: 10.18637/jss.v057.i13
![]() |
[53] |
M. Geraci, M. Bottai, Quantile regression for longitudinal data using the asymmetric laplace distribution, Biostatistics, 8 (2007), 140–154. https://doi.org/10.1093/biostatistics/kxj039 doi: 10.1093/biostatistics/kxj039
![]() |
[54] |
D. F. Benoit, D. V. den Poel, bayesQR: A Bayesian approach to quantile regression, J. Stat. Softw., 76 (2017), 1–32. https://doi.org/10.18637/jss.v076.i07 doi: 10.18637/jss.v076.i07
![]() |