First, we construct a new type of feedforward neural network operators on finite intervals, and give the pointwise and global estimates of approximation by the new operators. The new operator can approximate the continuous functions with a very good rate, which can not be obtained by polynomial approximation. Second, we construct a new type of feedforward neural network operator on infinite intervals and estimate the rate of approximation by the new operators. Finally, we investigate the weighted approximation properties of the new operators on infinite intervals and show that our new neural networks are dense in a very wide class of functional spaces. Thus, we demonstrate that approximation by feedforward neural networks has some better properties than approximation by polynomials on infinite intervals.
Citation: Bing Jiang. Rate of approximaton by some neural network operators[J]. AIMS Mathematics, 2024, 9(11): 31679-31695. doi: 10.3934/math.20241523
[1] | Junlong Chen, Yanbin Tang . Homogenization of nonlinear nonlocal diffusion equation with periodic and stationary structure. Networks and Heterogeneous Media, 2023, 18(3): 1118-1177. doi: 10.3934/nhm.2023049 |
[2] | Iryna Pankratova, Andrey Piatnitski . Homogenization of convection-diffusion equation in infinite cylinder. Networks and Heterogeneous Media, 2011, 6(1): 111-126. doi: 10.3934/nhm.2011.6.111 |
[3] | L.L. Sun, M.L. Chang . Galerkin spectral method for a multi-term time-fractional diffusion equation and an application to inverse source problem. Networks and Heterogeneous Media, 2023, 18(1): 212-243. doi: 10.3934/nhm.2023008 |
[4] | Markus Gahn, Maria Neuss-Radu, Peter Knabner . Effective interface conditions for processes through thin heterogeneous layers with nonlinear transmission at the microscopic bulk-layer interface. Networks and Heterogeneous Media, 2018, 13(4): 609-640. doi: 10.3934/nhm.2018028 |
[5] | Yinlin Ye, Hongtao Fan, Yajing Li, Ao Huang, Weiheng He . An artificial neural network approach for a class of time-fractional diffusion and diffusion-wave equations. Networks and Heterogeneous Media, 2023, 18(3): 1083-1104. doi: 10.3934/nhm.2023047 |
[6] | Tom Freudenberg, Michael Eden . Homogenization and simulation of heat transfer through a thin grain layer. Networks and Heterogeneous Media, 2024, 19(2): 569-596. doi: 10.3934/nhm.2024025 |
[7] | Kexin Li, Hu Chen, Shusen Xie . Error estimate of L1-ADI scheme for two-dimensional multi-term time fractional diffusion equation. Networks and Heterogeneous Media, 2023, 18(4): 1454-1470. doi: 10.3934/nhm.2023064 |
[8] | Farman Ali Shah, Kamran, Dania Santina, Nabil Mlaiki, Salma Aljawi . Application of a hybrid pseudospectral method to a new two-dimensional multi-term mixed sub-diffusion and wave-diffusion equation of fractional order. Networks and Heterogeneous Media, 2024, 19(1): 44-85. doi: 10.3934/nhm.2024003 |
[9] | Tasnim Fatima, Ekeoma Ijioma, Toshiyuki Ogawa, Adrian Muntean . Homogenization and dimension reduction of filtration combustion in heterogeneous thin layers. Networks and Heterogeneous Media, 2014, 9(4): 709-737. doi: 10.3934/nhm.2014.9.709 |
[10] | Thomas Geert de Jong, Georg Prokert, Alef Edou Sterk . Reaction–diffusion transport into core-shell geometry: Well-posedness and stability of stationary solutions. Networks and Heterogeneous Media, 2025, 20(1): 1-14. doi: 10.3934/nhm.2025001 |
First, we construct a new type of feedforward neural network operators on finite intervals, and give the pointwise and global estimates of approximation by the new operators. The new operator can approximate the continuous functions with a very good rate, which can not be obtained by polynomial approximation. Second, we construct a new type of feedforward neural network operator on infinite intervals and estimate the rate of approximation by the new operators. Finally, we investigate the weighted approximation properties of the new operators on infinite intervals and show that our new neural networks are dense in a very wide class of functional spaces. Thus, we demonstrate that approximation by feedforward neural networks has some better properties than approximation by polynomials on infinite intervals.
We will derive integrals as indicated in the abstract in terms of special functions. Some special cases of these integrals have been reported in Gradshteyn and Ryzhik [5]. In 1867 David Bierens de Haan derived hyperbolic integrals of the form
∫∞0((log(a)−ix)k+(log(a)+ix)k)log(cos(α)sech(x)+1)dx | (1.1) |
In our case the constants in the formulas are general complex numbers subject to the restrictions given below. The derivations follow the method used by us in [6]. The generalized Cauchy's integral formula is given by
ykk!=12πi∫Cewywk+1dw. | (1.2) |
We use the method in [6]. Here the contour is in the upper left quadrant with ℜ(w)<0 and going round the origin with zero radius. Using a generalization of Cauchy's integral formula we first replace y by ix+log(a) for the first equation and then y by −ix+log(a) to get the second equation. Then we add these two equations, followed by multiplying both sides by 12log(cos(α)sech(x)+1) to get
((log(a)−ix)k+(log(a)+ix)k)log(cos(α)sech(x)+1)2k!=12πi∫Caww−k−1cos(wx)log(cos(α)sech(x)+1)dw | (2.1) |
where the logarithmic function is defined in Eq (4.1.2) in [2]. We then take the definite integral over x∈[0,∞) of both sides to get
∫∞0((log(a)−ix)k+(log(a)+ix)k)log(cos(α)sech(x)+1)2k!dx=12πi∫∞0∫Caww−k−1cos(wx)log(cos(α)sech(x)+1)dwdx=12πi∫C∫∞0aww−k−1cos(wx)log(cos(α)sech(x)+1)dxdw=12πi∫Cπaww−k−2cosh(πw2)csch(πw)dw−12πi∫Cπaww−k−2csch(πw)cosh(αw)dw | (2.2) |
from Eq (1.7.7.120) in [1] and the integral is valid for α, a, and k complex and |ℜ(α)|<π.
In this section we will again use the generalized Cauchy's integral formula to derive equivalent contour integrals. First we replace y by y−π/2 for the first equation and y by y+π/2 for second then add these two equations to get
(y−π2)k+(y+π2)kk!=12πi∫C2w−k−1ewycosh(πw2)dw | (3.1) |
Next we replace y by log(a)+π(2p+1) then we take the infinite sum over p∈[0,∞) to get
∞∑p=02π((log(a)+π(2p+1)−π2)k+(log(a)+π(2p+1)+π2)k)k!=12πi∞∑p=0∫C4πw−k−1cosh(πw2)ew(log(a)+π(2p+1))dw=12πi∫C∞∑p=04πw−k−1cosh(πw2)ew(log(a)+π(2p+1))dw | (3.2) |
where ℜ(w)<0 according to (1.232.3) in [5]. Then we simplify the left-hand side to get the Hurwitz zeta function
−2k+1πk+2(k+1)!(ζ(−k−1,2log(a)+π4π)+ζ(−k−1,2log(a)+3π4π))=12πi∫Cπaww−k−2cosh(πw2)csch(πw)dw | (3.3) |
Then following the procedure of (3.1) and (3.2) we replace y by y+α and y−α to get the second equation for the contour integral given by
(y−α)k+(α+y)kk!=12πi∫C2w−k−1ewycosh(αw)dw | (3.4) |
next we replace y by log(a)+π(2p+1) and take the infinite sum over p∈[0,∞) to get
∞∑p=0(−α+log(a)+π(2p+1))k+(α+log(a)+π(2p+1))kk!=12πi∞∑p=0∫C2w−k−1cosh(αw)ew(log(a)+π(2p+1))dw=12πi∫C∞∑p=02w−k−1cosh(αw)ew(log(a)+π(2p+1))dw | (3.5) |
Then we simplify to get
2k+1πk+2(k+1)!(ζ(−k−1,−α+log(a)+π2π)+ζ(−k−1,α+log(a)+π2π))=12πi∫Cπ(−aw)w−k−2csch(πw)cosh(αw)dw | (3.6) |
Since the right-hand sides of Eqs (2.2), (3.3) and (3.5) are equivalent we can equate the left-hand sides to get
∫∞0((log(a)−ix)k+(log(a)+ix)k)log(cos(α)sech(x)+1)dx=2(2k+1πk+2(ζ(−k−1,−α+log(a)+π2π)+ζ(−k−1,α+log(a)+π2π))k+1)−2(2k+1πk+2(ζ(−k−1,2log(a)+π4π)+ζ(−k−1,2log(a)+3π4π))k+1) | (4.1) |
from (9.521) in [5] where ζ(z,q) is the Hurwitz zeta function. Note the left-hand side of Eq (4.1) converges for all finite k. The integral in Eq (4.1) can be used as an alternative method to evaluating the Hurwitz zeta function. The Hurwitz zeta function has a series representation given by
ζ(z,q)=∞∑n=01(q+n)z | (4.2) |
where ℜ(z)>1,q≠0,−1,.. and is continued analytically by (9.541.1) in [5] where z=1 is the only singular point.
In this section we have evaluated integrals and extended the range of the parameters over which the integrals are valid. The aim of this section is to derive a few integrals in [5] in terms of the Lerch function. We also present errata for one of the integrals and faster converging closed form solutions.
Using Eq (4.1) and taking the first partial derivative with respect to α and setting a=1 and simplifying the left-hand side we get
∫∞0xkcos(α)+cosh(x)dx=2kπk+1csc(α)sec(πk2)(ζ(−k,π−α2π)−ζ(−k,α+π2π)) | (5.1) |
from Eq (7.102) in [3].
Using Eq (5.1) and taking the first partial derivative with respect to k and setting k=0 and simplifying the left-hand side we get
∫∞0log(x)cos(α)+cosh(x)dx=csc(α)(αlog(2π)−πlog(−α−π)+πlog(α−π)−πlogΓ(−α+π2π)+πlogΓ(−π−α2π)))=csc(α)(αlog(2π)+πlog(Γ(α+π2π)Γ(π−α2π))) | (5.2) |
from (7.105) in [3].
Using Eq (5.2) and setting α=π/2 and simplifying we get
∫∞0log(x)sech(x)dx=πlog(√2πΓ(34)Γ(14)) | (5.3) |
Using Eq (5.2) and taking the first derivative with respect to α and setting α=π/2 and simplifying we get
∫∞0log(x)sech2(x)dx=log(π4)−γ | (5.4) |
where γ is Euler's constant.
Using Eq (5.1) and taking the first partial derivative with respect to k then setting k=−1/2 and α=π/2 and simplifying we get
∫∞0log(x)sech(x)√xdx=12√π(−2ζ′(12,14)+2ζ′(12,34)+(ζ(12,34)−ζ(12,14))(π+log(14π2))) | (5.5) |
The expression in [4] is correct but converges much slower than Eq (5.5).
Using Eq (5.1) and taking the first partial derivative with respect to α we get
∫∞0xk(cos(α)+cosh(x))2dx=−2k−1πkcsc2(α)sec(πk2)(k(ζ(1−k,π−α2π)+ζ(1−k,α+π2π)))−2k−1πkcsc2(α)sec(πk2)(2πcot(α)(ζ(−k,π−α2π)−ζ(−k,α+π2π))) | (5.6) |
from (7.102) in [3]. Next we use L'Hopital's rule and take the limit as α→0 to get
∫∞0xk(cosh(x)+1)2dx=−1321−k((2k−8)ζ(k−2)−(2k−2)ζ(k))Γ(k+1) | (5.7) |
Then we take the first partial derivative with respect to k to get
∫∞0xklog(x)(cosh(x)+1)2dx=1324−kkΓ(k)ζ′(k−2)−23kΓ(k)ζ′(k−2)−1322−kkΓ(k)ζ′(k)+23kΓ(k)ζ′(k)−1321−kklog(256)ζ(k−2)Γ(k)+1321−kklog(4)ζ(k)Γ(k)+1324−kkζ(k−2)Γ(k)ψ(0)(k+1)−23kζ(k−2)Γ(k)ψ(0)(k+1)−1322−kkζ(k)Γ(k)ψ(0)(k+1)+23kζ(k)Γ(k)ψ(0)(k+1) | (5.8) |
Finally we set k=0 to get
∫∞0log(x)(cosh(x)+1)2dx=13(14ζ′(−2)−γ+log(π2)) | (5.9) |
The integral listed in [4] appears with an error in the integrand.
Using Eq (4.1) we first take the limit as k→−1 by applying L'Hopital's rule and using (7.105) in [3] and simplifying the right-hand side we get
∫∞0log(cos(α)sech(x)+1)a2+x2dx=πalog(√π212−aπΓ(aπ+12)Γ(a−α+π2π)Γ(a+α+π2π)) | (5.10) |
Next we take the first partial derivative with respect to α and set α=0 to get
∫∞0sech(x)a2+x2dx=−ψ(0)(2a+π4π)−ψ(0)(2a+3π4π)2a | (5.11) |
from Eq (8.360.1) in [5] where ℜ(a)>0, next we replace x with bx to get
∫∞0sech(bx)a2+b2x2dx=−ψ(0)(2a+π4π)−ψ(0)(2a+3π4π)2ab | (5.12) |
Next we set a=b=π to get
∫∞0sech(πx)x2+1dx=12(ψ(0)(54)−ψ(0)(34))=2−π2 | (5.13) |
from Eq (8.363.8) in [5].
Using Eq (5.12) and setting a=b=π/2 we get
∫∞0sech(πx2)x2+1dx=12(−γ−ψ(0)(12))=log(2) | (5.14) |
from Eq (8.363.8) in [5].
Using Eq (5.12) and setting a=b=π/4 we get
∫∞0sech(πx4)x2+1dx=12(ψ(0)(78)−ψ(0)(38))=π−2coth−1(√2)√2 | (5.15) |
from Eq (8.363.8) in [5].
Using Eq (5.10) and taking the second partial derivative with respect to α we get
∫∞0sech2(x)a2+x2dx=ψ(1)(aπ+12)πa | (5.16) |
from Eq (8.363.8) in [5] where ℜ(a)>0.
In this paper we were able to present errata and express our closed form solutions in terms of special functions and fundamental constants such π, Euler's constant and log(2). The use of the trigamma function is quite often necessary in statistical problems involving beta or gamma distributions. This work provides both an accurate and extended range for the solutions of the integrals derived.
We have presented a novel method for deriving some interesting definite integrals by Bierens de Haan using contour integration. The results presented were numerically verified for both real and imaginary and complex values of the parameters in the integrals using Mathematica by Wolfram.
This research is supported by Natural Sciences and Engineering Research Council of Canada NSERC Canada under Grant 504070.
The authors declare there are no conflicts of interest.
[1] |
G. A. Anastassiou, Univariate hyperbolic tangent neural network approximation, Math. Comput. Model., 53 (2011), 1111–1132. https://doi.org/10.1016/j.mcm.2010.11.072 doi: 10.1016/j.mcm.2010.11.072
![]() |
[2] |
G. A. Anastassiou, Multivariate sigmoidal neural networks approximation, Neural Netw., 24 (2011), 378–386. https://doi.org/10.1016/j.neunet.2011.01.003 doi: 10.1016/j.neunet.2011.01.003
![]() |
[3] |
F. L. Cao, T. F. Xie, Z. B. Xu, The estimate for approximation error of neural networks: A constructive approach, Neurocomputing, 71 (2008), 626–630. https://doi.org/10.1016/j.neucom.2007.07.024 doi: 10.1016/j.neucom.2007.07.024
![]() |
[4] |
F. L. Cao, Y. Q. Zhang, Z. R. He, Interpolation and rates of convergence for a class of neural networks, Appl. Math. Model., 33 (2009), 1441–1456. https://doi.org/10.1016/j.apm.2008.02.009 doi: 10.1016/j.apm.2008.02.009
![]() |
[5] |
F. L. Cao, Z. C. Li, J. W. Zhao, K. Lv, Approximation of functions defined on full axis of real by a class of neural networks: Density, complexity and constructive algorithm, Chinese J. Comput., 35 (2012), 786–795. http://dx.doi.org/10.3724/SP.J.1016.2012.00786 doi: 10.3724/SP.J.1016.2012.00786
![]() |
[6] |
Z. X. Chen, F. L. Cao, The approximation operators with sigmoidal functions, Comput. Math. Appl., 58 (2009), 758–765. https://doi.org/10.1016/j.camwa.2009.05.001 doi: 10.1016/j.camwa.2009.05.001
![]() |
[7] |
D. X. Zhou, Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal., 48 (2019), 787–794. https://doi.org/10.1016/j.acha.2019.06.004 doi: 10.1016/j.acha.2019.06.004
![]() |
[8] |
C. K. Chui, S. B. Lin, B. Zhang, D. X. Zhou, Realization of spatial sparseness by deep ReLU nets with massive data, IEEE Trans. Neural Netw. Learn. Syst., 33 (2022), 229–243. https://doi.org/10.1109/TNNLS.2020.3027613 doi: 10.1109/TNNLS.2020.3027613
![]() |
[9] |
X. Liu, Approximating smooth and sparse functions by deep neural networks: Optimal approximation rates and saturation, J. Complexity, 79 (2023), 101783. https://doi.org/10.1016/j.jco.2023.101783 doi: 10.1016/j.jco.2023.101783
![]() |
[10] |
D. X. Zhou, Theory of deep convolutional neural networks: Downsampling, Neural Netw., 124 (2020), 319–327. https://doi.org/10.1016/j.neunet.2020.01.018 doi: 10.1016/j.neunet.2020.01.018
![]() |
[11] |
D. X. Zhou, Deep distributed convolutional neural networks: Universality, Anal. Appl., 16 (2018), 895–919. https://doi.org/10.1142/s0219530518500124 doi: 10.1142/s0219530518500124
![]() |
[12] |
G. S. Wang, D. S. Yu, L. M. Guan, Neural network interpolation operators of multivariate functions, J. Comput. Appl. Math., 431 (2023), 115266. https://doi.org/10.1016/j.cam.2023.115266 doi: 10.1016/j.cam.2023.115266
![]() |
[13] |
D. S. Yu, Approximation by Neural networks with sigmoidal functions, Acta. Math. Sin. English Ser., 29 (2013), 2013–2026. https://doi.org/10.1007/s10114-013-1730-2 doi: 10.1007/s10114-013-1730-2
![]() |
[14] |
D. S. Yu. Approximation by neural networks with sigmoidal functions, Acta. Math. Sin. English Ser., 29 (2013), 2013–2026. https://doi.org/10.1007/s10114-013-1730-2 doi: 10.1007/s10114-013-1730-2
![]() |
[15] |
D. S. Yu, F. L. Cao, Construction and approximation rate for feedforward neural networks operators with sigmoidal functions, J. Comput. Appl. Math., 453 (2025), 116150. https://doi.org/10.1016/j.cam.2024.116150 doi: 10.1016/j.cam.2024.116150
![]() |
[16] |
D. S. Yu, Y. Zhao, P. Zhou, Error estimates for the modified truncations of approximate approximation with Gaussian kernels, Calcolo, 50 (2013), 195–208. https://doi.org/10.1007/s10092-012-0064-2 doi: 10.1007/s10092-012-0064-2
![]() |
[17] |
I. E. Gopenguz, A theorem of A. F. Timan on the approximation of functions by polynomials on a finite segment, Math. Notes Acad. Sci. USSR 1, 1 (1967), 110–116. https://doi.org/10.1007/BF01268059 doi: 10.1007/BF01268059
![]() |
[18] |
D. S. Yu, S. P. Zhou, Approximation by rational operators in Lp spaces, Math. Nachr., 282 (2009), 1600–1618. https://doi.org/10.1002/mana.200610812 doi: 10.1002/mana.200610812
![]() |
[19] | Z. Ditzian, V. Totik, Moduli of smoothness, New York: Springer, 1987. https://doi.org/10.1007/978-1-4612-4778-4 |
[20] |
G. Mastroianni, J. Szabados, Balázs-Shepard operators on infinite intervals, Ⅱ, J. Approx. Theory, 90 (1997), 1–8. https://doi.org/10.1006/jath.1996.3075 doi: 10.1006/jath.1996.3075
![]() |