Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

Relaxed conditions for universal approximation by radial basis function neural networks of Hankel translates

  • Radial basis function neural networks (RBFNNs) of Hankel translates of order μ>1/2 with varying widths whose activation function σ is a.e. continuous, such that zμ1/2σ(z) is locally essentially bounded and not an even polynomial, are shown to enjoy the universal approximation property (UAP) in appropriate spaces of continuous and integrable functions. In this way, the requirement that σ be continuous for this kind of networks to achieve the UAP is weakened, and some results that hold true for RBFNNs of standard translates are extended to RBFNNs of Hankel translates.

    Citation: Isabel Marrero. Relaxed conditions for universal approximation by radial basis function neural networks of Hankel translates[J]. AIMS Mathematics, 2025, 10(5): 10852-10865. doi: 10.3934/math.2025493

    Related Papers:

    [1] Changgui Wu, Liang Zhao . Finite-time adaptive dynamic surface control for output feedback nonlinear systems with unmodeled dynamics and quantized input delays. AIMS Mathematics, 2024, 9(11): 31553-31580. doi: 10.3934/math.20241518
    [2] Mohamed Kharrat, Moez Krichen, Loay Alkhalifa, Karim Gasmi . Neural networks-based adaptive command filter control for nonlinear systems with unknown backlash-like hysteresis and its application to single link robot manipulator. AIMS Mathematics, 2024, 9(1): 959-973. doi: 10.3934/math.2024048
    [3] Tomomichi Hagiwara, Masaki Sugiyama . L2/L1 induced norm and Hankel norm analysis in sampled-data systems. AIMS Mathematics, 2024, 9(2): 3035-3075. doi: 10.3934/math.2024149
    [4] Shengyang Gao, Fashe Li, Hua Wang . Evaluation of the effects of oxygen enrichment on combustion stability of biodiesel through a PSO-EMD-RBF model: An experimental study. AIMS Mathematics, 2024, 9(2): 4844-4862. doi: 10.3934/math.2024235
    [5] Bing Jiang . Rate of approximaton by some neural network operators. AIMS Mathematics, 2024, 9(11): 31679-31695. doi: 10.3934/math.20241523
    [6] Jianwei Jiao, Keqin Su . A new Sigma-Pi-Sigma neural network based on L1 and L2 regularization and applications. AIMS Mathematics, 2024, 9(3): 5995-6012. doi: 10.3934/math.2024293
    [7] Xia Song, Lihua Shen, Fuyang Chen . Adaptive backstepping position tracking control of quadrotor unmanned aerial vehicle system. AIMS Mathematics, 2023, 8(7): 16191-16207. doi: 10.3934/math.2023828
    [8] Hansaem Oh, Gwanghyun Jo . Physics-informed neural network for the heat equation under imperfect contact conditions and its error analysis. AIMS Mathematics, 2025, 10(4): 7920-7940. doi: 10.3934/math.2025364
    [9] Suliman Khan, M. Riaz Khan, Aisha M. Alqahtani, Hasrat Hussain Shah, Alibek Issakhov, Qayyum Shah, M. A. EI-Shorbagy . A well-conditioned and efficient implementation of dual reciprocity method for Poisson equation. AIMS Mathematics, 2021, 6(11): 12560-12582. doi: 10.3934/math.2021724
    [10] Kairui Chen, Yongping Du, Shuyan Xia . Adaptive state observer event-triggered consensus control for multi-agent systems with actuator failures. AIMS Mathematics, 2024, 9(9): 25752-25775. doi: 10.3934/math.20241258
  • Radial basis function neural networks (RBFNNs) of Hankel translates of order μ>1/2 with varying widths whose activation function σ is a.e. continuous, such that zμ1/2σ(z) is locally essentially bounded and not an even polynomial, are shown to enjoy the universal approximation property (UAP) in appropriate spaces of continuous and integrable functions. In this way, the requirement that σ be continuous for this kind of networks to achieve the UAP is weakened, and some results that hold true for RBFNNs of standard translates are extended to RBFNNs of Hankel translates.



    Many complex problems are nowadays modeled and solved by means of neural networks (NNs), which have become a fundamental tool in machine learning and artificial intelligence. While NNs admit many possible architectures, radial basis function neural networks (RBFNNs) may be classified as single hidden layer, feedforward nonlinear NNs. In fact, they consist of three sequential layers: the first or input layer, the last or output layer, and an intermediate one, referred to as the hidden layer. Information flows only in one direction, from the input layer to the output one. Each layer is composed of several nodes, which act as neurons in the network. Once an input is received by the neurons in the first layer, it is processed by the neurons in the hidden layer by means of a locally biased activation function, thus producing partial outputs that are linearly combined by the neurons in the last layer to render a final output. The nonlinearity of the model comes from the activation function, which, in the case of RBFNNs, is some radial kernel, often a Gaussian.

    More specifically, given dN, an RBFNN is any function v:RdR expressible as

    v(x)=Ni=1wih(xziθi), (1.1)

    where h:[0,)R represents the activation function; xRd is the input; NN is the quantity of hidden layer nodes; (w1,,wN)RN is the N-tuple of weights connecting the i-th node to the output layer; and ziRd, θi>0 respectively denote the centroid and width of the kernel at the i-th node (1iN). The kernel widths can either remain uniform across all nodes or vary individually for each node.

    Soon after their introduction by Broomhead and Lowe [1] in the 1980s, RBFNNs were applied to supervised learning tasks like classification, pattern recognition, regression, and time series prediction [2,3]. Their theoretical appeal relies on their capacity of being dense in appropriate spaces of integrable or continuous functions, which, in NNs terminology, is referred to as the universal approximation property (UAP). A substantial corpus of literature has been devoted to studying this property in terms of the activation function h. For instance, Park and Sandberg [4,5] demonstrated that relatively soft conditions on h (such as being integrable with a nonzero integral, bounded, and a.e. continuous) are sufficient to guarantee this property in Lp(Rd) (1p<). Later on, Liao et al. [6] established that RBFNNs can uniformly approximate any continuous function provided that h is a.e. continuous, locally essentially bounded, and not a polynomial. Moreover, for 1p<, any function in an Lp space with respect to a finite measure can be approximated by some RBFNN with an essentially bounded activation function h that is not a polynomial. For further insights on p-mean approximation capabilities of RBFNNs, see [7] and references therein. Although the nonpolynomiality of h is clearly necessary, it has also been shown to suffice for other classes of networks to achieve the UAP [8,9].

    The Hankel transformation, being particularly well-suited to handle radial functions, motivated Arteaga and Marrero [10] to propose and study a radial basis function (RBF) interpolation scheme where the interpolants are given by

    u(x)=ni=1αi(τaiϕ)(x)+m1j=0βjpμ,j(x)(xI).

    Here, I=(0,), ϕ is a complex basis function on I, μ1/2, and τz=τμ,z stands for the operator of Hankel translation with order μ and symbol zI, while, for 1in and 0jm1, aiI are the interpolation nodes, pμ,j(x)=x2j+μ+1/2 are monomials of Müntz type, and αi, βj are complex coefficients.

    Details on the Hankel transformation and its associated translation and convolution operators will be provided in Section 2 below, as the results in the present paper will delve into this approach in the framework of NNs. In fact, by replacing the standard translation with the Hankel translation τz (zI) in (1.1), we give the next

    Definition 1.1 ([11,12]). An RBFNN of Hankel translates is any real function v on I that can be expressed as

    v(x)=Ni=1wiτzi(λσiϕ)(x)(xI),

    where ϕ is the activation function, NN accounts for the quantity of nodes in the hidden layer, and wiR stands for the weight from the i-th node to the output one, while zi,σiI represent the centroid and width, respectively, of the i-th node (1iN). Also, (λrϕ)(t)=ϕ(rt) (tI) is a homothety of ratio rI.

    The class of all RBFNNs of Hankel translates will be denoted by S1(ϕ)=Sμ,1(ϕ).

    It should be remarked that the UAP of closely related structures (termed RBFNNs of Delsarte translates) was investigated by Arteaga and the author in a series of papers, beginning with [13]. By considering RBFNNs of Hankel (or Delsarte) translates, a new parameter μ is introduced, which provides the practitioner with a greater variety of manageable kernels. This might be useful in handling mathematical models built upon a class of RBFs depending on the order μ [14,15], as network performance can be improved just by finely tuning this extra parameter, without increasing the number of centroids. Indeed, numerical and graphical examples illustrating the effect of μ in the approximation of functions can be found in [12, Section 5].

    Unless otherwise stated, henceforth we let μ>1/2. The following function spaces are to be considered:

    Lμ,c=zμ+1/2L([0,c],z2μ+1dz) (cI). The usual norm of this space will be denoted by μ,,c.

    Lμ, is the space of functions belonging to Lμ,c for all cI, topologized by the sequence of seminorms {μ,,n}nN.

    Cμ,c (cI) is the space of functions u, continuous on (0,c], for which

    limz0+zμ1/2u(z) (1.2)

    exists and is finite, normed by μ,,c. The correspondence uzμ1/2u(z) sets up an isometric isomorphism between Cμ,c and the Banach space C[0,c] of the functions that are continuous on the interval [0,c], with the supremum norm. Therefore, Cμ,c is Banach, too.

    Cμ is the space of functions u, continuous on I, for which (1.2) exists and is finite. Topologized by the sequence of seminorms {μ,,n}nN, Cμ becomes Fréchet.

    In [12], Marrero proved the following: When ϕCμ, the class S1(ϕ) is dense in Cμ if, and only if, ϕπμ, where

    πμ=span{t2r+μ+1/2:rN0}. (1.3)

    This generalizes to RBFNNs of Hankel translates a result of Pinkus [9, Theorem 12] for standard translates. Here we aim to extend to the Hankel setting the results in [6] as well: We will show that the density of S1(ϕ) in Cμ (in the sense that the closure of S1(ϕ) as a subspace of Lμ, contains Cμ) can be achieved under relaxed conditions on ϕ, namely, membership in Lμ,πμ and a.e. continuity, instead of membership in Cμ.

    The structure and main results of the paper are as follows: After gathering in Section 2 the basic preliminaries on the translation and convolution operators associated with the Hankel transformation, the UAP is addressed. In Section 3, we recall from [12] the UAP for the case of activation functions in Cμ (Theorem 3.2) along with an auxiliary lemma, which gets slightly improved. In Section 4, the UAP for a.e. continuous activation functions in Lμ, is established (Theorems 4.6 and 4.7). We remark that, at any event, nonpolynomiality of the activation function in the hidden layer, understood as exclusion from the class (1.3), has a pivotal role.

    Let μR, let Jμ denote the well-known Bessel function of the first kind and order μ, and let Jμ(z)=z1/2Jμ(z) (zI). Whenever the involved integral exists, the Hankel transform of a function ϕ=ϕ(x) (xI) is typically defined as

    (hμϕ)(x)=0ϕ(t)Jμ(xt)dt(xI).

    Zemanian extended the Hankel transformation to spaces of distributions by adapting the ideas that led Schwartz [16] to produce a distributional theory of the Fourier transformation. In fact, the Zemanian class Hμ [17,18] of all complex functions ϕC(I) such that

    νμ,r(ϕ)=max0krsupxI|(1+x2)r(x1D)kxμ1/2ϕ(x)|<(rN0),

    where D=d/dx, plays in the Hankel transformation setting the same role as the Schwartz space of rapidly decreasing functions with respect to the Fourier transformation. When μ1/2, the sequence of norms {νμ,r}rN0 makes Hμ into a Fréchet space, and hμ a self-isomorphism of Hμ. Hence, its adjoint hμ is also a self-isomorphism of the dual Hμ when either its weak or strong topologies are considered.

    Zemanian [19] further introduced the class Bμ, which plays with respect to the Hankel transformation the same role as the test space of infinitely differentiable, compactly supported functions in the context of the Fourier transformation. Given aI, the space Bμ,a consists of all complex functions ϕC(I) satisfying ϕ(x)=0 for x>a, and

    δμ,r(ϕ)=supxI|(x1D)rxμ1/2ϕ(x)|<(rN0).

    Topologized by means of the seminorms {δμ,r}rN0, this space is Fréchet. The strict inductive limit Bμ of {Bμ,a}aI is a dense subspace of Hμ; consequently, its dual Bμ can be viewed as a superspace of Hμ.

    Sousa Pinto [20] pioneered in the study of the distributional Hankel convolution, although focusing on distributions of compact support, with μ=0. Betancor and the author [21,22,23] subsequently extended this theory to wider distribution spaces for any μ>1/2. The definition of the Hankel #-convolution of φ,ϕHμ, in the classical sense, is as follows:

    (φ#ϕ)(x)=0φ(y)(τxϕ)(y)dy(xI),

    where

    (τxϕ)(y)=0ϕ(z)Dμ(x,y,z)dz(yI) (2.1)

    is the Hankel translate of ϕ, with symbol xI. For x,y,zI, the nonnegative function

    Dμ(x,y,z)=0tμ1/2Jμ(xt)Jμ(yt)Jμ(zt)dt={[z2(xy)2]μ1/2[(x+y)2z2]μ1/223μ1π1/2Γ(μ+1/2)(xyz)μ1/2,|xy|<z<x+y0,otherwise

    occurring in (2.1) is known as the Delsarte kernel. It is symmetric in its variables and satisfies the duplication formula

    0Jμ(zt)Dμ(x,y,z)dz=Jμ(xt)Jμ(yt)(x,y,tI)

    along with the integrability property

    0Dμ(x,y,z)zμ+1/2dz=c1μ(xy)μ+1/2(x,yI), (2.2)

    where cμ=2μΓ(μ+1). In particular,

    (τxϕ)(y)=(τyϕ)(x)(ϕHμ,x,yI).

    Other key results include the shifting formula

    hμ(τyϕ)(x)=xμ1/2Jμ(xy)(hμϕ)(x)(ϕHμ,x,yI),

    and the exchange formula

    hμ(φ#ϕ)(x)=xμ1/2(hμφ)(x)(hμϕ)(x)(φ,ϕHμ,xI).

    The translation operator extends up to Hμ by transposition. Given fHμ and ϕHμ, their Hankel convolution f#ϕHμ is

    (f#ϕ)(x)=f,τxϕ(xI) [23, Definition 3.1].

    The shifting and exchange formulas

    hμ(τyf)(x)=xμ1/2Jμ(xy)(hμf)(x)

    and

    hμ(f#ϕ)(x)=xμ1/2(hμϕ)(x)(hμf)(x)

    are valid in the distributional sense (cf. [23, Proposition 3.5]). The interested reader is especially referred to [18,21,22,23] for a more extensive study of the generalized Hankel transformation and its associated translation and convolution.

    Except for the a.e. pointwise convergence stated in part (ⅰ), the next lemma is contained in [12, Lemma 2.1].

    Lemma 3.1. For zI and ϕLμ,, let τzϕ be as in (2.1), and define

    (Tzϕ)(x)=ϕz(x)=cμzμ1/2(τzϕ)(x)(xI).

    Then, the following holds:

    (i) The function x(τzϕ)(x) is well defined and continuous on I. Both operators Tz and τz are linear and continuous from Lμ, into itself. If, moreover, ϕ is a.e. continuous, then limz0+ϕz(x)=ϕ(x) a.e. xI.

    (ii) When restricted to Cμ, both Tz and τz define continuous linear operators into Cμ. Also, if ϕCμ, then limz0+ϕz=ϕ in Cμ.

    Proof. As said above, it only remains to show that limz0+ϕz(x)=ϕ(x) a.e. xI whenever ϕLμ, is a.e. continuous, that is, the measure of the set of its discontinuity points is null.

    Assume xI is a continuity point of ϕ; then, given any ε>0, for some δ=δ(x,ε)>0, the conditions tI and |tx|<δ imply

    |tμ1/2ϕ(t)xμ1/2ϕ(x)|<ε.

    Furthermore, if 0<z<δ and tI with |tx|δ>z, then Dμ(x,z,t)=0. Thus, using (2.2), we may write

    |xμ1/2ϕz(x)xμ1/2ϕ(x)|=|cμ(xz)μ1/2(τzϕ)(x)xμ1/2ϕ(x)|=|cμ(xz)μ1/20ϕ(t)Dμ(x,z,t)dtcμ(xz)μ1/2xμ1/2ϕ(x)0Dμ(x,z,t)tμ+1/2dt|cμ(xz)μ1/2|tx|<δ|tμ1/2ϕ(t)xμ1/2ϕ(x)|Dμ(x,z,t)tμ+1/2dt<ε(0<z<δ),

    which settles the lemma.

    We end this section with a main result from [12] and some comments about its proof.

    Theorem 3.2. ([12, Theorem 3.3]). Let ϕCμπμ. Then, S1(ϕ)=span{τs(λrϕ):s,rI}Cμ is dense in Cμ, i.e., for any fCμ, cI and ε>0, some gS1(ϕ) satisfies fgμ,,c<ε.

    Conversely, if ϕπμ, then S1(ϕ) has finite dimension, which prevents it from being dense in Cμ.

    Proof. The description of S1(ϕ) is clear. A proof of the converse part was given in [12, Theorem 2.5]; however, we include it here for completeness. Let

    Sμ=xμ1/2Dx2μ+1Dxμ1/2

    denote the Bessel differential operator of order μ. Given mN0, a distribution fHμ solves the differential equation Sm+1μf=0 if, and only if, fπμ and the degree of the even polynomial tμ1/2f(t) is not greater than 2m [10, Theorem 2.19]. Assume ϕπμ and zμ1/2ϕ(z) has degree 2m, so that Sm+1μϕ=0. The commutativity of Sμ with Hankel translations (cf. [24]), followed by a simple computation, yields

    Sm+1μ[τs(λrϕ)]=r2(m+1)τs[λr(Sm+1μϕ)]=0(s,rI).

    This means that the dimension of the linear space S1(ϕ) is at most 2m. Being finite-dimensional and hence closed, S1(ϕ) cannot be dense in infinite-dimensional spaces.

    In this section, a series of lemmas will lead us to our main result. We begin with the following basic fact.

    Lemma 4.1. Assume AXμ, where Xμ=Lμ, or Xμ=Cμ, and let ¯A, respectively ¯Ac, denote the closure of A in the topology of Xμ, respectively in the norm of Xμ,c, where, for any cI, Xμ,c=Lμ,c or Xμ,c=Cμ,c. Then,

    ¯A=cI¯Ac.

    Proof. The inclusion map XμXμ,c being continuous, it is evident that ¯A¯Ac for all cI.

    Conversely, suppose g¯Ac whenever cI. Then, in particular, for every nN, there exists gnA such that ggnμ,,n<n1. Given bI and ε>0, choose mN with mmax{b,ε1}. We have

    ggnμ,,bggnμ,,mggnμ,,n<1n1mε(nm).

    The arbitrariness of bI shows that limngn=g in the topology of Xμ, so that g¯A.

    Lemma 4.2. Let σLμ, be a.e. continuous, and let b,cI. Then, given ρBμ,b, the convolution

    (σ#ρ)(x)=0(τxσ)(t)ρ(t)dt(xI) (4.1)

    lies in Cμ,c and can be approximated from span{τsσ:sI} in the norm of Lμ,c. In other words, for any ρBμ we have that σ#ρ lies in Cμ and belongs to the closure of span{τsσ:sI} in Lμ,.

    Proof. It can be adapted from that of [12, Lemma 3.1]. Fix ρBμ,b. By virtue of Lemma 3.1(i), τxσLμ, for each xI; consequently, the function (4.1) is well defined.

    We begin by showing the continuity of σ#ρ on (0,c]. With this purpose, pick x0(0,c]. We have

    |(σ#ρ)(x)(σ#ρ)(x0)|0|(τxσ)(z)(τx0σ)(z)||ρ(z)|dzbμ+1/2b0|(τxσ)(z)(τx0σ)(z)||zμ1/2ρ(z)|dzbμ+1/2supzI|zμ1/2ρ(z)|b0|(τxσ)(z)(τx0σ)(z)|dz(x(0,c]).

    Moreover, for each z(0,b], using (2.2) we may write

    |(τxσ)(z)(τx0σ)(z)|esssupt[0,b+c]|tμ1/2σ(t)|b+c0|Dμ(x,z,t)Dμ(x0,z,t)|tμ+1/2dtc1μzμ+1/2(xμ+1/2+xμ+1/20)esssupt[0,b+c]|tμ1/2σ(t)|2c1μ(bc)μ+1/2esssupt[0,b+c]|tμ1/2σ(t)|(x(0,c]).

    Lemma 3.1(ⅰ) guarantees that

    limxx0|(τxσ)(z)(τx0σ)(z)|=0(z(0,b]).

    The desired continuity now follows from an application of the Lebesgue theorem of dominated convergence.

    Similarly, because of Lemma 3.1(ⅰ), the estimate

    |cμxμ1/2(σ#ρ)(x)0σ(z)ρ(z)dz|=|b0cμxμ1/2(τxσ)(z)ρ(z)dzb0σ(z)ρ(z)dz|b0|cμ(xz)μ1/2(τxσ)(z)zμ1/2σ(z)||ρ(z)|zμ+1/2dz=b0|zμ1/2σx(z)zμ1/2σ(z)||zμ1/2ρ(z)|z2μ+1dzsupzI|zμ1/2ρ(z)|b0|zμ1/2σx(z)zμ1/2σ(z)|z2μ+1dz(xI),

    and dominated convergence:

    |zμ1/2σx(z)zμ1/2σ(z)||zμ1/2σx(z)|+|zμ1/2σ(z)||cμ(xz)μ1/2b+x0D(x,z,t)σ(t)dt|+|zμ1/2σ(z)|2esssupt[0,b+c]|tμ1/2σ(t)|(x(0,c],z(0,b]),

    we arrive at

    limx0+xμ1/2(σ#ρ)(x)=c1μ0σ(z)ρ(z)dz.

    Thus, σ#ρCμ,c.

    Next, fix x(0,c]. For each nN, consider the partition {ti=ib/n:0in} of [0,b], and let ε>0. The following estimate is easily obtained:

    |(σ#ρ)(x)ni=1bρ(ti)n(τtiσ)(x)||0(τxσ)(t)ρ(t)dtni=1titi1tμ1/2i(τxσ)(ti)tμ+1/2ρ(t)dt|+|ni=1titi1tμ1/2i(τxσ)(ti)tμ+1/2ρ(t)dtbnni=1(τxσ)(ti)ρ(ti)|. (4.2)

    As z2μ+1 and zμ1/2ρ(z) are uniformly continuous on [0,b] (cf. [18, Lemma 5.2-1]), for large enough n, the second term on the right-hand side of (4.2) can be bounded by

    |ni=1titi1tμ1/2i(τxσ)(ti)tμ+1/2ρ(t)dtbnni=1(τxσ)(ti)ρ(ti)|xμ+1/2c1μesssupz[0,b+c]|zμ1/2σ(z)|ni=1titi1|tμ+1/2ρ(t)tμ+1/2iρ(ti)|dtxμ+1/2c1μesssupz[0,b+c]|zμ1/2σ(z)|×ni=1titi1[suptI|tμ1/2ρ(t)||t2μ+1t2μ+1i|+|tμ1/2ρ(t)tμ1/2iρ(ti)|t2μ+1i]dt<xμ+1/2ε2. (4.3)

    Concerning the first term on the right-hand side of (4.2), recall that σ is a.e. continuous and note that the representation (2.1), jointly with Lemma 3.1, renders the map (x,t)(xt)μ1/2(τxσ)(t) continuous on (IU)×[0,), where U is some open set containing the points of discontinuity of σ, with measure less than a given λ>0. Therefore, this map is uniformly continuous over compacta: To every α,β>0, there corresponds NN, independent of x[α,c]U, such that nN implies

    |(xt)μ1/2(τxσ)(t)(xti)μ1/2(τxσ)(ti)|<β(t[ti1,ti],1in).

    In particular, given α,η>0, we may arrange for

    |0(τxσ)(t)ρ(t)dtni=1titi1tμ1/2i(τxσ)(ti)tμ+1/2ρ(t)dt|ni=1titi1|tμ1/2(τxσ)(t)tμ1/2i(τxσ)(ti)||tμ1/2ρ(t)|t2μ+1dtxμ+1/2suptI|tμ1/2ρ(t)|ni=1titi1|(xt)μ1/2(τxσ)(t)(xti)μ1/2(τxσ)(ti)|t2μ+1dt<xμ+1/2η(x[α,c]U), (4.4)

    provided that n is large enough. This way, given η,δ>0, there exists NN such that, whenever nN, the measure of the set of points x(0,c] for which the left-hand side of (4.4), weighted by xμ1/2, is greater than or equal to η, does not exceed δ; that is, the sequence of such measures converges to zero, or, in other words, the corresponding functional sequence converges to zero in measure. By passing to a subsequence if necessary, a.e. convergence is achieved; thus, we obtain

    |0(τxσ)(t)ρ(t)dtni=1titi1tμ1/2i(τxσ)(ti)tμ+1/2ρ(t)dt|<xμ+1/2ε2 (4.5)

    for a.e. x[0,c] and sufficiently large n. A combination of (4.2), (4.3), and (4.5) results in the estimate

    σ#ρni=1bρ(ti)nτtiσμ,,c=esssupx[0,c]|xμ1/2(σ#ρ)(x)xμ1/2ni=1bρ(ti)n(τtiσ)(x)|<ε

    being valid for large n, which accomplishes the first part of the proof.

    Now, for any ρBμ, we have that σ#ρCμ lies in the closure of span{τsσ:sI} in Lμ,c whenever cI. Since, by Lemma 3.1(ⅰ), span{τsσ:sI}Lμ,, a direct application of Lemma 4.1 reveals that σ#ρ belongs to the closure of span{τsσ:sI} in Lμ,. The proof is complete.

    Remark 4.3. Observe that, in the notation and conditions of Lemma 4.2, both

    {ni=1bρ(ti)nτtiσ}nN

    and

    {ni=1[tμ1/2ititi1tμ+1/2ρ(t)dt]τtiσ}nN

    are approximating sequences to σ#ρ from span{τsσ:sI}.

    Lemma 4.4. Assume σLμ, is a.e. continuous and does not lie in πμ. Then, some ρBμ is such that σ#ρ does not lie in πμ, either.

    Proof. Lemma 4.2 allows us to argue as in the proof of [12, Lemma 3.2].

    Lemma 4.5. If σLμ,, ρBμ and aI, then τa(σ#ρ)=σ#τaρ.

    Proof. Defined as in (4.1), the convolution σ#τaρ makes sense, because Bμ is stable under Hankel translations [21, Corollary 3.3].

    Let bI be such that ρ(t)=0 for t>b. There holds:

    0Dμ(a,x,z)dz0|ρ(s)|ds0|σ(t)|Dμ(z,s,t)dtx+a0Dμ(a,x,z)dzb0|ρ(s)|dsx+a+b0|σ(t)|Dμ(z,s,t)dtsupsI|sμ1/2ρ(s)|0Dμ(a,x,z)dzx+a+b0|σ(t)|dt0Dμ(z,s,t)sμ+1/2ds=c1μsupsI|sμ1/2ρ(s)|0Dμ(a,x,z)zμ+1/2dzx+a+b0|tμ1/2σ(t)|t2μ+1dtc2μ(ax)μ+1/2esssupt[0,x+a+b]|tμ1/2σ(t)|supsI|sμ1/2ρ(s)|x+a+b0t2μ+1dt<(xI).

    Thus, the Fubini theorem may be applied to obtain

    τa(σ#ρ)(x)=0(σ#ρ)(z)Dμ(a,x,z)dz=0Dμ(a,x,z)dz0ρ(s)ds0σ(t)Dμ(z,s,t)dt=0σ(t)dt0ρ(s)ds0Dμ(a,x,z)Dμ(z,s,t)dz=0σ(t)dt0ρ(s)ds0Dμ(a,z,s)Dμ(x,z,t)dz=0dz0σ(t)Dμ(x,z,t)dt0ρ(s)Dμ(a,z,s)ds=0(τxσ)(z)(τaρ)(z)dz=(σ#τaρ)(x)(xI),

    as claimed.

    Theorem 4.6. Let σLμ,πμ be a.e. continuous. Then,

    S1(σ)=span{τs(λrσ):s,rI}Lμ,

    is dense in Cμ, i.e., for any fCμ, cI and ε>0, some gS1(σ) satisfies fgμ,,c<ε.

    Conversely, if σπμ, then S1(σ) has finite dimension, which prevents it from being dense in Cμ.

    Proof. The converse statement is contained in Theorem 3.2.

    For the direct one, use Lemmas 4.2 and 4.4 to get some ρBμ such that σ#ρCμπμ. The identity

    λr(τqσ)=rμ+1/2τq/r(λrσ)(r,qI) (4.6)

    can be derived by simple changes of variables. A combination of Theorem 3.2 with (4.6) and Lemma 4.5 yields the density of

    S1(σ#ρ)=span{λr(σ#τqρ):r,qI}

    in Cμ. Recalling that Bμ is stable under Hankel translations, invoke Lemma 4.2 again, this time to approximate σ#τqρ from span{τsσ:sI} in the topology of Lμ,. After a new application of (4.6), we are done.

    As a consequence of Theorem 4.6, the hypotheses imposed on the activation function in [12, Theorem 4.1] can be weakened.

    Theorem 4.7. Let σLμ, be a.e. continuous, and let 1p<. Given cI, let γ be a Radon measure on [0,c] satisfying

    c0tμ+1/2d|γ|(t)<.

    Then, for S1(σ)=span{τs(λrσ):s,rI} to be dense in Lp([0,c],dγ), it is necessary and sufficient that σπμ.

    Proof. If σπμ then, as shown above, S1(σ) has finite dimension, which prevents it from being dense in Lp([0,c],dγ).

    Conversely, if σπμ then, from Theorem 4.6, S1(σ) is dense in Cμ,c, and hence in Lp([0,c],dγ).

    The universal approximation property (UAP) of three-layered radial basis function neural networks of Hankel translates with varying widths has been studied. The requirement on the activation function σ in the hidden layer for such networks to approximate continuous functions locally in the esssup-norm has been satisfactorily weakened from continuity to local essential boundedness and a.e. continuity, provided that zμ1/2σ(z) (zI) is not an even polynomial. The UAP in p-mean (1p<) with respect to a suitable finite measure can therefore be attained under the same relaxed condition.

    The author declares she has not used Artificial Intelligence (AI) tools in the creation of this article.

    The author wants to express her gratitude to the anonymous reviewers for valuable comments that helped improve the presentation of the paper.

    There is no conflict of interest to disclose.



    [1] D. S. Broomhead, D. Lowe, Multivariable functional interpolation and adaptive networks, Complex Syst., 2 (1988), 321–355.
    [2] R. P. Lippmann, Pattern classification using neural networks, IEEE Commun. Mag., 27 (1989), 47–64. https://doi.org/10.1109/35.41401 doi: 10.1109/35.41401
    [3] S. Renals, R. Rohwer, Phoneme classification experiments using radial basis functions, International 1989 Joint Conference on Neural Networks, Washington DC (USA), 1989,461–467. https://doi.org/10.1109/IJCNN.1989.118620
    [4] J. Park, I. W. Sandberg, Universal approximation using Radial-Basis-Function networks, Neural Comput., 3 (1991), 246–257. https://doi.org/10.1162/neco.1991.3.2.246 doi: 10.1162/neco.1991.3.2.246
    [5] J. Park, I. W. Sandberg, Approximation and radial-basis-function networks, Neural Comput., 5 (1993), 305–316. https://doi.org/10.1162/neco.1993.5.2.305 doi: 10.1162/neco.1993.5.2.305
    [6] Y. Liao, S. C. Fang, H. L. W. Nuttle, Relaxed conditions for radial-basis function networks to be universal approximators, Neural Netw., 16 (2003), 1019–1028. https://doi.org/10.1016/S0893-6080(02)00227-7 doi: 10.1016/S0893-6080(02)00227-7
    [7] D. Nan, W. Wu, J. L. Long, Y. M. Ma, L. J. Sun, Lp approximation capability of RBF neural networks, Acta Math. Sin.-Engl. Ser., 24 (2008), 1533–1540. https://doi.org/10.1007/s10114-008-6423-x doi: 10.1007/s10114-008-6423-x
    [8] M. Leshno, V. Y. Lin, A. Pinkus, S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Netw., 6 (1993), 861–867. https://doi.org/10.1016/S0893-6080(05)80131-5 doi: 10.1016/S0893-6080(05)80131-5
    [9] A. Pinkus, TDI-subspaces of C(Rd) and some density problems from neural networks, J. Approx. Theory, 85 (1996), 269–287. https://doi.org/10.1006/jath.1996.0042 doi: 10.1006/jath.1996.0042
    [10] C. Arteaga, I. Marrero, A scheme for interpolation by Hankel translates of a basis function, J. Approx. Theory, 164 (2012), 1540–1576. https://doi.org/10.1016/j.jat.2012.08.005 doi: 10.1016/j.jat.2012.08.005
    [11] I. Marrero, The role of nonpolynomiality in uniform approximation by RBF networks of Hankel translates, J. Funct. Spaces, 2019 (2019), 1845491 https://doi.org/10.1155/2019/1845491 doi: 10.1155/2019/1845491
    [12] I. Marrero, Radial basis function neural networks of Hankel translates as universal approximators, Anal. Appl. (Singap.), 17 (2019), 897–930. https://doi.org/10.1142/S0219530519500064 doi: 10.1142/S0219530519500064
    [13] C. Arteaga, I. Marrero, Universal approximation by radial basis function networks of Delsarte translates, Neural Netw., 46 (2013), 299–305. https://doi.org/10.1016/j.neunet.2013.06.011 doi: 10.1016/j.neunet.2013.06.011
    [14] H. Corrada, K. Lee, B. Klein, R. Klein, S. Iyengar, G. Wahba, Examining the relative influence of familial, genetic, and environmental covariate information in flexible risk models, Proc. Natl. Acad. Sci. USA, 106 (2009), 8128–8133. https://doi.org/10.1073/pnas.0902906106 doi: 10.1073/pnas.0902906106
    [15] S. Hamzehei Javaran, N. Khaji, A. Noorzad, First kind Bessel function (J-Bessel) as radial basis function for plane dynamic analysis using dual reciprocity boundary element method, Acta Mech., 218 (2011), 247–258. https://doi.org/10.1007/s00707-010-0421-7 doi: 10.1007/s00707-010-0421-7
    [16] L. Schwartz, Théorie des distributions, Vols. Ⅰ, Ⅱ, Publications de l'Institut de Mathématique de l'Université de Strasbourg, Paris: Hermann & Cie, 1950-1951.
    [17] A. H. Zemanian, A distributional Hankel transformation, SIAM J. Appl. Math., 14 (1966), 561–576. https://doi.org/10.1137/0114049 doi: 10.1137/0114049
    [18] A. H. Zemanian, Generalized integral transformations, Pure and Applied Mathematics, Vol. 18, New York: John Wiley & Sons, 1968.
    [19] A. H. Zemanian, The Hankel transformation of certain distributions of rapid growth, SIAM J. Appl. Math., 14 (1966), 678–690. https://doi.org/10.1137/0114056 doi: 10.1137/0114056
    [20] J. de Sousa Pinto, A generalised Hankel convolution, SIAM J. Math. Anal., 16 (1985), 1335–1346. https://doi.org/10.1137/0516097 doi: 10.1137/0516097
    [21] J. J. Betancor, I. Marrero, The Hankel convolution and the Zemanian spaces Bμ and Bμ, Math. Nachr., 160 (1993), 277–298. https://doi.org/10.1002/mana.3211600113 doi: 10.1002/mana.3211600113
    [22] J. Betancor, I. Marrero, Structure and convergence in certain spaces of distributions and the generalized Hankel convolution, Math. Japon., 38 (1993), 1141–1155.
    [23] I. Marrero, J. J. Betancor, Hankel convolution of generalized functions, Rend. Mat. Ser. VII, 15 (1995), 351–380.
    [24] J. J. Betancor, A new characterization of the bounded operators commuting with Hankel translation, Arch. Math., 69 (1997), 403–408. https://doi.org/10.1007/s000130050138 doi: 10.1007/s000130050138
  • Reader Comments
  • © 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(185) PDF downloads(36) Cited by(0)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog