Presence can promote online learners' learning effectiveness in higher education, but in livestream teaching, the influential relationship between different types of presence and learning effectiveness is unclear. Therefore, based on the Community of Inquiry (CoI) Framework, we used structural equation, hierarchical regression, and the Bootstrap self-serving method to conduct a survey on college students participating in livestream teaching practice. The research findings revealed that livestream teaching substantially impacts learning effectiveness, with teaching presence, social presence, and cognitive presence all playing crucial roles. Notably, teaching presence has a significant positive influence on learning effectiveness through two key mediating factors: Social presence and cognitive presence. Consequently, three distinct mediating paths are identified. Among these three mediating paths, the most optimal route for teaching presence to enhance learning effectiveness is mediating cognitive presence. In conclusion, we recommend improving the livestream teaching environment, guiding learners toward active participation to promote a sense of embodiment, and elaborately designing livestream learning activities to improve interactivity. Finally, this paper offers evidence and insights for the improvement of livestream teaching in colleges, which will enhance learners' overall learning effectiveness.
Citation: Xiangping Cui, Jiangming Qian, Soheila Garshasbi, Susan Zhang, Geng Sun, Juan Wang, Jun Shen, Lin Yue, Yewen Lyu. Enhancing learning effectiveness in livestream teaching: Investigating the impact of teaching, social, and cognitive presences through a community of inquiry lens[J]. STEM Education, 2024, 4(2): 82-105. doi: 10.3934/steme.2024006
[1] | Fatimah Alshahrani, Wahiba Bouabsa, Ibrahim M. Almanjahie, Mohammed Kadi Attouch . Robust kernel regression function with uncertain scale parameter for high dimensional ergodic data using k-nearest neighbor estimation. AIMS Mathematics, 2023, 8(6): 13000-13023. doi: 10.3934/math.2023655 |
[2] | Salim Bouzebda, Amel Nezzal, Issam Elhattab . Limit theorems for nonparametric conditional U-statistics smoothed by asymmetric kernels. AIMS Mathematics, 2024, 9(9): 26195-26282. doi: 10.3934/math.20241280 |
[3] | Salim Bouzebda . Weak convergence of the conditional single index U-statistics for locally stationary functional time series. AIMS Mathematics, 2024, 9(6): 14807-14898. doi: 10.3934/math.2024720 |
[4] | Fatimah Alshahrani, Wahiba Bouabsa, Ibrahim M. Almanjahie, Mohammed Kadi Attouch . kNN local linear estimation of the conditional density and mode for functional spatial high dimensional data. AIMS Mathematics, 2023, 8(7): 15844-15875. doi: 10.3934/math.2023809 |
[5] | Oussama Bouanani, Salim Bouzebda . Limit theorems for local polynomial estimation of regression for functional dependent data. AIMS Mathematics, 2024, 9(9): 23651-23691. doi: 10.3934/math.20241150 |
[6] | Miao Xiao, Zhe Lin, Qian Jiang, Dingcheng Yang, Xiongfeng Deng . Neural network-based adaptive finite-time tracking control for multiple inputs uncertain nonlinear systems with positive odd integer powers and unknown multiple faults. AIMS Mathematics, 2025, 10(3): 4819-4841. doi: 10.3934/math.2025221 |
[7] | Mohamed Kayid . Statistical inference of an α-quantile past lifetime function with applications. AIMS Mathematics, 2024, 9(6): 15346-15360. doi: 10.3934/math.2024745 |
[8] | Nicky K. Tumalun, Philotheus E. A. Tuerah, Marvel G. Maukar, Anetha L. F. Tilaar, Patricia V. J. Runtu . An application of generalized Morrey spaces to unique continuation property of the quasilinear elliptic equations. AIMS Mathematics, 2023, 8(11): 26007-26020. doi: 10.3934/math.20231325 |
[9] | Shengyang Gao, Fashe Li, Hua Wang . Evaluation of the effects of oxygen enrichment on combustion stability of biodiesel through a PSO-EMD-RBF model: An experimental study. AIMS Mathematics, 2024, 9(2): 4844-4862. doi: 10.3934/math.2024235 |
[10] | Xudong Shang . Existence and concentration of positive solutions for a p-fractional Choquard equation. AIMS Mathematics, 2021, 6(11): 12929-12951. doi: 10.3934/math.2021748 |
Presence can promote online learners' learning effectiveness in higher education, but in livestream teaching, the influential relationship between different types of presence and learning effectiveness is unclear. Therefore, based on the Community of Inquiry (CoI) Framework, we used structural equation, hierarchical regression, and the Bootstrap self-serving method to conduct a survey on college students participating in livestream teaching practice. The research findings revealed that livestream teaching substantially impacts learning effectiveness, with teaching presence, social presence, and cognitive presence all playing crucial roles. Notably, teaching presence has a significant positive influence on learning effectiveness through two key mediating factors: Social presence and cognitive presence. Consequently, three distinct mediating paths are identified. Among these three mediating paths, the most optimal route for teaching presence to enhance learning effectiveness is mediating cognitive presence. In conclusion, we recommend improving the livestream teaching environment, guiding learners toward active participation to promote a sense of embodiment, and elaborately designing livestream learning activities to improve interactivity. Finally, this paper offers evidence and insights for the improvement of livestream teaching in colleges, which will enhance learners' overall learning effectiveness.
U-statistics were first introduced by [115] in connection with unbiased estimators, following initial work by [106]. In brief, U-statistics of order m and kernel h based on a sequence {Xi}∞i=1 of random variables with values in a measurable space (S,S) and a measurable function f:Sm→R are given by
Un(h)=(n−m)!n!∑(i1,…,im)∈Imnh(Xi1,…,Xim),n≥m, |
where
Imn={(i1,…,im):ij∈N,1≤ij≤n,ij≠ik if j≠k}. |
Un(h) is the nonparametric uniformly minimum variance estimator of θ=E(h(X1,…,Xm)). It is the minimizer with respect to α of
∑1⩽i1<⋯<im⩽n(h(Xi1,…,Xim)−α)2. |
Empirical variance, Gini's mean difference, and Kendall's rank correlation coefficient are common examples of estimators based on U-statistics. The Wilcoxon signed rank test for the hypothesis of the location at zero is a classical test based on U-statistics, as discussed by [190] in Example 12.4. Asymptotic results for the case of independent and identically distributed underlying random variables were first provided by [115], who also referred to related work by [57,83,172,176]. Similar results were obtained for V-statistics by [94,198]. Extensive literature on the theory of U-statistics has been developed, as reviewed by [10,138,176], and others. A detailed review and major historical developments in this field can be found in the book by [20]. U-processes are sets of U-statistics indexed by a family of kernels, which can be viewed as infinite-dimensional variants of U-statistics with a single kernel or as stochastic processes that are nonlinear extensions of empirical processes. U-processes have been applied to solve complex statistical problems such as density estimation, nonparametric regression tests, and goodness-of-fit tests. Considering a large group of statistics instead of a single statistic is more statistically interesting, and ideas from the theory of empirical processes can be used to construct limit or approximation theorems for U-processes. However, obtaining results for U-processes is not easy and requires significant effort and distinct methodologies. Generalizing from empirical processes to U-processes is particularly difficult, especially in the stationary setting. U-processes appear in statistics in many instances, such as the components of higher-order terms in von Mises expansions, and play a role in analyzing estimators (including function estimators) with varying degrees of smoothness. For instance, the product limit estimator for truncated data is analyzed in [183] using a.s. uniform bounds for P-canonical U-processes. In addition, [11] introduce two new tests for normality based on U-processes, while [173] use weighted L1-distances between the standard normal density and local U-statistics based on standardized observations to propose new tests for normality, utilizing the results of [101]. Moreover, in [122], the median-of-means approach, which is based on U-statistics, is introduced to estimate the mean of multivariate functions in case of possibly heavy-tailed distributions. U-processes play a significant role in various statistical applications, including testing for qualitative features of functions in nonparametric statistics [1, 100], cross-validation for density estimation [157], and establishing limiting distributions of M-estimators [10, 68, 10]. [177] provide necessary and sufficient conditions for the law of large numbers and sufficient conditions for the central limit theorem for U-processes. For further references on U-statistics and U-processes, interested readers may refer to [28,40,46,47,48,54,138,179], while a comprehensive insight into the U-processes theory is provided by [68]. U-statistics are also naturally found in other contexts, such as the theory of random graphs, where they count occurrences of specific subgraphs like triangles, as presented in [120]. In machine learning, U-statistics arise naturally in various problems such as clustering, image recognition, ranking, and learning on graphs, where natural risk estimates take the form of U-statistics, as discussed in [63]. For instance, the empirical ranking error of any given prediction rule is a U-statistic of order 2, as stated in [62]. For U-statistics with random kernels of diverging orders, readers may refer to [97,112,178,180]. Infinite-order U-statistics are also useful for constructing simultaneous prediction intervals that quantify the uncertainty of ensemble methods like subbagging and random forests, as presented in [161]. The MeanNN approach estimation for differential entropy, introduced by [90], is a particular application of the U-statistic. Additionally, [143] proposed a new test statistic for goodness-of-fit tests using U-statistics. Moreover, [65] have explored a model-free approach for clustering and classifying genetic data based on U-statistics, leading to alternative ways of addressing genetic problems. Their motivation was based on the versatility and adaptability of U-statistics to various genetic problems and different data types. [140] proposed using the U-statistics, in a natural way, for analyzing random compressed sensing matrices in the non-asymptotic regime. Extending the above exploration to conditional U-processes is practically useful and technically more challenging.
We first introduce Stute's estimators. Let us consider regular sequence of random elements {(Xi,Yi),i∈N∗} with Xi∈Rd and Yi∈Y some polish space and N∗=N∖{0}. Let φ:Ym→R be a measurable function. In this paper, we are primarily concerned with the estimation of the conditional expectation, or regression function, for t∈Rdm,
r(m)(φ,t)=E(φ(Y1,…,Ym)∣(X1,…,Xm)=t), | (1.1) |
whenever it exists, i.e.,
E(|φ(Y1,…,Ym)|)<∞. |
We now introduce a kernel function K:Rd→R with support contained in [−B,B]d, B>0, satisfying:
supx∈Rd|K(x)|=:κ<∞and∫K(x)dx=1. | (1.2) |
[182] introduced a class of estimators for r(m)(φ,t), called conditional U-statistics, which is defined for each t∈Rdm to be :
ˆr(m)n(φ,t;hK)=∑(i1,…,im)∈I(m,n)φ(Yi1,…,Yim)K(t1−Xi1hK)⋯K(tm−XimhK)∑(i1,…,im)∈I(m,n)K(t1−Xi1hK)⋯K(tm−XimhK), | (1.3) |
where
I(m,n)={i=(i1,…,im):1≤ij≤n and ij≠ir if j≠r}, |
is the set of all m-tuples of different integers between 1 and n and {hK:=hn}n≥1 is a sequence of positive constants converging to zero at the rate nhmK→∞. In the particular case m=1, the r(m)(φ,t) is reduced to
r(1)(φ,t)=E(φ(Y)|X=t), |
and Stute's estimator becomes the Nadaraya-Watson [153, 200] estimator of r(1)(φ,t), refer to [50] for details. The work of [175] was devoted to estimating the rate of the uniform convergence in t of ˆr(m)n(φ,t;hK) to r(m)(φ,t). In [165], the limit distributions of ˆr(m)n(φ,t;hK) are analyzed and compared to those derived by Stute. Under proper mixing settings, [111] extended the results of [182] to weakly dependent data and applied their findings to validate the Bayes risk consistency of the corresponding discrimination rules. As alternatives to the conventional kernel-type estimators, [186] offered symmetrized closest neighbor conditional U-statistics. [98] evaluated the functional conditional U-statistic and determined its finite-dimensional asymptotic normality. Nonparametric estimate of the conditional U-statistics in a functional data context has gotten comparatively little attention despite the subject's significance. Recent developments are described in [42,44,45,52,56], in which the authors examine challenges associated with the uniform in bandwidth consistency in general settings. [119] evaluated the test of independence in the functional framework using the Kendall statistics, which may be viewed as special cases of the U-statistics. [14] introduced a comprehensive framework for clustering within multiple groups, employing a U-statistics-based approach specifically designed for high-dimensional datasets. This method classifies data into three groups while evaluating the significance of these partitions. In a related context, [128] focused on dimension-agnostic inference, devising methods whose validity remains independent of assumptions regarding dimension versus sample size. Their approach utilized variational representations of existing test statistics, incorporating sample splitting and self-normalization to yield a refined test statistic with a Gaussian limiting distribution. This modification of degenerate U-statistics involved dropping diagonal blocks and retaining off-diagonal blocks. Exploring further, [59] delved into U-statistics-based empirical risk minimization, while [121] examined asymmetric U-statistics based on a stationary sequence of m-dependent variables, with applications motivated by pattern matching in random strings and permutations. Additionally, [188] developed innovative U-statistics considering left truncation and right censoring. As an application, they proposed a straightforward non-parametric test for assessing the independence between time to failure and the cause of failure in competing risks, particularly when observations are subject to left truncation and right censoring. In a different context, [136] investigated the quadruplet U-statistic, offering applications in statistical inference for network analysis. It will be interesting to find connection of the U-statistics with the problems investigated in [187,201,204,205]. The extension of the preceding investigation to conditional empirical U-processes is both practically beneficial and technically difficult.
Recently, there has been a growing interest in regression models in which the response variable is real-valued and the explanatory variable is represented by smooth functions that vary arbitrarily between repeated observations or measurements. This form of data, known as functional data, appears in numerous disciplines, such as climatology (hourly concentration of pollutants), medicine (the knee angle of children as functions of time), economics, linguistics, etc. Functional time series are commonly encountered in practice, for example, when a long continuous time process is divided into smaller natural units, such as days. In this instance, every intraday curve is a functional random variable. This paper focuses mostly on the instance of functional data and the theory of the U-processes. We give an excerpt from [6]: Functional data analysis (FDA) is a branch of statistics that focuses on the study of variables having an unlimited number of dimensions, such as curves, sets, and images. It has had spectacular growth over the past two decades, fueled partly by technological advances that have led to the "Big Data" revolution. FDA is today one of the most active and significant disciplines of research in data science, despite its reputation at the turn of the century as a fairly obscure area of study. The reader is recommended to the works of [167,168], [91] for an introduction to this field. These sources offer a variety of case studies in numerous fields, including criminology, economics, archaeology, and neurophysiology, as well as basic analysis techniques. It should be noted that the extension of probability theory to random variables with values in normed vector spaces (such as Banach and Hilbert spaces), as well as extensions of certain classical asymptotic limit theorems, predates the recent literature on functional data; the reader is referred to [7]. [99] examined density and mode estimates for data occupying a normed vector space. This study examines the topic of the curse of dimensionality for functional data and proposes solutions to the problem. In the context of regression estimation, [91] examined nonparametric models. We may also refer to [21,116]. [130] provided a nice mix of foundational material, accessible theory, and practical examples. Recently, the contemporary theory was used in the analysis of functional data. [93], who has provided the consistency rates of several functionals of the conditional distribution, such as the regression function, the conditional cumulative distribution, the conditional density, and others, uniformly over a subset of the explanatory variable. [125] established the consistency rates for some functionals nonparametric models, such as the regression function, the conditional distribution, the conditional density, and the conditional hazard function, uniformly in bandwidth (UIB consistency). [35] extended these results to the ergodic setting. [12] examined the issue of local linear estimation of the regression function when the regressor is functional and demonstrated strong convergence (with rates) uniformly in bandwidth parameters. [141] examined the k-nearest neighbours (kNN) estimate of the nonparametric regression model for strong mixing functional time series data and demonstrated the uniform nearly complete convergence rate of the kNN estimator under several moderate conditions. [30] provided several limiting law results for the conditional mode in the functional setting for ergodic data; for more recent references, see [3,4,5,53,55,151].
The primary objective of this study is to examine a generic framework and characterize the weak convergence and the uniform convergence of kNN conditional U-processes based on a regular sequence of random functions. This is motivated by the fact that the k-NN method is a fundamental statistical method presenting several advantages. Recall that the k-NN method considers the k neighbors of Xi nearest to x with respect to some distance d(⋅,⋅). Although the local bandwidth of the k-NN is random and depends on the data Xi respecting the local structure of the data, which is essential in the infinite dimension. Historically, the k-NN was first introduced by [95]-see also [96]-in the context of nonparametric discrimination, and further investigated by [144], for more details, we refer to [17]. It is commonly used in practice (see [91]) and is simple to handle because the user has only one parameter to control the number k of nearest neighbors, valued in a finite set. In addition, it allows us to build a neighbor adapted to the data at any point. The k-NN method is widely studied if the explanatory variable is an element of a finite-dimensional space, for instance, see [16,64,77,104,145]. In an infinite dimensional space, i.e., a functional framework, there are three different approaches for the k-NN regression estimation. The first, published by [135], examines a k-NN kernel estimate when the functional variable is an element of a separable Hilbert space H. In this approach, [135] established a weak consistency result. The strategy of [135] is to reduce the infinite dimension of H by using a projection on a finite dimension subspace by considering only the first m coefficients of an expansion of X in an orthonormal system of H and then applying the multivariate techniques on the projected data to perform the k-NN regression. The second approach is based on the k-NN procedure and the functional local linear estimation, the consistency with the convergence rate is obtained in [60] and [142].
More precisely, in this paper, we are interested in establishing the a.co uniform consistency and the a.co uniform in the number of neighbors (UINN) consistency of the nonparametric functional regression estimator and also the functional conditional U-processes (statistics). [157] were the first to introduce the notion of uniform in bandwidth consistency for kernel density estimators and they applied empirical process methods in their study. This is motivated by a series of papers, among many others, [15,25,33,36,37,38,42,44,51,52,54,71,78,87,88] the authors established uniform in bandwidth (UIB) consistency results for such estimators in the i.i.d. finite-dimensional setting, where hn varies within suitably chosen intervals indexed by n. In the FDA, several authors have been interested in studying non-parametric functional estimators. For example, [93] provided the consistency rates of some functionals of the conditional distribution, including the regression function, the conditional cumulative distribution, the conditional density, and some others, uniformly over a certain subset of the explicative variable. [125] established the uniform consistency rate for some conditional models, including the regression function, the conditional distribution, the conditional density, and the conditional hazard function. The last mentioned paper is extended by [42]. [124] and [3] established the almost complete convergence of the k-nearest neighbors (k-NN) estimators, which are uniform in the number of neighbors, under some classical assumptions on the kernel and on the small ball probabilities of the functional variable in connection with the entropy condition controlling the space complexity. [12] considered the problem of local linear estimation of the regression function when the covariate is functional and proved the strong uniform-in-bandwidth (UIB) convergence. [141] investigated the k-NN estimation of the nonparametric regression model for strong mixing functional time series data and established the uniform a.co convergence rate of the k-NN estimator under some mild conditions. [158] stated some new uniform asymptotic results for kernel estimates in the functional single-index model. Most of this literature focuses on UIB or UINN consistency or on uniform consistency on some functional subset but never the both together, which was investigated in [44] in the independent framework. We aim to fill this gap in the literature by combining results from the FDA and the empirical processes theory in the dependent setting. The second problem for the weak convergence that we investigate is not simple, and the main merits of our contribution are the control of the asymptotic equi-continuity under minimal conditions in this general setting, which constitutes a fundamentally unresolved open problem in the literature. We intend to fill this gap in the literature by integrating the results of [8] and [27] with the strategies described in [147] and [43] for handling functional data. But, as will be demonstrated in the following section, the challenge requires much more than "just" merging concepts from the current outcomes. In reality, intricate mathematical derivations will be necessary to deal with the typical functional data in our framework. This necessitates the effective application of large sample theory tools, which were established for dependent empirical processes and for which we have used results from the work of [8,27,43]. Even with i.i.d. functional data, no weak convergence for the kNN conditional U-processes has been proven up to the present.
The current paper looks into the challenges high-dimensional functional data presents and presents important findings that are applicable to high-dimensional data models. This research expands the classical kernel estimator beyond samples that are independent and identically distributed to include stationary random processes. In addition, the main focus is on the kNN kernel estimator. It specifically examines four fundamental aspects that pertain to the kNN kernel estimator for regression and its uniform consistency. The study investigates the UINN and UIB scenarios to guarantee consistent results within the domain of functional regression (the UIB in Theorem 3.3 and the UINN in Theorem 3.1). The outcomes include the prediction of relative error, which enhances the overall comprehension of the kNN kernel estimator's performance (the UIB in Corollary 3.6 and and the UINN in Corollary 3.5). The paper includes the kNN method's for functional conditional U-statistics with consistent results, thereby extending the application area of these estimators. The study additionally investigates the uniform consistency and UINN consistency of functional conditional U-statistics, offering a comprehensive assessment of their limiting behaviour (the UIB in Theorems 3.8–3.10 and Corollary 3.12 and the UINN in Theorems 3.14, 3.16 and Corollary 3.18). In addition, the study makes an advanced contribution to the field by establishing a uniform central limit theorem for classes of functions that, subject to specific moment conditions, are either bounded or unbounded (for the conditional process the Normality is provided in Theorem 4.1 and the equicontinuity in Theorem 4.2; for the conditional U-process the Normality is provided in Theorems 4.5, 4.6 and the equicontinuity in Theorem 4.7). The process of establishing our main results uses advanced methodologies, including the k-nearest neighbours (kNN) method, covering number, small-ball probability, the Hoeffding decomposition, the decoupling methods, and the modern theory of the empirical process indexed by functions. At this stage, we mention that the Hoeffding decomposition cannot be used directly in our setting and needs some intricate preparation. All these results are established under fairly general conditions on function classes and underlying distributions, the majority of which are derived from prior works, thus guaranteeing their feasibility. The obtained results have the potential to be utilised in numerous statistical domains, such as time series prediction, set-indexed conditional U-statistics, and the Kendall rank correlation coefficient. Key technical tools in the proofs are the maximal moment inequalities for U-processes, and [84]'s results on β-mixing.
The layout of the present article is as follows. Section 2 is devoted to introducing the functional framework and the definitions that we need in our work, we give the assumptions used in our asymptotic analysis with a short discussion. Section 3 is devoted to the strong uniform convergence with rate. Section 4.1 provides the weak convergence of empirical processes in the functional framework. Section 4.2 gives the main results of the paper concerning the uniform TCL for the conditional U-processes. In Section 5, we collect some potential applications, including the set indexed conditional U-statistics in Section 5.1, Kendall rank correlation coefficient in Section 5.2, the discrimination problems in Section 5.3 and the time series prediction from a continuous set of past values in Section 5.4. We discuss a bandwidth choice for practical use in Section 6. Some concluding remarks and possible future developments are relegated to Section 7. To prevent interrupting the flow of the presentation, all proofs, based upon modern empirical process theory, are gathered in Section 8. Due to the lengthiness of the proofs, we limit ourselves to the most important arguments. A few relevant technical results are given in the Appendix.
Let {(Xi,Yi):i≥1} be a sequence of stationary† random copies of the random vector [rv] (X,Y), where X takes its values in some abstract space X and Y in the abstract space Y. Suppose that X is endowed with a semi-metric d(⋅,⋅)‡ defining a topology to measure the proximity between two elements of X and which is disconnected from the definition of X to avoid measurability problems. We are mainly interested in establishing the weak convergence of the conditional U-process based on the following U-statistic in the k-NN setting introduced in [44] by
ˆr∗(m)n(φ,t,hn,k(t))=∑(i1,…,im)∈I(m,n)φ(Yi1,…,Yim)K(d(t1,Xi1)Hn,k(t1))⋯K(d(tm,Xim)Hn,k(tm))∑(i1,…,im)∈I(m,n)K(d(t1,Xi1)Hn,k(t1))⋯K(d(tm,Xim)Hn,k(tm)), | (2.1) |
†In the case of the Hilbert space valued elements not necessarily strictly stationary is needed, a second order stationarity suffices. An Hilbert space valued sequence {Xt}t∈Z is second-order (or weakly) stationary if E‖, and
for all . We say that is strictly stationary if the joint distribution of and the joint distribution of coincide, for all , and .
‡A semi-metric (sometimes called pseudo-metric) is a metric which allows for some .
as an estimator for the multivariate regression function
(2.2) |
where
(2.3) |
and is a symmetric measurable function belonging to some class of functions and is a vector of positive random variables that depend on such that, for all and
(2.4) |
where is a ball in with the center and radius , and is the indicator function of the set In fact, this -NN estimate can be considered as an extension to the random and locally adaptive neighbor of the functional conditional -statistics estimate of defined for all as :
(2.5) |
where are positive real numbers decreasing to zero as goes to infinity. At this stage, we highlight that kernel estimation is popular since the classical Akaike-Parzen-Rosenblatt kernel density estimation [refer to [2,160,171]]. However, the first appearance of kernel estimators is likely to be [95]: as the original technical report is difficult to find, it has been re-published as [96]. [160] has shown, under some assumptions on , that is an asymptotically unbiased and consistent estimator for whenever and is a continuity point of . Under some additional assumptions on and , he obtained an asymptotic normality result, too. The kernel estimators have been extensively studied in the literature, see, e.g., [28,31,34,49,73,75,76,85,86,108,154,174,199] and the references therein. The -NN method is a fundamental statistical tool with various advantages. Generally, the procedure is computationally efficient and requires minimal parameter adjustment. In addition, the -NN approaches are nonparametric, which allows them to automatically adapt to any continuous underlying distributions without relying on any particular models. The -NN techniques were proven consistent for various significant statistical problems, including density estimation, classification, and regression, provided that a suitable is chosen. It should be noted that since our objective is to generalize the results obtained for the estimator defined in (2.5), and given the fact that one of the main differences is that the smoothing parameter, is a vector of random variables instead of a univariate parameter our first course of action would be to extend the results of [42,43] to the multivariate setting. First, we need to introduce some notation. Let denote a pointwise measurable class of real-valued symmetric functions on with a measurable envelope function
(2.6) |
For a kernel function and a subset , we define the pointwise measurable class of functions, for :
Statistical observations are not always independent but are often close to being so. Dependence may lead to severe repercussions on statistical inference if it is not taken into consideration. The notion of mixing quantifies how close to independence a sequence of random variables is, allowing us to extend standard results for independent sequences to weakly dependent or mixing sequences. Let us specify the dependence that will be the focus of this study. Let be a stationary sequence of random variables on some probability space and let be the -field generated by for The sequence is said -mixing or absolute regular, refer to [170,197], if :
It should be noted that [118] obtained a complete description of stationary Gaussian processes satisfying the last property. Throughout the sequel, we assume tacitly that the sequence of random elements is absolutely regular. The Markov chains, for instance, are -mixing under the milder Harris recurrence condition if the underlying space is finite [19,46,66,180]. We also need to introduce some concepts that are related to the topological structure of functional spaces. First, we define the small-ball probability for a fixed and for all by
(2.7) |
this notion is widely used in nonparametric functional data analysis to avoid introducing density assumptions on the functional variable and address the issues associated with the infinite-dimensional nature of the functional spaces. At this point, we can refer to [91,99,147]. We also need to deal with the VC-subgraph classes ("VC" for Vapnik and Chervonenkis, for instance, see [132,193,194]).
Definition 2.1. A class of subsets on a set is called a VC-class if there exists a polynomial such that, for every set of points in , the class picks out at most distinct subsets.
Definition 2.2. A class of functions is called a VC-subgraph class if the graphs of the functions in form a V-C class of sets, that is, if we define the subgraph of a real-valued function on as the following subset on
the class is a VC-class of sets on . Informally, a VC-class of functions is characterized by having a polynomial covering number (the minimal number of required functions to make a covering on the entire class of functions).
Definition 2.3. Let be a subset of a semi-metric space and a positive integer, a finite set of points is called, for a given , a -net of if:
If is the cardinality of the smallest -net (the minimal number of open balls of radius ) in needed to cover , then we call Kolmogorov's entropy (metric entropy) of the set the quantity
From its name, one can figure that this concept of metric entropy was introduced by Kolmogorov [131] and was studied subsequently for numerous metric spaces. This concept was used by [80] to give sufficient conditions for continuity of Gaussian processes, and was the basis for striking generalizations of Donsker's theorem on the weak convergence of the empirical process. Suppose that and are two subsets of the semi-metric space with Kolmogorov's entropy (for the radius ) and respectively, then the Kolomogorov entropy for the subset of the semi-metric space by :
Hence, is the Kolmogorov entropy of the subset of the semi-metric space . Noting that if we designate by the semi-metric on , then, we can define the semi metric on by :
for
Notice that the semi-metric plays an important role in this kind of study. The reader will find useful discussions about how to choose the semi-metric in [91] (see Chapters 3 and 13).
Let us present the conditions that we need in our analysis.
(C.1.) On the distributions/small-ball probabilities
(C.1.1) For and , we have
where is a non-negative functional in , and and is an invertible function absolutely continuous in a neighbor of the origin.
(C.1.2) For let , we have
where is a non-negative function, and as satisfying is bounded.
(C.2) On the smoothness of the model
(C.2.1) The regression satisfies for and for
where for and the semi metric on :
for
(C.2.2) The conditional variance, defined for is continuous in some neighborhood of
Further, assume that for some and
is continuous in some neighborhood of
(C.2.3) For the function does not depend on and is continuous in some neighborhood of
(C.3) On the kernel function
(C.3.1) The kernel functions is supported within and there exists some constants such that
and
(C.3.2) The kernel is a positive function and differentiable function on with derivative such that
(2.8) |
(C.4) On the classes of functions
(C.4.1) The class of functions is bounded and its envelope function satisfies for some
(C.4.2) The class of functions is unbounded and its envelope function satisfies for some
(C.4.3) The metric entropy of the class satisfies, for some :
where
(C.4.4) The class of functions is supposed to be of VC-type with envelope function previously defined. Hence, there are two finite constants and such that:
for any and each probability measure such that .
(C.5) On the dependence of the random variables
(C.5.1) Absolute regularity
for some and
(C.5.2) There is a sequence of positive integers such that, as ,
(C.6.) On the entropy
For large enough and for some , the Kolomogorov's entropy satisfies :
(2.9) |
(2.10) |
where and is the minimal number of open balls of radius in , needed to cover
(C.7.) The sequences and (resp. and ) verify
(2.11) |
(C.8.) There exist sequences , and () and constants , such that
and
(2.12) |
(2.13) |
(2.14) |
(2.15) |
Additional/alternative conditions
(C.1'.) For
(C.1'.1) We have:
with , and is absolutely continuous in a neighborhood of the origin.
(C.1'.2) we have:
where is a non-negative function, , and as satisfying is bounded.
(C.3'.) The kernel functions is supported within and there exists some constants such that for :
Comments
In our nonparametric functional regression model, we deal with a complex theoretical challenge. Mainly establishing functional central limit theorems for the conditional empirical processes and the conditional U-process under functional absolute regular data. We also use random (or data-dependent) bandwidths based on the nearest neighbors (kNN) approaches. Although standard statistical methods cannot be utilized in the functional setting, most imposed conditions overlap with some characteristics of the infinite-dimensional spaces, such as the topological structure of , the probability distribution of , and the measurability concept for the classes and . It is worth mentioning that most of the conditions that we will be using throughout this paper are inspired by [43,91,99,141,147]. Let us start with assumption (C.1.1), which was adapted from [147], who in turn was inspired by [99]. As explained by [147], if , then condition (C.1.1) coincides with the fundamental axioms of probability calculus. Furthermore, if is an infinite-dimensional Hilbert space, then can converge to exponentially as . Condition (C.1.1) can be considered a standard condition on the small ball probability, which is used to control the behavior of around zero. It shows that we can approximately write the small ball probability as a product of two independent functions and ; see, for instance, [148] for the diffusion process, [18] for a Gaussian measure, and [139] for a general Gaussian process. The most frequent result available in the literature is of the form where with and . It corresponds to the Ornstein-Uhlenbeck and general diffusion processes (for such processes, and ) and the fractal processes (for such processes, and ). For more examples, refer to [92]. It is worth noting that, in general, when we deal with functional data, we need some information about the variability of the small-ball probability to adapt to the bias of nonparametric estimators; this information is usually obtained by supposing that:
(C.1.1'')
Condition (C.2) concerns the regularity of the model; it consists of mild conditions on the continuity of certain conditional moments, in addition to the standard Lipschitz assumption on the regression ((C.2.1). Assumption (C.3) is another classical condition in nonparametric estimation models which concerns the kernel function . It should be noted that condition (C.3.1) can be replaced with condition (C.3') to find an expression for the asymptotic variance. We use condition (C.4.1) when dealing with bounded functions. However, our interest also extends to conditional -processes indexed by an unbounded class of functions. In this case, we replace (C.4.1) by (C.4.2). Keep in mind that there is a trade-off between the moment order in (C.4.2) and the decay rate of the mixing coefficient imposed in (C.5): the larger is, the weaker the decay of . Also note that if , i.e, decays exponentially fast, then (C.5.) is automatically satisfied. Furthermore, this condition is indispensable in our work since studying the weak convergence of the empirical processes entails establishing asymptotic equi-continuity. For Assumption (C.4.4), see [163, Examples 26 and 38], [157, Lemma 22], [82,§ 4.7.], [193, Theorem 2.6.7], [132,§ 9.1] provide a number of sufficient conditions under which (C.4.4) holds, we may also refer to [70,§ 3.2] for further discussions. For instance, it is satisfied, for general , whenever , with being a polynomial in variables and being a real-valued function of bounded variation, we refer the reader to [88, p. 1381]. We also mention that the class of function is assumed to be in general pointwise measurable, that is satisfied whenever is of bounded variation on (in the sense of Hardy and Kauser [110,133,195], see, e.g., [61,114,156,196]). The condition takes into account the topological considerations by controlling the Kolmogorov entropy§ of the set , which is standard in nonparametric models when we study the uniform consistency and the uniform in bandwidth consistency, we refer to [93] and [134] for discussions. As mentioned in [134], there are special cases of functional spaces and subsets where . Some examples are the closed ball in a Sobolev space, the unit ball of the Cameron-Martin space, and a compact subset in a Hilbert space with a projection semi-metric (see [93,131,192], respectively, for further details). In all these cases it is easy to see that is verified as soon as . Assumption (C.7.) is essential to establish the rates of convergence (consistency) of the estimator defined in (2.5), while assumption (C.8.) adapts condition (C.7.) to the case of the functional conditional -statistics in the -NN setting.
§The concept of metric entropy was introduced by Kolmogorov ( [131]) and studied subsequently for numerous metric spaces. This concept was used by [80] to give sufficient conditions for the continuity of Gaussian processes, and was the basis for striking generalizations of Donsker's theorem on the weak convergence of the empirical process, refer to [191] for the connection of this notion with Le Cam's work.
Remark 2.4. Note that the condition (C.4.2) may be replaced by more general hypotheses upon moments of as in [70]. That is
(M.1) We denote by a nonnegative continuous function, increasing on , and such that, for some , ultimately as ,
(2.16) |
For each , we define by . We assume further that:
The following choices of are of particular interest:
(ⅰ) for some ;
(ⅱ) for some .
For simplicity reasons, the condition (C.1.1) on the small ball probability will be replaced by:
(H.1) For and
(3.1) |
this standard condition can be considered an extension of the multivariate case where we assume that the density function of the variable is strictly positive. Also, it is worth mentioning that, if in particular, we denote , we can find two positive constants , such that
(3.2) |
which is similar to condition (C.1) used in [42,181], so we will deal with instead of (whenever we encounter a similar situation in the proofs). This approach is not only for notational purposes but also to make it easier to bridge the UIB and the UINN results.
In this section, we consider the uniform consistency of the functional regression operator in its general form, which is given for all , by
(3.3) |
where depending on and
In fact, the -NN operator presented in (3.3) can be considered as a generalization of the usual kernel regression
(3.4) |
where the bandwidth depends on (but does not depend on ).
Recall the bandwidths and given in the condition (C.7.). The following theorem will play an instrumental role in the sequel.
Theorem 3.1. Under the assumptions (H.1.), (C.2.1), (C.3.1), (C.4.1), (C.4.3), (C.6.) and (C.7.) (for ), we have, as ,
(3.5) |
¶ Let for , be a sequence of real r.v.'s. We say that converges almost-completely (a.co.) toward zero if, and only if, for all
Moreover, we say that the rate of the almost-complete convergence of toward zero is of order (with ) and we write if, and only if, there exists such that
This kind of convergence implies both the almost-sure convergence and the convergence in probability.
The following result gives uniform consistency when the class of functions is unbounded.
Corollary 3.2. Under the assumptions (H.1.), (C.2.1), (C.3.1), (C.4.2), (C.4.3), (C.6.) and (C.7.) (for ), we have
(3.6) |
Now, we can state the main results of this section concerning the -NN functional regression. Recall the bandwidths and given in the condition (C.8.).
Theorem 3.3. Under the assumptions (H.1.), (C.2.1), (C.3.1), (C.4.1), (C.4.3), (C.6.) and (C.7.) (for ), if, in addition, condition (C.8.) is satisfied, then, we have
The following result gives uniform consistency when the class of functions is unbounded.
Corollary 3.4. Under the assumptions (H.1.), (C.2.1), (C.3.1), (C.4.2), (C.4.3), (C.6.) and (C.7.) (for ), and if condition (C.8.) is satisfied, then we have
(3.7) |
Recall that the operator is usually estimated by minimizing the expected squared loss function . Nonetheless, this loss function, which is regarded as a measure of prediction performance, may be inappropriate in certain circumstances. In fact, the application of least-squares regression translates to giving all variables in the study equal weight. Consequently, the prevalence of outliers can render results irrelevant. In this paper, we, therefore, circumvent the limitations of classical regression by estimating the operator with regard to the minimization of the mean squared relative error (MSRE):
(3.8) |
This criterion is clearly a more meaningful measure of the prediction performance than the least square error, in particular, when the range of predicted values is large. Moreover, the solution of (3.8) can be explicitly expressed by the ratio of the first two conditional inverse moments of given . In fact, in order to construct the regression estimator allowing the best MSRE prediction, we assume that the first two conditional inverse moments of given , that is ) for , exist and are finite almost-surely (a.s.). Then, one can show easily, cf. [74,123,159], that the best mean squared relative error predictor of given is:
Thus, we estimate the regression operator , which minimizes the MSRE by:
(3.9) |
and
(3.10) |
By considering the special cases and in Corollaries 3.2 and 3.4, we obtain the following results complementing the work of [22,23,24,74].
Corollary 3.5. Under the assumptions (H.1.), (C.2.1), (C.3.1), (C.4.2), (C.4.3), (C.6.) and (C.7.) (for ), we have
(3.11) |
The following result is not considered in the literature.
Corollary 3.6. Under the assumptions (H.1.), (C.2.1), (C.3.1), (C.4.2), (C.4.3), (C.6.) and (C.7.) (for ), and if condition (C.8.) is satisfied, then we have
(3.12) |
In addition to the conditions imposed before, the following assumptions are essential to obtain exponential inequalities for dependent data later in the proofs:
(A1) Assume is strictly stationary, and there exists an absolute constant such that for any , we have the -mixing coefficient, corresponding to , satisfies .
(A2) Assume, uniformly, for any integer such that and arbitrary , conditional on , the sequence satisfies, for the -mixing coefficient corresponding to it,
where stands for the conditional probability. In particular, we have, for the -mixing coefficient corresponding to itself
The -mixing condition (A1) is typically necessary to obtain asymptotic normality for -statistics in the absence of a strict Lipchitz-continuity assumption for the kernel functions, for instance, see [72,203] and Remarks 2.2 and 2.3 of [107]. Assumption (A2) is often less restrictive than the -mixing condition. As will be shown in [107], finite-state and vector-valued absolutely continuous data sequences of exponentially -mixing decaying rate satisfy (A2). We note that this is more restrictive than the polynomially mixing decaying rate. [203]. This is because we need to calculate higher moments of the -statistics to obtain sharp concentration inequality. The "exponentially decaying rate" condition is routine in the literature of deriving concentration inequalities for weakly dependent data, see [149] and Remarks 2.4 and 2.5 in [107].
This section considers the uniform consistency, the UIB, and the UINN consistency of the functional conditional -statistic given by (2.1). First, let's introduce some notation. For some interval , we denote
where
In the sequel, we denote (unless stated otherwise)
For all , let us denote
and
For notational convenience, in the case of , we denote and simply and The same goes for other similar notation unless stated otherwise. Set
and for some symmetric measurable function define the -canonical function see [10] and [68] (we replace the index with to avoid confusing it with the smoothing parameter ), by
where for measures on we let
and denote Dirac measure at point . This decomposition follows easily by expanding
into terms of the form
It is very simple to check that symmetric is -degenerate of order ‖ if . For example,
‖
Definition 3.7 A -integrable symmetric function of variables, : , is -degenerate of order , if for all whereas
is not a constant function. If is -centered and is -degenerate of order , that is, if
then is said to be canonical or completely degenerate with respect to . If is not degenerate of any positive order, we say it is nondegenerate or degenerate of order zero.
It's clear that, for all
and is a classical -statistic with the - kernel However, the study of the uniform consistency of to can not be done with a straightforward approach due to the randomness of the bandwidth vector which poses some technical problems. To circumvent this, our strategy is first to study the uniform consistency of , where is a multivariate bandwidth that does not depend on and . Hence, we study the uniform consistency and the UIB consistency of to when and when and we shall consider an appropriate centering factor than the expectation , hence we define :
(3.13) |
The second step will be the use a general lemma given in [44], adapted to our setting, similar to that of [134] (see Subsection 8.1.1) to derive the results for the bandwidth
Next, we will give the UIB results for all and We first start with announcing the result concerning the uniform derivation of the estimate with respect to when the class of functions is bounded.
Theorem 3.8. Suppose that the conditions (H.1.), (C.3.1), (C.4.1), (C.4.4), (C.6.) and (C.7.) are fulfilled, we infer that, as ,
(3.14) |
The following result covers the uniform derivation of the estimate with respect to when the class of functions is unbounded satisfying general moments condition.}
Theorem 3.9. Suppose that the conditions (H.1.), (C.3.1), (C.4.2), (C.4.4), (C.6.) and (C.7.) are fulfilled. For all , we infer that, as ,
(3.15) |
The following result handles the uniform deviation of the estimate with respect to in both situations, where the class of functions is bounded or unbounded satisfying a general moment condition.
Theorem 3.10. Suppose that the conditions (H.1.), (C.3.1), (C.4.1), (C.4.4), (C.6.) and (C.7.) (or the following (H.1.), (C.3.1), (C.4.2), (C.4.4), (C.6.) and (C.7.)) are fulfilled. For all , we infer, as ,
(3.16) |
Theorem 3.11. Suppose that the conditions (H.1.), (C.2.1), (C.3.1) and (C.6.) are fulfilled. For all we infer, as ,
(3.17) |
Corollary 3.12. Under the assumptions of Theorems 3.10 and 3.11 it follows that, as ,
(3.18) |
Remark 3.13. As in [9,27,43], we will divide the -statistics into different parts: some parts can be approximated by -statistics of independent blocks, while by conditioning on one block, others are empirical processes of independent blocks. To prove the nonlinear terms are negligible, we will also need some symmetrization and maximal inequalities, we refer to [10,67].
Let be some constants and is a sequence chosen in such a way that
and
The following result deals with the uniform deviation of the estimate with respect to when the class of functions is bounded.
Theorem 3.14. Suppose that the conditions (H.1.), (C.3.1), (C.4.1), (C.4.4), (C.6.) and (C.7.) are fulfilled. If in addition assumption (C.8.) holds, we infer that, as ,
(3.19) |
The following result deals with the uniform deviation of the estimate with respect to when the class of functions is unbounded satisfying general moments condition.
Theorem 3.15. Suppose that the conditions (H.1.), (C.3.1), (C.4.2), (C.4.4), (C.6.) and (C.7.) are fulfilled. If in addition assumption (C.8.) holds, we infer that, as ,
(3.20) |
The next results give uniform consistency when the class of functions is bounded or unbounded.
Theorem 3.16. Suppose that the conditions (H.1.), (C.3.1), (C.4.1), (C.4.4), (C.6.) and (C.7.) (or the following (H.1.), (C.3.1), (C.4.2), (C.4.4), (C.6.) and (C.7.)) are fulfilled. If in addition assumption (C.8.) holds, we infer that, as ,
(3.21) |
Theorem 3.17. Suppose that the conditions (H.1.), (C.2.1), (C.3.1) and (C.6.) are fulfilled. If in addition assumption (C.8.) holds, we infer that, as ,
(3.22) |
Corollary 3.18. Under the assumptions of Theorems 3.16 and 3.17 it follows that, as ,
(3.23) |
Remark 3.19. The choice of the parameters and defined in a similar way as in condition (C.8.) affects the rate of convergence of the -NN estimator. We can choose these parameters depending on the small ball probability function .
Remark 3.20. The present work largely extends and completes the work of [42,44] in several ways. There are basically no restrictions on the choice of the kernel function in our setup, apart from satisfying some mild conditions that we will give after. The selection of the bandwidth or the number of neighbors, however, is more problematic. It is worth noticing that the choice of bandwidth is crucial to obtain a good rate of consistency, for example, it has a big influence on the size of the estimate's bias. In general, we are interested in the selection of bandwidth and neighbors that produce an estimator that has a good balance between the bias and the variance of the considered estimators. It is then more appropriate to consider the bandwidth and neighbors varying according to the criteria applied and to the available data and location which cannot be achieved by using the classical methods. The interested reader may refer to [34,146] for more details and discussion on the subject. In the present section, we have provided a response to this delicate problem in the FDA associated with the dependent data setting. In the present setting, we divide the -statistics into different parts: some parts can be approximated by -statistics of independent blocks, while by conditioning on one block, others are empirical processes of independent blocks. This decomposition is the key tool but makes the proof very involved which is the price of the extension to the dependent framework. This, allows us to use, in a nontrivial way, the techniques used for independent variables and mostly we will be using the results of [44]. We highlight that in the present paper, we have used a novel exponential inequality of [107] tailored to the dependent framework.
We define the functional conditional empirical process for univariate bandwidth by:
(4.1) |
where designates (2.5) when , and refers to the regression function (2.2), with
If, for , where is the probability measure and, for each ,
then is a random element with values in consisting of all functional on such that
Then, it will be important to investigate the following weak convergence
It is known that the weak convergence to a Gaussian limit with a version of uniformly bounded and uniformly continuous paths (with respect to the ) is equivalent to the finite-dimensional convergence and the existence of pseudo-metric on such that is totally bounded pseudo-metric space and
(4.2) |
Below, we write whenever the random vector follows a normal law with vector expectation and matrix variance , denotes the convergence in distribution. The following theorem is adapted from [147] to the setting of the -NN estimators. The main objective of this section is to investigate the central limit theorems for the functional conditional empirical process defined by
(4.3) |
Theorem 4.1. Let consider the class of functions , suppose that conditions (C.1'.), (C.1.2), (C.2.1), (C.2.2), (C.3'.), (C.5.) and (C.8.) hold. and if the smoothing parameter satisfies for all
we get for:
where is the covariance matrix with:
Theorem 4.2. Suppose that the conditions (C.3.1), (C.4.2)–(C.5.1), and (C.8.) hold and for each
Then, we have
The two previous theorems can be summarized as follows:
Theorem 4.3. Under conditions (C.1'.), (C.1.2), (C.2.), (C.3.1), (C.3'.), (C.4.4) (C.5.1), (C.5.2) and (C.8.), then the process, as ,
converges in law to a Gaussian process that admits a version with uniformly bounded and uniformly continuous paths with respect to -norm.
Remark 4.4. We mention that other types of applications can be obtained from Theorem 4.3 including the conditional distribution, conditional density, and conditional hazard function. This, and other applications of interest, will not be considered here due to lack of space.
In this section, we are interested in studying the weak convergence of conditional -processes under absolute regular observations. Recall the class of functions considered is given in Section 2. The conditional -process indexed by :
(4.4) |
The -empirical process is defined by
It should be noted that to establish the weak convergence of (4.4) it is first necessary to go through that of (4.6), below. Indeed, we will develop some details that will be used later. Because condition (C.6.) is satisfied, for each , we have
(4.5) |
We can write the -statistic as follows
(4.6) |
We call the first term of the right side of (4.6) truncated part and the second remainder part. First we are interested in . An application of Hoeffding's decomposition gives
(4.7) |
where is a sequence of i.i.d. r.v. with for each , and and are respectively defined as and . In view of (4.7), we have
the stationarity assumption and some algebras show that
Therefore,
(4.8) |
By the fact that is -canonical, we have to show that
So that to establish the weak convergence of the -process , it is enough to show
where is a Gaussian process indexed by , and for
We have to prove after, that the remaining part is negligible, in the sense that
Nevertheless, when we have to deal with finite-dimensional convergence, the truncation does not matter. Which means that establishing the finite-dimensional convergence of is equivalent to establishing that of
Theorem 4.5. (a) Under conditions (C.1'.), (C.1.2), (C.2.), (C.3'.), (C.5.1), (C.5.2) and if is continuous at then, as ,
(4.9) |
where
(4.10) |
(b) If, in addition, the smoothing parameter satisfies the condition (C.8.), then we have, as ,
(4.11) |
where is defined as in (4.10), with
Corollary 4.6. Under conditions (C.1'.), (C.1.2), (C.2.), (C.3'.), (C.5.) and if as , then we infer that:
(4.12) |
Theorem 4.7. Under conditions (C.1'.), (C.1.2), (C.2.), (C.3'.), (C.5.1), (C.5.2), (C.8.) and as . Let be a measurable VC-subgraph class of functions from such as condition (C.4.2) is satisfied and, if the -coefficients of the mixing stationary sequence fulfill:
(4.13) |
for some , then converges in law to a Gaussian process which has a version with uniformly bounded and uniformly continuous paths with respect to norm.
Remark 4.8. It is worth noting what is the price to pay by the nice features of the -NN-based estimators: remembering that is a random variable (which depends on , one should expect that additional technical difficulties will appear along the proofs of asymptotic properties. To fix the idea on this point, note that the random elements involved in (2.1), , can not be decomposed as sums of independent variables (as it is the case for instance with kernel-based estimators), and hence its treatment will need more sophisticated probabilistic developments than standard limit theorems for sums of i.i.d. variables. Also, the Hoeffding decomposition can not be applied directly to (2.1), which is the main tool for the study -statistics. The first step in proving Theorem 4.7 is the extension of [27,43] to the multivariate bandwidth problem. In addition, we have considered new applications: the set indexed conditional U-statistic, Kendall rank correlation coefficient and time series prediction from a continuous set of past values. Another delicate problem lies in the fact that some maximal inequalities and symmetrisation techniques of [10,67] are not applicable directly in our framework, making the proof quit lengthy, in particular, the equicontinuity of the empirical processes.
Remark 4.9. It is straightforward to modify the proofs of our results to show that it remains true when the entropy condition is substituted by the bracketing condition: For some and ,
Refer to p. 270 of [190] for the definition of .
Although only four examples will be given here, they stand as archetypes for a variety of problems that can be investigated in a similar way.
We aim to study the links between and , by estimating functional operators associated to the conditional distribution of given such as the regression operator, for in a is a class of sets ,
We define metric entropy with the inclusion of the class of sets . For each , the covering number is defined as :
the quantity is called metric entropy with inclusion of with respect to . The quantity is called metric entropy with inclusion of with respect to . Estimates for such covering numbers are known for many classes, (see, e.g., [81]). We will often assume below that either or behave like powers of : we say that the condition () holds if
(5.1) |
where
for some constants . As in [164], it is worth noticing that the condition (5.1), , holds for intervals, rectangles, balls, ellipsoids, and for classes which are constructed from the above by performing set operations union, intersection and complement finitely many times. The classes of convex sets in () fulfill the condition (5.1), . This and other classes of sets satisfying (5.1) with can be found in [81]. As a particular case of (2.5), we estimate
(5.2) |
One can apply Corollary 3.18 to infer that
(5.3) |
Remark 5.1. Another point of view is to consider the following situation, for a compact ,
Let be a distribution in and is the number of neighborhoods associated with s. One can estimate by
One can use Corollary 3.18 to infer that, as ,
(5.4) |
To test the independence of one-dimensional random variables and [127] proposed a method based on the -statistic with the kernel function :
(5.5) |
Its rejection on the region is of the form , for more general tests, refer [29,32]. In this example, we consider a multivariate case. To test the conditional independence of given , we propose a method based on the conditional U-statistic :
where and is Kendall's kernel (5.5). Suppose that and are and -dimensional random vectors respectively and . Furthermore, suppose that are observations of , we are interested in testing :
(5.6) |
Let such as and , and be the distribution functions of and respectively. Suppose and to be continuous for any unit vector where and and means the transpose of the vector . For , let and such as and for , and :
An application of Corollary 3.18 gives, as ,
(5.7) |
Now, we apply the results to the problem of discrimination described in Section 3 of [185], refer to also to [184]. We will use a similar notation and setting. Let be any function taking at most finitely many values, say . The sets
then yield a partition of the feature space. Predicting the value of is tantamount to predicting the set in the partition to which belongs. For any discrimination rule , we have
where
The above inequality becomes equality if
is called the Bayes rule, and the pertaining probability of error
is called the Bayes risk. Each of the above unknown functions 's can be consistently estimated by one of the methods discussed in the preceding sections. Let, for ,
(5.8) |
Set
Let us introduce
The discrimination rule is asymptotically Bayes' risk consistent, as ,
This follows from Corollary 3.18 and the obvious relation
Let denote a sequence of processes with value in . Let denote a fixed positive real number. In this model, we suppose that the process is observed from until , and assume without loss of generality that . The method ensures splitting the observed process into fixed-length segments. Let us denote each piece of the process by
The response value is therefore , and this can be formulated as a regression problem:
(5.9) |
provided that we assume that a function of this kind, , does not depend on (which is satisfied if the process is stationary, for example). Because of this, when we get to time , we can use the following predictor, which is directly derived from our estimator, to predict the value that will be at time
where , for . Corollary 3.12 provides mathematical support for this nonparametric functional predictor and extends previous results in numerous ways in [56,91]. Notice that this modelization encompasses a wide variety of practical applications, as this procedure allows for the consideration of a large number of past process values without being affected by the curse of dimensionality. We believe that our findings will find applications beyond the scope of this work, in particular, because many popular measures of dependence, such as distance covariance and the Hilbert-Schmidt independence criterion can be estimated using -statistics.
In the next section, we provide more details about how some of the methodologies of a number of neighbor choices in the literature can be combined with our results.
Many methods have been established and developed to construct, in asymptotically optimal ways, bandwidth selection rules for nonparametric kernel estimators especially for Nadaraya-Watson regression estimator we quote among them [15,44,45,50,105,109,166]. This parameter has to be selected suitably, either in the standard finite-dimensional case or in the infinite-dimensional framework to ensure good practical performances. However, according to our knowledge, such studies do not presently exist for treating a such general functional conditional -statistic (unless the real case we could find in the paper of [79] a paragraph devoted to the selection of the number ). Nevertheless an extension of the leave-one-out cross-validation procedure allows to define, for any fixed :
(6.1) |
where
The Eq (6.1) represents the leave-out- estimator of the functional regression and also could be considered as a predictor of . In order to minimize the quadratic loss function, we introduce the following criterion, we have for some (known) non-negative weight function
(6.2) |
where
Following the ideas developed by [166], a natural way for choosing the bandwidth is to minimize the precedent criterion, so let's choose minimizing among :
we can conclude, by Corollary 3.18, that, as ,
The main interest of our results is the possibility of deriving asymptotics for any automatic data-driven parameters. Let be a density function in and is the number of neighborhoods associated with s. One can estimate the conditional density by
Hence, the leave-one-out estimator is given by
While the cross-validation procedures described above aim to approximate quadratic errors of estimation, alternative ways for choosing smoothing parameters could be introduced aiming rather to optimize the predictive power of the method. The criterion is given by
In this paper, we consider the NN kernel type estimator for conditional -statistics, with the Nadaraya-Watson estimator as a special case, in a functional setting with regular datasets. To obtain our results, we need some regularity on the conditional -statistics and conditional moments, decay rates on the probability of variables belonging to shrinking open balls, and convenient decreasing rates on mixing coefficients. In particular, the conditional moment assumption allows unbounded classes of functions to be considered. The proof of weak convergence adheres to a standard method: convergence of finite dimensions and the intricate equicontinuity of conditional -processes. Approaching independence with a block decomposition technique, and then proving a central limit theorem for independent variables leads to finite-dimensional convergence. The equicontinuity requires more intricate control, and the details are lengthy due to the general and complex framework we have considered and will be presented in the following section. Observe that mixing is a type of asymptotic independence assumption that is commonly used to seek simplicity, but can be implausible when there is a strong dependence between the data. In [69] it is argued that -mixing is the weakest mixing assumption that allows for a "complete" empirical process theory that incorporates maximal inequalities and uniform central limit theorems. There exist explicit upper bounds for -mixing coefficients for Markov chains (cf. [113]) and for so-called -geometric mixing coefficients (cf. [169]). For several stationary time series models like linear processes (cf. [202] for -mixing), ARMA (cf. [189]), nonlinear AR (cf. [126]). A common assumption in these results is that the observed process or, more often, the innovations of the corresponding process, have a continuous distribution. This is a crucial assumption to handle the relatively complicated mixing coefficients defined over a supremum over two different sigma-algebras. A relaxation of -mixing coefficients was investigated by ([103], Theorem 1) and is specifically designed for the analysis of the EDF, for more details, refer to [162]. The application of non-parametric functional concepts to general dependence structure is a relatively underdeveloped field. Notably, the ergodic framework eschews the commonly employed strong mixing condition and its variants for measuring dependence, as well as the extremely involved probabilistic calculations that this condition necessitates. It would be interesting to extend our work to the case of functional ergodic data, but this would require nontrivial mathematics and is well outside the scope of this paper. The primary obstacle lies in the necessity of formulating novel probabilistic results, as the ones employed in our current work, as demonstrated in [8], are tailored specifically for -mixing samples. Another direction is to consider reducing the predictor's dimensionality by employing a Single Functional Index Model (SFIM) to estimate the regression [53]. SFIM has demonstrated its effectiveness in enhancing the consistency of the regression operator estimator. Change-point detection is widely employed to pinpoint positions within a data sequence where a stochastic system undergoes abrupt external influences. This method finds application across various scientific disciplines. The identification of these changes is crucial for exploring their diverse causes and enables appropriate responses. The challenge of detecting disruptions in a sequence of random variables has a rich historical background, refer to [26,39,41]. It would be of interest to find applications of our results in this direction.
This section is devoted to the proof of our results. The aforementioned notation is also used in what follows. The proof of Theorems are quite involved and will decomposed in several lemmas proved in Section A.
We present Lemma 8.1 in a general setting, for instance, see [44], which could be useful in many other situations than ours; this is a generalization of a result obtained in [58]. More generally, this technical tool could be useful for dealing with random bandwidths.
Let be random vectors valued in , a general space. Let be a fixed subset of and we note that a function such that, , is measurable and
We define the pointwise measurable class of functions, for :
Let be a sequence of random real vectors (r.r.v.) in such a way that for all , , and be a measurable function belonging to some class of functions and let be a nonrandom function such that,
Now, for all , and we define
where , and .
Lemma 8.1. Let be a decreasing positive sequence such that . If, for all increasing sequence with , there exist two sequences of r.r.v. and such that
Then, as ,
(8.1) |
We refer to [44] for proof of this lemma.
Proof of Theorem 3.1
In order to establish the convergence rates, the following notation is necessary. For all , set
This allows us to write
Now, let us consider the following decomposition
Therefore, the proof of (8.2) is based on the following lemmas.
Lemma 8.2. Under assumptions (C.1.1), (C.3.1), (C.4.1), (C.5.1), (C.5.2'), (C.6.) and (C.7.), we have, as ,
(8.2) |
and
(8.3) |
This lemma gives us the rates of consistency of the stochastic part when the class of functions is bounded. The following lemma will give us the result when the class of functions is unbounded.
Lemma 8.3. Under assumptions the (C.1.1), (C.3.1), (C.4.2), (C.5.1), (C.5.2'), (C.6.) and (C.7.), we have, as ,
(8.4) |
Finally, we only need the following result for the bias term. This lemma can be obtained similar way as in [147], where more details are given.
Lemma 8.4. Under the condition (C.2.1), we have, as ,
(8.5) |
Proof of Theorem 3.3
Similar to [44], To prove Theorem 3.3 we need to check the conditions of Lemma 8.1 in the case of . For that, we first identify the variables as follows: , ,
Choosing and such that
(8.6) |
(8.7) |
We denote , and
for all increasing sequence such that Note that for all , and we have
using the condition (2.12) we can easily deduce that the bandwidths and both belong to the interval
Checking the conditions () and ()
Let us start with checking (). The fact that is bounded by and the local bandwidth satisfies the conditions of Theorem 3.1 gives
which is equivalent to
We use the same reasoning to check () and we readily obtain
Hence, () and () are checked.
Checking the condition ()
To check () we show that for all and ,
(8.8) |
Let be fixed. Let be an -net for for all , we have
Now, we use a lemma similar to [124] (see Lemma B.5). For completeness, we give their proof. Making use of Lemma B.5, we infer that
(8.9) |
This implies that
In a similar way, we obtain
(8.10) |
It follows that
Therefore, by that fact that , , we obtain
(8.11) |
(8.12) |
Checking the condition ()
We consider the following quantities:
The condition () can be written as
Hence, by the fact that , our claimed result is
(8.13) |
The proof of (8.13) is based on the following results
(8.14) |
(8.15) |
(8.16) |
Proof of (8.14)
Using the condition (C.3.1) one has
Now using the condition (C.1.1) we directly obtain
(8.17) |
Proof of (8.15)
We have
(8.18) |
To prove this, we use Lemma 8.2, which gives
(8.19) |
and
(8.20) |
Moreover, combining (8.19), (8.20) with the fact that
(8.21) |
it follows that
(8.22) |
Proof of (8.16)
[134] use in their proof of part the proof of Lemma 1 in [89], on the other, we will use some computations similar to the steps of the proof of Lemma 5.7.0.3 in [42]. Let us consider the following quantity:
using the fact that and supposing that the condition holds which means
and assuming that the conditions and to be satisfied, then for all , and in , one gets
Keeping in mind the condition and the fact that , we obtain
(8.23) |
Finally, rewriting (8.23) with gives us
which is equivalent to
(8.24) |
Combining the results of (8.17), (8.22) and (8.24) and the fact that , implies that
Hence, () is checked. Note that () is obviously satisfied by (C.3.1), and that () is also trivially satisfied by construction of and So one can apply Lemma 8.1, and (8.1) with is exactly the result of Theorem 3.3.
Preliminaries of the proofs
This part is mainly dedicated to the study of the functional conditional -statistics. Just like in the case of , where is covered by
for some radius Hence, for each there exists where such that
So for each the closest center is and the ball with the closest center
The proofs of the UIB consistency for the multivariate bandwidth will follow the same lines as the proofs of the UIB consistency for the univariate smoothing parameter in [42,44]. Furthermore, as in the proof of Theorem 3.1, we divide the sequence into alternate blocks, here the sizes are different satisfying
(8.25) |
and set, for
Proof of Theorem 3.8
In this section, we consider a bandwidth . To prove Theorem 3.8, we can write the -statistic for each as follows
(8.26) |
(8.27) |
Let us begin with the term , we have
By applying the Telescoping binomial, we get
(8.28) |
(8.29) |
From condition (C.4.1), we could claim that
similarly, we have
So, (8.28) satisfies :
where
Therefore, we infer that
(8.30) |
(8.31) |
The transition from Eq (8.30) to (8.31) is done thanks to the fact that the kernel function is Lipschitz. Uniformly on and , we get
by (3.2), where and , with component by component. The idea is to apply Lemma B.6 on the function
which satisfies for all :
Notice that the existence of the constant on the last right side of the preceding inequality is deduced from the condition (2.11). Now, we can apply Lemma B.6 with , which gives us
(8.32) |
(8.33) |
such that . By developing the computation while respecting the imposed conditions, mainly (C.6.) and (C.7.), we get
(8.34) |
The study of the term is deduced from the previous one. In fact:
(8.35) |
(8.36) |
To pass from (8.35) to (8.36), we apply Jensen's inequality in connection with some properties of the absolute value function. Then following the same way already taken, we get
where
Notice that we have
That implies
This gives that
(8.37) |
Continue, now with ,
Supposing that the kernel function is symmetric, we need to decompose our -statistic according to [115] decomposition, we have
(8.38) |
Define new classes of functions, for , , and
These classes are VC-type classes of functions with the same characteristics and the envelope function satisfying
Let us start with the linear term of (8.38), which is
From Hoeffding's projection, we have
One can see that
is an empirical process based on a VC-type class of functions contained in with the same characteristics and the elements are defined by:
Hence, the proof of this part is similar to that of the Lemma 8.2 and then:
Pass now to the nonlinear terms. The purpose is to prove that, for :
(8.39) |
to do that, we need to decompose the interval into smaller intervals. First, let us consider the intervals for all , we note
where and and we set and We can observe that
and
(8.40) |
Now, we set the following new classes, for and , , and
Thus, to prove (8.39), we need to prove that for and :
Notice that for each and , we have
At this stage, we will focus on studying the above equation for to simplify the proof (the same steps remain valid for ). We have
and
(8.41) |
For the formula becomes cumbersome and given by
Let us start by considering the term . Suppose that the sequence of independent blocks is of size . An application of (A.1), shows that
We keep the choice of and such that which implies that as , so the term to consider is the second summand. the key idea is to apply Lemma B.4. We see clearly that the class of functions is uniformly bounded, i.e.,
Moreover, by applying Proposition 2.6 of [10] we have for each and Rademacher variables :
(8.42) |
where
and
We see that
So, using the fact that is a VC-type class of functions satisfying (C.4.4) which implies that the class is also a VC-type class of functions with the same characteristics as , then,
(8.43) |
All the conditions of Lemma B.4 are fulfilled, so a direct application gives for each ,
(8.44) |
where
Next, let us study the same blocks , we have
Following the same argument as the blocks , we obtain
(8.45) |
where
and
Again, using Lemma B.4, we readily obtain
(8.46) |
The results for the remaining blocks can be obtained by following the same strategy above. Consequently, we have
where
This completes the proof of the theorem.
Proof of Theorem 3.10:
Notice that
Under the imposed hypothesis and the previously obtained results, and for some , we get that
Therefore, we can now apply Theorem 3.8 to handle and Theorems 3.8 and 3.9 to handle depending on whether the class satisfies (C.4.1) or (C.4.2), we get, for some with probability 1:
Hence, the proof is complete.
Proof of Theorem 3.11:
Under the conditions (C.3.1), we have
Taking in consideration the hypotheses (H.1), (C.2.1), (C.3.1) and (C.6.), we get and
where
This completes the proof of the theorem.
Proof of Corollary 3.18:
In this section, we will prove Corollary using Lemma 8.1. Following the same reasoning as the case of the functional regression, we use the notation: , ,
Choosing and such that for all
where are increasing sequences that belong to , and
We denote where
We can easily see that, for all :
(8.47) |
(8.48) |
Using the condition (2.12) one gets, for all there exist constants , such that
we put , and thus and belong to the interval
We denote and , therefore,
We also note for all :
and
Finally, we can choose constants and a sequence , while respecting the condition (C.8.), in a way that makes and
It is clear that is satisfied due to the condition (C.3.1), and from (8.47) and (8.48), we can easily verify that the construction of and satisfies the condition
Checking the conditions () and ()
A direct application of Corollary 3.12 gives
which is equivalent to
(8.49) |
Applying the same reasoning with , we obtain
(8.50) |
Thus, the conditions () and () are checked.
Checking the condition ()
To check () we show that for all and
(8.51) |
We have
now, using
(8.52) |
and
(8.53) |
Consequently, we obtain
(8.54) |
and
(8.55) |
Thus is checked.
Checking the condition ()
Notice that
(8.56) |
The study of (8.56) is similar to the proofs of Theorems 3.10 and 3.11, as we can clearly see that
(8.57) |
(8.58) |
Let us start with (8.57), we have
Applying the same calculation as in the proof of Theorem 3.10, it follows that:
Now, for (8.58), using the fact that
therefore,
following the same steps as in Theorem 3.11, we can easily conclude that
Consequently, we have
Since , we can also conclude that
which implies that
(8.59) |
Finally, by putting in (8.59), we get
(8.60) |
Hence, is checked. Now, with all the conditions of Lemma 8.1.1 satisfied, it follows that
which is exactly the desired result, hence, the proof is completed.
Preliminaries of the proofs
As mentioned before, a straightforward approach does not work when dealing with random bandwidths. Therefore, we often use some general lemmas (see, for example, [44]) to be able to use the results of the non-random bandwidths. In this section, we present the results of [147] and [43] obtained for some positive bandwidth . These results are key instrumental in the proofs. We denote the bias term and the centered variate respectively the following quantities
(8.61) |
The decomposition (8.61) plays a key role in our proof. Indeed, following the method adopted by [147], we will show that convergences in quadratic mean to and that the bias satisfies
Lemma 8.5. Under conditions (C.1.), (C.3.1), and (C.5.1) (or equivalently (C.1'.), (C.3'.) and (C.5.1)), and if as , then, we have for each :
(8.62) |
Before we present the next result, the following notation is needed :
(8.63) |
(8.64) |
Set
and
Lemma 8.6. Under conditions (C.1.), (C.2.), (C.3.1) and (C.5.1) (with and for condition (C.5.1)), we have for and positive constants :
(8.65) |
whenever We have
(8.66) |
(8.67) |
Lemma 8.7. Under conditions (C.1'.), (C.2.), (C.3'.) and (C.5.1) (with and for condition (C.5.1)), we have for as and :
(8.68) |
whenever and are constants specified previously.
(8.69) |
(8.70) |
To unburden the notation a bit and for simplicity, we denote
Lemma 8.8. Under conditions (C.1'.), (C.2.), (C.3'.), (C.5.1) and (C.5.2), if as then we have for , as ,
(8.71) |
Lemma 8.9. Under conditions (C.1.), (C.2.), (C.3.1) and (C.5.1), and if
then we have, as ,
(8.72) |
Proof of Theorem 4.1
Using the Cramér-Wold device, it is sufficient to prove the convergence of one-dimensional distribution in order to prove Theorem 4.1. Indeed, by the linearity of if suffices to show that
for all of the form
Therefore, we shall only demonstrate convergence in a single dimension. Remember that we're dealing with
(8.73) |
Set
To obtain the desired result, we write
where
(8.74) |
and
(8.75) |
To obtain the desired results, we follow the strategy of [152].
Lemma 8.10. Under the assumptions (C.2.1), (C.2.2), (C.3.2), (C.4.2) and (C.8.), we have
Lemma 8.11. Under the assumptions (C.1'.), (C.2.1.), (C.2.2.), (C.3'.) and (C.5.) and if
we have
Lemma 8.12. Under the assumptions (C.1), (C.3.1), and (C.5.1) (or equivalently (C.1'.), (C.3'.) and (C.5.1)), and if
then we have for each :
Lemma 8.13. Under conditions (C.1.), (C.3.1), and (C.5.1) (or equivalently (C.1'.), (C.3'.) and (C.5.1)), and if as . Then we have for each :
(8.76) |
as
We highlight that this lemma is more general than Lemma 8.5. This result is slightly weaker than the uniform, almost complete convergence with the rate that we obtained in Section 3.14. However, the conditions imposed in this lemma are less restrictive.
Proof of Theorem 4.2
In this section, we will use the same method as in [43] and earlier [8], that is using the blocking approach which entails breaking down a strictly stationary sequence , into equal-sized blocks, that each one is of length , keeping in mind the notation given in the proof of Lemma 8.2. In order to establish the asymptotic equi-continuity of the conditional empirical process
Let us introduce, for any and
(8.77) |
(8.78) |
Then, we have
where for , we have
We study the asymptotic equi-continuity of each of the previous terms. For a class of functions let be an empirical process based on and indexed by :
and for a measurable function and , set
That implies
Again, keeping in mind that when . We will establish the asymptotic equi-continuity of
which means, for every , that
where
The idea is to work with the independent block sequence instead of working on dependent one, which is possible through (A.1), then we have
(8.79) |
where is defined by
(8.80) |
We choose
Note that in our setting is equivalent to:
Making use of the condition (C.5.1), we get as , then it's just a matter of the right side term of (8.79). Let us begin with, the blocks being independent, we symmetrize using a sequence of i.i.d. Rademacher variables, i.e., r.v's with
It should be noted that the sequence is independent of the sequence , thus it remains to establish, for all ,
Again, using the fact that that when and (A.8), it suffices to show that
Since the -conditional moment satisfies (C.4.2), we can truncate and we get, for each , as ,
(8.81) |
(8.82) |
Hence,
Then, it suffices to show
We have the following
This is done by using the chaining method. [8] gave , where
(8.83) |
and let the class of measurable functions of
There is a map that takes each to its closest function in such that
Applying the chaining method
(8.84) |
Let be in such a way that
(8.85) |
Let be chosen so small in such a way that
Therefore, from (8.84), we readily infer that
By the fact that the terms composing are bounded by , and by applying the Bernstein's inequality, we obtain
By using (8.83), we have
that means
In view of (8.85), we assume that . In a similar way, we have
Finally, by (8.83) it suffices to prove, for each ,
Making use of the square root trick (Lemma 5.2 [102]) and see also [137] in similar way as in [8], we get
(8.87) |
Let us introduce the semi-norm
and the covering number defined for any class of functions by
By the latter we can bound , (the calculations are detailed in [27]). In the same way, as in [27] and before in [8], as a result of the independence between the blocks and condition (C.4.3), we apply again Lemma 5.2 in [102] and get
Therefore, the theorem is proved.
Proof of Theorem 4.5
Lemma 8.14. Under assumptions of Theorem 4.5, we have
(8.88) |
and, if in addition, condition (C.8) is satisfied, then we have
(8.89) |
Lemma 8.15. Under the assumptions of Theorem 4.5, we have
(8.90) |
where
Proof of Theorem 4.5
As mentioned earlier, the study of the weak convergence of the conditional -process is based on the study of two parts: the truncated part and the remainder part.
Lemma 8.16. Let be a uniformly bounded class of measurable canonical functions from . Suppose that there are finite constants and such that the covering number satisfies:
(8.91) |
for every and every probability measure . If the mixing coefficient of the stationary sequence fulfills
(8.92) |
for some , then
Proof of Theorem 4.7
It is known that the weak convergence of an empirical process is obtained from its finite-dimensional convergence and its asymptotic equi-continuity (while respecting certain criteria). Theorem 4.5 gives the finite-dimensional convergence of the conditional U-process , so what remains to be seen is its asymptotic equi-continuity. We decompose the -process into two parts truncated and remainder part:
Following the same reasoning to obtain (A.48), we also know that:
(8.93) |
Therefore, it suffices to prove the weak convergence of and instead of studying . The steps of the proof are similar to [43] while taking into account a multivariate bandwidth instead of a univariate bandwidth . In this section we only show the proof for , the process can be replicated for .
As shown earlier, the truncated part is decomposed according to the Hoeffding's decomposition:
We shall first investigate the linear term . Notice that
We can write
We need to introduce a new function
Hence,
The linear term of the process is given by
Therefore, the linear term of the -process is an empirical process indexed by the class of functions defined by
therefore, its weak convergence may be established in a similar way as in the proof of Theorem 4.5. It's clear that . We consider now the nonlinear part, we have to show that
This is a consequence of the Lemma 8.16. Note that the choice of the number and size of the blocks must be made in such a way that the terms converge to We need to prove that
Again, for clarity purposes, we restrict ourselves to . We have
We will use blocking arguments and treat the resulting terms. We start by considering the first Ⅰ'. We have
Notice that (4.13) readily implies that and recall that for all
By the symmetry of the function , it holds that
(8.94) |
We use while maintaining order, Chebyshev's inequality and Hoeffding's trick, then
(8.95) |
Under (C.6.), we have for each
which tends to as . The terms Ⅱ', Ⅴ' and Ⅵ' are treated in the same way as the first, except that for Ⅱ', Ⅵ' we do not need to apply Hoeffding's trick because our variables or for Ⅵ' are in the same blocks, and for the term Ⅳ' we deduce its study from those of Ⅰ' and Ⅲ'. Let us consider the term Ⅲ'. As for the truncated part, we have
(8.96) |
We also have
Since the Eq (8.94) is still satisfied, the problem is reduced to
we follow the same procedure as in (8.91). The rest has just been shown to be asymptotically negligible, so the process converges in law to a Gaussian process which has a version with uniformly bounded and uniformly continuous paths with respect to norm. By repeating the same steps, this also holds true, for the process . Consequently, by (8.93) it follows that the process also converges in law to a Gaussian process which has a version with uniformly bounded and uniformly continuous paths with respect to norm. In a similar way, we treat , and about and the treatment is done as in the proof of Theorem 4.5.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors would like to thank the Editor-in-Chief, an Associate-Editor, and four anonymous referees for their constructive remarks, which resulted in a substantial improvement of the work's original form and a more sharply focused presentation.
The second author gratefully acknowledges the funding received towards his PhD from the Algerian government PhD fellowship.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Proof of Lemma 8.2:
Following the same notation as [43], refer also to [27] and [8], the proof of this lemma is based on the blocking approach, which consists of breaking down a strictly stationary sequence , into equal-sized blocks, that each one is of length that is, for ,
The values of are given in the following. Another important component in this proof is the sequence of independent blocks satisfying
As in [27,43], applying the results of [84] on -mixing, gives us for any measurable set
(A.1) |
Let be an -net of and . Observing the following decomposition
(A.2) |
Let us start with the term . We need to prove that there exists and , such that
The key strategy here is to work with independent block sequence instead of dependent once, this can be achieved by (A.1). Then we can clearly see that for :
(A.3) |
If we put
then, by condition (C.5.1), it follows that as . Hence, only the first term in the right-hand side of (A.3) remains to be dealt with. Set
Then, we have . The empirical measure on is defined by:
For the original sequence, we write
For the constructed independent block sequence , define
Hence, we have
We consider also the following class of functions for and :
with the envelope function also keep in mind that . Now that we have all the necessary elements and using the fact that we are now dealing with sums of independent blocks, we can use the standard inequalities of independent setting. We have
(A.4) |
Now, to bound the probability in (A.4), we apply Bernstein's inequality with
and
We know that the envelope function verifies
Hence, by (C.1.1), (C.3.1), (C.4.1) combined with Lemma B.1 and Lemma B.2, we get that
Now, we can apply Bernstein's inequality to the empirical processes, for
Making use of Lemma B.3, we get
(A.5) |
(A.6) |
where . Moreover, from Eq (2.10), and the fact that and by choosing
we get from (A.6) and (A.4),
Now, let us study the term . By conditions (C.4.1) and (C.3.1), for some constant , we have
Using the fact that the kernel is supposedly Lipschitz, we obtain
where for
Taking into account that and using condition (C.7.), we get for
where we assume that is the bound of the sequence which converges to Hence, by applying a standard inequality (see Corollary A.8 [91]), the conditions are satisfied here, uniformly on and on on :
in combination with condition (C.6.) and (C.7.), we readily obtain
(A.7) |
Finally, we treat . Noticing that
similar to the procedures of treating once again, it follows directly that
(A.8) |
Hence, by combining (A.2) and (A.8), we so obtain (8.2), as sought.
Proof of (8.3):
Notice that (8.3) is a direct result of (8.2) when the function . This completes the proof of the lemma.
Proof of Lemma 8.10
Let us consider the term such that when , and making use of condition (C.3.2), we have
(A.9) |
We have
(A.10) |
(A.11) |
The Eq (A.10) gives
(A.12) |
(A.13) |
Similarly, (A.11) gives us:
(A.14) |
The following decomposition will be used to treat (A.12)
(A.15) |
(A.16) |
(A.17) |
For some , recall that and are defined in (8.6) and (8.7) respectively. Moreover, observe that
Under the assumptions (C.4.2) and (C.8.), using Hölder's inequality, for and such that , we can write
To obtain the convergence in probability of (A.15), it suffices to use the assumption (C.4.2), (C.8.), and Markov's inequality. Indeed, for all , we obtain
for some large enough and by condition (C.8.). Now, for the second term of (A.14), Lemma 8.2 gives
(A.18) |
Using the fact that and by condition (C.2.1), for , (A.13) gives
(A.19) |
On the other side, recall that , then, using the fact that the regression function satisfies the Lipschitz condition and under condition (C.4.1), we have
(A.20) |
For the second term of the right-hand side of the Eq (8.74), we have
(A.21) |
(A.22) |
making use of the Lemma 8.2, we readily infer
(A.23) |
The proof of is complete. Hence, the convergence of to 0 is complete combining the result (8.74) and (A.18), (A.19), (A.20) and (A.23).
Proof of Lemma 8.3:
In this lemma, we suppose that the class of functions is not necessarily bounded but satisfying condition (C.4.2). In order to prove Lemma 8.3, we proceed as follows, for an arbitrary and each function :
where we take So, we write
Let us start with the truncated part. Following the same reasoning as in the proof of Lemma 8.2, we have:
(A.24) |
First, let us start with the term , using the same notation as the proof of Lemma 8.2, we have:
where is an empirical process indexed by the class of functions , where we define the class of functions of by
where being the moment order given in (C.4.2). We consider also the following class of functions, for and ,
that the envelope function is denoted by , then we have
(A.25) |
(A.26) |
We apply Bernstein's inequality to bound (A.26), so, we have to study the asymptotic behavior of the quantities
and
From condition (C.4.2), we remark that
so thanks to the condition (C.4.2), we get
Furthermore, making use of Lemma B.1 in combination with Lemma B.2 in appendix, and since condition (C.4.3) is satisfied, we infer that
which means that,
Now, we can apply Bernstein's inequality for empirical processes with:
in Lemma B.3, we obtain, for ,
(A.27) |
where . From Eq (2.10), and fact that and by choosing
we get
Next, we prove that the result is valid for the term . By conditions (C.4.2) and (C.3.1), for some constant , we have
where
Observe that
Furthermore, following the same steps as the proof of in the previous lemma's proof, we have:
where, for
We get uniformly on and on :
So, for , we obtain
(A.28) |
(A.29) |
The transition from (A.28) to (A.29) is done by using Jensen's inequality used for the concave function , for . Moreover, from condition (C.7.), we deduce that the quantity is bounded, then for
where . Hence, by applying a standard inequality (see Corollary A.8 [91]) that the conditions are satisfied here, uniformly on , we get
Finally, all that is left is to evaluate the term . We have
which means that
Let's take a look at and following the same steps, by the fact that is a Lipschitz function, we get
(A.30) |
where, for
We get for and uniformly on and on :
(A.31) |
(A.32) |
The transition from (A.31) to (A.32) is done by using Minkowski's and Jensen's inequalities. Then, following the same way as in treating , we get
Therefore, the proof is done for the truncated part. Regarding the remainder term, the idea consists of proving its asymptotic negligibility, that is
which could be derived directly from the proof of the remainder part of the -statistics developed in the sequel.
Proof of Lemma 8.11
We recall that
Therefore, the proof of this lemma is based on the following decomposition:
(A.33) |
By the same arguments as those involved in the proof of . We get
(A.34) |
and
(A.35) |
By combining (A.34) and (A.35) and by similar arguments used to treat the and , we get
We then obtain that tending to zero as tends to infinity. We evaluate the term on the right side of (A.33). Let us introduce the following sum
where
and
Thus, the claimed result now is
The asymptotic normality of was proved in Lemma 8.8 by choosing the bandwidth parameter as . For and , we obtain by using Lemma 8.2 and the fact that with or
Using Lemma 8.5, we get
Consequently, we have
Hence, the proof is complete.
Proof of Lemma 8.12:
For the proof of this Lemma, it suffices to use the result of [58]) in inequality (A.9), Tchebychev's inequality and Lemma 8.2. For , we readily infer
Making use of the fact that
We finally obtain
Hence, the proof is complete.
Proof of Lemma 8.13:
We have
where
Taking into account condition (C.4.2) and using Hölder's inequality, for such that , we can write for all :
(A.36) |
the later inequality is due to condition (C.1.2). Next, we have
Let us start with the term , considering the conditions (C.1.1) and (C.3.1), it follows that
(A.37) |
and we have
Hence, combining the later inequality with (A.37) gives us:
(A.38) |
whenever Next, we consider . We have
(A.39) |
whenever satisfies, , and by conditions (C.1.2)–(C.3.1), we readily infer that:
Thus, by the previous equation in combination with (A.39), we have
Making use of (A.37), we obtain
This when combined with (A.38) implies that
where is chosen in such a way that the above bound tends to as Now, let's consider . For any two -algebras and so by applying Davydov's lemma on strong-mixing sequences, and taking into account condition (C.4.2), we infer
However, satisfies (A.37), then we have
Which involves
Once more by using (A.37) and by a simple calculation (reduction of the double sum), we get:
This latter, with the boundary on implies:
(A.40) |
Choosing and making use of condition (C.5.1), we get that:
(A.41) |
By the same value chosen for :
by the fact that is assumed to be bounded and , then the first term of the last inequality tends to as . Furthermore, the second term also tends to since Hence, the proof is complete.
Proof of Lemma 8.14:
Based on (4.8), we have
Remark that the first term of the last sum is an empirical process indexed by , so from Section 4.1, we have
(A.42) |
So, it is enough to show, for that
(A.43) |
To avoid ambiguity, we consider the case where (the other cases are treated in the same way). Then, we have
The projections being -canonical, then to show (A.43) it suffices to show that
We have
where
We first remark that
Following similar reasoning to obtain (A.41), the following bound suffices to our need
Treating the other terms as in Lemma 2 of [203] (refer for detailed proof in the conditional setting to [13, Lemma 3]) we get
Hence, the proof of (8.88) is complete. The statement (8.89) is a direct consequence of (8.88) in connection with condition (C.8).
Proof of Lemma 8.15
For any two constants the sum of the product of these constants by the components of the vector (8.90) is a centered -statistic, i.e., for ,
where
So, by the Cramér-Wold theorem the proof of the present lemma is directly deduced from Lemma 8.14.
What remains to be done is to complete the proof of the theorem in question based on the two previous lemmas. Indeed,
and because
then all we have to do is prove that
(A.44) |
We have
the latter, the fact that is continuous on and leads to the desired result.
Proof of Lemma 8.16:
For the clarity of the exposition, we present the proof for ; this case already contains the main idea. As in the proof of Theorem 4.2, we divide the sequence into alternate blocks, here the sizes are different satisfying
(A.45) |
and set, for :
Note that the notation used here and in the proof of Theorem 4.7 denotes the size of the alternative blocks. However, in the proof of Theorem 4.2, it denotes the radius of the nets of the class of functions. Then, we have
(A.46) |
We have to treat each of the terms Ⅰ–Ⅵ. The treatment of Ⅴ and Ⅵ is readily achieved through similar techniques used to investigate Ⅰ and Ⅱ, which we omit.
The same type of block but not the same block (Ⅰ):
Suppose that the sequence of independent blocks is of size . An application of (A.1), shows that
We keep the choice of and such that :
(A.47) |
which implies that as , so the term to consider is the second summand. For some , we choose and such that for
then using the results of [44], we have when and making use of condition (C.3.2), we have :
(A.48) |
where and . This implies that
Now, combining Lemma A.1 of [67] with Proposition B.8 in the Appendix, we obtain
(A.49) |
where is the diameter of according to the distance which are defined respectively by
and
Let consider another semi-norm
One can see that we have
We readily infer that
where Notice that as we have
where and are chosen in such a way that the following relation will be fulfilled
(A.50) |
Making use of the triangle inequality, in combination with Hoeffding's trick, for instance, see [8, page 62], we obtain readily that
(A.51) |
where are independent copies of . By imposing
(A.52) |
we readily infer that
By symmetrizing the expression in the expression in (A.51) and applying again the Proposition B.8 in the Appendix, we get
(A.53) |
where
and for
The fact that
so,
(A.54) |
we have the convergence of (A.53) to zero. For the choice of , and , it should be noted that all the values satisfying (A.45), (A.47), (A.50), (A.52) and (A.54) are accepted.
The same blocks (Ⅱ):
Remark that we have
In a similar way as in the preceding proof, it suffices to prove that
Because of computation by [138, p. 53] and the fact that the classes functions are uniformly bounded, we obtain uniformly in
This implies that we have to prove that
(A.55) |
Like for empirical processes, to prove (A.55), it suffices to symmetrize and show that
In a similar way as in (A.49), we infer that
where
and the semi-metric is defined by
Since we are trading uniformly bounded classes of functions, we infer that
Since , , we obtain Ⅱ as .
Different types of blocks (Ⅲ)
An application of (A.1), shows that
by the last choice of the parameters and the condition (8.92) imposed on the -coefficients, we have
For and , since we have independent exchangeable blocks, we infer that
For , we obtain
therefore, it suffices to treat the convergence
By similar arguments as in [8], the usual symmetrization gives
(A.56) |
where
In a similar way as in (A.49), we infer that
(A.57) |
where
Since we have
and by considering the semi-metric
We show that the expression in (A.57) is bounded as follows
by choosing for some , we get the convergence to zero of the previous quantity. To bound the second term in the right-hand side of (A.56), we remark that
(A.58) |
(A.59) |
We now apply the square root trick to the last expression conditionally on . We denote by the expectation with respect to and we get by (C.6.), for , (in the notation in Lemma 5.2 of [102])
Since we need and , by similar arguments as in [8] page 69, we get the convergence of (A.57) and (A.59) to zero.
Different types of blocks (Ⅳ)
We have :
Hence, the proof of the lemma is complete.
This appendix contains supplementary information that is an essential part of providing a more comprehensive understanding of the paper.
In the sequel, we define to be i.i.d. random variables defined on the probability space and taking values in some measurable space and to be a -measurable class of measurable functions with envelope function such that :
We further assume that has the following property:
For any sequence of i.i.d. -valued random variables it holds that
where is a constant depending on only.
Lemma B.1 (Theorem 2.14.1 [193]). For an empirical process indexed by the class of functions with the notation:
and meaning
we have, for ,
Lemma B.2 (Theorem 3.1 [78]). Let be a pointwise measurable function class satisfying the above assumptions. If we suppose that the empirical process satisfies:
(B.1) |
then for any measurable subset :
From Theorem 3.2 of [78] it follows that a VC-type class of functions satisfies, always, the condition (15.1).
Lemma B.3 (Bernstein type inequality Fact 4.2 [78]). Assume that for some and the r.vs satisfy:
then for we have for any :
Lemma B.4 (Proposition 4. [9]). Let be i.i.d. random variables with values in some measurable space . Let be a class of symmetric functions from satisfying some measurability conditions. Suppose that there exists a finite constant such that for each we have:
and that there is a finite constant such that a.s. Then for each :
where the variables are a Rademacher variables.
Lemma B.5 (Lemma 6.1 in [124], p.186). Let be independent Bernoulli random variables with , for all Set and Then, for any we have
and if , we have
Lemma B.6. Let be a data sequence, along with the kernel function , satisfying Assumptions (A1)–(A3). We then have, there exist absolute constants only depending on and , such that, for any and sufficiently large,
Proposition B.7. Let a strong mixing non-stationary sequence of random variables, and let be a -measurable and be a -measurable. If and then:
where :
Proposition B.8 (Proposition 3.6 of [10]). Let be a process satisfying, for :
and the semi-metric : There exists a constant such that:
where being the -diameter of .
Remark B.9. In a similar way as in [35], Theorem 4.2 can be used to investigate the following problems.
1) (Expectile regression). For , let , then the zero of with respect to leads to quantities called expectiles by [155]. Expectiles, as defined by [155], may be introduced either as a generalization of the mean or as an alternative to quantiles. Indeed, classical regression provides us with a high sensitivity to extreme values, allowing for more reactive risk management. Quantile regression, on the other hand, provides the ability to acquire exhaustive information on the effect of the explanatory variable on the response variable by examining its conditional distribution, refer to [3,4,150,151] for further details on expectiles in functional data setting.
2) (Quantile regression). For , let . Then the zero of with respect to is the conditional -th quantile, initially introduced in [129] in the real and linear framework, for more general setting, refer to [35].
3) (Conditional winsorized mean). As in [117], if we consider if , , or , then the zero of with respect to will be the conditional winsorized mean. Notably, this parameter was not considered in the literature on nonparametric functional data analysis involving wavelet estimators. Our paper offers asymptotic results for the conditional winsorized mean when the covariates are functions.
[1] |
Le, A., Satkunam, L. and Jaime, C.Y., Residents' Perceptions of a Novel Virtual Livestream Cadaveric Teaching Series for Musculoskeletal Anatomy Education. American Journal of Physical Medicine & Rehabilitation, 2023,102(12): e165–e168. https://doi.org/10.1097/PHM.0000000000002284 doi: 10.1097/PHM.0000000000002284
![]() |
[2] |
Hawdon, J.M. and Bernot, J.P., Teaching Parasitology Lab Remotely Using Livestreaming. The American Biology Teacher: Journal of the National Association of Biology Teachers, 2022, 84(5): 312–314. https://doi.org/10.1525/abt.2022.84.5.312 doi: 10.1525/abt.2022.84.5.312
![]() |
[3] | UNESCO, Global education monitoring report, 2023: technology in education: a tool on whose terms? 2023. |
[4] | Ma, L.P. and Bu, S.C., Is Synchronous Online Education Better than Asynchronous? An Empirical Study Based on Survey and Administration Data. Peking University Education Review, 2022, 20(3): 2-24+187. |
[5] | Li, S., Huang, J.J. and Liu, S.Z., Analysis of Patterns and Characteristics of Teacher-Student Dialogue Interaction in Live Teaching. Modern Distance Education Research, 2022, 91–112. |
[6] |
Lin, X.F., Deng, C., Hu, Q. and Tsai, C.C., Chinese undergraduate students' perceptions of mobile learning: Conceptions, learning profiles, and approaches. Journal of Computer Assisted Learning, 2019, 35(3): 317–333. https://doi.org/10.1111/jcal.12333 doi: 10.1111/jcal.12333
![]() |
[7] | Alqurashi, E., Technology Tools for Teaching and Learning in Real Time. 2019. Educational Technology and Resources for Synchronous Learning in Higher Education. |
[8] | Wei, B.S., Analysis of Problems and Countermeasures in Open Education Livestream Teaching. Journal of Jilin Radio and TV University, 2018, 98-99+112. |
[9] |
Dunlap, J.C. and Lowenthal, P.R., Online educators' recommendations for teaching online: Crowdsourcing in action. Open Praxis, 2018, 10(1): 79–89. https://doi.org/10.5944/openpraxis.10.1.721 doi: 10.5944/openpraxis.10.1.721
![]() |
[10] | Lin, X.F. and S.Q. Liu, Live Broadcast Teaching Strategies for Cultivating Higher-order Thinking Skills. Modern Educational Technology, 2019(03): 99–105. |
[11] |
Cui, X.P., Zhao, L., Su, W. and Lu, C.C., Research on the Structural Relationship and Effects of Influencing Factors of Live Learning Outcomes——From the Perspective of Interactive Distance Theory. e-Education Research, 2022, 63–70. https://doi.org/10.13811/j.cnki.eer.2022.01.008 doi: 10.13811/j.cnki.eer.2022.01.008
![]() |
[12] |
Kozan, K. and Richardson, J.C., Interrelationships Between and Among Social, Teaching, and Cognitive Presence. The Internet and higher education, 2014, 21: 68–73. https://doi.org/10.1016/j.iheduc.2013.10.007 doi: 10.1016/j.iheduc.2013.10.007
![]() |
[13] |
Wu, X.E., Chen, X.H. and Wu, J., On the Effects of Presence on Online Learning Performance. Modern Distance Education, 2017, 24–30. https://doi.org/10.13927/j.cnki.yuan.2017.0014 doi: 10.13927/j.cnki.yuan.2017.0014
![]() |
[14] |
Zhang, X.F., Shi, Y., Li, T., Guan, Y. and Cui, X., How Do Virtual AI Streamers Influence Viewers' Livestream Shopping Behavior? The Effects of Persuasive Factors and the Mediating Role of Arousal. Information Systems Frontiers, 2023, 1–32. https://doi.org/10.1007/s10796-023-10425-2 doi: 10.1007/s10796-023-10425-2
![]() |
[15] |
Garrison, D.R., Anderson, T. and Archer, W., Critical Inquiry in a Text-Based Environment: Computer Conferencing in Higher Education. The internet and higher education, 1999, 2(2-3): 87–105. https://doi.org/10.1016/S1096-7516(00)00016-6 doi: 10.1016/S1096-7516(00)00016-6
![]() |
[16] |
Garrison, D.R. and Arbaugh, J.B., Researching the community of inquiry framework: Review, issues, and future directions. Internet & Higher Education, 2007, 10(3): 157–172. https://doi.org/10.1016/j.iheduc.2007.04.001 doi: 10.1016/j.iheduc.2007.04.001
![]() |
[17] |
Garrison, D.R., Online Community of Inquiry Review: Social, Cognitive, and Teaching Presence Issues. Online Learning, 2007, 11(1). https://doi.org/10.24059/olj.v11i1.1737 doi: 10.24059/olj.v11i1.1737
![]() |
[18] |
Wan, L.Y., David, S. and Xie, K., Two Decades of Research on Community of Inquiry Framework: Retrospect and Prospect. Open Education Research, 2023, 26(06): 57–68. https://doi.org/10.13966/j.cnki.kfjyyj.2020.06.006 doi: 10.13966/j.cnki.kfjyyj.2020.06.006
![]() |
[19] | Ma, Z.Q., Liu, Y.Q. and Kong, L.L., The Context and Inspiration of Theoretical and Empirical Research on the Community of Inquiry. Modern Distance Education Research, 2018, 39–48. |
[20] |
Goshtasbpour, F., Swinnerton, B. and Morris, N.P., Look Who's Talking: Exploring Instructors' Contributions to Massive Open Online Courses. British Journal of Educational Technology, 2020, 51(1): 228–244. https://doi.org/10.1111/bjet.12787 doi: 10.1111/bjet.12787
![]() |
[21] |
Shea, P. and Bidjerano, T., Community of inquiry as a theoretical framework to foster "epistemic engagement" and "cognitive presence" in online education. Computers & Education, 2009, 52(3): 543–553. https://doi.org/10.1016/j.compedu.2008.10.007 doi: 10.1016/j.compedu.2008.10.007
![]() |
[22] |
Hardin-Pierce, M., Hampton, D., Melander, S., Wheeler, K., Scott, L., Inman, D., et al., Faculty and Student Perspectives of a Graduate Online Delivery Model Supported by On-Campus Immersion. Clinical Nurse Specialist, 2020, 34(1): 23–29. https://doi.org/10.1097/NUR.0000000000000494 doi: 10.1097/NUR.0000000000000494
![]() |
[23] |
Bai, X.M., Ma, H.L. and Wu, H.M., Relationships among Teaching, Cognitive and Social Presence in a MOOC-based Blended Course. Open Education Research, 2016, 22(04): 71–78. https://doi.org/10.13966/j.cnki.kfjyyj.2016.04.009 doi: 10.13966/j.cnki.kfjyyj.2016.04.009
![]() |
[24] |
Rolim, V., Ferreira, R., Lins, R.D. and Gǎsević, D., A network-based analytic approach to uncovering the relationship between social and cognitive presences in communities of inquiry. The Internet and higher education, 2019, 42: 53–65. https://doi.org/10.1016/j.iheduc.2019.05.001 doi: 10.1016/j.iheduc.2019.05.001
![]() |
[25] |
Sun, K.T., Lin, Y.C. and Yu, C.J., A Study on Learning Effect among Different Learning Styles in a Web-Based Lab of Science at Elementary Schools. Computers & Education, 2008, 50(4): 1411–1422. https://doi.org/10.1016/j.compedu.2007.01.003 doi: 10.1016/j.compedu.2007.01.003
![]() |
[26] |
Cheung, W., Li, E.Y. and Yee, L.W., Multimedia learning system and its effect on self-efficacy in database modeling and design: an exploratory study. Computers & Education, 2003, 41(3): 249–270. https://doi.org/10.1016/S0360-1315(03)00048-4 doi: 10.1016/S0360-1315(03)00048-4
![]() |
[27] |
Latham, G.P. and Brown, T.C., The Effect of Learning vs. Outcome Goals on Self-Efficacy, Satisfaction and Performance in an MBA Program. Applied Psychology, 2006, 55(4): 606–623. https://doi.org/10.1111/j.1464-0597.2006.00246.x doi: 10.1111/j.1464-0597.2006.00246.x
![]() |
[28] |
Sung, Y.T., Chang, K.E. and Liu, T.C., The effects of integrating mobile devices with teaching and learning on students' learning performance. Pergamon, 2016, 94: 252–275. https://doi.org/10.1016/j.compedu.2015.11.008 doi: 10.1016/j.compedu.2015.11.008
![]() |
[29] |
Zhong, B.C. and Xia, L.Y., Effects of new coopetition designs on learning performance in robotics education. Journal of Computer Assisted Learning, 2021, 38(1): 223–236. https://doi.org/10.1111/jcal.12606 doi: 10.1111/jcal.12606
![]() |
[30] |
Fung, C.J.L. and Lang, Q.C., Modelling relationships between students' academic achievement and community of inquiry in an online learning environment for a blended course. Australasian Journal of Educational Technology, 2016, 32(4). https://doi.org/10.14742/ajet.2500 doi: 10.14742/ajet.2500
![]() |
[31] |
Garrison, D.R., Clevelandinnes, M. and Fung, T.S., Exploring causal relationships among teaching, cognitive and social presence: Student perceptions of the community of inquiry framework. Internet & Higher Education, 2010, 13(1): 31–36. https://doi.org/10.1016/j.iheduc.2009.10.002 doi: 10.1016/j.iheduc.2009.10.002
![]() |
[32] |
Guoshuai, L., Qiuju, Z., Caijie, L.V., Yating, S. and Jiacai, W., Construction of a Chinese Version of the Community of Inquiry Measurement Instrument. Open Education Research, 2018, 24(03): 68–76. https://doi.org/10.13966/j.cnki.kfjyyj.2018.03.008 doi: 10.13966/j.cnki.kfjyyj.2018.03.008
![]() |
[33] |
Aldhahi, M.I., Alqahtani, A.S., Baattaiah, B.A. and Al-Mohammed, H.I., Exploring the relationship between students' learning satisfaction and self-efficacy during the emergency transition to remote learning amid the coronavirus pandemic: A cross-sectional study. Education and information technologies, 2022, 27(1): 1323–1340. https://doi.org/10.1007/s10639-021-10644-7 doi: 10.1007/s10639-021-10644-7
![]() |
[34] |
Thomas, L.J., Parsons, M. and Whitcombe, D., Assessment in Smart Learning Environments: Psychological Factors Affecting Perceived Learning. Computers in Human Behavior, 2018, 95: 197–207. https://doi.org/10.1016/j.chb.2018.11.037 doi: 10.1016/j.chb.2018.11.037
![]() |
[35] |
Wen, Z.L. and Ye, B.J., Analyses of Mediating Effects: The Development of Methods and Models. Advances in Psychological ence, 2014, 22(5): 731–745. https://doi.org/10.3724/SP.J.1042.2014.00731 doi: 10.3724/SP.J.1042.2014.00731
![]() |
[36] | Li, S.W., Luo, J.L. and Ge, Y.Q., Effects of Big Data Analysis Capability on Breakthrough Product Innovation. Journal of Management Science, 2021, 3–15. |
[37] | Dixson, M.D., Creating effective student engagement in online courses: What do students find engaging? Journal of the Scholarship of Teaching and Learning, 2010, 10(2): 1–13. |
[38] |
Contreras, C.P., Picazo, D., Cordero-Hidalgo, A. and Chaparro-Medina, P.M., Challenges of virtual education during the Covid-19 pandemic: Experiences of mexican university professors and students. International Journal of Learning, Teaching and Educational Research, 2021, 20(3): 188204. https://doi.org/10.26803/ijlter.20.3.12 doi: 10.26803/ijlter.20.3.12
![]() |
[39] |
Kwiatkowska, W. and Winiewska-Nogaj, L., Motives, benefits and difficulties in online collaborative learning versus the field of study. An empirical research project concerning Polish students. e-mentor, 2021, 90(3): 11–21. https://doi.org/10.15219/em90.1518 doi: 10.15219/em90.1518
![]() |
[40] | Zhang, W.W., Zhu, J. and Jiang, X., Research on the Impact of Social Learning on the Knowledge Contribution Behavior of Different Types of Users in Professional Virtual Communities. Information and Documentation Services, 2021, 42(5): 94–103. |
[41] | Li, Y.F., Zhou, X.Z. and Yang, X.H., The Optimization Design of Online Open Courses for Subjects Needs. Modern Educational Technology, 2022,104–110. |
[42] | Ghazali, A.F., Othman, A.K., Sokman, Y., Zainuddi, N.A., Suhaimi, A., Mokhtar, N.A. et al., Investigating Social Cognitive Theory in Online Distance and Learning for Decision Support: The Case for Community of Inquiry. International Journal of Asian Social Science, 2021, 11(11): 522–538. |
[43] | Yang, S.J., How to Ensure the Teaching Quality of Online Live Courses——Taking the Learning Environment and Learning Content Design of Online Live Courses during the Epidemic as an Example. Modern Educational Technology, 2023,112–118. |
[44] |
Valentini, M. and Guarnacci, S., Embodied cognition, effective learning and physical activity as a shared feature: Systematic review. Universidad de Alicante. Área de Educación Física y Deporte, 2021, 16(2): S539–S552. https://doi.org/10.14198/JHSE.2021.16.PROC2.38 doi: 10.14198/JHSE.2021.16.PROC2.38
![]() |
[45] |
Guo, S.Q., Gao, H.Y. and Hua, X.Y., Theoretical Research on Design of "Internet +" Unit Teaching Mode. e-Education Research, 2022, 43(06): 91-103+112. https://doi.org/10.13811/j.cnki.eer.2022.06.014 doi: 10.13811/j.cnki.eer.2022.06.014
![]() |
[46] | Cui, X.P., et al., Visualization Analysis of Online Learning Activity Design Research in China. Journal of Longdong University, 2023, 32(4): 135–139. |
1. | Nour-Eddine Berrahou, Salim Bouzebda, Lahcen Douge, Functional Uniform-in-Bandwidth Moderate Deviation Principle for the Local Empirical Processes Involving Functional Data, 2024, 33, 1066-5307, 26, 10.3103/S1066530724700030 | |
2. | Salim Bouzebda, Limit Theorems in the Nonparametric Conditional Single-Index U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design, 2024, 12, 2227-7390, 1996, 10.3390/math12131996 | |
3. | Salim Bouzebda, Nourelhouda Taachouche, Oracle inequalities and upper bounds for kernel conditional U-statistics estimators on manifolds and more general metric spaces associated with operators, 2024, 1744-2508, 1, 10.1080/17442508.2024.2391898 | |
4. | Salim Bouzebda, Youssouf Souddi, Fethi Madani, Weak Convergence of the Conditional Set-Indexed Empirical Process for Missing at Random Functional Ergodic Data, 2024, 12, 2227-7390, 448, 10.3390/math12030448 | |
5. | Salim Bouzebda, Inass Soukarieh, Limit theorems for a class of processes generalizing the U -empirical process , 2024, 96, 1744-2508, 799, 10.1080/17442508.2024.2320402 | |
6. | Salim Bouzebda, Weak convergence of the conditional single index -statistics for locally stationary functional time series, 2024, 9, 2473-6988, 14807, 10.3934/math.2024720 | |
7. | Oussama Bouanani, Salim Bouzebda, Limit theorems for local polynomial estimation of regression for functional dependent data, 2024, 9, 2473-6988, 23651, 10.3934/math.20241150 | |
8. | Salim Bouzebda, Amel Nezzal, Issam Elhattab, Limit theorems for nonparametric conditional U-statistics smoothed by asymmetric kernels, 2024, 9, 2473-6988, 26195, 10.3934/math.20241280 | |
9. | Salim Bouzebda, Uniform in Number of Neighbor Consistency and Weak Convergence of k-Nearest Neighbor Single Index Conditional Processes and k-Nearest Neighbor Single Index Conditional U-Processes Involving Functional Mixing Data, 2024, 16, 2073-8994, 1576, 10.3390/sym16121576 | |
10. | Alain Desgagné, Christian Genest, Frédéric Ouimet, Asymptotics for non-degenerate multivariate U-statistics with estimated nuisance parameters under the null and local alternative hypotheses, 2024, 0047259X, 105398, 10.1016/j.jmva.2024.105398 | |
11. | Breix Michael Agua, Salim Bouzebda, Single index regression for locally stationary functional time series, 2024, 9, 2473-6988, 36202, 10.3934/math.20241719 | |
12. | Youssouf Souddi, Salim Bouzebda, k-Nearest Neighbour Estimation of the Conditional Set-Indexed Empirical Process for Functional Data: Asymptotic Properties, 2025, 14, 2075-1680, 76, 10.3390/axioms14020076 | |
13. | Ibrahim M. Almanjahie, Hanan Abood, Salim Bouzebda, Fatimah Alshahrani, Ali Laksaci, Nonparametric expectile shortfall regression for functional data, 2025, 58, 2391-4661, 10.1515/dema-2025-0125 | |
14. | Salim Bouzebda, Nourelhouda Taachouche, Limit theorems for conditional U-statistics analysis on hyperspheres for missing at random data in the presence of measurement error, 2026, 472, 03770427, 116811, 10.1016/j.cam.2025.116811 |