Weak convergence of the conditional single index $ U $-statistics for locally stationary functional time series

Salim Bouzebda; Salim Bouzebda

doi:10.3934/math.2024720

AIMS Mathematics

2024, Volume 9, Issue 6: 14807-14898. doi: 10.3934/math.2024720

Previous Article Next Article

Research article Special Issues

Weak convergence of the conditional single index $U$ -statistics for locally stationary functional time series

Salim Bouzebda ^,

Université de technologie de Compiègne, Laboratory of Applied Mathematics of Compiègne (LMAC), France

Received: 12 February 2024 Revised: 01 April 2024 Accepted: 12 April 2024 Published: 24 April 2024
MSC : 60F17, 62E17, 62G05, 62G08, 62G20, 62G35, 62G07, 62G32

In recent years, there has been a notable shift in focus towards the analysis of non-stationary time series, driven largely by the complexities associated with delineating significant asymptotic behaviors inherent to such processes. The genesis of the theory of locally stationary processes arises from the quest for asymptotic inference grounded in nonparametric statistics. This paper endeavors to formulate a comprehensive framework for conducting inference within the realm of locally stationary functional time series by harnessing the conditional $U$ -statistics methodology as propounded by W. Stute in 1991. The proposed methodology extends the Nadaraya-Watson regression function estimations. Within this context, a novel estimator was introduced for the single index conditional $U$ -statistics operator, adept at accommodating the non-stationary attributes inherent to the data-generating process. The primary objective of this paper was to establish the weak convergence of conditional $U$ -processes within the domain of locally stationary functional mixing data. Specifically, the investigation delved into scenarios of weak convergence involving functional explanatory variables, considering both bounded and unbounded sets of functions while adhering to specific moment requirements. The derived findings emanate from broad structural specifications applicable to the class of functions and models under scrutiny. The theoretical insights expounded in this study constitute pivotal tools for advancing the domain of functional data analysis.

Keywords:

Citation: Salim Bouzebda. Weak convergence of the conditional single index $U$ -statistics for locally stationary functional time series[J]. AIMS Mathematics, 2024, 9(6): 14807-14898. doi: 10.3934/math.2024720

Related Papers:

[1]	Salim Bouzebda, Amel Nezzal . Uniform in number of neighbors consistency and weak convergence of $ k $NN empirical conditional processes and $ k $NN conditional $ U $-processes involving functional mixing data. AIMS Mathematics, 2024, 9(2): 4427-4550. doi: 10.3934/math.2024218
[2]	Salim Bouzebda, Amel Nezzal, Issam Elhattab . Limit theorems for nonparametric conditional U-statistics smoothed by asymmetric kernels. AIMS Mathematics, 2024, 9(9): 26195-26282. doi: 10.3934/math.20241280
[3]	Breix Michael Agua, Salim Bouzebda . Single index regression for locally stationary functional time series. AIMS Mathematics, 2024, 9(12): 36202-36258. doi: 10.3934/math.20241719
[4]	Fatimah Alshahrani, Wahiba Bouabsa, Ibrahim M. Almanjahie, Mohammed Kadi Attouch . Robust kernel regression function with uncertain scale parameter for high dimensional ergodic data using $ k $-nearest neighbor estimation. AIMS Mathematics, 2023, 8(6): 13000-13023. doi: 10.3934/math.2023655
[5]	Said Attaoui, Billal Bentata, Salim Bouzebda, Ali Laksaci . The strong consistency and asymptotic normality of the kernel estimator type in functional single index model in presence of censored data. AIMS Mathematics, 2024, 9(3): 7340-7371. doi: 10.3934/math.2024356
[6]	Oussama Bouanani, Salim Bouzebda . Limit theorems for local polynomial estimation of regression for functional dependent data. AIMS Mathematics, 2024, 9(9): 23651-23691. doi: 10.3934/math.20241150
[7]	Xueping Hu, Jingya Wang . A Berry-Ess$\acute{e}$n bound of wavelet estimation for a nonparametric regression model under linear process errors based on LNQD sequence. AIMS Mathematics, 2020, 5(6): 6985-6995. doi: 10.3934/math.2020448
[8]	H. M. Barakat, M. H. Dwes . Asymptotic behavior of ordered random variables in mixture of two Gaussian sequences with random index. AIMS Mathematics, 2022, 7(10): 19306-19324. doi: 10.3934/math.20221060
[9]	Ramzi May, Chokri Mnasri, Mounir Elloumi . Asymptotic for a second order evolution equation with damping and regularizing terms. AIMS Mathematics, 2021, 6(5): 4901-4914. doi: 10.3934/math.2021287
[10]	Yue Li, Yunyan Wang . Strong consistency of the nonparametric kernel estimator of the transition density for the second-order diffusion process. AIMS Mathematics, 2024, 9(7): 19015-19030. doi: 10.3934/math.2024925

Abstract

1. Introduction and motivations

In the evolutionary trajectory of asymptotic outcomes related to $U$ -statistics, with a particular emphasis on independent and identically distributed random variables, pivotal contributions can be ascribed to esteemed figures like ^[93,99,181], among others. When extending these advancements to accommodate scenarios involving weak dependency assumptions, notable references include ^{[30,39,40,66,119,120]}. For a comprehensive grasp of $U$ -statistics and $U$ -processes, scholars are directed to seminal works such as ^{[11,12,14,31,113,117]}. A substantial leap forward in the theoretical landscape of $U$ -processes is accredited to ^[64], who made significant contributions by assimilating insights from empirical process theory. Their introduction of innovative techniques, including decoupling inequality and randomization, played a pivotal role in propelling the theoretical framework forward. The applications of $U$ -processes traverse diverse statistical domains, encompassing testing for qualitative features of functions in nonparametric statistics ^[1,88,118], cross-validation for density estimation ^[144], and establishing limiting distributions of M-estimators ^{[11,64,157,158]}. In the realm of machine learning, $U$ -statistics find multifaceted applications in clustering, image recognition, ranking, and learning on graphs. The natural estimates of risk prevalent in various machine learning contexts often manifest in the form of $U$ -statistics, as elucidated in ^[54]. Instances of $U$ -statistics are also discerned in various contexts, such as empirical performance measures in metric learning, exemplified by ^[50]. When confronted with $U$ -statistics characterized by random kernels exhibiting diverging orders, pertinent literature includes contributions from ^{[85,98,152,162]}. Infinite-order $U$ -statistics manifest as invaluable tools for constructing simultaneous prediction intervals, providing insights into the uncertainty inherent in ensemble methods like subbagging and random forests, as explicated in ^[146]. The MeanNN approach, introduced by ^[75] for estimating differential entropy, intricately involves the utilization of the $U$ -statistic. Additionally, ^[129] proposes a novel test statistic for goodness-of-fit tests, employing $U$ -statistics. A model-free approach to clustering and classifying genetic data based on $U$ -statistics is explored by ^[55], presenting alternative perspectives driven by the adaptability of $U$ -statistics to a diverse array of genetic issues and their capability to accommodate various data types. Furthermore, ^[125] advocates for the natural application of $U$ -statistics in examining random compressed sensing matrices in the non-asymptotic regime. For the latest references in this context, please consult ^[43,163,164]. In the realm of nonparametric density and regression function estimation, ^[166] introduces a class of estimators for $r^{(m)}(\varphi, \mathbf{ t}) $, referred to as conditional $U$-statistics. These estimators can be perceived as an extension of the Nadaraya-Watson estimates for regression functions, initially proposed by ^[139,184]. The nonparametric domain of density and regression function estimation has been a focal point for statisticians and probabilists over numerous years, resulting in the evolution of various methodologies. Kernel nonparametric function estimation methods, in particular, have garnered substantial attention. For a thorough exploration of the research literature and statistical applications in this field, one is encouraged to consult ^{[67,74,95,140,160,182]}, and the pertinent references therein.

This investigation delves into the intricacies of nonparametric conditional $U$ -statistics. To facilitate our exploration, we commence by introducing the estimators proposed by ^[166]. Consider a regular sequence of random elements $\{(\mathbf{X}_i, Y_i): i\in \mathbb{N}^*\}$ , where $\mathbf{X}_i \in \mathbb{R}^d$ and $Y_i \in \mathcal{Y}$ , a Polish space, with $\mathbb{N}^* = \mathbb{N} \backslash\{0\}$ . Let $\varphi: \mathcal{Y}^m\rightarrow \mathbb{R}$ be a measurable function. In this paper, our central focus revolves around the estimation of the conditional expectation or regression function:

$r^{(m)}(\varphi, \mathbf{ t}) = \mathbb{E}\left(\varphi(Y_{1}, \ldots, Y_{m})\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{m}) = \mathbf{ t}\right),$

for $\mathbf{ t} \in \mathbb{R}^{dm}$ , provided it exists, namely, when $\mathbb{E}\left(\left\lvert\varphi(Y_{1}, \ldots, Y_{m})\right\rvert\right) < \infty$ . We introduce a kernel function $K:\mathbb{R}^d\rightarrow \mathbb{R}$ with support contained in $[-B, B]^d$ , where $B > 0,$ adhering to the following conditions:

$\sup\limits_{\mathbf{ x}\in \mathbb{R}^d}\vert K(\mathbf{ x})\vert = :\kappa < \infty \; \; \mbox{and}\; \; \int K(\mathbf{ x})d\mathbf{ x} = 1.$

Reference ^[166] introduced a class of estimators for $r^{(m)}(\varphi, \mathbf{ t})$ , known as conditional $U$ -statistics, defined for each $\mathbf{ t}\in\mathbb{R}^{dm}$ as:

$\begin{eqnarray} \widehat{r}_{n}^{(m)}(\varphi, \mathbf{t};h_{n}) = \frac{ \sum\limits_{(i_{1}, \ldots, i_{m})\in I_n^m}\varphi({ Y}_{i_{1}}, \ldots, { Y}_{i_{m}})K\left(\frac{\mathbf{ t}_1- \mathbf{ X}_{i_{1}}}{h_{n}}\right)\cdots K\left(\frac{\mathbf{ t}_m-\mathbf{ X}_{i_{m}}}{h_{n}}\right)}{ \sum\limits_{(i_{1}, \ldots, i_{m})\in I_n^m}K\left(\frac{ \mathbf{ t}_1-\mathbf{ X}_{i_{1}}}{h_{n}}\right)\cdots K\left(\frac{\mathbf{ t}_m-\mathbf{ X}_{i_{m}}}{h_{n}}\right)}, \end{eqnarray}$

(1.1)

where $I_n^m$ is the set of all $m$ -tuples of different integers between $1$ and $n$ :

$I_n^m = \left\{\mathbf{ i} = (i_{1}, \ldots, i_{m}): 1\leq i_{j}\leq n \; \; \mbox{ and }\; \; i_{j}\neq i_{r} \; \; \mbox{ if }\; \; j\neq r \right\},$

and $\{h_{n}\}_{n\geq1}$ is a sequence of positive constants converging to zero at the rate $nh_{n}^{dm} \rightarrow \infty$ . In the specific scenario of $m = 1$ , where $r^{(m)}(\varphi, t)$ simplifies to $r^{(1)}(\varphi, t) = \mathbb E(\varphi(\mathbf{ Y})\mid \mathbf{ X} = t)$ , the estimator by Stute transforms into the Nadaraya-Watson estimator of $r^{(1)}(\varphi, t)$ . The study conducted by ^[154] focused on estimating the rate of uniform convergence in $\mathbf{ t}$ of $\widehat{r}_n^{(m)}(\varphi, \mathbf{t}; h_n)$ to $r^{(m)}(\varphi, \mathbf{ t})$ . In ^[148], the paper discusses and compares the limit distributions of $\widehat{r}_n^{(m)}(\varphi, \mathbf{ t}; h_n)$ with those obtained by Stute. ^[97] extended the results of ^[166] to weakly dependent data under appropriate mixing conditions (also see ^[17]). They applied these findings to verify the Bayes risk consistency of corresponding discrimination rules similar to ^[167] and Section 5.1. In ^[169], symmetrized nearest neighbor conditional $U$ -statistics are proposed as alternatives to the usual kernel-type estimators, and reference can also be made to ^[48]. ^[86] explored the functional conditional $U$ -statistic and established its finite-dimensional asymptotic normality. Despite the subject's importance, nonparametric estimation of conditional $U$ -statistics in a functional data framework has received relatively limited attention. Recent advancements are presented in ^[48], addressing problems related to uniform bandwidth consistency in a general setting. In ^[104], the test of independence in the functional framework based on the Kendall statistics was investigated, which can be considered as particular cases of $U$ -statistics. Extending this exploration to conditional empirical $U$ -processes in the functional setting is practically useful and technically more challenging. Two perspectives on conditional $U$ -processes are presented 1) they are infinite-dimensional versions of conditional $U$ -statistics (with one kernel) and 2) they are stochastic processes that are nonlinear generalizations of conditional empirical processes. Both views are valuable because: 1) from a statistical standpoint, considering a rich class of statistics is more interesting than a single statistic; 2) mathematically, insights from empirical process theory can be applied to derive limit or approximation theorems for $U$ -processes. Importantly, extending $U$ -statistics to $U$ -processes demands substantial effort and different techniques, and generalization from conditional empirical processes to conditional $U$ -processes is highly nontrivial.

The prevalent practice of assuming stationarity in time series modeling has prompted the development of various models, techniques, research, and methodologies. However, this assumption may not always be suitable for spatio-temporal data, even with detrending and deseasonalization. Many pivotal time series models exhibit nonstationarity, observed in diverse physical phenomena and economic data, rendering classical methods ineffective. To address this challenge, the concept of the locally stationary random process was introduced by ^[161]. This type of process approximates a non-stationary process by a stationary one locally over short periods. The intuitive concept of local stationarity is also explored in the works of ^{[57,58,142,149,153]}, among others. The groundbreaking work of ^[57] notably serves as a robust foundation for the inference of locally stationary processes. In addition to generalizing stationary processes, this innovative approach eliminates time-varying parameters. Over the past decade, the theory of empirical processes for locally stationary time series has garnered significant attention. Empirical processes theory plays a crucial role in addressing statistical problems and has expanded into time series analysis and regression estimation. Relevant references in this context include ^[59,179] and more recent contributions such as ^[134,147]. The extension of the previously discussed exploration to conditional empirical $U$ -processes bears significant interest from both practical and theoretical standpoints. We specifically delve into the domain of conditional $U$ -processes indexed by a class of functions within the framework of functional data. Building upon insights from ^[8], functional data analysis (FDA) emerges as a statistical field dedicated to analyzing infinite-dimensional variables such as curves, sets, and images. Experiencing remarkable growth over the past two decades, FDA has become a crucial area of investigation in data science, fueled by advancements in data collection technology during the "Big Data" revolution. For an introduction to FDA, readers can refer to the books by ^[78,151], providing fundamental analysis methods and case studies across various domains like criminology, economics, archaeology, and neurophysiology. Notably, the extension of probability theory to random variables taking values in normed spaces predates recent literature on functional data, with foundational knowledge available in ^[9,87]. In the context of regression estimation and nonparametric models for data in normed vector spaces, valuable references include ^[78,136], along with additional contributions from ^[32,102,126]. Modern empirical process theory has been applied to functional data, as demonstrated by ^[82], who established uniform consistency rates for functionals of the conditional distribution, including the regression function, conditional cumulative distribution, and conditional density. ^[109] extended this by providing consistency rates for various functional nonparametric models, uniformly in bandwidth (UIB consistency). Recent advancements in this field can be explored through references such as ^{[3,40,42,49,68,69,137]}. This strongly motivates the consideration of regression models that offer dimension reduction. Single index models are widely used to achieve this by assuming that the predictors' influence on the response can be simplified to a single index. This index represents a projection in a specified direction and is combined with a nonparametric link function, simplifying the predictors to a one-dimensional index while still incorporating important characteristics. Additionally, because the nonparametric link function only operates on a one-dimensional index, these models are not affected by the problem of having a high number of dimensions, known as the curse of dimensionality. The single index model extends the concept of linear regression by incorporating a link function equivalent to the identity function; for further details, interested readers can refer to ^{[23,91,123,138,171,183]}.

Recent progress in functional data analysis underscores the need for developing models to address the challenges of dimensionality reduction (refer to ^[90,126] for recent surveys, and also ^{[4,5,40,41,165]}). In response to this, semiparametric approaches emerge as promising solutions. The functional single-index model (FSIM) has gained attention in this context, with exploration by ^[2,16,79]. Furthermore, ^[106] proposed functional single-index composite quantile regression, estimating the unknown slope function and link function through $B$ -spline basis functions. A functional single index model with coefficient functions restricted to a subregion was introduced by ^[143]. The estimation of a general functional single index model, where the conditional distribution depends on the functional predictor via a single index structure, was investigated by ^[188]. Innovatively, ^[173] developed a new estimation method that combines functional principal component analysis, $B$ -spline modeling, and profile estimation for parameters and functions. Addressing the estimation of the functional single index regression model with missing responses for strong mixing time series data, ^[127,128] made valuable contributions. ^[76] introduced a functional single-index varying coefficient model with the functional predictor as the single-index part. Utilizing functional principal component analysis and basis function approximation, they obtained estimators for slope and coefficient functions, proposing an iterative estimating procedure. An automatic and location-adaptive procedure for estimating regression in an FSIM based on k-Nearest Neighbors ( $k$ NN) principles was presented by ^[145]. Motivated by imaging data analysis, ^[121] proposed a novel functional varying-coefficient single-index model for regression analysis of functional response data on a set of covariates. Investigating a functional Hilbertian regressor for nonparametric estimation of the conditional cumulative distribution with a scalar response variable in a single index structure, ^[15] made notable contributions. An alternative approach was introduced by ^[52], extending to the multi-index case without anchoring the true parameter on a prespecified sieve. Their detailed theoretical analysis of a direct kernel-based estimation scheme establishes a polynomial convergence rate.

The primary objective of this paper is to scrutinize a comprehensive framework for the single-index conditional $U$ -process of any fixed order, indexed by a class of functions within a nonparametric context. Specifically, we explore the conditional $U$ -process in the realm of functional covariates, considering the potential non-stationary nature of functional time series. The main aim of this study is to offer an initial and comprehensive theoretical examination within this specific context. To achieve this, we skillfully apply large sample theory techniques developed for empirical processes and $U$ -empirical processes. This paper meticulously tackles various technical hurdles. Initially, it delves into the nonlinear expansion of the single index concept and conditional $U$ -statistics. Subsequently, it addresses the extension of the Hoeffding decomposition to non-stationary time series. Finally, it confronts the complexity stemming from the unbounded function class, leading to extensive and intricate proofs.

The manuscript's organization is structured as follows. Section 2 provides a detailed exposition of our theoretical framework, elucidating essential definitions and contextual explanations while introducing technical assumptions. Our principal findings are presented in Sections 3 and 4. Specifically, Section 3 unveils convergence rate results, reintroducing the pivotal Hoeffding decomposition technique. Accommodating our outcomes on weak convergence, Section 4 delves into the details of these results. Section 5 accentuates selected applications. In Section 6, we explore bandwidth selection methodologies utilizing cross-validation procedures. Concluding reflections are encapsulated in Section 7. The comprehensive proofs are furnished in Section 8. Lastly, Appendix 13 provides technical properties and lemmas for easy reference.

2. Background and preliminaries

2.1. Notation

In this document, the notation $a_n \lesssim b_n$ is employed to signify the existence of a constant $C$ , which is independent of $n$ and may vary between lines, unless explicitly specified otherwise. This constant is such that $a_n \leq C b_n$ holds for all $n$ . Additionally, the notation $a_n \ll b_n$ indicates that $a_n / b_n \rightarrow 0$ as $n \rightarrow \infty$ . When both $a_n \lesssim b_n$ and $b_n \lesssim a_n$ hold, their equivalence is denoted as $a_n \sim b_n$ . Moreover, $(i_1, \ldots, i_m)$ is denoted as $\mathbf{i}$ , and $(i_1/n, \ldots, i_m/n)$ as $\mathbf{i}/n$ . For any $c, d \in \mathbb{R}$ , the expressions $c \vee d = \max\{c, d\}$ and $c \land d = \min\{c, d\}$ are employed. The notation $\left \lfloor a \right \rfloor$ signifies the integer part of a number. Additionally, for $m < n$ , where $m$ and $n$ are positive integers, $C_m^n = \frac{n!}{(n-m)! m!}$ is defined. The set $I_n^m$ is introduced as

$I_n^m : = \left\{\mathbf{ i} = (i_{1}, \ldots, i_{m}): 1\leq i_{j}\leq n \; \; \mbox{ and }\; \; i_{j}\neq i_{r} \; \; \mbox{ if }\; \; j\neq r \right\},$

comprising all $m$ -tuples of distinct integers between $1$ and $n$ .

2.2. Model

Consider the stochastic processes $\{Y_{i, n}, X_{i, n}\}_{i = 1}^n$ , where $Y_{i, n}$ takes values in a space $\mathcal{Y}$ , and $X_{i, n}$ is in an abstract space $\mathcal{H}$ . We assume $\mathcal{H}$ is a semi-metric vector space with a semi-metric $d(\cdot, \cdot)$ ^* which, in most applications, would be a Hilbert or Banach space. We consider the semi-metric $d_{\theta}(\cdot, \cdot)$ associated with the single-index $\theta \in\mathcal H$ , defined as

^*A semi-metric (or pseudo-metric) $d(\cdot, \cdot)$ is a metric allowing $d_{\theta_1}(x_1, x_{2}) = 0$ for some $x_{1}\neq x_{2}$ .

$d_{\theta}(u, v): = |\langle\theta, u-v\rangle|, \; \mbox{for}\; u, v\in \mathcal H.$

Consider any function $\varphi(\cdot)$ of $k$ variables (the $U$ -kernel) such that $\varphi(Y_{1}, \ldots, Y_{m})$ is integrable. For $\mathbf{x} = (x_1, \ldots, x_m) \in \mathcal{H}^m$ and $\boldsymbol{\theta} = (\theta_1, \ldots, \theta_m)\in \Theta^m \subset \mathcal{H}^m$ , define the regression functional parameter as

$\begin{eqnarray} r^{(m)}\left(\varphi, \frac{\mathbf{i}}{n}, \mathbf{x}, \boldsymbol{\theta}\right)&: = &\mathbb{E}\left(\varphi(Y_{1}, \ldots, Y_{m})\mid \langle{ X}_{1}, \theta_1\rangle = \langle{ x}_{1}, \theta_1\rangle, \ldots, \langle{ X}_{m}, \theta_m\rangle = \langle{ x}_{m}, \theta_m\rangle\right) \\& = :&{ \mathbb{E}\left(\varphi(Y_{\mathbf i})\mid \langle{ X}_{\mathbf i}, \boldsymbol{\theta}\rangle = \langle\mathbf x, \boldsymbol{\theta}\rangle\right)}, \; \; \; \mathbf{i} = (1, \ldots, m). \end{eqnarray}$

(2.1)

In this study, we consider the following model:

$\begin{equation} \varphi(Y_{\mathbf{i}, n}) = r^{(m)}\left(\varphi, \frac{\mathbf{i}}{n}, \mathbf{x}, \boldsymbol{\theta}\right)+\sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) \varepsilon_{\mathbf{i}}, \; \; \mathbf{i} = (i_1, \ldots, i_m), \; \; 1 \leq i_j \leq n, \end{equation}$

(2.2)

where $\left\{\varepsilon_{\mathbf i}\right\}_{\mathbf i \in I_n^m}$ is a sequence of univariate independent and identically distributed random variables, independent of $\left\{X_{i, n}\right\}_{i = 1}^{n}$ . We denote $\sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right)\varepsilon_{\mathbf{i}}$ as $\varepsilon_{\mathbf{i}, n}$ . Furthermore, we assume that the process is a locally stationary functional time series. In a heuristic sense, a process $\left\{X_{i, n}\right\}$ is considered locally stationary if it displays approximately stationary behavior locally in time. The regression function $r^{(m)}(\varphi, \cdot, \mathbf x, \boldsymbol{\theta})$ is allowed to change smoothly over time, depending on a rescaled quantity ${\bf i}/n$ rather than on the specific point ${\bf i}$ (where ${\bf i}$ typically represents time in a time series framework).

2.3. Local stationarity

We delve into the exploration of non-stationary processes characterized by dynamics evolving gradually over time, manifesting behaviors akin to stationarity at a local level. This conceptual realm has undergone comprehensive scrutiny, exemplified by references such as ^{[57,105,115,131,141,185]}. For illustration, consider a continuous function $a:[0, 1] \rightarrow \mathbb{R}$ and a sequence of i.i.d. random variables $\left(\varepsilon_i\right)_{i \in \mathbb{N}}$ . The stochastic process $X_{i, n} = a(i / n)+\varepsilon_i$ , where $i \in\{1, \ldots, n\}$ and $n \in \mathbb{N}$ , can exhibit "almost" stationary behavior for $i$ close to a specific point $i^*$ (e.g., $i^*$ in $\{1, \ldots, n\}$ ), given that $a\left(i^* / n\right) \approx a(i / n)$ . However, this process is not strictly weakly stationary. To capture this type of gradual change, the concept of local stationarity was introduced by ^[57], wherein the spectral representation of the underlying stochastic process is locally approximated. In our framework, the process $\{X_{i, n}\}$ can be stochastically approximated by a stationary process $\{ X_{i, n}^{(u)}\}$ around each rescaled time point $u$ , specifically for those values of $i$ where $i/n-u$ is small. Since our focus is on functional data, we define a functional time series as locally stationary if it can be locally approximated by a stationary functional time series. We will provide a standard definition of local stationarity.

Definition 2.1 (local stationarity). For a sequence of stochastic processes, indexed by $n \in \mathbb{N}$ and taking values in $\mathcal{H}$ , denoted as $\{ X_{i, n}\}$ , it is deemed locally stationary if, for all rescaled times $u\in [0, 1]$ , there exists an associated $\mathcal{H}$ -valued process $\{ X_{i}^{(u)}\}$ that is strictly stationary. This association is characterized by the inequality:

$\begin{equation} d_\theta\left(X_{i, n}, X_{i}^{(u)}\right) \leq \left(\left\lvert \frac{i}{n} - u\right\rvert + \frac{1}{n}\right)U_{i, n}^{(u)} \ \;a.s., \end{equation}$

(2.3)

This holds for all $1 \leq i \leq n$ , where $\{U_{i, n}^{(u)}\}$ is a positive valued process satisfying $\mathbb E[(U_{i, n}^{(u)})^{\rho}] < C$ for some $\rho > 0$ , $C < \infty$ . These conditions are independent of $u$ , $i$ , and $n$ .

Definition 2.1 represents a natural extension of the concept of local stationarity for real-valued time series introduced in ^[57]. In a more specific context, ^[176] and ^[175] elaborate on this definition, considering $\mathcal{H}$ as a Hilbert space $L_\mathbb{R}^2[0, 1]$ . Here, all real-valued functions are square-integrable with respect to the Lebesgue measure on the interval $[0, 1]$ , equipped with the inner product $L_2$ -norm:

$\|f\|_{2} = \sqrt{\langle f, f \rangle}, \ \langle f, g \rangle = \int_{0}^{1}f(t)g(t)dt,$

where $f, g\in L_{\mathbb{R}}^{2}([0, 1])$ . These authors also provide sufficient conditions for an $L_{\mathbb{R}}^{2}([0, 1])$ -valued stochastic process $\{X_{i, n}\}$ to satisfy (2.3) with $d(f, g) = \|f-g\|_{2}$ and $\rho = 2$ . Additionally, they define $L_E^p(T, \mu)$ as the Banach space of all strongly measurable functions $f: T \rightarrow E$ with finite norm:

$\|f\|_p = \|f\|_{L_E^p(T, \mu)} = \left(\int\|f(\tau)\|_E^p d \mu(\tau)\right)^{\frac{1}{p}},$

for $1 \leqslant p < \infty$ and with finite norm

$\|f\|_{\infty} = \|f\|_{L_E^{\infty}(T, \mu)} = \inf\limits_{\mu(N) = 0} \sup\limits_{\tau \in T \backslash N}\|f(\tau)\|_E,$

for $p = \infty$ . In this context, $\mathcal H_{\mathbb{C}} = L_{\mathbb{C}}^2([0, 1])$ .

Remark 2.2. ^[176] generalizes the definition of local stationary processes, initially proposed by ^[56], to the functional setting in the frequency domain. This extension is made under the following assumptions:

(A1) (i) $\left\{\varepsilon_i\right\}_{i\in \mathbb{Z}}$ is a weakly stationary white noise process taking values in $\mathcal H$ with a spectral representation given by

$\varepsilon_j = \int_{-\pi}^\pi e^{\mathrm{i} \omega j} d Z_\omega,$

where $Z_\omega$ is a $2 \pi$ -periodic orthogonal increment process taking values in $\mathcal H_{\mathbb{C}}$ ;

(ii) the functional process $X_{i, n}$ with $i = 1, \ldots, n$ and $n \in \mathbb{N}$ is given by

$X_{j, n} = \int_{-\pi}^\pi e^{\mathrm{i} \omega j} \mathcal{A}_{j, \omega}^{(n)} d Z_\omega \quad a.e. \;in \;\mathcal H$

with the transfer operator $\mathcal{A}_{j, \omega}^{(n)} \in \mathcal{B}_p$ and an orthogonal increment process $Z_\omega$ .

(A2) There exists $\mathcal{A}:[0, 1] \times[-\pi, \pi] \rightarrow S_p\left(H_{\mathbb{C}}\right)$ with $\mathcal{A}_{u, \cdot} \in \mathcal{B}_p$ and $\mathcal{A}_{u, \omega}$ being continuous in $u$ such that for all $n \in \mathbb{N}$

$\sup\limits_{\omega, t}\left\|\mathcal{A}_{t, \omega}^{(n)}-\mathcal{A}_{\frac{t}{n}, \omega}\right\|_p = O\left(\frac{1}{n}\right) .$

They have proved in [176, Proposition 2.2] that:

Proposition 2.3. Suppose that assumptions (A1) and (A2) hold. Then, $\left\{X_{i, n}\right\}$ is a locally stationary process in $\mathcal H$ .

In our investigation of non-stationary processes characterized by gradually evolving dynamics over time, exhibiting behaviors reminiscent of stationarity at a localized scale, we draw upon an extensive body of research documented in notable references such as ^{[57,105,115,131,141,185]}. As an illustrative example, let $a:[0, 1] \rightarrow \mathbb{R}$ be a continuous function, and consider a sequence of independent and identically distributed random variables $\left(\varepsilon_i\right)_{i \in \mathbb{N}}$ . The stochastic process $X_{i, n} = a(i / n)+\varepsilon_i$ , where $i \in\{1, \ldots, n\}$ and $n \in \mathbb{N}$ , may demonstrate "almost" stationary behavior for $i$ in proximity to a specific point $i^*$ (e.g., $i^*$ in $\{1, \ldots, n\}$ ), under the condition that $a\left(i^* / n\right) \approx a(i / n)$ . However, it is important to note that this process does not strictly adhere to weak stationarity. To capture this gradual transition, the concept of local stationarity was introduced by ^[57], wherein the spectral representation of the underlying stochastic process is locally approximated. Within our framework, the process $\{X_{i, n}\}$ can be stochastically approximated by a stationary process $\{ X_{i, n}^{(u)}\}$ around each rescaled time point $u$ , particularly for those values of $i$ where $i/n-u$ is small. Given our emphasis on functional data, we define a functional time series as locally stationary if it can be approximated locally by a stationary functional time series. The subsequent section will present a standard definition of local stationarity, building upon the foundation laid out in ^[176].

Theorem 2.4. Consider a white noise process $\left\{\varepsilon_i\right\}_{i \in \mathbb{Z}}$ in $L_H^2(\Omega, \mathbb{P})$ where $H = L^2([0, 1])$ , and let $\left\{X_{i, n}\right\}$ be a sequence of functional autoregressive processes defined as

$\sum\limits_{j = 0}^m B_{\frac{i}{n}, j}\left(X_{i-j, n}\right) = C_{\frac{i}{n}}\left(\varepsilon_i\right),$

with $B_{u, j} = B_{0, j}, C_u = C_0$ for $u < 0$ , and $B_{u, j} = B_{1, j}, C_u = C_1$ for $u > 1$ . If the process satisfies, for all $u \in[0, 1]$ and $p = 2$ or $p = \infty$ , the conditions

(i) $C_u$ is an invertible element of $S_{\infty}(H)$ ;

(ii) $B_{u, j} \in S_p(\mathcal H)$ for $j = 1, \ldots, m$ with $\sum_{j = 1}^m\left\|B_{u, j}\right\|_l < 1$ and $B_{u, 0} = I_H$ ;

(iii) the mappings $u \mapsto B_{u, j}$ for $j = 1, \ldots, m$ and $u \mapsto C_u$ are continuous in $u \in[0, 1]$ and differentiable on $u \in(0, 1)$ with bounded derivatives,

then the process $\left\{X_{i, n}\right\}$ satisfies (A2) with

$\mathcal{A}_{\frac{i}{n}, \omega}^{(n)} = \frac{1}{\sqrt{2 \pi}}\left(\sum\limits_{j = 0}^m e^{-\mathrm{i} \omega j} B_{\frac{i}{n}, j}\right)^{-1} C_{\frac{i}{n}},$

and, consequently, is locally stationary.

2.4. Small-ball probability

Handling infinite-dimensional spaces presents a notable technical hurdle due to the absence of a universal reference measure, such as the Lebesgue measure. Consequently, defining a density function for the functional variable becomes elusive. To surmount this challenge, we leverage the concept of "small-ball probability". In particular, we address the concentration of the probability measure for the functional variable within a small-ball using the function $\phi_x(\cdot)$ . For a fixed $x\in \mathcal{H}$ and every $r > 0$ , the function $\phi_{x, \theta}(r)$ is defined as:

$\begin{equation} \mathbb{P}\left(X \in B_\theta(x , r)\right) = : \phi_{x, \theta}(r) > 0. \end{equation}$

(2.4)

Here, $\mathcal{H}$ is equipped with the semi-metric $d(\cdot, \cdot)$ , and $B_\theta(x, r)$ represents a ball in $\mathcal{H}$ with center $x \in \mathcal{H}$ and radius $r$ . For $\mathbf x = (x_1, \ldots, x_m) \in \mathcal H^m$ and $\boldsymbol{\theta} = (\theta_1, \ldots, \theta_m) \in \Theta^m$ , we define

$\phi_{\mathbf x, \boldsymbol\theta}(r) = \prod\limits_{i = 1}^m\phi_{x_i, \theta_i}(r).$

Further elucidation and examples concerning small-ball probability can be explored in Remark 3.6.

2.5. Mixing conditions

Statistical observations commonly exhibit a certain degree of dependence rather than complete independence. The concept of mixing serves as a quantitative measure of the proximity of a sequence of random variables to independence, facilitating the extension of traditional results applicable to independent sequences to sequences that are weakly dependent or mixing. The development of the theory of mixing conditions has emerged from the recognition that time series manifest "asymptotic independence" properties, thereby facilitating their analysis and statistical inference. Consider a probability space $(\Omega, \mathcal{F}, \mathbb{P})$ . Let ${Z}_{i, n}, {Z}_{2, n}, \ldots$ be a sequence of random variables on a probability space $\left(\Omega, \mathcal{D}, \mathbb{P}\right)$ . For an array $\left\{Z_{i, n}: 1 \leq i \leq n\right\}$ , the coefficients are defined as

$\beta(k) = \sup\limits_{i, n: 1 \leq i \leq n-k} \beta\bigg(\sigma\left(Z_{s, n}, 1 \leq s \leq i\right), \sigma\left(Z_{s, n}, i+k \leq s \leq n\right)\bigg),$

where $\sigma(Z)$ represents the $\sigma$ -field generated by $Z$ . The array $\left\{Z_{i, n}\right\}$ is considered $\beta$ -mixing if $\beta(k) \rightarrow 0$ . It is crucial to note that $\beta$ -mixing implies $\alpha$ -mixing. Throughout the ensuing discussion, we assume that the sequence of random elements $\{(X_{i, n}, Y_{i, n}), i = 1, \ldots, n; n \geq 1\}$ is absolutely regular. Remarkably, Markov chains exhibit $\beta$ -mixing under the milder Harris recurrence condition, provided the underlying space is finite ^[62]. Additional rationale for favoring regular processes over strongly mixing processes is provided in the concluding remarks (Section 7).

2.6. Kernel estimation

We endeavor to estimate the regression function as denoted in (2.1). The kernel estimator is formally defined as

$\begin{equation} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}, \end{equation}$

(2.5)

where $K_{1}(\cdot)$ and $K_{2}(\cdot)$ denote one-dimensional kernel functions. Here, let $h = h_{n}$ be a bandwidth with the property that $h \to 0$ as $n \to \infty$ . The function $\varphi: \mathcal{Y}^m \longrightarrow \mathbb{R}$ is symmetric and measurable, belonging to a class of functions denoted as $\mathscr{F}_m$ . Importantly, this estimator is a conditional $U$ -statistic utilizing the sequence of random variables $\{Y_{i, n}, X_{i, n}\}_{i = 1}^n$ and the kernel $\varphi \times K_1\times K_2$ . The introduction of such statistics was pioneered by ^[166]. To investigate the weak convergence of the conditional empirical process and the conditional $U$ -process within the functional data framework, we introduce some necessary notations. Consider the class

$\mathscr{F}_{m} = \{\varphi: \mathcal{Y}^{m} \rightarrow \mathbb{R}\},$

which consists of real-valued symmetric measurable functions on $\mathcal{Y}^{m}$ with a measurable envelope function :

$\begin{equation} F(\mathbf{ y}) \geq \sup\limits_{\varphi \in \mathscr{F}_{m}} \lvert\varphi(\mathbf{ y})\rvert, \; \mbox{for }\; \mathbf{ y} \in \mathcal{Y}^{m}. \end{equation}$

(2.6)

For kernel functions $K_1(\cdot)$ and $K_2(\cdot)$ , as well as a subset $S_{\mathcal{H}}\subset \mathcal{H}$ , we define the pointwise measurable class of functions for $1\leq m \leq n$ and $\boldsymbol{\theta} = (\theta_1, \ldots, \theta_m)$ :

$\mathscr{K}^{m}_{\boldsymbol{\theta}}: = \left\{ (x_1, \ldots, x_m) \mapsto \prod\limits_{i = 1}^{m}{K_{1}\left(\frac{u_k-\cdot}{h_n}\right)}K_2\left(\frac{d_{\theta_i}(x_i, \cdot)}{h_{i}}\right) , \; \; (\mathbf{x}, {\mathbf u})\in {\mathcal{H}}^{m}\times {[0, 1]^m}\right\}$

and

$\mathfrak{K}^{m}_{\boldsymbol\Theta}: = \bigcup\limits_{\boldsymbol{\theta}\in \boldsymbol\Theta^m}\left\{ (x_1, \ldots, x_m) \mapsto \prod\limits_{i = 1}^{m}{K_{1}\left(\frac{u_k-\cdot}{h_n}\right)}K_2\left(\frac{d_{\theta_i}(x_i, \cdot)}{h_{i}}\right), \; \; (\mathbf{x}, {\mathbf u})\in {\mathcal{H}}^{m}\times {[0, 1]^m} \right\} .$

The conditional $U$ -process indexed by $\mathscr{F}_{m}\mathfrak{K}^m_\Theta$ is defined by

$\begin{equation} \left\{ \mathbb{G}_n(\varphi, \mathbf u, \mathbf{x}, \boldsymbol{\theta}) : = \sqrt{n h^m \phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left(\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right)\right\}_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}. \end{equation}$

(2.7)

Observing the importance of point-wise measurability in our context, it enables us to state our results conventionally, adhering to the classical definition of probability, without invoking the abstract notions of outer probability or outer expectation ^[178].

Remark 2.5. The bandwidth $h$ remains consistent across all directions, simplifying the analysis in cases involving product kernels. Nevertheless, the results can be readily adapted to scenarios with non-product kernels and varying bandwidths.

Remark 2.6. Our estimator differs from the conventional conditional $U$ -statistics not only in the nature of the sequence $\{X_i\}_i$ but also in the inclusion of a kernel in the time direction. As a result, we attain smoothness from both the covariate direction $(X_{i, n})$ and the temporal dimension, allowing us to capture the characteristics of a regression model evolving over time.

2.7. VC-type classes of functions

Examining functional data through asymptotic methods involves delving into concentration properties elucidated by the concept of small-ball probability. When scrutinizing a process characterized by a set of functions, it becomes imperative to consider additional topological concepts, including metric entropy and VC-subgraph classes (referred to as "VC", inspired by Vapnik and Cervonenkis).

Definition 2.7. Let $\mathcal{S}_{\mathcal{E}}$ be a subset of a semi-metric space $\mathcal{E}$ . A finite set of points $\{e_1, \ldots, e_N\}\subset \mathcal{E}$ is considered a $\varepsilon$ -net of $\mathcal{S}_{\mathcal{E}}$ for a given $\varepsilon > 0$ if:

$\mathcal{S}_{\mathcal{E}}\subseteq \bigcup\limits_{j = 1}^NB(e_j, \varepsilon).$

If ${N_{\varepsilon}}(\mathcal{S}_{\mathcal{E}})$ is the cardinality of the smallest $\varepsilon$ -net (i.e., the minimal number of open balls of radius $\varepsilon$ ) in $\mathcal{E}$ needed to cover $\mathcal{S}_{\mathscr{H}}$ , then the Kolmogorov's entropy (metric entropy) of the set $\mathcal{S}_{\mathcal{E}}$ is defined as the quantity:

$\psi_{\mathcal{S}_{\mathcal{E}}}(\varepsilon): = \log{N_{\varepsilon}}(\mathcal{S}_{\mathcal{E}}).$

Kolmogorov introduced the concept of metric entropy, extensively explored in various metric spaces, as indicated by its name (cf. ^[112]). Dudley (^[70]) utilized this concept to establish sufficient conditions for the continuity of Gaussian processes, forming the foundation for significant generalizations of Donsker's theorem regarding the weak convergence of the empirical process. Consider two subsets, $\mathcal{B}_\mathcal{H}$ and $\mathcal{S}_{\mathcal{H}}$ , in the semi-metric space $\mathcal{H}$ with Kolmogorov's entropy (for radius $\varepsilon$ ) denoted as $\psi_{\mathcal{B}_{\mathcal{H}}}(\varepsilon)$ and $\psi_{\mathcal{S}_{\mathcal{H}}}(\varepsilon)$ , respectively. The Kolmogorov entropy for the subset $\mathcal{B}_\mathcal{H}\times \mathcal{S}_\mathcal{H}$ of the semi-metric space $\mathcal{H}^2$ is given by:

$\begin{equation*} \psi_{\mathcal{B}_{\mathcal{H}} \times \mathcal{S}_{\mathcal{H}}}(\varepsilon) = \psi_{\mathcal{B}_{\mathcal{H}}}(\varepsilon) +\psi_{\mathcal{S}_{\mathcal{H}}}(\varepsilon). \end{equation*}$

Thus, $m\psi_{\mathcal{S}_{\mathcal{H}}}(\varepsilon)$ represents the Kolmogorov entropy of the subset $\mathcal{S}_{\mathcal{H}}^m$ in the semi-metric space $\mathcal{H}^m$ . If $d$ denotes the semi-metric on $\mathcal{H}$ , a semi-metric on $\mathcal{H}^m$ can be defined as:

$\begin{equation} d_{\mathcal{H}^m}\left(\mathbf{x}, \mathbf{z}\right) : = \frac{1}{m} d_{\theta_1}\left({x_1}, {z_1}\right)+ \cdots + \frac{1}{m} d_{\theta_m}\left({x_m}, {z_m}\right), \end{equation}$

(2.8)

for $\mathbf{x} = (x_1, \ldots, x_m)$ , $\mathbf{z} = (z_1, \ldots, z_m) \in\mathcal{H}^m$ . The choice of the semi-metric is crucial in this type of analysis, and readers can find insightful discussions on this topic in [78, Chapters 3 and 11]. Furthermore, in this context, we must also address another topological concept: VC-subgraph classes.

Definition 2.8. A class of subsets $\mathcal{C}$ on a set $C$ is called a VC-class if there exists a polynomial $P(\cdot)$ such that, for every set of $N$ points in $C$ , the class $\mathcal{C}$ picks out at most $P(N)$ distinct subsets.

Definition 2.9. A class of functions $\mathscr{F}$ is called a VC-subgraph class if the graphs of the functions in $\mathscr{F}$ form a VC-class of sets. In other words, if we define the subgraph of a real-valued function $f$ on $S$ as the following subset $\mathcal{G}_f$ on $S \times \mathbb{R}$ :

$\mathcal{G}_f = \{ (s, t): 0\leq t\leq f(s) \quad or \quad f(s)\leq t\leq 0 \}$

the class $\{\mathcal{G}_f : f \in \mathscr{F}\}$ is a VC-class of sets on $S \times \mathbb{R}$ . Informally, a VC-class of functions is characterized by having a polynomial covering number (the minimal number of required functions to make a covering on the entire class of functions).

A VC-class of functions $\mathscr{F}$ with an envelope function $F$ has the following entropy property. For a given $1\leqslant q < \infty$ , there exist constants $a$ and $b$ such that:

$\begin{equation} N(\epsilon, \mathscr{F}, \Vert\cdot\Vert_{L_q(Q)}) \leq a \left(\frac{(QF^q)^{1/q}}{\epsilon}\right)^b, \end{equation}$

(2.9)

for any $\epsilon > 0$ and each probability measure such that $QF^q < \infty$ . Several references provide sufficient conditions under which (2.9) holds, such as [144, Lemma 22], [72, §4.7.], [178, Theorem 2.6.7], [114, §9.1], [65, §3.2], ^[33,34,45], offering further discussions.

2.8. Assumptions

For the reader's convenience, we have compiled the essential assumptions as follows:

Assumption 1. [Model and distribution assumptions]

i) The process $\{X_{i, n}\}$ is locally stationary and satisfies that for each time point $u \in [0, 1]$ , there exists a stationary process $\{X_{i}^{(u)}\}$ such that

$\begin{equation*} d_{\theta_i}\left(X_{i, n}, X_{i}^{(u)}\right) \leq \left(\left\lvert \frac{i}{n} - u\right\rvert + \frac{1}{n}\right)U_{i, n}^{(u)} \mathit{\text{a.s.}}, \end{equation*}$

with $\mathbb{E}[(U_{i, n}^{(u)})^{\rho}] < C$ for some $\rho > 0$ , $C < \infty$ .

ii) Let $B(x, h)$ be a ball centered at $x \in \mathscr{H}$ with radius $h$ , defined in Section 2.4, and let $c_{d} < C_{d}$ be positive constants. For all $\mathbf{u} \in [0, 1]^m$ ,

$\begin{equation} 0 < c_{d}\phi^m(h_n)f_{1}(\mathbf{x})\leq \mathbb{P}\left(\left(X_{i_1}^{(u_1)}, \ldots, X_{i_m}^{(u_m)}\right) \in \mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h)\right) = :F_{u, \boldsymbol{\theta}}(h;\mathbf{x})\leq C_{d}\phi^m(h_n)f_{1}(\mathbf{x}), \end{equation}$

(2.10)

where $\phi(0) = 0$ and $\phi(u)$ is absolutely continuous in a neighborhood of the origin, $f_{1}(x)$ is a non-negative functional in $x \in \mathscr{H}$ , and

$\mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h) = \prod\limits_{i = 1}^m B_{\theta_i}(x_i, h).$

iii) There exist constants $C_{\phi} > 0$ and $\varepsilon_{0} > 0$ such that for any $0 < \varepsilon < \varepsilon_{0}$ ,

$\begin{align} \int_{0}^{\varepsilon}\phi(u)du > C_{\phi}\varepsilon \phi(\varepsilon). \end{align}$

(2.11)

iv) Let $\psi(h_n) \to 0$ as $h \to 0$ , and $f_{2}(\mathbf{x})$ is a non-negative functional in $\mathbf{x}: = (x_1, \ldots, x_m) \in \mathcal{H}^m$

$\begin{equation} \nonumber \sup\limits_{\mathbf{i}\in I_n^m}\mathbb{P}\left(\left( (X_{i_1, n}, \ldots, X_{i_m, n}), (X_{i^\prime_1, n}, \ldots, X_{i^\prime_m, n})\right) \in \mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h)\times \mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h)\right) \leq \psi^m(h_n)f_{2}(\mathbf{x}). \end{equation}$

We will also assume that the ratio $\psi(h_n)/\phi^{2}(h_n)$ is bounded.

Assumption 2. [Kernel assumptions]

i) $K_{1}(\cdot)$ is a symmetric kernel around zero, bounded, and possesses compact support, i.e., $K_{1}(v) = 0$ for all $\mid v \mid > C_{1}$ for some $C_{1} < \infty$ . Additionally,

$\int K_{1}(z)dz = 1$

and $K_{1}(\cdot)$ is Lipschitz continuous, i.e.,

$\lvert K_{1}(v_{1}) - K_{1}(v_{2})\rvert \leq C_{2} \lvert v_{1} - v_{2}\rvert$

for some $C_{2} < \infty$ and all $v_{1}, v_{2} \in \mathbb{R}$ .

ii) The kernel $K_{2}(\cdot)$ is non-negative, bounded, and has a compact support in $[0, 1]$ , such that $0 < K_{2}(0)$ and $K_{2}(1) = 0$ . Alternatively, $K_2(\cdot)$ can be viewed as an asymmetrical triangular kernel, i.e., $K_2(x) = (1 - x) \mathbb{1}_{(x \in [0, 1])}$ , and $K_{2}(\cdot)$ is Lipschitz continuous, i.e.,

$\lvert K_{2}(v_{1}) - K_{2}(v_{2})\rvert \leq C_{2} \lvert v_{1} - v_{2}\rvert.$

Moreover, $K^\prime_{2}(v) = dK_{2}(v)/dv$ exists on $[0, 1]$ , and for two real constants $-\infty < C^\prime_{1} < C^\prime_{2} < 0$ , we have:

$C^\prime_{2}\leq K^\prime_{2}(v) \leq C^\prime_{1}.$

Assumption 3. [Smoothness]

i) $r^{(m)}(\mathbf{u}, x)$ is twice continuously partially differentiable with respect to $\bf u$ . We also assume that

$\begin{equation} \sup\limits_{\mathbf{u}_1, \mathbf{u}_2 \in [0, 1]^m}\left\lvert{r}^{(m)}(\mathbf{u}_1, \mathbf{x}, {\boldsymbol{\theta}})-{r}^{(m)}(\mathbf{u}_2, \mathbf{z}, {\boldsymbol{\theta}})\right\rvert\leq c_{m} \left(d_{\mathcal{H}^m}\left(\mathbf{x}, \mathbf{z}\right)^{\alpha} + \|\mathbf{u}_1 - \mathbf{u}_2 \|^{\alpha}\right) \end{equation}$

(2.12)

for some $c_{m} > 0$ , $\alpha > 0$ and $\mathbf{x} = (x_1, \ldots, x_m), \mathbf{z} = (z_1, \ldots, z_m) \in\mathcal{H}^m.$

ii) $\sigma:[0, 1]\times \mathcal{H}^m \to \mathbb{R}$ is bounded by some constant $C_{\sigma} < \infty$ from above and by some constant $c_{\sigma} > 0$ from below, that is, for all $\bf u$ and $\bf x$ ,

$0 < c_{\sigma}\leq\sigma(\boldsymbol{\theta}, \mathbf u, \mathbf x)\leq C_{\sigma} < \infty.$

iii) $\sigma(\cdot, \cdot, \cdot)$ is Lipschitz continuous with respect to $\bf u$ .

iv) $\sup_{\mathbf{u} \in [0, 1]^m}\sup_{z:{d_{\mathcal{H}^m}(\bf x, \bf z)}\leq \varepsilon}\lvert\sigma(\boldsymbol{\theta}, \mathbf u, \mathbf x)-\sigma(\boldsymbol{\theta}, \mathbf{u}, \mathbf{z})\rvert = o(1)$ as $\varepsilon \to 0$ .

Let ${\mathfrak W}_{\mathbf{i}, \varphi, n}$ be an array of one-dimensional random variables. In this study, this array will be equal to ${\mathfrak W}_{\mathbf{i}, \varphi, n} = 1$ or ${\mathfrak W}_{\mathbf{i}, \varphi, n} = \varepsilon_{\mathbf{i}, n}$ .

Assumption 4. [Mixing]

i) For $\zeta > 2$ and $C < \infty$ , we have

$\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\mathbb{E}\lvert {\mathfrak W}_{\mathbf{i}, n}\rvert^{\zeta} \leq C,$

and

$\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\mathbb{E}\left[\lvert {\mathfrak W}_{\mathbf{i}, n}\rvert^{\zeta} \mid X_{\mathbf{i}, n} = \mathbf{x} \right]\leq C .$

ii) The $\beta$ -mixing coefficients of the array $\{X_{i, n}, {\mathfrak W}_{i, n}\}$ satisfy $\beta(k) \leq Ak^{-\gamma}$ for some $A > 0$ and $\gamma > 2$ . Additionally, we assume that $\delta+1 < \gamma(1-{\frac{2}{\nu}})$ for some $\nu > 2$ and $\delta > 1-{\frac{2}{\nu}}$ , along with the condition

$\begin{align} h^{2(1\wedge \alpha)-1}\left(\phi(h_n)a_{n} + \sum\limits_{k = a_{n}}^{\infty}k^{\delta}(\beta(k))^{1-{\frac{2}{\nu}}}\right) \to 0, \end{align}$

(2.13)

as $n \to \infty$ , where $a_{n} = \bigg[(\phi(h_n))^{-(1-{\frac{2}{\nu}})/\delta}\bigg]$ and for all $\alpha > 0$ .

iii) For some $\zeta_{0} > 0$ , as $n \to \infty$ , we have

$\frac{(\log n) ^{\frac{-m+\gamma+1}{2}+ \zeta_0(\gamma+1)}}{n ^{\frac{-m+\gamma+1}{2} -1 -\frac{\gamma+1}{ \zeta} }h_n^{\frac{m+\gamma+1}{2}} \phi(h_n)^{\frac{-m+\gamma+1}{2}}} \to 0.$

iv) Both $n h^{2m+1}_n$ and $n h^m\phi(h_n)^m$ tend to infinity as $n$ goes to infinity.

Assumption 5. [Blocking assumptions] There exists a sequence of positive integers $\{v_{n}\}$ satisfying $v_{n}\to \infty$ , $v_{n} = o(\sqrt{nh\phi(h_n)})$ and $\sqrt{\frac{n} { h\phi(h_n)}}\beta(v_{n}) \to \infty$ as $n \to \infty$ .

Assumption 6. [Class of functions assumptions]

The classes of functions $\mathfrak{K}^m_\Theta$ and $\mathscr{F}_m$ are such that :

i) The class of functions $\mathscr{F}_m$ is bounded and its envelope function satisfies for some $0 < M < \infty:$

$\begin{equation*} F(\mathbf{y})\leq M, \qquad \mathbf{y}\in \mathcal{Y}^m. \end{equation*}$

ii) The class of functions $\mathscr{F}_m\mathfrak{K}^m_\Theta$ is supposed to be of VC-type with envelope function previously defined. Hence, there are two finite constants $b$ and $\nu$ such that:

$\begin{equation*} N\left(\epsilon, \mathscr{F}_m\mathfrak{K}^m_\Theta, \Vert\cdot\Vert_{L_2(Q)}\right) \leq \left(\frac{ b\Vert F\kappa^m\Vert_{L_2(Q)}}{ \epsilon}\right)^\nu \end{equation*}$

for any $\epsilon, \nu > 0$ and each probability measure such that $Q(F)^2 < \infty$ .

iii) The class of functions $\mathscr{F}_m$ is unbounded and its envelope function satisfies for some $\zeta > 2:$

$\begin{equation*} \theta_\zeta: = \sup\limits_{\mathbf{x}\in\mathcal{S}^m_\mathscr{H}} \mathbb{E}\left( F^\zeta(\mathbf{Y})\mid \mathbf{X} = \mathbf{x}\right) < \infty, \; \; \mathcal{S}^m\subset \mathcal{H}^m. \end{equation*}$

iv) The metric entropy of the class $\mathscr{FK}$ satisfies, for some $\; 1\leq \zeta < \infty$ :

$\begin{align*} &\int_{0}^{\infty}{(\log N(u, \mathscr{FK}, \Vert\cdot\Vert_\zeta))^{\frac{1}{2}}du} < \infty. \end{align*}$

2.9. Comments on the assumptions

To establish the groundwork for our analysis, we draw inspiration from seminal works such as ^{[78,87,116,133,179]}. Our assumptions play a pivotal role in shaping the properties of the random processes under consideration. Starting with Assumption 1, we formalize the local stationarity property of $X_i$ and introduce conditions related to the distribution behavior of the variables. Equation (2.10) governs the small-ball probability around zero, representing the standard condition for small-ball probability. This equation implies that the small-ball probability can be approximated as the product of two independent functions $\phi^m(\cdot)$ and ${f}_1(\cdot)$ . For instance, when $m = 1$ , references such as ^[135] (diffusion process), ^[25] (Gaussian measure), ^[122] (general Gaussian process), and ^[133] (strongly mixing processes) provide context. The function $\phi(\cdot)$ can take various forms, such as $\phi(\epsilon) = \epsilon^{\delta}\exp(-C/\epsilon^{a})$ for Ornstein-Uhlenbeck and general diffusion processes. Further examples and discussions can be found in ^[81] and Remark 3.6. Assumption 1 iv) details the behavior of the joint distribution near the origin, aligning with assumptions made by ^[87] in the context of density estimation for functional data. Assumption 2 encompasses the kernel assumptions commonly used in nonparametric functional estimation. Notably, the Parzen symmetric kernel is inadequate due to the positivity of the random process $D_i = d\left(x, X_i\right)$ ; thus, $K_2(\cdot)$ with support $[0, 1]$ is considered. The kernel $K_2(\cdot)$ is a symmetric type Ⅱ kernel belonging to the family of continuous kernels (triangle, quadratic, etc.). Compact support on $[0, 1]$ is assumed for the kernels to derive an expression for the asymptotic variance. The Lipschitz-type assumptions on $K_2(\cdot)$ and $\sigma(\cdot, \cdot)$ (Assumption 2ⅱ) and Assumption 3ⅲ)) are crucial for obtaining the convergence rate. Assumption 3 restricts the growth of $r^{(m)}(\cdot)$ and $\sigma(\cdot)$ and places bounds on these functions to prevent rapid growth outside a large bound. It is tailored to ensure the convergence rate and forms an integral part of the overall analysis. Assumption 4 ⅱ) is a standard mixing condition necessary for establishing asymptotic normality and the asymptotic negligibility of the bias, consistent with ^[133]. The variables ${\mathfrak W}_{\mathbf{i}, n}$ are not necessarily bounded, and there is a tradeoff between the decay of the mixing condition and the order ${\zeta}$ of the moment $\sup_{\mathbf{x} \in \mathcal{H}^m}\mathbb{E}\lvert {\mathfrak W}_{\mathbf{i}, n}\rvert^{\zeta} \leq C$ . Assumption 4 ⅲ) and ⅳ) are technical conditions crucial for obtaining the desired results, addressing the uniform convergence rate and the bias and convergence rate of the general estimator. Assumption 6 asserts that the class of functions satisfies certain entropy conditions. Part ⅱ) and ⅲ) are interconnected, with the former stating that the class is bounded. However, in the context of proving the functional central limit theorem for conditional $U$ -processes indexed by an unbounded class of functions, part ⅲ) supersedes the first one. Assumption 6 ⅱ) ensures that $\mathcal{F}$ is VC type with characteristics $b$ and $n$ for the envelope $F\kappa^m$ . Since $F \in L^2(\mathbb P)$ by Assumption 6, Dudley's criterion on the sample continuity of Gaussian processes implies that the function class $\mathcal{F}$ is $\mathbb P$ -pre-Gaussian. These general assumptions, inclusive of the mentioned conditions, provide adequate flexibility given the diverse components in our main results. They encapsulate and leverage the topological structure of functional variables, the probability measure within the functional space, the concept of measurability applied to the function class, and the uniformity regulated by entropy properties.

Remark 2.10. It is worth noting that Assumption 6 iii) can be replaced by more general hypotheses regarding the moments of $Y$ , as discussed in ^[65]. The alternative assumption takes the following form:

iii) $^{\prime\prime}$ We introduce $\{\mathcal{M}(x) : x \geq 0\}$ as a non-negative continuous function, increasing on $[0, \infty)$ , and such that, for some $s > 2$ , eventually as $x \rightarrow \infty$ :

$\begin{equation} x^{-s}\mathcal{M}(x) \downarrow; \quad x^{-1}\mathcal{M}(x) \uparrow . \end{equation}$

(2.14)

For each $t \geq \mathcal{M}(0)$ , we define $\mathcal{M}^{inv}(t) \geq 0$ such that $\mathcal{M}(\mathcal{M}^{inv}(t)) = t$ . Additionally, we assume that:

$\mathbb E (\mathcal M(\lvert F( Y)\rvert )) < \infty.$

The following choices for $\mathcal{M}(\cdot)$ are particularly interesting:

(i) $\mathcal{M}(x) = x^{\xi}$ for some $\xi > 2$ ;

(ii) $\mathcal{M}(x) = \exp (s x)$ for some $s > 0$ .

These alternative formulations provide broader flexibility in defining the moments of $Y$ , accommodating various scenarios and enhancing the applicability of the analysis.

3. Uniform convergence rates for kernel estimators

Before expressing the asymptotic behavior of our estimator represented in (2.5), we will generalize the study to a $U$ -statistic estimator defined by:

$\begin{equation} \widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} {\mathfrak W}_{\mathbf{i}, \varphi, n}, \end{equation}$

(3.1)

where ${\mathfrak W}_{\mathbf{i}, \varphi, n}$ is an array of one-dimensional random variables. In this study, we use the results with ${\mathfrak W}_{\mathbf{i}, \varphi, n} = 1$ and ${\mathfrak W}_{\mathbf{i}, \varphi, n} = \varepsilon_{\mathbf{i}, n}$ .

3.1. Hoeffding's decomposition

Note that $\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)$ is a classical $U$ -statistic with a kernel depending on $n$ . We define

$\begin{align*} \xi_k &: = \frac{1}{h } K_{1}\left(\frac{u_k-k/n}{h_n}\right), \\ H(Z_{1}, \ldots, Z_{m}) & : = \prod\limits_{k = 1}^m \frac{1}{ \phi(h_n)} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{k, n})}{h_n}\right) {\mathfrak W}_{\mathbf{i}, \varphi, n}, \end{align*}$

thus, the $U$ -statistic in (3.1) can be viewed as a weighted $U$ -statistic of degree $m$ :

$\begin{equation} \widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) = \frac{(n-m)!}{n!} \sum\limits_{\mathbf{i}\in I_n^m} \xi_{i_1} \ldots \xi_{i_m} H(Z_{i_1}, \ldots, Z_{i_m}). \end{equation}$

(3.2)

We can write Hoeffding's decomposition in this case as in ^[94]. If we do not assume symmetry for ${\mathfrak W}_{\mathbf{i}, \varphi, n}$ or $H$ , we must define:

● The expectation of $H(Z_{i_1}, \ldots, Z_{i_m})$ :

$\begin{equation} \vartheta(\mathbf{i}) : = \mathbb{E} H(Z_{i_1}, \ldots, Z_{i_m}) = \int {\mathfrak W}_{\mathbf{i}, \varphi, n} \prod\limits_{k = 1}^m \frac{1}{ \phi(h_n)} K_{2}\left(\frac{d_{\theta_k}(x_k, \nu_{k, n})}{h_n}\right) d\mathbb{P}_{\mathbf{i}}(z_\mathbf{i}). \end{equation}$

(3.3)

● For all $\ell \in \{1, \ldots, m\}$ the position of the argument, construct the function $\pi_\ell$ such that:

$\begin{equation} \pi_\ell(z; z_1, \ldots, z_{m-1}): = (z_1, \ldots, z_{\ell-1}, z, z_{\ell}, \ldots, z_{m-1}). \end{equation}$

(3.4)

● Define:

$\begin{align} H^{(\ell)}(z ; z_{1}, \ldots, z_{m-1}) &: = H\{\pi_{\ell}(z ; z_{1}, \ldots, z_{m-1})\} \end{align}$

(3.5)

$\begin{align} \vartheta^{(\ell)}(i ; i_{1}, \ldots, i_{m-1}) &: = \vartheta\{\pi_{\ell}(i ; i_{1}, \ldots, i_{m-1})\} . \end{align}$

(3.6)

Hence, the first-order expansion of $H(\cdot)$ will be seen as:

$\begin{eqnarray} \widetilde{H}^{(\ell)}(z) &: = &\mathbb{E}\{H^{(\ell)}(z, Z_1, \ldots, Z_{m-1})\} \\ & = & \int {\mathfrak W}_{( {1}, \ldots, \ell-1, i , \ell, \ldots, {m-1})} \prod\limits_{\underset{k \neq i}{k = 1}}^{m-1} \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \times \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_i}(x_i, \nu_i)}{h_n}\right) \\ && \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}) \\ & = & \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, x)}{h_n}\right) \times w \times \int {\mathfrak W}_{( {1}, \ldots, \ell-1 , \ell, \ldots, {m-1})} \prod\limits_{\underset{k \neq i}{k = 1}}^{m-1} \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \\ && \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}), \end{eqnarray}$

(3.7)

with $\mathbb{P}$ as the underlying probability measure, and define

$\begin{equation} f^{(\ell)}_{i, i_{1}, \ldots, i_{m-1}} : = \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} (\widetilde{H}^{(\ell)}(z) - \vartheta^{(\ell)}(i ; i_{1}, \ldots, i_{m-1}) ). \end{equation}$

(3.8)

Then, the first-order projection can be defined as:

$\begin{equation} \widehat {H}_{1, i}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) : = \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} f^{(\ell)}_{i, i_{1}, \ldots, i_{m-1}}, \end{equation}$

(3.9)

where

$I_{n-1}^{m-1}(-i) : = \left\{1 \leq i_{1} < \ldots < i_{m-1}\leq n \; \mbox{ and }\; i_j \neq i \; \mbox{for all}\; j \in \{1, \ldots, m-1\}\right\}.$

For the remainder terms, we denote by $\mathbf{i} \backslash i_\ell : = (i_1, \ldots, i_{l-1}, i_{l+1}, \ldots, i_m)$ and for $\ell \in \{1, \ldots, m\}$ , let

$\begin{equation} H_{2, \mathbf{i}}(\boldsymbol{z}) : = H(\boldsymbol{z}) - \sum\limits_{l = 1}^m \widetilde{H}^{(\ell)}_{\mathbf{i}\backslash i_\ell}(z_\ell) +(m-1)\vartheta(\mathbf{i}), \end{equation}$

(3.10)

where

$\widetilde{H}^{(\ell)}_{\mathbf{i}\backslash i_\ell}(z_\ell) = \mathbb{E}\{H( Z_1, \ldots, Z_{\ell-1}, z, Z_{\ell+1} Z_{m-1})\},$

defined in (3.7), this projection derives us to the following remainder term:

$\begin{equation} \widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) : = \frac{(n-m)!}{(n)!} \sum\limits_{\mathbf{i}\in I_n^m} \xi_{i_1} \cdots \xi_{i_m} H_{2, \mathbf{i}}(\boldsymbol{z}). \end{equation}$

(3.11)

Finally, using Eqs (3.9) and (3.11), and under conditions that :

$\begin{align} \mathbb{E}\{\widehat{H}_{1, i}(\mathbf u, X, \boldsymbol{\theta}, \varphi)\} & = 0, \end{align}$

(3.12)

$\begin{align} \mathbb{E}\{ H_{2, \mathbf{i}}(\boldsymbol{Z} \mid Z_k)\} & = 0 \; \mbox{a.s.}, \end{align}$

(3.13)

we get the ^[99] decomposition:

$\begin{align} &\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)- \mathbb{E}\{\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)\} \\ & = \frac{1}{n} \sum\limits_{i = 1}^n \widehat{H}_{1, i}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)+ \widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) \\ & = : \widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) + \widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi). \end{align}$

(3.14)

For more details, the interested reader can refer to [94, Lemma 2.2].

3.2. Uniform convergence rate

We commence by presenting the following general proposition.

Proposition 3.1. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ denote a measurable VC-subgraph class of functions, adhering to Assumption 6. Suppose that Assumptions 1–4 are satisfied. In such case, the ensuing result holds:

$\begin{align*} {\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} }\sup\limits_{\mathbf{u} \in [0, 1]^m}\left\lvert\widehat{\psi}(\mathbf{u}, \mathbf{x}, \varphi) - \mathbb E[\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)]\right\rvert & = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}}\right). \end{align*}$

The proof of Proposition 3.1 is deferred to Section 4.1.

Remark 3.2. Elaborating on Proposition 3.1, we can delve into the uniform convergence rate of the kernel estimator $\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n})$ . It is crucial to emphasize that when $m = 1$ and the function $\varphi$ remains constant, the outcomes align with the pointwise convergence rate of the regression function for a strictly stationary functional time series, as discussed in ^[78].

The subsequent theorem presents the uniform convergence rate of the kernel estimator (2.5).

Theorem 3.3. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ be a measurable VC-subgraph class of functions complying with Assumption 6. Suppose Assumptions 1–4 are fulfilled. Then, we have:

$\begin{eqnarray} &&\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in [C_{1}h, 1-C_{1}h]^m} \left\lvert\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert \\&&\qquad \qquad\qquad \qquad = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}} + h^{2m\wedge\alpha}_n\right). \end{eqnarray}$

(3.15)

The proof of Theorem 3.15 is postponed until Section 4.1.

Remark 3.4. It is possible to consider the setting of $\Theta = \Theta_n$ and assume that

$\operatorname{card}\left(\Theta_n\right) = n^\alpha \quad \text { with } \quad \alpha > 0$

and

$\forall \theta \in \Theta_n, \quad\langle\theta-\theta_0, \theta-\theta_0\rangle^{1 / 2} \leq C_7 b_n ,$

where $b_n$ tends to zero, as in ^[145].

Remark 3.5. In contrast to Theorem 4.2 in ^[179] and akin to Theorem 3.1 in ^[116], our formulation excludes the bias term arising from the approximation error of $X_{i, n}$ by $X_{i}^{(u)}$ . Under our assumptions, the approximation error is negligibly small compared to $h^{2m \wedge \beta}_n$ .

Remark 3.6. In nonparametric problems, the infinite dimensionality of the target function is typically determined by the smoothness condition, specifically in Assumptions 2 i) and 3 i). This primarily affects the bias component of the convergence rates, represented by terms like $O\left(h^{2m\wedge\alpha}_n\right)$ in Theorem 3.3. Other terms in the convergence rates stem directly from dispersion effects and are inherently linked to the concentration properties of the probability measure of the variable $X$ . These terms can be expressed as

$O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}} \right),$

where small-ball probabilities are quantified and controlled by the mean of the function $\phi(\cdot)$ defined in (2.4). The rate of convergence is influenced by the concentration of the measure of the process $X$ ; less concentration leads to a slower rate of convergence. Unfortunately, solutions to $\mathbb{P}\left(X \in B(x, r)\right)$ are known for very few random variables (or processes) $X$ , even when $x = 0$ . In certain functional spaces, considering $x \neq 0$ introduces considerable difficulties that might not be surmountable. Authors often focus on Gaussian random elements, and for a comprehensive overview of main results on small-ball probability, refer to ^[122]. In many scenarios, it is convenient to assume that

$\begin{equation} \mathbb{P}\left(X \in B(x , r)\right) \sim \psi(x) \phi(r) \quad as\; r \rightarrow 0, \end{equation}$

(3.16)

where, to ensure the identifiability of the decomposition, a normalizing restriction is necessary, such as

$\mathbb{E}[\psi(X)] = 1 .$

The factorization (3.16) is not a stringent assumption; it holds under appropriate hypotheses (see, for instance, ^[27,122]). The advantage of assuming (3.16) is two-fold. Firstly, the function $\psi(x)$ can be viewed as a surrogate density of the functional random element $X$ and can be leveraged in various contexts. The interested reader can explore its potential in works like ^[26,83,87], where the surrogate density is estimated and used to define a notion of mode or for classification purposes. Second, the function $\phi(h_n)$ acts as the volumetric term and can be employed to assess the complexity of the probability law of the process $X$ (see ^[28]). In the special multi-(but finite)-dimensional scenario where $\mathbf{X} \in \mathbb{R}^d$ , the relation (2.4) is satisfied under standard assumptions with $\phi_x(h_n) \sim C_x h^d$ , commonly known as the curse of dimensionality (see ^[77,80]). Here, we would instead refer to it as the curse of infinite dimension, specifically highlighting the effects of small-ball probabilities. The inherent nature of these probability effects involving small-balls becomes apparent within our infinite-dimensional framework. The remainder of this remark will focus on applying our approach to various continuous-time processes, where the probabilities associated with small-balls have already been identified. For more details on the following examples, refer to ^[80].

(i) Consider the space $\mathcal{C}([0, 1], \mathbb{R})$ equipped with the supremum norm, and its associated Cameron-Martin space

$\mathcal{F} = \mathcal{C}([0, 1], \mathbb{R})^{\mathrm{CM}}.$

Let's examine the fractional Brownian motion $\zeta^{\mathrm{FBM}}$ with parameter $\delta, 0 < \delta < 2$ . small-ball probabilities in this context have been extensively studied. According to [122,Theorems 3.1 and 4.6], we have

$\forall x_0 \in \mathcal{F}, \quad C_{x_0}^{\prime} \mathrm{e}^{h^{-2 / 8}} \leqslant \mathbb P\left(\left\|\zeta^{\mathrm{FBM}}-x_0\right\|_\infty \leqslant h\right) \leqslant C_{x_0} \mathrm{e}^{h^{-2 / 8}}.$

Notably, our crucial relation (2.4) is trivially satisfied for the fractional Brownian motion by choosing the function $\phi(\cdot)$ in the form

$\phi_x^{\mathrm{FBM}}(h_n) \sim C_x \mathrm{e}^{h^{-2 / \delta}}.$

(ii) Consider a centered Gaussian process $\zeta^{\mathrm{GP}} = \left\{\zeta_t^{\mathrm{GP}}, 0 \leqslant t \leqslant 1\right\}$ . This process can be expressed using the Karhunen-Loève decomposition as follows:

$\zeta_t^{\mathrm{GP}} = \sum\limits_{i = 1}^{\infty} \sqrt{\lambda_i} {\mathfrak W}_i f_i(t),$

where $\lambda_i$ are the eigenvalues of the covariance operator of $\zeta^{\mathrm{GP}}$ , $f_i$ are the associated orthonormal eigenfunctions, and ${\mathfrak W}_i$ are independent standard normal real random variables. For any fixed $k \in \mathbb{N}^*$ , let $\Pi_k$ be the orthogonal projection onto the subspace spanned by the eigenfunctions $\left\{f_1, \ldots, f_k\right\}$ . Define a semi-metric by

$d^2(x, y) = \int_0^1\left[\Pi_k(x-y)(t)\right]^2 d t.$

Using the Karhunen-Loève expansion, we obtain

$d^2\left(\zeta^{\mathrm{GP}}, x\right) = \sum\limits_{i = 1}^k\left(\sqrt{\lambda_i} {\mathfrak W}_i-x_i\right)^2 = \sum\limits_{i = 1}^k Z_i^2,$

where $x_i = \int_0^1 x(t) f_i(t) \mathrm{d} t$ , and $Z_i$ are the components of the vector $Z = \left(Z_1, \ldots, Z_k\right)$ , exhibiting the Euclidean norm structure on $\mathbb{R}^k$ . Due to the independence of $Z_i$ with densities with respect to the Lebesgue measure, we have

$\mathbb P\left(d^2\left(\zeta^{\mathrm{GP}}, x\right) < h\right) \sim C_x h^k.$

(iii) Consider the space $\mathcal{C}([0, 1], \mathbb{R})$ of continuous real-valued functions on $[0, 1]$ , equipped with the supremum norm denoted by $\|\cdot\|_{\infty}$ . Let $\mathbb P^{\mathrm{W}}$ be the Wiener measure on $\mathcal{C}([0, 1], \mathbb{R})$ , and define the Cameron-Martin space of $\mathcal{C}([0, 1], \mathbb{R})$ as

$\mathcal{F} = \mathcal{C}([0, 1], \mathbb{R})^{\mathrm{CM}}.$

Consider the Ornstein-Uhlenbeck process $\zeta^{\mathrm{OU}}$ with $\zeta_0^{\mathrm{OU}} = 0$ and

$\mathrm{d} \zeta_t^{\mathrm{OU}} = \mathrm{d} {\mathfrak W}_t-\frac{1}{2} \zeta_t^{\mathrm{OU}}, \quad \forall t, 0 < t \leqslant 1.$

Wiener measures for small centered balls are known to be of the form [^[25], p. 187]:

$\mathbb P^{\mathrm{W}}\left(\|x\|_{\infty} \leqslant h\right) \sim \frac{4}{\pi} \mathrm{e}^{-\pi^2 / 8 h^2}.$

By extending this result to any small-ball probability measure through the Cameron-Martin space characterization, we have

$\forall x_0 \in \mathcal{F}, \quad \mathbb P^{\mathrm{W}}\left(\left\|x-x_0\right\|_{\mathrm{sup}} \leqslant h\right) \sim C_{x_0} \mathrm{e}^{-\pi^2 / 8 h^2}.$

Since the Ornstein-Uhlenbeck process has a probability measure absolutely continuous with respect to $P^{\mathrm{W}}$ , we can directly state

$\forall x_0 \in \mathcal{F}, \quad \mathbb P\left(\zeta^{\mathrm{OU}} \in \mathcal{B}\left(x_0, h\right)\right) \sim C_{x_0} \mathrm{e}^{-\pi^2 / 8 h^2}.$

Our crucial relation (2.4) is trivially satisfied for this Ornstein-Uhlenbeck process by choosing the function $\phi_x(\cdot)$ in the form

$\phi_x^{\mathrm{OU}}(h_n) \sim C_x \mathrm{e}^{-\pi^2 / 8 h^2}.$

4. Weak convergence for kernel estimators

In this section, we are interested in studying the weak convergence of the conditional $U$ -processes under absolute regular observations. Observe that

$\begin{array}{l} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \\ \;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{ x}, \mathbf{u})\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right) \\ \;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}^{B}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right), \end{array}$

(4.1)

where

$\begin{equation*} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}, \end{equation*}$

$\begin{equation*} \widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} {\mathfrak W}_{\mathbf{i}, \varphi, n}, \end{equation*}$

$\begin{equation*} \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} r^{(m)}\left(\frac{\mathbf{i}}{n} , X_{i, n}, \boldsymbol{\theta}\right). \end{equation*}$

Under the same assumption in Theorem 3.3, we will show in the next theorem that

$Var(\widehat{g}^{B}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = o\left(\frac{1 }{ n h^m\phi_{\mathbf x, \boldsymbol\theta}(h_n)}\right)$

and

$1/ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = O_{ \mathbb{P}}(1).$

Then, we have

$\begin{align*} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x};h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) & = \frac{\widehat{g}_{1}(\boldsymbol{\theta}, \mathbf u, \mathbf x) }{ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})} + B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) + o_{ \mathbb{P}}\left(\sqrt{\frac{1}{nh^m\phi_{\mathbf x, \boldsymbol\theta}(h_n)}}\right), \end{align*}$

where

$B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \mathbb{E}[\widehat{g}^{B}(\boldsymbol{\theta}, \mathbf u, \mathbf x)]/ \mathbb{E}[ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})],$

is the "bias" term and $\frac{\widehat{g}_{1}(\boldsymbol{\theta}, \mathbf u, \mathbf x)}{ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}$ is the "variance" term. Let us define, for $\varphi_1, \varphi_2\in \mathcal F_m$

$\begin{aligned} & \sigma(\varphi_1, \varphi_2) = \lim\limits_{n \rightarrow \infty}nh^m\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)\mathbb E( (\widetilde{r}_{n}^{(m)}(\varphi_1, \mathbf{u}, \mathbf{x};h_{n}) - r^{(m)}(\varphi_1, \mathbf{u}, \mathbf{x})\\ &\quad\;\;\times(\widetilde{r}_{n}^{(m)}(\varphi_2, \mathbf{u}, \mathbf{x};h_{n}) - r^{(m)}(\varphi_2, \mathbf{u}, \mathbf{x})). \end{aligned}$

(4.2)

In the following, we would set $K_{2}(\cdot)$ as the asymmetrical triangle kernel, that is, $K_{2}(x) = (1-x) \mathbb 1_{(x \in [0, 1])}$ to simplify the proof. The main results of this section are given in the following theorems.

Theorem 4.1. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ be a measurable VC-subgraph class of functions, and assume that all the assumptions of Section 2.8 are satisfied for both cases ${\mathfrak W}_{\mathbf{i}, \varphi, n} = 1$ and ${\mathfrak W}_{\mathbf{i}, \varphi, n} = \varepsilon_{\mathbf{i}, n}$ . Then, as $n\to \infty$ , the $U$ -process

$\sqrt{nh^m\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)}\left(\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})-B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right)$

converges to a Gaussian process $\mathfrak G_{n}$ over $\mathscr{F}_m\mathfrak{K}^m_\Theta$ , whose sample paths are bounded and informally continuous with respect to the $\Vert\cdot\Vert_2$ -norm, with the covariance function given in (4.2).

The proof of Theorem 4.2 is postponed until Section 4.1.

To examine the weak convergence of our estimator using the standard procedure, involving Hoeffding decomposition, finite-dimensional convergence, and equicontinuity, we can turn to the following theorem. In the proof of this theorem, we express the conditional $U$ -process in terms of a $U$ -process based on a stationary sequence, illustrating its convergence to a Gaussian process. This convergence is established in the distribution sense within $l^{\infty}(\mathscr{F}_m\mathfrak{K}^m_\Theta)$ , the space of bounded real functions on $\mathscr{F}_m\mathfrak{K}^m_\Theta$ , as defined in ^[100]. For further details, refer to ^[6,71], or ^[178].

Theorem 4.2. Assume $\mathscr{F}_m\mathfrak{K}^m_\Theta$ is a measurable VC-subgraph class of functions, and all assumptions in Section 2.8 are satisfied. If in addition $n\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)h^{m+2(2m\wedge\alpha)}\rightarrow 0$ as $n\to \infty$ , then we have

converges in law to a Gaussian process $\left\{ \mathbb{G}(\psi):\psi\in\mathscr{F}_m\mathfrak{K}^m_\Theta\right\}$ in $l^{\infty}(\mathscr{F}_m\mathfrak{K}^m_\Theta)$ that admits a version with uniformly bounded and uniformly continuous paths with respect to the $\Vert\cdot\Vert_2-$ norm, and its covariance function is given in (4.2).

The proof of Theorem 4.2 is deferred to Section 8.

Remark 4.3. To eliminate the bias term, it is necessary to have $n\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)h^{m+2(2m\wedge\alpha)}\rightarrow 0$ as $n \rightarrow \infty$ . As a consequence, the last condition, along with $nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)\rightarrow \infty$ , holds as long as $h_n = n^{-\xi}$ and $\phi_{\mathbf x, \boldsymbol\theta}(h_n) = h_n^{mc}$ , where $0 < c < 1-{\frac{1}{\xi m}}$ and $\frac{1}{m(1+c)+ 2(2m\wedge \alpha)} < \xi < \frac{1}{m(1-c)}$ .

Remark 4.4. The validity of the results remains intact even when replacing the entropy condition with the bracketing condition. In particular, the existence of constants $C_0 > 0$ and $v_0 > 0$ ensures the specified inequality, as described in the remark. In our framework, the choice of the kernel function is flexible, with minimal restrictions, as long as some mild conditions are satisfied. However, the selection of the bandwidth introduces challenges, and it is crucial for achieving a favorable rate of consistency. The bandwidth choice significantly impacts the bias and variance trade-off in the estimator. Therefore, adopting a bandwidth that adjusts based on specific criteria, available data, and location is more suitable. The discussion on this topic can be found in ^[44,46,132]. Establishing uniform-in-bandwidth central limit theorems in our context would be particularly interesting.

Remark 4.5. We can consider the scenario where $\Theta = \Theta_n$ , where $\Theta_n$ satisfies the conditions $\operatorname{card}\left(\Theta_n\right) = n^\alpha$ with $\alpha > 0$ , and for every $\theta \in \Theta_n$ , we have

$\langle\theta-\theta_0, \theta-\theta_0\rangle^{1 / 2} \leq C_7 b_n,$

where $b_n$ converges to zero, as discussed in ^[145].

The functional directions set, $\Theta_n$ , is constructed following a similar approach to ^[2,145], as outlined below:

(i) Each direction $\theta \in \Theta_n$ is derived from a $d_n$ -dimensional space formed by $\mathrm{B}$ -spline basis functions, denoted by $\left\{e_1(\cdot), \ldots, e_{d_n}(\cdot)\right\}$ . Thus, we express directions as:

$\begin{eqnarray} \theta(\cdot) = \sum\limits_{j = 1}^{d_n} \alpha_j e_j(\cdot) \text { where }\left(\alpha_1, \ldots, \alpha_{d_n}\right) \in \mathcal{V}. \end{eqnarray}$

(4.3)

(ii) The set of coefficient vectors in (4.3), denoted by $\mathcal{V}$ , is generated through the following steps:

Step 1. For each $\left(\beta_1, \ldots, \beta_{d_n}\right) \in \mathcal{C}^{d_n}$ , where $\mathcal{C} = \left\{c_1, \ldots, c_J\right\} \subset \mathbb{R}^J$ represents a set of $J$ 'seed-coefficients', construct the initial functional direction as

$\theta_{\text {init }}(\cdot) = \sum\limits_{j = 1}^{d_n} \beta_j e_j(\cdot) .$

Step 2. For each $\theta_{\text {init }}$ from Step 1 satisfying $\theta_{\text {init }}\left(t_0\right) > 0$ , where $t_0$ denotes a fixed value in the domain of $\theta_{\text {init }}(\cdot)$ , compute $\langle\theta_{\text {init }}, \theta_{\text {init }}\rangle$ and form $\left(\alpha_1, \ldots, \alpha_{d_n}\right) = \left(\beta_1, \ldots, \beta_{d_n}\right) /\langle\theta_{\text {init }}, \theta_{\text {init }}\rangle^{1 / 2}$ .

Step 3. Define $\mathcal{V}$ as the collection of vectors $(\alpha_1, \ldots, \alpha_{d_n})$ obtained in Step 2. Consequently, the final set of permissible functional directions is represented as

$\Theta_n = \left\{\theta(\cdot) = \sum\limits_{j = 1}^{d_n} \alpha_j e_j(\cdot) ;\left(\alpha_1, \ldots, \alpha_{d_n}\right) \in \mathcal{V}\right\}.$

5. Applications

While only the subsequent examples will be provided in this section, they serve as prototypes for a range of problems that can be explored in a comparable manner.

5.1. Discrimination

Now, we apply the results to the discrimination problem described in Section 3 of ^[168], also referring to ^[167]. We will employ similar notation and settings. Let $\varphi(\cdot)$ be any function taking at most finitely many values, say $1, \ldots, M$ . The sets

$A_{j} = \left\{(y_{1}, \ldots, y_{m}): \varphi(y_{1}, \ldots, y_{m}) = j\right\}, \; \; 1\leq j\leq M,$

then yield a partition of the feature space. Predicting the value of $\varphi(Y_{1}, \ldots, Y_{m})$ is tantamount to predicting the set in the partition to which $(Y_{1}, \ldots, Y_{m})$ belongs. For any discrimination rule $g$ , we have

$\mathbb{P}(g(\mathbf{ X}, \boldsymbol{\theta}) = \varphi(\mathbf{ Y}))\leq \sum\limits_{j = 1}^{M}\int_{\{\mathbf{ x}:g(\mathbf{ x}) = j\}}\max \mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta})d\mathbb{P}(\mathbf{ x}),$

where

$\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta}) = \mathbb{P}(\varphi(\mathbf{ Y}_{\mathbf i}) = j\mid \langle\mathbf{ X}_{\mathbf i}, \boldsymbol{\theta}\rangle = \langle\mathbf{ x}, \boldsymbol{\theta}\rangle), \; \; \mathbf{ x}\in\mathcal{H}^m.$

The above inequality becomes equality if

$\mathfrak G_{0}(\mathbf{ x}, \boldsymbol{\theta}) = \arg \max\limits_{1\leq j\leq M}\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta}).$

$\mathfrak G_{0}(\cdot)$ is called the Bayes rule, and the pertaining probability of error

$\mathbf{ L}^{*} = 1-\mathbb{P}(\mathfrak G_{0}(\mathbf{ X}, \boldsymbol{\theta}) = \varphi(\mathbf{ Y})) = 1-\mathbb{E}\left\{\max\limits_{1\leq j\leq M}\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta})\right\}$

is called the Bayes risk. Each of the above unknown functions $\mathfrak{M}^{j}$ 's can be consistently estimated by one of the methods discussed in the preceding sections. Let, for $1\leq j\leq M$ ,

$\begin{eqnarray} \label{newestimatye2323} \mathfrak{M}^{j}_{n}(\mathbf u, \mathbf{ x}, \boldsymbol{\theta}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m} \mathbb{1}\{ \varphi({ Y}_{i_{1}}, \ldots, { Y}_{i_{m}}) = j\}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_{k}, X_{i_k, n})}{h_n}\right)\right\}}. \end{eqnarray}$

Set

$\mathfrak G_{0, n}(\mathbf{ x}, \boldsymbol{\theta}) = \arg \max\limits_{1\leq j\leq M}\mathfrak{M}^{j}_{n}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta}).$

Let us introduce

$\mathbf{ L}^{*}_{n} = \mathbb{P}(\mathfrak G_{0, n}(\mathbf{ X}, \boldsymbol{\theta})\neq \varphi(\mathbf{ Y})).$

Then, one can show that the discrimination rule $\mathfrak G_{0, n}(\cdot)$ is asymptotically Bayes' risk consistent

$\mathbf{ L}^{*}_{n} \rightarrow \mathbf{ L}^{*}.$

This follows from the apparent relation:

$\mid \mathbf{ L}^{*}-\mathbf{ L}^{*}_{n}\mid \leq 2 \mathbb E\left[\max _{1 \leq j \leq M}\mid\mathfrak{M}^{j}_{n}(\frac{\mathbf i}{n}, \mathbf{X}, \boldsymbol{\theta})-\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{X}, \boldsymbol{\theta})\mid\right].$

5.2. Metric learning

Metric learning, a field that has garnered significant attention in recent years, revolves around adapting the metric to the underlying data. Extensive discussions on metric learning and its applications are available in ^[19,54]. This concept has proven valuable in diverse domains, including computer vision, information retrieval, and bioinformatics. To demonstrate the practicality of metric learning, we delve into the metric learning problem for supervised classification outlined in ^[54]. Consider independent copies $\left(X_{1}, Y_{1}\right), \ldots, \left(X_{n}, Y_{n}\right)$ of a $\mathscr{H}\times \mathcal{Y}$ -valued random couple $(X, Y)$ , where $\mathscr{H}$ is a feature space, and $\mathcal{Y} = \{1, \ldots, C\}$ (with $C \geq 2$ ), representing a finite set of labels. Let $\mathcal{D}$ be a set of distance measures $D: \mathscr{H} \times \mathscr{H} \rightarrow \mathbb{R}_{+}$ . In this context, the objective of metric learning is to find a metric under which pairs of points with the same label are close to each other, while those with different labels are far apart. The natural way to define the risk of a metric $D$ is given by:

$\begin{equation} R(D) = \mathbb{E}\left[\phi\left(\left(1-D\left(X, X^{\prime}\right) \cdot\left(2 \; \; \mathbb{1}_{\left\{Y = Y^{\prime}\right\}}-1\right)\right)\right]\right., \end{equation}$

(5.1)

where $\phi(u)$ is a convex loss function upper-bounding the indicator function $\mathbb{1}\{u \geq 0\}$ , for instance, the hinge loss $\phi(u) = \max (0, 1-u)$ . To estimate $R(D)$ , we consider the natural empirical estimator:

$\begin{equation} R_{n}(D) = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n} \phi\left(\left(D\left(X_{i}, X_{j}\right)-1\right) \cdot\left(2 \; \; \mathbb{1}_{\left\{Y_{i} = Y_{j}\right\}}-1\right)\right), \end{equation}$

(5.2)

which is a one-sample $U$ -statistic of degree two with a kernel given by:

$\varphi_{\mathrm{D}}\left((x, y), \left(x^{\prime}, y^{\prime}\right)\right) = \phi\left(\left(D\left(x, x^{\prime}\right)-1\right) \cdot\left(2 \; \; \mathbb{1}_{\left\{y = y^{\prime}\right\}}-1\right)\right) .$

The convergence to (5.1) of a minimizer of (5.2) has been investigated within the frameworks of algorithmic stability ^[107], algorithmic robustness ^[18], and the theory of $U$ -processes under suitable regularization ^[50].

5.3. Kendall rank correlation coefficient

To test the independence of one-dimensional random variables $Y_1$ and $Y_2$ , ^[110] proposed a method based on the $U$ -statistic $K_{n}$ with the kernel function :

$\begin{equation} \varphi\left(\left(s_{1}, t_{1}\right), \left(s_{2}, t_{2}\right)\right) = \mathbb{1}_{\left\{\left(s_{2}-s_{1}\right)\left(t_{2}-t_{1}\right) > 0\right\}}- \mathbb{1}_{\left\{\left(s_{2}-s_{1}\right)\left(t_{2}-t_{1}\right) \leqslant 0\right\}}. \end{equation}$

(5.3)

Its rejection on the region is of the form $\left\{\sqrt{n} K_{n} > \gamma\right\}$ . In this example, we consider a multivariate case. To test the conditional independence of $\boldsymbol{\xi}, \boldsymbol{\eta}: Y = (\boldsymbol{\xi}, \boldsymbol{\eta})$ given $X$ , we propose a method based on the conditional U-statistic :

$\widehat{r}_{n}^{(2)}(\varphi, \mathbf u, \mathbf{x}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^2} \varphi\left({Y}_{i_1}, {Y}_{i_2}\right)\prod\limits_{k = 1}^2\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_{k}, X_{i_k, n})}{h_n}\right)\right\} }{ \sum\limits_{\mathbf{i}\in I_n^2}\prod\limits_{k = 1}^2\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_{k}, X_{i_k, n})}{h_n}\right)\right\} },$

where $\mathbf{x} = \left(x_{1}, x_{2}\right) \in \mathbb{I} \subset \mathbb{R}^{2}$ and $\varphi(\cdot)$ is Kendall's kernel (5.3). Suppose that $\boldsymbol{\xi}$ and $\boldsymbol{\eta}$ are $d_{1}$ and $d_{2}$ -dimensional random vectors respectively and $d_{1}+d_{2} = d$ . Furthermore, suppose that $\mathrm{Y}_{1}, \ldots, \mathrm{Y}_{n}$ are observations of $(\boldsymbol{\xi}, \boldsymbol{\eta})$ , we are interested in testing :

$\begin{equation} \mathrm{H}_{0}:\boldsymbol{\xi}\; \; \mbox{and}\; \; \boldsymbol{\eta}\; \; \mbox{are conditionally independent given}\; \; X.\; \; \mbox{vs}\; \; \mathrm{H}_{a}: \mathrm{H}_{0}\; \; \mbox{is not true}. \end{equation}$

(5.4)

Let $\mathbf{a} = \left(\mathbf{a}_{1}, \mathbf{a}_{2}\right) \in \mathbb{R}^{d}$ such as $\|\mathbf{a}\| = 1$ and $\mathbf{a}_{1} \in \mathbb{R}^{d_{1}}, \mathbf{a}_{2} \in \mathbb{R}^{d_{2}}$ , and $\mathrm{F}(\cdot), \mathrm{G}(\cdot)$ be the distribution functions of $\boldsymbol{\xi}$ and $\boldsymbol{\eta}$ , respectively. Suppose $F^{a_{1}}(\cdot)$ and $G^{a_{2}}(\cdot)$ to be continuous for any unit vector $\mathbf{a} = \left(\mathbf{a}_{1}, \mathbf{a}_{2}\right)$ where $\mathrm{F}^{\mathbf{a}_{1}}(t) = \mathbb{P}\left(\mathbf{a}_{1}^{\top} \boldsymbol{\xi} < t\right)$ and $\mathrm{G}^{\mathbf{a}_{2}}(t) = \mathbb{P}\left(\mathbf{a}_{2}^{\top} \boldsymbol{\eta} < t\right)$ and $\mathbf{a}_{1}^{\mathrm{T}}$ means the transpose of the vector $\mathbf{a}_{i}, 1 \leqslant i \leqslant 2$ . For $n = 2$ , let $Y^{(1)} = \left(\boldsymbol{\xi}^{(1)}, \boldsymbol{\eta}^{(1)}\right)$ and $Y^{(2)} = \left(\boldsymbol{\xi}^{(2)}, \boldsymbol{\eta}^{(2)}\right)$ such as $\boldsymbol{\xi}^{(i)} \in \mathbb{R}^{d_{1}}$ and $\boldsymbol{\eta}^{(i)} \in \mathbb{R}^{d_{2}}$ for $i = 1, 2$ , and :

$\varphi^{a}\left(\mathrm{Y}^{(1)}, \mathrm{Y}^{(2)}\right) = \varphi\left(\left(\mathbf{a}_{1}^{\top} \boldsymbol{\xi}^{(1)}, \mathbf{a}_{2}^{\top} \boldsymbol{\eta}^{(1)}\right), \left(\mathbf{a}_{1}^{\top} \boldsymbol{\xi}^{(2)}, \mathbf{a}_{2}^{\top} \boldsymbol{\eta}^{(2)}\right)\right).$

An application of Theorem 3.3 gives

$\begin{equation} \left|\widehat{r}^{(2)}_{n}(\varphi^{a}, \mathbf{u}, \mathbf{x}) -{r}^{(2)}(\varphi^{a}, \mathbf{u}, \mathbf{x})\right|\longrightarrow 0, \quad a.s. \end{equation}$

(5.5)

5.4. Conditional U-statistics for censored data

Consider a triple $(Y, C, {X})$ of random variables defined in $\mathbb{R} \times \mathbb{R} \times \mathcal{H}$ . Here, $Y$ is the variable of interest, $C$ is a censoring variable, and ${X}$ is a concomitant variable. Throughout, we work with a sample $\{(Y_{i}, C_{i}, { X}_{i})_{1\leq i\leq n}\}$ of independent and identically distributed replication of $(Y, C, { X})$ , $n \geq 1$ . Actually, in the right censorship model, the pairs $(Y_{i}, C_{i})$ , $1 \leq i \leq n$ , are not directly observed, and the corresponding information is given by $Z_{i} : = \min\{Y_{i}, C_{i}\}$ and $\Delta_{i} : = \mathbb{1}\{Y_{i}\leq C_{i}\}$ , $1 \leq i \leq n$ . Accordingly, the observed sample is

$\mathcal{D}_{n} = \{(Z_{i}, \Delta_{i}, { X}_{i}), i = 1, \ldots, n\}.$

For example, survival data in clinical trials or failure time data in reliability studies are often subject to such censoring. To be more specific, many statistical experiments result in incomplete samples, even under well-controlled conditions. For example, clinical data for surviving most types of diseases are usually censored by other competing risks to life, which result in death. In the sequel, we impose the following assumptions upon the distribution of $({ X}, Y)$ . For $-\infty < t < \infty$ , set

$F_{Y}(t) = \mathbb{P}(Y \leq t), \; \; G(t) = \mathbb{P}(C \leq t), \; \; \mbox{and}\; \; H(t) = \mathbb{P}(Z \leq t),$

the right-continuous distribution functions of $Y$ , $C$ , and $Z$ , respectively. For any right-continuous distribution function $\mathfrak L$ defined on $\mathbb{R}$ , denote by

$T_{\mathfrak L} = \sup\{t \in \mathbb{R} : \mathfrak L(t) < 1\}$

the upper point of the corresponding distribution. Now consider a pointwise measurable class $\mathscr{F}$ of real measurable functions defined on $\mathbb{R}$ , and assume that $\mathscr{F}$ is of VC-type. We recall the regression function of $\psi(Y)$ evaluated at $\langle{ X}, \theta\rangle = \langle{ t}, \theta\rangle$ , for $\psi \in \mathscr{F}$ and $t \in\mathcal H$ , given by

$r^{(1)}(\psi, \frac{i}{n}, { t}, \theta) = \mathbb{E}(\psi(Y_i)\mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle),$

when $Y$ is right-censored. To estimate $r^{(1)}(\psi, \cdot)$ , we make use of the inverse probability of censoring weighted (I.P.C.W.) estimators that have recently gained popularity in the censored data literature (see ^[51,111]). The key ideas of I.P.C.W. estimators are as follows. Introduce the real-valued function $\Phi_{\psi}(\cdot, \cdot)$ defined on $\mathbb{R}^{2}$ by

$\begin{equation} \Phi_{\psi}(y, c) = \frac{ \mathbb{1}\{y \leq c\}\psi(y\wedge c)}{1-G(y\wedge c)}. \end{equation}$

(5.6)

Assuming the function $G(\cdot)$ to be known, first note that $\Phi_{\psi}(Y_{i}, C_{i}) = \Delta_{i}\psi(Z_{i})/(1 -G(Z_{i}))$ is observed for every $1 \leq i \leq n$ . Moreover, under the Assumption (I) below:

(I) $C$ and $(Y, \mathbf{ X})$ are independent.

We have

$\begin{eqnarray} r^{(1)}(\Phi_{\psi}, \frac{i}{n}, { t}, \theta) &: = &\mathbb{E}(\Phi_{\psi}(Y_i, C_i)\mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle) \\& = &\mathbb{E}\left\{\frac{ \mathbb{1}\{ Y_i\leq C_i\}\psi(Z_i)}{1-G(Z_i)} \mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle\right\}\\ & = &\mathbb{E}\left\{\frac{\psi(Y_i)}{1-G(Y_i)}\mathbb{E}( \mathbb{1}\{ Y_i\leq C_i\}\mid \mathbf{ X}_i, Y_i) \mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle\right\}\\& = &r^{(1)}(\psi, \frac{i}{n}, { t}, \theta). \end{eqnarray}$

(5.7)

Therefore, any estimate of $r^{(1)}(\Phi_{\psi}, \cdot)$ , which can be built on fully observed data, turns out to be an estimate for $r^{(1)}(\psi, \cdot)$ too. Thanks to this property, most statistical procedures that provide estimates of the regression function in the uncensored case can be naturally extended to the censored case. For instance, kernel-type estimates are straightforward to construct. Set, for $\mathbf{ x}\in \mathcal{I}$ , $h\geq 0$ , $1\leq i\leq n$ ,

$\begin{eqnarray} \overline{\omega}_{n, h, j}^{(1)}(u, x): = K_{1}\left(\frac{u-j/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x, X_{j, n})}{h_n}\right) \Big/\sum\limits_{j = 1}^{n}K_{1}\left(\frac{u-j/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x, X_{j, n})}{h_n}\right). \end{eqnarray}$

(5.8)

In view of (5.6)–(5.8), whenever $G(\cdot)$ is known, a kernel estimator of $r^{(1)}(\psi, \cdot)$ is given by

$\begin{eqnarray} \breve{r}_{n}^{(1)}(\psi, u, x;h_{n}) = \sum\limits_{i = 1}^{n}\overline{\omega}_{n, h, i}^{(1)}(u, x)\frac{\Delta_{i}\psi(Z_{i})}{1-G(Z_{i})}. \end{eqnarray}$

(5.9)

The function $G(\cdot)$ is generally unknown and has to be estimated. We will denote by $G^{*}_{n}(\cdot)$ the Kaplan-Meier estimator of the function $G(\cdot)$ ^[108]. Namely, adopting the conventions

$\prod\limits_{\emptyset} = 1$

and $0^{0} = 1$ and setting

$N_{n}(u) = \sum\limits_{i = 1}^{n} \mathbb{1}\{Z_{i}\geq u\},$

we have

$G_{n}^{*}(u) = 1-\prod\limits_{i:Z_{i}\leq u}\left\{\frac{N_{n}(Z_{i})-1}{N_{n}(Z_{i})}\right\}^{(1-\Delta_{i})}, \; \; \mbox{for}\; \; u \in \mathbb{R}.$

Given this notation, we will investigate the following estimator of $r^{(1)}(\psi, \cdot)$

$\begin{eqnarray} \breve{r}_{n}^{(1)*}(\psi, u, x;h_{n}) = \sum\limits_{i = 1}^{n}\overline{\omega}_{n, h, i}^{(1)}(u, x)\frac{\Delta_{i}\psi(Z_{i})}{1-G_{n}^{*}(Z_{i})}, \end{eqnarray}$

(5.10)

refer to ^[111,130]. Adopting the convention $0/0 = 0$ , this quantity is well defined, since $G_{n}^{*}(Z_{i}) = 1$ if and only if $Z_{i} = Z_{(n)}$ and $\Delta_{(n)} = 0$ , where $Z_{(k)}$ is the $k$ th ordered statistic associated with the sample $(Z_{1}, \ldots, Z_{n})$ for $k = 1, \ldots, n$ and $\Delta_{(k)}$ is the $\Delta_{j}$ corresponding to $Z_{k} = Z_{j}$ . A right-censored version of an unconditional $U$ -statistic with a kernel of degree $m\geq 1$ is introduced by the principle of a mean preserving reweighting scheme in ^[60]. Reference ^[170] has proved almost sure convergence of multi-sample $U$ -statistics under random censorship and provided application by considering the consistency of a new class of tests designed for testing equality in distribution. To overcome potential biases arising from right-censoring of the outcomes and the presence of confounding covariates, ^[53] proposed adjustments to the classical $U$ -statistics. ^[186] proposed a different way in the estimation procedure of the $U$ -statistic by using a substitution estimator of the conditional kernel given the observed data. We also refer to ^[45]. To our best knowledge, the problem of the estimation of the conditional $U$ -statistics in the censored setting with variable bandwidth was opened up to the present, and it gives the primary motivation for the study of this section. A natural extension of the function defined in (5.6) is given by

$\begin{eqnarray} { \Phi}_{\psi}(y_{1}, \ldots, y_{k}, c_{1}, \ldots, c_{k}) = \frac{ \prod\limits_{i = 1}^{k}\{ \mathbb{1}\{y_{i} \leq c_{i}\}\psi(y_{1}\wedge c_{1}, \ldots, y_{k}\wedge c_{m})}{ \prod\limits_{i = 1}^{k}\{1-G(y_{i}\wedge c_{i})\}}. \end{eqnarray}$

(5.11)

From this, we have an analogous relation to (5.7) given by

$\begin{aligned} &\mathbb{E}({ \Phi}_{\psi}(Y_{1}, \ldots, Y_{k}, C_{1}, \ldots, C_{k})\mid(\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t} )\\ &\;\;\; = \mathbb{E}\left(\frac{ \prod\limits_{i = 1}^{k} \mathbb{1}\{Y_{i} \leq C_{i}\}\psi(Y_{1}\wedge C_{1}, \ldots, Y_{k}\wedge C_{k})}{ \prod\limits_{i = 1}^{k}\{1-G(Y_{i}\wedge C_{i})\}}\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{t}\right)\\ &\;\;\; = \mathbb{E}\left(\frac{ \psi(Y_{1}, \ldots, Y_{k})}{ \prod\limits_{i = 1}^{k}\{1-G(Y_{i})\}}\mathbb{E}\left(\prod\limits_{i = 1}^{k} \mathbb{1}\{Y_{i} \leq C_{i}\}\mid (Y_{1}, X_{1}), \ldots (Y_{k}, X_{ k})\right)\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t}\right)\\&\;\;\; = \mathbb{E}\left(\psi(Y_{1}, \ldots, Y_{k})\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t}\right). \end{aligned}$

(5.12)

An analogue estimator to (1.1) in the censored case is given by

$\begin{eqnarray} \breve{r}_{n}^{(k)}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}}\cdots\Delta_{i_{k}}\psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ (1-G(Z_{i_{1}})\cdots(1-G(Z_{i_{k}}))}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}), \end{eqnarray}$

(5.13)

where, for $\mathbf{i} = (i_{1}, \ldots, i_{k})\in I(k, n)$ ,

$\begin{equation} \overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \frac{ \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}. \end{equation}$

(5.14)

The estimator that we will investigate is given by

$\begin{eqnarray} \breve{r}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}}\cdots\Delta_{i_{k}}\psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ (1-G_{n}^{*}(Z_{i_{1}})\cdots(1-G_{n}^{*}(Z_{i_{k}}))}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}). \end{eqnarray}$

(5.15)

The main result of this section is given in the following corollary.

Corollary 5.1. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ be a measurable VC-subgraph class of functions complying with Assumption 6. Suppose Assumptions 1–4 are fulfilled. Then we have:

$\begin{aligned} &\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in [C_{1}h, 1-C_{1}h]^m} \left\lvert\breve{r}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{ x}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert \\& \qquad \qquad\qquad \qquad = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}} + h^{2m\wedge\alpha}_n\right). \end{aligned}$

(5.16)

This last result is a direct consequence of Theorem 3.3 and the law of iterated logarithm for $G_{n}^{*}(\cdot)$ established in ^[84] ensures that

$\sup\limits_{t\leq\tau}|G_{n}^{*}-G(t)| = O\left(\sqrt{\frac{\log\log n}{n}}\right)\;\;\;\;\;\text{almost surely}\;\;\text{as}\;\;\;n\rightarrow \infty.$

For more details refer to ^[34].

5.5. Conditional U-statistics for left truncated and right censored data

Keeping in mind the notation of the preceding section, we now introduce a truncation variable denoted as $L$ and assume that $(L, C)$ is independent of $Y$ . Let us consider a situation where we have random vectors $(Z_i \epsilon_i, \Delta_i)$ , with $\epsilon_i = \mathbb 1 (L_i \leq Z_i)$ . In this section, we aim to define $U$ -statistics conditional for data that are left truncated and right censored (LTRC), by following ideas from ^[172] in the unconditional setting. To achieve this, we propose an extension of the function $(5.6)$ for LTRC data as follows:

$\tilde{ \Phi}_{\psi}(y_{1}, \ldots, y_{k}, l_{1}, \ldots, l_{k}, c_{1}, \ldots, c_{k}) = \frac{ \psi(y_{1}\wedge c_{1}, \ldots, y_{k}\wedge c_{k})\prod\limits_{i = 1}^{k} \mathbb{1}\{y_{i} \leq c_{i}\} \mathbb{1}\{l_{i} \leq z_{i}\}}{ \prod\limits_{i = 1}^{k}\mathbb P\left( l_{i} < z_{i} < c_{i}\right)}.$

According to (5.12), we get that

$\begin{eqnarray*} &&\mathbb{E}({ \Phi}_{\psi}(Y_{1}, \ldots, Y_{k}, L_{1}, \ldots, L_{k}, C_{1}, \ldots, C_{k})\mid(\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t} )\nonumber\\ && \qquad \qquad \qquad = \mathbb{E}\left(\psi(Y_{1}, \ldots, Y_{k})\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t}\right). \end{eqnarray*}$

An analog estimator to (1.1) for LTRC data can be expressed as follows:

$\begin{eqnarray} \breve{\breve{r}}_{n}^{(k)}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}} \ \cdots\Delta_{i_{k}} \epsilon_{i_{1}} \cdots \epsilon_{i_{k}} \psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ \mathbb P\left( L_{i_1} < Z_{i_1} < C_{i_1}\right)\cdots \mathbb P\left( L_{i_k} < Z_{i_k} < C_{i_k}\right)}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}), \end{eqnarray}$

(5.17)

where $\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x})$ is defined as in (5.14). As $\mathbb P\left(L_i < Z_i < C_i\right)$ is not known, we need to estimate it. We introduce $N_i(t) =$ $\mathbb{1}(L_i < Z_i \leq t, \Delta_i = 1)$ and $N_i^c(t) = \mathbb{1}\left(L_i < Z_i \leq t, \Delta_i = 0\right)$ as the counting process corresponding to the variable of interest and the censoring variable. Furthermore, let

$N(t) = \sum\limits_{i = 1}^n N_i(t)$

and

$N^c(t) = \sum\limits_{i = 1}^n N_i^c(t).$

We introduce the risk indicators as $R_i(t) = \mathbb{1}\left(Z_i \geq t \geq L_i\right)$ and

$R(t) = \sum\limits_{i = 1}^n R_i(t).$

It is important to note that the risk set $R(t)$ at $t$ contains the subjects who entered the study before $t$ and are still under study at $t$ . Indeed, $N_i^c(t)$ is a local sub-martingale with the appropriate filtration $\mathbb{F}_t$ . The martingale associated with the censoring counting process with filtration $\mathbb{F}_t$ is given by

$M_i^c(t) = N_i^c(t)-\int_0^t R_i(u) \lambda_c(u) \mathrm{d} u, \quad i = 1, 2 \ldots, n.$

Here, $\lambda_c(\cdot)$ represents the hazard function associated with the censoring variable $C$ under left truncation. The cumulative hazard function for the censoring variable $C$ is defined as

$\Lambda_c(t) = \int_0^t \lambda_c(u) \mathrm{d} u.$

Denote

$M^c(t) = \sum\limits_{i = 1}^n M_i^c(t).$

Now, we define the sub-distribution function of $T_1$ corresponding to $\Delta_1 = 1$ and $\epsilon_1 = 1$ as

$S(x) = \mathbb P\left(T_1 \leq x, \Delta_1 \epsilon_1 = 1\right) .$

Let

$w(t) = \int_0^{\infty} \frac{h_1(x)}{\mathbb P\left(L_1 \leq x \leq C_1\right)} \mathbb{1}(x > t) \mathrm{d} S(x),$

where $h_1(x) = \mathbb E\left(\psi\left(\left(T_1, \Delta_1\right), \ldots, \left(T_k, \Delta_k\right)\right) \mid\left(T_1, \Delta_1\right) = \left(x, \Delta_1\right)\right)$ . Also, denote $\tilde z(t) = \mathbb P\left(T_1 \geq t \geq L_1\right)$ . Then, an estimate for the survival function of the censoring variable $C$ under left truncation, denoted as $\widehat{K}_c(\cdot)$ , see ^[174], can be formulated as follows:

$\begin{eqnarray} \widehat{K}_c(\tau) = \prod\limits_{t \leq \tau}\left(1-\frac{d N^c(t)}{\tilde Z(t)}\right). \end{eqnarray}$

(5.18)

Similar to the Nelson-Aalen estimator (for instance, see ^[7]), the estimator for the cumulative hazard function of the censoring variable $C$ under left truncation is represented as:

$\begin{eqnarray} \widehat{\Lambda}_c(\tau) = \int_0^\tau \frac{d N^c(t)}{\tilde Z(t)} . \end{eqnarray}$

(5.19)

In both the definitions presented in (5.18) and (5.19), we make the assumption that $\tilde Z(t)$ is non-zero with probability one. The interrelation between $\widehat{K}_c(\tau)$ and $\widehat{\Lambda}_c(\tau)$ can be expressed as:

$\widehat{K}_c(\tau) = \exp \left[-\widehat{\Lambda}_c(\tau)\right].$

Let $a_K = \inf \{t: K(t) > 0\}$ and $b_K = \sup \{t: K(t) < 1\}$ denote the left and right endpoints of the support. For LTRC data, as in ^[187], $F(\cdot)$ is identifiable if $a_G \leqslant a_W$ and $b_G \leqslant b_W$ . By Corollary 2.2. ^[187], for $b < b_W$ we readily infer that

$\begin{eqnarray} \sup\limits_{a_W < \tau < b}|\widehat{K}_c(\tau)-K_c(\tau)| = O(\sqrt{n^{-1}\log \log n}). \end{eqnarray}$

(5.20)

From the above, the estimator (5.17) can be rewritten directly as follows:

$\begin{eqnarray} \breve{\breve{r}}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{ x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}} \ \cdots\Delta_{i_{k}} \epsilon_{i_{1}} \cdots \epsilon_{i_{k}} \psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ \widehat{K}_c\left(Z_{i_1}\right) \cdots \widehat{K}_c\left(Z_{i_k}\right)}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}). \end{eqnarray}$

(5.21)

The last estimator is the conditional version of that studied in ^[172]. Following the same reasoning of the Corollary 5.1, one can infer that as $n\rightarrow \infty,$

$\begin{eqnarray} &&\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in [C_{1}h, 1-C_{1}h]^m} \left\lvert\breve{\breve {r}}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{ x}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert \\&&\qquad \qquad\qquad \qquad = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m\phi^m(h_n)}} + h^{2m\wedge\alpha}\right). \end{eqnarray}$

(5.22)

6. The bandwidth selection criterion

Various methodologies have been devised to formulate asymptotically optimal bandwidth selection rules for nonparametric kernel estimators, particularly for the Nadaraya-Watson regression estimator. Key contributions have been made by ^{[42,92,96,150]}. The proper selection of bandwidth is pivotal, whether in the standard finite-dimensional case or in the infinite-dimensional framework, to ensure robust practical performance. Currently, to the best of our knowledge, such investigations do not exist for addressing a general functional conditional $U$ -statistic. However, an extension of the leave-one-out cross-validation procedure allows us to define, for any fixed $\mathbf{j} = (j_1, \ldots, j_m)\in I_n^m$ :

$\begin{equation} \widetilde{r}_{n, \mathbf j}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \sum\limits_{\mathbf{i} \in I_n^m(\mathbf j)}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i} \in I_n^m(\mathbf j)}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}, \end{equation}$

(6.1)

where:

$I^m_n(\mathbf{j}): = \left\{\mathbf{i}\in I_n^m\; \mbox{and}\; \mathbf{i}\neq \mathbf{j}\right\} = I_n^m\backslash \{\mathbf{j}\}.$

Equation (6.1) denotes the leave-out- $\left(\mathbf{X}_{\mathbf{j}}, \mathbf{Y}_{\mathbf{j}}\right)$ estimator of the functional regression and can also serve as a predictor of $\varphi(\mathbf{ Y}_{{j_{1}}}, \ldots, \mathbf{ Y}_{{j_{m}}}): = \varphi(\mathbf{ Y}_{\mathbf j})$ . To minimize the quadratic loss function, we introduce the following criterion. Let $\mathcal{W}(\cdot)$ be a known non-negative weight function:

$\begin{equation} CV\left(\varphi, h_n\right): = \frac{(n-m)!}{n!}\sum\limits_{\mathbf{j}\in I_n^m}\left(\varphi\left(\mathbf{Y}_{\mathbf{j}}\right)-\widetilde{r}_{n, \mathbf j}^{(m)}(\varphi, \mathbf{u}, \mathbf{X}_{\mathbf{j}}, \boldsymbol{\theta};h_{n})\right)^2\widetilde{\mathcal{W}}\left(\mathbf{X}_{\mathbf{j}}\right). \end{equation}$

(6.2)

Building upon the concepts advanced by ^[150], a judicious approach for selecting the bandwidth is to minimize the aforementioned criterion. Therefore, we choose $\widehat{h}_{n} \in [a_n, b_n]$ , minimizing among $h \in [a_n, b_n]$ :

$CV\left(\varphi, h_n \right).$

Following the approach proposed by ^[20], where bandwidths are locally determined through a data-driven method involving the minimization of a functional version of a cross-validated criterion, we can substitute (6.2) with:

$\begin{equation} CV\left(\varphi, h_n\right): = \frac{(n-m)!}{n!}\sum\limits_{\mathbf{j}\in I_n^m}\left(\varphi\left(\mathbf{Y}_{\mathbf{j}}\right)-\widetilde{r}_{n, \mathbf j}^{(m)}(\varphi, \mathbf{u}, \mathbf{X}_{\mathbf{j}};h_{n})\right)^2\widehat{\mathcal{W}}\left(\mathbf{X}_{\mathbf{j}}, \mathbf{ x}\right), \end{equation}$

(6.3)

where

$\widehat{\mathcal{W}}\left(\mathbf{ s}, \mathbf{ t }\right): = \prod\limits_{i = 1}^m\widehat{W}(\mathbf s_{i}, \mathbf t_i).$

In practice, one takes, for $\mathbf{ i} \in I_n^m$ , the uniform global weights $\widetilde{\mathcal{W}}\left(\mathbf{X}_{\mathbf{i}}\right) = 1$ , and the local weights

$\widehat{W}(\mathbf{X}_{\mathbf{i}}, \mathbf{ t }) = \left\{ \begin{array}{ccl} 1 & \mbox{if}& d_{\boldsymbol{\theta}}(\mathbf{X}_{\mathbf{i}}, \mathbf{ t }) \leq h_n, \\ 0 & & \mbox{otherwise}. \end{array}\right.$

For conciseness, we have exclusively discussed the widely used cross-validated selected bandwidth method. However, this approach can be generalized to other bandwidth selectors, including those based on Bayesian principles ^[156].

7. Concluding remarks

This manuscript introduces the theory of single-index $U$ -processes tailored specifically for locally stationary variables within the functional data paradigm. The primary objective is to leverage functional local stationary approximations to facilitate asymptotic analyses in the statistical inference of non-stationary time series. We underscore the significance of adopting absolutely regular conditions or $\beta$ -mixing conditions, which are independent of the entropy dimension of the class, a departure from other mixing conditions. In contrast to strong mixing, $\beta$ -mixing offers greater flexibility, allowing decoupling and accommodation of diverse examples. Reference ^[103] provided a comprehensive characterization of stationary Gaussian processes satisfying the $\beta$ -mixing condition. Additionally, $\beta$ -mixing aligns with the $L_2(\mathbb{P})$ -norm, playing a crucial role. Unlike strong mixing, which demands a polynomial rate of decay for strong mixing coefficients contingent on the entropy dimension of the function class, $\beta$ -mixing involves the $L_1$ -norm and the metric entropy function $H(\cdot, T, d)$ . This function is defined concerning the pseudo-metric $d(s, t) = \sqrt{\operatorname{Var}(\mathbb{G}(s)-\mathbb{G}(t))}$ for a Gaussian process $\mathbb{G}(\cdot)$ . The definition satisfies the integrability condition:

$\int_{0}^{1} \sqrt{H(u, T, d)} du < +\infty.$

Consequently, we establish the rate of convergence, demonstrating that, under suitable conditions, the kernel estimator $\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n})$ constructed with bandwidth $h$ converges to the regression operator $r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})$ with a rate:

$O_{\mathbb{P}}\left(\sqrt{\frac{\log n}{n h^m\phi^m(h_n)}} + h^{2m\wedge\alpha}\right).$

The presented rate underscores the importance of the small-ball probability function, impacting the concentration of functional variables $X_i$ . The second term is linked to the bias of the estimate, dependent on the smoothness of the operator $r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})$ and its parameter $\alpha$ , as indicated by the Lipschitz condition. The interconnected nature of the concentration of functional variables $X$ , the small-ball probability, and the convergence rate are crucial for achieving a more efficient estimator with less dispersed variables and a higher small-ball probability. In the context of empirical process settings, the rate of convergence is established over a subset $[Ch, Ch-1]^m \times \{\mathbf{x}\}^m$ , and for forecasting purposes, it can be extended to a subset $[Ch-1, 1]^m \times \{\mathbf{x}\}^m$ using one-sided kernels or boundary-corrected kernels. This extension necessitates ensuring that the kernels have compact support and are Lipschitz. The weak convergence is established through classical procedures, involving finite-dimensional convergence and the equicontinuity of conditional $U$ -processes. Finite-dimensional convergence is achieved through Hoeffding decomposition, followed by approaching independence through a block decomposition strategy, ultimately leading to proving a central limit theorem for independent variables. The equicontinuity aspect requires meticulous attention due to the comprehensive framework considered. These results provide a solid theoretical foundation for our methodologies, extending non-parametric functional principles to a generic dependent structure—a relatively new and significant research area. It is crucial to note that, when dealing with highly dependent data, mixing, often adopted for simplicity, may not be suitable. The ergodic framework eliminates the need for frequently used strong mixing conditions, along with their variations for measuring dependence, and the more intricate probabilistic computations they entail (see ^[36,37,47]). An intriguing avenue for exploration is the $k$ NN estimator:

$\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{H_{n, k}(x_k)}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{H_{n, k}(x_k)}\right)\right\}},$

where

$H_{n, k}(x_j) = \min\left\{h\in \mathbb{R}^{+}: \sum\limits_{i = 1}^{n} \mathbb{1}_{B(x_j, h)}(X_{i}) = k\right\},$

with $B(t, h) = \left\{z \in \mathcal{H}: d(z, t) \leqslant h\right\}$ representing a ball in $\mathcal{H}$ with center $t \in \mathcal{H} $ and radius $h$, and $\mathbb{1}_{A}$ being the indicator function of the set $A$, as detailed in ^[3]. These findings open avenues for various applications, such as data-driven automatic bandwidth selection and confidence band construction. We propose the intriguing notion that bootstrap methods, as outlined in ^[35,38], provide valuable insights when applied in the functional context, especially the functional variant of the wild bootstrap, employed for smoothing parameter selection. It is crucial to acknowledge that the theoretical underpinnings of this bandwidth selection method using the functional wild bootstrap are still an area with unresolved challenges. Finally, change-point detection has emerged as a widely utilized technique for recognizing specific points within a data series when a stochastic system experiences abrupt external perturbations. The occurrence of these alterations can be attributed to several circumstances, hence their identification can greatly enhance their comprehension. The application of change-point analysis has been observed in numerous stochastic processes across a wide range of scientific domains. However, the investigation of change-point analysis for conditional $U$ -statistics remains an unexplored and demanding research issue.

8. Mathematical developments

In this section, we focus on proving our results, using the notation introduced earlier. We start by presenting the following lemma before delving into the proof of the main results. The proof techniques follow and extend those of ^[165] to the single index setting. Additionally, we incorporate certain intricate steps from ^[13], as observed in ^[39,40].

Proof of Proposition 3.1. As mentioned earlier, our statistic is a weighted $U$ -statistic, expressed as a sum of $U$ -statistics through the Hoeffding decomposition. We will delve into the details of this decomposition in Sub-section 3.1 to achieve our desired outcomes. In that specific section, we observed that:

$\begin{equation*} \widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)- \mathbb{E}\left(\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)\right) = \widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)+\widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi), \end{equation*}$

where the linear term $\widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)$ and the residual term $\widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)$ are precisely defined in (3.9) and (3.11), respectively. Our goal is to establish that the linear term governs the convergence rate of this statistic, while the remaining term converges to zero almost surely as $n\rightarrow \infty$ . We will initiate the analysis by addressing the first term in the decomposition. Consider $B = [0, 1]$ , with $\alpha_{n} = \sqrt{\log n / n h^m \phi^m(h_n)}$ and $\tau_{n} = \rho_{n} n^{1 / \zeta}$, where $\zeta$ is a positive constant specified in Assumption 4, part ⅰ), and $\rho_{n} = (\log n)^{\zeta_{0}}$ for some $\zeta_{0}>0$. Define

$\begin{align} \tilde{\mathbb H}^{(\ell)}_1(z) & : = \tilde{\mathbb H}^{(\ell)}(z) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert \leq \tau_{n}\right\}}, \end{align}$

(8.1)

$\begin{align} \tilde{\mathbb H}_2(z) & : = \tilde{\mathbb H}^{(\ell)}(z) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n}\right\}}, \end{align}$

(8.2)

and

$\begin{align} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) & = \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_\ell \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \tilde{\mathbb H}^{(\ell)}_1(z), \\ \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) & = \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_\ell \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \tilde{\mathbb H}^{(\ell)}_2(z). \end{align}$

It is evident that we have

$\begin{aligned} &\widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) - \mathbb{E} \widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) \\ &\;\;\; = \left[ \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right]+\left[\widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right]. \end{aligned}$

(8.3)

First, we can see that

$\begin{aligned} & \mathbb{P} \left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) \right\rvert > \alpha_{n}\right) \\ & = \mathbb{P} \left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) \right\rvert > \alpha_{n}\right) \\ & \;\;\;\;\;\;\bigcap \left \{ \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m} \bigcup\limits_{i = 1}^{n}\left\lvert {\mathfrak W}_{i, n} \right\rvert > \tau_{n} \right \}\bigcup \left \{\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m} \left\{\bigcup\limits_{i = 1}^{n} \left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \right\}^{\boldsymbol{c}}\right \} \\ &\leq \mathbb{P}\left\{ \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) \right\rvert > \alpha_{n}\right. \\ &\left.\qquad\qquad\;\;\;\;\;\;\; \bigcap \left \{\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \bigcup\limits_{i = 1}^{n}\left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \right \}\right\} \\ & \;\;\;\;\;\; + \mathbb{P}\left\{sup_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert > \alpha_{n}\right. \\ &\;\;\;\;\;\;\;\left.\qquad \qquad \bigcap \left \{ \left\{\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\bigcup\limits_{i = 1}^{n} \left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \right\}^{\boldsymbol{c}}\; \; \right \}\right\} \\ & \leq \mathbb{P}\left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \text { for some } i = 1, \ldots, n\right) + \mathbb{P}( \varnothing ) \\ & \leq {\tau_{n}^{-\zeta} \sum\limits_{i = 1}^{n} \mathbb{E}\left[\sup\limits_{\mathscr{F}_m \mathcal{K}^m} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in B^m}\left\lvert {\mathfrak W}_{i, n}\right\rvert^{\zeta}\right] \leq n \tau_{n}^{-\zeta} = \rho_{n}^{-\zeta} \rightarrow 0 }. \end{aligned}$

We deduce that

$\begin{align} \mathbb{E}\left[\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert\right] & \leq \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \mathbb{E}\left(\left\lvert \tilde{\mathbb H}^{(\ell)}_2(z)\right\rvert\right), \end{align}$

(8.4)

where

$\begin{aligned} &\mathbb{E}\left(\left\lvert \tilde{\mathbb H}^{(\ell)}_2(z)\right\rvert\right) = \mathbb{E}\left[\frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right) {\mathfrak W}_{i, n}\times \int {\mathfrak W}_{( {1}, \ldots, \ell-1 , \ell, \ldots, {m})} \right. \\ &\;\;\;\;\;\;\;\; \left. \prod\limits_{\underset{k \neq i}{k = 1}}^m \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n}\right\}}\right] \\ & \;\;\;\lesssim \tau_{n}^{-(\zeta-1)} \mathbb{E}\left[\frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right) \left\rvert {\mathfrak W}_{i, n}\right\rvert ^\zeta\right] \\ & \;\;\;\lesssim \frac{\tau_{n}^{-(\zeta-1)}}{ \phi(h_n)} \mathbb{E}\left[ K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right)\right] \\ &\;\;\;\lesssim \frac{\tau_{n}^{-(\zeta-1)}}{ \phi(h_n)} \times \left[\frac{1}{n h}+ \phi(h_n)\right] \\ &\;\;\;\lesssim \frac{\tau_{n}^{-(\zeta-1)}}{ n h \phi(h_n)} + \tau_{n}^{-(\zeta-1)}, \end{aligned}$

(8.5)

where

$\begin{aligned} & \mathbb{E}\left(K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i, n} \right)}{h_n}\right)\right)\\ & \;\;\;= \mathbb{E}\left( K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i, n}\right)}{h_n}\right) +K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right) -K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right)\right)\\ &\;\;\;\leqslant \mathbb{E}\left\vert K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i, n}\right)}{h_n}\right) -K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right)\right\vert + \mathbb{E} \left\vert K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right)\right\vert\\ &\;\;\;\lesssim \quad C h^{-1} \mathbb{E}\left\vert d_{\theta_i}\left(x_i, X_{i, n}\right) - d_{\theta_i}\left(x_i, X_{i}^{i/n}\right) \right\vert + \mathbb{E}\left[ \mathbb{1}_{\left(d\left(x, X_{i, n}^{(i / n)}\right) \leq h\right)}\right] \text{( K_2 is Lipschitz)}\\ &\;\;\;\lesssim \frac{1}{n h} \mathbb{E} \left\vert U_{i}^{(i /n)}\right\vert+F_{i / n}(h ; x_i) \text{(using Assumption 1 i))}\\ &\;\;\;\lesssim \frac{1}{n h}+ \phi(h_n). \end{aligned}$

Thus, we acquire

$\begin{aligned} &\mathbb{E}\left[\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert\right] \leq \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \mathbb{E}\left(\left\lvert H^{(\ell)}_2(z)\right\rvert\right) \nonumber\\ &\lesssim \underset{\leq C \; \; \text{uniformly in} \;\textbf{u} }{\underbrace{\frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}}}} \times \left[\frac{\tau_{n}^{-(\zeta-1)}}{ n h \phi(h_n)} + \tau_{n}^{-(\zeta-1)}\right]\\ &\lesssim \left[\frac{\tau_{n}^{-(\zeta-1)}}{ n h \phi(h_n)} + \tau_{n}^{-(\zeta-1)}\right] \lesssim \tau_{n}^{-(\zeta-1)} = (\rho_{n} n^{1 / \zeta})^{-(\zeta-1)} \lesssim \alpha_n. \end{aligned}$

Consequently, we deduce that

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert = O_ \mathbb{P}(\alpha_n). \end{equation}$

(8.6)

Next, let's consider

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert.$

To achieve the aimed result, we will cover the region $B^m = [0, 1]^{m}$ by

$\bigcup\limits_{k_1, \ldots, k_m = 1}^{N_{(\mathbf{u})}}\prod\limits_{j = 1}^m\mathbf{B}(\mathbf{u}_{k_j}, r),$

for some radius $r$ . Hence, for each $\mathbf{u} = (\mathbf{u}_1, \ldots, \mathbf{u}_m)\in [0, 1]^{m}$ , there exists $\mathbf{l}(\mathbf{u}) = (l(\mathbf{u}_1), \ldots, l(\mathbf{u}_m))$ , where $\forall 1 \leq i \leq m, 1\leq l(\mathbf{u}_i)\leq N_{(\mathbf{u})}$ and such that

$\mathbf{u} \in \prod\limits_{i = 1}^m \mathbf{B}(\mathbf{u}_{l(\mathbf{u}_i)}, r) \; \mbox{ and }\; \vert \mathbf{u}_i - \mathbf{u}_{l(\mathbf{u}_i)}\vert\leq r , \; \mbox{ for }\; 1\leq i\leq m,$

then, for each $\mathbf{u}\in [0, 1]^{m}$ , the closest center will be $\mathbf{u}_\mathbf{l}(\mathbf{u})$ , and the ball with the closest centre will be defined by

$\mathcal{B}(\mathbf{u}, \mathbf{l}(\mathbf{u}), r): = \prod\limits_{j = 1}^m\mathbf{B}(\mathbf{u}_{k_j}, r).$

In the same way, $\Theta^m\times \mathcal{H}^m$ should be covered by

$\bigcup\limits_{\tilde k_1, \ldots, \tilde k_m = 1}^{N_{(\theta)}}\bigcup\limits_{k_1, \ldots, k_m = 1}^{N_{(x)}}\prod\limits_{j = 1}^m\mathbf{B}(x_{k_j}, r),$

for some radius $r$ . Hence, for each $\mathbf{x} = (x_1, \ldots, x_m)\in \mathcal{H}^m$ , there exists $\mathbf{l}(\mathbf{x}) = (l(x_1), \ldots, l(x_m))$ , where $\forall 1 \leq i \leq m, 1\leq l(x_i)\leq {N_{(x)}}$ and such that

$\mathbf{x} \in \prod\limits_{i = 1}^m B(u_{l(x_i)}, r) \; \mbox{ and }\; d_{\theta_i}( x_i, x_{l(u_i)})\leq r, \; \mbox{ for }\; 1\leq i\leq m,$

then, for each $\mathbf{x}\in \mathcal{H}^m$ , the closest center will be $\mathbf{x}_\mathbf{l}(\mathbf{x})$ , and the ball with the closest centre will be defined by

$\mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, \mathbf{l}(\mathbf{x}), r): = \prod\limits_{i = 1}^m B_{\theta_i}(x_{l(x_i)}, r).$

We define

$K^*(\boldsymbol{\omega}, \boldsymbol{v}) = C \prod\limits_{k = 1}^m \mathbb{1}_{(\lvert\omega_k \rvert\leq 2C_1)}\prod\limits_{k = 1}^m K_2(v_k)\; \; \mbox{for}\; \; (\omega, v) \in \mathbb{R}^2.$

We can show that, for $(u, x) \in B _{j, n}$ and $n$ large enough,

$\begin{align*} &\left\lvert \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-\frac{i_k}{n}}{h_n}\right)-\prod\limits_{k = 1}^m K_{1}\left(\frac{u_{j, k}-\frac{i_k}{n}}{h_n}\right)\right\rvert K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right) \\ &\quad \leq \alpha_{n} K^{*}\left(\frac{u_{n}-\frac{i}{n}, d_{\theta_i}\left(x_i, X_{i, n}\right)}{h_n}\right). \end{align*}$

Let

$\begin{aligned} &\bar{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = \frac{1}{n h^m \phi(h_n)} \sum\limits_{i = 1}^{n} \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)}\prod\limits_{k = 1}^m K^{*}\left(\frac{u_k-\frac{i_k}{n}, d_{\theta_k}\left(x_{k}, X_{i_k, n}\right)}{h_n}\right){\mathfrak W}_{i, n} \\ &\;\;\;\;\times \int {\mathfrak W}_{( {1}, \ldots, \ell-1 , \ell, \ldots, {m})} \prod\limits_{\underset{k \neq i}{k = 1}}^m \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert \leq \tau_{n}\right\}}. \end{aligned}$

Note that $\mathbb E\left[\left\lvert \bar{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert\right] \leq M < \infty$ for some sufficiently large $M$ . Then, we obtain

$\begin{aligned} & \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m}\left\vert\widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})- \mathbb{E}\left[\widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right]\right\vert \\ &\;\;\;\leq \;\;sup_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\left\vert\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)- \mathbb{E}\left[\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right]\right\vert \\ & \;\;\;\;\;\;\;\;\;\qquad + \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\alpha_{n}\left(\left\vert\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right\vert+ \mathbb{E}\left[\left\vert\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right\vert\right]\right) \\ &\;\;\;\leq\;\; \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\left\vert\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)- \mathbb{E}\left[\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right]\right\vert \\ & \qquad \;\;\;\;\;\;\;\;\; + \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\left\vert\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)- \mathbb{E}\left[\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right]\right\vert+2 M F(\mathbf{y}) \alpha_{n} . \end{aligned}$

(8.7)

Therefore

(8.8)

where, for $N_{\mathscr{F}_m\mathfrak{K}^m_\Theta}N_{(\theta)}^m N_{(x)}^m N_{(u)}$ denotes the covering number related respectively to the class of functions $\mathscr{F}_m\mathfrak{K}^m_\Theta$ , the balls that cover $[0, 1]^m$ and the balls that cover $\mathcal{H}^m\times\Theta^m$ .

$\begin{eqnarray} Q_{1, n}& = & {N_{\mathscr{F}_m\mathfrak{K}^m_\Theta} N_{(\theta)}^mN_{(x)}^m N_{(u)}^m\underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(x_{ \mathbf i(x)}, r)} \underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(\mathbf u_{ \mathbf i(\mathbf u)}, r)}} \\ &&\;\;\;\qquad \qquad \qquad \qquad \mathbb P\left(\left\vert\widehat{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)- \mathbb{E}\left[\widehat{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)\right]\right\vert > M \alpha_{n}\right), \\ Q_{2, n}& = & {N_{\mathscr{F}_m\mathfrak{K}^m_\Theta} N_{(\theta)}^m N_{(x)}^m N_{(u)}^m\underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(x_{ \mathbf i(x)}, r)} \underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(\mathbf u_{ \mathbf i(\mathbf u)}, r)}} \\ &&\;\;\;\qquad \qquad \qquad \qquad \mathbb P\left(\left\vert\bar{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)- \mathbb{E}\left[\bar{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)\right]\right\vert > M \alpha_{n}\right). \end{eqnarray}$

Notice that $Q_{1, n}$ and $Q_{1, n}$ might be treated in the same way, so, we restrict our attention to $Q_{1, n}$ . Write:

$\begin{aligned} &\mathbb{P}\left(\left\vert\widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta} > M \alpha_{n}\right) \\ &\;\;\; = \mathbb{P}\left[\left\vert h^m \phi^m(h_n) \sum\limits_{i = 1}^n \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} H^{(\ell)}_1(z)\right.\right. \\ & \;\;\;\;\;\;\;\;-\left.\left. \mathbb{E}\left( h^m \phi^m(h_n)\sum\limits_{i = 1}^n \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} H^{(\ell)}_1(z)\right)\right\vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\right. \\ &\qquad\;\;\;\;\;\;\;\;\;\qquad\qquad\qquad \left. > M n \frac{(n-1)!}{(n-m)!} \alpha_{n} h^m \phi^m(h_n)\right] \\ & \;\;\;= \mathbb{P}\left(\left\vert \sum\limits_{i = 1}^n \Phi_{i, n}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta} > M n \frac{(n-1)!}{(n-m)!} \alpha_{n} h^m \phi^m(h_n)\right) . \end{aligned}$

Note that the array $\left\{\Phi_{i, n}(u, x)\right\}$ is $\alpha$ -mixing for each fixed $(u, x)$ with mixing coefficients $\beta_{\Phi, n}$ such that $\beta_{\Phi, n}(k) \leq \beta(k)$ . We apply Lemma A.4 with

$\varepsilon : = M n \frac{(n-1)!}{(n-m)!}h^m \phi^m(h_n) \alpha_{n},$

and $b_{n} = C \tau_{n}$ for sufficiently large $C > 0$ and $S_n = \alpha_n^{-1} \tau_{n}^{-1}$ . As same as [, Theorem 2], we can see that $\sigma_{S_{n}, n}^{2} \leq C^\prime S_n h^m \phi^m(h_n)$ , and we obtain:

$\begin{align*} { \mathbb{P} \left(\left\vert\sum\limits_{i = 1}^{n} Z_{i, n}(u, x)\right\vert \geq \varepsilon \right)} &\leq4 \exp \left(-\frac{\varepsilon^{2}}{64 \sigma_{S_{n}, n}^{2} \frac{n}{S_{n}}+\frac{8}{3} \varepsilon b_{n} S_{n}}\right)+4 \frac{n}{S_{n}} \beta\left(S_{n}\right) \nonumber \\ &\leq 4 \exp \left( - \frac{M^2 \alpha_{n}^2 n^2 \left(\frac{(n-1)!}{(n-m)!}\right)^2 h^{2m} \phi^{2m}(h_n) }{64 C^\prime S_n h \phi(h_n) \frac{n}{S_{n}} +\frac{8}{3} M n \frac{(n-1)!}{(n-m)!} h^m \phi^m(h_n) \alpha_{n} b_n S_{n} } \right) +4 \frac{n}{S_{n}} \beta\left(S_{n}\right) \nonumber \\ &\leq 4 \exp \left(- \frac{ M \left(\sqrt{\log n / n h^m \phi^m(h_n)}\right)^2 n \left(\frac{(n-1)!}{(n-m)!}\right) }{64 C^\prime h^m \phi^m(h_n) \frac{(n-m)!}{M(n-1)! } +\frac{8}{3} C h^m \phi^m(h_n) } \right) + 4 \frac{n}{S_{n}} \beta\left(S_{n}\right)\\ &\lesssim \exp \left( - \frac{M \frac{(n-1)!}{(n-m)!} \log n }{64 \frac{(n-m)!}{(n-1)!} \frac{C^\prime}{M}+ \frac{8}{3} C } \right)+ n S_{n}^{-\gamma-1}. \end{align*}$

To get the last inequality, we must choose $M > C^\prime$ . Since $N \leq C h^{-m}\phi(h_n) \alpha_{n}^{-m}$ , it follows that

$\widehat{Q}_{n} \leq O\left(R_{1n}\right)+O\left(R_{2 n}\right),$

with

$\begin{align*} &R_{1 T} = h^{-m}_n \alpha_{n}^{-m} n^{- \frac{M \frac{(n-1)!}{(n-m)!} }{64 \frac{(n-m)!}{(n-1)!} +3 C }}, \\ &R_{2 T} = h^{-m}_n \alpha_{n}^{-m} n S_{n}^{-\gamma-1}. \end{align*}$

For $M$ sufficiently large, we can see that $R_{1 n} \leq n^{-\varsigma}$ for some small $\varsigma > 0$ . As $\frac{\phi_{n} \log T}{T^{\theta} h^{d+1}} = o(1)$ by assumption, we further get that

$\begin{eqnarray*} R_{2 n} & = & h^{-m}_n\alpha_{n}^{-m} n S_{n}^{-\gamma-1} \\ & = & h^{-m}_nn \left(\sqrt{\frac{\log n }{ n h^m \phi^m(h_n)}}\right)^{-m} (\alpha_n^{-1} \tau_n^{-1})^{-\gamma-1} \\ & = & h^{-m}_n \left(\sqrt{\frac{\log n }{ n h^m \phi^m(h_n)}}\right)^{-m+\gamma+1}((\log n)^{ \zeta_0} n^{1 / \zeta})^{\gamma+1}\\ & = & \frac{(\log n) ^{\frac{-m+\gamma+1}{2}+ \zeta_0(\gamma+1)}}{n ^{\frac{-m+\gamma+1}{2} -1 -\frac{\gamma+1}{ \zeta} }h^{\frac{m+\gamma+1}{2}} \phi(h_n)^{\frac{-m+\gamma+1}{2}}}. \end{eqnarray*}$

Under Assumption 4 ⅱ), we establish that $R_{2n}$ tends to zero, confirming the obtained result. Now, let's move on to the nonlinear segment of the Hoeffding decomposition. Here, the goal is to illustrate that

$\begin{equation*} \mathbb{P} \left[\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\vert\widehat{\psi}_{2, \mathbf{i}}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert > \lambda \right] \rightarrow 0 \; \; \text{as}\; \; n\rightarrow \infty. \end{equation*}$

The conclusive phase in establishing Proposition 3.1 involves leveraging Lemma 8.2 to demonstrate the convergence of the nonlinear term to zero. □

Proof of Theorem 3.3. Equation (4.1) in Section 4 shows that

$\begin{eqnarray} {\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})} = \frac{1}{\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{ x}, \mathbf{u})\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right), \end{eqnarray}$

where

$\begin{eqnarray*} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}, \\ \widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} {\mathfrak W}_{\mathbf{i}, \varphi, n}, \\ \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} r^{(m)}\left(\frac{\mathbf{i}}{n} , X_{i, n}\right). \end{eqnarray*}$

The proof of this theorem is intricate and divided into the following four steps; wherein each step, our objective is to demonstrate that:

Step 1.

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \vert\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) \vert = O_{ \mathbb{P}}\left(\sqrt{\log n / n h^m \phi(h_n)}\right).$

Step 2.

$\begin{aligned} & \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \vert \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n}) \\ &\;\;\;\; - \mathbb{E}(\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n})) \vert = O_{ \mathbb{P}}\left(\sqrt{\log n / n h^m \phi(h_n)}\right) . \end{aligned}$

Step 3.

$\begin{eqnarray} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert \mathbb{E}(\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n}))\right\vert = O(h^2) +O(h^\alpha) . \end{eqnarray}$

Step 4.

$\begin{equation*} \frac{ 1}{ \inf\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\inf\limits_{\mathbf{x} \in \mathcal{H}^m} \inf\limits_{\mathbf{u} \in [C_1h, 1-C_1h]^m} \left\vert\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\vert} = O_{ \mathbb{P}}(1). \end{equation*}$

Step 1 is an immediate consequence of Proposition 3.1. The validity of the second step is ensured by substituting $\varphi(Y_{i_1}, \ldots, Y_{i_m})$ with

$\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n}),$

and applying Proposition 3.1. We will now proceed with the demonstration of Step 4. Consider

$\begin{equation} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) + \bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}), \end{equation}$

(8.9)

where

$\begin{eqnarray*} \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})& = &\frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i/n)})}{h_n}\right)\right\}\\ \bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})& = &\frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\\ &&\prod\limits_{k = 1}^m \left[K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i/n)})}{h_n}\right)\right]. \end{eqnarray*}$

For $W\equiv 1$ , the preceding proposition established that

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E}\left(\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right)\right\vert = o_ \mathbb{P}(1).$

Therefore, it is evident that

$\begin{eqnarray} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) & = & \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) + \mathbb{E}( \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})) - \mathbb{E}(\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})) \\ & = & o_ \mathbb{P}(1) + \mathbb{E}[\widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})]+ \mathbb{E}[\bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})]. \end{eqnarray}$

(8.10)

Furthermore, we have

$\begin{eqnarray} { \mathbb{E}\left(\bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right)} & = & \mathbb{E}\left(\frac{(n-m)!}{n!h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\right. \\ &&\left.\prod\limits_{k = 1}^m\left[K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right] \right) \\ & \lesssim& \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \left(\frac{m \phi^{m-1}(h_n)}{n h_n }\right) = o(1). \end{eqnarray}$

(8.11)

The final outcome arises from the Lipschitz continuity of $K_2(\cdot)$ (as stipulated in Assumption 2, ⅰ)), along with the utilization of Assumption 1 ⅰ) and Lemma A.2). This holds uniformly in $\mathbf{u}$ . Furthermore,

$\begin{aligned} & \mathbb{E}\left[ \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right] = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right]\\ &\;\;\; = \frac{(n-m)!}{n!h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^mK_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \int_{0}^h \prod\limits_{k = 1}^m K_2\left(\frac{y_k}{h_n}\right) dF_{i_k/n}(y_k, x_k)\\ & \;\;\;\gtrsim \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \phi^m(h_n)f_1(\mathbf{x}) \sim f_1(\mathbf{x}) > 0, \end{aligned}$

uniformly in $\mathbf{u}$ . Then, we obtain

$\begin{aligned} & \frac{1}{\underset{\mathscr{F}_m\mathfrak{K}^m_\Theta}{\inf} \underset{\mathbf{x} \in \mathcal{H}^m}{\inf} \underset{\mathbf{u} \in [C_1h, 1-C_1h]^m}{\inf} \left\vert\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\vert} \\ &\;\;\; = \frac{1}{\underset{\mathscr{F}_m\mathfrak{K}^m_\Theta}{\inf} \underset{\mathbf{x} \in \mathcal{H}^m}{\inf} \underset{\mathbf{u} \in [C_1h, 1-C_1h]^m}{\inf} o(1) + o_ \mathbb{P}(1) + \mathbb{E}\left[ \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right] } = O_ \mathbb{P}(1). \end{aligned}$

(8.12)

Take $K_0:[0, 1]\rightarrow \mathbb{R}$ to be a Lipschitz continuous function with its support in $[0, q]$ for some $q > 1$ , and ensure that $K_0(x) = 1$ for all $x \in [0, 1]$ . Importantly,

$\begin{equation} \mathbb{E}\left[\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n})) \right] = \sum\limits_{i = 1}^{4} Q_i(\boldsymbol{\theta}, \mathbf u, \mathbf x), \end{equation}$

(8.13)

where $Q_i$ can be defined as follows

$\begin{equation} Q_i(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\}q_i(\boldsymbol{\theta}, \mathbf u, \mathbf x), \end{equation}$

(8.14)

such that

$\begin{eqnarray*} q_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\prod\limits_{k = 1}^mK_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\left\{\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right.\right.\\ &&\left.\left.- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\} \times \left\{r^{(m)}(\varphi, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right], \\ q_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\prod\limits_{k = 1}^m\left\{ K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\}\right.\\ &&\left.\left\{r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n})-r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n}^{(i_k/n)})\right\}\right], \\ q_3(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\left\{\prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) - \prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\}\right.\\ &&\left. \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right) \times\left\{r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n}^{(i_k/n)})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right], \\ q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\left\{r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n}^{(i_k/n)})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right]. \end{eqnarray*}$

Observe that

$\begin{eqnarray*} Q_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) &\lesssim & \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right.\right.\\ &&\left. \left. \left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert \right.\right.\\ &&\left. \left.\times\left\vert r^{(m)}(\varphi, \frac{\boldsymbol i} {n}, \boldsymbol X_{\boldsymbol i, n})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert\right]\right\}. \end{eqnarray*}$

Utilizing Assumption 3 (ⅰ), the properties of $r(\boldsymbol{\theta}, \mathbf u, \mathbf x)$ allow us to establish that

$\begin{aligned}& \prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\left\vert r^{(m)}\left(\varphi, \frac{\boldsymbol i}{n}, \boldsymbol X_{\boldsymbol i, n}\right)-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert \\ & \;\;\;\;\;\lesssim \prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \left(d_{\mathcal{H}^m}\left( \boldsymbol X_{\boldsymbol i, n}, \mathbf{x}\right) + \|\mathbf{u} - \frac{\boldsymbol i}{n} \|\right)^{\alpha} \lesssim h^{m \wedge \alpha}. \end{aligned}$

With Assumption 2, part ⅱ) in mind, we will employ Lemma 8.1 and Eq (8.55) to verify that:

$\begin{eqnarray} Q_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) &\lesssim & \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\right\} \times h^{m \wedge \alpha} \times \frac{m \phi^{m-1}(h_n)}{n h_n }. \\ &\lesssim& \frac{1}{n \phi(h_n) h^{m-(m \wedge \alpha)}}\; \; \; \; \text{uniformly in}\; \mathbf{u} . \end{eqnarray}$

(8.15)

Likewise, it can be observed that

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in [C_1h, 1-C_1h]^m} Q_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) \lesssim \frac{1}{n \phi(h_n) h^{m-(m \wedge \alpha)}}, \end{equation}$

(8.16)

and

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in [C_1h, 1-C_1h]^m} Q_3(\boldsymbol{\theta}, \mathbf u, \mathbf x) \lesssim \frac{1}{n \phi(h_n) h^{m-(1 \wedge \alpha)}}. \end{equation}$

(8.17)

Concerning the final term, we derive

$\begin{eqnarray*} Q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \\ && \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\left\{r^{(m)}(\varphi, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}^{(\mathbf{i}/n)})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right]. \end{eqnarray*}$

By utilizing Lemma A.1 along with the inequality (2.12) and considering Assumption 1, it becomes evident that

$\begin{aligned}& \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in [C_1h, 1-C_1h]^m} \left\vert Q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right\vert \\ &\;\;\;\leq \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\vert\left\{ \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \right\vert \\ & \;\;\;\;\;\;\;\mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; \left\vert\left\{r^{(m)}\left(\varphi, \frac{\mathbf i}{n}, X_{\mathbf i, n}^{(\mathbf i/n)}\right)-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right\vert \right] \\ &\;\;\;\lesssim \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\vert\left\{ \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \right\vert \\ & \;\;\;\;\;\;\;\mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; \left(d_{\mathcal{H}^m}\left(X_{\mathbf i, n}^{(\mathbf i/n)}, \mathbf{x}\right) + \|\mathbf{u} - \frac{\mathbf i}{n} \|\right)^{\alpha} \right] \\ &\;\;\;\lesssim \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\vert \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right. \\ & \;\;\;\;\;\;\;\left. -\int_{0}^{1}\cdots \int_{0}^{1} \frac{1}{h^m} \prod\limits_{k = 1}^m K_{1}\left(\frac{(u_k-v_k)}{h_n}\right) d v_k \right\vert \mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; h^{\alpha}_n\right] \\ & \;\;\;\;\;\;\;+ \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m} \int_{0}^{1}\cdots \int_{0}^{1} \frac{1}{h^m} \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-v_k}{h_n}\right) d v_k \\ &\;\;\;\;\;\;\;\times \mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; h^{\alpha}_n\right] \\ &\;\;\;\lesssim O\left(\frac{1}{n^mh^{2m}_n}\right) \; \; h^{\alpha}_n + h^{\alpha}_n. \end{aligned}$

(8.18)

Keep in mind

$(n)^{-m} h^{\alpha -2m}_n \lesssim h^{2m}_n\phi^m(h_n)\ll h^{2m}_n,$

we deduce that

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in [C_1h, 1-C_1h]^m} \left\vert Q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right\vert \ll h^{2m}_n+h^{\alpha}_n.$

Certainly, within the framework of our assumptions, the approximation error can be considered as

$\begin{equation} O \left(\frac{1}{n^m \phi(h_n) h^{m-(1 \wedge \alpha)}_n}\right) \ll h^{2m\wedge \alpha}_n. \end{equation}$

(8.19)

With this inequality, the proof is concluded. □

Proof of Theorem 4.1. The goal is to establish weak convergence, encompassing finite-dimensional convergence and asymptotic equicontinuity, for the stochastic $U$ -process:

$\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(\varphi) = \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left(\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf x, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf x, \boldsymbol{\theta})-B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right),$

across all the specified function classes within the framework. According to [, Section 4.2], finite-dimensional convergence asserts that every finite set of functions $f_1, \ldots, f_q$ in $L_2$

$\begin{equation} \left(\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(f_1), \ldots, \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(f_q)\right) \end{equation}$

(8.20)

convergences to the corresponding finite-dimensional distributions of the process $G_p$ . By leveraging Cramér-Wold and considering the countability of various classes, we can simplify the scenario from the weak convergence of the $U$ -process to the weak convergence of $U$ -statistics with the kernel $f_r$ for all $r \in \{1, \ldots, q\}$ . As the $U$ -process is a linear operator, our focus narrows down to demonstrating the convergence of $\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(f_r)$ towards a Gaussian distribution. Therefore, for a fixed kernel, we have:

$\begin{aligned}& \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}) \\&\;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) \right) \\ &\;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+ \widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \right), \end{aligned}$

(8.21)

where

$\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}).$

We begin treating each term. For this sake, we will calculate the variance of $\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x)$ . Take

$\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \prod\limits_{k = 1}^mK_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \left[ r^{(m)}(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n})- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right].$

Observe that

$\begin{eqnarray} Var(\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & Var\left(\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n})\right) \\ & = & Var\left(\frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{ \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) Var\left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ && + \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{i_1\neq \ldots\neq i_m, \\ i_1^\prime\neq \ldots\neq i_m^\prime, \\ \exists j / i_j\neq i_j^\prime}} \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k^\prime/n}{h_n}\right) \\ && Cov\left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x), \Delta_{\boldsymbol i^\prime, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ &: = & V_1 + V_2. \end{eqnarray}$

(8.22)

Looking on $V_1$ , we have

$\begin{eqnarray} \vert V_1\vert & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) Var\left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \left[ \mathbb{E} \left(\Delta_{\boldsymbol i, n}^2(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) - \left( \mathbb{E} \left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right)\right)^2\right] \\ &\leq& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left(\Delta_{\boldsymbol i, n}^2(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ &\leq& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left(\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right. \\ && \;\;\;\;\;\qquad \qquad \qquad \left.\left[r^{(m)}(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n})- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right]^2\right) \\ &\leq& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left(\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right. \\ && \;\;\;\qquad \qquad\left.- \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right) + \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right. \\ && \;\;\;\qquad \qquad\times \left.\left[r^{(m)}(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n})- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right]^2\right) \\ &\leq& \frac{((n-m)!)^{2} h^{2\alpha}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left[\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right. \\ && \;\;\;\qquad \qquad\left.- \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right] + \mathbb{E} \left(\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right). \end{eqnarray}$

(8.23)

The last part of (8.23) follows from the smoothness assumption on $r^{(m)}$ in 3 ⅰ)

$\begin{eqnarray} &&\left\vert r^{(m)}\left(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n}\right)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right\vert ^2{ \lesssim h^{2\alpha}}. \end{eqnarray}$

(8.24)

Combining the latter inequalities with Eqs (8.60) and (8.58) from Lemma 8.1 where:

$\begin{aligned} & \mathbb{E} \left(\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \\ &\;\;\;\lesssim \mathbb{E} \left(\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right), \end{aligned}$

to get

$\begin{eqnarray} \vert V_1\vert &\lesssim& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ &\lesssim& \frac{h^{2\alpha}(h_n)((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \left[ \frac{m \phi^{m-1}(h_n)}{n h_n }+\phi^{m}(h_n)\right] \ll \frac{1}{n h^{m}\phi(h_n) }. \end{eqnarray}$

(8.25)

A deep sight into the work of ^[10], specially Lemma 2, makes us see $V_2$ as follows:

$\begin{eqnarray} \vert V_2\vert & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \\ &&\times Cov\left(\Delta_{i_{\sigma(1)}, \ldots, i_{\sigma(2 m)} , n}(\boldsymbol{\theta}, \mathbf u, \mathbf x), \Delta_{i_{\sigma(m+1)}, \ldots, \sigma(2 m), n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ &\leq & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \\ && \left\vert \mathbb{E}\left[\Delta_{i_{\sigma(1)}, \ldots, i_{\sigma(2 m)} , n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \Delta_{i_{\sigma(1)}, \ldots, i_{\sigma(2 m)} , n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right]\right\vert \\ &\leq & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \times c M^{2}\left(1+\sum\limits_{k = 1}^{n-1} k^{m-1} \beta_{k}^{(p-2) / p}\right) \\ &\lesssim& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \times M^{2} \ll \frac{1}{n h^{m}\phi(h_n) }, \end{eqnarray}$

(8.26)

where

$M: = \sup\limits_{1 \leq i_{1} < \cdots < i_{m} < \infty} \mathbb{E}\left[\left\vert \Delta_{\boldsymbol i, n}\right\vert^{\zeta}\right]^{1 / \zeta},$

and $j_{1} = i_{2}-i_{1}$ , $j_{l} = \min \left(i_{2 l-1}-i_{2 l-2}, i_{2 l}-i_{2 l-1}\right)$ for $2 \leq l \leq m-1$ , and $j_{m} = i_{2 m}-i_{2 m-1}$ . If we designate $j_{1} = \max \left(j_{1}, \ldots, j_{m}\right)$ , we can align the original sequence $\left\{X_{1}, \ldots, X_{n}\right\}$ with another sequence characterized by independent blocks $\{i_{1}, i_{2}, \ldots, i_{2 m}\}$ while preserving the identical block distribution. It is straightforward to demonstrate now that

$\begin{equation} Var(\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \leq \vert V_1\vert +\vert V_2\vert = o\left( \frac{1}{n h^{m}_n\phi(h_n)} \right). \end{equation}$

(8.27)

This indicates the quadratic-mean convergence of $\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x)$ with the specified rate as follows:

$\begin{equation} \widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) - \mathbb{E} \left(\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) = o\left( \frac{1}{n h_n^m\phi(h_n)} \right) \text{in probability.} \end{equation}$

(8.28)

Let's remember that

$\begin{eqnarray} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})& = & \frac{1}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+ \widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \right), \\ B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)& = & \mathbb{E}[\widehat{g}^{B}(u, x)]/ \mathbb{E}[\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})], \\ \widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}) & = & \mathbb{E}[\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})]+ o_ \mathbb{P}(1), \\\lim\limits_{n\rightarrow \infty } \mathbb{E}[\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})] & > &0, \end{eqnarray}$

then

$\begin{eqnarray} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})& = & \frac{\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}+ B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) +o_ \mathbb{P}\left( \frac{1}{n h^{m}_n\phi(h_n)} \right). \end{eqnarray}$

In the next step, we will consider the first part of the last equation.

$\begin{aligned} & \sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} Var(\widehat{g}_1(x)) \\ &\;\; = \frac{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) Var\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \varepsilon_{\mathbf{i}, n}\right] \\ &\;\; = \frac{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \varepsilon^2_{\mathbf{i}, n}\right] \\ &\;\; = \frac{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \sigma^2\left(\boldsymbol{\theta}, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) \varepsilon^2_{\mathbf{i}}\right] \\ & \;\;\;\;\;\text { ( } \left.\left\{\varepsilon_{\mathbf{i}}\right\}_{\mathbf{i} \in \mathbb{Z}} \text { is a sequence i.i.d r.v.'s, independent of }\left\{X_{i, n}\right\}_{\mathbf{i} = 1}^n\right) \\ &\;\; = \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}})\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \right]\left[ \sigma^2\left(\boldsymbol{\theta}, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) \right] \\ &\;\; = \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}})\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ & \;\;\;\;\;\mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{i_k/n})}{h_n}\right) \right]\left[ \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x}) + o(1)\right) \right]\text{(According to Assumption 3 [ii)- iii)- iv)] )} \\ & \;\;\;\;\;+ \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}}) o(\phi^m(h_n))\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \left[ \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x} \right)+ o(1) \right] \\ &\;\; = \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}})\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ &\;\;\;\;\; \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{i_k/n})}{h_n}\right) \right]\left[ \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x} \right) + o(1)\right] + o(1) \\ &\;\;\sim \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}}) (\sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x}\right) + o(1))\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ &\;\;\sim \frac{1}{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}} \mathbb{E}(\varepsilon^2_{\mathbf{i}}) \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x}\right) \int_{[0, h]^m} \prod\limits_{k = 1}^m K_{1}^2( \mathbf{z}) d \mathbb{P}(z_1, \ldots, z_m).\end{aligned}$

(8.29)

The intricate relationship between weak convergence and $\widehat{g}_1$ is further substantiated. To elucidate this connection, we undertake the following steps:

(1) Truncation of the function $\widehat{g}_1$ is performed, given the unbounded nature of the function class.

(2) The convergence of the remainder term resulting from truncation to zero is established.

(3) Hoeffding's decomposition is applied to the truncated part.

(4) The convergence to zero of the non-linear term in this decomposition is validated.

(5) The weak convergence to the linear term is established by demonstrating finite-dimensional convergence and asymptotic equicontinuity.

These steps closely parallel the proof strategy employed in Theorem 4.2. Consequently, the proof is concluded. □

Proof of Theorem 4.2. Keep in mind that

$\begin{equation} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}. \end{equation}$

(8.30)

Let's define

$\begin{eqnarray*} \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y})&: = &\frac{ \prod\limits_{k = 1}^m\left\{K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})} { \mathbb{E}\prod\limits_{k = 1}^m\left\{K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}} \; \; \mbox{for}\; \; \textbf{x}\in \mathcal{H}^m, \mathbf{y}\in \mathcal{Y}^m, \\ \\ \mathcal{G} &: = & \left\{ \mathfrak G_{\varphi, \mathbf{i}} (\cdot , \cdot) : \varphi \in \mathscr{F}_m, \; \mathbf{i} = (i_1, \ldots , i_m) \right\}, \\ \mathcal{G}^{(k)}& : = &\left\{\pi_{k, m} \mathfrak G_{\varphi, \mathbf{i}} (\cdot , \cdot) : \varphi \in \mathscr{F}_m \right\}, \\ \mathfrak{U}_n(\varphi, \mathbf{i})& = & \mathfrak{U}_n^{(m)} (\mathfrak G_{\varphi, \mathbf{i}}) : = \frac{(n-m)!}{n!}\sum\limits_{i\in I_n^m} \prod\limits_{k = 1}^m \xi_{i_k}{\mathfrak G_{\varphi, \mathbf{i}}(\textbf{X}_i, \textbf{Y}_i)}, \end{eqnarray*}$

and the $U$ -empirical process is defined to be

$\mu_n(\varphi, \mathbf{i}): = \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n(\varphi, \mathbf{i})- \mathbb{E} (\mathfrak{U}_n(\varphi, \mathbf{i}))\right\}.$

Subsequently, we have

$\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \mathfrak{U}_n(\varphi, \mathbf{i})}{ \mathfrak{U}_n(1, \mathbf{i})}.$

To ensure the weak convergence of our estimator, it is essential to first establish it for $\mu_n(\varphi, \mathbf{i})$ . As mentioned earlier, dealing with unbounded function classes necessitates truncating the function $\mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y})$ . Specifically, for $\lambda_n = n^{1/\zeta}$ , where $\zeta > 2$ , we have:

$\begin{eqnarray} \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y})& = & \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y}) \mathbb{1}_{\left\{F(\textbf{y})\leq \lambda_n \right\}}+ \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y}) \mathbb{1}_{\left\{F(\textbf{y}) > \lambda_n \right\}} \\ &: = & \mathfrak G_{\varphi, \mathbf{i}}^{(T)}(\textbf{x}, \textbf{y}) + \mathfrak G_{\varphi, \mathbf{i}}^{(R)}(\textbf{x}, \textbf{y}) . \end{eqnarray}$

We can write the $U$ -statistic as follows :

$\begin{eqnarray} \label{Equation: Troncature} \mu_n(\varphi, \mathbf{i}) & = & \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(T)}\right)- \mathbb{E} \left(\mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(T)}\right)\right)\right\}\\&& + \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\right)- \mathbb{E} \left(\mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\right)\right)\right\} \\ &: = & \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(T)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(T)}({\varphi, \mathbf{i}})\right)\right\} \\&&+ \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})\right)\right\} \\ &: = &\mu_n^{(T)}(\varphi, \mathbf{i}) + \mu_n^{(R)}(\varphi, \mathbf{i}) . \end{eqnarray}$

(8.31)

The first term is the truncated part and the second is the remaining one. We have to prove that:

(1) $\mu_n^{(T)}(\varphi, \mathbf{i})$ converges to a Gaussian process.

(2) The remainder part is negligible, in the sense that

$\left\Vert\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})\right)\right\}\right\Vert_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\overset{\mathbb{P}}{\longrightarrow} 0.$

For the initial step, we will utilize the Hoeffding decomposition, akin to the one presented in the previous Subsection 3.1, with the sole modification of replacing ${\mathfrak W}_{\mathbf{i}, n}$ with $\varphi(Y_{\mathbf{i}, n})$

$\begin{equation} \mathfrak{U}_n^{(T)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(T)}({\varphi, \mathbf{i}})\right) : = \mathfrak{U}_{1, n}(\varphi, \mathbf{i}) \nonumber+\mathfrak{U}_{2, n}(\varphi, \mathbf{i}), \end{equation}$

where

$\begin{eqnarray} \mathfrak{U}_{1, n}(\varphi, \mathbf{i}) &: = & \frac{1}{n} \sum\limits_{i = 1}^n \widehat{H}_{1, i}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi), \end{eqnarray}$

(8.32)

$\begin{eqnarray} \mathfrak{U}_{2, n}(\varphi, \mathbf{i}) &: = & \frac{(n-m)!}{(n)!} \sum\limits_{\mathbf{i}\in I_n^m} \xi_{i_1} \cdots \xi_{i_m} H_{2, \mathbf{i}}(\boldsymbol{z}). \end{eqnarray}$

(8.33)

The convergence of $\mathfrak{U}_{2, n}(\varphi, \mathbf{i})$ to zero in probability has been established by Lemma 8.2. Therefore, it suffices to demonstrate the weak convergence of $\mathfrak{U}_{1, n}(\varphi, \mathbf{i})$ to a Gaussian process denoted as $\mathbb{G}(\varphi)$ . To accomplish this, we will proceed with finite-dimensional convergence and equicontinuity. Finite-dimensional convergence asserts that for every finite set of functions $f_1, \ldots, f_q$ in $L_2$ , where $\tilde{\mathfrak{U}}$ represents the centered form of ${\mathfrak{U}}$ :

$\begin{equation} \left(\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} \tilde{\mathfrak{U}}_{1, n}(f_1, \mathbf{i}), \ldots, \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} \tilde{\mathfrak{U}}_{1, n}(f_q, \mathbf{i})\right) \end{equation}$

(8.34)

convergences to the corresponding finite-dimensional distributions of the process $\mathbb{G}(\varphi)$ . We only need to demonstrate that for every fixed collection $(a_1, \ldots, a_k) \in \mathbb{R}$ , we have

$\begin{equation*} \sum\limits_{j = 1}^{q}a_j \tilde{\mathfrak{U}}_{1, n}(f_j, \mathbf{i}) \rightarrow N \Big(0, \sigma ^2\Big), \end{equation*}$

where

$\begin{equation} \sigma ^2 = \sum\limits_{j = 1}^{q}a_j^2 {\rm Var}\Big( \tilde{\mathfrak{U}}_{1, n}(f_j, \mathbf{i})\Big) + \sum\limits_{s \neq r} a_s a_r {\rm Cov}(\tilde{\mathfrak{U}}_{1, n}(f_s, \mathbf{i})), \tilde{\mathfrak{U}}_{1, n}(f_r, \mathbf{i}) \Big). \end{equation}$

(8.35)

Take

$\Psi(\cdot) = \sum\limits_{j = 1}^{q}a_j f_{j}(\cdot).$

By linearity of $\Psi(\cdot)$ , we have to see that $\tilde{\mathfrak{U}}_{1, n}(\Psi, \mathbf{i}) \rightarrow \mathbb{G}(\Psi)$ . Let us denote

$N = \mathbb{E}\prod\limits_{k = 1}^m\left\{K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}.$

We have

$\begin{eqnarray*} \tilde{\mathfrak{U}}_{1, n}(h_n, \mathbf{i})& = & N^{-1} \times \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}}\frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \nonumber \\ && \times \int h(y_{ {1}}, \ldots, y_{\ell-1} , Y_i, y_\ell, \ldots, y_{m-1}) \prod\limits_{\underset{k \neq i}{k = 1}}^{m-1} \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \nonumber \\ &&\qquad \qquad \mathbb{P}(d(\nu_1, y_1), \ldots, d(\nu_{\ell-1}, y_{\ell-1}), d(\nu_{\ell}, y_\ell), \ldots, d(\nu_{m-1}, y_{m-1})), \nonumber\\ & = & N^{-1} \frac{1}{n} \sum\limits_{i = 1}^n \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i). \end{eqnarray*}$

Now, we shall employ the blocking procedure for this empirical process. We intend to partition the set $\{1, \ldots, n\}$ into $2\nu_n+1$ subsets, each containing small and large blocks. In alignment with the notation used in Lemma 8.2, we denote the size of large blocks as $a_n$ and the size of small blocks as $b_n$ , satisfying:

$\begin{equation} \nu_n : = \left\lfloor \frac{n}{a_n+b_n} \right\rfloor, \qquad \frac{b_n}{a_n}\rightarrow 0, \qquad \frac{a_n}{n}\rightarrow 0, \qquad \frac{n}{a_n} \beta(b_n)\rightarrow 0. \end{equation}$

(8.36)

In this case, we can see that :

$\begin{eqnarray} \tilde{\mathfrak{U}}_{1, n}(h, \mathbf{j}) & = & \sum\limits_{j = 1}^{\nu_n-1} \widehat{ \mathfrak{U}}^{(1)}_{j, n} + \sum\limits_{j = 1}^{\nu_n-1} \widehat{ \mathfrak{U}}^{(2)}_{j, n}+ \widehat{ \mathfrak{U}}^{(3)}_{j, n} \\ &: = &\tilde{\mathfrak{U}}^{(1)}_{1, n}+\tilde{\mathfrak{U}}^{(2)}_{1, n}+\tilde{\mathfrak{U}}^{(3)}_{1, n}, \end{eqnarray}$

(8.37)

where

$\begin{eqnarray} \widehat{ \mathfrak{U}}^{(1)}_{j, n} & = & N^{-1}\sum\limits_{i = j(a_n+b_n)+1}^{j(a_n+b_n)+a_n} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), \end{eqnarray}$

(8.38)

$\begin{eqnarray} \widehat{ \mathfrak{U}}^{(2)}_{j, n}& = &N^{-1}\sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)}\frac{1}{ \xi_i \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), \end{eqnarray}$

(8.39)

$\begin{eqnarray} \widehat{ \mathfrak{U}}^{(3)}_{j, n}& = & N^{-1}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i). \end{eqnarray}$

(8.40)

First, we aim to prove that

$\frac{1}{n} \mathbb{E}(\tilde{\mathfrak{U}}^{(2)}_{1, n})^2\rightarrow 0 \; \mbox{ and }\; \frac{1}{n} \mathbb{E}(\tilde{\mathfrak{U}}^{(3)}_{1, n})^2\rightarrow 0$

to show that the case of summation over the small blocks and the summation over the last one is asymptotically negligible. Hence, we infer

$\begin{eqnarray} \mathbb{E}(\tilde{\mathfrak{U}}^{(2)}_{1, n})^2 = Var\left(\sum\limits_{j = 1}^{\nu_n-1} \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) = \sum\limits_{j = 1}^{\nu_n-1} Var\left( \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) + \underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right) . \end{eqnarray}$

We have :

$\begin{eqnarray} Var\left( \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) & = & Var\left(N^{-1}\sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ & = & N^{-2}\frac{1}{ \phi^2(h_n)} \frac{a_n}{a_n}\sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)} \xi^2_i Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\lesssim& \frac{b_n}{ \phi^2(h_n) \left(m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{\omega}{h_n}\right)d\omega\right] \\ &&\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right).\qquad \mbox{(Using Lemma 8.1ii))} \end{eqnarray}$

Thus, we have

$\begin{eqnarray} \sum\limits_{j = 1}^{\nu_n-1} Var\left( \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) &\lesssim& \nu_n b_n \frac{1}{\phi^{2(m+1)}(h_n)} \left[\int_{[0, h]} K_1^2\left(\frac{\omega}{h_n}\right)d\omega\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\sim& \frac{n}{a_n+b_n} b_n \sim \frac{nb_n}{a_n} = o_ \mathbb{P}(n), \qquad \qquad \qquad\text{(by (8.36)) }. \end{eqnarray}$

(8.41)

and

$\begin{eqnarray} {\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right)} & = &\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} \sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)} \sum\limits_{i^\prime = k(a_n+b_n)+a_n+1}^{(k+1)(a_n+b_n)}\frac{N^{-2}}{ \phi(h_n)} \xi_i \xi_{i^\prime} \\ &Cov&\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), K_2 \left(\frac{d_{\theta_{i^\prime}}(x_{i^\prime}, X_{i^\prime})}{h_n}\right) \tilde{\mathbb H}(Y_{i^\prime})\right) \\ & = &\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}}\sum\limits_{l_1 = 1}^{b_n}\sum\limits_{l_2 = 1}^{b_n} \frac{1}{ N^{2}\phi^2(h_n)} \xi_{\lambda_i+l_1} \xi_{\lambda_{i^\prime}+l_2} \\ &Cov&\left(K_2 \left(\frac{d_{\theta_{{\lambda_{i^\prime}+l_1}}}(x_{\lambda_i+l_1}, X_{\lambda_i+l_1})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_i+l_1}), K_2 \left(\frac{d_{\theta_{{\lambda_{i^\prime}+l_2}}}(x_{{\lambda_{i^\prime}+l_2}}, X_{\lambda_{i^\prime}+l_2})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_{i^\prime}+l_2})\right), \end{eqnarray}$

where $\lambda_i = j(a_n+b_n)+a_n$ , but for $j \neq k, \vert \lambda_i- \lambda_{i^\prime} +l_1-l_2\vert \geq b_n$ , then

$\begin{eqnarray} {\left\vert\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right) \right\vert } \leq \underset{\vert j-k \vert\geq b_n }{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} \frac{1}{ N^{2}\phi^2(h_n)} \xi_j \xi_{k} \left\vert Cov\left(K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right) \tilde{\mathbb H}(Y_j), K_2 \left(\frac{d_{\theta_k}(x_{k}, X_{k})}{h_n}\right) \tilde{\mathbb H}(Y_{k})\right)\right\vert , \end{eqnarray}$

here, the use of Davydov's lemma (Lemma A.7) is necessary, we have

$\begin{aligned}& \left\vert Cov\left(K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right) \tilde{\mathbb H}(Y_j), K_2 \left(\frac{d_{\theta_k}(x_{k}, X_{k})}{h_n}\right) \tilde{\mathbb H}(Y_{k})\right)\right\vert \\ &\;\;\;\;\leq 8 \left( \mathbb{E}\left\vert K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right)\right\vert^p\right)^{1/p} \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p}\beta(\vert i-j\vert)^{1-2/p} \\ &\;\;\;\;\lesssim \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p}\beta(\vert i-j\vert)^{1-2/p}, \end{aligned}$

it follows that

$\begin{eqnarray} {\left\vert\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right) \right\vert } &\lesssim & \underset{\vert j-k \vert\geq b_n }{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} \frac{1}{ N^{2}\phi^2(h_n)} \xi_j \xi_{k}\phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p}\beta(\vert i-j\vert)^{1-2/p} \\ &\lesssim& \frac{1}{b_n^{\varrho} N^{2}\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} \sum\limits_{l = b_n+1}^{\infty} l^{\varrho}\beta(l)^{1-2/p} \\ &\lesssim& \frac{1}{b_n^{\varrho} N^{2}\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} n \varrho = o_ \mathbb{P}(n), \end{eqnarray}$

(8.42)

where the last inequality follows also from (8.36) and the size of $b_n$ . Then, (8.41) and (8.42) shows us that

$\frac{1}{n} \mathbb{E}(\tilde{\mathfrak{U}}^{(2)}_{1, n})^2\rightarrow 0.$

Using the same footsteps, we find that

$\begin{aligned}& Var\left(\tilde{ \mathfrak{U}}^{(3)}_{1, n}\right) = Var \left( N^{-1}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ & \;\;\;= N^{-2}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i^2 \frac{1}{ \phi^2(h_n)} Var\left( K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\;\;\;\;+ \frac{1}{ N^{2}\phi^2(h_n)} \underset{\vert i-j \vert > 0}{\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \sum\limits_{j = \nu(a_n+b_n)+1}^{n} } \xi_i \xi_{j} Cov \left( K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right) \tilde{\mathbb H}(Y_j)\right) \\ & \;\;\;= N^{-2}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i^2 \frac{1}{ \phi^2(h_n)} Var\left( K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\&\;\;\;\;\;\;\;\;+\frac{1}{ N^{2}\phi^2(h_n)} \underset{\vert i-j \vert > 0}{\sum\limits_{l_1 = 1}^{n- \nu(a_n+b_n)} \sum\limits_{l_2 = 1}^{n-\nu(a_n+b_n)} } \xi_i \xi_{j} \\ & \qquad \;\;\;\;\;\;\;\;\; Cov \left( K_2 \left(\frac{d_{\theta_{\lambda_i+l_1}}(x_{\lambda_i+l_1}, X_{\lambda_i+l_1})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_i+l_1}), K_2 \left(\frac{d_{\theta_{\lambda_i+l_2}}(x_{\lambda_i+l_2}, X_{\lambda_i+l_2})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_i+l_2})\right) \\ & \qquad \qquad\qquad\qquad\qquad\qquad\qquad \qquad\qquad\qquad\qquad\qquad\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\left(\text { For } \lambda_i:=n-v\left(a_n+b_n\right)\right) \\ &\;\;\;\lesssim \frac{n-\nu(a_n+b_n)}{ \phi^2(h_n) \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\;\;\;\;+ \frac{1}{(n-\nu(a_n+b_n))^{\varrho} \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} \\&\;\;\;\;\;\;\;\;\times \sum\limits_{l = (n-\nu(a_n+b_n))+1}^{\infty} l^{\varrho}\beta(l)^{1-2/p} \\ & \qquad\qquad\qquad\qquad \qquad\qquad\qquad\qquad\qquad\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\qquad\mbox{(Using Lemma 8.1 ii) and Lemma A.7) } \\ &\;\;\;\lesssim \frac{n-\nu_n(a_n+b_n)}{ \phi^2(h_n) \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\;\;\;\;+ \frac{1}{(n-\nu(a_n+b_n))^{\varrho} \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} n \varrho. \end{aligned}$

(8.43)

By (8.36), we find that

$\frac{1}{n}Var\left(\tilde{ \mathfrak{U}}^{(3)}_{i, n}\right) \rightarrow 0.$

Now, we need to establish that the summands in $\tilde{ \mathfrak{U}}^{(1)}_{1, n}$ are asymptotically independent, allowing us to apply the conditions of Lindeberg-Feller for asymptotic finite normality. We can utilize Lemma A.8, where $\widehat{ \mathfrak{U}}^{(1)}_{a, n}$ is $\mathcal{F}_{i_a}^{j_a}$ -measurable with $i_a = a(a_n +b_n)+1$ and $j_a = a(a_n+b_n)+a_n$ , giving us

$\begin{eqnarray} \left\vert \mathbb{E} \left( \exp ( it n^{-1/2} \tilde{ \mathfrak{U}}^{(1)}_{1, n}\right) - \prod\limits_{i = 0}^{\nu_n-1} \mathbb{E} \left( \exp ( it n^{-1/2} \widehat{ \mathfrak{U}}^{(1)}_{i, n}\right)\right\vert \leq 16 \nu_n \beta(b_n+1) , \end{eqnarray}$

(8.44)

which tends to zero using (8.36), then the asymptotic independence is achieved. We can see also that

$\begin{aligned} & \frac{1}{n} Var\left(\tilde{\mathfrak{U}}^{(1)}_{1, n}\right) \lesssim \frac{\nu_n a_n}{n \phi^2(h_n) N^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\rightarrow \frac{1}{ \phi^2(h_n) N^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \text { (since } v_n a_n / n \rightarrow 1 \text { ) }. \\&\;\;\;\;: = \mathbb{V}(X, Y). \end{aligned}$

(8.45)

Up to this point, we have addressed the final condition for finite-dimensional convergence. It is important to observe that, for sufficiently large $n$ , the set

$\{\vert\widehat{ \mathfrak{U}}^{(1)}_{i, n}\vert > \epsilon \mathbb{V}(X, Y) \sqrt{n}\}$

is empty, thus

$\begin{equation} \frac{1}{n}\sum\limits_{i = 0}^{\nu_n-1} \mathbb{E} \left( \widehat{ \mathfrak{U}}^{(1)2}_{i, n} \mathbb{1}_{\{ \vert\widehat{ \mathfrak{U}}^{(1)}_{i, n}\vert > \epsilon \mathbb{V}(X, Y) \sqrt{n} \}}\right) \rightarrow 0. \end{equation}$

(8.46)

Therefore, we conclude the proof of finite-dimensional convergence. Now, we shift our focus to asymptotic equicontinuity, aiming to demonstrate that:

$\begin{eqnarray} \lim\limits_{\delta\rightarrow 0} \lim\limits_{n \rightarrow \infty} \mathbb{P} \left\{ {\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}} \left\| \tilde{\mathfrak{U}}_{1, n}(h_n, \mathbf{i}) \right\|_{\mathcal{FK}(\delta, \|.\|_\zeta)} > \epsilon \right\} = 0, \end{eqnarray}$

(8.47)

where

$\begin{eqnarray} \mathcal{FK}(\delta, \|.\|_\zeta)&: = & \left\{\tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i})- \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) :\right. \\ && \left.\left\|\tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i})- \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) \right\| < \delta, \tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i}), \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) \in \mathcal{FK} \right\}, \end{eqnarray}$

(8.48)

for

$\begin{eqnarray} \tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i}) & = & N^{-1} \frac{1}{n} \sum\limits_{i = 1}^n \xi_i \frac{1}{ \phi(h_n)} K_{2, 1} \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}_1(Y_i) - \mathbb{E}\left({\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i})\right), \\ \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) & = & N^{-1} \frac{1}{n} \sum\limits_{i = 1}^n \xi_i \frac{1}{ \phi(h_n)} K_{2, 2} \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}_2(Y_i) - \mathbb{E}\left({\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i})\right). \end{eqnarray}$

(8.49)

Now, we will employ the chaining technique from ^[13] and follow the approach outlined in ^[39] for the conditional setting. The fundamental idea is to decompose a sequence $({ X}_1, \ldots, { X}_n)$ into $2\upsilon_n$ equal-sized blocks, each of length $a_n$ , and a residual block of length $n-2\upsilon_na_n$ , where, for $1\leqslant j\leqslant \upsilon_n$ :

$\begin{eqnarray*} H_j& = & \{ i : 2(j-1)a_n+1 \leqslant i \leqslant (2j-1)a_n \}, \\ T_j& = & \{ i : (2j-1)a_n+1 \leqslant i \leqslant 2ja_n \}, \\ R& = & \{ i : (2\upsilon_na_n+1 \leqslant i \leqslant n \}. \end{eqnarray*}$

The values of $\upsilon_n, a_n$ are given in the following. Another ingredient is essential, in this proof, that is a sequence of independent blocks $({\zeta}_1, \ldots, {\zeta}_n)$ such as:

$\mathcal{L}(\zeta_1, \ldots, {\zeta}_n) = \mathcal{L}({ X}_1, \ldots, { X}_{a_n})\times \mathcal{L}({ X}_{a_n+1}, \ldots, { X}_{2a_n}) \times \cdots.$

In the same line as ^[39], the results of the work of ^[73] on $\beta$ -mixing are applied, and get, for any measurable set $A$ :

$\begin{aligned} &\left\vert\mathbb{P}\left\{ \left(\zeta_{1}, \ldots, \zeta_{a_n}, {\zeta}_{2a_n+1}, \ldots, {\zeta}_{3a_n}, \ldots, {\zeta}_{2(\upsilon_n-1)a_n+1}, \ldots, {\zeta}_{2\upsilon_na_n} \right) \in A \right\}\right. \\ & \;\;\;\;\;\;\;\left.- \mathbb{P}\left\{\left({ X}_1, \ldots, { X}_{a_n}, { X}_{2a_n+1}, \ldots, { X}_{3a_n}, \ldots, { X}_{2(\upsilon_n-1)a_n+1}, \ldots, { X}_{2\upsilon_na_n}\right) \in A\right\} \right\vert\\ &\;\;\; \leqslant 2(\upsilon_n-1)\beta_{a_n}. \end{aligned}$

(8.50)

We will focus solely on the independent blocks represented as $\zeta_i = (\eta_i, {\varsigma}_i)$ sequences, rather than dealing with the dependent variables. We will utilize a strategy akin to the one employed in Lemma 8.2 to transition from the sequence of locally stationary random variables to the stationary one:

$\begin{aligned} &\mathbb{P}\left\{\left\Vert{(n\phi\left(h_n\right))}^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^n\left( \xi_i K_{2} \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right)\tilde{h}(Y_i)- \mathbb{E}\left({\mathfrak{U}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\Vert_{\mathscr{FK}_{(b, \Vert\cdot\Vert_\zeta)}} > \epsilon\right\} \\ \;\;\;&\leq2\mathbb{P}\left\{\left\Vert{(n\phi\left(h_n\right))}^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^{\nu_n}\sum\limits_{i \in H_j} \right.\right.\left. \left. \left( \xi_i K_{2} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\Vert_{\mathscr{FK}_{(b, \Vert\cdot\Vert_\zeta)}} > \epsilon^\prime \right\} \\ &\;\;\;\;\;\;+ 2(\nu_n-1)\beta_{a_n} + o_ \mathbb{P}(1). \end{aligned}$

(8.51)

We choose

$a_n = [(\log{n})^{-1}(n^{p-2}\phi^p(h_n))^{1/2(p-1)}]$

and

$\upsilon_n = \left[\frac{n}{2a_n}\right]-1.$

Exploiting condition (ⅴ) from Assumption 4, we deduce that $(\upsilon_n-1)\beta_{a_n}\longrightarrow 0$ as $n\rightarrow \infty$ . This primarily pertains to the first term on the right-hand side of (8.51). Given the independence of the blocks, we symmetrize the sequence using a set $\{\epsilon_j\}_{j\in \mathbb{N}^*}$ of i.i.d. Rademacher variables, where

$\mathbb{P}(\epsilon_j = 1) = \mathbb{P}(\epsilon_j = -1) = 1/2.$

It is noteworthy that the sequence $\{\epsilon_j\}_{j\in \mathbb{N}^*}$ is independent of the sequence $\left\{\boldsymbol{\xi}_i = ({\varsigma}_i, {\zeta}_i)\right\}_{i \in \mathbb{N}^*}$ . Now, our objective is to establish, for all $\epsilon > 0$ ,

$\begin{eqnarray} {\lim\limits_{\delta\rightarrow 0} \lim\limits_{n \rightarrow \infty} \mathbb{P}\left\{\left\Vert{(n\phi\left(h_n\right))}^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^{\nu_n}\sum\limits_{i \in H_j} \right.\right.}\left. \left. \left( \xi_i K_{2} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\Vert_{\mathscr{FK}_{(b, \Vert\cdot\Vert_\zeta)}} > \epsilon \right\}. \end{eqnarray}$

(8.52)

Define the semi-norm:

$\begin{eqnarray} \widetilde{d}_{n\phi, 2}&: = & \left((n\phi\left(h_n\right))^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^{\nu_n}\sum\limits_{i \in H_j}\left\vert\left( \xi_i K_{2, 1} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}_1(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}^\prime}_{1, n}(h_n, \mathbf{i})\right)\right) \right.\right. \\ && \left.\left.-\left( \xi_i K_{2, 2} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}_2(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}^{\prime\prime}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\vert^2\right)^{1/2}, \end{eqnarray}$

(8.53)

and the covering number defined for any class of functions $\mathscr{E}$ by :

$\widetilde{N}_{n\phi, 2}(u, \mathscr{E}) : = N_{n\phi, 2}(u, \mathscr{E}, \widetilde{d}_{n\phi, 2}).$

Building upon the preceding discussion, we can bound (8.52), with further elaboration available in ^[39]. Similarly, by following the methodology in ^[39] and referring back to ^[13], the independence between the blocks, coupled with Assumption 6 ⅱ) and the utilization of [89, Lemma 5.2], ensures equicontinuity, thus laying the foundation for weak convergence. Now, our task is to establish that:

$\mathbb{P}\left\{\left\Vert\mu_n^{(R)}(\varphi, \mathbf{t})\right\Vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta} > \lambda\right\} \to 0 \; \; \mbox{as}\; \; n\to \infty.$

In this proof, for clarity, we present the case where $m = 2$ , with different sizes for $a_n$ and $b_n$ , where $b_n$ denotes the size of the alternate blocks. Both $a_n$ and $b_n$ satisfy:

$\begin{equation} b_{n} \ll a_{n}, \quad\left(v_{n}-1\right)\left(a_{n}+b_{n}\right) < n \leqslant v_{n}\left(a_{n}+b_{n}\right), \end{equation}$

(8.54)

and set, for $1 \leqslant j \leqslant v_{n}-1:$

$\begin{align*} \mathrm{H}_{j}^{(\mathrm{U})} & = \left\{i:(j-1)\left(a_{n}+b_{n}\right)+1 \leqslant i \leqslant(j-1)\left(a_{n}+b_{n}\right)+a_{n}\right\}, \\ \mathrm{T}_{j}^{(\mathrm{U})} & = \left\{i:(j-1)\left(a_{n}+b_{n}\right)+a_{n}+1 \leqslant i \leqslant(j-1)\left(a_{n}+b_{n}\right)+a_{n}+b_{n}\right\}, \\ \mathrm{H}_{v_{n}}^{(\mathrm{U})} & = \left\{i:\left(v_{n}-1\right)\left(a_{n}+b_{n}\right)+1 \leqslant i \leqslant n \wedge\left(v_{n}-1\right)\left(a_{n}+b_{n}\right)+a_{n}\right\} , \\ \mathrm{T}_{v_{n}}^{(\mathrm{U})} & = \left\{i:\left(v_{n}-1\right)\left(a_{n}+b_{n}\right)+a_{n}+1 \leqslant i \leqslant n\right\}. \end{align*}$

We have:

$\begin{eqnarray*} \nonumber \mu_n^{(R)}(\varphi, \mathbf{i}) & = & \sqrt{n\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})\right)\right\} \nonumber\\ & = &\frac{\sqrt{n\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}}{n(n-1)}\sum\limits_{i_1 \neq i_2}^{n} \xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ &\leqslant&\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ & &+\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1\neq i_2 \, i_1, i_2 \in H_p^{(U)} }\phi(h_n)\xi_{i_1} \xi_{i_2} \left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ & &+2 \frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\\nonumber\\ &&+2 \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \leqslant 1}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\qquad\qquad \qquad\qquad\qquad\qquad \left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\\nonumber\\ &&+\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{T}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ &&+\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2} \sum\limits_{i_1, i_2 \in \mathrm{T}_{p}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ & = : & \rm I^{\prime} + \rm II^{\prime}+ \rm III^{\prime} + \rm IV^{\prime} + \rm V^{\prime} + \rm VI^{\prime}. \end{eqnarray*}$

We will employ blocking arguments and address the resulting terms. Let's begin by examining the first term $\rm I^{\prime}$ . We have:

$\begin{array}{r} \mathbb{P}\left\{\| \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p \neq q}^{v_n} \sum\limits_{i_1 \in \mathrm{H}_p^{(\mathrm{U)}}} \sum\limits_{i_2 \in \mathrm{H}_q^{(\mathrm{U})}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(X_{i_1}, X_{i_2}\right), \left(Y_{i_1}, Y_{i_2}\right)\right)\right.\right. \\ \left.\left.-\mathbb{E}\left[\mathfrak{G}_{\varphi, \mathbf{t}}^{(R)}\left(\left(X_{i_1}, X_{i_2}\right), \left(Y_{i_1}, Y_{i_2}\right)\right)\right]\right\} \|_{\mathscr{F}_2 \mathscr{K}^2} > \delta\right\} \\ \;\;\;\leqslant \mathbb{P}\left\{\| \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p \neq q}^{v_n} \sum\limits_{i_1 \in \mathrm{H}_p^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_q^{(\mathrm{U})}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{\left(\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right.\right.\right. \\ \;\;\;\left.\left.-\mathbb{E}\left[\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right]\right\} \|_{\mathscr{F}_2 \mathscr{K}^2} > \delta\right\}+2 v_n \beta\left(b_n\right) . \end{array}$

Notice that $\upsilon_n \beta_{b_n}\to 0$ and recall that for all $\varphi \in \mathscr{F}_m,$ and :

$\boldsymbol{\theta}, \mathbf{x} \in \mathcal{H}^2, \mathbf{y}\in\mathcal{Y}^2: \mathbb{1}_{\left\{d_{\boldsymbol{\theta}}\left(\mathbf{x}, X_{i, n}\right)\leqslant h\right\}} F(\mathbf{y}) \geqslant {\varphi}(\mathbf{y})K_2\left(\frac{d_{\boldsymbol{\theta}}\left(\mathbf{x}, X_{i, n}\right)}{h_n}\right).$

Hence, by the symmetry of $F(\cdot)$ :

$\begin{eqnarray} &&\left\Vert \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2}\right)\right)\right.\right. \left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2})\right)\right]\right\} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &&\lesssim \left\vert\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2}\left\{ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n\right\}}\right.\right. \left.\left.- \mathbb{E}\left[ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n\right\}}\right]\right\}\vphantom{\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p\neq q}^{\upsilon_n}\sum\limits_{i\in H_p^{(U)} }\sum\limits_{j\in H_q^{(U)} }}\right\vert. \end{eqnarray}$

(8.55)

We are going to use Chebyshev's inequality, Hoeffding's trick and inequality, respectively to obtain:

$\begin{eqnarray} &&{\mathbb{P}\left\{\left\vert\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2}\left\{ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right.\right.\right.}\left.\left.\left.- \mathbb{E}\left[ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right]\right\}\vphantom{\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p\neq q}^{\upsilon_n}\sum\limits_{i\in H_p^{(U)} }\sum\limits_{j\in H_q^{(U)} }}\right\vert > \delta\right\}\\ &&\lesssim\delta^{-2}n^{-1}\phi^{-1}(h_n) Var\left(\sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right)\\ &&\lesssim c_2 \upsilon_n \delta^{-2}n^{-1}\phi^{-1}(h_n) Var\left(\sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} F({\zeta}_i, {\zeta}^{\prime}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right)\\ &&\lesssim 2c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \mathbb{E}\left[\left( F({\zeta}_1, {\zeta}_2) \right)^2 \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right]. \end{eqnarray}$

(8.56)

Under Assumption 6 ⅲ), we have for each $\lambda > 0:$

$\begin{aligned} & c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \mathbb{E}\left[\left( F({\zeta}_1, {\zeta}_2) \right)^2 \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right] \nonumber\\ & \;\;\; = c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \int_0^{\infty} {\mathbb{P}\left\{\left( F({\zeta}_1, {\zeta}_2) \right)^2 \mathbb{1}_{\left\{ F > \lambda_n \right\}} \geqslant t \right\} dt} \nonumber\\ & \;\;\; = c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \int_0^{\lambda_n} {\mathbb{P}\left\{ F > \lambda_n\right\} dt } \nonumber\\ &\;\;\;\;\;\;\;\; +c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n)\sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \int_{\lambda_n}^{\infty} {\mathbb{P}\left\{\left( F\right)^2 > t\right\}dt}, \end{aligned}$

which tends to $0$ as $n\to\infty$ . Terms $\rm II^{\prime}$ , $\rm V^{\prime}$ , and $\rm VI^{\prime}$ will be handled similarly to the previous term. However, $\rm II^{\prime}$ and $\rm VI^{\prime}$ deviate from this pattern due to the variables $\{{\zeta}_i, {\zeta}_j\}_{i, j \in H_p^{(U)} }$ (or $\{{\zeta}_i, {\zeta}_j\}_{i, j \in T_p^{(U)} }$ for $\rm VI^{\prime}$ ) belonging to the same blocks. Regarding term $\rm IV^{\prime}$ , its analysis can be inferred from the study of terms $\rm I^{\prime}$ and $\rm III^{\prime}$ . Considering term $\rm III^{\prime}$ , we have:

$\begin{eqnarray} &\mathbb{P}&\left\{\left\Vert\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2} \left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\right.\right.\\ &&\qquad \left.\left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\vphantom{\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i \in H_p^{(U)} }\sum\limits_{q : |q-p|\geqslant2}^{\upsilon_n}} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} > \delta \right\} \\ && \leqslant \mathbb{P}\left\{\left\Vert \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2}\right)\right)\right.\right.\right.\\ &&\qquad\left.\left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2})\right)\right]\right\} \vphantom{\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i \in H_p^{(U)} }\sum\limits_{q : |q-p|\geqslant2}^{\upsilon_n}} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} > \delta \right\} + \frac{\upsilon_n^2 a_n b_n \beta({a_n})}{\sqrt{n\phi(h_n)}} . \end{eqnarray}$

(8.57)

We also have

$\begin{gathered}&\mathbb{P}\left\{\| \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p = 1}^{v_n} \sum\limits_{i_1 \in \mathbf{H}_p^{(\mathrm{U})}} \sum\limits_{q:|q-p| \geqslant 2}^{v_n} \sum\limits_{i_2 \in \mathbf{T}_q^{(\mathrm{U})}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{\left(\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right.\right.\right. \\ &\left.\left.-\mathbb{E}\left[\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right]\right\} \|_{\mathscr{F}_2 \mathscr{K}^2} > \delta\right\}\\ &\;\leqslant \mathbb{P}\left\{\left\Vert \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2}\right)\right)\right.\right.\right.\\ &\qquad\qquad\qquad\left.\left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2})\right)\right]\right\} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} > \delta \right\}. \end{gathered}$

Since the Eq (8.55) is still satisfied, the problem is reduced to

$\mathbb{P}\left\{\left\lvert\, \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p = 1}^{v_n} \sum\limits_{i_1 \in \mathrm{H}_p^{(U)}} \sum\limits_{q: q q-p \mid \geqslant 2}^{v_n} \sum\limits_{i_2 \in \mathrm{T}_q^{(U)}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{F\left(\zeta_i, \zeta_j\right) \mathbb{1}_{\left\{F > \lambda_n\right\}}\right.\right.\right.\\ \;\;\;\;\;\;\;\;\left.\left.-\mathbb{E}\left[F\left(\zeta_i, \zeta_j\right) \mathbb{1}_{\left\{F > \lambda_n\right\}}\right]\right\} \mid > \delta\right\}\\ \;\;\;\lesssim \delta^{-2} n^{-1} \phi\left(h_n\right) Var\left(\sum\limits_{p = 1}^{v_n} \sum\limits_{i_1 \in H_p^{(U)}} \sum\limits_{q:|q-p| \geqslant 2}^{v_n} \sum\limits_{i_2 \in T_q^{(())}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2} F\left(\zeta_i, \zeta_j\right) \mathbb{1}_{\left\{F > \lambda_n\right\}}\right),$

we follow the same procedure as in (8.56). The rest has just been shown to be asymptotically negligible.

Finally, with

$|r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E}\left(\mathfrak{U}_n(\varphi, \mathbf{i})\right)|\rightarrow 0,$

and for

$\left(\mathfrak{U}_n(1, \mathbf{i})\right) \underset{ \mathbb{P}}{\rightarrow } 1,$

the weak convergence of our estimator is accomplished. □

Technical lemmas

The forthcoming proof relies on the arguments delineated in ^[39,40,165], extended to the single index model framework.

Lemma 8.1. Let $K_2(\cdot)$ denote one dimensional kernel function satisfying Assumption 2 part i), if Assumption 1, then:

$\begin{equation} i)\; \; \mathbb{E}\left\lvert\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert\lesssim \frac{m \phi^{m-1}(h_n)}{n h_n }; \end{equation}$

(8.58)

$\begin{equation} ii)\; \; \; \; \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2} \left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right] \lesssim \frac{m \phi^{m-1}(h_n)}{n h_n }+\phi^{m}(h_n); \end{equation}$

(8.59)

$\begin{equation} iii)\; \; \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right]\sim \phi^{m}(h_n). \end{equation}$

(8.60)

Proof of Lemma 8.1. For the first inequality $i)$ , by assuming that the kernel function $K_2(\cdot)$ is an asymmetrical triangle kernel, that is, $K_2(x) = (1 -x) \mathbb{1}_{(x \in [0, 1])}$ , we have

$\begin{aligned} &\label{expectation of the difference K2} \mathbb{E}\left\lvert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert \\ &\;\;\; = \mathbb{E} \left[\left\lvert \sum\limits_{k = 1}^m \left(K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \right\rvert\right. \\ & \;\;\;\;\;\;\;\;\left. \times \left\lvert\prod\limits_{i = 1}^{k-1} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\rvert\times \left\lvert\prod\limits_{j = k+1}^{m} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert\right] \\ & \qquad \qquad\qquad \qquad\qquad\qquad\qquad \qquad \qquad\qquad \qquad \qquad \qquad \qquad\text{(Using a telescoping argument)} \\ &\;\;\;\leq \left\{ \mathbb{E} \left[\left\lvert \sum\limits_{k = 1}^m \left( K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \right\rvert\right]^3 \right\}^{1/3} \\ & \;\;\;\;\;\;\;\;\times \left\{ \mathbb{E} \left\lvert\prod\limits_{i = 1}^{k-1} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\rvert^3\right\}^{1/3} \times \left\{ \mathbb{E}\left\lvert\prod\limits_{j = k+1}^{m} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert^3 \right\}^{1/3} \\ &\qquad\qquad\qquad\qquad \qquad\qquad \;\;\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad\text{(By Hölder's inequality)} \\ &\;\;\;\leq \left\{ \mathbb{E}\left[\sum\limits_{k = 1}^m \left\lvert \left( K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \right\rvert \right]^3 \right\}^{1/3} \\ &\;\;\;\;\;\;\;\;\times\left\{\prod\limits_{i = 1}^{k-1} \left( \mathbb{E}\left\lvert K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\rvert ^{3p_i}\right)^{1/p_i} \right\}^{1/3} \\ & \;\;\;\;\;\;\;\;\times \left\{\prod\limits_{j = k+1}^{m} \mathbb{E} \left(\left\lvert K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert^{3q_j}\right)^{1/q_j} \right\}^{1/3} \qquad\text{(By Hölder's inequality)} \\ &\;\;\;\lesssim \left\{\sum\limits_{k = 1}^m \mathbb{E} \left\lvert \frac{1}{n h_n } U_{i_k, n}^{(i_k/n)} \right\rvert^3 \right\}^{1/3} \times \left\{\prod\limits_{i = 1}^{k-1} \left( \mathbb{E}\left\lvert \mathbb{1}_{\{d_{\theta_k}(x_k, X_{i_k, n})\leq h\}} \right\rvert^{3p_i}\right)^{1/p_i}\right. \\ &\;\;\;\;\;\;\;\;\left.\times \prod\limits_{j = k+1}^{m}\left( \mathbb{E}\left\lvert \mathbb{1}_{\{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})\leq h\}}\right\rvert^{3q_j}\right)^{1/q_j}\right\}^{1/3}\qquad \qquad\qquad\;\;\text{(By Assumption 1)} \\ &\;\;\;\lesssim \left\{\sum\limits_{k = 1}^m \frac{1}{n^3 h^3 } \mathbb{E} \left\lvert U_{i_k, n}^{(i_k/n)} \right\rvert^3\right\}^{1/3}\times \left\{ \prod\limits_{i = 1}^{k-1} \left(F^{3p_i}(h, x_k)\right)^{1/p_i} \prod\limits_{j = k+1}^{m}\left(F^{3q_j}_{i_k/n}(h, x_k)\right)^{1/q_j}\right\}^{1/3} \\ & \\ &\;\;\;\lesssim\frac{1}{n h_n } \left\{\sum\limits_{k = 1}^m \mathbb{E} \left\lvert U_{i_k, n}^{(i_k/n)} \right\rvert^3\right\}^{1/3}\times \left\{ \prod\limits_{i = 1}^{k-1} C_d \phi^3(h_n) f_1^3(x_k) \times \prod\limits_{j = k+1}^{m}C_d \phi(h_n)^3 f_1^3(x_k)\right\}^{1/3} \\ & \;\;\;\lesssim \frac{m\phi^{m-1}(h_n)}{n h_n }. \end{aligned}$

(8.61)

For the second inequality $ii)$ , we have:

$\begin{aligned} & \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2} \left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right] \\ &\;\;\; = \mathbb{E}\left[ \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)+\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right] . \end{aligned}$

By linearity of the expectation, inequality $i)$ and for

$\begin{equation*} \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right) \right]\lesssim \phi^{m-1}(h_n), \end{equation*}$

using Assumption 1 part ⅳ), the proof of this inequality holds. Now, we consider the last one. Set

$\widetilde{K}^2_2(x_k): = K_{2}^2\left({d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}\right).$

We have

$\begin{aligned} & \mathbb{E}\left[\left(\prod\limits_{k = 1}^m \widetilde{K}^2_2\left(\frac{x_k}{h_n}\right)\right)\right] = \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m \widetilde{K}_{2}^{2}\left(\frac{y_k}{h_n}\right) \mathbb{P}(dy_1, \ldots, dy_k) \\ &\;\;\; = - \frac{2}{h_n} \int_{0}^h\cdots\int_{0}^h \prod\limits_{j = 1, j\neq k }^m\widetilde{K}_{2}\left(y_j\right) \\&\;\;\;\;\;\;\;\;\times\int_{0}^h \widetilde{K}_{2}\left(y_\ell\right) \widetilde{K}_{2}^{\prime}\left(y_\ell\right) \mathbb{P}(dy_1, \ldots, dy_{\ell-1}, y_\ell, dy_{\ell+1}, \ldots, y_k)dy_\ell\; \; \text{(Integration by parts)} \\& \\ &\;\;\; = \frac{(-2)^m}{h^m} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1 }^m\widetilde{K}_{2}\left(y_k\right) \widetilde{K}_{2}^{\prime}\left(y_k\right) \mathbb{P}(y_1, \ldots, y_k)dy_1\ldots dy_k \\ &\;\;\;\sim \frac{2^m}{h^m} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m \widetilde{K}_{2}\left(y_k\right) \widetilde{K}_{2}^\prime\left(y_k\right) \phi_k(y_k) d\mathbb{P}(y_1, \ldots, y_k) \\ &\;\;\; = \frac{2^m}{h^m} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m \left(1-\frac{y_k}{h_n}\right)\phi_k(y_k) d\mathbb{P}(y_1, \ldots, y_k) \\ & \quad \qquad\qquad \qquad\qquad \qquad\text { (Using Assumption } \left.2 \text { ii) and } K_2(x)=(1-x) I(x \in[0,1])\right) \\ &\;\;\; = \frac{2^m}{h^{2m}} \int_{0}^h\cdots\int_{0}^h \left(\int_{0}^y \cdots\int_{0}^y \prod\limits_{k = 1}^m\phi_k(\epsilon_k)d(\epsilon_1, \ldots, \epsilon_k)\right) d\mathbb{P}(y_1, \ldots, y_k) \\ & \qquad \quad\qquad \qquad\qquad \qquad\text{(By an integration by parts)} \\ &\;\;\;\sim \frac{2^m}{h^{2m}} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m y_k \phi_k(y_k) d\mathbb{P}(y_1, \ldots, y_k)\sim \frac{1}{h^{2m}}\phi^{m}(h_n) h^{2m} \\&\;\;\;\sim \phi^{m}(h_n). \end{aligned}$

The final result is established by utilizing the small-ball lower bound provided in (2.11). Consequently, inequality (8.60) follows. □

In the ensuing discussion, we will present a lemma that can be regarded as a technical result in the proof of our proposition.

Lemma 8.2. Consider $\mathscr{F}_{m}\mathfrak{K}^m_\Theta$ as a uniformly bounded class of measurable canonical functions, where $m\geq 2$ . Suppose there exist finite constants $\boldsymbol{a}$ and $\boldsymbol{b}$ such that the $\mathscr{F}_{m}\mathfrak{K}^m_\Theta$ covering number satisfies:

$\begin{equation} N(\epsilon, \mathscr{F}_m\mathfrak{K}^m_\Theta, \Vert\cdot\Vert_{L_2(Q)}) \leq \boldsymbol{a}\epsilon^{-\boldsymbol{b}}, \end{equation}$

(8.62)

for every $\epsilon > 0$ and every probability measure $Q$ . If the mixing coefficients $\beta$ of the local stationary sequence $\{Z_i = (X_{\mathbf{i}, n}, {\mathfrak W}_{\mathbf{i}, n})\}_{i \in \mathbb{N}^\star}$ satisfy:

$\begin{equation} \beta(k) k^r \rightarrow 0, \; \; \mathit{\text{as}}\; \; k \rightarrow \infty, \end{equation}$

(8.63)

for some $r > 1$ , then:

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \mathbb{P} \left[ h^{m/2}_n \phi^{m/2}(h_n) n^{-m+1/2} \sum\limits_{\mathbf{i} \in I_n^m} \xi_{i_1} \cdots \xi_{i_m} H(Z_{i_1}, \ldots, Z_{i_m}) \right] \rightarrow 0. \end{equation}$

(8.64)

Remark 8.3. As mentioned before, ${\mathfrak W}_{\mathbf{i}, n}$ will be equal to $1$ or $\varepsilon_{\mathbf{i}, n} = \sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right)\varepsilon_{\mathbf{i}}$ . In the proof of the previous Lemma, ${\mathfrak W}_{\mathbf{i}, n}$ will be equal $\varepsilon_{\mathbf{i}, n} = \sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right)\varepsilon_{\mathbf{i}}$ , and we will use the notation $\mathfrak{W}^{(u)}_{\mathbf{i}, n}$ to indicate $\sigma\left(\mathbf{u}, \mathbf{x}\right)\varepsilon_{\mathbf{i}}$

Proof of Lemma 8.2. The proof of this lemma relies on the blocking method, specifically drawing upon techniques introduced by ^[13]. The central idea involves partitioning the strictly stationary sequence $(Z_1, \ldots, Z_n)$ into $2n$ blocks, each of length $a_n$ , along with a residual block of length $n-2v_n a_n$ . This approach, known as Bernstein's method and discussed in ^[22], facilitates the application of symmetrization and various techniques designed for i.i.d. random variables. To establish the independence between the blocks, the smaller blocks are placed between two consecutive larger blocks, and their contribution should be asymptotically negligible. Next, introduce the sequence of independent blocks $\left(\eta_{1}, \ldots, \eta_{n}\right)$ such that:

$\mathscr{L}\left(\eta_{1}, \ldots, \eta_{n}\right) = \mathscr{L}\left(\mathrm{Z}_{1}, \ldots, \mathrm{Z}_{a_{n}}\right) \times \mathscr{L}\left(\mathrm{Z}_{a_{n}+1}, \ldots, \mathrm{Z}_{2 a_{n}}\right) \times \cdots$

An application of the result of ^[73] implies that for any measurable set $A$ :

$\begin{aligned} & \mid \mathbb{P}\left\{\left\{\eta_1, \ldots, \eta_{a_n}, \eta_{2 a_n+1}, \ldots, \eta_{3 a_n}, \ldots, \eta_{2\left(v_n-1\right) a_n+1}, \ldots, \eta_{2 v_n a_n}\right) \in \mathrm{A}\right\} \\ & \quad\;\;\;\;\;\;\;\;\;\;\;-\mathbb{P}\left\{\left(\mathrm{Z}_1, \ldots, \mathrm{Z}_{a_n}, \mathrm{Z}_{2 a_n+1}, \ldots, \mathrm{Z}_{3 a_n}, \ldots, \mathrm{Z}_{2\left(v_n-1\right) a_n+1}, \ldots, \mathrm{Z}_{2 v_n a_n}\right) \in \mathrm{A}\right\} \mid \\ & \leqslant 2\left(v_n-1\right) \beta\left(a_n\right) . \end{aligned}$

(8.65)

Since we are working with a locally stationary sequence $(X_1, \ldots, X_n)$ , the sequence of independent blocks used subsequently is denoted by $\{\eta_i\}_{i\in \mathbb{N}^*}$ . We decompose the process based on the distribution of these blocks:

$\begin{aligned} & \sum\limits_{i_1 \neq i_2}^{n}\frac{1}{h^{2}\phi^2(h_n)}\prod\limits_{k = 1}^2\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}{\mathfrak W}_{i_1, i_2, \varphi, n} \\ &\;\;\; = \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \frac{ {\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+2 \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+2 \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \leqslant 1}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+\sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{T}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2} \sum\limits_{i_1, i_2 \in \mathrm{T}_{p}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;: = \mathrm{I}+\mathrm{II}+\mathrm{III}+\mathrm{IV}+\mathrm{V}+\mathrm{VI} . \end{aligned}$

(8.66)

(I): The same type of block but not the same block: Assume that the sequence of independent blocks $\{\eta_i\}_{i\in \mathbb{N}^*}$ is of size $a_n$ . An application of (8.61) shows that:

$\begin{eqnarray} \label{First block inequality: I} &&{ \mathbb{P}\left( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right.\right.}\left. \left.\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n} \right\vert > \delta\right) \\ &&\leq \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ && \qquad\qquad\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg) \\ &&+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ &&\qquad\qquad\qquad \qquad\left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \left[{\mathfrak W}_{i_1, i_2, \varphi, n} - \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]\right\vert > \delta \Bigg) \\ &&+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right.\left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg) \\ &&\leq \mathbb{P}\left( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \qquad\qquad\qquad\left. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right)\mathfrak{W}^{({u})}_{\mathbf{i}, \varphi, n} \right\vert > \delta\right)+ 2\upsilon_n \beta({b_n}) + o_ \mathbb{P}(1)+o_ \mathbb{P}(1). \end{eqnarray}$

By the fact that

$\begin{eqnarray} &&{ \mathbb{E} \left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right.}\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert \\ && = n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \left\vert \mathbb{E}\left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert \\ && = n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \\ &&\qquad \mathbb{E}\left\vert \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] \sigma\left(\frac{\mathbf{i}}{n}, X_{i, n}\right)\varepsilon_{\mathbf{i}}\right\vert \\ && = n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \mathbb{E}(\varepsilon_{\mathbf{i}}) \mathbb{E} \left\vert \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right.\right. \\ && \left.\left.-\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] \left[ \sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) - \sigma\left(\mathbf{u}, X_{\mathbf{i}, n}\right) + \sigma\left(\mathbf{u}, X_{\mathbf{i}, n}\right)\right] \right\vert \\ &&\lesssim n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \mathbb{E}(\varepsilon_{\mathbf{i}}) (\sigma\left(\mathbf{u}, \mathbf{x}\right)+o_ \mathbb{P}(1)) \\ &&\qquad\qquad \mathbb{E}\left\vert \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] \right\vert \\ &&\lesssim n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \mathbb{E}(\varepsilon_{\mathbf{i}}) (\sigma\left(\mathbf{u}, \mathbf{x}\right)+o_ \mathbb{P}(1)) \left[ \frac{m \phi^{m-1}(h_n)}{n h_n }\right] \\&&\qquad\qquad \mbox{(where m = 2 and using Lemma 8.1 Equation (8.58))} \\ &\sim& o_ \mathbb{P}(1), \end{eqnarray}$

and

(8.67)

We keep the choice of $b_n$ and $\upsilon_n$ such that

$\begin{equation} \upsilon_nb_n^r \leqslant 1, \end{equation}$

(8.68)

which implies that $2\upsilon_n \alpha_{b_n} \to 0$ as $n \to \infty$ , so the term to consider is the second summand. For the second part of the inequality, we turn to the work of ^[11] in the non-fixed kernel setting. Specifically, we define

$f_{i_{1}, \ldots, i_m} = \prod\limits_{k = 1}^m \xi_{i_k} \times H ,$

and $\mathcal{F}_{i_{1}, \ldots, i_m}$ represents a collection of kernels and the corresponding class of functions associated with this kernel. Subsequently, we will apply [, Theorem 3.1.1 and Remarks 3.5.4 part 2] for decoupling and randomization. Given our assumption that $m = 2$ , we can observe that:

$\begin{aligned} & \mathbb{E} \left\Vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_2 \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ & \;\;\; = \mathbb{E} \left\Vert n^{-3/2} h\phi(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} f_{i_1, i_2} (\boldsymbol{u}, \boldsymbol{\eta}) \right\Vert_{\mathcal{F}_{i_1, i_2}}\\ &\;\;\; \leq c_2 \mathbb{E} \left\Vert n^{-3/2} h \phi(h_n) \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} }f_{i_1, i_2} (\boldsymbol{u}, \boldsymbol{\eta})\right\Vert_{\mathcal{F}_{i_1, i_2}} \\ & \;\;\;\leqslant c_2 \mathbb{E} \int_0^{D_{nh}^{(U_1)}}{N\left(t, \mathcal{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(1)}\right)} dt, \; \; \; \text{(by Lemma A.5 and Proposition A.6)} \end{aligned}$

(8.69)

where $D_{nh}^{(U_1)}$ is the diameter of $\mathcal{F}_{i_1, i_2}$ according to the distance $\widetilde{d}_{nh, 2}^{(1)},$ respectively defined as

$\begin{aligned} &D_{nh}^{(U_1)} : = \left\Vert \mathbb{E}_{\epsilon} \left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} }f_{i_1, i_2} (\boldsymbol{u}, \boldsymbol{\eta}) \right\vert \right\Vert_{\mathcal{F}_{i_1, i_2}} \\ &\;\;\; = \left\Vert \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h \phi^{-1}(h_n) \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} } \xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_2 \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\vert \right\Vert_{\mathcal{F}_{2}\mathcal{K}^{2}}, \end{aligned}$

and :

$\begin{aligned} &\widetilde{d}_{nh, 2}^{(1)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right) \\ & \;\;\;: = \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h{\phi^{-1}(h_n)}\sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} } \left[\xi_{1i_1}\xi_{1i_2}\prod\limits_{k = 1}^2 K_{1, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\right. \right. \\ & \;\;\;\;\;\;\;\;-\left. \vphantom{ \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i\in H_p^{(U)} } \sum\limits_{j\in H_q^{(U)} }}\left.\xi_{2i_1}\xi_{2i_2} \prod\limits_{k = 1}^2 K_{2, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n} \right] \right\vert .\end{aligned}$

Let's consider another semi-norm $\widetilde{d}_{nh, 2}^{(2)} :$

$\begin{eqnarray*} {\widetilde{d}_{ nh, 2}^{(2)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)} && = \frac{1}{n h^2\phi^2(h_n)}\left[\sum\limits_{i\neq j}^{\upsilon_n}\left(\xi_{1i_1}\xi_{1i_2}\prod\limits_{k = 1}^2 K_{1, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\right. \right. \nonumber\\ && -\left. \vphantom{ \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i\in H_p^{(U)} } \sum\limits_{j\in H_q^{(U)} }}\left.\xi_{2i_1}\xi_{2i_2} \prod\limits_{k = 1}^2 K_{2, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n} \right)^2\right]^{1/2}. \end{eqnarray*}$

One can see that

$\widetilde{d}_{nh, 2}^{(1)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right) \leqslant a_n n^{-1/2}h \phi(h_n) \widetilde{d}_{ nh, 2}^{(2)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right).$

We readily infer that

(8.70)

where $\lambda_n \to 0.$ We have

$\frac{ \left(\int_0^{\lambda_n}{\log{t^{-1}}dt}\right)}{ \left(\lambda_n \log{\lambda_n^{-1}}\right)} \to 0,$

where $a_n$ and $\lambda_n$ must be chosen in such a way that the following relation will be achieved

$\begin{equation} a_n \lambda_n n^{-1/2} \log{\lambda_n ^{-1}} \to 0. \end{equation}$

(8.71)

Utilizing the triangle inequality along with Hoeffding's trick, we readily obtain that

$\begin{aligned} & a_n n^{-1/2} \mathbb{P}\left\{D_{nh}^{(U_1)}\geqslant \lambda_na_n n^{-1/2} \right\} \\ &\;\;\;\leqslant \lambda_n ^{-2}a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\left\Vert \sum\limits_{p\neq q}^{\upsilon_n}\left[\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left. \left.K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ & \;\;\;\leqslant c_2 \upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\left\Vert \sum\limits_{p = 1 }^{\upsilon_n}\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left. \left.K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2}, \end{aligned}$

(8.72)

where $\left\{{\eta}^\prime_i\right\}_{i \in \mathbb{N}^*}$ are independent copies of $({\eta}_i)_{i \in \mathbb{N}^*}$ . By imposing:

$\begin{equation} \lambda_n ^{-2}a_n^{1-r}n^{-1/2} \to 0, \end{equation}$

(8.73)

we readily infer that

$\begin{eqnarray*} {\left\Vert \upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\sum\limits_{p = 1 }^{\upsilon_n}\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_2 \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2}} \leqslant O\left(\lambda_n ^{-2}a_n^{1-r}n^{-1/2}\right). \nonumber \end{eqnarray*}$

Symmetrizing the last inequality in (8.72) and subsequently applying Proposition A.6 from the Appendix yields

$\begin{eqnarray} &&\upsilon_n \lambda_n ^{-2} a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\left\Vert \sum\limits_{p = 1 }^{\upsilon_n}\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\epsilon_p\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left. \left.K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ && \leqslant c_2 \mathbb{E}\left(\int_0^{D_{nh}^{(U_2)}}\left(\log{N(u, \mathscr{F}_{i, j}, \widetilde{d}_{nh, 2}^\prime)}\right)^{1/2}\right), \end{eqnarray}$

(8.74)

where

$\begin{eqnarray} D_{nh}^{(U_2)} && = \left\Vert \mathbb{E}_{\epsilon}\left\vert\upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2}\phi^{-1}(h_n)\right.\right. \\ &&\left.\left.\sum\limits_{p = 1 }^{\upsilon_n}\epsilon_p\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2}K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\vert \right\Vert_{\mathscr{F}_2\mathscr{K}^2}. \end{eqnarray}$

and for $\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^\prime\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime} \in \mathcal{F}_{ij}$ :

$\begin{aligned} &\widetilde{d}_{nh, 2}^\prime \left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)\\ &\;\;\;: = \mathbb{E}_{\epsilon}\left\vert\upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2}\phi^{-1}(h_n)\sum\limits_{p = 1 }^{\upsilon_n}\epsilon_p\left[\left(\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{1i_1}\xi_{1i_2}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta^{\prime}_{i_2}})}{h_n}\right) \right. \right.\right. \left.\mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\right)^2\nonumber\\ &\;\;\;\;\;\;\;\; - \left. \left. \left(\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{2i}\xi_{2j} {K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, {\eta_{i_1}})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta^{\prime}_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right)^2\right]\right\vert.\nonumber \end{aligned}$

By the fact that:

$\begin{aligned} & \mathbb{E}_{\epsilon} \left\vert\upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2}\phi^{-1}(h_n)\sum\limits_{p = 1 }^{\upsilon_n}\epsilon_p\left(\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2}K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2 \left(\frac{d_{\theta_2}(x_2, \eta^{\prime}_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2\right\vert \\ & \;\;\;\leqslant a_n^{3/2}\lambda_n^{-2}n^{-1}\left[\upsilon_n^{-1}a_n^{-2}\phi^{-2}(h_n)\sum\limits_{p = 1 }^{\upsilon_n}\sum\limits_{i_1, i_2\in H_p^{(U)} }\left(\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_i}(x_i, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, \eta^{\prime}_j)}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^4\right]^{1/2}, \end{aligned}$

so:

$\begin{equation} a_n^{3/2}\lambda_n^{-2}n^{-1}\to 0, \end{equation}$

(8.75)

we have the convergence of (8.74) to zero. For the choice of $a_n,$ $b_n$ and $\upsilon_n$ , it should be noted that all the values satisfying (8.54), (8.68), (8.71), (8.73), and (8.74) are suitable.

(II): The same block:

$\begin{aligned} & \mathbb{P}\left(\sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)} \xi_{i_1} \xi_{i_2} \right.\right. \\ & \;\;\;\;\;\;\;\;\left.\left.K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1, n})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2, n})}{h_n}\right) {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta\right) \\ &\;\;\;\leq \mathbb{P}\left(\sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\|n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)} \xi_{i_1} \xi_{i_2} \right.\right. \\ &\;\;\;\;\;\;\;\;\left.\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta \right) \\ &\;\;\;+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ &\;\;\;\;\;\;\;\;\left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \left[{\mathfrak W}_{i_1, i_2, \varphi, n} - \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]\right\vert > \delta \Bigg) \\ &\;\;\;+\mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right. \\ &\left. \;\;\;\;\;\;\;\; \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg) \\ &\;\;\;\leqslant \mathbb{P}\left( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_i}(x_i, \eta_{i_1})}{h_n}\right)\right.\right. \\ & \;\;\;\;\;\;\;\;\left.\left. K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\vert > \delta\right)+ 2\upsilon_n \alpha_{b_n} . \end{aligned}$

(8.76)

Similar to $\rm I$ , we can show that both the first and second terms in the previous inequality are of order $o_ \mathbb{P}(1)$ . Therefore, as in the preceding proof, it is enough to establish

$\begin{eqnarray} \mathbb{E}\left(\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} } \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left.\left. K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \to 0 . \end{eqnarray}$

Notice that when we consider a uniformly bounded class of functions, we obtain uniformity in $B^m \times \mathscr{F}_2\mathscr{K}^2$

$\mathbb{E}\left(\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} } \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n} \right) = O(a_n).$

This implies that we have to prove that, for $\mathbf{u} \in B^m$

$\begin{aligned} & \mathbb{E}\left(\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\left[\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_i}(x_i, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right. \right. \right. \\ & \qquad - \left. \left. \left. \mathbb{E}\left(\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n} \right)\right]\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \to 0 . \end{aligned}$

As for empirical processes, to prove (8.77), it suffices to symmetrize and show that

$\begin{eqnarray} \label{II.symmetrization} { \mathbb{E}\left(\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\epsilon_p\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right. \right.} \left. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \to 0 . \end{eqnarray}$

In a similar way as in (8.69), we infer that :

$\begin{eqnarray*} &&{ \mathbb{E}\left(\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\epsilon_p\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.} \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \nonumber\\ &&\leqslant \mathbb{E}\left(\int_0^{D_{nh}^{(U_3)}}{\left(\log N \left(u, \mathscr{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(3)}\right)\right)^{1/2}du}\right), \end{eqnarray*}$

where

$\begin{eqnarray} D_{nh}^{(U_3)}& = & \left\Vert \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\epsilon_p\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert\right\Vert_{\mathscr{F}_2\mathscr{K}^2}, \end{eqnarray}$

(8.77)

and the semi-metric $\widetilde{d}_{nh, 2}^{(3)}$ is defined by

$\begin{aligned}& \widetilde{d}_{nh, 2}^{(3)}\left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)\\ &\;\;\; = \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\epsilon_p\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\left(\xi_{1i}\xi_{1j}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \right. \right. \\ &\qquad\qquad\qquad \;\;\; \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}- \left. \left. \xi_{2i}\xi_{2j}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right)\right\vert. \end{aligned}$

Since we are trading uniformly bounded classes of functions, we infer that

$\begin{eqnarray*} && \mathbb{E}_{\epsilon} \left\vert n^{-3/2} h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\epsilon_p\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert \nonumber\\ && \leqslant a_n^{3/2}(n)^{-1}h\phi^{-1}(h_n)\left[\frac{1}{\upsilon_na_n^2}\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\left(\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right)\right.\right. \nonumber\\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right)^2\right]^{1/2}\leqslant O\left( a_n^{3/2}(n)^{-1}\phi^{-1}(h_n)\right). \end{eqnarray*}$

Since $a_n^{3/2}(n)^{-1}\phi^{-1}(h_n)\to 0$ , $D_{nh}^{(U_3)}\to 0$ , we obtain $\rm{II}\to 0$ as $n \to \infty$ .

(III): Different types of blocks:

$\begin{aligned} &\label{III: 1} \mathbb{P}\left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) \right.\right. \\ &\;\;\;\;\;\;\;\; \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta\right) \\&\;\;\;\leq \mathbb{P}\left(\sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)} \xi_{i_1} \xi_{i_2} \right.\right. \\ &\;\;\;\;\;\;\;\; \left.\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta \right) \\ &\;\;\;\;\;\;\;\;+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ &\;\;\;\;\;\;\;\; \left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \left[{\mathfrak W}_{i_1, i_2, \varphi, n} - \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]\right\vert > \delta \Bigg) \\ & \;\;\;\;\;\;\;\;+\mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right. \\ &\;\;\;\;\;\;\;\; \left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg). \end{aligned}$

As mentioned earlier, we have addressed the first and second summands in the previous inequality. What remains is the last summation, where the application of (8.61) reveals that

$\begin{aligned} &\sum\limits_{p = 1}^{\upsilon_n} \mathbb{E}\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2\in T_q^{(U)}}\xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ & \;\;\; \leqslant \sum\limits_{p = 1}^{\upsilon_n} \mathbb{E}\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} \right. \\ & \;\;\;\;\;\;\;\;\left.K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} +n^{-3/2} h \phi^{-1}(h_n) \upsilon_n^2 a_n b_n \beta({a_n}) , \end{aligned}$

we have

$n^{-3/2}\phi^{-1}(h_n) \upsilon_n^2 a_n b_n \beta({a_n})\to 0,$

using Condition (8.63) and the choice of $a_n$ , $b_n$ and $\upsilon_n$ . For $p = 1$ and $p = \nu_n$ :

$\begin{aligned} & \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &\;\;\; = \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} . \end{aligned}$

For $2\leqslant p\leqslant \upsilon_n-1$ , we obtain:

$\begin{aligned}& \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &\;\;\;\; = \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 4}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &\;\;\;\;\leqslant \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}, \end{aligned}$

therefore it suffices to treat the convergence:

$\begin{eqnarray} \mathbb{E}\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\longrightarrow 0 . \end{eqnarray}$

(8.78)

Using similar arguments as in ^[13], we apply the standard symmetrization and:

$\begin{eqnarray} &&{ \mathbb{E}\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.} \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ &&\leqslant 2 \mathbb{E}\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ && = 2 \mathbb{E}\left\{\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \mathbb{1}_{\left\{D_{nh}^{(U_4)}\leqslant \gamma_n\right\}}\right\}\\ &&+ 2 \mathbb{E}\left\{\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \mathbb{1}_{\left\{D_{nh}^{(U_4)} > \gamma_n\right\}}\right\}\\ && = 2\rm{III}_1 + 2 \rm{III}_2, \end{eqnarray}$

(8.79)

where

$\begin{eqnarray} D_{nh}^{(U_4)} & = & \left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\left[\sum\limits_{q = 3}^{\upsilon_n} \left(\sum\limits_{i_2 \in T_q^{(U)}}\sum\limits_{i_1 \in H_1^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\right. \\ && \left.\left. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2\right]^{1/2}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}. \end{eqnarray}$

(8.80)

In a similar way as in (8.69), we infer that

$\begin{eqnarray} {\rm{III}_1} \leqslant c_2 \int_0^{\gamma_n}{\left(\log{N\left(t, \mathscr{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(4)}\right)}\right)^{1/2} dt}, \end{eqnarray}$

(8.81)

where

$\begin{eqnarray} &&{\widetilde{d}_{nh, 2}^{(4)}\left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)} : = \mathbb{E}_{\epsilon}\left\vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\left[\xi_{1i_1}\xi_{1i_2}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\\ && \times {K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n} - \left.\left.\xi_{2i_1}\xi_{2i_2}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right]\right\vert. \end{eqnarray}$

Since we have

$\begin{eqnarray} &&{ \mathbb{E}_{\epsilon}\left\vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.}\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert \\ &&\leqslant a_n^{-1/2}b_n h^2\phi(h_n)\left(\frac{1}{ a_n b_n \upsilon_n h^2 \phi^{4}(h_n)}\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\left[\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\\ &&\left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2\right)^{1/2}, \end{eqnarray}$

and considering the semi-metric

$\begin{aligned} &\widetilde{d}_{nh, 2}^{(5)}\left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right) \nonumber\\ &\;\;\;\;\;\;\;\;: = \left(\frac{1}{ a_n b_n \upsilon_n h^2 \phi^4(h_n)}\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\left[\xi_{1i_1}\xi_{1i_2}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right){K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \right.\right.\mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\nonumber\\ &\;\;\;\;\;\;\;\; - \left.\left.\xi_{2i_1}\xi_{2i_2}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right]^2\vphantom{\frac{1}{ a_n b_n \upsilon_n h^4}\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\left[\xi_{2i_1}\xi_{2i_2}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \varphi_{2}({\zeta}_i, {\zeta}_j)\right.}\right)^{1/2}. \end{aligned}$

We show that the expression in (8.81) is bounded as follows

$\begin{equation*} \upsilon_n^{1/2} b_n n^{-1/2}h^2\phi(h_n)\int_0^{\upsilon_n^{-1/2}b_n^{-1}n^{1/2}h^2\gamma_n}{\left(\log{N\left(t, \mathscr{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(5)}\right)}\right)^{1/2}}dt, \end{equation*}$

by choosing $\gamma_n = n^{-\alpha}$ for some $\alpha > (17r-26)/60r$ , we get the convergence to zero of the previous quantity. To bound the second term on the right-hand side of (8.79), we observe that

$\begin{eqnarray} \rm{III}_2 & = & \mathbb{E}\left\{\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left. K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} \mathbb{1}_{\left\{D_{nh}^{(U_4)} > \gamma_n\right\}}\right\}\\ & \leqslant &a_n^{-1}b_n n^{1/2}h\phi^{-1}(h_n)\mathbb{P}\left\{ \left\Vert \upsilon_n^2 n^{-3}h^2\phi^{-2}(h_n)\sum\limits_{q = 3}^{\upsilon_n} \left(\sum\limits_{i_2 \in T_q^{(U)}}\sum\limits_{i_1 \in H_1^{(U)} }\xi_{i_1} \xi_{i_2} \right.\right. \right.\\ && \left.\left. K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right)K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\geqslant \gamma_n ^2\Bigg\}. \end{eqnarray}$

(8.82)

Now, we apply the square root trick to the last expression conditional on $H_{1}^{U}$ . Denoting $\mathbb{E}_T$ as the expectation with respect to $\sigma\{{\eta}_{i_2}: i_2 \in T_q, q\geqslant 3\}$ , we assume that any class of functions $\mathscr{F}_m$ is unbounded, and its envelope function satisfies, for some $p > 2$ :

$\begin{equation} \theta_p: = \sup\limits_{\mathbf{t}\in\mathcal{S}^m_\mathscr{H}} \mathbb{E}\left( F^p(\mathbf{Y})\vert\mathbf{X} = \mathbf{t}\right) < \infty , \end{equation}$

(8.83)

for $2r/(r-1) < s < \infty$ , (in the notation in of [89, Lemma 5.2]).

$\begin{eqnarray*} M_n& = & \upsilon_n^{1/2} \mathbb{E}_T \left(\sum\limits_{j \in T_q^{(U)}}\sum\limits_{i \in H_1^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_i})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_j})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2, \end{eqnarray*}$

where

$t = \gamma_n^2a_n^{5/2}n^{1/2}h\phi^{-1}(h_n), \; \; \rho = \lambda = 2^{-4}\gamma_na_n^{5/4}n^{1/4}h^{1/2}\phi^{-1/2}(h_n),$

and

$m = \exp{\left(\gamma_n^2n h^2\phi^{-2}(h_n)b_n^{-2}\right)}.$

Nevertheless, since we require $t > 8M_n$ and $m \to \infty$ , using similar arguments as in [13, page 69], we achieve the convergence of (8.81) and (8.82) to zero.

(IV): Different types of blocks: The target here is to prove that:

$\begin{eqnarray} \mathbb{P}\left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \leqslant 1}^{v_{n}} \sum\limits_{i_2 \in \mathrm{n}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) \right.\right.\left.\left.K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta\right) \rightarrow 0. \end{eqnarray}$

We have

$\begin{eqnarray*} && \left\Vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\leqslant1}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1}^{(i_1/n)})}{h_n}\right) \right.\left.K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2}^{(i_2/n)})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ &&\leqslant c_2 \upsilon_n a_n b_n n^{-3/2} h \phi^{-1}(h_n) \to 0 . \end{eqnarray*}$

Hence the proof of the lemma is complete. □

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in creating this article.

Acknowledgements

The authors would like to thank the Editor-in-Chief, an Associate-Editor, and three referees for their extremely helpful remarks, which resulted in a substantial improvement of the original form of the work and a presentation that was more sharply focused.

Conflict of interest

The author declares that he has no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix

A. Additional information

This appendix includes additional information that is integral to achieving a more comprehensive understanding of the paper.

Lemma A.1 (^[165]). Let $I_{h} = \left[C_{1} h, 1-C_{1} h\right]$ . Suppose that kernel $K_{1}$ satisfies Assumption 2 part(i). Then, for $q = 0, 1, 2$ and $m > 1$ :

$\begin{aligned} &\sup _{\mathbf u \in I_{h}^m} \left\lvert \frac{1}{n^m h^m} \sum\limits_{\mathbf{i}\in I_n^m}\right. \prod\limits_{k = 1}^mK_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\left(\frac{u_k-i_k /n}{h_n}\right)^{q}\\ & \;\;\;\;\;\;\left. -\int_{0}^{1}\cdots \int_{0}^{1} \frac{1}{h^m} \prod\limits_{k = 1}^m \left\{ K_{1}\left(\frac{(u_k-v_k)}{h_n}\right)\left(\frac{u_k-v_k}{h_n}\right)^{q} \right\} \prod\limits_{k = 1}^m d v_k\right\rvert = O\left(\frac{1}{nh^{m+1}}\right). \end{aligned}$

Lemma A.2 (^[165]). Suppose that kernel $K_{1}$ satisfies Assumption 2 part (i) and let $g:[0, 1] \times \mathscr{H} \rightarrow \mathbb{R}$ , $(\mathbf{u}, \mathbf{x}) \mapsto g(\mathbf{u}, \mathbf{x})$ be continuously differentiable with respect to $\mathbf{u}$ . Then,

$\begin{eqnarray} {\sup _{u \in I_{h}}\left\lvert\frac{1}{n^m h^m} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k /n}{h_n}\right) g\left(\frac{i_k}{n}, x_{k}\right)- \prod\limits_{k = 1}^m g(u_k, x_k)\right\rvert } = O\left(\frac{1}{nh^{m+1}}\right)+o(h_n). \end{eqnarray}$

(A.1)

Lemma A.3 (^[124]). Let $\left\{Z_{i, n}\right\}$ be a zero-mean triangular array such that $\left\lvert Z_{i, n}\right\rvert \leq b_{n}$ with $\alpha$ -mixing coefficients $\alpha(k)$ . Then, for any $\varepsilon > 0$ and $S_{n} \leq n$ with $\epsilon > 4 S_{n} b_{n}$ ,

$\begin{equation} \mathbb P\left(\left\lvert\sum\limits_{i = 1}^{n} Z_{i, i}(u, x)\right\rvert \geq \varepsilon\right) \leq 4 \exp \left(-\frac{\varepsilon^{2}}{64 \sigma_{S_{n}, n}^{2} \frac{n}{S_{n}}+\frac{8}{3} \varepsilon b_{n} S_{n}}\right)+4 \frac{n}{S_{n}} \alpha\left(S_{n}\right). \end{equation}$

(A.2)

Lemma A.4. Let $\left\{Z_{i, n}\right\}$ be a zero-mean triangular array such that $\left\lvert Z_{i, n}\right\rvert \leq b_{n}$ with $\beta$ -mixing coefficients $\beta(k)$ . Then, for any $\varepsilon > 0$ and $S_{n} \leq n$ with $\epsilon > 4 S_{n} b_{n}$ ,

(A.3)

Proof of Lemma A.4. Using Lemma A.3 and the fact that for any $\sigma$ -algebra $\mathcal{A}$ and $\mathcal{B}$ , $\alpha(\mathcal{A}, \mathcal{B}) \subseteq \beta(\mathcal{A}, \mathcal{B})$ , Lemma A.7 holds. □

Lemma A.5 (^[63]). Let $X_{1}, \ldots, X_{n}$ be a sequence of independent random elements taking values in a Banach space $(B, \|\cdot\|)$ with $\mathbb E X_{i} = 0$ for all $i$ . Let $\left\{\varepsilon_{i}\right\}$ be a sequence of independent Bernoulli random variables independent of $\left\{X_{i}\right\}$ . Then, for any convex increasing function $\Phi$ ,

$\mathbb E \Phi\left(\frac{1}{2}\left\|\sum\limits_{i = 1}^{n} X_{i} \varepsilon_{i}\right\|\right) \leq\mathbb E \Phi\left(\left\|\sum\limits_{i = 1}^{n} X_{i}\right\|\right) \leq\mathbb E \Phi\left(2\left\|\sum\limits_{i = 1}^{n} X_{i} \varepsilon_{i}\right\|\right).$

Proposition A.6 (^[11]). Let $\{{X}_i : i \in {n}\}$ be a process satisfying, for $m \geq 1$ :

$\left(\mathbb E\left\Vert{X}_i-{X}_j\right\Vert^{p}\right)^{1/p} \leq \left(\frac{p-1}{q-1}\right)^{m/2}\left(\mathbb E\left\Vert{X}_i-{X}_j\right\Vert^{q}\right)^{1/q} , \quad 1 < q < p < \infty,$

and the semi-metric :

$\rho(j, i) = \left(\mathbb E\left\Vert{X}_i-{X}_j\right\Vert^{2}\right)^{1/2}.$

There exists a constant $K = K(m)$ such that :

$\mathbb E\sup\limits_{i, j \in {n}}\left\Vert{X}_i-{X}_j\right\Vert \leq K \int_0^{D} [\log{N(\epsilon, {n}, \rho)}]^{m/2}d\epsilon,$

where $D$ is the $\rho$ -diameter of ${n}$ .

Lemma A.7 (^[61]). Suppose that $X$ and $Y$ are random variables which are $\mathscr{G}$ and $\mathscr{H}$ -measurable, respectively, and that $\mathbb E|X|^{p} < \infty, \mathbb E|Y|^{q} < \infty$ , where $p > 0$ ,

$q > 1, p^{-1}+q^{-1} < 1.$

Then,

$\lvert \mathbb E X Y- \mathbb E X \mathbb E Y\rvert \leq 8\lVert X\rVert_{p}\lVert Y \rVert_{q}[\beta(\mathscr{G}, \mathscr{H})]^{1-p^{-1}-q^{-1}}.$

Proof of Lemma A.7. This Lemma follows directly using Lemma A.7 and the fact that for any $\sigma$ -algebra $\mathcal{A}$ and $\mathcal{B}$ , $\alpha(\mathcal{A}, \mathcal{B}) \subseteq \beta(\mathcal{A}, \mathcal{B})$ .□

Lemma A.8 (^[180]). Let $V_{1}, \ldots, V_{L}$ be strongly mixing random variables measurable with respect to the $\sigma$ -algebras $\mathscr{F}_{i_{1}}^{j_{1}}, \ldots, \mathscr{F}_{i_{L}}^{j_{L}}$ respectively with $1 \leq i_{1} < j_{1} < i_{2} < \cdots < j_{L} \leq n, i_{l+1}-j_{l} \geq w \geq 1$ and $\left\lvert V_{j}\right\rvert \leq 1$ for $j = 1, \ldots, L$ . Then,

$\left\lvert \mathbb E\left(\prod\limits_{j = 1}^{L} V_{j}\right)-\prod\limits_{j = 1}^{L} \mathbb E\left(V_{j}\right)\right\rvert \leq 16(L-1) \alpha(w),$

where $\alpha(w)$ is the strongly mixing coefficient.

A1. Examples of classes of functions

Example A.9. The set $\mathscr{F}$ of all indicator functions ${\rm 1\!I}_{\{(-\infty, t]\}}$ of cells in $\mathbb{R}$ satisfies :

${N}\left(\epsilon, \mathscr{F}, d_{\mathbb{P}}^{(2)}\right)\leq \frac{2}{\epsilon^{2}},$

for any probability measure $\mathbb{P}$ and $\epsilon\leq 1$ . Notice that :

$\int_{0}^{1}\sqrt{\log\left(\frac{1}{\epsilon}\right)}d\epsilon\leq\int_{0}^{\infty}u^{1/2}\exp(-u)du\leq 1.$

For more details and discussion on this example refer to Example 2.5.4 of ^[178] and [, p. 157]. The covering numbers of the class of cells $(-\infty, t]$ in higher dimension satisfy a similar bound, but with higher power of $(1/\epsilon)$ , see Theorem 9.19 of ^[114].

Example A.10. (Classes of functions that are Lipschitz in a parameter, Section 2.7.4 in ^[178]). Let $\mathscr{F}$ be the class of functions $x\mapsto \varphi(t, x)$ that are Lipschitz in the index parameter $t\in T$ . Suppose that:

$|\varphi(t_1, x)-\varphi(t_2, x)|\leq d({t}_1, {t}_2)\kappa(x)$

for some metric $d$ on the index set $T$ , the function $\kappa(\cdot)$ defined on the sample space $\mathcal{X}$ , and all $x$ . According to Theorem 2.7.11 of ^[178] and Lemma 9.18 of ^[114], it follows, for any norm $\|\cdot\|_{{\mathscr F}}$ on $\mathscr{F}$ , that :

$N(\epsilon\|F\|_{{\mathscr F}}, {\mathscr F}, \|\cdot\|_{{\mathscr F}})\leq{N}(\epsilon/2, T, d).$

Hence if $(T, d)$ satisfy

$J(\infty, T, d) = \int_{0}^{\infty}\sqrt{\log {N}(\epsilon, T, d)} d\epsilon < \infty,$

then the conclusions holds for $\mathscr{F}$ .

Example A.11. Let us consider as an example the classes of functions that are smooth up to order $\alpha$ defined as follows, see Section 2.7.1 of ^[178] and Section 2 of ^[177]. For $0 < \alpha < \infty$ let $\lfloor \alpha \rfloor$ be the greatest integer strictly smaller than $\alpha$ . For any vector $k = (k_{1}, \ldots, k_{d})$ of $d$ integers define the differential operator

$D^{k_{.}}: = \frac{\partial^{k_{.}}}{\partial^{k_{1}}\cdots \partial^{k_{d}}}, \; where\; k_{.}: = \sum\limits_{i = 1}^{d}k_{i}.$

Then, for a function $f:\mathcal{X}\rightarrow \mathbb{R}$ , let

$\|f\|_{\alpha}: = \max\limits_{k_{.}\leq \lfloor \alpha \rfloor}\sup\limits_{x}|D^{k}f(x)|+\max\limits_{k_{.} = \lfloor \alpha \rfloor}\sup\limits_{x}\frac{D^{k}f(x)-D^{k}f(y)}{\|x-y\|^{\alpha-\lfloor \alpha \rfloor}},$

where the suprema are taken over all $x, y$ in the interior of $\mathcal{X}$ with $x \neq y$ . Let $C_{M}^{\alpha}(\mathcal{X})$ be the set of all continuous functions $f: \mathcal{X}\rightarrow \mathbb{R}$ with

$\|f\|_{\alpha}\leq M.$

Note that for $\alpha \leq 1$ this class consists of bounded functions $f$ that satisfy a Lipschitz condition. ^[112] computed the entropy of the classes of $C_{M}^{\alpha}(\mathcal{X})$ for the uniform norm. As a consequence of their results, ^[177] shows that there exists a constant $K$ depending only on $\alpha, d$ and the diameter of $\mathcal{X}$ such that for every measure $\gamma$ and every $\epsilon > 0$ ,

$\log \mathcal{N}_{[\; ]}(\epsilon M\gamma(\mathcal{X}), C_{M}^{\alpha}(\mathcal{X}), L_{2}(\gamma) )\leq K\left(\frac{1}{\epsilon}\right)^{d/\alpha},$

${\mathcal N}_{[\; ]}$ is the bracketing number, refer to Definition 2.1.6 of ^[178] and we refer to Theorem 2.7.1 of ^[178] for a variant of the last inequality. By Lemma 9.18 of ^[114], we have

$\log \mathcal{N}(\epsilon M\gamma(\mathcal{X}), C_{M}^{\alpha}(\mathcal{X}), L_{2}(\gamma) )\leq K\left(\frac{1}{2\epsilon}\right)^{d/\alpha}.$

A2. Examples of U-kernels

In this section, we present some classical $U$ -kernels.

Example A.12. ^[99] introduced the parameter

$\triangle = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} D^{2}(y_1, y_2) d F(y_1, y_2),$

where $D(y_1, y_2) = F(y_1, y_2)-F(y_1, \infty) F(\infty, y_2)$ and $F(\cdot, \cdot)$ is the distribution function of $Y_1$ and $Y_2$ . The parameter $\triangle$ has the property that $\triangle = 0$ if and only if $Y_1$ and $Y_2$ are independent. From ^[117], an alternative expression for $\triangle$ can be developed by introducing the functions

$\psi\left(y_{1}, y_{2}, y_{3}\right) = \left\{\begin{array}{rcl} 1 & if& y_{2} \leq y_{1} < y_{3}, \\ 0 & if & y_{1} < y_{2}, y_{3} \text { or } y_{1} \geq y_{2}, y_{3}, \\ -1 & if& y_{3} \leq y_{1} < y_{2}, \end{array}\right.$

and

$h\left(y_{1, 1}, y_{1, 2} , \ldots , y_{5, 1}, y_{5, 2}\right) = \frac{1}{4} \psi\left(y_{1, 1}, y_{1, 2}, y_{1, 3}\right) \psi\left(y_{1, 1}, y_{1, 4}, y_{1, 5}\right) \psi\left(y_{1, 2}, y_{2, 2}, y_{3, 2}\right) \psi\left(y_{1, 2}, y_{4, 2}, y_{5, 2}\right).$

We have

$\triangle = \int \ldots \int h\left(y_{1, 1}, y_{1, 2} , \ldots , y_{5, 1}, y_{5, 2}\right)d F\left(y_{1, 1}, y_{1, 2}\right) \ldots d F\left(y_{1, 5}, y_{2, 5}\right).$

Example A.13. (Hoeffding's $D$ ). From the symmetric kernel,

$\begin{aligned} h_{D} &\left(z_{1}, \ldots, z_{5}\right) \\ : = & \frac{1}{16} \sum\limits_{\left(i_{1}, \ldots, i_{5}\right) \in \mathcal{P}_{5}}\left[\left\{ \mathbb{1}\left(z_{i_{1}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{2}, 1} \leq z_{i_{5}, 1}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{4}, 1} \leq z_{i_{5}, 1}\right)\right\}\right] \\ & \times\left[\left\{ \mathbb{1}\left(z_{i_{1}, 2} \leq z_{i_{5}, 2}\right)- \mathbb{1}\left(z_{i_{2}, 2} \leq z_{i_{5}, 2}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 2} \leq z_{i_{5}, 2}\right)- \mathbb{1}\left(z_{i_{4}, 2} \leq z_{i_{5}, 2}\right)\right\}\right], \end{aligned}$

we recover Hoeffding's $D$ statistic, a rank-based $U$ -statistic of order 5, and gives rise to Hoeffding's $D$ correlation measure $\mathbb{E} h_{D}$ .

Example A.14. (Blum-Kiefer-Rosenblatt's $R$ ). The symmetric kernel

$\begin{aligned} h_{R}&\left(z_{1}, \ldots, z_{6}\right) & \\ : = & \frac{1}{32} \sum\limits_{\left(i_{1}, \ldots, i_{6}\right) \in \mathcal{P}_{6}}\left[\left\{ \mathbb{1}\left(z_{i_{1}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{2}, 1} \leq z_{i_{5}, 1}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{4}, 1} \leq z_{i_{5}, 1}\right)\right\}\right] \\ & \times\left[\left\{ \mathbb{1}\left(z_{i_{1}, 2} \leq z_{i_{6}, 2}\right)- \mathbb{1}\left(z_{i_{2}, 2} \leq z_{i_{6}, 2}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 2} \leq z_{i_{6}, 2}\right)- \mathbb{1}\left(z_{i_{4}, 2} \leq z_{i_{6}, 2}\right)\right\}\right] \end{aligned}$

yields Blum-Kiefer-Rosenblatt's $R$ statistic (^[24]),

Example A.15. (Bergsma-Dassios-Yanagimoto's $\tau^{*}$ ). ^[21] introduced a rank correlation statistic as a $U$ -statistic of order 4 with the symmetric kernel

$\begin{aligned} h_{\tau^{*}}\left(z_{1}\right.&\left., \ldots, z_{4}\right) \\ : = & \frac{1}{16} \sum\limits_{\left(i_{1}, \ldots, i_{4}\right) \in \mathcal{P}_{4}}\left\{ \mathbb{1}\left(z_{i_{1}, 1}, z_{i_{3}, 1} < z_{i_{2}, 1}, z_{i_{4}, 1}\right)+ \mathbb{1}\left(z_{i_{2}, 1}, z_{i_{4}, 1} < z_{i_{1}, 1}, z_{i_{3}, 1}\right)\right.\\ &\left.- \mathbb{1}\left(z_{i_{1}, 1}, z_{i_{4}, 1} < z_{i_{2}, 1}, z_{i_{3}, 1}\right)- \mathbb{1}\left(z_{i_{2}, 1}, z_{i_{3}, 1} < z_{i_{1}, 1}, z_{i_{4}, 1}\right)\right\} \\ & \times\left\{ \mathbb{1}\left(z_{i_{1}, 2}, z_{i_{3}, 2} < z_{i_{2}, 2}, z_{i_{4}, 2}\right)+ \mathbb{1}\left(z_{i_{2}, 2}, z_{i_{4}, 2} < z_{i_{1}, 2}, z_{i_{3}, 2}\right)\right.\\ &\left.- \mathbb{1}\left(z_{i_{1}, 2}, z_{i_{4}, 2} < z_{i_{2}, 2}, z_{i_{3}, 2}\right)- \mathbb{1}\left(z_{i_{2}, 2}, z_{i_{3}, 2} < z_{i_{1}, 2}, z_{i_{4}, 2}\right)\right\} \end{aligned}$

Here, $\mathbb{1}\left(y_{1}, y_{2} < y_{3}, y_{4}\right): = \mathbb{1}\left(y_{1} < y_{3}\right) \mathbb{1}\left(y_{1} < y_{4}\right) \mathbb{1}\left(y_{2} < y_{3}\right) \mathbb{1}\left(y_{2} < y_{4}\right).$

Example A.16. The Wilcoxon Statistic. Suppose that $E \subset \mathbb{R}$ is symmetric around zero. As an estimate of the quantity

$\int_{(x, y) \in E^2}\left\{2 \mathbb{1}_{\{x+y > 0\}}-1\right\} dF(x) dF(y),$

it is pertinent to consider the statistic

$W_n = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n}\left\{2 \cdot \mathbb{1}_{\left\{X_i+X_j > 0\right\}}-1\right\},$

which is relevant for testing whether or not $\mu$ is located at zero.

Example A.17. The Takens estimator. Denote by $\|\cdot\|$ the usual Euclidean norm on $\mathbb{R}^d$ . In ^[29], the following estimate of the correlation integral,

$C_F(r) = \int \mathbb{I}_{\left\{\left\|x-x^{\prime}\right\| \leq r\right\}} dF(x) dF\left(x^{\prime}\right), \quad r > 0,$

is considered:

$C_n(r) = \frac{1}{n(n-1)} \sum\limits_{1 \leq i \neq j \leq n} \mathbb{I}_{\left\{\left\|X_i-X_j\right\| \leq r\right\}}.$

In the case where a scaling law holds for the correlation integral, i.e., when there exists $\left(\alpha, r_0, c\right) \in$ $\mathbb{R}_{+}^{* 3}$ such that $C_F(r) = c \cdot r^{-\alpha}$ for $0 < r \leq r_0$ , the $U$ -statistic

$T_n = \frac{1}{n(n-1)} \sum\limits_{1 \leq i \neq j \leq n} \log \left(\frac{\left\|X_i-X_j\right\|}{r_0}\right),$

is used in order to build the Takens estimator $\hat{\alpha}_n = -T_n^{-1}$ of the correlation dimension $\alpha$ .

Example A.18. Let $\widehat{{Y}_1{Y}_2}$ denote the oriented angle between ${Y}_1, {Y}_2 \in T$ , $T$ is the circle of radius 1 and center $0$ in $\mathbb{R}^{2}$ . Let:

$h_{t}({Y}_1, {Y}_2) = \mathbb{1}\{\widehat{{Y}_1{Y}_2}\leq {t}\} -{ t}/\pi, \; \; \mbox{for}\; \; {t} \in[0, \pi).$

Reference ^[159] has used this kernel in order to propose a $U$ -process to test uniformity on the circle.

Example A.19. For $m = 3$ , let :

$\varphi(Y_1, Y_2, Y_3) = \mathbb{1}\{Y_1-Y_2-Y_3 > 0\},$

We have

$r^{(3)}(\varphi, {t}_1, {t}_2, {t}_3) = \mathbb{P}(Y_1 > Y_2+Y_3\mid {X}_{1} = {X}_2 = { X}_{3} = { t})$

and the corresponding conditional $U$ -Statistic can be considered a conditional analog of the Hollander-Proschan test-statistic (^[101]). It may be used to test the hypothesis that the conditional distribution of $Y_1$ given ${X}_1 = { t}$ , is exponential, against the alternative that it is of the New-Better than-Used-type.

Example A.20. The Gini mean difference. The Gini index provides another popular measure of dispersion. It corresponds to the case where $E \subset \mathbb{R}$ and $h(x, y) = |x-y|$ :

$G_n = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n}\left|X_i-X_j\right|.$

Example A.21 (^[113]). Let the sample central moments of any order $m = 2, 3, \ldots$ be given by

$\theta_m(F) = \mathbb E\left(X_1-\mathbb E X_1\right)^m = \int(x-\mathbb E X_1)^m d F(x).$

In this case, the $U$ -statistic has a symmetric kernel

$\begin{aligned} & h\left(x_1, \ldots, x_m\right) = \frac{1}{m !} \sum\left[x_{i_1}^m-\left(\begin{array}{c} m \\ 1 \end{array}\right) x_{i_1}^{m-1} x_{i_2}\right. \\ &\left.\quad+\left(\begin{array}{c} m \\ 2 \end{array}\right) x_{i_1}^{m-2} x_{i_2} x_{i_3}-\cdots+(-1)^{m-1}\left(\left(\begin{array}{c} m \\ m-1 \end{array}\right)-1\right) x_{i_1} x_{i_2} \ldots x_{i_m}\right], \end{aligned}$

where summation is carried out over all permutations $\left(i_1, \ldots, i_m\right)$ of the numbers $(1, \ldots, m)$ . In particular, if $m = 3$ , then

$\begin{aligned} h\left(x_1, x_2, x_3\right) = \frac{1}{3}\left(x_1^3+x_2^3+x_3^3\right) -\frac{1}{2}\left(x_1^2 x_2+x_2^2 x_1+x_1^2 x_3+x_3^2 x_1+x_2^2 x_3+x_2^2 x_2\right)+2 x_1 x_2 x_3 . \end{aligned}$

In the case of $m = 2$ ,

$\theta_2(F) = \mathbb E\left(X_1-\mathbb E X_1\right)^2 = \int(x-\mathbb E X_1)^2 d F(x).$

For the kernel

$h\left(x_1, x_2\right) = \frac{x_1^2+x_2^2-2 x_1 x_2}{2} = \frac{1}{2}\left(x_1-x_2\right)^2 .$

the corresponding $U$ -statistic is the sample variance

$\begin{aligned} U_n\left(h\right) & = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n} h\left(X_i, X_j\right) \\ & = \frac{1}{n-1}\left(\sum\limits_{i = 1}^n X_i^2-n \left\{\frac{1}{n}\sum\limits_{i = 1}^n X_i\right\}^2\right) = \frac{1}{n-1}\left(\sum\limits_{i = 1}^n X_i^2-n\bar X_n^2\right), \end{aligned}$

we refer also to ^[155].

References

[1]	J. Abrevaya, W. Jiang, A nonparametric approach to measuring and testing curvature, J. Bus. Econom. Statist., 23 (2005), 1–19. https://doi.org/10.1198/073500104000000316 doi: 10.1198/073500104000000316
[2]	A. Ait-Saïdi, F. Ferraty, R. Kassa, P. Vieu, Cross-validated estimations in the single-functional index model, Statistics, 42 (2008), 475–494. https://doi.org/10.1080/02331880801980377 doi: 10.1080/02331880801980377
[3]	I. M. Almanjahie, S. Bouzebda, Z. Chikr Elmezouar, A. Laksaci, The functional $k\text{NN}$ estimator of the conditional expectile: uniform consistency in number of neighbors, Stat. Risk Model., 38 (2022a), 47–63. https://doi.org/10.1515/strm-2019-0029 doi: 10.1515/strm-2019-0029
[4]	I. M. Almanjahie, S. Bouzebda, Z. Kaid, A. Laksaci, Nonparametric estimation of expectile regression in functional dependent data, J. Nonparametr. Stat., 34 (2022), 250–281. https://doi.org/10.1080/10485252.2022.2027412 doi: 10.1080/10485252.2022.2027412
[5]	I. M. Almanjahie, S. Bouzebda, Z. Kaid, A. Laksaci, The local linear functional $k$ NN estimator of the conditional expectile: uniform consistency in number of neighbors, Metrika, 34 (2024), 1–29. https://doi.org/10.1007/s00184-023-00942-0 doi: 10.1007/s00184-023-00942-0
[6]	N. T. Andersen, The central limit theorem for non-separable valued functions, Z. Wahrscheinlichkeitstheor. Verw. Geb., 70 (1985), 445–455.
[7]	P. K. Andersen, O. R. Borgan, R. D. Gill, N. Keiding, Statistical Models Based on Counting Processes, New York: Springer, 1993.
[8]	G. Aneiros, R. Cao, R. Fraiman, C. Genest, P. Vieu, Recent advances in functional data analysis and high-dimensional statistics, J. Multivariate Anal., 170 (2019), 3–9. https://doi.org/10.1016/j.jmva.2018.11.007 doi: 10.1016/j.jmva.2018.11.007
[9]	A. Araujo, E. Giné, The Central Limit Theorem for Real and Banach Valued Random Variables, New York: John Wiley & Sons, 1980.
[10]	M. A. Arcones, The law of large numbers for $U$ -statistics under absolute regularity, Electron. Comm. Probab., 3 (1998), 13–19.
[11]	M. A. Arcones, E. Giné, Limit theorems for $U$ -processes, Ann. Probab., 21 (1993), 1494–1542.
[12]	M. A. Arcones, E. Giné, On the law of the iterated logarithm for canonical $U$ -statistics and processes, Stochast. Process. Appl., 58 (1995), 217–245. https://doi.org/10.1016/0304-4149(94)00023-M doi: 10.1016/0304-4149(94)00023-M
[13]	M. A. Arcones, B. Yu, Central limit theorems for empirical and $U$ -processes of stationary mixing sequences, J. Theor. Probab., 7 (1994), 47–71. https://doi.org/10.1007/BF02213360 doi: 10.1007/BF02213360
[14]	M. A. Arcones, Z. Chen, E. Giné, Estimators related to $U$ -processes with applications to multivariate medians: asymptotic normality, Ann. Statist., 22 (1994), 1460–1477.
[15]	S. Attaoui, N. Ling, Asymptotic results of a nonparametric conditional cumulative distribution estimator in the single functional index modeling for time series data with applications, Metrika, 79 (2016), 485–511. https://doi.org/10.1007/s00184-015-0564-6 doi: 10.1007/s00184-015-0564-6
[16]	S. Attaoui, B. Bentat, S. Bouzebda, A. Laksaci, The strong consistency and asymptotic normality of the kernel estimator type in functional single index model in presence of censored data, AIMS Math., 9 (2024), 7340–7371. http://dx.doi.org/10.3934/math.2024356 doi: 10.3934/math.2024356
[17]	A. K. Basu, A. Kundu, Limit distribution for conditional $U$ -statistics for dependent processes, Calcutta Statist. Assoc. Bull., 52 (2002), 381–407. https://doi.org/10.1177/0008068320020522 doi: 10.1177/0008068320020522
[18]	A. Bellet, A. Habrard, Robustness and generalization for metric learning, Neurocomputing, 151 (2015), 259–267. https://doi.org/10.1016/j.neucom.2014.09.044 doi: 10.1016/j.neucom.2014.09.044
[19]	A. Bellet, A. Habrard, M. Sebban, A survey on metric learning for feature vectors and structured data, preprint paper, 2013. https://doi.org/10.48550/arXiv.1306.6709
[20]	K. Benhenni, F. Ferraty, M. Rachdi, P. Vieu, Local smoothing regression with functional data, Comput. Statist., 22 (2007), 353–369. https://doi.org/10.1007/s00180-007-0045-0 doi: 10.1007/s00180-007-0045-0
[21]	W. Bergsma, A. Dassios, A consistent test of independence based on a sign covariance related to Kendall's tau, Bernoulli, 20 (2014), 1006–1028.
[22]	S. Bernstein, Sur l'extension du théoréme limite du calcul des probabilités aux sommes de quantités dépendantes, Math. Ann., 97 (1927), 1–59. https://doi.org/10.1007/BF01447859 doi: 10.1007/BF01447859
[23]	S. Bhattacharjee, H. G. Müller, Single index Fréchet regression, Ann. Statist., 51 (2023), 1770–1798. https://doi.org/10.1214/23-AOS2307 doi: 10.1214/23-AOS2307
[24]	J. R. Blum, J. Kiefer, M. Rosenblatt, Distribution free tests of independence based on the sample distribution function, Ann. Math. Statist., 32 (1961), 485–498.
[25]	V. I. Bogachev, Gaussian measures, In: Mathematical Surveys and Monographs, Providence: American Mathematical Society, 1998.
[26]	E. G. Bongiorno, A. Goia, Classification methods for Hilbert data based on surrogate density, Comput. Statist. Data Anal., 99 (2016), 204–222. https://doi.org/10.1016/j.csda.2016.01.019 doi: 10.1016/j.csda.2016.01.019
[27]	E. G. Bongiorno, A. Goia, Some insights about the small-ball probability factorization for Hilbert random elements, Statist. Sinica, 27 (2017), 1949–1965.
[28]	E. G. Bongiorno, A. Goia, P. Vieu, Evaluating the complexity of some families of functional data, SORT, 42 (2018), 27–44.
[29]	S. Borovkova, R. Burton, H. Dehling, Consistency of the Takens estimator for the correlation dimension, Ann. Appl. Probab., 9 (1999), 376–390.
[30]	S. Borovkova, R. Burton, H. Dehling, Limit theorems for functionals of mixing processes with applications to $U$ -statistics and dimension estimation, Trans. Amer. Math. Soc., 353 (2001), 4261–4318.
[31]	Y. V. Borovskikh, U-Statistics in Banach Spaces, Utrecht: VSP, 1996.
[32]	D. Bosq, Linear processes in function spaces, In: Lecture Notes in Statistics, New York: Springer, 2000.
[33]	S. Bouzebda, General tests of conditional independence based on empirical processes indexed by functions, Jpn. J. Stat. Data Sci., 6 (2023), 115–177. https://doi.org/10.1007/s42081-023-00193-3 doi: 10.1007/s42081-023-00193-3
[34]	S. Bouzebda, On the weak convergence and the uniform-in-bandwidth consistency of the general conditional $U$ -processes based on the copula representation: multivariate setting, Hacet. J. Math. Stat., 52 (2023), 1303–1348.
[35]	S. Bouzebda, M. Cherfi, General bootstrap for dual $\phi$ -divergence estimates, J. Probab. Stat., 2012 (2012), 834107. https://doi.org/10.1155/2012/834107 doi: 10.1155/2012/834107
[36]	S. Bouzebda, S. Didi, Multivariate wavelet density and regression estimators for stationary and ergodic discrete time processes: asymptotic results, Comm. Statist. Theory Methods, 46 (2017), 1367–1406. https://doi.org/10.1080/03610926.2015.1019144 doi: 10.1080/03610926.2015.1019144
[37]	S. Bouzebda, S. Didi, Some asymptotic properties of kernel regression estimators of the mode for stationary and ergodic continuous time processes, Rev. Mat. Complut., 34 (2021), 811–852. https://doi.org/10.1007/s13163-020-00368-6 doi: 10.1007/s13163-020-00368-6
[38]	S. Bouzebda, A. Keziou, A semiparametric maximum likelihood ratio test for the change point in copula models, Stat. Methodol., 14 (2013), 39–61. https://doi.org/10.1016/j.stamet.2013.02.003 doi: 10.1016/j.stamet.2013.02.003
[39]	S. Bouzebda, B. Nemouchi, Central limit theorems for conditional empirical and conditional $U$ -processes of stationary mixing sequences, Math. Meth. Stat., 28 (2019), 169–207. https://doi.org/10.3103/S1066530719030013 doi: 10.3103/S1066530719030013
[40]	S. Bouzebda, B. Nemouchi, Weak-convergence of empirical conditional processes and conditional $U$ -processes involving functional mixing data, Stat. Inference Stoch. Process., 26 (2023), 33–88. https://doi.org/10.1007/s11203-022-09276-6 doi: 10.1007/s11203-022-09276-6
[41]	S. Bouzebda, A. Nezzal, Uniform in number of neighbors consistency and weak convergence of $k$ NN empirical conditional processes and $k$ NN conditional $U$ -processes involving functional mixing data, AIMS Math., 9 (2024), 4427–4550. https://doi.org/10.3934/math.2024218 doi: 10.3934/math.2024218
[42]	S. Bouzebda, Soukarieh, Non-parametric conditional $U$ -processes for locally stationary functional random fields under stochastic sampling design, Mathematics, 11 (2023), 16. https://doi.org/10.3390/math11010016 doi: 10.3390/math11010016
[43]	S. Bouzebda, I. Soukarieh, Limit theorems for a class of processes generalizing the $U$ -empirical process, Stochastics, 96 (2024), 799–845. https://doi.org/10.1080/17442508.2024.2320402 doi: 10.1080/17442508.2024.2320402
[44]	S. Bouzebda, N. Taachouche, On the variable bandwidth kernel estimation of conditional $U$ -statistics at optimal rates in sup-norm, Phys. A Stat. Mechan. Appl., 625 (2023), 129000. https://doi.org/10.1016/j.physa.2023.129000 doi: 10.1016/j.physa.2023.129000
[45]	S. Bouzebda, N. Taachouche, Rates of the strong uniform consistency for the kernel-type regression function estimators with general kernels on manifolds, Math. Meth. Stat., 32 (2023), 27–80. https://doi.org/10.3103/S1066530723010027 doi: 10.3103/S1066530723010027
[46]	S. Bouzebda, N. Taachouche, Rates of the strong uniform consistency with rates for conditional $U$ -statistics estimators with general kernels on manifolds, Math. Meth. Stat., in press, 2023.
[47]	S. Bouzebda, S. Didi, L. El Hajj, Multivariate wavelet density and regression estimators for stationary and ergodic continuous time processes: asymptotic results, Math. Meth. Stat., 24 (2015), 163–199. https://doi.org/10.3103/S1066530715030011 doi: 10.3103/S1066530715030011
[48]	S. Bouzebda, I. Elhattab, B. Nemouchi, On the uniform-in-bandwidth consistency of the general conditional $U$ -statistics based on the copula representation, J. Nonparametr. Stat., 33 (2021), 321–358. https://doi.org/10.1080/10485252.2021.1937621 doi: 10.1080/10485252.2021.1937621
[49]	S. Bouzebda, A. Laksaci, M. Mohammedi, The $k$ -nearest neighbors method in single index regression model for functional quasi-associated time series data, Rev. Mat. Complut., 36 (2023), 361–391. https://doi.org/10.1007/s13163-022-00436-z doi: 10.1007/s13163-022-00436-z
[50]	Q. Cao, Z. C. Guo, Y. Ying, Generalization bounds for metric and similarity learning, Mach. Learn., 102 (2016), 115–132. https://doi.org/10.1007/s10994-015-5499-7 doi: 10.1007/s10994-015-5499-7
[51]	A. Carbonez, L. Györfi, E. C. van der Meulen, Partitioning-estimates of a regression function under random censoring, Stat. Risk Model., 13 (1995), 21–37. https://doi.org/10.1524/strm.1995.13.1.21 doi: 10.1524/strm.1995.13.1.21
[52]	D. Chen, P. Hall, H. G. Müller, Single and multiple index functional regression models with nonparametric link, Ann. Statist., 39 (2011), 1720–1747. https://doi.org/10.1214/11-AOS882 doi: 10.1214/11-AOS882
[53]	Y. Chen, S. Datta, Adjustments of multi-sample $U$ -statistics to right censored data and confounding covariates, Comput. Statist. Data Anal., 135 (2019), 1–14. https://doi.org/10.1016/j.csda.2019.01.012 doi: 10.1016/j.csda.2019.01.012
[54]	S. Clémençon, I. Colin, A. Bellet, Scaling-up empirical risk minimization: optimization of incomplete $U$ -statistics, J. Mach. Learn. Res., 17 (2016), 1–36.
[55]	G. B. Cybis, M. Valk, S. R. C. Lopes, Clustering and classification problems in genetics through $U$ -statistics, J. Stat. Comput. Simul., 88 (2018), 1882–1902. https://doi.org/10.1080/00949655.2017.1374387 doi: 10.1080/00949655.2017.1374387
[56]	R. Dahlhaus, On the Kullback-Leibler information divergence of locally stationary processes, Stochastic Process. Appl., 62 (1996), 139–168. https://doi.org/10.1016/0304-4149(95)00090-9 doi: 10.1016/0304-4149(95)00090-9
[57]	R. Dahlhaus, Fitting time series models to non-stationary processes, Ann. Statist., 25 (1997), 1–37. https://doi.org/10.1214/aos/1034276620 doi: 10.1214/aos/1034276620
[58]	R. Dahlhaus, W. Polonik, Nonparametric quasi-maximum likelihood estimation for Gaussian locally stationary processes, Ann. Statist., 34 (2006), 2790–2824. https://doi.org/10.1214/009053606000000867 doi: 10.1214/009053606000000867
[59]	R. Dahlhaus, W. Polonik, Empirical spectral processes for locally stationary time series, Bernoulli, 15 (2009), 1–39. https://doi.org/10.3150/08-BEJ137 doi: 10.3150/08-BEJ137
[60]	S. Datta, D. Bandyopadhyay, G. A. Satten, Inverse probability of censoring weighted $U$ -statistics for right-censored data with an application to testing hypotheses, Scand. J. Stat., 37 (2010), 680–700. https://doi.org/10.1111/j.1467-9469.2010.00697.x doi: 10.1111/j.1467-9469.2010.00697.x
[61]	J. A. Davydov, Convergence of distributions generated by stationary stochastic processes, Theory Probab. Appl., 13 (1968), 691–696. https://doi.org/10.1137/1113086 doi: 10.1137/1113086
[62]	J. A. Davydov, Mixing conditions for Markov chains, Theory Probab. Appl., 18 (1973), 312–328. https://doi.org/10.1137/1118033 doi: 10.1137/1118033
[63]	V. H. de la Peña, Decoupling and Khintchine's inequalities for $U$ -statistics, Ann. Probab., 20 (1992), 1877–1892.
[64]	V. H. de la Peña, E. Giné, Decoupling, In: Probability and its Applications, New York: Springer, 1999.
[65]	P. Deheuvels, One bootstrap suffices to generate sharp uniform bounds in functional estimation, Kybernetika, 47 (2011), 855–865.
[66]	M. Denker, G. Keller, On $U$ -statistics and v. Mises' statistics for weakly dependent processes, Z. Wahrsch. Verw. Gebiete, 64 (1983), 505–522. https://doi.org/10.1007/BF00534953 doi: 10.1007/BF00534953
[67]	L. Devroye, G. Lugosi, Combinatorial Methods in Density Estimation, New York: Springer, 2001.
[68]	S. Didi, S. Bouzebda, Wavelet density and regression estimators for continuous time functional stationary and ergodic processes, Mathematics, 10 (2022), 4356. https://doi.org/10.3390/math10224356 doi: 10.3390/math10224356
[69]	S. Didi, A. Al Harby, S. Bouzebda, Wavelet density and regression estimators for functional stationary and ergodic data: discrete time, Mathematics, 10 (2022), 3433. https://doi.org/10.3390/math10193433 doi: 10.3390/math10193433
[70]	R. M. Dudley, The sizes of compact subsets of Hilbert space and continuity of Gaussian processes, J. Funct. Anal., 1 (1967), 290–330.
[71]	R. M. Dudley, An extended Wichura theorem, definitions of Donsker class, and weighted empirical distributions, In: Probability in Banach Spaces V. Lecture Notes in Mathematics, Springer, 1153 (1985), 141–178. https://doi.org/10.1007/BFb0074949
[72]	R. M. Dudley, Uniform central limit theorems, In: Cambridge Studies in Advanced Mathematics, 2 Eds., New York: Cambridge University Press, 2014.
[73]	E. Eberlein, Weak convergence of partial sums of absolutely regular sequences, Statist. Probab. Lett., 2 (1984), 291–293. https://doi.org/10.1016/0167-7152(84)90067-1 doi: 10.1016/0167-7152(84)90067-1
[74]	P. P. B. Eggermont, V. N. LaRiccia, Maximum Penalized Likelihood Estimation, New York: Springer, 2001.
[75]	L. Faivishevsky, J. Goldberger, ICA based on a smooth estimation of the differential entropy, In: Advances in Neural Information Processing Systems, Inc: Curran Associates, 2008.
[76]	S. Feng, P. Tian, Y. Hu, G. Li, Estimation in functional single-index varying coefficient model, J. Statist. Plann. Inference, 214 (2021), 62–75. https://doi.org/10.1016/j.jspi.2021.01.003 doi: 10.1016/j.jspi.2021.01.003
[77]	F. Ferraty, P. Vieu, Nonparametric models for functional data, with application in regression, time-series prediction and curve discrimination, In: The International Conference on Recent Trends and Directions in Nonparametric Statistics, J. Nonparametr. Stat., 16 (2004), 111–125. https://doi.org/10.1080/10485250310001622686 doi: 10.1080/10485250310001622686
[78]	F. Ferraty, P. Vieu, Nonparametric Functional Data Analysis, New York: Springer, 2006.
[79]	F. Ferraty, A. Peuch, P. Vieu, Modèle à indice fonctionnel simple, C. R. Math. Acad. Sci. Paris, 336 (2003), 1025–1028. https://doi.org/10.1016/S1631-073X(03)00239-5 doi: 10.1016/S1631-073X(03)00239-5
[80]	F. Ferraty, A. Laksaci, P. Vieu, Estimating some characteristics of the conditional distribution in nonparametric functional models, Stat. Infer. Stoch. Process., 9 (2006), 47–76. https://doi.org/10.1007/s11203-004-3561-3 doi: 10.1007/s11203-004-3561-3
[81]	F. Ferraty, A. Mas, P. Vieu, Nonparametric regression on functional data: inference and practical aspects, Aust. N.Z. J. Stat., 49 (2007), 267–286. https://doi.org/10.1111/j.1467-842X.2007.00480.x doi: 10.1111/j.1467-842X.2007.00480.x
[82]	F. Ferraty, A. Laksaci, A. Tadj, P. Vieu, Rate of uniform consistency for nonparametric estimates with functional variables, J. Statist. Plann. Inference, 140 (2010), 335–352. https://doi.org/10.1016/j.jspi.2009.07.019 doi: 10.1016/j.jspi.2009.07.019
[83]	F. Ferraty, N. Kudraszow, P. Vieu, Nonparametric estimation of a surrogate density function in infinite-dimensional spaces, J. Nonparametr. Stat., 24 (2012), 447–464. https://doi.org/10.1080/10485252.2012.671943 doi: 10.1080/10485252.2012.671943
[84]	A. Földes, L. Rejtő, A LIL type result for the product limit estimator, Z. Wahrsch. Verw. Gebiete, 56 (1981), 75–86. https://doi.org/10.1007/BF00531975 doi: 10.1007/BF00531975
[85]	E. W. Frees, Infinite order $U$ -statistics, Scand. J. Statist., 16 (1989), 29–45.
[86]	K. A. Fu, An application of $U$ -statistics to nonparametric functional data analysis, Commun. Stat. Theory Meth., 41 (2012), 1532–1542. https://doi.org/10.1080/03610926.2010.526747 doi: 10.1080/03610926.2010.526747
[87]	T. Gasser, P. Hall, B. Presnell, Nonparametric estimation of the mode of a distribution of random curves, J. R. Stat. Soc. Ser. B Stat. Methodol., 60 (1998), 681–691. https://doi.org/10.1111/1467-9868.00148 doi: 10.1111/1467-9868.00148
[88]	S. Ghosal, A. Sen, A. W. van der Vaart, Testing monotonicity of regression, Ann. Statist., 28 (2000), 1054–1082.
[89]	E. Giné, J. Zinn, Some limit theorems for empirical processes, Ann. Probab., 12 (1984), 929–998.
[90]	A. Goia, P. Vieu, An introduction to recent advances in high/infinite dimensional statistics, J. Multivar. Anal., 146 (2016), 1–6. https://doi.org/10.1016/j.jmva.2015.12.001 doi: 10.1016/j.jmva.2015.12.001
[91]	L. Gu, L. Yang, Oracally efficient estimation for single-index link function with simultaneous confidence band, Electron. J. Stat., 9 (2015), 1540–1561. https://doi.org/10.1214/15-EJS1051 doi: 10.1214/15-EJS1051
[92]	P. Hall, Asymptotic properties of integrated square error and cross-validation for kernel estimation of a regression function, Z. Wahrsch. Verw. Gebiete, 67 (1984), 175–196. https://doi.org/10.1007/BF00535267 doi: 10.1007/BF00535267
[93]	P. R. Halmos, The theory of unbiased estimation, Ann. Math. Statist,, 17 (1946), 34–43. https://doi.org/10.1214/aoms/1177731020
[94]	F. Han, T. Qian, On inference validity of weighted U-statistics under data heterogeneity, Electron. J. Statist., 12 (2018), 2637–2708. https://doi.org/10.1214/18-EJS1462 doi: 10.1214/18-EJS1462
[95]	W. Härdle, Applied nonparametric regression, In: Econometric Society Monographs, Cambridge: Cambridge University Press, 1990.
[96]	W. Härdle, J. S. Marron, Optimal bandwidth selection in nonparametric regression function estimation, Ann. Statist., 13 (1985), 1465–1481.
[97]	M. Harel, M. L. Puri, Conditional $U$ -statistics for dependent random variables, J. Multivar. Anal., 57 (1996), 84–100. https://doi.org/10.1006/jmva.1996.0023 doi: 10.1006/jmva.1996.0023
[98]	C. Heilig, D. Nolan, Limit theorems for the infinite-degree $U$ -process, Statist. Sinica, 11 (2001), 289–302.
[99]	W. Hoeffding, A class of statistics with asymptotically normal distribution, Ann. Math. Stat., 19 (1948), 293–325.
[100]	J. Hoffmann-Jørgensen, Stochastic processes on Polish spaces, In: Various Publications Series, Aarhus: Aarhus Universitet, Matematisk Institut, 1991.
[101]	M. Hollander, F. Proschan, Testing whether new is better than used, Ann. Math. Statist., 43 (1972), 1136–1146. https://doi.org/10.1214/aoms/1177692466 doi: 10.1214/aoms/1177692466
[102]	L. Horváth, P. Kokoszka, Inference for Functional Data with Applications, New York: Springer, 2012.
[103]	I. A. Ibragimov, V. N. Solev, A certain condition for the regularity of Gaussian stationary sequence, Zap. Naučn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI), 12 (1969), 113–125.
[104]	S. Jadhav, S. Ma, An association test for functional data based on Kendall's tau, J. Multivar. Anal., 184 (2021), 104740. https://doi.org/10.1016/j.jmva.2021.104740 doi: 10.1016/j.jmva.2021.104740
[105]	C. Jentsch, S. Subba Rao, A test for second order stationarity of a multivariate time series, J. Econometrics, 185 (2015), 124–161. https://doi.org/10.1016/j.jeconom.2014.09.010 doi: 10.1016/j.jeconom.2014.09.010
[106]	Z. Jiang, Z. Huang, J. Zhang, Functional single-index composite quantile regression, Metrika, 86 (2023), 595–603. https://doi.org/10.1007/s00184-022-00887-w doi: 10.1007/s00184-022-00887-w
[107]	R. Jin, S. Wang, Y. Zhou, Regularized distance metric learning: theory and algorithm, In: Advances in Neural Information Processing Systems, Inc: Curran Associates, 2009.
[108]	E. L. Kaplan, P. Meier, Nonparametric estimation from incomplete observations, J. Amer. Statist. Assoc., 53 (1958), 457–481.
[109]	L. Kara-Zaitri, A. Laksaci, M. Rachdi, P. Vieu, Uniform in bandwidth consistency for various kernel estimators involving functional data, J. Nonparametr. Stat., 29 (2017), 85–107. https://doi.org/10.1080/10485252.2016.1254780 doi: 10.1080/10485252.2016.1254780
[110]	M. G. Kendall, A new measure of rank correlation, Biometrika, 30 (1938), 81–93.
[111]	M. Kohler, K. Máthé, M. Pintér, Prediction from randomly right censored data, J. Multivar. Anal., 80 (2002), 73–100. https://doi.org/10.1006/jmva.2000.1973 doi: 10.1006/jmva.2000.1973
[112]	A. N. Kolmogorov, V. M. Tihomirov, $\varepsilon$ -entropy and $\varepsilon$ -capacity of sets in function spaces, Uspehi. Mat. Nauk., 14 (1959), 3–86.
[113]	V. S. Koroljuk, Y. V. Borovskich, Theory of $U$ -statistics, In: Mathematics and its Applications, Dordrecht: Kluwer Academic Publishers Group, 1994.
[114]	M. R. Kosorok, Introduction to Empirical Processes and Semiparametric Inference, New York: Springer, 2008.
[115]	J. P. Kreiss, E. Paparoditis, Bootstrapping locally stationary processes, J. R. Stat. Soc. Ser. B. Stat. Methodol., 77 (2015), 267–290. https://doi.org/10.1111/rssb.12068 doi: 10.1111/rssb.12068
[116]	D. Kurisu, Nonparametric regression for locally stationary functional time series, Electron. J. Statist., 16 (2022), 3973–3995. https://doi.org/10.1214/22-EJS2041 doi: 10.1214/22-EJS2041
[117]	A. J. Lee, $U$ -statistics, In: Statistics: Textbooks and Monographs, New York: Marcel Dekker, 1990.
[118]	S. Lee, O. Linton, Y. J. Whang, Testing for stochastic monotonicity, Econometrica, 77 (2009), 585–602. https://doi.org/10.3982/ECTA7145 doi: 10.3982/ECTA7145
[119]	A. Leucht, Degenerate $U$ - and $V$ -statistics under weak dependence: asymptotic theory and bootstrap consistency, Bernoulli, 18 (2012), 552–585. https://doi.org/10.3150/11-BEJ354 doi: 10.3150/11-BEJ354
[120]	A. Leucht, M. H. Neumann, Degenerate $U$ - and $V$ -statistics under ergodicity: asymptotics, bootstrap and applications in statistics, Ann. Inst. Stat. Math., 65 (2013), 349–386. https://doi.org/10.1007/s10463-012-0374-9 doi: 10.1007/s10463-012-0374-9
[121]	J. Li, C. Huang, Z. Hongtu, A functional varying-coefficient single-index model for functional response data, J. Amer. Stat. Assoc., 112 (2017), 1169–1181. https://doi.org/10.1080/01621459.2016.1195742 doi: 10.1080/01621459.2016.1195742
[122]	W. V. Li, Q. M. Shao, Gaussian processes: inequalities, small-ball probabilities and applications, Handbook Stat., 19 (2001), 533–597. https://doi.org/10.1016/S0169-7161(01)19019-X doi: 10.1016/S0169-7161(01)19019-X
[123]	H. Liang, X. Liu, R. Li, C. L. Tsai, Estimation and testing for partially linear single-index models, Ann. Stat., 38 (2010), 3811–3836. https://doi.org/10.1214/10-AOS835 doi: 10.1214/10-AOS835
[124]	E. Liebscher, Strong convergence of sums of $\alpha$ -mixing random variables with applications to density estimation, Stochast. Process. Appl., 65 (1996), 69–80. https://doi.org/10.1016/S0304-4149(96)00096-8 doi: 10.1016/S0304-4149(96)00096-8
[125]	F. Lim, V. M. Stojanovic, On $U$ -statistics and compressed sensing I: non-asymptotic average-case analysis, IEEE T. Signal Process., 61 (2013), 2473–2485. https://doi.org/10.1109/TSP.2013.2247598 doi: 10.1109/TSP.2013.2247598
[126]	N. Ling, P. Vieu, Nonparametric modelling for functional data: selected survey and tracks for future, Statistics, 52 (2018), 934–949. https://doi.org/10.1080/02331888.2018.1487120 doi: 10.1080/02331888.2018.1487120
[127]	N. Ling, L. Cheng, P. Vieu, Single functional index model under responses MAR and dependent observations, In: Functional and High-Dimensional Statistics and Related Fields. IWFOS 2020. Contributions to Statistics. Springer, Cham., 2020.
[128]	N. Ling, L. Cheng, P. Vieu, H. Ding, Missing responses at random in functional single index model for time series data, Stat. Papers, 63 (2022), 665–692. https://doi.org/10.1007/s00362-021-01251-2 doi: 10.1007/s00362-021-01251-2
[129]	Q. Liu, J. Lee, M. Jordan, A kernelized stein discrepancy for goodness-of-fit tests, In: Proceedings of The 33rd International Conference on Machine Learning, PMLR, 48 (2016), 276–284.
[130]	B. Maillot, V. Viallon, Uniform limit laws of the logarithm for nonparametric estimators of the regression function in presence of censored data, Math. Meth. Stat., 18 (2009), 159–184. https://doi.org/10.3103/S1066530709020045 doi: 10.3103/S1066530709020045
[131]	T. Masak, S. Sarkar, V. M. Panaretos, Principal separable component analysis via the partial inner product, Stat. Theory, 2020.
[132]	D. M. Mason, Proving consistency of non-standard kernel estimators, Stat. Inference Stoch. Process., 15 (2012), 151–176. https://doi.org/10.1007/s11203-012-9068-4 doi: 10.1007/s11203-012-9068-4
[133]	E. Masry, Nonparametric regression estimation for dependent functional data: asymptotic normality, Stochast. Process. Appl., 115 (2005), 155–177. https://doi.org/10.1016/j.spa.2004.07.006 doi: 10.1016/j.spa.2004.07.006
[134]	U. Mayer, H. Zähle, Z. Zhou, Functional weak limit theorem for a local empirical process of non-stationary time series and its application, Bernoulli, 26 (2020), 1891–1911. https://doi.org/10.3150/19-BEJ1174 doi: 10.3150/19-BEJ1174
[135]	E. Mayer-Wolf, O. Zeitouni, The probability of small Gaussian ellipsoids and associated conditional moments, Ann. Probab., 21 (1993), 14–24.
[136]	M. Mohammedi, S. Bouzebda, A. Laksaci, The consistency and asymptotic normality of the kernel type expectile regression estimator for functional data, J. Multivar. Anal., 181 (2021), 104673. https://doi.org/10.1016/j.jmva.2020.104673 doi: 10.1016/j.jmva.2020.104673
[137]	M. Mohammedi, S. Bouzebda, A. Laksaci, O. Bouanani, Asymptotic normality of the k-NN single index regression estimator for functional weak dependence data, Commun. Stat. Theory Meth., 53 (2024), 3143–3168. https://doi.org/10.1080/03610926.2022.2150823 doi: 10.1080/03610926.2022.2150823
[138]	J. S. Morris, Functional regression, Annu. Rev. Stat. Appl., 2 (2015), 321–359. https://doi.org/10.1146/annurev-statistics-010814-020413
[139]	E. A. Nadaraja, On a regression estimate, Teor. Verojatnost. Primenen., 9 (1964), 157–159.
[140]	E. A. Nadaraya, Nonparametric estimation of probability densities and regression curves, In: Mathematics and its Applications (Soviet Series), Dordrecht: Kluwer Academic Publishers Group, 1989.
[141]	G. P. Nason, R. von Sachs, G. Kroisandt, Wavelet processes and adaptive estimation of the evolutionary wavelet spectrum, J. R. Stat. Soc. Ser. B Stat. Methodol., 62 (2000), 271–292. https://doi.org/10.1111/1467-9868.00231 doi: 10.1111/1467-9868.00231
[142]	M. H. Neumann, R. von Sachs, Wavelet thresholding in anisotropic function classes and application to adaptive estimation of evolutionary spectra, Ann. Statist., 25 (1997), 38–76. https://doi.org/10.1214/aos/1034276621 doi: 10.1214/aos/1034276621
[143]	Y. Nie, L. Wang, J. Cao, Estimating functional single index models with compact support, Environmetrics, 34 (2023), e2784. https://doi.org/10.1002/env.2784 doi: 10.1002/env.2784
[144]	D. Nolan, D. Pollard, $U$ -processes: rates of convergence, Ann. Statist., 15 (1987), 780–799.
[145]	S. Novo, G. Aneiros, P. Vieu, Automatic and location-adaptive estimation in functional single-index regression, J. Nonparametr. Stat., 31 (2019), 364–392. https://doi.org/10.1080/10485252.2019.1567726 doi: 10.1080/10485252.2019.1567726
[146]	W. Peng, T. Coleman, L. Mentch, Rates of convergence for random forests via generalized $U$ -statistics, Electron. J. Stat., 16 (2022), 232–292. https://doi.org/10.1214/21-EJS1958 doi: 10.1214/21-EJS1958
[147]	N. Phandoidaen, S. Richter, Empirical process theory for locally stationary processes, Bernoulli, 28 (2022), 453–480. https://doi.org/10.3150/21-BEJ1351 doi: 10.3150/21-BEJ1351
[148]	B. L. S. Prakasa Rao, A. Sen, Limit distributions of conditional $U$ -statistics, J. Theoret. Probab., 8 (1995), 261–301. https://doi.org/10.1007/BF02212880 doi: 10.1007/BF02212880
[149]	M. B. Priestley, Evolutionary spectra and non-stationary processes, J. Roy. Statist. Soc. Ser. B, 27 (1965), 204–237. https://doi.org/10.1111/j.2517-6161.1965.tb01488.x doi: 10.1111/j.2517-6161.1965.tb01488.x
[150]	M. Rachdi, P. Vieu, Nonparametric regression for functional data: automatic smoothing parameter selection, J. Statist. Plann. Inference, 137 (2007), 2784–2801. https://doi.org/10.1016/j.jspi.2006.10.001 doi: 10.1016/j.jspi.2006.10.001
[151]	J. O. Ramsay, B. W. Silverman, Applied Functional Data Analysis, New York: Springer, 2002.
[152]	G. Rempala, A. Gupta, Weak limits of $U$ -statistics of infinite order, Random Oper. Stoch. Equ., 7 (1999), 39–52. https://doi.org/10.1515/rose.1999.7.1.39 doi: 10.1515/rose.1999.7.1.39
[153]	K. Sakiyama, M. Taniguchi, Discriminant analysis for locally stationary processes, J. Multivar. Anal., 90 (2004), 282–300. https://doi.org/10.1016/j.jmva.2003.08.002 doi: 10.1016/j.jmva.2003.08.002
[154]	A. Sen, Uniform strong consistency rates for conditional $U$ -statistics, Sankhyā Ind. J. Stat. Ser. A, 56 (1994), 179–194.
[155]	R. J. Serfling, Approximation Theorems of Mathematical Statistics, New York: John Wiley & Sons, 1980.
[156]	H. L. Shang, Bayesian bandwidth estimation for a functional nonparametric regression model with mixed types of regressors and unknown error density, J. Nonparametr. Stat., 26 (2014), 599–615. https://doi.org/10.1080/10485252.2014.916806 doi: 10.1080/10485252.2014.916806
[157]	R. P. Sherman, The limiting distribution of the maximum rank correlation estimator, Econometrica, 61 (1993), 123–137.
[158]	R. P. Sherman, Maximal inequalities for degenerate $U$ -processes with applications to optimization estimators, Ann. Statist., 22 (1994), 439–459. https://doi.org/10.1214/aos/1176325377 doi: 10.1214/aos/1176325377
[159]	B. W. Silverman, Distances on circles, toruses and spheres, J. Appl. Probab., 15 (1978), 136–143. https://doi.org/10.2307/3213243
[160]	B. W. Silverman, Density Estimation for Statistics and Data Analysis, London: Chapman & Hall, 1986.
[161]	R. A. Silverman, Locally stationary random processes, IRE T. Inform. Theory, 3 (1957), 182–187. https://doi.org/10.1109/TIT.1957.1057413 doi: 10.1109/TIT.1957.1057413
[162]	Y. Song, X. Chen, K. Kato, Approximating high-dimensional infinite-order $U$ -statistics: statistical and computational guarantees, Electron. J. Stat., 13 (2019), 4794–4848. https://doi.org/10.1214/19-EJS1643 doi: 10.1214/19-EJS1643
[163]	I. Soukarieh, S. Bouzebda, Exchangeably weighted bootstraps of general Markov $U$ -process, Mathematics, 10 (2022), 3745. https://doi.org/10.3390/math10203745 doi: 10.3390/math10203745
[164]	I. Soukarieh, S. Bouzebda, Renewal type bootstrap for increasing degree $U$ -process of a Markov chain, J. Multivar. Anal., 195 (2023), 105143. https://doi.org/10.1016/j.jmva.2022.105143 doi: 10.1016/j.jmva.2022.105143
[165]	I. Soukarieh, S. Bouzebda, Weak convergence of the conditional $U$ -statistics for locally stationary functional time series, Stat. Inference Stoch. Process., 17 (2024), 227–304. https://doi.org/10.1007/s11203-023-09305-y doi: 10.1007/s11203-023-09305-y
[166]	W. Stute, Conditional $U$ -statistics, Ann. Probab., 19 (1991), 812–825.
[167]	W. Stute, $L^p$ -convergence of conditional $U$ -statistics, J. Multivar. Anal., 51 (1994), 71–82. https://doi.org/10.1006/jmva.1994.1050 doi: 10.1006/jmva.1994.1050
[168]	W. Stute, Universally consistent conditional $U$ -statistics, Ann. Statist., 22 (1994), 460–473. https://doi.org/10.1214/aos/1176325378 doi: 10.1214/aos/1176325378
[169]	W. Stute, Symmetrized NN-conditional $U$ -statistics. In: Research Developments in Probability and Statistics, 231–237, 1996.
[170]	W. Stute, W. and Wang, Multi-sample $U$ -statistics for censored data, Scand. J. Statist., 20 (1993), 369–374.
[171]	W. Stute, L. X. Zhu, Nonparametric checks for single-index models, Ann. Statist., 33 (2005), https://doi.org/10.1214/009053605000000020 1048–1083.
[172]	K. K. Sudheesh, S. Anjana, M. Xie, U-statistics for left truncated and right censored data, Statistics, 57 (2023), 900–917. https://doi.org/10.1080/02331888.2023.2217314 doi: 10.1080/02331888.2023.2217314
[173]	Q. Tang, L. Kong, D. Rupper, R. J. Karunamuni, Partial functional partially linear single-index models, Statist. Sinica, 31 (2021), 107–133.
[174]	W. Y. Tsai, N. P. Jewell, M. C. Wang, A note on the product-limit estimator under right censoring and left truncation, Biometrika, 74 (1987), 883–886. https://doi.org/10.1093/biomet/74.4.883 doi: 10.1093/biomet/74.4.883
[175]	A. van Delft, H. Dette, A general framework to quantify deviations from structural assumptions in the analysis of non-stationary function-valued processes, preprint paper, 2022. https://doi.org/10.48550/arXiv.2208.10158
[176]	A. van Delft, M. Eichler, Locally stationary functional time series, Electron. J. Stat., 12 (2018), 107–170. https://doi.org/10.1214/17-EJS1384 doi: 10.1214/17-EJS1384
[177]	A. van der Vaart, New donsker classes, Ann. Probab., 24 (1996), 2128–2140. https://doi.org/10.1214/aop/1041903221
[178]	A. W. van der Vaart, J. A. Wellner, Weak Convergence and Empirical Processes, New York: Springer, 1996.
[179]	M. Vogt, Nonparametric regression for locally stationary time series, Ann. Statist., 40 (2012), 2601–2633. https://doi.org/10.1214/12-AOS1043 doi: 10.1214/12-AOS1043
[180]	V. A. Volkonskiui, Y. A. Rozanov, Some limit theorems for random functions I, Theory Probab. Appl., 4 (1959), 178–197. https://doi.org/10.1137/1104015 doi: 10.1137/1104015
[181]	R. von Mises, On the asymptotic distribution of differentiable statistical functions, Ann. Math. Stat., 18 (1947), 309–348.
[182]	M. P. Wand, M. C. Jones, Kernel smoothing, In: Monographs on Statistics and Applied Probability, London: Chapman and Hall, 1995.
[183]	J. L. Wang, J. M. Chiou, H. G. Müller, Functional data analysis, Annu. Rev. Stat. Appl., 3 (2016), 257–295. https://doi.org/10.1146/annurev-statistics-041715-033624 doi: 10.1146/annurev-statistics-041715-033624
[184]	G. S. Watson, Smooth regression analysis, Sankhyā Ind. J. Stat. Ser. A, 26 (1964), 359–372.
[185]	J. Yang, Z. Zhou, Spectral inference under complex temporal dynamics, J. Amer. Statist. Assoc., 117 (2022), 133–155. https://doi.org/10.1080/01621459.2020.1764365 doi: 10.1080/01621459.2020.1764365
[186]	A. Yuan, M. Giurcanu, G. Luta, M. T. Tan, U-statistics with conditional kernels for incomplete data models, Ann. Inst. Statist. Math., 69 (2017), 271–302. https://doi.org/10.1007/s10463-015-0537-6 doi: 10.1007/s10463-015-0537-6
[187]	Y. Zhou, P. S. F. Yip, A strong representation of the product-limit estimator for left truncated and right censored data, J. Multivar. Anal., 69 (1999), 261–280. https://doi.org/10.1006/jmva.1998.1806 doi: 10.1006/jmva.1998.1806
[188]	H. Zhu, R. Zhang, Y. Liu, H. Ding, Robust estimation for a general functional single index model via quantile regression, J. Korean Stat. Soc., 51 (2022), 1041–1070. https://doi.org/10.1007/s42952-022-00174-4 doi: 10.1007/s42952-022-00174-4

This article has been cited by:

1.	Salim Bouzebda, Nourelhouda Taachouche, Oracle inequalities and upper bounds for kernel conditional U-statistics estimators on manifolds and more general metric spaces associated with operators, 2024, 1744-2508, 1, 10.1080/17442508.2024.2391898
2.	Salim Bouzebda, Amel Nezzal, Issam Elhattab, Limit theorems for nonparametric conditional U-statistics smoothed by asymmetric kernels, 2024, 9, 2473-6988, 26195, 10.3934/math.20241280
3.	Salim Bouzebda, Limit Theorems in the Nonparametric Conditional Single-Index U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design, 2024, 12, 2227-7390, 1996, 10.3390/math12131996
4.	Oussama Bouanani, Salim Bouzebda, Limit theorems for local polynomial estimation of regression for functional dependent data, 2024, 9, 2473-6988, 23651, 10.3934/math.20241150
5.	Breix Michael Agua, Salim Bouzebda, Single index regression for locally stationary functional time series, 2024, 9, 2473-6988, 36202, 10.3934/math.20241719
6.	Youssouf Souddi, Salim Bouzebda, k-Nearest Neighbour Estimation of the Conditional Set-Indexed Empirical Process for Functional Data: Asymptotic Properties, 2025, 14, 2075-1680, 76, 10.3390/axioms14020076
7.	Khalid Chokri, Salim Bouzebda, Optimal Almost Sure Rate of Convergence for the Wavelets Estimator in the Partially Linear Additive Models, 2025, 17, 2073-8994, 394, 10.3390/sym17030394
8.	Sultana Didi, Salim Bouzebda, Linear Wavelet-Based Estimators of Partial Derivatives of Multivariate Density Function for Stationary and Ergodic Continuous Time Processes, 2025, 27, 1099-4300, 389, 10.3390/e27040389
9.	Ibrahim M. Almanjahie, Hanan Abood, Salim Bouzebda, Fatimah Alshahrani, Ali Laksaci, Nonparametric expectile shortfall regression for functional data, 2025, 58, 2391-4661, 10.1515/dema-2025-0125
10.	Sultana Didi, Salim Bouzebda, Wavelet Estimation of Partial Derivatives in Multivariate Regression Under Discrete-Time Stationary Ergodic Processes, 2025, 13, 2227-7390, 1587, 10.3390/math13101587
11.	Sultana Didi, Salim Bouzebda, Wavelet-based estimators of partial derivatives of a multivariate density function for discrete stationary and ergodic processes, 2025, 10, 2473-6988, 12519, 10.3934/math.2025565
12.	Mohammed B. Alamari, Fatimah A. Almulhim, Ibrahim M. Almanjahie, Salim Bouzebda, Ali Laksaci, Scalar-on-Function Mode Estimation Using Entropy and Ergodic Properties of Functional Time Series Data, 2025, 27, 1099-4300, 552, 10.3390/e27060552

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(1501) PDF downloads(87) Cited by(12)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

AIMS Mathematics

Weak convergence of the conditional single index U U -statistics for locally stationary functional time series

Related Papers:

Abstract

1. Introduction and motivations

2. Background and preliminaries

2.1. Notation

2.2. Model

2.3. Local stationarity

2.4. Small-ball probability

2.5. Mixing conditions

2.6. Kernel estimation

2.7. VC-type classes of functions

2.8. Assumptions

2.9. Comments on the assumptions

3. Uniform convergence rates for kernel estimators

3.1. Hoeffding's decomposition

3.2. Uniform convergence rate

4. Weak convergence for kernel estimators

5. Applications

5.1. Discrimination

5.2. Metric learning

5.3. Kendall rank correlation coefficient

5.4. Conditional U-statistics for censored data

5.5. Conditional U-statistics for left truncated and right censored data

6. The bandwidth selection criterion

7. Concluding remarks

8. Mathematical developments

Use of AI tools declaration

Acknowledgements

Conflict of interest

Appendix

A. Additional information

A1. Examples of classes of functions

A2. Examples of U-kernels

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog

Weak convergence of the conditional single index $U$ -statistics for locally stationary functional time series