Determining acute ischemic stroke onset time using machine learning and radiomics features of infarct lesions and whole brain

Jiaxi Lu; Yingwei Guo; Mingming Wang; Yu Luo; Xueqiang Zeng; Xiaoqiang Miao; Asim Zaman; Huihui Yang; Anbo Cao; Yan Kang; Jiaxi Lu; Yingwei Guo; Mingming Wang; Yu Luo; Xueqiang Zeng; Xiaoqiang Miao; Asim Zaman; Huihui Yang; Anbo Cao; Yan Kang

doi:10.3934/mbe.2024002

Mathematical Biosciences and Engineering

2024, Volume 21, Issue 1: 34-48. doi: 10.3934/mbe.2024002

Previous Article Next Article

Research article Special Issues

Determining acute ischemic stroke onset time using machine learning and radiomics features of infarct lesions and whole brain

1.
School of Applied Technology, Shenzhen University, Shenzhen 518060, China
2.
College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen 518118, China
3.
College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
4.
Department of Radiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai 200434, China
5.
Engineering Research Centre of Medical Imaging and Intelligent Analysis, Ministry of Education, Shenyang 110169, China

Academic Editor: Diego Oliva

Received: 15 August 2023 Revised: 23 November 2023 Accepted: 26 November 2023 Published: 08 December 2023

Accurate determination of the onset time in acute ischemic stroke (AIS) patients helps to formulate more beneficial treatment plans and plays a vital role in the recovery of patients. Considering that the whole brain may contain some critical information, we combined the Radiomics features of infarct lesions and whole brain to improve the prediction accuracy. First, the radiomics features of infarct lesions and whole brain were separately calculated using apparent diffusion coefficient (ADC), diffusion-weighted imaging (DWI) and fluid-attenuated inversion recovery (FLAIR) sequences of AIS patients with clear onset time. Then, the least absolute shrinkage and selection operator (Lasso) was used to select features. Four experimental groups were generated according to combination strategies: Features in infarct lesions (IL), features in whole brain (WB), direct combination of them (IW) and Lasso selection again after direct combination (IWS), which were used to evaluate the predictive performance. The results of ten-fold cross-validation showed that IWS achieved the best AUC of 0.904, which improved by 13.5% compared with IL (0.769), by 18.7% compared with WB (0.717) and 4.2% compared with IW (0.862). In conclusion, combining infarct lesions and whole brain features from multiple sequences can further improve the accuracy of AIS onset time.

Keywords:

Citation: Jiaxi Lu, Yingwei Guo, Mingming Wang, Yu Luo, Xueqiang Zeng, Xiaoqiang Miao, Asim Zaman, Huihui Yang, Anbo Cao, Yan Kang. Determining acute ischemic stroke onset time using machine learning and radiomics features of infarct lesions and whole brain[J]. Mathematical Biosciences and Engineering, 2024, 21(1): 34-48. doi: 10.3934/mbe.2024002

Related Papers:

[1]	M. Nagy, H. M. Barakat, M. A. Alawady, I. A. Husseiny, A. F. Alrasheedi, T. S. Taher, A. H. Mansi, M. O. Mohamed . Inference and other aspects for $q-$ Weibull distribution via generalized order statistics with applications to medical datasets. AIMS Mathematics, 2024, 9(4): 8311-8338. doi: 10.3934/math.2024404
[2]	G. M. Mansour, M. A. Abd Elgawad, A. S. Al-Moisheer, H. M. Barakat, M. A. Alawady, I. A. Husseiny, M. O. Mohamed . Bivariate Epanechnikov-Weibull distribution based on Sarmanov copula: properties, simulation, and uncertainty measures with applications. AIMS Mathematics, 2025, 10(5): 12689-12725. doi: 10.3934/math.2025572
[3]	H. M. Barakat, M. A. Alawady, I. A. Husseiny, M. Nagy, A. H. Mansi, M. O. Mohamed . Bivariate Epanechnikov-exponential distribution: statistical properties, reliability measures, and applications to computer science data. AIMS Mathematics, 2024, 9(11): 32299-32327. doi: 10.3934/math.20241550
[4]	Jumanah Ahmed Darwish, Saman Hanif Shahbaz, Lutfiah Ismail Al-Turk, Muhammad Qaiser Shahbaz . Some bivariate and multivariate families of distributions: Theory, inference and application. AIMS Mathematics, 2022, 7(8): 15584-15611. doi: 10.3934/math.2022854
[5]	Aisha Fayomi, Ehab M. Almetwally, Maha E. Qura . A novel bivariate Lomax-G family of distributions: Properties, inference, and applications to environmental, medical, and computer science data. AIMS Mathematics, 2023, 8(8): 17539-17584. doi: 10.3934/math.2023896
[6]	I. A. Husseiny, M. Nagy, A. H. Mansi, M. A. Alawady . Some Tsallis entropy measures in concomitants of generalized order statistics under iterated FGM bivariate distribution. AIMS Mathematics, 2024, 9(9): 23268-23290. doi: 10.3934/math.20241131
[7]	Ammar M. Sarhan, Rabab S. Gomaa, Alia M. Magar, Najwan Alsadat . Bivariate exponentiated generalized inverted exponential distribution with applications on dependent competing risks data. AIMS Mathematics, 2024, 9(10): 29439-29473. doi: 10.3934/math.20241427
[8]	Nora Nader, Dina A. Ramadan, Hanan Haj Ahmad, M. A. El-Damcese, B. S. El-Desouky . Optimizing analgesic pain relief time analysis through Bayesian and non-Bayesian approaches to new right truncated Fréchet-inverted Weibull distribution. AIMS Mathematics, 2023, 8(12): 31217-31245. doi: 10.3934/math.20231598
[9]	A. M. Abd El-Raheem, Ehab M. Almetwally, M. S. Mohamed, E. H. Hafez . Accelerated life tests for modified Kies exponential lifetime distribution: binomial removal, transformers turn insulation application and numerical results. AIMS Mathematics, 2021, 6(5): 5222-5255. doi: 10.3934/math.2021310
[10]	Yichen Lv, Xinping Xiao . Grey parameter estimation method for extreme value distribution of short-term wind speed data. AIMS Mathematics, 2024, 9(3): 6238-6265. doi: 10.3934/math.2024304

Abstract

1. Introduction and motivations

In the evolutionary trajectory of asymptotic outcomes related to $U$ -statistics, with a particular emphasis on independent and identically distributed random variables, pivotal contributions can be ascribed to esteemed figures like ^[93,99,181], among others. When extending these advancements to accommodate scenarios involving weak dependency assumptions, notable references include ^{[30,39,40,66,119,120]}. For a comprehensive grasp of $U$ -statistics and $U$ -processes, scholars are directed to seminal works such as ^{[11,12,14,31,113,117]}. A substantial leap forward in the theoretical landscape of $U$ -processes is accredited to ^[64], who made significant contributions by assimilating insights from empirical process theory. Their introduction of innovative techniques, including decoupling inequality and randomization, played a pivotal role in propelling the theoretical framework forward. The applications of $U$ -processes traverse diverse statistical domains, encompassing testing for qualitative features of functions in nonparametric statistics ^[1,88,118], cross-validation for density estimation ^[144], and establishing limiting distributions of M-estimators ^{[11,64,157,158]}. In the realm of machine learning, $U$ -statistics find multifaceted applications in clustering, image recognition, ranking, and learning on graphs. The natural estimates of risk prevalent in various machine learning contexts often manifest in the form of $U$ -statistics, as elucidated in ^[54]. Instances of $U$ -statistics are also discerned in various contexts, such as empirical performance measures in metric learning, exemplified by ^[50]. When confronted with $U$ -statistics characterized by random kernels exhibiting diverging orders, pertinent literature includes contributions from ^{[85,98,152,162]}. Infinite-order $U$ -statistics manifest as invaluable tools for constructing simultaneous prediction intervals, providing insights into the uncertainty inherent in ensemble methods like subbagging and random forests, as explicated in ^[146]. The MeanNN approach, introduced by ^[75] for estimating differential entropy, intricately involves the utilization of the $U$ -statistic. Additionally, ^[129] proposes a novel test statistic for goodness-of-fit tests, employing $U$ -statistics. A model-free approach to clustering and classifying genetic data based on $U$ -statistics is explored by ^[55], presenting alternative perspectives driven by the adaptability of $U$ -statistics to a diverse array of genetic issues and their capability to accommodate various data types. Furthermore, ^[125] advocates for the natural application of $U$ -statistics in examining random compressed sensing matrices in the non-asymptotic regime. For the latest references in this context, please consult ^[43,163,164]. In the realm of nonparametric density and regression function estimation, ^[166] introduces a class of estimators for $r^{(m)}(\varphi, \mathbf{ t}) $, referred to as conditional $U$-statistics. These estimators can be perceived as an extension of the Nadaraya-Watson estimates for regression functions, initially proposed by ^[139,184]. The nonparametric domain of density and regression function estimation has been a focal point for statisticians and probabilists over numerous years, resulting in the evolution of various methodologies. Kernel nonparametric function estimation methods, in particular, have garnered substantial attention. For a thorough exploration of the research literature and statistical applications in this field, one is encouraged to consult ^{[67,74,95,140,160,182]}, and the pertinent references therein.

This investigation delves into the intricacies of nonparametric conditional $U$ -statistics. To facilitate our exploration, we commence by introducing the estimators proposed by ^[166]. Consider a regular sequence of random elements $\{(\mathbf{X}_i, Y_i): i\in \mathbb{N}^*\}$ , where $\mathbf{X}_i \in \mathbb{R}^d$ and $Y_i \in \mathcal{Y}$ , a Polish space, with $\mathbb{N}^* = \mathbb{N} \backslash\{0\}$ . Let $\varphi: \mathcal{Y}^m\rightarrow \mathbb{R}$ be a measurable function. In this paper, our central focus revolves around the estimation of the conditional expectation or regression function:

$r^{(m)}(\varphi, \mathbf{ t}) = \mathbb{E}\left(\varphi(Y_{1}, \ldots, Y_{m})\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{m}) = \mathbf{ t}\right),$

for $\mathbf{ t} \in \mathbb{R}^{dm}$ , provided it exists, namely, when $\mathbb{E}\left(\left\lvert\varphi(Y_{1}, \ldots, Y_{m})\right\rvert\right) < \infty$ . We introduce a kernel function $K:\mathbb{R}^d\rightarrow \mathbb{R}$ with support contained in $[-B, B]^d$ , where $B > 0,$ adhering to the following conditions:

$\sup\limits_{\mathbf{ x}\in \mathbb{R}^d}\vert K(\mathbf{ x})\vert = :\kappa < \infty \; \; \mbox{and}\; \; \int K(\mathbf{ x})d\mathbf{ x} = 1.$

Reference ^[166] introduced a class of estimators for $r^{(m)}(\varphi, \mathbf{ t})$ , known as conditional $U$ -statistics, defined for each $\mathbf{ t}\in\mathbb{R}^{dm}$ as:

$\begin{eqnarray} \widehat{r}_{n}^{(m)}(\varphi, \mathbf{t};h_{n}) = \frac{ \sum\limits_{(i_{1}, \ldots, i_{m})\in I_n^m}\varphi({ Y}_{i_{1}}, \ldots, { Y}_{i_{m}})K\left(\frac{\mathbf{ t}_1- \mathbf{ X}_{i_{1}}}{h_{n}}\right)\cdots K\left(\frac{\mathbf{ t}_m-\mathbf{ X}_{i_{m}}}{h_{n}}\right)}{ \sum\limits_{(i_{1}, \ldots, i_{m})\in I_n^m}K\left(\frac{ \mathbf{ t}_1-\mathbf{ X}_{i_{1}}}{h_{n}}\right)\cdots K\left(\frac{\mathbf{ t}_m-\mathbf{ X}_{i_{m}}}{h_{n}}\right)}, \end{eqnarray}$

(1.1)

where $I_n^m$ is the set of all $m$ -tuples of different integers between $1$ and $n$ :

$I_n^m = \left\{\mathbf{ i} = (i_{1}, \ldots, i_{m}): 1\leq i_{j}\leq n \; \; \mbox{ and }\; \; i_{j}\neq i_{r} \; \; \mbox{ if }\; \; j\neq r \right\},$

and $\{h_{n}\}_{n\geq1}$ is a sequence of positive constants converging to zero at the rate $nh_{n}^{dm} \rightarrow \infty$ . In the specific scenario of $m = 1$ , where $r^{(m)}(\varphi, t)$ simplifies to $r^{(1)}(\varphi, t) = \mathbb E(\varphi(\mathbf{ Y})\mid \mathbf{ X} = t)$ , the estimator by Stute transforms into the Nadaraya-Watson estimator of $r^{(1)}(\varphi, t)$ . The study conducted by ^[154] focused on estimating the rate of uniform convergence in $\mathbf{ t}$ of $\widehat{r}_n^{(m)}(\varphi, \mathbf{t}; h_n)$ to $r^{(m)}(\varphi, \mathbf{ t})$ . In ^[148], the paper discusses and compares the limit distributions of $\widehat{r}_n^{(m)}(\varphi, \mathbf{ t}; h_n)$ with those obtained by Stute. ^[97] extended the results of ^[166] to weakly dependent data under appropriate mixing conditions (also see ^[17]). They applied these findings to verify the Bayes risk consistency of corresponding discrimination rules similar to ^[167] and Section 5.1. In ^[169], symmetrized nearest neighbor conditional $U$ -statistics are proposed as alternatives to the usual kernel-type estimators, and reference can also be made to ^[48]. ^[86] explored the functional conditional $U$ -statistic and established its finite-dimensional asymptotic normality. Despite the subject's importance, nonparametric estimation of conditional $U$ -statistics in a functional data framework has received relatively limited attention. Recent advancements are presented in ^[48], addressing problems related to uniform bandwidth consistency in a general setting. In ^[104], the test of independence in the functional framework based on the Kendall statistics was investigated, which can be considered as particular cases of $U$ -statistics. Extending this exploration to conditional empirical $U$ -processes in the functional setting is practically useful and technically more challenging. Two perspectives on conditional $U$ -processes are presented 1) they are infinite-dimensional versions of conditional $U$ -statistics (with one kernel) and 2) they are stochastic processes that are nonlinear generalizations of conditional empirical processes. Both views are valuable because: 1) from a statistical standpoint, considering a rich class of statistics is more interesting than a single statistic; 2) mathematically, insights from empirical process theory can be applied to derive limit or approximation theorems for $U$ -processes. Importantly, extending $U$ -statistics to $U$ -processes demands substantial effort and different techniques, and generalization from conditional empirical processes to conditional $U$ -processes is highly nontrivial.

The prevalent practice of assuming stationarity in time series modeling has prompted the development of various models, techniques, research, and methodologies. However, this assumption may not always be suitable for spatio-temporal data, even with detrending and deseasonalization. Many pivotal time series models exhibit nonstationarity, observed in diverse physical phenomena and economic data, rendering classical methods ineffective. To address this challenge, the concept of the locally stationary random process was introduced by ^[161]. This type of process approximates a non-stationary process by a stationary one locally over short periods. The intuitive concept of local stationarity is also explored in the works of ^{[57,58,142,149,153]}, among others. The groundbreaking work of ^[57] notably serves as a robust foundation for the inference of locally stationary processes. In addition to generalizing stationary processes, this innovative approach eliminates time-varying parameters. Over the past decade, the theory of empirical processes for locally stationary time series has garnered significant attention. Empirical processes theory plays a crucial role in addressing statistical problems and has expanded into time series analysis and regression estimation. Relevant references in this context include ^[59,179] and more recent contributions such as ^[134,147]. The extension of the previously discussed exploration to conditional empirical $U$ -processes bears significant interest from both practical and theoretical standpoints. We specifically delve into the domain of conditional $U$ -processes indexed by a class of functions within the framework of functional data. Building upon insights from ^[8], functional data analysis (FDA) emerges as a statistical field dedicated to analyzing infinite-dimensional variables such as curves, sets, and images. Experiencing remarkable growth over the past two decades, FDA has become a crucial area of investigation in data science, fueled by advancements in data collection technology during the "Big Data" revolution. For an introduction to FDA, readers can refer to the books by ^[78,151], providing fundamental analysis methods and case studies across various domains like criminology, economics, archaeology, and neurophysiology. Notably, the extension of probability theory to random variables taking values in normed spaces predates recent literature on functional data, with foundational knowledge available in ^[9,87]. In the context of regression estimation and nonparametric models for data in normed vector spaces, valuable references include ^[78,136], along with additional contributions from ^[32,102,126]. Modern empirical process theory has been applied to functional data, as demonstrated by ^[82], who established uniform consistency rates for functionals of the conditional distribution, including the regression function, conditional cumulative distribution, and conditional density. ^[109] extended this by providing consistency rates for various functional nonparametric models, uniformly in bandwidth (UIB consistency). Recent advancements in this field can be explored through references such as ^{[3,40,42,49,68,69,137]}. This strongly motivates the consideration of regression models that offer dimension reduction. Single index models are widely used to achieve this by assuming that the predictors' influence on the response can be simplified to a single index. This index represents a projection in a specified direction and is combined with a nonparametric link function, simplifying the predictors to a one-dimensional index while still incorporating important characteristics. Additionally, because the nonparametric link function only operates on a one-dimensional index, these models are not affected by the problem of having a high number of dimensions, known as the curse of dimensionality. The single index model extends the concept of linear regression by incorporating a link function equivalent to the identity function; for further details, interested readers can refer to ^{[23,91,123,138,171,183]}.

Recent progress in functional data analysis underscores the need for developing models to address the challenges of dimensionality reduction (refer to ^[90,126] for recent surveys, and also ^{[4,5,40,41,165]}). In response to this, semiparametric approaches emerge as promising solutions. The functional single-index model (FSIM) has gained attention in this context, with exploration by ^[2,16,79]. Furthermore, ^[106] proposed functional single-index composite quantile regression, estimating the unknown slope function and link function through $B$ -spline basis functions. A functional single index model with coefficient functions restricted to a subregion was introduced by ^[143]. The estimation of a general functional single index model, where the conditional distribution depends on the functional predictor via a single index structure, was investigated by ^[188]. Innovatively, ^[173] developed a new estimation method that combines functional principal component analysis, $B$ -spline modeling, and profile estimation for parameters and functions. Addressing the estimation of the functional single index regression model with missing responses for strong mixing time series data, ^[127,128] made valuable contributions. ^[76] introduced a functional single-index varying coefficient model with the functional predictor as the single-index part. Utilizing functional principal component analysis and basis function approximation, they obtained estimators for slope and coefficient functions, proposing an iterative estimating procedure. An automatic and location-adaptive procedure for estimating regression in an FSIM based on k-Nearest Neighbors ( $k$ NN) principles was presented by ^[145]. Motivated by imaging data analysis, ^[121] proposed a novel functional varying-coefficient single-index model for regression analysis of functional response data on a set of covariates. Investigating a functional Hilbertian regressor for nonparametric estimation of the conditional cumulative distribution with a scalar response variable in a single index structure, ^[15] made notable contributions. An alternative approach was introduced by ^[52], extending to the multi-index case without anchoring the true parameter on a prespecified sieve. Their detailed theoretical analysis of a direct kernel-based estimation scheme establishes a polynomial convergence rate.

The primary objective of this paper is to scrutinize a comprehensive framework for the single-index conditional $U$ -process of any fixed order, indexed by a class of functions within a nonparametric context. Specifically, we explore the conditional $U$ -process in the realm of functional covariates, considering the potential non-stationary nature of functional time series. The main aim of this study is to offer an initial and comprehensive theoretical examination within this specific context. To achieve this, we skillfully apply large sample theory techniques developed for empirical processes and $U$ -empirical processes. This paper meticulously tackles various technical hurdles. Initially, it delves into the nonlinear expansion of the single index concept and conditional $U$ -statistics. Subsequently, it addresses the extension of the Hoeffding decomposition to non-stationary time series. Finally, it confronts the complexity stemming from the unbounded function class, leading to extensive and intricate proofs.

The manuscript's organization is structured as follows. Section 2 provides a detailed exposition of our theoretical framework, elucidating essential definitions and contextual explanations while introducing technical assumptions. Our principal findings are presented in Sections 3 and 4. Specifically, Section 3 unveils convergence rate results, reintroducing the pivotal Hoeffding decomposition technique. Accommodating our outcomes on weak convergence, Section 4 delves into the details of these results. Section 5 accentuates selected applications. In Section 6, we explore bandwidth selection methodologies utilizing cross-validation procedures. Concluding reflections are encapsulated in Section 7. The comprehensive proofs are furnished in Section 8. Lastly, Appendix 13 provides technical properties and lemmas for easy reference.

2. Background and preliminaries

2.1. Notation

In this document, the notation $a_n \lesssim b_n$ is employed to signify the existence of a constant $C$ , which is independent of $n$ and may vary between lines, unless explicitly specified otherwise. This constant is such that $a_n \leq C b_n$ holds for all $n$ . Additionally, the notation $a_n \ll b_n$ indicates that $a_n / b_n \rightarrow 0$ as $n \rightarrow \infty$ . When both $a_n \lesssim b_n$ and $b_n \lesssim a_n$ hold, their equivalence is denoted as $a_n \sim b_n$ . Moreover, $(i_1, \ldots, i_m)$ is denoted as $\mathbf{i}$ , and $(i_1/n, \ldots, i_m/n)$ as $\mathbf{i}/n$ . For any $c, d \in \mathbb{R}$ , the expressions $c \vee d = \max\{c, d\}$ and $c \land d = \min\{c, d\}$ are employed. The notation $\left \lfloor a \right \rfloor$ signifies the integer part of a number. Additionally, for $m < n$ , where $m$ and $n$ are positive integers, $C_m^n = \frac{n!}{(n-m)! m!}$ is defined. The set $I_n^m$ is introduced as

$I_n^m : = \left\{\mathbf{ i} = (i_{1}, \ldots, i_{m}): 1\leq i_{j}\leq n \; \; \mbox{ and }\; \; i_{j}\neq i_{r} \; \; \mbox{ if }\; \; j\neq r \right\},$

comprising all $m$ -tuples of distinct integers between $1$ and $n$ .

2.2. Model

Consider the stochastic processes $\{Y_{i, n}, X_{i, n}\}_{i = 1}^n$ , where $Y_{i, n}$ takes values in a space $\mathcal{Y}$ , and $X_{i, n}$ is in an abstract space $\mathcal{H}$ . We assume $\mathcal{H}$ is a semi-metric vector space with a semi-metric $d(\cdot, \cdot)$ ^* which, in most applications, would be a Hilbert or Banach space. We consider the semi-metric $d_{\theta}(\cdot, \cdot)$ associated with the single-index $\theta \in\mathcal H$ , defined as

^*A semi-metric (or pseudo-metric) $d(\cdot, \cdot)$ is a metric allowing $d_{\theta_1}(x_1, x_{2}) = 0$ for some $x_{1}\neq x_{2}$ .

$d_{\theta}(u, v): = |\langle\theta, u-v\rangle|, \; \mbox{for}\; u, v\in \mathcal H.$

Consider any function $\varphi(\cdot)$ of $k$ variables (the $U$ -kernel) such that $\varphi(Y_{1}, \ldots, Y_{m})$ is integrable. For $\mathbf{x} = (x_1, \ldots, x_m) \in \mathcal{H}^m$ and $\boldsymbol{\theta} = (\theta_1, \ldots, \theta_m)\in \Theta^m \subset \mathcal{H}^m$ , define the regression functional parameter as

$\begin{eqnarray} r^{(m)}\left(\varphi, \frac{\mathbf{i}}{n}, \mathbf{x}, \boldsymbol{\theta}\right)&: = &\mathbb{E}\left(\varphi(Y_{1}, \ldots, Y_{m})\mid \langle{ X}_{1}, \theta_1\rangle = \langle{ x}_{1}, \theta_1\rangle, \ldots, \langle{ X}_{m}, \theta_m\rangle = \langle{ x}_{m}, \theta_m\rangle\right) \\& = :&{ \mathbb{E}\left(\varphi(Y_{\mathbf i})\mid \langle{ X}_{\mathbf i}, \boldsymbol{\theta}\rangle = \langle\mathbf x, \boldsymbol{\theta}\rangle\right)}, \; \; \; \mathbf{i} = (1, \ldots, m). \end{eqnarray}$

(2.1)

In this study, we consider the following model:

$\begin{equation} \varphi(Y_{\mathbf{i}, n}) = r^{(m)}\left(\varphi, \frac{\mathbf{i}}{n}, \mathbf{x}, \boldsymbol{\theta}\right)+\sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) \varepsilon_{\mathbf{i}}, \; \; \mathbf{i} = (i_1, \ldots, i_m), \; \; 1 \leq i_j \leq n, \end{equation}$

(2.2)

where $\left\{\varepsilon_{\mathbf i}\right\}_{\mathbf i \in I_n^m}$ is a sequence of univariate independent and identically distributed random variables, independent of $\left\{X_{i, n}\right\}_{i = 1}^{n}$ . We denote $\sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right)\varepsilon_{\mathbf{i}}$ as $\varepsilon_{\mathbf{i}, n}$ . Furthermore, we assume that the process is a locally stationary functional time series. In a heuristic sense, a process $\left\{X_{i, n}\right\}$ is considered locally stationary if it displays approximately stationary behavior locally in time. The regression function $r^{(m)}(\varphi, \cdot, \mathbf x, \boldsymbol{\theta})$ is allowed to change smoothly over time, depending on a rescaled quantity ${\bf i}/n$ rather than on the specific point ${\bf i}$ (where ${\bf i}$ typically represents time in a time series framework).

2.3. Local stationarity

We delve into the exploration of non-stationary processes characterized by dynamics evolving gradually over time, manifesting behaviors akin to stationarity at a local level. This conceptual realm has undergone comprehensive scrutiny, exemplified by references such as ^{[57,105,115,131,141,185]}. For illustration, consider a continuous function $a:[0, 1] \rightarrow \mathbb{R}$ and a sequence of i.i.d. random variables $\left(\varepsilon_i\right)_{i \in \mathbb{N}}$ . The stochastic process $X_{i, n} = a(i / n)+\varepsilon_i$ , where $i \in\{1, \ldots, n\}$ and $n \in \mathbb{N}$ , can exhibit "almost" stationary behavior for $i$ close to a specific point $i^*$ (e.g., $i^*$ in $\{1, \ldots, n\}$ ), given that $a\left(i^* / n\right) \approx a(i / n)$ . However, this process is not strictly weakly stationary. To capture this type of gradual change, the concept of local stationarity was introduced by ^[57], wherein the spectral representation of the underlying stochastic process is locally approximated. In our framework, the process $\{X_{i, n}\}$ can be stochastically approximated by a stationary process $\{ X_{i, n}^{(u)}\}$ around each rescaled time point $u$ , specifically for those values of $i$ where $i/n-u$ is small. Since our focus is on functional data, we define a functional time series as locally stationary if it can be locally approximated by a stationary functional time series. We will provide a standard definition of local stationarity.

Definition 2.1 (local stationarity). For a sequence of stochastic processes, indexed by $n \in \mathbb{N}$ and taking values in $\mathcal{H}$ , denoted as $\{ X_{i, n}\}$ , it is deemed locally stationary if, for all rescaled times $u\in [0, 1]$ , there exists an associated $\mathcal{H}$ -valued process $\{ X_{i}^{(u)}\}$ that is strictly stationary. This association is characterized by the inequality:

$\begin{equation} d_\theta\left(X_{i, n}, X_{i}^{(u)}\right) \leq \left(\left\lvert \frac{i}{n} - u\right\rvert + \frac{1}{n}\right)U_{i, n}^{(u)} \ \;a.s., \end{equation}$

(2.3)

This holds for all $1 \leq i \leq n$ , where $\{U_{i, n}^{(u)}\}$ is a positive valued process satisfying $\mathbb E[(U_{i, n}^{(u)})^{\rho}] < C$ for some $\rho > 0$ , $C < \infty$ . These conditions are independent of $u$ , $i$ , and $n$ .

Definition 2.1 represents a natural extension of the concept of local stationarity for real-valued time series introduced in ^[57]. In a more specific context, ^[176] and ^[175] elaborate on this definition, considering $\mathcal{H}$ as a Hilbert space $L_\mathbb{R}^2[0, 1]$ . Here, all real-valued functions are square-integrable with respect to the Lebesgue measure on the interval $[0, 1]$ , equipped with the inner product $L_2$ -norm:

$\|f\|_{2} = \sqrt{\langle f, f \rangle}, \ \langle f, g \rangle = \int_{0}^{1}f(t)g(t)dt,$

where $f, g\in L_{\mathbb{R}}^{2}([0, 1])$ . These authors also provide sufficient conditions for an $L_{\mathbb{R}}^{2}([0, 1])$ -valued stochastic process $\{X_{i, n}\}$ to satisfy (2.3) with $d(f, g) = \|f-g\|_{2}$ and $\rho = 2$ . Additionally, they define $L_E^p(T, \mu)$ as the Banach space of all strongly measurable functions $f: T \rightarrow E$ with finite norm:

$\|f\|_p = \|f\|_{L_E^p(T, \mu)} = \left(\int\|f(\tau)\|_E^p d \mu(\tau)\right)^{\frac{1}{p}},$

for $1 \leqslant p < \infty$ and with finite norm

$\|f\|_{\infty} = \|f\|_{L_E^{\infty}(T, \mu)} = \inf\limits_{\mu(N) = 0} \sup\limits_{\tau \in T \backslash N}\|f(\tau)\|_E,$

for $p = \infty$ . In this context, $\mathcal H_{\mathbb{C}} = L_{\mathbb{C}}^2([0, 1])$ .

Remark 2.2. ^[176] generalizes the definition of local stationary processes, initially proposed by ^[56], to the functional setting in the frequency domain. This extension is made under the following assumptions:

(A1) (i) $\left\{\varepsilon_i\right\}_{i\in \mathbb{Z}}$ is a weakly stationary white noise process taking values in $\mathcal H$ with a spectral representation given by

$\varepsilon_j = \int_{-\pi}^\pi e^{\mathrm{i} \omega j} d Z_\omega,$

where $Z_\omega$ is a $2 \pi$ -periodic orthogonal increment process taking values in $\mathcal H_{\mathbb{C}}$ ;

(ii) the functional process $X_{i, n}$ with $i = 1, \ldots, n$ and $n \in \mathbb{N}$ is given by

$X_{j, n} = \int_{-\pi}^\pi e^{\mathrm{i} \omega j} \mathcal{A}_{j, \omega}^{(n)} d Z_\omega \quad a.e. \;in \;\mathcal H$

with the transfer operator $\mathcal{A}_{j, \omega}^{(n)} \in \mathcal{B}_p$ and an orthogonal increment process $Z_\omega$ .

(A2) There exists $\mathcal{A}:[0, 1] \times[-\pi, \pi] \rightarrow S_p\left(H_{\mathbb{C}}\right)$ with $\mathcal{A}_{u, \cdot} \in \mathcal{B}_p$ and $\mathcal{A}_{u, \omega}$ being continuous in $u$ such that for all $n \in \mathbb{N}$

$\sup\limits_{\omega, t}\left\|\mathcal{A}_{t, \omega}^{(n)}-\mathcal{A}_{\frac{t}{n}, \omega}\right\|_p = O\left(\frac{1}{n}\right) .$

They have proved in [176, Proposition 2.2] that:

Proposition 2.3. Suppose that assumptions (A1) and (A2) hold. Then, $\left\{X_{i, n}\right\}$ is a locally stationary process in $\mathcal H$ .

In our investigation of non-stationary processes characterized by gradually evolving dynamics over time, exhibiting behaviors reminiscent of stationarity at a localized scale, we draw upon an extensive body of research documented in notable references such as ^{[57,105,115,131,141,185]}. As an illustrative example, let $a:[0, 1] \rightarrow \mathbb{R}$ be a continuous function, and consider a sequence of independent and identically distributed random variables $\left(\varepsilon_i\right)_{i \in \mathbb{N}}$ . The stochastic process $X_{i, n} = a(i / n)+\varepsilon_i$ , where $i \in\{1, \ldots, n\}$ and $n \in \mathbb{N}$ , may demonstrate "almost" stationary behavior for $i$ in proximity to a specific point $i^*$ (e.g., $i^*$ in $\{1, \ldots, n\}$ ), under the condition that $a\left(i^* / n\right) \approx a(i / n)$ . However, it is important to note that this process does not strictly adhere to weak stationarity. To capture this gradual transition, the concept of local stationarity was introduced by ^[57], wherein the spectral representation of the underlying stochastic process is locally approximated. Within our framework, the process $\{X_{i, n}\}$ can be stochastically approximated by a stationary process $\{ X_{i, n}^{(u)}\}$ around each rescaled time point $u$ , particularly for those values of $i$ where $i/n-u$ is small. Given our emphasis on functional data, we define a functional time series as locally stationary if it can be approximated locally by a stationary functional time series. The subsequent section will present a standard definition of local stationarity, building upon the foundation laid out in ^[176].

Theorem 2.4. Consider a white noise process $\left\{\varepsilon_i\right\}_{i \in \mathbb{Z}}$ in $L_H^2(\Omega, \mathbb{P})$ where $H = L^2([0, 1])$ , and let $\left\{X_{i, n}\right\}$ be a sequence of functional autoregressive processes defined as

$\sum\limits_{j = 0}^m B_{\frac{i}{n}, j}\left(X_{i-j, n}\right) = C_{\frac{i}{n}}\left(\varepsilon_i\right),$

with $B_{u, j} = B_{0, j}, C_u = C_0$ for $u < 0$ , and $B_{u, j} = B_{1, j}, C_u = C_1$ for $u > 1$ . If the process satisfies, for all $u \in[0, 1]$ and $p = 2$ or $p = \infty$ , the conditions

(i) $C_u$ is an invertible element of $S_{\infty}(H)$ ;

(ii) $B_{u, j} \in S_p(\mathcal H)$ for $j = 1, \ldots, m$ with $\sum_{j = 1}^m\left\|B_{u, j}\right\|_l < 1$ and $B_{u, 0} = I_H$ ;

(iii) the mappings $u \mapsto B_{u, j}$ for $j = 1, \ldots, m$ and $u \mapsto C_u$ are continuous in $u \in[0, 1]$ and differentiable on $u \in(0, 1)$ with bounded derivatives,

then the process $\left\{X_{i, n}\right\}$ satisfies (A2) with

$\mathcal{A}_{\frac{i}{n}, \omega}^{(n)} = \frac{1}{\sqrt{2 \pi}}\left(\sum\limits_{j = 0}^m e^{-\mathrm{i} \omega j} B_{\frac{i}{n}, j}\right)^{-1} C_{\frac{i}{n}},$

and, consequently, is locally stationary.

2.4. Small-ball probability

Handling infinite-dimensional spaces presents a notable technical hurdle due to the absence of a universal reference measure, such as the Lebesgue measure. Consequently, defining a density function for the functional variable becomes elusive. To surmount this challenge, we leverage the concept of "small-ball probability". In particular, we address the concentration of the probability measure for the functional variable within a small-ball using the function $\phi_x(\cdot)$ . For a fixed $x\in \mathcal{H}$ and every $r > 0$ , the function $\phi_{x, \theta}(r)$ is defined as:

$\begin{equation} \mathbb{P}\left(X \in B_\theta(x , r)\right) = : \phi_{x, \theta}(r) > 0. \end{equation}$

(2.4)

Here, $\mathcal{H}$ is equipped with the semi-metric $d(\cdot, \cdot)$ , and $B_\theta(x, r)$ represents a ball in $\mathcal{H}$ with center $x \in \mathcal{H}$ and radius $r$ . For $\mathbf x = (x_1, \ldots, x_m) \in \mathcal H^m$ and $\boldsymbol{\theta} = (\theta_1, \ldots, \theta_m) \in \Theta^m$ , we define

$\phi_{\mathbf x, \boldsymbol\theta}(r) = \prod\limits_{i = 1}^m\phi_{x_i, \theta_i}(r).$

Further elucidation and examples concerning small-ball probability can be explored in Remark 3.6.

2.5. Mixing conditions

Statistical observations commonly exhibit a certain degree of dependence rather than complete independence. The concept of mixing serves as a quantitative measure of the proximity of a sequence of random variables to independence, facilitating the extension of traditional results applicable to independent sequences to sequences that are weakly dependent or mixing. The development of the theory of mixing conditions has emerged from the recognition that time series manifest "asymptotic independence" properties, thereby facilitating their analysis and statistical inference. Consider a probability space $(\Omega, \mathcal{F}, \mathbb{P})$ . Let ${Z}_{i, n}, {Z}_{2, n}, \ldots$ be a sequence of random variables on a probability space $\left(\Omega, \mathcal{D}, \mathbb{P}\right)$ . For an array $\left\{Z_{i, n}: 1 \leq i \leq n\right\}$ , the coefficients are defined as

$\beta(k) = \sup\limits_{i, n: 1 \leq i \leq n-k} \beta\bigg(\sigma\left(Z_{s, n}, 1 \leq s \leq i\right), \sigma\left(Z_{s, n}, i+k \leq s \leq n\right)\bigg),$

where $\sigma(Z)$ represents the $\sigma$ -field generated by $Z$ . The array $\left\{Z_{i, n}\right\}$ is considered $\beta$ -mixing if $\beta(k) \rightarrow 0$ . It is crucial to note that $\beta$ -mixing implies $\alpha$ -mixing. Throughout the ensuing discussion, we assume that the sequence of random elements $\{(X_{i, n}, Y_{i, n}), i = 1, \ldots, n; n \geq 1\}$ is absolutely regular. Remarkably, Markov chains exhibit $\beta$ -mixing under the milder Harris recurrence condition, provided the underlying space is finite ^[62]. Additional rationale for favoring regular processes over strongly mixing processes is provided in the concluding remarks (Section 7).

2.6. Kernel estimation

We endeavor to estimate the regression function as denoted in (2.1). The kernel estimator is formally defined as

$\begin{equation} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}, \end{equation}$

(2.5)

where $K_{1}(\cdot)$ and $K_{2}(\cdot)$ denote one-dimensional kernel functions. Here, let $h = h_{n}$ be a bandwidth with the property that $h \to 0$ as $n \to \infty$ . The function $\varphi: \mathcal{Y}^m \longrightarrow \mathbb{R}$ is symmetric and measurable, belonging to a class of functions denoted as $\mathscr{F}_m$ . Importantly, this estimator is a conditional $U$ -statistic utilizing the sequence of random variables $\{Y_{i, n}, X_{i, n}\}_{i = 1}^n$ and the kernel $\varphi \times K_1\times K_2$ . The introduction of such statistics was pioneered by ^[166]. To investigate the weak convergence of the conditional empirical process and the conditional $U$ -process within the functional data framework, we introduce some necessary notations. Consider the class

$\mathscr{F}_{m} = \{\varphi: \mathcal{Y}^{m} \rightarrow \mathbb{R}\},$

which consists of real-valued symmetric measurable functions on $\mathcal{Y}^{m}$ with a measurable envelope function :

$\begin{equation} F(\mathbf{ y}) \geq \sup\limits_{\varphi \in \mathscr{F}_{m}} \lvert\varphi(\mathbf{ y})\rvert, \; \mbox{for }\; \mathbf{ y} \in \mathcal{Y}^{m}. \end{equation}$

(2.6)

For kernel functions $K_1(\cdot)$ and $K_2(\cdot)$ , as well as a subset $S_{\mathcal{H}}\subset \mathcal{H}$ , we define the pointwise measurable class of functions for $1\leq m \leq n$ and $\boldsymbol{\theta} = (\theta_1, \ldots, \theta_m)$ :

$\mathscr{K}^{m}_{\boldsymbol{\theta}}: = \left\{ (x_1, \ldots, x_m) \mapsto \prod\limits_{i = 1}^{m}{K_{1}\left(\frac{u_k-\cdot}{h_n}\right)}K_2\left(\frac{d_{\theta_i}(x_i, \cdot)}{h_{i}}\right) , \; \; (\mathbf{x}, {\mathbf u})\in {\mathcal{H}}^{m}\times {[0, 1]^m}\right\}$

and

$\mathfrak{K}^{m}_{\boldsymbol\Theta}: = \bigcup\limits_{\boldsymbol{\theta}\in \boldsymbol\Theta^m}\left\{ (x_1, \ldots, x_m) \mapsto \prod\limits_{i = 1}^{m}{K_{1}\left(\frac{u_k-\cdot}{h_n}\right)}K_2\left(\frac{d_{\theta_i}(x_i, \cdot)}{h_{i}}\right), \; \; (\mathbf{x}, {\mathbf u})\in {\mathcal{H}}^{m}\times {[0, 1]^m} \right\} .$

The conditional $U$ -process indexed by $\mathscr{F}_{m}\mathfrak{K}^m_\Theta$ is defined by

$\begin{equation} \left\{ \mathbb{G}_n(\varphi, \mathbf u, \mathbf{x}, \boldsymbol{\theta}) : = \sqrt{n h^m \phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left(\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right)\right\}_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}. \end{equation}$

(2.7)

Observing the importance of point-wise measurability in our context, it enables us to state our results conventionally, adhering to the classical definition of probability, without invoking the abstract notions of outer probability or outer expectation ^[178].

Remark 2.5. The bandwidth $h$ remains consistent across all directions, simplifying the analysis in cases involving product kernels. Nevertheless, the results can be readily adapted to scenarios with non-product kernels and varying bandwidths.

Remark 2.6. Our estimator differs from the conventional conditional $U$ -statistics not only in the nature of the sequence $\{X_i\}_i$ but also in the inclusion of a kernel in the time direction. As a result, we attain smoothness from both the covariate direction $(X_{i, n})$ and the temporal dimension, allowing us to capture the characteristics of a regression model evolving over time.

2.7. VC-type classes of functions

Examining functional data through asymptotic methods involves delving into concentration properties elucidated by the concept of small-ball probability. When scrutinizing a process characterized by a set of functions, it becomes imperative to consider additional topological concepts, including metric entropy and VC-subgraph classes (referred to as "VC", inspired by Vapnik and Cervonenkis).

Definition 2.7. Let $\mathcal{S}_{\mathcal{E}}$ be a subset of a semi-metric space $\mathcal{E}$ . A finite set of points $\{e_1, \ldots, e_N\}\subset \mathcal{E}$ is considered a $\varepsilon$ -net of $\mathcal{S}_{\mathcal{E}}$ for a given $\varepsilon > 0$ if:

$\mathcal{S}_{\mathcal{E}}\subseteq \bigcup\limits_{j = 1}^NB(e_j, \varepsilon).$

If ${N_{\varepsilon}}(\mathcal{S}_{\mathcal{E}})$ is the cardinality of the smallest $\varepsilon$ -net (i.e., the minimal number of open balls of radius $\varepsilon$ ) in $\mathcal{E}$ needed to cover $\mathcal{S}_{\mathscr{H}}$ , then the Kolmogorov's entropy (metric entropy) of the set $\mathcal{S}_{\mathcal{E}}$ is defined as the quantity:

$\psi_{\mathcal{S}_{\mathcal{E}}}(\varepsilon): = \log{N_{\varepsilon}}(\mathcal{S}_{\mathcal{E}}).$

Kolmogorov introduced the concept of metric entropy, extensively explored in various metric spaces, as indicated by its name (cf. ^[112]). Dudley (^[70]) utilized this concept to establish sufficient conditions for the continuity of Gaussian processes, forming the foundation for significant generalizations of Donsker's theorem regarding the weak convergence of the empirical process. Consider two subsets, $\mathcal{B}_\mathcal{H}$ and $\mathcal{S}_{\mathcal{H}}$ , in the semi-metric space $\mathcal{H}$ with Kolmogorov's entropy (for radius $\varepsilon$ ) denoted as $\psi_{\mathcal{B}_{\mathcal{H}}}(\varepsilon)$ and $\psi_{\mathcal{S}_{\mathcal{H}}}(\varepsilon)$ , respectively. The Kolmogorov entropy for the subset $\mathcal{B}_\mathcal{H}\times \mathcal{S}_\mathcal{H}$ of the semi-metric space $\mathcal{H}^2$ is given by:

$\begin{equation*} \psi_{\mathcal{B}_{\mathcal{H}} \times \mathcal{S}_{\mathcal{H}}}(\varepsilon) = \psi_{\mathcal{B}_{\mathcal{H}}}(\varepsilon) +\psi_{\mathcal{S}_{\mathcal{H}}}(\varepsilon). \end{equation*}$

Thus, $m\psi_{\mathcal{S}_{\mathcal{H}}}(\varepsilon)$ represents the Kolmogorov entropy of the subset $\mathcal{S}_{\mathcal{H}}^m$ in the semi-metric space $\mathcal{H}^m$ . If $d$ denotes the semi-metric on $\mathcal{H}$ , a semi-metric on $\mathcal{H}^m$ can be defined as:

$\begin{equation} d_{\mathcal{H}^m}\left(\mathbf{x}, \mathbf{z}\right) : = \frac{1}{m} d_{\theta_1}\left({x_1}, {z_1}\right)+ \cdots + \frac{1}{m} d_{\theta_m}\left({x_m}, {z_m}\right), \end{equation}$

(2.8)

for $\mathbf{x} = (x_1, \ldots, x_m)$ , $\mathbf{z} = (z_1, \ldots, z_m) \in\mathcal{H}^m$ . The choice of the semi-metric is crucial in this type of analysis, and readers can find insightful discussions on this topic in [78, Chapters 3 and 11]. Furthermore, in this context, we must also address another topological concept: VC-subgraph classes.

Definition 2.8. A class of subsets $\mathcal{C}$ on a set $C$ is called a VC-class if there exists a polynomial $P(\cdot)$ such that, for every set of $N$ points in $C$ , the class $\mathcal{C}$ picks out at most $P(N)$ distinct subsets.

Definition 2.9. A class of functions $\mathscr{F}$ is called a VC-subgraph class if the graphs of the functions in $\mathscr{F}$ form a VC-class of sets. In other words, if we define the subgraph of a real-valued function $f$ on $S$ as the following subset $\mathcal{G}_f$ on $S \times \mathbb{R}$ :

$\mathcal{G}_f = \{ (s, t): 0\leq t\leq f(s) \quad or \quad f(s)\leq t\leq 0 \}$

the class $\{\mathcal{G}_f : f \in \mathscr{F}\}$ is a VC-class of sets on $S \times \mathbb{R}$ . Informally, a VC-class of functions is characterized by having a polynomial covering number (the minimal number of required functions to make a covering on the entire class of functions).

A VC-class of functions $\mathscr{F}$ with an envelope function $F$ has the following entropy property. For a given $1\leqslant q < \infty$ , there exist constants $a$ and $b$ such that:

$\begin{equation} N(\epsilon, \mathscr{F}, \Vert\cdot\Vert_{L_q(Q)}) \leq a \left(\frac{(QF^q)^{1/q}}{\epsilon}\right)^b, \end{equation}$

(2.9)

for any $\epsilon > 0$ and each probability measure such that $QF^q < \infty$ . Several references provide sufficient conditions under which (2.9) holds, such as [144, Lemma 22], [72, §4.7.], [178, Theorem 2.6.7], [114, §9.1], [65, §3.2], ^[33,34,45], offering further discussions.

2.8. Assumptions

For the reader's convenience, we have compiled the essential assumptions as follows:

Assumption 1. [Model and distribution assumptions]

i) The process $\{X_{i, n}\}$ is locally stationary and satisfies that for each time point $u \in [0, 1]$ , there exists a stationary process $\{X_{i}^{(u)}\}$ such that

$\begin{equation*} d_{\theta_i}\left(X_{i, n}, X_{i}^{(u)}\right) \leq \left(\left\lvert \frac{i}{n} - u\right\rvert + \frac{1}{n}\right)U_{i, n}^{(u)} \mathit{\text{a.s.}}, \end{equation*}$

with $\mathbb{E}[(U_{i, n}^{(u)})^{\rho}] < C$ for some $\rho > 0$ , $C < \infty$ .

ii) Let $B(x, h)$ be a ball centered at $x \in \mathscr{H}$ with radius $h$ , defined in Section 2.4, and let $c_{d} < C_{d}$ be positive constants. For all $\mathbf{u} \in [0, 1]^m$ ,

$\begin{equation} 0 < c_{d}\phi^m(h_n)f_{1}(\mathbf{x})\leq \mathbb{P}\left(\left(X_{i_1}^{(u_1)}, \ldots, X_{i_m}^{(u_m)}\right) \in \mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h)\right) = :F_{u, \boldsymbol{\theta}}(h;\mathbf{x})\leq C_{d}\phi^m(h_n)f_{1}(\mathbf{x}), \end{equation}$

(2.10)

where $\phi(0) = 0$ and $\phi(u)$ is absolutely continuous in a neighborhood of the origin, $f_{1}(x)$ is a non-negative functional in $x \in \mathscr{H}$ , and

$\mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h) = \prod\limits_{i = 1}^m B_{\theta_i}(x_i, h).$

iii) There exist constants $C_{\phi} > 0$ and $\varepsilon_{0} > 0$ such that for any $0 < \varepsilon < \varepsilon_{0}$ ,

$\begin{align} \int_{0}^{\varepsilon}\phi(u)du > C_{\phi}\varepsilon \phi(\varepsilon). \end{align}$

(2.11)

iv) Let $\psi(h_n) \to 0$ as $h \to 0$ , and $f_{2}(\mathbf{x})$ is a non-negative functional in $\mathbf{x}: = (x_1, \ldots, x_m) \in \mathcal{H}^m$

$\begin{equation} \nonumber \sup\limits_{\mathbf{i}\in I_n^m}\mathbb{P}\left(\left( (X_{i_1, n}, \ldots, X_{i_m, n}), (X_{i^\prime_1, n}, \ldots, X_{i^\prime_m, n})\right) \in \mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h)\times \mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, h)\right) \leq \psi^m(h_n)f_{2}(\mathbf{x}). \end{equation}$

We will also assume that the ratio $\psi(h_n)/\phi^{2}(h_n)$ is bounded.

Assumption 2. [Kernel assumptions]

i) $K_{1}(\cdot)$ is a symmetric kernel around zero, bounded, and possesses compact support, i.e., $K_{1}(v) = 0$ for all $\mid v \mid > C_{1}$ for some $C_{1} < \infty$ . Additionally,

$\int K_{1}(z)dz = 1$

and $K_{1}(\cdot)$ is Lipschitz continuous, i.e.,

$\lvert K_{1}(v_{1}) - K_{1}(v_{2})\rvert \leq C_{2} \lvert v_{1} - v_{2}\rvert$

for some $C_{2} < \infty$ and all $v_{1}, v_{2} \in \mathbb{R}$ .

ii) The kernel $K_{2}(\cdot)$ is non-negative, bounded, and has a compact support in $[0, 1]$ , such that $0 < K_{2}(0)$ and $K_{2}(1) = 0$ . Alternatively, $K_2(\cdot)$ can be viewed as an asymmetrical triangular kernel, i.e., $K_2(x) = (1 - x) \mathbb{1}_{(x \in [0, 1])}$ , and $K_{2}(\cdot)$ is Lipschitz continuous, i.e.,

$\lvert K_{2}(v_{1}) - K_{2}(v_{2})\rvert \leq C_{2} \lvert v_{1} - v_{2}\rvert.$

Moreover, $K^\prime_{2}(v) = dK_{2}(v)/dv$ exists on $[0, 1]$ , and for two real constants $-\infty < C^\prime_{1} < C^\prime_{2} < 0$ , we have:

$C^\prime_{2}\leq K^\prime_{2}(v) \leq C^\prime_{1}.$

Assumption 3. [Smoothness]

i) $r^{(m)}(\mathbf{u}, x)$ is twice continuously partially differentiable with respect to $\bf u$ . We also assume that

$\begin{equation} \sup\limits_{\mathbf{u}_1, \mathbf{u}_2 \in [0, 1]^m}\left\lvert{r}^{(m)}(\mathbf{u}_1, \mathbf{x}, {\boldsymbol{\theta}})-{r}^{(m)}(\mathbf{u}_2, \mathbf{z}, {\boldsymbol{\theta}})\right\rvert\leq c_{m} \left(d_{\mathcal{H}^m}\left(\mathbf{x}, \mathbf{z}\right)^{\alpha} + \|\mathbf{u}_1 - \mathbf{u}_2 \|^{\alpha}\right) \end{equation}$

(2.12)

for some $c_{m} > 0$ , $\alpha > 0$ and $\mathbf{x} = (x_1, \ldots, x_m), \mathbf{z} = (z_1, \ldots, z_m) \in\mathcal{H}^m.$

ii) $\sigma:[0, 1]\times \mathcal{H}^m \to \mathbb{R}$ is bounded by some constant $C_{\sigma} < \infty$ from above and by some constant $c_{\sigma} > 0$ from below, that is, for all $\bf u$ and $\bf x$ ,

$0 < c_{\sigma}\leq\sigma(\boldsymbol{\theta}, \mathbf u, \mathbf x)\leq C_{\sigma} < \infty.$

iii) $\sigma(\cdot, \cdot, \cdot)$ is Lipschitz continuous with respect to $\bf u$ .

iv) $\sup_{\mathbf{u} \in [0, 1]^m}\sup_{z:{d_{\mathcal{H}^m}(\bf x, \bf z)}\leq \varepsilon}\lvert\sigma(\boldsymbol{\theta}, \mathbf u, \mathbf x)-\sigma(\boldsymbol{\theta}, \mathbf{u}, \mathbf{z})\rvert = o(1)$ as $\varepsilon \to 0$ .

Let ${\mathfrak W}_{\mathbf{i}, \varphi, n}$ be an array of one-dimensional random variables. In this study, this array will be equal to ${\mathfrak W}_{\mathbf{i}, \varphi, n} = 1$ or ${\mathfrak W}_{\mathbf{i}, \varphi, n} = \varepsilon_{\mathbf{i}, n}$ .

Assumption 4. [Mixing]

i) For $\zeta > 2$ and $C < \infty$ , we have

$\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\mathbb{E}\lvert {\mathfrak W}_{\mathbf{i}, n}\rvert^{\zeta} \leq C,$

and

$\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\mathbb{E}\left[\lvert {\mathfrak W}_{\mathbf{i}, n}\rvert^{\zeta} \mid X_{\mathbf{i}, n} = \mathbf{x} \right]\leq C .$

ii) The $\beta$ -mixing coefficients of the array $\{X_{i, n}, {\mathfrak W}_{i, n}\}$ satisfy $\beta(k) \leq Ak^{-\gamma}$ for some $A > 0$ and $\gamma > 2$ . Additionally, we assume that $\delta+1 < \gamma(1-{\frac{2}{\nu}})$ for some $\nu > 2$ and $\delta > 1-{\frac{2}{\nu}}$ , along with the condition

$\begin{align} h^{2(1\wedge \alpha)-1}\left(\phi(h_n)a_{n} + \sum\limits_{k = a_{n}}^{\infty}k^{\delta}(\beta(k))^{1-{\frac{2}{\nu}}}\right) \to 0, \end{align}$

(2.13)

as $n \to \infty$ , where $a_{n} = \bigg[(\phi(h_n))^{-(1-{\frac{2}{\nu}})/\delta}\bigg]$ and for all $\alpha > 0$ .

iii) For some $\zeta_{0} > 0$ , as $n \to \infty$ , we have

$\frac{(\log n) ^{\frac{-m+\gamma+1}{2}+ \zeta_0(\gamma+1)}}{n ^{\frac{-m+\gamma+1}{2} -1 -\frac{\gamma+1}{ \zeta} }h_n^{\frac{m+\gamma+1}{2}} \phi(h_n)^{\frac{-m+\gamma+1}{2}}} \to 0.$

iv) Both $n h^{2m+1}_n$ and $n h^m\phi(h_n)^m$ tend to infinity as $n$ goes to infinity.

Assumption 5. [Blocking assumptions] There exists a sequence of positive integers $\{v_{n}\}$ satisfying $v_{n}\to \infty$ , $v_{n} = o(\sqrt{nh\phi(h_n)})$ and $\sqrt{\frac{n} { h\phi(h_n)}}\beta(v_{n}) \to \infty$ as $n \to \infty$ .

Assumption 6. [Class of functions assumptions]

The classes of functions $\mathfrak{K}^m_\Theta$ and $\mathscr{F}_m$ are such that :

i) The class of functions $\mathscr{F}_m$ is bounded and its envelope function satisfies for some $0 < M < \infty:$

$\begin{equation*} F(\mathbf{y})\leq M, \qquad \mathbf{y}\in \mathcal{Y}^m. \end{equation*}$

ii) The class of functions $\mathscr{F}_m\mathfrak{K}^m_\Theta$ is supposed to be of VC-type with envelope function previously defined. Hence, there are two finite constants $b$ and $\nu$ such that:

$\begin{equation*} N\left(\epsilon, \mathscr{F}_m\mathfrak{K}^m_\Theta, \Vert\cdot\Vert_{L_2(Q)}\right) \leq \left(\frac{ b\Vert F\kappa^m\Vert_{L_2(Q)}}{ \epsilon}\right)^\nu \end{equation*}$

for any $\epsilon, \nu > 0$ and each probability measure such that $Q(F)^2 < \infty$ .

iii) The class of functions $\mathscr{F}_m$ is unbounded and its envelope function satisfies for some $\zeta > 2:$

$\begin{equation*} \theta_\zeta: = \sup\limits_{\mathbf{x}\in\mathcal{S}^m_\mathscr{H}} \mathbb{E}\left( F^\zeta(\mathbf{Y})\mid \mathbf{X} = \mathbf{x}\right) < \infty, \; \; \mathcal{S}^m\subset \mathcal{H}^m. \end{equation*}$

iv) The metric entropy of the class $\mathscr{FK}$ satisfies, for some $\; 1\leq \zeta < \infty$ :

$\begin{align*} &\int_{0}^{\infty}{(\log N(u, \mathscr{FK}, \Vert\cdot\Vert_\zeta))^{\frac{1}{2}}du} < \infty. \end{align*}$

2.9. Comments on the assumptions

To establish the groundwork for our analysis, we draw inspiration from seminal works such as ^{[78,87,116,133,179]}. Our assumptions play a pivotal role in shaping the properties of the random processes under consideration. Starting with Assumption 1, we formalize the local stationarity property of $X_i$ and introduce conditions related to the distribution behavior of the variables. Equation (2.10) governs the small-ball probability around zero, representing the standard condition for small-ball probability. This equation implies that the small-ball probability can be approximated as the product of two independent functions $\phi^m(\cdot)$ and ${f}_1(\cdot)$ . For instance, when $m = 1$ , references such as ^[135] (diffusion process), ^[25] (Gaussian measure), ^[122] (general Gaussian process), and ^[133] (strongly mixing processes) provide context. The function $\phi(\cdot)$ can take various forms, such as $\phi(\epsilon) = \epsilon^{\delta}\exp(-C/\epsilon^{a})$ for Ornstein-Uhlenbeck and general diffusion processes. Further examples and discussions can be found in ^[81] and Remark 3.6. Assumption 1 iv) details the behavior of the joint distribution near the origin, aligning with assumptions made by ^[87] in the context of density estimation for functional data. Assumption 2 encompasses the kernel assumptions commonly used in nonparametric functional estimation. Notably, the Parzen symmetric kernel is inadequate due to the positivity of the random process $D_i = d\left(x, X_i\right)$ ; thus, $K_2(\cdot)$ with support $[0, 1]$ is considered. The kernel $K_2(\cdot)$ is a symmetric type Ⅱ kernel belonging to the family of continuous kernels (triangle, quadratic, etc.). Compact support on $[0, 1]$ is assumed for the kernels to derive an expression for the asymptotic variance. The Lipschitz-type assumptions on $K_2(\cdot)$ and $\sigma(\cdot, \cdot)$ (Assumption 2ⅱ) and Assumption 3ⅲ)) are crucial for obtaining the convergence rate. Assumption 3 restricts the growth of $r^{(m)}(\cdot)$ and $\sigma(\cdot)$ and places bounds on these functions to prevent rapid growth outside a large bound. It is tailored to ensure the convergence rate and forms an integral part of the overall analysis. Assumption 4 ⅱ) is a standard mixing condition necessary for establishing asymptotic normality and the asymptotic negligibility of the bias, consistent with ^[133]. The variables ${\mathfrak W}_{\mathbf{i}, n}$ are not necessarily bounded, and there is a tradeoff between the decay of the mixing condition and the order ${\zeta}$ of the moment $\sup_{\mathbf{x} \in \mathcal{H}^m}\mathbb{E}\lvert {\mathfrak W}_{\mathbf{i}, n}\rvert^{\zeta} \leq C$ . Assumption 4 ⅲ) and ⅳ) are technical conditions crucial for obtaining the desired results, addressing the uniform convergence rate and the bias and convergence rate of the general estimator. Assumption 6 asserts that the class of functions satisfies certain entropy conditions. Part ⅱ) and ⅲ) are interconnected, with the former stating that the class is bounded. However, in the context of proving the functional central limit theorem for conditional $U$ -processes indexed by an unbounded class of functions, part ⅲ) supersedes the first one. Assumption 6 ⅱ) ensures that $\mathcal{F}$ is VC type with characteristics $b$ and $n$ for the envelope $F\kappa^m$ . Since $F \in L^2(\mathbb P)$ by Assumption 6, Dudley's criterion on the sample continuity of Gaussian processes implies that the function class $\mathcal{F}$ is $\mathbb P$ -pre-Gaussian. These general assumptions, inclusive of the mentioned conditions, provide adequate flexibility given the diverse components in our main results. They encapsulate and leverage the topological structure of functional variables, the probability measure within the functional space, the concept of measurability applied to the function class, and the uniformity regulated by entropy properties.

Remark 2.10. It is worth noting that Assumption 6 iii) can be replaced by more general hypotheses regarding the moments of $Y$ , as discussed in ^[65]. The alternative assumption takes the following form:

iii) $^{\prime\prime}$ We introduce $\{\mathcal{M}(x) : x \geq 0\}$ as a non-negative continuous function, increasing on $[0, \infty)$ , and such that, for some $s > 2$ , eventually as $x \rightarrow \infty$ :

$\begin{equation} x^{-s}\mathcal{M}(x) \downarrow; \quad x^{-1}\mathcal{M}(x) \uparrow . \end{equation}$

(2.14)

For each $t \geq \mathcal{M}(0)$ , we define $\mathcal{M}^{inv}(t) \geq 0$ such that $\mathcal{M}(\mathcal{M}^{inv}(t)) = t$ . Additionally, we assume that:

$\mathbb E (\mathcal M(\lvert F( Y)\rvert )) < \infty.$

The following choices for $\mathcal{M}(\cdot)$ are particularly interesting:

(i) $\mathcal{M}(x) = x^{\xi}$ for some $\xi > 2$ ;

(ii) $\mathcal{M}(x) = \exp (s x)$ for some $s > 0$ .

These alternative formulations provide broader flexibility in defining the moments of $Y$ , accommodating various scenarios and enhancing the applicability of the analysis.

3. Uniform convergence rates for kernel estimators

Before expressing the asymptotic behavior of our estimator represented in (2.5), we will generalize the study to a $U$ -statistic estimator defined by:

$\begin{equation} \widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} {\mathfrak W}_{\mathbf{i}, \varphi, n}, \end{equation}$

(3.1)

where ${\mathfrak W}_{\mathbf{i}, \varphi, n}$ is an array of one-dimensional random variables. In this study, we use the results with ${\mathfrak W}_{\mathbf{i}, \varphi, n} = 1$ and ${\mathfrak W}_{\mathbf{i}, \varphi, n} = \varepsilon_{\mathbf{i}, n}$ .

3.1. Hoeffding's decomposition

Note that $\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)$ is a classical $U$ -statistic with a kernel depending on $n$ . We define

$\begin{align*} \xi_k &: = \frac{1}{h } K_{1}\left(\frac{u_k-k/n}{h_n}\right), \\ H(Z_{1}, \ldots, Z_{m}) & : = \prod\limits_{k = 1}^m \frac{1}{ \phi(h_n)} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{k, n})}{h_n}\right) {\mathfrak W}_{\mathbf{i}, \varphi, n}, \end{align*}$

thus, the $U$ -statistic in (3.1) can be viewed as a weighted $U$ -statistic of degree $m$ :

$\begin{equation} \widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) = \frac{(n-m)!}{n!} \sum\limits_{\mathbf{i}\in I_n^m} \xi_{i_1} \ldots \xi_{i_m} H(Z_{i_1}, \ldots, Z_{i_m}). \end{equation}$

(3.2)

We can write Hoeffding's decomposition in this case as in ^[94]. If we do not assume symmetry for ${\mathfrak W}_{\mathbf{i}, \varphi, n}$ or $H$ , we must define:

● The expectation of $H(Z_{i_1}, \ldots, Z_{i_m})$ :

$\begin{equation} \vartheta(\mathbf{i}) : = \mathbb{E} H(Z_{i_1}, \ldots, Z_{i_m}) = \int {\mathfrak W}_{\mathbf{i}, \varphi, n} \prod\limits_{k = 1}^m \frac{1}{ \phi(h_n)} K_{2}\left(\frac{d_{\theta_k}(x_k, \nu_{k, n})}{h_n}\right) d\mathbb{P}_{\mathbf{i}}(z_\mathbf{i}). \end{equation}$

(3.3)

● For all $\ell \in \{1, \ldots, m\}$ the position of the argument, construct the function $\pi_\ell$ such that:

$\begin{equation} \pi_\ell(z; z_1, \ldots, z_{m-1}): = (z_1, \ldots, z_{\ell-1}, z, z_{\ell}, \ldots, z_{m-1}). \end{equation}$

(3.4)

● Define:

$\begin{align} H^{(\ell)}(z ; z_{1}, \ldots, z_{m-1}) &: = H\{\pi_{\ell}(z ; z_{1}, \ldots, z_{m-1})\} \end{align}$

(3.5)

$\begin{align} \vartheta^{(\ell)}(i ; i_{1}, \ldots, i_{m-1}) &: = \vartheta\{\pi_{\ell}(i ; i_{1}, \ldots, i_{m-1})\} . \end{align}$

(3.6)

Hence, the first-order expansion of $H(\cdot)$ will be seen as:

$\begin{eqnarray} \widetilde{H}^{(\ell)}(z) &: = &\mathbb{E}\{H^{(\ell)}(z, Z_1, \ldots, Z_{m-1})\} \\ & = & \int {\mathfrak W}_{( {1}, \ldots, \ell-1, i , \ell, \ldots, {m-1})} \prod\limits_{\underset{k \neq i}{k = 1}}^{m-1} \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \times \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_i}(x_i, \nu_i)}{h_n}\right) \\ && \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}) \\ & = & \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, x)}{h_n}\right) \times w \times \int {\mathfrak W}_{( {1}, \ldots, \ell-1 , \ell, \ldots, {m-1})} \prod\limits_{\underset{k \neq i}{k = 1}}^{m-1} \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \\ && \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}), \end{eqnarray}$

(3.7)

with $\mathbb{P}$ as the underlying probability measure, and define

$\begin{equation} f^{(\ell)}_{i, i_{1}, \ldots, i_{m-1}} : = \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} (\widetilde{H}^{(\ell)}(z) - \vartheta^{(\ell)}(i ; i_{1}, \ldots, i_{m-1}) ). \end{equation}$

(3.8)

Then, the first-order projection can be defined as:

$\begin{equation} \widehat {H}_{1, i}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) : = \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} f^{(\ell)}_{i, i_{1}, \ldots, i_{m-1}}, \end{equation}$

(3.9)

where

$I_{n-1}^{m-1}(-i) : = \left\{1 \leq i_{1} < \ldots < i_{m-1}\leq n \; \mbox{ and }\; i_j \neq i \; \mbox{for all}\; j \in \{1, \ldots, m-1\}\right\}.$

For the remainder terms, we denote by $\mathbf{i} \backslash i_\ell : = (i_1, \ldots, i_{l-1}, i_{l+1}, \ldots, i_m)$ and for $\ell \in \{1, \ldots, m\}$ , let

$\begin{equation} H_{2, \mathbf{i}}(\boldsymbol{z}) : = H(\boldsymbol{z}) - \sum\limits_{l = 1}^m \widetilde{H}^{(\ell)}_{\mathbf{i}\backslash i_\ell}(z_\ell) +(m-1)\vartheta(\mathbf{i}), \end{equation}$

(3.10)

where

$\widetilde{H}^{(\ell)}_{\mathbf{i}\backslash i_\ell}(z_\ell) = \mathbb{E}\{H( Z_1, \ldots, Z_{\ell-1}, z, Z_{\ell+1} Z_{m-1})\},$

defined in (3.7), this projection derives us to the following remainder term:

$\begin{equation} \widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) : = \frac{(n-m)!}{(n)!} \sum\limits_{\mathbf{i}\in I_n^m} \xi_{i_1} \cdots \xi_{i_m} H_{2, \mathbf{i}}(\boldsymbol{z}). \end{equation}$

(3.11)

Finally, using Eqs (3.9) and (3.11), and under conditions that :

$\begin{align} \mathbb{E}\{\widehat{H}_{1, i}(\mathbf u, X, \boldsymbol{\theta}, \varphi)\} & = 0, \end{align}$

(3.12)

$\begin{align} \mathbb{E}\{ H_{2, \mathbf{i}}(\boldsymbol{Z} \mid Z_k)\} & = 0 \; \mbox{a.s.}, \end{align}$

(3.13)

we get the ^[99] decomposition:

$\begin{align} &\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)- \mathbb{E}\{\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)\} \\ & = \frac{1}{n} \sum\limits_{i = 1}^n \widehat{H}_{1, i}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)+ \widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) \\ & = : \widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) + \widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi). \end{align}$

(3.14)

For more details, the interested reader can refer to [94, Lemma 2.2].

3.2. Uniform convergence rate

We commence by presenting the following general proposition.

Proposition 3.1. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ denote a measurable VC-subgraph class of functions, adhering to Assumption 6. Suppose that Assumptions 1–4 are satisfied. In such case, the ensuing result holds:

$\begin{align*} {\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} }\sup\limits_{\mathbf{u} \in [0, 1]^m}\left\lvert\widehat{\psi}(\mathbf{u}, \mathbf{x}, \varphi) - \mathbb E[\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)]\right\rvert & = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}}\right). \end{align*}$

The proof of Proposition 3.1 is deferred to Section 4.1.

Remark 3.2. Elaborating on Proposition 3.1, we can delve into the uniform convergence rate of the kernel estimator $\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n})$ . It is crucial to emphasize that when $m = 1$ and the function $\varphi$ remains constant, the outcomes align with the pointwise convergence rate of the regression function for a strictly stationary functional time series, as discussed in ^[78].

The subsequent theorem presents the uniform convergence rate of the kernel estimator (2.5).

Theorem 3.3. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ be a measurable VC-subgraph class of functions complying with Assumption 6. Suppose Assumptions 1–4 are fulfilled. Then, we have:

$\begin{eqnarray} &&\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in [C_{1}h, 1-C_{1}h]^m} \left\lvert\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert \\&&\qquad \qquad\qquad \qquad = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}} + h^{2m\wedge\alpha}_n\right). \end{eqnarray}$

(3.15)

The proof of Theorem 3.15 is postponed until Section 4.1.

Remark 3.4. It is possible to consider the setting of $\Theta = \Theta_n$ and assume that

$\operatorname{card}\left(\Theta_n\right) = n^\alpha \quad \text { with } \quad \alpha > 0$

and

$\forall \theta \in \Theta_n, \quad\langle\theta-\theta_0, \theta-\theta_0\rangle^{1 / 2} \leq C_7 b_n ,$

where $b_n$ tends to zero, as in ^[145].

Remark 3.5. In contrast to Theorem 4.2 in ^[179] and akin to Theorem 3.1 in ^[116], our formulation excludes the bias term arising from the approximation error of $X_{i, n}$ by $X_{i}^{(u)}$ . Under our assumptions, the approximation error is negligibly small compared to $h^{2m \wedge \beta}_n$ .

Remark 3.6. In nonparametric problems, the infinite dimensionality of the target function is typically determined by the smoothness condition, specifically in Assumptions 2 i) and 3 i). This primarily affects the bias component of the convergence rates, represented by terms like $O\left(h^{2m\wedge\alpha}_n\right)$ in Theorem 3.3. Other terms in the convergence rates stem directly from dispersion effects and are inherently linked to the concentration properties of the probability measure of the variable $X$ . These terms can be expressed as

$O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}} \right),$

where small-ball probabilities are quantified and controlled by the mean of the function $\phi(\cdot)$ defined in (2.4). The rate of convergence is influenced by the concentration of the measure of the process $X$ ; less concentration leads to a slower rate of convergence. Unfortunately, solutions to $\mathbb{P}\left(X \in B(x, r)\right)$ are known for very few random variables (or processes) $X$ , even when $x = 0$ . In certain functional spaces, considering $x \neq 0$ introduces considerable difficulties that might not be surmountable. Authors often focus on Gaussian random elements, and for a comprehensive overview of main results on small-ball probability, refer to ^[122]. In many scenarios, it is convenient to assume that

$\begin{equation} \mathbb{P}\left(X \in B(x , r)\right) \sim \psi(x) \phi(r) \quad as\; r \rightarrow 0, \end{equation}$

(3.16)

where, to ensure the identifiability of the decomposition, a normalizing restriction is necessary, such as

$\mathbb{E}[\psi(X)] = 1 .$

The factorization (3.16) is not a stringent assumption; it holds under appropriate hypotheses (see, for instance, ^[27,122]). The advantage of assuming (3.16) is two-fold. Firstly, the function $\psi(x)$ can be viewed as a surrogate density of the functional random element $X$ and can be leveraged in various contexts. The interested reader can explore its potential in works like ^[26,83,87], where the surrogate density is estimated and used to define a notion of mode or for classification purposes. Second, the function $\phi(h_n)$ acts as the volumetric term and can be employed to assess the complexity of the probability law of the process $X$ (see ^[28]). In the special multi-(but finite)-dimensional scenario where $\mathbf{X} \in \mathbb{R}^d$ , the relation (2.4) is satisfied under standard assumptions with $\phi_x(h_n) \sim C_x h^d$ , commonly known as the curse of dimensionality (see ^[77,80]). Here, we would instead refer to it as the curse of infinite dimension, specifically highlighting the effects of small-ball probabilities. The inherent nature of these probability effects involving small-balls becomes apparent within our infinite-dimensional framework. The remainder of this remark will focus on applying our approach to various continuous-time processes, where the probabilities associated with small-balls have already been identified. For more details on the following examples, refer to ^[80].

(i) Consider the space $\mathcal{C}([0, 1], \mathbb{R})$ equipped with the supremum norm, and its associated Cameron-Martin space

$\mathcal{F} = \mathcal{C}([0, 1], \mathbb{R})^{\mathrm{CM}}.$

Let's examine the fractional Brownian motion $\zeta^{\mathrm{FBM}}$ with parameter $\delta, 0 < \delta < 2$ . small-ball probabilities in this context have been extensively studied. According to [122,Theorems 3.1 and 4.6], we have

$\forall x_0 \in \mathcal{F}, \quad C_{x_0}^{\prime} \mathrm{e}^{h^{-2 / 8}} \leqslant \mathbb P\left(\left\|\zeta^{\mathrm{FBM}}-x_0\right\|_\infty \leqslant h\right) \leqslant C_{x_0} \mathrm{e}^{h^{-2 / 8}}.$

Notably, our crucial relation (2.4) is trivially satisfied for the fractional Brownian motion by choosing the function $\phi(\cdot)$ in the form

$\phi_x^{\mathrm{FBM}}(h_n) \sim C_x \mathrm{e}^{h^{-2 / \delta}}.$

(ii) Consider a centered Gaussian process $\zeta^{\mathrm{GP}} = \left\{\zeta_t^{\mathrm{GP}}, 0 \leqslant t \leqslant 1\right\}$ . This process can be expressed using the Karhunen-Loève decomposition as follows:

$\zeta_t^{\mathrm{GP}} = \sum\limits_{i = 1}^{\infty} \sqrt{\lambda_i} {\mathfrak W}_i f_i(t),$

where $\lambda_i$ are the eigenvalues of the covariance operator of $\zeta^{\mathrm{GP}}$ , $f_i$ are the associated orthonormal eigenfunctions, and ${\mathfrak W}_i$ are independent standard normal real random variables. For any fixed $k \in \mathbb{N}^*$ , let $\Pi_k$ be the orthogonal projection onto the subspace spanned by the eigenfunctions $\left\{f_1, \ldots, f_k\right\}$ . Define a semi-metric by

$d^2(x, y) = \int_0^1\left[\Pi_k(x-y)(t)\right]^2 d t.$

Using the Karhunen-Loève expansion, we obtain

$d^2\left(\zeta^{\mathrm{GP}}, x\right) = \sum\limits_{i = 1}^k\left(\sqrt{\lambda_i} {\mathfrak W}_i-x_i\right)^2 = \sum\limits_{i = 1}^k Z_i^2,$

where $x_i = \int_0^1 x(t) f_i(t) \mathrm{d} t$ , and $Z_i$ are the components of the vector $Z = \left(Z_1, \ldots, Z_k\right)$ , exhibiting the Euclidean norm structure on $\mathbb{R}^k$ . Due to the independence of $Z_i$ with densities with respect to the Lebesgue measure, we have

$\mathbb P\left(d^2\left(\zeta^{\mathrm{GP}}, x\right) < h\right) \sim C_x h^k.$

(iii) Consider the space $\mathcal{C}([0, 1], \mathbb{R})$ of continuous real-valued functions on $[0, 1]$ , equipped with the supremum norm denoted by $\|\cdot\|_{\infty}$ . Let $\mathbb P^{\mathrm{W}}$ be the Wiener measure on $\mathcal{C}([0, 1], \mathbb{R})$ , and define the Cameron-Martin space of $\mathcal{C}([0, 1], \mathbb{R})$ as

$\mathcal{F} = \mathcal{C}([0, 1], \mathbb{R})^{\mathrm{CM}}.$

Consider the Ornstein-Uhlenbeck process $\zeta^{\mathrm{OU}}$ with $\zeta_0^{\mathrm{OU}} = 0$ and

$\mathrm{d} \zeta_t^{\mathrm{OU}} = \mathrm{d} {\mathfrak W}_t-\frac{1}{2} \zeta_t^{\mathrm{OU}}, \quad \forall t, 0 < t \leqslant 1.$

Wiener measures for small centered balls are known to be of the form [^[25], p. 187]:

$\mathbb P^{\mathrm{W}}\left(\|x\|_{\infty} \leqslant h\right) \sim \frac{4}{\pi} \mathrm{e}^{-\pi^2 / 8 h^2}.$

By extending this result to any small-ball probability measure through the Cameron-Martin space characterization, we have

$\forall x_0 \in \mathcal{F}, \quad \mathbb P^{\mathrm{W}}\left(\left\|x-x_0\right\|_{\mathrm{sup}} \leqslant h\right) \sim C_{x_0} \mathrm{e}^{-\pi^2 / 8 h^2}.$

Since the Ornstein-Uhlenbeck process has a probability measure absolutely continuous with respect to $P^{\mathrm{W}}$ , we can directly state

$\forall x_0 \in \mathcal{F}, \quad \mathbb P\left(\zeta^{\mathrm{OU}} \in \mathcal{B}\left(x_0, h\right)\right) \sim C_{x_0} \mathrm{e}^{-\pi^2 / 8 h^2}.$

Our crucial relation (2.4) is trivially satisfied for this Ornstein-Uhlenbeck process by choosing the function $\phi_x(\cdot)$ in the form

$\phi_x^{\mathrm{OU}}(h_n) \sim C_x \mathrm{e}^{-\pi^2 / 8 h^2}.$

4. Weak convergence for kernel estimators

In this section, we are interested in studying the weak convergence of the conditional $U$ -processes under absolute regular observations. Observe that

$\begin{array}{l} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \\ \;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{ x}, \mathbf{u})\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right) \\ \;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}^{B}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right), \end{array}$

(4.1)

where

$\begin{equation*} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}, \end{equation*}$

$\begin{equation*} \widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} {\mathfrak W}_{\mathbf{i}, \varphi, n}, \end{equation*}$

$\begin{equation*} \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} r^{(m)}\left(\frac{\mathbf{i}}{n} , X_{i, n}, \boldsymbol{\theta}\right). \end{equation*}$

Under the same assumption in Theorem 3.3, we will show in the next theorem that

$Var(\widehat{g}^{B}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = o\left(\frac{1 }{ n h^m\phi_{\mathbf x, \boldsymbol\theta}(h_n)}\right)$

and

$1/ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = O_{ \mathbb{P}}(1).$

Then, we have

$\begin{align*} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x};h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) & = \frac{\widehat{g}_{1}(\boldsymbol{\theta}, \mathbf u, \mathbf x) }{ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})} + B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) + o_{ \mathbb{P}}\left(\sqrt{\frac{1}{nh^m\phi_{\mathbf x, \boldsymbol\theta}(h_n)}}\right), \end{align*}$

where

$B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \mathbb{E}[\widehat{g}^{B}(\boldsymbol{\theta}, \mathbf u, \mathbf x)]/ \mathbb{E}[ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})],$

is the "bias" term and $\frac{\widehat{g}_{1}(\boldsymbol{\theta}, \mathbf u, \mathbf x)}{ \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}$ is the "variance" term. Let us define, for $\varphi_1, \varphi_2\in \mathcal F_m$

$\begin{aligned} & \sigma(\varphi_1, \varphi_2) = \lim\limits_{n \rightarrow \infty}nh^m\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)\mathbb E( (\widetilde{r}_{n}^{(m)}(\varphi_1, \mathbf{u}, \mathbf{x};h_{n}) - r^{(m)}(\varphi_1, \mathbf{u}, \mathbf{x})\\ &\quad\;\;\times(\widetilde{r}_{n}^{(m)}(\varphi_2, \mathbf{u}, \mathbf{x};h_{n}) - r^{(m)}(\varphi_2, \mathbf{u}, \mathbf{x})). \end{aligned}$

(4.2)

In the following, we would set $K_{2}(\cdot)$ as the asymmetrical triangle kernel, that is, $K_{2}(x) = (1-x) \mathbb 1_{(x \in [0, 1])}$ to simplify the proof. The main results of this section are given in the following theorems.

Theorem 4.1. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ be a measurable VC-subgraph class of functions, and assume that all the assumptions of Section 2.8 are satisfied for both cases ${\mathfrak W}_{\mathbf{i}, \varphi, n} = 1$ and ${\mathfrak W}_{\mathbf{i}, \varphi, n} = \varepsilon_{\mathbf{i}, n}$ . Then, as $n\to \infty$ , the $U$ -process

$\sqrt{nh^m\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)}\left(\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})-B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right)$

converges to a Gaussian process $\mathfrak G_{n}$ over $\mathscr{F}_m\mathfrak{K}^m_\Theta$ , whose sample paths are bounded and informally continuous with respect to the $\Vert\cdot\Vert_2$ -norm, with the covariance function given in (4.2).

The proof of Theorem 4.2 is postponed until Section 4.1.

To examine the weak convergence of our estimator using the standard procedure, involving Hoeffding decomposition, finite-dimensional convergence, and equicontinuity, we can turn to the following theorem. In the proof of this theorem, we express the conditional $U$ -process in terms of a $U$ -process based on a stationary sequence, illustrating its convergence to a Gaussian process. This convergence is established in the distribution sense within $l^{\infty}(\mathscr{F}_m\mathfrak{K}^m_\Theta)$ , the space of bounded real functions on $\mathscr{F}_m\mathfrak{K}^m_\Theta$ , as defined in ^[100]. For further details, refer to ^[6,71], or ^[178].

Theorem 4.2. Assume $\mathscr{F}_m\mathfrak{K}^m_\Theta$ is a measurable VC-subgraph class of functions, and all assumptions in Section 2.8 are satisfied. If in addition $n\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)h^{m+2(2m\wedge\alpha)}\rightarrow 0$ as $n\to \infty$ , then we have

converges in law to a Gaussian process $\left\{ \mathbb{G}(\psi):\psi\in\mathscr{F}_m\mathfrak{K}^m_\Theta\right\}$ in $l^{\infty}(\mathscr{F}_m\mathfrak{K}^m_\Theta)$ that admits a version with uniformly bounded and uniformly continuous paths with respect to the $\Vert\cdot\Vert_2-$ norm, and its covariance function is given in (4.2).

The proof of Theorem 4.2 is deferred to Section 8.

Remark 4.3. To eliminate the bias term, it is necessary to have $n\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)h^{m+2(2m\wedge\alpha)}\rightarrow 0$ as $n \rightarrow \infty$ . As a consequence, the last condition, along with $nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)\rightarrow \infty$ , holds as long as $h_n = n^{-\xi}$ and $\phi_{\mathbf x, \boldsymbol\theta}(h_n) = h_n^{mc}$ , where $0 < c < 1-{\frac{1}{\xi m}}$ and $\frac{1}{m(1+c)+ 2(2m\wedge \alpha)} < \xi < \frac{1}{m(1-c)}$ .

Remark 4.4. The validity of the results remains intact even when replacing the entropy condition with the bracketing condition. In particular, the existence of constants $C_0 > 0$ and $v_0 > 0$ ensures the specified inequality, as described in the remark. In our framework, the choice of the kernel function is flexible, with minimal restrictions, as long as some mild conditions are satisfied. However, the selection of the bandwidth introduces challenges, and it is crucial for achieving a favorable rate of consistency. The bandwidth choice significantly impacts the bias and variance trade-off in the estimator. Therefore, adopting a bandwidth that adjusts based on specific criteria, available data, and location is more suitable. The discussion on this topic can be found in ^[44,46,132]. Establishing uniform-in-bandwidth central limit theorems in our context would be particularly interesting.

Remark 4.5. We can consider the scenario where $\Theta = \Theta_n$ , where $\Theta_n$ satisfies the conditions $\operatorname{card}\left(\Theta_n\right) = n^\alpha$ with $\alpha > 0$ , and for every $\theta \in \Theta_n$ , we have

$\langle\theta-\theta_0, \theta-\theta_0\rangle^{1 / 2} \leq C_7 b_n,$

where $b_n$ converges to zero, as discussed in ^[145].

The functional directions set, $\Theta_n$ , is constructed following a similar approach to ^[2,145], as outlined below:

(i) Each direction $\theta \in \Theta_n$ is derived from a $d_n$ -dimensional space formed by $\mathrm{B}$ -spline basis functions, denoted by $\left\{e_1(\cdot), \ldots, e_{d_n}(\cdot)\right\}$ . Thus, we express directions as:

$\begin{eqnarray} \theta(\cdot) = \sum\limits_{j = 1}^{d_n} \alpha_j e_j(\cdot) \text { where }\left(\alpha_1, \ldots, \alpha_{d_n}\right) \in \mathcal{V}. \end{eqnarray}$

(4.3)

(ii) The set of coefficient vectors in (4.3), denoted by $\mathcal{V}$ , is generated through the following steps:

Step 1. For each $\left(\beta_1, \ldots, \beta_{d_n}\right) \in \mathcal{C}^{d_n}$ , where $\mathcal{C} = \left\{c_1, \ldots, c_J\right\} \subset \mathbb{R}^J$ represents a set of $J$ 'seed-coefficients', construct the initial functional direction as

$\theta_{\text {init }}(\cdot) = \sum\limits_{j = 1}^{d_n} \beta_j e_j(\cdot) .$

Step 2. For each $\theta_{\text {init }}$ from Step 1 satisfying $\theta_{\text {init }}\left(t_0\right) > 0$ , where $t_0$ denotes a fixed value in the domain of $\theta_{\text {init }}(\cdot)$ , compute $\langle\theta_{\text {init }}, \theta_{\text {init }}\rangle$ and form $\left(\alpha_1, \ldots, \alpha_{d_n}\right) = \left(\beta_1, \ldots, \beta_{d_n}\right) /\langle\theta_{\text {init }}, \theta_{\text {init }}\rangle^{1 / 2}$ .

Step 3. Define $\mathcal{V}$ as the collection of vectors $(\alpha_1, \ldots, \alpha_{d_n})$ obtained in Step 2. Consequently, the final set of permissible functional directions is represented as

$\Theta_n = \left\{\theta(\cdot) = \sum\limits_{j = 1}^{d_n} \alpha_j e_j(\cdot) ;\left(\alpha_1, \ldots, \alpha_{d_n}\right) \in \mathcal{V}\right\}.$

5. Applications

While only the subsequent examples will be provided in this section, they serve as prototypes for a range of problems that can be explored in a comparable manner.

5.1. Discrimination

Now, we apply the results to the discrimination problem described in Section 3 of ^[168], also referring to ^[167]. We will employ similar notation and settings. Let $\varphi(\cdot)$ be any function taking at most finitely many values, say $1, \ldots, M$ . The sets

$A_{j} = \left\{(y_{1}, \ldots, y_{m}): \varphi(y_{1}, \ldots, y_{m}) = j\right\}, \; \; 1\leq j\leq M,$

then yield a partition of the feature space. Predicting the value of $\varphi(Y_{1}, \ldots, Y_{m})$ is tantamount to predicting the set in the partition to which $(Y_{1}, \ldots, Y_{m})$ belongs. For any discrimination rule $g$ , we have

$\mathbb{P}(g(\mathbf{ X}, \boldsymbol{\theta}) = \varphi(\mathbf{ Y}))\leq \sum\limits_{j = 1}^{M}\int_{\{\mathbf{ x}:g(\mathbf{ x}) = j\}}\max \mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta})d\mathbb{P}(\mathbf{ x}),$

where

$\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta}) = \mathbb{P}(\varphi(\mathbf{ Y}_{\mathbf i}) = j\mid \langle\mathbf{ X}_{\mathbf i}, \boldsymbol{\theta}\rangle = \langle\mathbf{ x}, \boldsymbol{\theta}\rangle), \; \; \mathbf{ x}\in\mathcal{H}^m.$

The above inequality becomes equality if

$\mathfrak G_{0}(\mathbf{ x}, \boldsymbol{\theta}) = \arg \max\limits_{1\leq j\leq M}\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta}).$

$\mathfrak G_{0}(\cdot)$ is called the Bayes rule, and the pertaining probability of error

$\mathbf{ L}^{*} = 1-\mathbb{P}(\mathfrak G_{0}(\mathbf{ X}, \boldsymbol{\theta}) = \varphi(\mathbf{ Y})) = 1-\mathbb{E}\left\{\max\limits_{1\leq j\leq M}\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta})\right\}$

is called the Bayes risk. Each of the above unknown functions $\mathfrak{M}^{j}$ 's can be consistently estimated by one of the methods discussed in the preceding sections. Let, for $1\leq j\leq M$ ,

$\begin{eqnarray} \label{newestimatye2323} \mathfrak{M}^{j}_{n}(\mathbf u, \mathbf{ x}, \boldsymbol{\theta}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m} \mathbb{1}\{ \varphi({ Y}_{i_{1}}, \ldots, { Y}_{i_{m}}) = j\}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_{k}, X_{i_k, n})}{h_n}\right)\right\}}. \end{eqnarray}$

Set

$\mathfrak G_{0, n}(\mathbf{ x}, \boldsymbol{\theta}) = \arg \max\limits_{1\leq j\leq M}\mathfrak{M}^{j}_{n}(\frac{\mathbf i}{n}, \mathbf{ x}, \boldsymbol{\theta}).$

Let us introduce

$\mathbf{ L}^{*}_{n} = \mathbb{P}(\mathfrak G_{0, n}(\mathbf{ X}, \boldsymbol{\theta})\neq \varphi(\mathbf{ Y})).$

Then, one can show that the discrimination rule $\mathfrak G_{0, n}(\cdot)$ is asymptotically Bayes' risk consistent

$\mathbf{ L}^{*}_{n} \rightarrow \mathbf{ L}^{*}.$

This follows from the apparent relation:

$\mid \mathbf{ L}^{*}-\mathbf{ L}^{*}_{n}\mid \leq 2 \mathbb E\left[\max _{1 \leq j \leq M}\mid\mathfrak{M}^{j}_{n}(\frac{\mathbf i}{n}, \mathbf{X}, \boldsymbol{\theta})-\mathfrak{M}^{j}(\frac{\mathbf i}{n}, \mathbf{X}, \boldsymbol{\theta})\mid\right].$

5.2. Metric learning

Metric learning, a field that has garnered significant attention in recent years, revolves around adapting the metric to the underlying data. Extensive discussions on metric learning and its applications are available in ^[19,54]. This concept has proven valuable in diverse domains, including computer vision, information retrieval, and bioinformatics. To demonstrate the practicality of metric learning, we delve into the metric learning problem for supervised classification outlined in ^[54]. Consider independent copies $\left(X_{1}, Y_{1}\right), \ldots, \left(X_{n}, Y_{n}\right)$ of a $\mathscr{H}\times \mathcal{Y}$ -valued random couple $(X, Y)$ , where $\mathscr{H}$ is a feature space, and $\mathcal{Y} = \{1, \ldots, C\}$ (with $C \geq 2$ ), representing a finite set of labels. Let $\mathcal{D}$ be a set of distance measures $D: \mathscr{H} \times \mathscr{H} \rightarrow \mathbb{R}_{+}$ . In this context, the objective of metric learning is to find a metric under which pairs of points with the same label are close to each other, while those with different labels are far apart. The natural way to define the risk of a metric $D$ is given by:

$\begin{equation} R(D) = \mathbb{E}\left[\phi\left(\left(1-D\left(X, X^{\prime}\right) \cdot\left(2 \; \; \mathbb{1}_{\left\{Y = Y^{\prime}\right\}}-1\right)\right)\right]\right., \end{equation}$

(5.1)

where $\phi(u)$ is a convex loss function upper-bounding the indicator function $\mathbb{1}\{u \geq 0\}$ , for instance, the hinge loss $\phi(u) = \max (0, 1-u)$ . To estimate $R(D)$ , we consider the natural empirical estimator:

$\begin{equation} R_{n}(D) = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n} \phi\left(\left(D\left(X_{i}, X_{j}\right)-1\right) \cdot\left(2 \; \; \mathbb{1}_{\left\{Y_{i} = Y_{j}\right\}}-1\right)\right), \end{equation}$

(5.2)

which is a one-sample $U$ -statistic of degree two with a kernel given by:

$\varphi_{\mathrm{D}}\left((x, y), \left(x^{\prime}, y^{\prime}\right)\right) = \phi\left(\left(D\left(x, x^{\prime}\right)-1\right) \cdot\left(2 \; \; \mathbb{1}_{\left\{y = y^{\prime}\right\}}-1\right)\right) .$

The convergence to (5.1) of a minimizer of (5.2) has been investigated within the frameworks of algorithmic stability ^[107], algorithmic robustness ^[18], and the theory of $U$ -processes under suitable regularization ^[50].

5.3. Kendall rank correlation coefficient

To test the independence of one-dimensional random variables $Y_1$ and $Y_2$ , ^[110] proposed a method based on the $U$ -statistic $K_{n}$ with the kernel function :

$\begin{equation} \varphi\left(\left(s_{1}, t_{1}\right), \left(s_{2}, t_{2}\right)\right) = \mathbb{1}_{\left\{\left(s_{2}-s_{1}\right)\left(t_{2}-t_{1}\right) > 0\right\}}- \mathbb{1}_{\left\{\left(s_{2}-s_{1}\right)\left(t_{2}-t_{1}\right) \leqslant 0\right\}}. \end{equation}$

(5.3)

Its rejection on the region is of the form $\left\{\sqrt{n} K_{n} > \gamma\right\}$ . In this example, we consider a multivariate case. To test the conditional independence of $\boldsymbol{\xi}, \boldsymbol{\eta}: Y = (\boldsymbol{\xi}, \boldsymbol{\eta})$ given $X$ , we propose a method based on the conditional U-statistic :

$\widehat{r}_{n}^{(2)}(\varphi, \mathbf u, \mathbf{x}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^2} \varphi\left({Y}_{i_1}, {Y}_{i_2}\right)\prod\limits_{k = 1}^2\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_{k}, X_{i_k, n})}{h_n}\right)\right\} }{ \sum\limits_{\mathbf{i}\in I_n^2}\prod\limits_{k = 1}^2\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_{k}, X_{i_k, n})}{h_n}\right)\right\} },$

where $\mathbf{x} = \left(x_{1}, x_{2}\right) \in \mathbb{I} \subset \mathbb{R}^{2}$ and $\varphi(\cdot)$ is Kendall's kernel (5.3). Suppose that $\boldsymbol{\xi}$ and $\boldsymbol{\eta}$ are $d_{1}$ and $d_{2}$ -dimensional random vectors respectively and $d_{1}+d_{2} = d$ . Furthermore, suppose that $\mathrm{Y}_{1}, \ldots, \mathrm{Y}_{n}$ are observations of $(\boldsymbol{\xi}, \boldsymbol{\eta})$ , we are interested in testing :

$\begin{equation} \mathrm{H}_{0}:\boldsymbol{\xi}\; \; \mbox{and}\; \; \boldsymbol{\eta}\; \; \mbox{are conditionally independent given}\; \; X.\; \; \mbox{vs}\; \; \mathrm{H}_{a}: \mathrm{H}_{0}\; \; \mbox{is not true}. \end{equation}$

(5.4)

Let $\mathbf{a} = \left(\mathbf{a}_{1}, \mathbf{a}_{2}\right) \in \mathbb{R}^{d}$ such as $\|\mathbf{a}\| = 1$ and $\mathbf{a}_{1} \in \mathbb{R}^{d_{1}}, \mathbf{a}_{2} \in \mathbb{R}^{d_{2}}$ , and $\mathrm{F}(\cdot), \mathrm{G}(\cdot)$ be the distribution functions of $\boldsymbol{\xi}$ and $\boldsymbol{\eta}$ , respectively. Suppose $F^{a_{1}}(\cdot)$ and $G^{a_{2}}(\cdot)$ to be continuous for any unit vector $\mathbf{a} = \left(\mathbf{a}_{1}, \mathbf{a}_{2}\right)$ where $\mathrm{F}^{\mathbf{a}_{1}}(t) = \mathbb{P}\left(\mathbf{a}_{1}^{\top} \boldsymbol{\xi} < t\right)$ and $\mathrm{G}^{\mathbf{a}_{2}}(t) = \mathbb{P}\left(\mathbf{a}_{2}^{\top} \boldsymbol{\eta} < t\right)$ and $\mathbf{a}_{1}^{\mathrm{T}}$ means the transpose of the vector $\mathbf{a}_{i}, 1 \leqslant i \leqslant 2$ . For $n = 2$ , let $Y^{(1)} = \left(\boldsymbol{\xi}^{(1)}, \boldsymbol{\eta}^{(1)}\right)$ and $Y^{(2)} = \left(\boldsymbol{\xi}^{(2)}, \boldsymbol{\eta}^{(2)}\right)$ such as $\boldsymbol{\xi}^{(i)} \in \mathbb{R}^{d_{1}}$ and $\boldsymbol{\eta}^{(i)} \in \mathbb{R}^{d_{2}}$ for $i = 1, 2$ , and :

$\varphi^{a}\left(\mathrm{Y}^{(1)}, \mathrm{Y}^{(2)}\right) = \varphi\left(\left(\mathbf{a}_{1}^{\top} \boldsymbol{\xi}^{(1)}, \mathbf{a}_{2}^{\top} \boldsymbol{\eta}^{(1)}\right), \left(\mathbf{a}_{1}^{\top} \boldsymbol{\xi}^{(2)}, \mathbf{a}_{2}^{\top} \boldsymbol{\eta}^{(2)}\right)\right).$

An application of Theorem 3.3 gives

$\begin{equation} \left|\widehat{r}^{(2)}_{n}(\varphi^{a}, \mathbf{u}, \mathbf{x}) -{r}^{(2)}(\varphi^{a}, \mathbf{u}, \mathbf{x})\right|\longrightarrow 0, \quad a.s. \end{equation}$

(5.5)

5.4. Conditional U-statistics for censored data

Consider a triple $(Y, C, {X})$ of random variables defined in $\mathbb{R} \times \mathbb{R} \times \mathcal{H}$ . Here, $Y$ is the variable of interest, $C$ is a censoring variable, and ${X}$ is a concomitant variable. Throughout, we work with a sample $\{(Y_{i}, C_{i}, { X}_{i})_{1\leq i\leq n}\}$ of independent and identically distributed replication of $(Y, C, { X})$ , $n \geq 1$ . Actually, in the right censorship model, the pairs $(Y_{i}, C_{i})$ , $1 \leq i \leq n$ , are not directly observed, and the corresponding information is given by $Z_{i} : = \min\{Y_{i}, C_{i}\}$ and $\Delta_{i} : = \mathbb{1}\{Y_{i}\leq C_{i}\}$ , $1 \leq i \leq n$ . Accordingly, the observed sample is

$\mathcal{D}_{n} = \{(Z_{i}, \Delta_{i}, { X}_{i}), i = 1, \ldots, n\}.$

For example, survival data in clinical trials or failure time data in reliability studies are often subject to such censoring. To be more specific, many statistical experiments result in incomplete samples, even under well-controlled conditions. For example, clinical data for surviving most types of diseases are usually censored by other competing risks to life, which result in death. In the sequel, we impose the following assumptions upon the distribution of $({ X}, Y)$ . For $-\infty < t < \infty$ , set

$F_{Y}(t) = \mathbb{P}(Y \leq t), \; \; G(t) = \mathbb{P}(C \leq t), \; \; \mbox{and}\; \; H(t) = \mathbb{P}(Z \leq t),$

the right-continuous distribution functions of $Y$ , $C$ , and $Z$ , respectively. For any right-continuous distribution function $\mathfrak L$ defined on $\mathbb{R}$ , denote by

$T_{\mathfrak L} = \sup\{t \in \mathbb{R} : \mathfrak L(t) < 1\}$

the upper point of the corresponding distribution. Now consider a pointwise measurable class $\mathscr{F}$ of real measurable functions defined on $\mathbb{R}$ , and assume that $\mathscr{F}$ is of VC-type. We recall the regression function of $\psi(Y)$ evaluated at $\langle{ X}, \theta\rangle = \langle{ t}, \theta\rangle$ , for $\psi \in \mathscr{F}$ and $t \in\mathcal H$ , given by

$r^{(1)}(\psi, \frac{i}{n}, { t}, \theta) = \mathbb{E}(\psi(Y_i)\mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle),$

when $Y$ is right-censored. To estimate $r^{(1)}(\psi, \cdot)$ , we make use of the inverse probability of censoring weighted (I.P.C.W.) estimators that have recently gained popularity in the censored data literature (see ^[51,111]). The key ideas of I.P.C.W. estimators are as follows. Introduce the real-valued function $\Phi_{\psi}(\cdot, \cdot)$ defined on $\mathbb{R}^{2}$ by

$\begin{equation} \Phi_{\psi}(y, c) = \frac{ \mathbb{1}\{y \leq c\}\psi(y\wedge c)}{1-G(y\wedge c)}. \end{equation}$

(5.6)

Assuming the function $G(\cdot)$ to be known, first note that $\Phi_{\psi}(Y_{i}, C_{i}) = \Delta_{i}\psi(Z_{i})/(1 -G(Z_{i}))$ is observed for every $1 \leq i \leq n$ . Moreover, under the Assumption (I) below:

(I) $C$ and $(Y, \mathbf{ X})$ are independent.

We have

$\begin{eqnarray} r^{(1)}(\Phi_{\psi}, \frac{i}{n}, { t}, \theta) &: = &\mathbb{E}(\Phi_{\psi}(Y_i, C_i)\mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle) \\& = &\mathbb{E}\left\{\frac{ \mathbb{1}\{ Y_i\leq C_i\}\psi(Z_i)}{1-G(Z_i)} \mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle\right\}\\ & = &\mathbb{E}\left\{\frac{\psi(Y_i)}{1-G(Y_i)}\mathbb{E}( \mathbb{1}\{ Y_i\leq C_i\}\mid \mathbf{ X}_i, Y_i) \mid \langle{ X}_i, \theta\rangle = \langle{ t}, \theta\rangle\right\}\\& = &r^{(1)}(\psi, \frac{i}{n}, { t}, \theta). \end{eqnarray}$

(5.7)

Therefore, any estimate of $r^{(1)}(\Phi_{\psi}, \cdot)$ , which can be built on fully observed data, turns out to be an estimate for $r^{(1)}(\psi, \cdot)$ too. Thanks to this property, most statistical procedures that provide estimates of the regression function in the uncensored case can be naturally extended to the censored case. For instance, kernel-type estimates are straightforward to construct. Set, for $\mathbf{ x}\in \mathcal{I}$ , $h\geq 0$ , $1\leq i\leq n$ ,

$\begin{eqnarray} \overline{\omega}_{n, h, j}^{(1)}(u, x): = K_{1}\left(\frac{u-j/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x, X_{j, n})}{h_n}\right) \Big/\sum\limits_{j = 1}^{n}K_{1}\left(\frac{u-j/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x, X_{j, n})}{h_n}\right). \end{eqnarray}$

(5.8)

In view of (5.6)–(5.8), whenever $G(\cdot)$ is known, a kernel estimator of $r^{(1)}(\psi, \cdot)$ is given by

$\begin{eqnarray} \breve{r}_{n}^{(1)}(\psi, u, x;h_{n}) = \sum\limits_{i = 1}^{n}\overline{\omega}_{n, h, i}^{(1)}(u, x)\frac{\Delta_{i}\psi(Z_{i})}{1-G(Z_{i})}. \end{eqnarray}$

(5.9)

The function $G(\cdot)$ is generally unknown and has to be estimated. We will denote by $G^{*}_{n}(\cdot)$ the Kaplan-Meier estimator of the function $G(\cdot)$ ^[108]. Namely, adopting the conventions

$\prod\limits_{\emptyset} = 1$

and $0^{0} = 1$ and setting

$N_{n}(u) = \sum\limits_{i = 1}^{n} \mathbb{1}\{Z_{i}\geq u\},$

we have

$G_{n}^{*}(u) = 1-\prod\limits_{i:Z_{i}\leq u}\left\{\frac{N_{n}(Z_{i})-1}{N_{n}(Z_{i})}\right\}^{(1-\Delta_{i})}, \; \; \mbox{for}\; \; u \in \mathbb{R}.$

Given this notation, we will investigate the following estimator of $r^{(1)}(\psi, \cdot)$

$\begin{eqnarray} \breve{r}_{n}^{(1)*}(\psi, u, x;h_{n}) = \sum\limits_{i = 1}^{n}\overline{\omega}_{n, h, i}^{(1)}(u, x)\frac{\Delta_{i}\psi(Z_{i})}{1-G_{n}^{*}(Z_{i})}, \end{eqnarray}$

(5.10)

refer to ^[111,130]. Adopting the convention $0/0 = 0$ , this quantity is well defined, since $G_{n}^{*}(Z_{i}) = 1$ if and only if $Z_{i} = Z_{(n)}$ and $\Delta_{(n)} = 0$ , where $Z_{(k)}$ is the $k$ th ordered statistic associated with the sample $(Z_{1}, \ldots, Z_{n})$ for $k = 1, \ldots, n$ and $\Delta_{(k)}$ is the $\Delta_{j}$ corresponding to $Z_{k} = Z_{j}$ . A right-censored version of an unconditional $U$ -statistic with a kernel of degree $m\geq 1$ is introduced by the principle of a mean preserving reweighting scheme in ^[60]. Reference ^[170] has proved almost sure convergence of multi-sample $U$ -statistics under random censorship and provided application by considering the consistency of a new class of tests designed for testing equality in distribution. To overcome potential biases arising from right-censoring of the outcomes and the presence of confounding covariates, ^[53] proposed adjustments to the classical $U$ -statistics. ^[186] proposed a different way in the estimation procedure of the $U$ -statistic by using a substitution estimator of the conditional kernel given the observed data. We also refer to ^[45]. To our best knowledge, the problem of the estimation of the conditional $U$ -statistics in the censored setting with variable bandwidth was opened up to the present, and it gives the primary motivation for the study of this section. A natural extension of the function defined in (5.6) is given by

$\begin{eqnarray} { \Phi}_{\psi}(y_{1}, \ldots, y_{k}, c_{1}, \ldots, c_{k}) = \frac{ \prod\limits_{i = 1}^{k}\{ \mathbb{1}\{y_{i} \leq c_{i}\}\psi(y_{1}\wedge c_{1}, \ldots, y_{k}\wedge c_{m})}{ \prod\limits_{i = 1}^{k}\{1-G(y_{i}\wedge c_{i})\}}. \end{eqnarray}$

(5.11)

From this, we have an analogous relation to (5.7) given by

$\begin{aligned} &\mathbb{E}({ \Phi}_{\psi}(Y_{1}, \ldots, Y_{k}, C_{1}, \ldots, C_{k})\mid(\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t} )\\ &\;\;\; = \mathbb{E}\left(\frac{ \prod\limits_{i = 1}^{k} \mathbb{1}\{Y_{i} \leq C_{i}\}\psi(Y_{1}\wedge C_{1}, \ldots, Y_{k}\wedge C_{k})}{ \prod\limits_{i = 1}^{k}\{1-G(Y_{i}\wedge C_{i})\}}\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{t}\right)\\ &\;\;\; = \mathbb{E}\left(\frac{ \psi(Y_{1}, \ldots, Y_{k})}{ \prod\limits_{i = 1}^{k}\{1-G(Y_{i})\}}\mathbb{E}\left(\prod\limits_{i = 1}^{k} \mathbb{1}\{Y_{i} \leq C_{i}\}\mid (Y_{1}, X_{1}), \ldots (Y_{k}, X_{ k})\right)\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t}\right)\\&\;\;\; = \mathbb{E}\left(\psi(Y_{1}, \ldots, Y_{k})\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t}\right). \end{aligned}$

(5.12)

An analogue estimator to (1.1) in the censored case is given by

$\begin{eqnarray} \breve{r}_{n}^{(k)}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}}\cdots\Delta_{i_{k}}\psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ (1-G(Z_{i_{1}})\cdots(1-G(Z_{i_{k}}))}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}), \end{eqnarray}$

(5.13)

where, for $\mathbf{i} = (i_{1}, \ldots, i_{k})\in I(k, n)$ ,

$\begin{equation} \overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \frac{ \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}. \end{equation}$

(5.14)

The estimator that we will investigate is given by

$\begin{eqnarray} \breve{r}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}}\cdots\Delta_{i_{k}}\psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ (1-G_{n}^{*}(Z_{i_{1}})\cdots(1-G_{n}^{*}(Z_{i_{k}}))}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}). \end{eqnarray}$

(5.15)

The main result of this section is given in the following corollary.

Corollary 5.1. Let $\mathscr{F}_m\mathfrak{K}^m_\Theta$ be a measurable VC-subgraph class of functions complying with Assumption 6. Suppose Assumptions 1–4 are fulfilled. Then we have:

$\begin{aligned} &\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in [C_{1}h, 1-C_{1}h]^m} \left\lvert\breve{r}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{ x}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert \\& \qquad \qquad\qquad \qquad = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m_n\phi^m(h_n)}} + h^{2m\wedge\alpha}_n\right). \end{aligned}$

(5.16)

This last result is a direct consequence of Theorem 3.3 and the law of iterated logarithm for $G_{n}^{*}(\cdot)$ established in ^[84] ensures that

$\sup\limits_{t\leq\tau}|G_{n}^{*}-G(t)| = O\left(\sqrt{\frac{\log\log n}{n}}\right)\;\;\;\;\;\text{almost surely}\;\;\text{as}\;\;\;n\rightarrow \infty.$

For more details refer to ^[34].

5.5. Conditional U-statistics for left truncated and right censored data

Keeping in mind the notation of the preceding section, we now introduce a truncation variable denoted as $L$ and assume that $(L, C)$ is independent of $Y$ . Let us consider a situation where we have random vectors $(Z_i \epsilon_i, \Delta_i)$ , with $\epsilon_i = \mathbb 1 (L_i \leq Z_i)$ . In this section, we aim to define $U$ -statistics conditional for data that are left truncated and right censored (LTRC), by following ideas from ^[172] in the unconditional setting. To achieve this, we propose an extension of the function $(5.6)$ for LTRC data as follows:

$\tilde{ \Phi}_{\psi}(y_{1}, \ldots, y_{k}, l_{1}, \ldots, l_{k}, c_{1}, \ldots, c_{k}) = \frac{ \psi(y_{1}\wedge c_{1}, \ldots, y_{k}\wedge c_{k})\prod\limits_{i = 1}^{k} \mathbb{1}\{y_{i} \leq c_{i}\} \mathbb{1}\{l_{i} \leq z_{i}\}}{ \prod\limits_{i = 1}^{k}\mathbb P\left( l_{i} < z_{i} < c_{i}\right)}.$

According to (5.12), we get that

$\begin{eqnarray*} &&\mathbb{E}({ \Phi}_{\psi}(Y_{1}, \ldots, Y_{k}, L_{1}, \ldots, L_{k}, C_{1}, \ldots, C_{k})\mid(\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t} )\nonumber\\ && \qquad \qquad \qquad = \mathbb{E}\left(\psi(Y_{1}, \ldots, Y_{k})\mid (\mathbf{ X}_{1}, \ldots, \mathbf{ X}_{k}) = \mathbf{ t}\right). \end{eqnarray*}$

An analog estimator to (1.1) for LTRC data can be expressed as follows:

$\begin{eqnarray} \breve{\breve{r}}_{n}^{(k)}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}} \ \cdots\Delta_{i_{k}} \epsilon_{i_{1}} \cdots \epsilon_{i_{k}} \psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ \mathbb P\left( L_{i_1} < Z_{i_1} < C_{i_1}\right)\cdots \mathbb P\left( L_{i_k} < Z_{i_k} < C_{i_k}\right)}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}), \end{eqnarray}$

(5.17)

where $\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x})$ is defined as in (5.14). As $\mathbb P\left(L_i < Z_i < C_i\right)$ is not known, we need to estimate it. We introduce $N_i(t) =$ $\mathbb{1}(L_i < Z_i \leq t, \Delta_i = 1)$ and $N_i^c(t) = \mathbb{1}\left(L_i < Z_i \leq t, \Delta_i = 0\right)$ as the counting process corresponding to the variable of interest and the censoring variable. Furthermore, let

$N(t) = \sum\limits_{i = 1}^n N_i(t)$

and

$N^c(t) = \sum\limits_{i = 1}^n N_i^c(t).$

We introduce the risk indicators as $R_i(t) = \mathbb{1}\left(Z_i \geq t \geq L_i\right)$ and

$R(t) = \sum\limits_{i = 1}^n R_i(t).$

It is important to note that the risk set $R(t)$ at $t$ contains the subjects who entered the study before $t$ and are still under study at $t$ . Indeed, $N_i^c(t)$ is a local sub-martingale with the appropriate filtration $\mathbb{F}_t$ . The martingale associated with the censoring counting process with filtration $\mathbb{F}_t$ is given by

$M_i^c(t) = N_i^c(t)-\int_0^t R_i(u) \lambda_c(u) \mathrm{d} u, \quad i = 1, 2 \ldots, n.$

Here, $\lambda_c(\cdot)$ represents the hazard function associated with the censoring variable $C$ under left truncation. The cumulative hazard function for the censoring variable $C$ is defined as

$\Lambda_c(t) = \int_0^t \lambda_c(u) \mathrm{d} u.$

Denote

$M^c(t) = \sum\limits_{i = 1}^n M_i^c(t).$

Now, we define the sub-distribution function of $T_1$ corresponding to $\Delta_1 = 1$ and $\epsilon_1 = 1$ as

$S(x) = \mathbb P\left(T_1 \leq x, \Delta_1 \epsilon_1 = 1\right) .$

Let

$w(t) = \int_0^{\infty} \frac{h_1(x)}{\mathbb P\left(L_1 \leq x \leq C_1\right)} \mathbb{1}(x > t) \mathrm{d} S(x),$

where $h_1(x) = \mathbb E\left(\psi\left(\left(T_1, \Delta_1\right), \ldots, \left(T_k, \Delta_k\right)\right) \mid\left(T_1, \Delta_1\right) = \left(x, \Delta_1\right)\right)$ . Also, denote $\tilde z(t) = \mathbb P\left(T_1 \geq t \geq L_1\right)$ . Then, an estimate for the survival function of the censoring variable $C$ under left truncation, denoted as $\widehat{K}_c(\cdot)$ , see ^[174], can be formulated as follows:

$\begin{eqnarray} \widehat{K}_c(\tau) = \prod\limits_{t \leq \tau}\left(1-\frac{d N^c(t)}{\tilde Z(t)}\right). \end{eqnarray}$

(5.18)

Similar to the Nelson-Aalen estimator (for instance, see ^[7]), the estimator for the cumulative hazard function of the censoring variable $C$ under left truncation is represented as:

$\begin{eqnarray} \widehat{\Lambda}_c(\tau) = \int_0^\tau \frac{d N^c(t)}{\tilde Z(t)} . \end{eqnarray}$

(5.19)

In both the definitions presented in (5.18) and (5.19), we make the assumption that $\tilde Z(t)$ is non-zero with probability one. The interrelation between $\widehat{K}_c(\tau)$ and $\widehat{\Lambda}_c(\tau)$ can be expressed as:

$\widehat{K}_c(\tau) = \exp \left[-\widehat{\Lambda}_c(\tau)\right].$

Let $a_K = \inf \{t: K(t) > 0\}$ and $b_K = \sup \{t: K(t) < 1\}$ denote the left and right endpoints of the support. For LTRC data, as in ^[187], $F(\cdot)$ is identifiable if $a_G \leqslant a_W$ and $b_G \leqslant b_W$ . By Corollary 2.2. ^[187], for $b < b_W$ we readily infer that

$\begin{eqnarray} \sup\limits_{a_W < \tau < b}|\widehat{K}_c(\tau)-K_c(\tau)| = O(\sqrt{n^{-1}\log \log n}). \end{eqnarray}$

(5.20)

From the above, the estimator (5.17) can be rewritten directly as follows:

$\begin{eqnarray} \breve{\breve{r}}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{ x}) = \sum\limits_{(i_{1}, \ldots, i_{k})\in I_n^k}\frac{ \Delta_{i_{1}} \ \cdots\Delta_{i_{k}} \epsilon_{i_{1}} \cdots \epsilon_{i_{k}} \psi({ Z}_{i_{1}}, \ldots, { Z}_{i_{k}})}{ \widehat{K}_c\left(Z_{i_1}\right) \cdots \widehat{K}_c\left(Z_{i_k}\right)}\overline{\omega}_{n, \mathbf{ i}}^{(k)}(\boldsymbol{\theta}, \mathbf u, \mathbf{x}). \end{eqnarray}$

(5.21)

The last estimator is the conditional version of that studied in ^[172]. Following the same reasoning of the Corollary 5.1, one can infer that as $n\rightarrow \infty,$

$\begin{eqnarray} &&\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m} \sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in [C_{1}h, 1-C_{1}h]^m} \left\lvert\breve{\breve {r}}_{n}^{(k)*}(\psi, \boldsymbol{\theta}, \mathbf u, \mathbf{ x}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert \\&&\qquad \qquad\qquad \qquad = O_{\mathbb P}\left(\sqrt{\frac{\log n}{n h^m\phi^m(h_n)}} + h^{2m\wedge\alpha}\right). \end{eqnarray}$

(5.22)

6. The bandwidth selection criterion

Various methodologies have been devised to formulate asymptotically optimal bandwidth selection rules for nonparametric kernel estimators, particularly for the Nadaraya-Watson regression estimator. Key contributions have been made by ^{[42,92,96,150]}. The proper selection of bandwidth is pivotal, whether in the standard finite-dimensional case or in the infinite-dimensional framework, to ensure robust practical performance. Currently, to the best of our knowledge, such investigations do not exist for addressing a general functional conditional $U$ -statistic. However, an extension of the leave-one-out cross-validation procedure allows us to define, for any fixed $\mathbf{j} = (j_1, \ldots, j_m)\in I_n^m$ :

$\begin{equation} \widetilde{r}_{n, \mathbf j}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \sum\limits_{\mathbf{i} \in I_n^m(\mathbf j)}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i} \in I_n^m(\mathbf j)}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}, \end{equation}$

(6.1)

where:

$I^m_n(\mathbf{j}): = \left\{\mathbf{i}\in I_n^m\; \mbox{and}\; \mathbf{i}\neq \mathbf{j}\right\} = I_n^m\backslash \{\mathbf{j}\}.$

Equation (6.1) denotes the leave-out- $\left(\mathbf{X}_{\mathbf{j}}, \mathbf{Y}_{\mathbf{j}}\right)$ estimator of the functional regression and can also serve as a predictor of $\varphi(\mathbf{ Y}_{{j_{1}}}, \ldots, \mathbf{ Y}_{{j_{m}}}): = \varphi(\mathbf{ Y}_{\mathbf j})$ . To minimize the quadratic loss function, we introduce the following criterion. Let $\mathcal{W}(\cdot)$ be a known non-negative weight function:

$\begin{equation} CV\left(\varphi, h_n\right): = \frac{(n-m)!}{n!}\sum\limits_{\mathbf{j}\in I_n^m}\left(\varphi\left(\mathbf{Y}_{\mathbf{j}}\right)-\widetilde{r}_{n, \mathbf j}^{(m)}(\varphi, \mathbf{u}, \mathbf{X}_{\mathbf{j}}, \boldsymbol{\theta};h_{n})\right)^2\widetilde{\mathcal{W}}\left(\mathbf{X}_{\mathbf{j}}\right). \end{equation}$

(6.2)

Building upon the concepts advanced by ^[150], a judicious approach for selecting the bandwidth is to minimize the aforementioned criterion. Therefore, we choose $\widehat{h}_{n} \in [a_n, b_n]$ , minimizing among $h \in [a_n, b_n]$ :

$CV\left(\varphi, h_n \right).$

Following the approach proposed by ^[20], where bandwidths are locally determined through a data-driven method involving the minimization of a functional version of a cross-validated criterion, we can substitute (6.2) with:

$\begin{equation} CV\left(\varphi, h_n\right): = \frac{(n-m)!}{n!}\sum\limits_{\mathbf{j}\in I_n^m}\left(\varphi\left(\mathbf{Y}_{\mathbf{j}}\right)-\widetilde{r}_{n, \mathbf j}^{(m)}(\varphi, \mathbf{u}, \mathbf{X}_{\mathbf{j}};h_{n})\right)^2\widehat{\mathcal{W}}\left(\mathbf{X}_{\mathbf{j}}, \mathbf{ x}\right), \end{equation}$

(6.3)

where

$\widehat{\mathcal{W}}\left(\mathbf{ s}, \mathbf{ t }\right): = \prod\limits_{i = 1}^m\widehat{W}(\mathbf s_{i}, \mathbf t_i).$

In practice, one takes, for $\mathbf{ i} \in I_n^m$ , the uniform global weights $\widetilde{\mathcal{W}}\left(\mathbf{X}_{\mathbf{i}}\right) = 1$ , and the local weights

$\widehat{W}(\mathbf{X}_{\mathbf{i}}, \mathbf{ t }) = \left\{ \begin{array}{ccl} 1 & \mbox{if}& d_{\boldsymbol{\theta}}(\mathbf{X}_{\mathbf{i}}, \mathbf{ t }) \leq h_n, \\ 0 & & \mbox{otherwise}. \end{array}\right.$

For conciseness, we have exclusively discussed the widely used cross-validated selected bandwidth method. However, this approach can be generalized to other bandwidth selectors, including those based on Bayesian principles ^[156].

7. Concluding remarks

This manuscript introduces the theory of single-index $U$ -processes tailored specifically for locally stationary variables within the functional data paradigm. The primary objective is to leverage functional local stationary approximations to facilitate asymptotic analyses in the statistical inference of non-stationary time series. We underscore the significance of adopting absolutely regular conditions or $\beta$ -mixing conditions, which are independent of the entropy dimension of the class, a departure from other mixing conditions. In contrast to strong mixing, $\beta$ -mixing offers greater flexibility, allowing decoupling and accommodation of diverse examples. Reference ^[103] provided a comprehensive characterization of stationary Gaussian processes satisfying the $\beta$ -mixing condition. Additionally, $\beta$ -mixing aligns with the $L_2(\mathbb{P})$ -norm, playing a crucial role. Unlike strong mixing, which demands a polynomial rate of decay for strong mixing coefficients contingent on the entropy dimension of the function class, $\beta$ -mixing involves the $L_1$ -norm and the metric entropy function $H(\cdot, T, d)$ . This function is defined concerning the pseudo-metric $d(s, t) = \sqrt{\operatorname{Var}(\mathbb{G}(s)-\mathbb{G}(t))}$ for a Gaussian process $\mathbb{G}(\cdot)$ . The definition satisfies the integrability condition:

$\int_{0}^{1} \sqrt{H(u, T, d)} du < +\infty.$

Consequently, we establish the rate of convergence, demonstrating that, under suitable conditions, the kernel estimator $\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n})$ constructed with bandwidth $h$ converges to the regression operator $r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})$ with a rate:

$O_{\mathbb{P}}\left(\sqrt{\frac{\log n}{n h^m\phi^m(h_n)}} + h^{2m\wedge\alpha}\right).$

The presented rate underscores the importance of the small-ball probability function, impacting the concentration of functional variables $X_i$ . The second term is linked to the bias of the estimate, dependent on the smoothness of the operator $r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})$ and its parameter $\alpha$ , as indicated by the Lipschitz condition. The interconnected nature of the concentration of functional variables $X$ , the small-ball probability, and the convergence rate are crucial for achieving a more efficient estimator with less dispersed variables and a higher small-ball probability. In the context of empirical process settings, the rate of convergence is established over a subset $[Ch, Ch-1]^m \times \{\mathbf{x}\}^m$ , and for forecasting purposes, it can be extended to a subset $[Ch-1, 1]^m \times \{\mathbf{x}\}^m$ using one-sided kernels or boundary-corrected kernels. This extension necessitates ensuring that the kernels have compact support and are Lipschitz. The weak convergence is established through classical procedures, involving finite-dimensional convergence and the equicontinuity of conditional $U$ -processes. Finite-dimensional convergence is achieved through Hoeffding decomposition, followed by approaching independence through a block decomposition strategy, ultimately leading to proving a central limit theorem for independent variables. The equicontinuity aspect requires meticulous attention due to the comprehensive framework considered. These results provide a solid theoretical foundation for our methodologies, extending non-parametric functional principles to a generic dependent structure—a relatively new and significant research area. It is crucial to note that, when dealing with highly dependent data, mixing, often adopted for simplicity, may not be suitable. The ergodic framework eliminates the need for frequently used strong mixing conditions, along with their variations for measuring dependence, and the more intricate probabilistic computations they entail (see ^[36,37,47]). An intriguing avenue for exploration is the $k$ NN estimator:

$\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{H_{n, k}(x_k)}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{H_{n, k}(x_k)}\right)\right\}},$

where

$H_{n, k}(x_j) = \min\left\{h\in \mathbb{R}^{+}: \sum\limits_{i = 1}^{n} \mathbb{1}_{B(x_j, h)}(X_{i}) = k\right\},$

with $B(t, h) = \left\{z \in \mathcal{H}: d(z, t) \leqslant h\right\}$ representing a ball in $\mathcal{H}$ with center $t \in \mathcal{H} $ and radius $h$, and $\mathbb{1}_{A}$ being the indicator function of the set $A$, as detailed in ^[3]. These findings open avenues for various applications, such as data-driven automatic bandwidth selection and confidence band construction. We propose the intriguing notion that bootstrap methods, as outlined in ^[35,38], provide valuable insights when applied in the functional context, especially the functional variant of the wild bootstrap, employed for smoothing parameter selection. It is crucial to acknowledge that the theoretical underpinnings of this bandwidth selection method using the functional wild bootstrap are still an area with unresolved challenges. Finally, change-point detection has emerged as a widely utilized technique for recognizing specific points within a data series when a stochastic system experiences abrupt external perturbations. The occurrence of these alterations can be attributed to several circumstances, hence their identification can greatly enhance their comprehension. The application of change-point analysis has been observed in numerous stochastic processes across a wide range of scientific domains. However, the investigation of change-point analysis for conditional $U$ -statistics remains an unexplored and demanding research issue.

8. Mathematical developments

In this section, we focus on proving our results, using the notation introduced earlier. We start by presenting the following lemma before delving into the proof of the main results. The proof techniques follow and extend those of ^[165] to the single index setting. Additionally, we incorporate certain intricate steps from ^[13], as observed in ^[39,40].

Proof of Proposition 3.1. As mentioned earlier, our statistic is a weighted $U$ -statistic, expressed as a sum of $U$ -statistics through the Hoeffding decomposition. We will delve into the details of this decomposition in Sub-section 3.1 to achieve our desired outcomes. In that specific section, we observed that:

$\begin{equation*} \widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)- \mathbb{E}\left(\widehat{\psi}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)\right) = \widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)+\widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi), \end{equation*}$

where the linear term $\widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)$ and the residual term $\widehat{\psi}_{2, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi)$ are precisely defined in (3.9) and (3.11), respectively. Our goal is to establish that the linear term governs the convergence rate of this statistic, while the remaining term converges to zero almost surely as $n\rightarrow \infty$ . We will initiate the analysis by addressing the first term in the decomposition. Consider $B = [0, 1]$ , with $\alpha_{n} = \sqrt{\log n / n h^m \phi^m(h_n)}$ and $\tau_{n} = \rho_{n} n^{1 / \zeta}$, where $\zeta$ is a positive constant specified in Assumption 4, part ⅰ), and $\rho_{n} = (\log n)^{\zeta_{0}}$ for some $\zeta_{0}>0$. Define

$\begin{align} \tilde{\mathbb H}^{(\ell)}_1(z) & : = \tilde{\mathbb H}^{(\ell)}(z) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert \leq \tau_{n}\right\}}, \end{align}$

(8.1)

$\begin{align} \tilde{\mathbb H}_2(z) & : = \tilde{\mathbb H}^{(\ell)}(z) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n}\right\}}, \end{align}$

(8.2)

and

$\begin{align} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) & = \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_\ell \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \tilde{\mathbb H}^{(\ell)}_1(z), \\ \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) & = \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_\ell \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \tilde{\mathbb H}^{(\ell)}_2(z). \end{align}$

It is evident that we have

$\begin{aligned} &\widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) - \mathbb{E} \widehat{\psi}_{1, \mathbf{i}}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi) \\ &\;\;\; = \left[ \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right]+\left[\widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right]. \end{aligned}$

(8.3)

First, we can see that

$\begin{aligned} & \mathbb{P} \left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) \right\rvert > \alpha_{n}\right) \\ & = \mathbb{P} \left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) \right\rvert > \alpha_{n}\right) \\ & \;\;\;\;\;\;\bigcap \left \{ \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m} \bigcup\limits_{i = 1}^{n}\left\lvert {\mathfrak W}_{i, n} \right\rvert > \tau_{n} \right \}\bigcup \left \{\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m} \left\{\bigcup\limits_{i = 1}^{n} \left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \right\}^{\boldsymbol{c}}\right \} \\ &\leq \mathbb{P}\left\{ \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \vartheta(\mathbf{i}) \right\rvert > \alpha_{n}\right. \\ &\left.\qquad\qquad\;\;\;\;\;\;\; \bigcap \left \{\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \bigcup\limits_{i = 1}^{n}\left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \right \}\right\} \\ & \;\;\;\;\;\; + \mathbb{P}\left\{sup_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert > \alpha_{n}\right. \\ &\;\;\;\;\;\;\;\left.\qquad \qquad \bigcap \left \{ \left\{\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\bigcup\limits_{i = 1}^{n} \left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \right\}^{\boldsymbol{c}}\; \; \right \}\right\} \\ & \leq \mathbb{P}\left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n} \text { for some } i = 1, \ldots, n\right) + \mathbb{P}( \varnothing ) \\ & \leq {\tau_{n}^{-\zeta} \sum\limits_{i = 1}^{n} \mathbb{E}\left[\sup\limits_{\mathscr{F}_m \mathcal{K}^m} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m} \sup\limits_{\mathbf{u} \in B^m}\left\lvert {\mathfrak W}_{i, n}\right\rvert^{\zeta}\right] \leq n \tau_{n}^{-\zeta} = \rho_{n}^{-\zeta} \rightarrow 0 }. \end{aligned}$

We deduce that

$\begin{align} \mathbb{E}\left[\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert\right] & \leq \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \mathbb{E}\left(\left\lvert \tilde{\mathbb H}^{(\ell)}_2(z)\right\rvert\right), \end{align}$

(8.4)

where

$\begin{aligned} &\mathbb{E}\left(\left\lvert \tilde{\mathbb H}^{(\ell)}_2(z)\right\rvert\right) = \mathbb{E}\left[\frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right) {\mathfrak W}_{i, n}\times \int {\mathfrak W}_{( {1}, \ldots, \ell-1 , \ell, \ldots, {m})} \right. \\ &\;\;\;\;\;\;\;\; \left. \prod\limits_{\underset{k \neq i}{k = 1}}^m \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert > \tau_{n}\right\}}\right] \\ & \;\;\;\lesssim \tau_{n}^{-(\zeta-1)} \mathbb{E}\left[\frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right) \left\rvert {\mathfrak W}_{i, n}\right\rvert ^\zeta\right] \\ & \;\;\;\lesssim \frac{\tau_{n}^{-(\zeta-1)}}{ \phi(h_n)} \mathbb{E}\left[ K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right)\right] \\ &\;\;\;\lesssim \frac{\tau_{n}^{-(\zeta-1)}}{ \phi(h_n)} \times \left[\frac{1}{n h}+ \phi(h_n)\right] \\ &\;\;\;\lesssim \frac{\tau_{n}^{-(\zeta-1)}}{ n h \phi(h_n)} + \tau_{n}^{-(\zeta-1)}, \end{aligned}$

(8.5)

where

$\begin{aligned} & \mathbb{E}\left(K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i, n} \right)}{h_n}\right)\right)\\ & \;\;\;= \mathbb{E}\left( K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i, n}\right)}{h_n}\right) +K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right) -K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right)\right)\\ &\;\;\;\leqslant \mathbb{E}\left\vert K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i, n}\right)}{h_n}\right) -K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right)\right\vert + \mathbb{E} \left\vert K_{2}\left(\frac{d_{\theta_i}\left(x_i, X_{i}^{i/n}\right)}{h_n}\right)\right\vert\\ &\;\;\;\lesssim \quad C h^{-1} \mathbb{E}\left\vert d_{\theta_i}\left(x_i, X_{i, n}\right) - d_{\theta_i}\left(x_i, X_{i}^{i/n}\right) \right\vert + \mathbb{E}\left[ \mathbb{1}_{\left(d\left(x, X_{i, n}^{(i / n)}\right) \leq h\right)}\right] \text{( K_2 is Lipschitz)}\\ &\;\;\;\lesssim \frac{1}{n h} \mathbb{E} \left\vert U_{i}^{(i /n)}\right\vert+F_{i / n}(h ; x_i) \text{(using Assumption 1 i))}\\ &\;\;\;\lesssim \frac{1}{n h}+ \phi(h_n). \end{aligned}$

Thus, we acquire

$\begin{aligned} &\mathbb{E}\left[\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert\right] \leq \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} \mathbb{E}\left(\left\lvert H^{(\ell)}_2(z)\right\rvert\right) \nonumber\\ &\lesssim \underset{\leq C \; \; \text{uniformly in} \;\textbf{u} }{\underbrace{\frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}}}} \times \left[\frac{\tau_{n}^{-(\zeta-1)}}{ n h \phi(h_n)} + \tau_{n}^{-(\zeta-1)}\right]\\ &\lesssim \left[\frac{\tau_{n}^{-(\zeta-1)}}{ n h \phi(h_n)} + \tau_{n}^{-(\zeta-1)}\right] \lesssim \tau_{n}^{-(\zeta-1)} = (\rho_{n} n^{1 / \zeta})^{-(\zeta-1)} \lesssim \alpha_n. \end{aligned}$

Consequently, we deduce that

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(2)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert = O_ \mathbb{P}(\alpha_n). \end{equation}$

(8.6)

Next, let's consider

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\lvert \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\rvert.$

To achieve the aimed result, we will cover the region $B^m = [0, 1]^{m}$ by

$\bigcup\limits_{k_1, \ldots, k_m = 1}^{N_{(\mathbf{u})}}\prod\limits_{j = 1}^m\mathbf{B}(\mathbf{u}_{k_j}, r),$

for some radius $r$ . Hence, for each $\mathbf{u} = (\mathbf{u}_1, \ldots, \mathbf{u}_m)\in [0, 1]^{m}$ , there exists $\mathbf{l}(\mathbf{u}) = (l(\mathbf{u}_1), \ldots, l(\mathbf{u}_m))$ , where $\forall 1 \leq i \leq m, 1\leq l(\mathbf{u}_i)\leq N_{(\mathbf{u})}$ and such that

$\mathbf{u} \in \prod\limits_{i = 1}^m \mathbf{B}(\mathbf{u}_{l(\mathbf{u}_i)}, r) \; \mbox{ and }\; \vert \mathbf{u}_i - \mathbf{u}_{l(\mathbf{u}_i)}\vert\leq r , \; \mbox{ for }\; 1\leq i\leq m,$

then, for each $\mathbf{u}\in [0, 1]^{m}$ , the closest center will be $\mathbf{u}_\mathbf{l}(\mathbf{u})$ , and the ball with the closest centre will be defined by

$\mathcal{B}(\mathbf{u}, \mathbf{l}(\mathbf{u}), r): = \prod\limits_{j = 1}^m\mathbf{B}(\mathbf{u}_{k_j}, r).$

In the same way, $\Theta^m\times \mathcal{H}^m$ should be covered by

$\bigcup\limits_{\tilde k_1, \ldots, \tilde k_m = 1}^{N_{(\theta)}}\bigcup\limits_{k_1, \ldots, k_m = 1}^{N_{(x)}}\prod\limits_{j = 1}^m\mathbf{B}(x_{k_j}, r),$

for some radius $r$ . Hence, for each $\mathbf{x} = (x_1, \ldots, x_m)\in \mathcal{H}^m$ , there exists $\mathbf{l}(\mathbf{x}) = (l(x_1), \ldots, l(x_m))$ , where $\forall 1 \leq i \leq m, 1\leq l(x_i)\leq {N_{(x)}}$ and such that

$\mathbf{x} \in \prod\limits_{i = 1}^m B(u_{l(x_i)}, r) \; \mbox{ and }\; d_{\theta_i}( x_i, x_{l(u_i)})\leq r, \; \mbox{ for }\; 1\leq i\leq m,$

then, for each $\mathbf{x}\in \mathcal{H}^m$ , the closest center will be $\mathbf{x}_\mathbf{l}(\mathbf{x})$ , and the ball with the closest centre will be defined by

$\mathbf{B}_{\boldsymbol{\theta}}(\mathbf{x}, \mathbf{l}(\mathbf{x}), r): = \prod\limits_{i = 1}^m B_{\theta_i}(x_{l(x_i)}, r).$

We define

$K^*(\boldsymbol{\omega}, \boldsymbol{v}) = C \prod\limits_{k = 1}^m \mathbb{1}_{(\lvert\omega_k \rvert\leq 2C_1)}\prod\limits_{k = 1}^m K_2(v_k)\; \; \mbox{for}\; \; (\omega, v) \in \mathbb{R}^2.$

We can show that, for $(u, x) \in B _{j, n}$ and $n$ large enough,

$\begin{align*} &\left\lvert \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-\frac{i_k}{n}}{h_n}\right)-\prod\limits_{k = 1}^m K_{1}\left(\frac{u_{j, k}-\frac{i_k}{n}}{h_n}\right)\right\rvert K_2 \left(\frac{d_{\theta_i}(x_i, X_{i, n})}{h_n}\right) \\ &\quad \leq \alpha_{n} K^{*}\left(\frac{u_{n}-\frac{i}{n}, d_{\theta_i}\left(x_i, X_{i, n}\right)}{h_n}\right). \end{align*}$

Let

$\begin{aligned} &\bar{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = \frac{1}{n h^m \phi(h_n)} \sum\limits_{i = 1}^{n} \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)}\prod\limits_{k = 1}^m K^{*}\left(\frac{u_k-\frac{i_k}{n}, d_{\theta_k}\left(x_{k}, X_{i_k, n}\right)}{h_n}\right){\mathfrak W}_{i, n} \\ &\;\;\;\;\times \int {\mathfrak W}_{( {1}, \ldots, \ell-1 , \ell, \ldots, {m})} \prod\limits_{\underset{k \neq i}{k = 1}}^m \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \mathbb{P}(d\nu_1, \ldots, d\nu_{\ell-1}, d\nu_{\ell}, \ldots, d\nu_{m-1}) \mathbb{1}_{\left\{\left\lvert {\mathfrak W}_{i, n}\right\rvert \leq \tau_{n}\right\}}. \end{aligned}$

Note that $\mathbb E\left[\left\lvert \bar{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\rvert\right] \leq M < \infty$ for some sufficiently large $M$ . Then, we obtain

$\begin{aligned} & \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m}\left\vert\widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})- \mathbb{E}\left[\widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right]\right\vert \\ &\;\;\;\leq \;\;sup_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\left\vert\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)- \mathbb{E}\left[\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right]\right\vert \\ & \;\;\;\;\;\;\;\;\;\qquad + \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\alpha_{n}\left(\left\vert\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right\vert+ \mathbb{E}\left[\left\vert\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right\vert\right]\right) \\ &\;\;\;\leq\;\; \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\left\vert\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)- \mathbb{E}\left[\widehat{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right]\right\vert \\ & \qquad \;\;\;\;\;\;\;\;\; + \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\left\vert\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)- \mathbb{E}\left[\bar{\psi}_{1}^{(1)}\left(\mathbf{u}_{n}, \mathbf{x}, \boldsymbol{\theta}\right)\right]\right\vert+2 M F(\mathbf{y}) \alpha_{n} . \end{aligned}$

(8.7)

Therefore

(8.8)

where, for $N_{\mathscr{F}_m\mathfrak{K}^m_\Theta}N_{(\theta)}^m N_{(x)}^m N_{(u)}$ denotes the covering number related respectively to the class of functions $\mathscr{F}_m\mathfrak{K}^m_\Theta$ , the balls that cover $[0, 1]^m$ and the balls that cover $\mathcal{H}^m\times\Theta^m$ .

$\begin{eqnarray} Q_{1, n}& = & {N_{\mathscr{F}_m\mathfrak{K}^m_\Theta} N_{(\theta)}^mN_{(x)}^m N_{(u)}^m\underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(x_{ \mathbf i(x)}, r)} \underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(\mathbf u_{ \mathbf i(\mathbf u)}, r)}} \\ &&\;\;\;\qquad \qquad \qquad \qquad \mathbb P\left(\left\vert\widehat{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)- \mathbb{E}\left[\widehat{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)\right]\right\vert > M \alpha_{n}\right), \\ Q_{2, n}& = & {N_{\mathscr{F}_m\mathfrak{K}^m_\Theta} N_{(\theta)}^m N_{(x)}^m N_{(u)}^m\underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(x_{ \mathbf i(x)}, r)} \underset{1\leq i_1 < \cdots < i_m \leq m}{\max} \sup\limits_{B(\mathbf u_{ \mathbf i(\mathbf u)}, r)}} \\ &&\;\;\;\qquad \qquad \qquad \qquad \mathbb P\left(\left\vert\bar{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)- \mathbb{E}\left[\bar{\psi}_{1}\left(\mathbf{u}_{j}, \mathbf{x}\right)\right]\right\vert > M \alpha_{n}\right). \end{eqnarray}$

Notice that $Q_{1, n}$ and $Q_{1, n}$ might be treated in the same way, so, we restrict our attention to $Q_{1, n}$ . Write:

$\begin{aligned} &\mathbb{P}\left(\left\vert\widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E} \widehat{\psi}_{1}^{(1)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta} > M \alpha_{n}\right) \\ &\;\;\; = \mathbb{P}\left[\left\vert h^m \phi^m(h_n) \sum\limits_{i = 1}^n \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} H^{(\ell)}_1(z)\right.\right. \\ & \;\;\;\;\;\;\;\;-\left.\left. \mathbb{E}\left( h^m \phi^m(h_n)\sum\limits_{i = 1}^n \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}} H^{(\ell)}_1(z)\right)\right\vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\right. \\ &\qquad\;\;\;\;\;\;\;\;\;\qquad\qquad\qquad \left. > M n \frac{(n-1)!}{(n-m)!} \alpha_{n} h^m \phi^m(h_n)\right] \\ & \;\;\;= \mathbb{P}\left(\left\vert \sum\limits_{i = 1}^n \Phi_{i, n}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta} > M n \frac{(n-1)!}{(n-m)!} \alpha_{n} h^m \phi^m(h_n)\right) . \end{aligned}$

Note that the array $\left\{\Phi_{i, n}(u, x)\right\}$ is $\alpha$ -mixing for each fixed $(u, x)$ with mixing coefficients $\beta_{\Phi, n}$ such that $\beta_{\Phi, n}(k) \leq \beta(k)$ . We apply Lemma A.4 with

$\varepsilon : = M n \frac{(n-1)!}{(n-m)!}h^m \phi^m(h_n) \alpha_{n},$

and $b_{n} = C \tau_{n}$ for sufficiently large $C > 0$ and $S_n = \alpha_n^{-1} \tau_{n}^{-1}$ . As same as [, Theorem 2], we can see that $\sigma_{S_{n}, n}^{2} \leq C^\prime S_n h^m \phi^m(h_n)$ , and we obtain:

$\begin{align*} { \mathbb{P} \left(\left\vert\sum\limits_{i = 1}^{n} Z_{i, n}(u, x)\right\vert \geq \varepsilon \right)} &\leq4 \exp \left(-\frac{\varepsilon^{2}}{64 \sigma_{S_{n}, n}^{2} \frac{n}{S_{n}}+\frac{8}{3} \varepsilon b_{n} S_{n}}\right)+4 \frac{n}{S_{n}} \beta\left(S_{n}\right) \nonumber \\ &\leq 4 \exp \left( - \frac{M^2 \alpha_{n}^2 n^2 \left(\frac{(n-1)!}{(n-m)!}\right)^2 h^{2m} \phi^{2m}(h_n) }{64 C^\prime S_n h \phi(h_n) \frac{n}{S_{n}} +\frac{8}{3} M n \frac{(n-1)!}{(n-m)!} h^m \phi^m(h_n) \alpha_{n} b_n S_{n} } \right) +4 \frac{n}{S_{n}} \beta\left(S_{n}\right) \nonumber \\ &\leq 4 \exp \left(- \frac{ M \left(\sqrt{\log n / n h^m \phi^m(h_n)}\right)^2 n \left(\frac{(n-1)!}{(n-m)!}\right) }{64 C^\prime h^m \phi^m(h_n) \frac{(n-m)!}{M(n-1)! } +\frac{8}{3} C h^m \phi^m(h_n) } \right) + 4 \frac{n}{S_{n}} \beta\left(S_{n}\right)\\ &\lesssim \exp \left( - \frac{M \frac{(n-1)!}{(n-m)!} \log n }{64 \frac{(n-m)!}{(n-1)!} \frac{C^\prime}{M}+ \frac{8}{3} C } \right)+ n S_{n}^{-\gamma-1}. \end{align*}$

To get the last inequality, we must choose $M > C^\prime$ . Since $N \leq C h^{-m}\phi(h_n) \alpha_{n}^{-m}$ , it follows that

$\widehat{Q}_{n} \leq O\left(R_{1n}\right)+O\left(R_{2 n}\right),$

with

$\begin{align*} &R_{1 T} = h^{-m}_n \alpha_{n}^{-m} n^{- \frac{M \frac{(n-1)!}{(n-m)!} }{64 \frac{(n-m)!}{(n-1)!} +3 C }}, \\ &R_{2 T} = h^{-m}_n \alpha_{n}^{-m} n S_{n}^{-\gamma-1}. \end{align*}$

For $M$ sufficiently large, we can see that $R_{1 n} \leq n^{-\varsigma}$ for some small $\varsigma > 0$ . As $\frac{\phi_{n} \log T}{T^{\theta} h^{d+1}} = o(1)$ by assumption, we further get that

$\begin{eqnarray*} R_{2 n} & = & h^{-m}_n\alpha_{n}^{-m} n S_{n}^{-\gamma-1} \\ & = & h^{-m}_nn \left(\sqrt{\frac{\log n }{ n h^m \phi^m(h_n)}}\right)^{-m} (\alpha_n^{-1} \tau_n^{-1})^{-\gamma-1} \\ & = & h^{-m}_n \left(\sqrt{\frac{\log n }{ n h^m \phi^m(h_n)}}\right)^{-m+\gamma+1}((\log n)^{ \zeta_0} n^{1 / \zeta})^{\gamma+1}\\ & = & \frac{(\log n) ^{\frac{-m+\gamma+1}{2}+ \zeta_0(\gamma+1)}}{n ^{\frac{-m+\gamma+1}{2} -1 -\frac{\gamma+1}{ \zeta} }h^{\frac{m+\gamma+1}{2}} \phi(h_n)^{\frac{-m+\gamma+1}{2}}}. \end{eqnarray*}$

Under Assumption 4 ⅱ), we establish that $R_{2n}$ tends to zero, confirming the obtained result. Now, let's move on to the nonlinear segment of the Hoeffding decomposition. Here, the goal is to illustrate that

$\begin{equation*} \mathbb{P} \left[\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \left\vert\widehat{\psi}_{2, \mathbf{i}}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert > \lambda \right] \rightarrow 0 \; \; \text{as}\; \; n\rightarrow \infty. \end{equation*}$

The conclusive phase in establishing Proposition 3.1 involves leveraging Lemma 8.2 to demonstrate the convergence of the nonlinear term to zero. □

Proof of Theorem 3.3. Equation (4.1) in Section 4 shows that

$\begin{eqnarray} {\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})} = \frac{1}{\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{ x}, \mathbf{u})\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right), \end{eqnarray}$

where

$\begin{eqnarray*} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}, \\ \widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} {\mathfrak W}_{\mathbf{i}, \varphi, n}, \\ \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} r^{(m)}\left(\frac{\mathbf{i}}{n} , X_{i, n}\right). \end{eqnarray*}$

The proof of this theorem is intricate and divided into the following four steps; wherein each step, our objective is to demonstrate that:

Step 1.

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \vert\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) \vert = O_{ \mathbb{P}}\left(\sqrt{\log n / n h^m \phi(h_n)}\right).$

Step 2.

$\begin{aligned} & \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \vert \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n}) \\ &\;\;\;\; - \mathbb{E}(\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n})) \vert = O_{ \mathbb{P}}\left(\sqrt{\log n / n h^m \phi(h_n)}\right) . \end{aligned}$

Step 3.

$\begin{eqnarray} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert \mathbb{E}(\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n}))\right\vert = O(h^2) +O(h^\alpha) . \end{eqnarray}$

Step 4.

$\begin{equation*} \frac{ 1}{ \inf\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta}\inf\limits_{\mathbf{x} \in \mathcal{H}^m} \inf\limits_{\mathbf{u} \in [C_1h, 1-C_1h]^m} \left\vert\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\vert} = O_{ \mathbb{P}}(1). \end{equation*}$

Step 1 is an immediate consequence of Proposition 3.1. The validity of the second step is ensured by substituting $\varphi(Y_{i_1}, \ldots, Y_{i_m})$ with

$\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n}),$

and applying Proposition 3.1. We will now proceed with the demonstration of Step 4. Consider

$\begin{equation} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) = \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) + \bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}), \end{equation}$

(8.9)

where

$\begin{eqnarray*} \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})& = &\frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i/n)})}{h_n}\right)\right\}\\ \bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})& = &\frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\\ &&\prod\limits_{k = 1}^m \left[K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i/n)})}{h_n}\right)\right]. \end{eqnarray*}$

For $W\equiv 1$ , the preceding proposition established that

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E}\left(\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right)\right\vert = o_ \mathbb{P}(1).$

Therefore, it is evident that

$\begin{eqnarray} \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) & = & \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) + \mathbb{E}( \widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})) - \mathbb{E}(\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})) \\ & = & o_ \mathbb{P}(1) + \mathbb{E}[\widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})]+ \mathbb{E}[\bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})]. \end{eqnarray}$

(8.10)

Furthermore, we have

$\begin{eqnarray} { \mathbb{E}\left(\bar{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right)} & = & \mathbb{E}\left(\frac{(n-m)!}{n!h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\right. \\ &&\left.\prod\limits_{k = 1}^m\left[K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right] \right) \\ & \lesssim& \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \left(\frac{m \phi^{m-1}(h_n)}{n h_n }\right) = o(1). \end{eqnarray}$

(8.11)

The final outcome arises from the Lipschitz continuity of $K_2(\cdot)$ (as stipulated in Assumption 2, ⅰ)), along with the utilization of Assumption 1 ⅰ) and Lemma A.2). This holds uniformly in $\mathbf{u}$ . Furthermore,

$\begin{aligned} & \mathbb{E}\left[ \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right] = \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right]\\ &\;\;\; = \frac{(n-m)!}{n!h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^mK_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \int_{0}^h \prod\limits_{k = 1}^m K_2\left(\frac{y_k}{h_n}\right) dF_{i_k/n}(y_k, x_k)\\ & \;\;\;\gtrsim \frac{(n-m)!}{n! h^m\phi^m(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \phi^m(h_n)f_1(\mathbf{x}) \sim f_1(\mathbf{x}) > 0, \end{aligned}$

uniformly in $\mathbf{u}$ . Then, we obtain

$\begin{aligned} & \frac{1}{\underset{\mathscr{F}_m\mathfrak{K}^m_\Theta}{\inf} \underset{\mathbf{x} \in \mathcal{H}^m}{\inf} \underset{\mathbf{u} \in [C_1h, 1-C_1h]^m}{\inf} \left\vert\widetilde{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) \right\vert} \\ &\;\;\; = \frac{1}{\underset{\mathscr{F}_m\mathfrak{K}^m_\Theta}{\inf} \underset{\mathbf{x} \in \mathcal{H}^m}{\inf} \underset{\mathbf{u} \in [C_1h, 1-C_1h]^m}{\inf} o(1) + o_ \mathbb{P}(1) + \mathbb{E}\left[ \widehat{r}_1(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right] } = O_ \mathbb{P}(1). \end{aligned}$

(8.12)

Take $K_0:[0, 1]\rightarrow \mathbb{R}$ to be a Lipschitz continuous function with its support in $[0, q]$ for some $q > 1$ , and ensure that $K_0(x) = 1$ for all $x \in [0, 1]$ . Importantly,

$\begin{equation} \mathbb{E}\left[\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) - r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\widetilde{r}_1(\varphi, \mathbf{u}, \mathbf{x};h_{n})) \right] = \sum\limits_{i = 1}^{4} Q_i(\boldsymbol{\theta}, \mathbf u, \mathbf x), \end{equation}$

(8.13)

where $Q_i$ can be defined as follows

$\begin{equation} Q_i(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\}q_i(\boldsymbol{\theta}, \mathbf u, \mathbf x), \end{equation}$

(8.14)

such that

$\begin{eqnarray*} q_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\prod\limits_{k = 1}^mK_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\left\{\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right.\right.\\ &&\left.\left.- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\} \times \left\{r^{(m)}(\varphi, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right], \\ q_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\prod\limits_{k = 1}^m\left\{ K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\}\right.\\ &&\left.\left\{r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n})-r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n}^{(i_k/n)})\right\}\right], \\ q_3(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\left\{\prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) - \prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\}\right.\\ &&\left. \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right) \times\left\{r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n}^{(i_k/n)})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right], \\ q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\left\{r^{(m)}(\varphi, \frac{i_k}{n}, X_{i_k, n}^{(i_k/n)})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right]. \end{eqnarray*}$

Observe that

$\begin{eqnarray*} Q_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) &\lesssim & \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right.\right.\\ &&\left. \left. \left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert \right.\right.\\ &&\left. \left.\times\left\vert r^{(m)}(\varphi, \frac{\boldsymbol i} {n}, \boldsymbol X_{\boldsymbol i, n})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert\right]\right\}. \end{eqnarray*}$

Utilizing Assumption 3 (ⅰ), the properties of $r(\boldsymbol{\theta}, \mathbf u, \mathbf x)$ allow us to establish that

$\begin{aligned}& \prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\left\vert r^{(m)}\left(\varphi, \frac{\boldsymbol i}{n}, \boldsymbol X_{\boldsymbol i, n}\right)-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\vert \\ & \;\;\;\;\;\lesssim \prod\limits_{k = 1}^m K_0\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \left(d_{\mathcal{H}^m}\left( \boldsymbol X_{\boldsymbol i, n}, \mathbf{x}\right) + \|\mathbf{u} - \frac{\boldsymbol i}{n} \|\right)^{\alpha} \lesssim h^{m \wedge \alpha}. \end{aligned}$

With Assumption 2, part ⅱ) in mind, we will employ Lemma 8.1 and Eq (8.55) to verify that:

$\begin{eqnarray} Q_1(\boldsymbol{\theta}, \mathbf u, \mathbf x) &\lesssim & \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\right\} \times h^{m \wedge \alpha} \times \frac{m \phi^{m-1}(h_n)}{n h_n }. \\ &\lesssim& \frac{1}{n \phi(h_n) h^{m-(m \wedge \alpha)}}\; \; \; \; \text{uniformly in}\; \mathbf{u} . \end{eqnarray}$

(8.15)

Likewise, it can be observed that

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in [C_1h, 1-C_1h]^m} Q_2(\boldsymbol{\theta}, \mathbf u, \mathbf x) \lesssim \frac{1}{n \phi(h_n) h^{m-(m \wedge \alpha)}}, \end{equation}$

(8.16)

and

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in [C_1h, 1-C_1h]^m} Q_3(\boldsymbol{\theta}, \mathbf u, \mathbf x) \lesssim \frac{1}{n \phi(h_n) h^{m-(1 \wedge \alpha)}}. \end{equation}$

(8.17)

Concerning the final term, we derive

$\begin{eqnarray*} Q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{\prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \\ && \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\left\{r^{(m)}(\varphi, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}^{(\mathbf{i}/n)})-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right]. \end{eqnarray*}$

By utilizing Lemma A.1 along with the inequality (2.12) and considering Assumption 1, it becomes evident that

$\begin{aligned}& \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in [C_1h, 1-C_1h]^m} \left\vert Q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right\vert \\ &\;\;\;\leq \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\vert\left\{ \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \right\vert \\ & \;\;\;\;\;\;\;\mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; \left\vert\left\{r^{(m)}\left(\varphi, \frac{\mathbf i}{n}, X_{\mathbf i, n}^{(\mathbf i/n)}\right)-r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x})\right\}\right\vert \right] \\ &\;\;\;\lesssim \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\vert\left\{ \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \right\vert \\ & \;\;\;\;\;\;\;\mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; \left(d_{\mathcal{H}^m}\left(X_{\mathbf i, n}^{(\mathbf i/n)}, \mathbf{x}\right) + \|\mathbf{u} - \frac{\mathbf i}{n} \|\right)^{\alpha} \right] \\ &\;\;\;\lesssim \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\vert \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right. \\ & \;\;\;\;\;\;\;\left. -\int_{0}^{1}\cdots \int_{0}^{1} \frac{1}{h^m} \prod\limits_{k = 1}^m K_{1}\left(\frac{(u_k-v_k)}{h_n}\right) d v_k \right\vert \mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; h^{\alpha}_n\right] \\ & \;\;\;\;\;\;\;+ \frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m} \int_{0}^{1}\cdots \int_{0}^{1} \frac{1}{h^m} \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-v_k}{h_n}\right) d v_k \\ &\;\;\;\;\;\;\;\times \mathbb{E}\left[\left\vert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\vert\; \; h^{\alpha}_n\right] \\ &\;\;\;\lesssim O\left(\frac{1}{n^mh^{2m}_n}\right) \; \; h^{\alpha}_n + h^{\alpha}_n. \end{aligned}$

(8.18)

Keep in mind

$(n)^{-m} h^{\alpha -2m}_n \lesssim h^{2m}_n\phi^m(h_n)\ll h^{2m}_n,$

we deduce that

$\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in [C_1h, 1-C_1h]^m} \left\vert Q_4(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right\vert \ll h^{2m}_n+h^{\alpha}_n.$

Certainly, within the framework of our assumptions, the approximation error can be considered as

$\begin{equation} O \left(\frac{1}{n^m \phi(h_n) h^{m-(1 \wedge \alpha)}_n}\right) \ll h^{2m\wedge \alpha}_n. \end{equation}$

(8.19)

With this inequality, the proof is concluded. □

Proof of Theorem 4.1. The goal is to establish weak convergence, encompassing finite-dimensional convergence and asymptotic equicontinuity, for the stochastic $U$ -process:

$\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(\varphi) = \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left(\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf x, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf x, \boldsymbol{\theta})-B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right),$

across all the specified function classes within the framework. According to [, Section 4.2], finite-dimensional convergence asserts that every finite set of functions $f_1, \ldots, f_q$ in $L_2$

$\begin{equation} \left(\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(f_1), \ldots, \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(f_q)\right) \end{equation}$

(8.20)

convergences to the corresponding finite-dimensional distributions of the process $G_p$ . By leveraging Cramér-Wold and considering the countability of various classes, we can simplify the scenario from the weak convergence of the $U$ -process to the weak convergence of $U$ -statistics with the kernel $f_r$ for all $r \in \{1, \ldots, q\}$ . As the $U$ -process is a linear operator, our focus narrows down to demonstrating the convergence of $\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} D_{n}(f_r)$ towards a Gaussian distribution. Therefore, for a fixed kernel, we have:

$\begin{aligned}& \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}) \\&\;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) \right) \\ &\;\;\; = \frac{1}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+ \widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \right), \end{aligned}$

(8.21)

where

$\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}).$

We begin treating each term. For this sake, we will calculate the variance of $\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x)$ . Take

$\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) = \prod\limits_{k = 1}^mK_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \left[ r^{(m)}(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n})- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right].$

Observe that

$\begin{eqnarray} Var(\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) & = & Var\left(\widehat{g}_2(\boldsymbol{\theta}, \mathbf u, \mathbf x)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n})\right) \\ & = & Var\left(\frac{(n-m)!}{n! h^m\phi^m(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\left\{ \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \right\} \Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) Var\left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ && + \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{i_1\neq \ldots\neq i_m, \\ i_1^\prime\neq \ldots\neq i_m^\prime, \\ \exists j / i_j\neq i_j^\prime}} \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k^\prime/n}{h_n}\right) \\ && Cov\left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x), \Delta_{\boldsymbol i^\prime, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ &: = & V_1 + V_2. \end{eqnarray}$

(8.22)

Looking on $V_1$ , we have

$\begin{eqnarray} \vert V_1\vert & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) Var\left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \left[ \mathbb{E} \left(\Delta_{\boldsymbol i, n}^2(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) - \left( \mathbb{E} \left(\Delta_{\boldsymbol i, n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right)\right)^2\right] \\ &\leq& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left(\Delta_{\boldsymbol i, n}^2(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ &\leq& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left(\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right. \\ && \;\;\;\;\;\qquad \qquad \qquad \left.\left[r^{(m)}(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n})- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right]^2\right) \\ &\leq& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left(\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right. \\ && \;\;\;\qquad \qquad\left.- \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right) + \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right. \\ && \;\;\;\qquad \qquad\times \left.\left[r^{(m)}(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n})- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right]^2\right) \\ &\leq& \frac{((n-m)!)^{2} h^{2\alpha}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E} \left[\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right. \\ && \;\;\;\qquad \qquad\left.- \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right] + \mathbb{E} \left(\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right). \end{eqnarray}$

(8.23)

The last part of (8.23) follows from the smoothness assumption on $r^{(m)}$ in 3 ⅰ)

$\begin{eqnarray} &&\left\vert r^{(m)}\left(\varphi, \frac{\mathbf{ i}}{n} , \mathbf{u}, X_{i, n}\right)- r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})\right\vert ^2{ \lesssim h^{2\alpha}}. \end{eqnarray}$

(8.24)

Combining the latter inequalities with Eqs (8.60) and (8.58) from Lemma 8.1 where:

$\begin{aligned} & \mathbb{E} \left(\prod\limits_{k = 1}^m K^2_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \\ &\;\;\;\lesssim \mathbb{E} \left(\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right), \end{aligned}$

to get

$\begin{eqnarray} \vert V_1\vert &\lesssim& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ &\lesssim& \frac{h^{2\alpha}(h_n)((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \left[ \frac{m \phi^{m-1}(h_n)}{n h_n }+\phi^{m}(h_n)\right] \ll \frac{1}{n h^{m}\phi(h_n) }. \end{eqnarray}$

(8.25)

A deep sight into the work of ^[10], specially Lemma 2, makes us see $V_2$ as follows:

$\begin{eqnarray} \vert V_2\vert & = & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \\ &&\times Cov\left(\Delta_{i_{\sigma(1)}, \ldots, i_{\sigma(2 m)} , n}(\boldsymbol{\theta}, \mathbf u, \mathbf x), \Delta_{i_{\sigma(m+1)}, \ldots, \sigma(2 m), n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) \\ &\leq & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \\ && \left\vert \mathbb{E}\left[\Delta_{i_{\sigma(1)}, \ldots, i_{\sigma(2 m)} , n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \Delta_{i_{\sigma(1)}, \ldots, i_{\sigma(2 m)} , n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right]\right\vert \\ &\leq & \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \times c M^{2}\left(1+\sum\limits_{k = 1}^{n-1} k^{m-1} \beta_{k}^{(p-2) / p}\right) \\ &\lesssim& \frac{((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\substack{1 \leq i_{1} \leq \cdots \leq i_{2 m} \leq n \\ j_{1} \geq j_{2}, \ldots, j_{m}}} \prod\limits_{k = 1}^{2m} K_{1}\left(\frac{u_k-i_k/n}{h_n}\right) \times M^{2} \ll \frac{1}{n h^{m}\phi(h_n) }, \end{eqnarray}$

(8.26)

where

$M: = \sup\limits_{1 \leq i_{1} < \cdots < i_{m} < \infty} \mathbb{E}\left[\left\vert \Delta_{\boldsymbol i, n}\right\vert^{\zeta}\right]^{1 / \zeta},$

and $j_{1} = i_{2}-i_{1}$ , $j_{l} = \min \left(i_{2 l-1}-i_{2 l-2}, i_{2 l}-i_{2 l-1}\right)$ for $2 \leq l \leq m-1$ , and $j_{m} = i_{2 m}-i_{2 m-1}$ . If we designate $j_{1} = \max \left(j_{1}, \ldots, j_{m}\right)$ , we can align the original sequence $\left\{X_{1}, \ldots, X_{n}\right\}$ with another sequence characterized by independent blocks $\{i_{1}, i_{2}, \ldots, i_{2 m}\}$ while preserving the identical block distribution. It is straightforward to demonstrate now that

$\begin{equation} Var(\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \leq \vert V_1\vert +\vert V_2\vert = o\left( \frac{1}{n h^{m}_n\phi(h_n)} \right). \end{equation}$

(8.27)

This indicates the quadratic-mean convergence of $\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x)$ with the specified rate as follows:

$\begin{equation} \widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) - \mathbb{E} \left(\widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x)\right) = o\left( \frac{1}{n h_n^m\phi(h_n)} \right) \text{in probability.} \end{equation}$

(8.28)

Let's remember that

$\begin{eqnarray} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})& = & \frac{1}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}\left(\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)+ \widehat{G}(\boldsymbol{\theta}, \mathbf u, \mathbf x) \right), \\ B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x)& = & \mathbb{E}[\widehat{g}^{B}(u, x)]/ \mathbb{E}[\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})], \\ \widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}) & = & \mathbb{E}[\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})]+ o_ \mathbb{P}(1), \\\lim\limits_{n\rightarrow \infty } \mathbb{E}[\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})] & > &0, \end{eqnarray}$

then

$\begin{eqnarray} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta};h_{n}) - r^{(m)}(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})& = & \frac{\widehat{g}_1(\boldsymbol{\theta}, \mathbf u, \mathbf x)}{\widetilde{r}_1(\varphi, \mathbf{i}, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta})}+ B_{n}(\boldsymbol{\theta}, \mathbf u, \mathbf x) +o_ \mathbb{P}\left( \frac{1}{n h^{m}_n\phi(h_n)} \right). \end{eqnarray}$

In the next step, we will consider the first part of the last equation.

$\begin{aligned} & \sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} Var(\widehat{g}_1(x)) \\ &\;\; = \frac{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) Var\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \varepsilon_{\mathbf{i}, n}\right] \\ &\;\; = \frac{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)} \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \varepsilon^2_{\mathbf{i}, n}\right] \\ &\;\; = \frac{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \sigma^2\left(\boldsymbol{\theta}, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) \varepsilon^2_{\mathbf{i}}\right] \\ & \;\;\;\;\;\text { ( } \left.\left\{\varepsilon_{\mathbf{i}}\right\}_{\mathbf{i} \in \mathbb{Z}} \text { is a sequence i.i.d r.v.'s, independent of }\left\{X_{i, n}\right\}_{\mathbf{i} = 1}^n\right) \\ &\;\; = \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}})\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) \right]\left[ \sigma^2\left(\boldsymbol{\theta}, \frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) \right] \\ &\;\; = \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}})\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ & \;\;\;\;\;\mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{i_k/n})}{h_n}\right) \right]\left[ \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x}) + o(1)\right) \right]\text{(According to Assumption 3 [ii)- iii)- iv)] )} \\ & \;\;\;\;\;+ \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}}) o(\phi^m(h_n))\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \left[ \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x} \right)+ o(1) \right] \\ &\;\; = \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}})\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi^{2m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ &\;\;\;\;\; \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{i_k/n})}{h_n}\right) \right]\left[ \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x} \right) + o(1)\right] + o(1) \\ &\;\;\sim \frac{ \mathbb{E}(\varepsilon^2_{\mathbf{i}}) (\sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x}\right) + o(1))\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}((n-m)!)^{2}}{(n!)^{2} h^{2m}\phi_{\mathbf x, \boldsymbol\theta}^{1/m}(h_n)}\sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m K_{1}^2\left(\frac{u_k-i_k/n}{h_n}\right) \\ &\;\;\sim \frac{1}{\sqrt{n h^{m}\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}} \mathbb{E}(\varepsilon^2_{\mathbf{i}}) \sigma^2\left(\boldsymbol{\theta}, {\mathbf{u}}, \mathbf{x}\right) \int_{[0, h]^m} \prod\limits_{k = 1}^m K_{1}^2( \mathbf{z}) d \mathbb{P}(z_1, \ldots, z_m).\end{aligned}$

(8.29)

The intricate relationship between weak convergence and $\widehat{g}_1$ is further substantiated. To elucidate this connection, we undertake the following steps:

(1) Truncation of the function $\widehat{g}_1$ is performed, given the unbounded nature of the function class.

(2) The convergence of the remainder term resulting from truncation to zero is established.

(3) Hoeffding's decomposition is applied to the truncated part.

(4) The convergence to zero of the non-linear term in this decomposition is validated.

(5) The weak convergence to the linear term is established by demonstrating finite-dimensional convergence and asymptotic equicontinuity.

These steps closely parallel the proof strategy employed in Theorem 4.2. Consequently, the proof is concluded. □

Proof of Theorem 4.2. Keep in mind that

$\begin{equation} \widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})}{ \sum\limits_{\mathbf{i}\in I_n^m}\prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}}. \end{equation}$

(8.30)

Let's define

$\begin{eqnarray*} \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y})&: = &\frac{ \prod\limits_{k = 1}^m\left\{K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}\varphi(Y_{\mathbf{i}, n})} { \mathbb{E}\prod\limits_{k = 1}^m\left\{K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}} \; \; \mbox{for}\; \; \textbf{x}\in \mathcal{H}^m, \mathbf{y}\in \mathcal{Y}^m, \\ \\ \mathcal{G} &: = & \left\{ \mathfrak G_{\varphi, \mathbf{i}} (\cdot , \cdot) : \varphi \in \mathscr{F}_m, \; \mathbf{i} = (i_1, \ldots , i_m) \right\}, \\ \mathcal{G}^{(k)}& : = &\left\{\pi_{k, m} \mathfrak G_{\varphi, \mathbf{i}} (\cdot , \cdot) : \varphi \in \mathscr{F}_m \right\}, \\ \mathfrak{U}_n(\varphi, \mathbf{i})& = & \mathfrak{U}_n^{(m)} (\mathfrak G_{\varphi, \mathbf{i}}) : = \frac{(n-m)!}{n!}\sum\limits_{i\in I_n^m} \prod\limits_{k = 1}^m \xi_{i_k}{\mathfrak G_{\varphi, \mathbf{i}}(\textbf{X}_i, \textbf{Y}_i)}, \end{eqnarray*}$

and the $U$ -empirical process is defined to be

$\mu_n(\varphi, \mathbf{i}): = \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n(\varphi, \mathbf{i})- \mathbb{E} (\mathfrak{U}_n(\varphi, \mathbf{i}))\right\}.$

Subsequently, we have

$\widetilde{r}_{n}^{(m)}(\varphi, \mathbf{u}, \mathbf{x}, \boldsymbol{\theta}; h_{n}) = \frac{ \mathfrak{U}_n(\varphi, \mathbf{i})}{ \mathfrak{U}_n(1, \mathbf{i})}.$

To ensure the weak convergence of our estimator, it is essential to first establish it for $\mu_n(\varphi, \mathbf{i})$ . As mentioned earlier, dealing with unbounded function classes necessitates truncating the function $\mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y})$ . Specifically, for $\lambda_n = n^{1/\zeta}$ , where $\zeta > 2$ , we have:

$\begin{eqnarray} \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y})& = & \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y}) \mathbb{1}_{\left\{F(\textbf{y})\leq \lambda_n \right\}}+ \mathfrak G_{\varphi, \mathbf{i}}(\textbf{x}, \textbf{y}) \mathbb{1}_{\left\{F(\textbf{y}) > \lambda_n \right\}} \\ &: = & \mathfrak G_{\varphi, \mathbf{i}}^{(T)}(\textbf{x}, \textbf{y}) + \mathfrak G_{\varphi, \mathbf{i}}^{(R)}(\textbf{x}, \textbf{y}) . \end{eqnarray}$

We can write the $U$ -statistic as follows :

$\begin{eqnarray} \label{Equation: Troncature} \mu_n(\varphi, \mathbf{i}) & = & \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(T)}\right)- \mathbb{E} \left(\mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(T)}\right)\right)\right\}\\&& + \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\right)- \mathbb{E} \left(\mathfrak{U}_n^{(m)}\left(\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\right)\right)\right\} \\ &: = & \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(T)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(T)}({\varphi, \mathbf{i}})\right)\right\} \\&&+ \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})\right)\right\} \\ &: = &\mu_n^{(T)}(\varphi, \mathbf{i}) + \mu_n^{(R)}(\varphi, \mathbf{i}) . \end{eqnarray}$

(8.31)

The first term is the truncated part and the second is the remaining one. We have to prove that:

(1) $\mu_n^{(T)}(\varphi, \mathbf{i})$ converges to a Gaussian process.

(2) The remainder part is negligible, in the sense that

$\left\Vert\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})\right)\right\}\right\Vert_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\overset{\mathbb{P}}{\longrightarrow} 0.$

For the initial step, we will utilize the Hoeffding decomposition, akin to the one presented in the previous Subsection 3.1, with the sole modification of replacing ${\mathfrak W}_{\mathbf{i}, n}$ with $\varphi(Y_{\mathbf{i}, n})$

$\begin{equation} \mathfrak{U}_n^{(T)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(T)}({\varphi, \mathbf{i}})\right) : = \mathfrak{U}_{1, n}(\varphi, \mathbf{i}) \nonumber+\mathfrak{U}_{2, n}(\varphi, \mathbf{i}), \end{equation}$

where

$\begin{eqnarray} \mathfrak{U}_{1, n}(\varphi, \mathbf{i}) &: = & \frac{1}{n} \sum\limits_{i = 1}^n \widehat{H}_{1, i}(\mathbf{u}, \mathbf{x}, \boldsymbol{\theta}, \varphi), \end{eqnarray}$

(8.32)

$\begin{eqnarray} \mathfrak{U}_{2, n}(\varphi, \mathbf{i}) &: = & \frac{(n-m)!}{(n)!} \sum\limits_{\mathbf{i}\in I_n^m} \xi_{i_1} \cdots \xi_{i_m} H_{2, \mathbf{i}}(\boldsymbol{z}). \end{eqnarray}$

(8.33)

The convergence of $\mathfrak{U}_{2, n}(\varphi, \mathbf{i})$ to zero in probability has been established by Lemma 8.2. Therefore, it suffices to demonstrate the weak convergence of $\mathfrak{U}_{1, n}(\varphi, \mathbf{i})$ to a Gaussian process denoted as $\mathbb{G}(\varphi)$ . To accomplish this, we will proceed with finite-dimensional convergence and equicontinuity. Finite-dimensional convergence asserts that for every finite set of functions $f_1, \ldots, f_q$ in $L_2$ , where $\tilde{\mathfrak{U}}$ represents the centered form of ${\mathfrak{U}}$ :

$\begin{equation} \left(\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} \tilde{\mathfrak{U}}_{1, n}(f_1, \mathbf{i}), \ldots, \sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)} \tilde{\mathfrak{U}}_{1, n}(f_q, \mathbf{i})\right) \end{equation}$

(8.34)

convergences to the corresponding finite-dimensional distributions of the process $\mathbb{G}(\varphi)$ . We only need to demonstrate that for every fixed collection $(a_1, \ldots, a_k) \in \mathbb{R}$ , we have

$\begin{equation*} \sum\limits_{j = 1}^{q}a_j \tilde{\mathfrak{U}}_{1, n}(f_j, \mathbf{i}) \rightarrow N \Big(0, \sigma ^2\Big), \end{equation*}$

where

$\begin{equation} \sigma ^2 = \sum\limits_{j = 1}^{q}a_j^2 {\rm Var}\Big( \tilde{\mathfrak{U}}_{1, n}(f_j, \mathbf{i})\Big) + \sum\limits_{s \neq r} a_s a_r {\rm Cov}(\tilde{\mathfrak{U}}_{1, n}(f_s, \mathbf{i})), \tilde{\mathfrak{U}}_{1, n}(f_r, \mathbf{i}) \Big). \end{equation}$

(8.35)

Take

$\Psi(\cdot) = \sum\limits_{j = 1}^{q}a_j f_{j}(\cdot).$

By linearity of $\Psi(\cdot)$ , we have to see that $\tilde{\mathfrak{U}}_{1, n}(\Psi, \mathbf{i}) \rightarrow \mathbb{G}(\Psi)$ . Let us denote

$N = \mathbb{E}\prod\limits_{k = 1}^m\left\{K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}.$

We have

$\begin{eqnarray*} \tilde{\mathfrak{U}}_{1, n}(h_n, \mathbf{i})& = & N^{-1} \times \frac{1}{n} \sum\limits_{i = 1}^n \frac{(n-m)!}{(n-1)!} \sum\limits_{I_{n-1}^{m-1}(-i)} \sum\limits_{\ell = 1}^m \xi_{i_1} \cdots \xi_{i_{\ell-1}} \xi_i \xi_{i_{\ell}}\cdots \xi_{i_{m-1}}\frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \nonumber \\ && \times \int h(y_{ {1}}, \ldots, y_{\ell-1} , Y_i, y_\ell, \ldots, y_{m-1}) \prod\limits_{\underset{k \neq i}{k = 1}}^{m-1} \frac{1}{ \phi(h_n)} K_{2} \left(\frac{d_{\theta_k}(x_k, \nu_{k})}{h_n}\right) \nonumber \\ &&\qquad \qquad \mathbb{P}(d(\nu_1, y_1), \ldots, d(\nu_{\ell-1}, y_{\ell-1}), d(\nu_{\ell}, y_\ell), \ldots, d(\nu_{m-1}, y_{m-1})), \nonumber\\ & = & N^{-1} \frac{1}{n} \sum\limits_{i = 1}^n \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i). \end{eqnarray*}$

Now, we shall employ the blocking procedure for this empirical process. We intend to partition the set $\{1, \ldots, n\}$ into $2\nu_n+1$ subsets, each containing small and large blocks. In alignment with the notation used in Lemma 8.2, we denote the size of large blocks as $a_n$ and the size of small blocks as $b_n$ , satisfying:

$\begin{equation} \nu_n : = \left\lfloor \frac{n}{a_n+b_n} \right\rfloor, \qquad \frac{b_n}{a_n}\rightarrow 0, \qquad \frac{a_n}{n}\rightarrow 0, \qquad \frac{n}{a_n} \beta(b_n)\rightarrow 0. \end{equation}$

(8.36)

In this case, we can see that :

$\begin{eqnarray} \tilde{\mathfrak{U}}_{1, n}(h, \mathbf{j}) & = & \sum\limits_{j = 1}^{\nu_n-1} \widehat{ \mathfrak{U}}^{(1)}_{j, n} + \sum\limits_{j = 1}^{\nu_n-1} \widehat{ \mathfrak{U}}^{(2)}_{j, n}+ \widehat{ \mathfrak{U}}^{(3)}_{j, n} \\ &: = &\tilde{\mathfrak{U}}^{(1)}_{1, n}+\tilde{\mathfrak{U}}^{(2)}_{1, n}+\tilde{\mathfrak{U}}^{(3)}_{1, n}, \end{eqnarray}$

(8.37)

where

$\begin{eqnarray} \widehat{ \mathfrak{U}}^{(1)}_{j, n} & = & N^{-1}\sum\limits_{i = j(a_n+b_n)+1}^{j(a_n+b_n)+a_n} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), \end{eqnarray}$

(8.38)

$\begin{eqnarray} \widehat{ \mathfrak{U}}^{(2)}_{j, n}& = &N^{-1}\sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)}\frac{1}{ \xi_i \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), \end{eqnarray}$

(8.39)

$\begin{eqnarray} \widehat{ \mathfrak{U}}^{(3)}_{j, n}& = & N^{-1}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i). \end{eqnarray}$

(8.40)

First, we aim to prove that

$\frac{1}{n} \mathbb{E}(\tilde{\mathfrak{U}}^{(2)}_{1, n})^2\rightarrow 0 \; \mbox{ and }\; \frac{1}{n} \mathbb{E}(\tilde{\mathfrak{U}}^{(3)}_{1, n})^2\rightarrow 0$

to show that the case of summation over the small blocks and the summation over the last one is asymptotically negligible. Hence, we infer

$\begin{eqnarray} \mathbb{E}(\tilde{\mathfrak{U}}^{(2)}_{1, n})^2 = Var\left(\sum\limits_{j = 1}^{\nu_n-1} \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) = \sum\limits_{j = 1}^{\nu_n-1} Var\left( \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) + \underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right) . \end{eqnarray}$

We have :

$\begin{eqnarray} Var\left( \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) & = & Var\left(N^{-1}\sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ & = & N^{-2}\frac{1}{ \phi^2(h_n)} \frac{a_n}{a_n}\sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)} \xi^2_i Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\lesssim& \frac{b_n}{ \phi^2(h_n) \left(m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{\omega}{h_n}\right)d\omega\right] \\ &&\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right).\qquad \mbox{(Using Lemma 8.1ii))} \end{eqnarray}$

Thus, we have

$\begin{eqnarray} \sum\limits_{j = 1}^{\nu_n-1} Var\left( \widehat{ \mathfrak{U}}^{(2)}_{j, n}\right) &\lesssim& \nu_n b_n \frac{1}{\phi^{2(m+1)}(h_n)} \left[\int_{[0, h]} K_1^2\left(\frac{\omega}{h_n}\right)d\omega\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\sim& \frac{n}{a_n+b_n} b_n \sim \frac{nb_n}{a_n} = o_ \mathbb{P}(n), \qquad \qquad \qquad\text{(by (8.36)) }. \end{eqnarray}$

(8.41)

and

$\begin{eqnarray} {\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right)} & = &\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} \sum\limits_{i = j(a_n+b_n)+a_n+1}^{(j+1)(a_n+b_n)} \sum\limits_{i^\prime = k(a_n+b_n)+a_n+1}^{(k+1)(a_n+b_n)}\frac{N^{-2}}{ \phi(h_n)} \xi_i \xi_{i^\prime} \\ &Cov&\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), K_2 \left(\frac{d_{\theta_{i^\prime}}(x_{i^\prime}, X_{i^\prime})}{h_n}\right) \tilde{\mathbb H}(Y_{i^\prime})\right) \\ & = &\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}}\sum\limits_{l_1 = 1}^{b_n}\sum\limits_{l_2 = 1}^{b_n} \frac{1}{ N^{2}\phi^2(h_n)} \xi_{\lambda_i+l_1} \xi_{\lambda_{i^\prime}+l_2} \\ &Cov&\left(K_2 \left(\frac{d_{\theta_{{\lambda_{i^\prime}+l_1}}}(x_{\lambda_i+l_1}, X_{\lambda_i+l_1})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_i+l_1}), K_2 \left(\frac{d_{\theta_{{\lambda_{i^\prime}+l_2}}}(x_{{\lambda_{i^\prime}+l_2}}, X_{\lambda_{i^\prime}+l_2})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_{i^\prime}+l_2})\right), \end{eqnarray}$

where $\lambda_i = j(a_n+b_n)+a_n$ , but for $j \neq k, \vert \lambda_i- \lambda_{i^\prime} +l_1-l_2\vert \geq b_n$ , then

$\begin{eqnarray} {\left\vert\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right) \right\vert } \leq \underset{\vert j-k \vert\geq b_n }{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} \frac{1}{ N^{2}\phi^2(h_n)} \xi_j \xi_{k} \left\vert Cov\left(K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right) \tilde{\mathbb H}(Y_j), K_2 \left(\frac{d_{\theta_k}(x_{k}, X_{k})}{h_n}\right) \tilde{\mathbb H}(Y_{k})\right)\right\vert , \end{eqnarray}$

here, the use of Davydov's lemma (Lemma A.7) is necessary, we have

$\begin{aligned}& \left\vert Cov\left(K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right) \tilde{\mathbb H}(Y_j), K_2 \left(\frac{d_{\theta_k}(x_{k}, X_{k})}{h_n}\right) \tilde{\mathbb H}(Y_{k})\right)\right\vert \\ &\;\;\;\;\leq 8 \left( \mathbb{E}\left\vert K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right)\right\vert^p\right)^{1/p} \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p}\beta(\vert i-j\vert)^{1-2/p} \\ &\;\;\;\;\lesssim \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p}\beta(\vert i-j\vert)^{1-2/p}, \end{aligned}$

it follows that

$\begin{eqnarray} {\left\vert\underset{j\neq k}{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} Cov \left(\widehat{ \mathfrak{U}}^{(2)}_{j, n}, \widehat{ \mathfrak{U}}^{(2)}_{k, n} \right) \right\vert } &\lesssim & \underset{\vert j-k \vert\geq b_n }{\sum\limits_{j = 1}^{\nu_n-1} \sum\limits_{k = 1}^{\nu_n-1}} \frac{1}{ N^{2}\phi^2(h_n)} \xi_j \xi_{k}\phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p}\beta(\vert i-j\vert)^{1-2/p} \\ &\lesssim& \frac{1}{b_n^{\varrho} N^{2}\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} \sum\limits_{l = b_n+1}^{\infty} l^{\varrho}\beta(l)^{1-2/p} \\ &\lesssim& \frac{1}{b_n^{\varrho} N^{2}\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} n \varrho = o_ \mathbb{P}(n), \end{eqnarray}$

(8.42)

where the last inequality follows also from (8.36) and the size of $b_n$ . Then, (8.41) and (8.42) shows us that

$\frac{1}{n} \mathbb{E}(\tilde{\mathfrak{U}}^{(2)}_{1, n})^2\rightarrow 0.$

Using the same footsteps, we find that

$\begin{aligned}& Var\left(\tilde{ \mathfrak{U}}^{(3)}_{1, n}\right) = Var \left( N^{-1}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i \frac{1}{ \phi(h_n)} K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ & \;\;\;= N^{-2}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i^2 \frac{1}{ \phi^2(h_n)} Var\left( K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\;\;\;\;+ \frac{1}{ N^{2}\phi^2(h_n)} \underset{\vert i-j \vert > 0}{\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \sum\limits_{j = \nu(a_n+b_n)+1}^{n} } \xi_i \xi_{j} Cov \left( K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i), K_2 \left(\frac{d_{\theta_j}(x_{j}, X_j)}{h_n}\right) \tilde{\mathbb H}(Y_j)\right) \\ & \;\;\;= N^{-2}\sum\limits_{i = \nu(a_n+b_n)+1}^{n} \xi_i^2 \frac{1}{ \phi^2(h_n)} Var\left( K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\&\;\;\;\;\;\;\;\;+\frac{1}{ N^{2}\phi^2(h_n)} \underset{\vert i-j \vert > 0}{\sum\limits_{l_1 = 1}^{n- \nu(a_n+b_n)} \sum\limits_{l_2 = 1}^{n-\nu(a_n+b_n)} } \xi_i \xi_{j} \\ & \qquad \;\;\;\;\;\;\;\;\; Cov \left( K_2 \left(\frac{d_{\theta_{\lambda_i+l_1}}(x_{\lambda_i+l_1}, X_{\lambda_i+l_1})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_i+l_1}), K_2 \left(\frac{d_{\theta_{\lambda_i+l_2}}(x_{\lambda_i+l_2}, X_{\lambda_i+l_2})}{h_n}\right) \tilde{\mathbb H}(Y_{\lambda_i+l_2})\right) \\ & \qquad \qquad\qquad\qquad\qquad\qquad\qquad \qquad\qquad\qquad\qquad\qquad\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\left(\text { For } \lambda_i:=n-v\left(a_n+b_n\right)\right) \\ &\;\;\;\lesssim \frac{n-\nu(a_n+b_n)}{ \phi^2(h_n) \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\;\;\;\;+ \frac{1}{(n-\nu(a_n+b_n))^{\varrho} \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} \\&\;\;\;\;\;\;\;\;\times \sum\limits_{l = (n-\nu(a_n+b_n))+1}^{\infty} l^{\varrho}\beta(l)^{1-2/p} \\ & \qquad\qquad\qquad\qquad \qquad\qquad\qquad\qquad\qquad\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\qquad\mbox{(Using Lemma 8.1 ii) and Lemma A.7) } \\ &\;\;\;\lesssim \frac{n-\nu_n(a_n+b_n)}{ \phi^2(h_n) \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\;\;\;\;+ \frac{1}{(n-\nu(a_n+b_n))^{\varrho} \left( m \phi^{m-1}(h_n)/n h_n +\phi^{m}(h_n)\right)^2\phi^2(h_n)} \phi(h_n) \mathbb{E}(\vert \tilde{\mathbb H}(Y_j) \vert ^p)^{1/p} n \varrho. \end{aligned}$

(8.43)

By (8.36), we find that

$\frac{1}{n}Var\left(\tilde{ \mathfrak{U}}^{(3)}_{i, n}\right) \rightarrow 0.$

Now, we need to establish that the summands in $\tilde{ \mathfrak{U}}^{(1)}_{1, n}$ are asymptotically independent, allowing us to apply the conditions of Lindeberg-Feller for asymptotic finite normality. We can utilize Lemma A.8, where $\widehat{ \mathfrak{U}}^{(1)}_{a, n}$ is $\mathcal{F}_{i_a}^{j_a}$ -measurable with $i_a = a(a_n +b_n)+1$ and $j_a = a(a_n+b_n)+a_n$ , giving us

$\begin{eqnarray} \left\vert \mathbb{E} \left( \exp ( it n^{-1/2} \tilde{ \mathfrak{U}}^{(1)}_{1, n}\right) - \prod\limits_{i = 0}^{\nu_n-1} \mathbb{E} \left( \exp ( it n^{-1/2} \widehat{ \mathfrak{U}}^{(1)}_{i, n}\right)\right\vert \leq 16 \nu_n \beta(b_n+1) , \end{eqnarray}$

(8.44)

which tends to zero using (8.36), then the asymptotic independence is achieved. We can see also that

$\begin{aligned} & \frac{1}{n} Var\left(\tilde{\mathfrak{U}}^{(1)}_{1, n}\right) \lesssim \frac{\nu_n a_n}{n \phi^2(h_n) N^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \\ &\;\;\;\;\rightarrow \frac{1}{ \phi^2(h_n) N^2}\times \left[\int_{[0, h]} K_1^2\left(\frac{t}{h_n}\right)dt\right]\times Var\left(K_2 \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}(Y_i)\right) \text { (since } v_n a_n / n \rightarrow 1 \text { ) }. \\&\;\;\;\;: = \mathbb{V}(X, Y). \end{aligned}$

(8.45)

Up to this point, we have addressed the final condition for finite-dimensional convergence. It is important to observe that, for sufficiently large $n$ , the set

$\{\vert\widehat{ \mathfrak{U}}^{(1)}_{i, n}\vert > \epsilon \mathbb{V}(X, Y) \sqrt{n}\}$

is empty, thus

$\begin{equation} \frac{1}{n}\sum\limits_{i = 0}^{\nu_n-1} \mathbb{E} \left( \widehat{ \mathfrak{U}}^{(1)2}_{i, n} \mathbb{1}_{\{ \vert\widehat{ \mathfrak{U}}^{(1)}_{i, n}\vert > \epsilon \mathbb{V}(X, Y) \sqrt{n} \}}\right) \rightarrow 0. \end{equation}$

(8.46)

Therefore, we conclude the proof of finite-dimensional convergence. Now, we shift our focus to asymptotic equicontinuity, aiming to demonstrate that:

$\begin{eqnarray} \lim\limits_{\delta\rightarrow 0} \lim\limits_{n \rightarrow \infty} \mathbb{P} \left\{ {\sqrt{nh^m\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}} \left\| \tilde{\mathfrak{U}}_{1, n}(h_n, \mathbf{i}) \right\|_{\mathcal{FK}(\delta, \|.\|_\zeta)} > \epsilon \right\} = 0, \end{eqnarray}$

(8.47)

where

$\begin{eqnarray} \mathcal{FK}(\delta, \|.\|_\zeta)&: = & \left\{\tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i})- \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) :\right. \\ && \left.\left\|\tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i})- \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) \right\| < \delta, \tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i}), \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) \in \mathcal{FK} \right\}, \end{eqnarray}$

(8.48)

for

$\begin{eqnarray} \tilde{\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i}) & = & N^{-1} \frac{1}{n} \sum\limits_{i = 1}^n \xi_i \frac{1}{ \phi(h_n)} K_{2, 1} \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}_1(Y_i) - \mathbb{E}\left({\mathfrak{U}}_{1, n}^{\prime}(h_n, \mathbf{i})\right), \\ \tilde{\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i}) & = & N^{-1} \frac{1}{n} \sum\limits_{i = 1}^n \xi_i \frac{1}{ \phi(h_n)} K_{2, 2} \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right) \tilde{\mathbb H}_2(Y_i) - \mathbb{E}\left({\mathfrak{U}}_{1, n}^{\prime\prime}(h_n, \mathbf{i})\right). \end{eqnarray}$

(8.49)

Now, we will employ the chaining technique from ^[13] and follow the approach outlined in ^[39] for the conditional setting. The fundamental idea is to decompose a sequence $({ X}_1, \ldots, { X}_n)$ into $2\upsilon_n$ equal-sized blocks, each of length $a_n$ , and a residual block of length $n-2\upsilon_na_n$ , where, for $1\leqslant j\leqslant \upsilon_n$ :

$\begin{eqnarray*} H_j& = & \{ i : 2(j-1)a_n+1 \leqslant i \leqslant (2j-1)a_n \}, \\ T_j& = & \{ i : (2j-1)a_n+1 \leqslant i \leqslant 2ja_n \}, \\ R& = & \{ i : (2\upsilon_na_n+1 \leqslant i \leqslant n \}. \end{eqnarray*}$

The values of $\upsilon_n, a_n$ are given in the following. Another ingredient is essential, in this proof, that is a sequence of independent blocks $({\zeta}_1, \ldots, {\zeta}_n)$ such as:

$\mathcal{L}(\zeta_1, \ldots, {\zeta}_n) = \mathcal{L}({ X}_1, \ldots, { X}_{a_n})\times \mathcal{L}({ X}_{a_n+1}, \ldots, { X}_{2a_n}) \times \cdots.$

In the same line as ^[39], the results of the work of ^[73] on $\beta$ -mixing are applied, and get, for any measurable set $A$ :

$\begin{aligned} &\left\vert\mathbb{P}\left\{ \left(\zeta_{1}, \ldots, \zeta_{a_n}, {\zeta}_{2a_n+1}, \ldots, {\zeta}_{3a_n}, \ldots, {\zeta}_{2(\upsilon_n-1)a_n+1}, \ldots, {\zeta}_{2\upsilon_na_n} \right) \in A \right\}\right. \\ & \;\;\;\;\;\;\;\left.- \mathbb{P}\left\{\left({ X}_1, \ldots, { X}_{a_n}, { X}_{2a_n+1}, \ldots, { X}_{3a_n}, \ldots, { X}_{2(\upsilon_n-1)a_n+1}, \ldots, { X}_{2\upsilon_na_n}\right) \in A\right\} \right\vert\\ &\;\;\; \leqslant 2(\upsilon_n-1)\beta_{a_n}. \end{aligned}$

(8.50)

We will focus solely on the independent blocks represented as $\zeta_i = (\eta_i, {\varsigma}_i)$ sequences, rather than dealing with the dependent variables. We will utilize a strategy akin to the one employed in Lemma 8.2 to transition from the sequence of locally stationary random variables to the stationary one:

$\begin{aligned} &\mathbb{P}\left\{\left\Vert{(n\phi\left(h_n\right))}^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^n\left( \xi_i K_{2} \left(\frac{d_{\theta_i}(x_i, X_i)}{h_n}\right)\tilde{h}(Y_i)- \mathbb{E}\left({\mathfrak{U}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\Vert_{\mathscr{FK}_{(b, \Vert\cdot\Vert_\zeta)}} > \epsilon\right\} \\ \;\;\;&\leq2\mathbb{P}\left\{\left\Vert{(n\phi\left(h_n\right))}^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^{\nu_n}\sum\limits_{i \in H_j} \right.\right.\left. \left. \left( \xi_i K_{2} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\Vert_{\mathscr{FK}_{(b, \Vert\cdot\Vert_\zeta)}} > \epsilon^\prime \right\} \\ &\;\;\;\;\;\;+ 2(\nu_n-1)\beta_{a_n} + o_ \mathbb{P}(1). \end{aligned}$

(8.51)

We choose

$a_n = [(\log{n})^{-1}(n^{p-2}\phi^p(h_n))^{1/2(p-1)}]$

and

$\upsilon_n = \left[\frac{n}{2a_n}\right]-1.$

Exploiting condition (ⅴ) from Assumption 4, we deduce that $(\upsilon_n-1)\beta_{a_n}\longrightarrow 0$ as $n\rightarrow \infty$ . This primarily pertains to the first term on the right-hand side of (8.51). Given the independence of the blocks, we symmetrize the sequence using a set $\{\epsilon_j\}_{j\in \mathbb{N}^*}$ of i.i.d. Rademacher variables, where

$\mathbb{P}(\epsilon_j = 1) = \mathbb{P}(\epsilon_j = -1) = 1/2.$

It is noteworthy that the sequence $\{\epsilon_j\}_{j\in \mathbb{N}^*}$ is independent of the sequence $\left\{\boldsymbol{\xi}_i = ({\varsigma}_i, {\zeta}_i)\right\}_{i \in \mathbb{N}^*}$ . Now, our objective is to establish, for all $\epsilon > 0$ ,

$\begin{eqnarray} {\lim\limits_{\delta\rightarrow 0} \lim\limits_{n \rightarrow \infty} \mathbb{P}\left\{\left\Vert{(n\phi\left(h_n\right))}^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^{\nu_n}\sum\limits_{i \in H_j} \right.\right.}\left. \left. \left( \xi_i K_{2} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\Vert_{\mathscr{FK}_{(b, \Vert\cdot\Vert_\zeta)}} > \epsilon \right\}. \end{eqnarray}$

(8.52)

Define the semi-norm:

$\begin{eqnarray} \widetilde{d}_{n\phi, 2}&: = & \left((n\phi\left(h_n\right))^{-1/2}h^{m/2}_n N^{-1} \sum\limits_{j = 1}^{\nu_n}\sum\limits_{i \in H_j}\left\vert\left( \xi_i K_{2, 1} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}_1(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}^\prime}_{1, n}(h_n, \mathbf{i})\right)\right) \right.\right. \\ && \left.\left.-\left( \xi_i K_{2, 2} \left(\frac{d_{\theta_i}(x_i, \eta_i)}{h_n}\right)\tilde{h}_2(\varsigma_i)- \mathbb{E}\left({\mathfrak{U}^{\prime\prime}}_{1, n}(h_n, \mathbf{i})\right)\right)\right\vert^2\right)^{1/2}, \end{eqnarray}$

(8.53)

and the covering number defined for any class of functions $\mathscr{E}$ by :

$\widetilde{N}_{n\phi, 2}(u, \mathscr{E}) : = N_{n\phi, 2}(u, \mathscr{E}, \widetilde{d}_{n\phi, 2}).$

Building upon the preceding discussion, we can bound (8.52), with further elaboration available in ^[39]. Similarly, by following the methodology in ^[39] and referring back to ^[13], the independence between the blocks, coupled with Assumption 6 ⅱ) and the utilization of [89, Lemma 5.2], ensures equicontinuity, thus laying the foundation for weak convergence. Now, our task is to establish that:

$\mathbb{P}\left\{\left\Vert\mu_n^{(R)}(\varphi, \mathbf{t})\right\Vert_{\mathscr{F}_m\mathfrak{K}^m_\Theta} > \lambda\right\} \to 0 \; \; \mbox{as}\; \; n\to \infty.$

In this proof, for clarity, we present the case where $m = 2$ , with different sizes for $a_n$ and $b_n$ , where $b_n$ denotes the size of the alternate blocks. Both $a_n$ and $b_n$ satisfy:

$\begin{equation} b_{n} \ll a_{n}, \quad\left(v_{n}-1\right)\left(a_{n}+b_{n}\right) < n \leqslant v_{n}\left(a_{n}+b_{n}\right), \end{equation}$

(8.54)

and set, for $1 \leqslant j \leqslant v_{n}-1:$

$\begin{align*} \mathrm{H}_{j}^{(\mathrm{U})} & = \left\{i:(j-1)\left(a_{n}+b_{n}\right)+1 \leqslant i \leqslant(j-1)\left(a_{n}+b_{n}\right)+a_{n}\right\}, \\ \mathrm{T}_{j}^{(\mathrm{U})} & = \left\{i:(j-1)\left(a_{n}+b_{n}\right)+a_{n}+1 \leqslant i \leqslant(j-1)\left(a_{n}+b_{n}\right)+a_{n}+b_{n}\right\}, \\ \mathrm{H}_{v_{n}}^{(\mathrm{U})} & = \left\{i:\left(v_{n}-1\right)\left(a_{n}+b_{n}\right)+1 \leqslant i \leqslant n \wedge\left(v_{n}-1\right)\left(a_{n}+b_{n}\right)+a_{n}\right\} , \\ \mathrm{T}_{v_{n}}^{(\mathrm{U})} & = \left\{i:\left(v_{n}-1\right)\left(a_{n}+b_{n}\right)+a_{n}+1 \leqslant i \leqslant n\right\}. \end{align*}$

We have:

$\begin{eqnarray*} \nonumber \mu_n^{(R)}(\varphi, \mathbf{i}) & = & \sqrt{n\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}\left\{ \mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})- \mathbb{E} \left(\mathfrak{U}_n^{(R)}(\varphi, \mathbf{i})\right)\right\} \nonumber\\ & = &\frac{\sqrt{n\phi^{1/m}_{\mathbf x, \boldsymbol\theta}(h_n)}}{n(n-1)}\sum\limits_{i_1 \neq i_2}^{n} \xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ &\leqslant&\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ & &+\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1\neq i_2 \, i_1, i_2 \in H_p^{(U)} }\phi(h_n)\xi_{i_1} \xi_{i_2} \left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ & &+2 \frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\\nonumber\\ &&+2 \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \leqslant 1}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\qquad\qquad \qquad\qquad\qquad\qquad \left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\\nonumber\\ &&+\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{T}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ &&+\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2} \sum\limits_{i_1, i_2 \in \mathrm{T}_{p}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\nonumber\\ &&\left.\qquad\qquad \qquad\qquad\qquad\qquad - \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\nonumber\\ & = : & \rm I^{\prime} + \rm II^{\prime}+ \rm III^{\prime} + \rm IV^{\prime} + \rm V^{\prime} + \rm VI^{\prime}. \end{eqnarray*}$

We will employ blocking arguments and address the resulting terms. Let's begin by examining the first term $\rm I^{\prime}$ . We have:

$\begin{array}{r} \mathbb{P}\left\{\| \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p \neq q}^{v_n} \sum\limits_{i_1 \in \mathrm{H}_p^{(\mathrm{U)}}} \sum\limits_{i_2 \in \mathrm{H}_q^{(\mathrm{U})}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(X_{i_1}, X_{i_2}\right), \left(Y_{i_1}, Y_{i_2}\right)\right)\right.\right. \\ \left.\left.-\mathbb{E}\left[\mathfrak{G}_{\varphi, \mathbf{t}}^{(R)}\left(\left(X_{i_1}, X_{i_2}\right), \left(Y_{i_1}, Y_{i_2}\right)\right)\right]\right\} \|_{\mathscr{F}_2 \mathscr{K}^2} > \delta\right\} \\ \;\;\;\leqslant \mathbb{P}\left\{\| \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p \neq q}^{v_n} \sum\limits_{i_1 \in \mathrm{H}_p^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_q^{(\mathrm{U})}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{\left(\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right.\right.\right. \\ \;\;\;\left.\left.-\mathbb{E}\left[\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right]\right\} \|_{\mathscr{F}_2 \mathscr{K}^2} > \delta\right\}+2 v_n \beta\left(b_n\right) . \end{array}$

Notice that $\upsilon_n \beta_{b_n}\to 0$ and recall that for all $\varphi \in \mathscr{F}_m,$ and :

$\boldsymbol{\theta}, \mathbf{x} \in \mathcal{H}^2, \mathbf{y}\in\mathcal{Y}^2: \mathbb{1}_{\left\{d_{\boldsymbol{\theta}}\left(\mathbf{x}, X_{i, n}\right)\leqslant h\right\}} F(\mathbf{y}) \geqslant {\varphi}(\mathbf{y})K_2\left(\frac{d_{\boldsymbol{\theta}}\left(\mathbf{x}, X_{i, n}\right)}{h_n}\right).$

Hence, by the symmetry of $F(\cdot)$ :

$\begin{eqnarray} &&\left\Vert \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2}\right)\right)\right.\right. \left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{t}}^{(R)}\left(({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2})\right)\right]\right\} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &&\lesssim \left\vert\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2}\left\{ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n\right\}}\right.\right. \left.\left.- \mathbb{E}\left[ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n\right\}}\right]\right\}\vphantom{\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p\neq q}^{\upsilon_n}\sum\limits_{i\in H_p^{(U)} }\sum\limits_{j\in H_q^{(U)} }}\right\vert. \end{eqnarray}$

(8.55)

We are going to use Chebyshev's inequality, Hoeffding's trick and inequality, respectively to obtain:

$\begin{eqnarray} &&{\mathbb{P}\left\{\left\vert\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2}\left\{ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right.\right.\right.}\left.\left.\left.- \mathbb{E}\left[ F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right]\right\}\vphantom{\frac{1}{\sqrt{n\phi(h_n)}} \sum\limits_{p\neq q}^{\upsilon_n}\sum\limits_{i\in H_p^{(U)} }\sum\limits_{j\in H_q^{(U)} }}\right\vert > \delta\right\}\\ &&\lesssim\delta^{-2}n^{-1}\phi^{-1}(h_n) Var\left(\sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} F({\zeta}_i, {\zeta}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right)\\ &&\lesssim c_2 \upsilon_n \delta^{-2}n^{-1}\phi^{-1}(h_n) Var\left(\sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} F({\zeta}_i, {\zeta}^{\prime}_j) \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right)\\ &&\lesssim 2c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \mathbb{E}\left[\left( F({\zeta}_1, {\zeta}_2) \right)^2 \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right]. \end{eqnarray}$

(8.56)

Under Assumption 6 ⅲ), we have for each $\lambda > 0:$

$\begin{aligned} & c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \mathbb{E}\left[\left( F({\zeta}_1, {\zeta}_2) \right)^2 \mathbb{1}_{\left\{ F > \lambda_n \right\}}\right] \nonumber\\ & \;\;\; = c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \int_0^{\infty} {\mathbb{P}\left\{\left( F({\zeta}_1, {\zeta}_2) \right)^2 \mathbb{1}_{\left\{ F > \lambda_n \right\}} \geqslant t \right\} dt} \nonumber\\ & \;\;\; = c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n) \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \int_0^{\lambda_n} {\mathbb{P}\left\{ F > \lambda_n\right\} dt } \nonumber\\ &\;\;\;\;\;\;\;\; +c_2 \upsilon_n \delta^{-2}n^{-2}\phi^{-1}(h_n)\sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}}\phi(h_n) \xi_{i_1} \xi_{i_2} \int_{\lambda_n}^{\infty} {\mathbb{P}\left\{\left( F\right)^2 > t\right\}dt}, \end{aligned}$

which tends to $0$ as $n\to\infty$ . Terms $\rm II^{\prime}$ , $\rm V^{\prime}$ , and $\rm VI^{\prime}$ will be handled similarly to the previous term. However, $\rm II^{\prime}$ and $\rm VI^{\prime}$ deviate from this pattern due to the variables $\{{\zeta}_i, {\zeta}_j\}_{i, j \in H_p^{(U)} }$ (or $\{{\zeta}_i, {\zeta}_j\}_{i, j \in T_p^{(U)} }$ for $\rm VI^{\prime}$ ) belonging to the same blocks. Regarding term $\rm IV^{\prime}$ , its analysis can be inferred from the study of terms $\rm I^{\prime}$ and $\rm III^{\prime}$ . Considering term $\rm III^{\prime}$ , we have:

$\begin{eqnarray} &\mathbb{P}&\left\{\left\Vert\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2} \left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({X}_i, {X}_j), ({Y}_i, {Y}_j\right)\right)\right.\right.\right.\\ &&\qquad \left.\left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({X}_{i_1}, {X}_{i_2}), ({Y}_{i_1}, {Y}_{i_2})\right)\right]\right\}\vphantom{\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i \in H_p^{(U)} }\sum\limits_{q : |q-p|\geqslant2}^{\upsilon_n}} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} > \delta \right\} \\ && \leqslant \mathbb{P}\left\{\left\Vert \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2}\right)\right)\right.\right.\right.\\ &&\qquad\left.\left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2})\right)\right]\right\} \vphantom{\frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i \in H_p^{(U)} }\sum\limits_{q : |q-p|\geqslant2}^{\upsilon_n}} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} > \delta \right\} + \frac{\upsilon_n^2 a_n b_n \beta({a_n})}{\sqrt{n\phi(h_n)}} . \end{eqnarray}$

(8.57)

We also have

$\begin{gathered}&\mathbb{P}\left\{\| \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p = 1}^{v_n} \sum\limits_{i_1 \in \mathbf{H}_p^{(\mathrm{U})}} \sum\limits_{q:|q-p| \geqslant 2}^{v_n} \sum\limits_{i_2 \in \mathbf{T}_q^{(\mathrm{U})}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{\left(\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right.\right.\right. \\ &\left.\left.-\mathbb{E}\left[\mathfrak{G}_{\varphi, \mathbf{i}}^{(R)}\left(\left(\varsigma_{i_1}, \varsigma_{i_2}\right), \left(\zeta_{i_1}, \zeta_{i_2}\right)\right)\right]\right\} \|_{\mathscr{F}_2 \mathscr{K}^2} > \delta\right\}\\ &\;\leqslant \mathbb{P}\left\{\left\Vert \frac{1}{\sqrt{n\phi(h_n)}}\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}}\phi(h_n)\xi_{i_1} \xi_{i_2}\left\{\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(\left({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2}\right)\right)\right.\right.\right.\\ &\qquad\qquad\qquad\left.\left.\left.- \mathbb{E}\left[\mathfrak G_{\varphi, \mathbf{i}}^{(R)}\left(({\varsigma}_{i_1}, {\varsigma}_{i_2}), ({\zeta}_{i_1}, {\zeta}_{i_2})\right)\right]\right\} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} > \delta \right\}. \end{gathered}$

Since the Eq (8.55) is still satisfied, the problem is reduced to

$\mathbb{P}\left\{\left\lvert\, \frac{1}{\sqrt{n \phi\left(h_n\right)}} \sum\limits_{p = 1}^{v_n} \sum\limits_{i_1 \in \mathrm{H}_p^{(U)}} \sum\limits_{q: q q-p \mid \geqslant 2}^{v_n} \sum\limits_{i_2 \in \mathrm{T}_q^{(U)}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2}\left\{F\left(\zeta_i, \zeta_j\right) \mathbb{1}_{\left\{F > \lambda_n\right\}}\right.\right.\right.\\ \;\;\;\;\;\;\;\;\left.\left.-\mathbb{E}\left[F\left(\zeta_i, \zeta_j\right) \mathbb{1}_{\left\{F > \lambda_n\right\}}\right]\right\} \mid > \delta\right\}\\ \;\;\;\lesssim \delta^{-2} n^{-1} \phi\left(h_n\right) Var\left(\sum\limits_{p = 1}^{v_n} \sum\limits_{i_1 \in H_p^{(U)}} \sum\limits_{q:|q-p| \geqslant 2}^{v_n} \sum\limits_{i_2 \in T_q^{(())}} \phi\left(h_n\right) \xi_{i_1} \xi_{i_2} F\left(\zeta_i, \zeta_j\right) \mathbb{1}_{\left\{F > \lambda_n\right\}}\right),$

we follow the same procedure as in (8.56). The rest has just been shown to be asymptotically negligible.

Finally, with

$|r^{(m)}(\varphi, \boldsymbol{\theta}, \mathbf{u}, \mathbf{x}) - \mathbb{E}\left(\mathfrak{U}_n(\varphi, \mathbf{i})\right)|\rightarrow 0,$

and for

$\left(\mathfrak{U}_n(1, \mathbf{i})\right) \underset{ \mathbb{P}}{\rightarrow } 1,$

the weak convergence of our estimator is accomplished. □

Technical lemmas

The forthcoming proof relies on the arguments delineated in ^[39,40,165], extended to the single index model framework.

Lemma 8.1. Let $K_2(\cdot)$ denote one dimensional kernel function satisfying Assumption 2 part i), if Assumption 1, then:

$\begin{equation} i)\; \; \mathbb{E}\left\lvert\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert\lesssim \frac{m \phi^{m-1}(h_n)}{n h_n }; \end{equation}$

(8.58)

$\begin{equation} ii)\; \; \; \; \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2} \left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right] \lesssim \frac{m \phi^{m-1}(h_n)}{n h_n }+\phi^{m}(h_n); \end{equation}$

(8.59)

$\begin{equation} iii)\; \; \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}^2\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right]\sim \phi^{m}(h_n). \end{equation}$

(8.60)

Proof of Lemma 8.1. For the first inequality $i)$ , by assuming that the kernel function $K_2(\cdot)$ is an asymmetrical triangle kernel, that is, $K_2(x) = (1 -x) \mathbb{1}_{(x \in [0, 1])}$ , we have

$\begin{aligned} &\label{expectation of the difference K2} \mathbb{E}\left\lvert \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert \\ &\;\;\; = \mathbb{E} \left[\left\lvert \sum\limits_{k = 1}^m \left(K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \right\rvert\right. \\ & \;\;\;\;\;\;\;\;\left. \times \left\lvert\prod\limits_{i = 1}^{k-1} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\rvert\times \left\lvert\prod\limits_{j = k+1}^{m} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert\right] \\ & \qquad \qquad\qquad \qquad\qquad\qquad\qquad \qquad \qquad\qquad \qquad \qquad \qquad \qquad\text{(Using a telescoping argument)} \\ &\;\;\;\leq \left\{ \mathbb{E} \left[\left\lvert \sum\limits_{k = 1}^m \left( K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \right\rvert\right]^3 \right\}^{1/3} \\ & \;\;\;\;\;\;\;\;\times \left\{ \mathbb{E} \left\lvert\prod\limits_{i = 1}^{k-1} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\rvert^3\right\}^{1/3} \times \left\{ \mathbb{E}\left\lvert\prod\limits_{j = k+1}^{m} K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert^3 \right\}^{1/3} \\ &\qquad\qquad\qquad\qquad \qquad\qquad \;\;\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad\text{(By Hölder's inequality)} \\ &\;\;\;\leq \left\{ \mathbb{E}\left[\sum\limits_{k = 1}^m \left\lvert \left( K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right) \right\rvert \right]^3 \right\}^{1/3} \\ &\;\;\;\;\;\;\;\;\times\left\{\prod\limits_{i = 1}^{k-1} \left( \mathbb{E}\left\lvert K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\rvert ^{3p_i}\right)^{1/p_i} \right\}^{1/3} \\ & \;\;\;\;\;\;\;\;\times \left\{\prod\limits_{j = k+1}^{m} \mathbb{E} \left(\left\lvert K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right\rvert^{3q_j}\right)^{1/q_j} \right\}^{1/3} \qquad\text{(By Hölder's inequality)} \\ &\;\;\;\lesssim \left\{\sum\limits_{k = 1}^m \mathbb{E} \left\lvert \frac{1}{n h_n } U_{i_k, n}^{(i_k/n)} \right\rvert^3 \right\}^{1/3} \times \left\{\prod\limits_{i = 1}^{k-1} \left( \mathbb{E}\left\lvert \mathbb{1}_{\{d_{\theta_k}(x_k, X_{i_k, n})\leq h\}} \right\rvert^{3p_i}\right)^{1/p_i}\right. \\ &\;\;\;\;\;\;\;\;\left.\times \prod\limits_{j = k+1}^{m}\left( \mathbb{E}\left\lvert \mathbb{1}_{\{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})\leq h\}}\right\rvert^{3q_j}\right)^{1/q_j}\right\}^{1/3}\qquad \qquad\qquad\;\;\text{(By Assumption 1)} \\ &\;\;\;\lesssim \left\{\sum\limits_{k = 1}^m \frac{1}{n^3 h^3 } \mathbb{E} \left\lvert U_{i_k, n}^{(i_k/n)} \right\rvert^3\right\}^{1/3}\times \left\{ \prod\limits_{i = 1}^{k-1} \left(F^{3p_i}(h, x_k)\right)^{1/p_i} \prod\limits_{j = k+1}^{m}\left(F^{3q_j}_{i_k/n}(h, x_k)\right)^{1/q_j}\right\}^{1/3} \\ & \\ &\;\;\;\lesssim\frac{1}{n h_n } \left\{\sum\limits_{k = 1}^m \mathbb{E} \left\lvert U_{i_k, n}^{(i_k/n)} \right\rvert^3\right\}^{1/3}\times \left\{ \prod\limits_{i = 1}^{k-1} C_d \phi^3(h_n) f_1^3(x_k) \times \prod\limits_{j = k+1}^{m}C_d \phi(h_n)^3 f_1^3(x_k)\right\}^{1/3} \\ & \;\;\;\lesssim \frac{m\phi^{m-1}(h_n)}{n h_n }. \end{aligned}$

(8.61)

For the second inequality $ii)$ , we have:

$\begin{aligned} & \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2} \left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right] \\ &\;\;\; = \mathbb{E}\left[ \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)- \prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)+\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right)\right] . \end{aligned}$

By linearity of the expectation, inequality $i)$ and for

$\begin{equation*} \mathbb{E}\left[\prod\limits_{k = 1}^m K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}{h_n}\right) \right]\lesssim \phi^{m-1}(h_n), \end{equation*}$

using Assumption 1 part ⅳ), the proof of this inequality holds. Now, we consider the last one. Set

$\widetilde{K}^2_2(x_k): = K_{2}^2\left({d_{\theta_k}(x_k, X_{i_k, n}^{(i_k/n)})}\right).$

We have

$\begin{aligned} & \mathbb{E}\left[\left(\prod\limits_{k = 1}^m \widetilde{K}^2_2\left(\frac{x_k}{h_n}\right)\right)\right] = \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m \widetilde{K}_{2}^{2}\left(\frac{y_k}{h_n}\right) \mathbb{P}(dy_1, \ldots, dy_k) \\ &\;\;\; = - \frac{2}{h_n} \int_{0}^h\cdots\int_{0}^h \prod\limits_{j = 1, j\neq k }^m\widetilde{K}_{2}\left(y_j\right) \\&\;\;\;\;\;\;\;\;\times\int_{0}^h \widetilde{K}_{2}\left(y_\ell\right) \widetilde{K}_{2}^{\prime}\left(y_\ell\right) \mathbb{P}(dy_1, \ldots, dy_{\ell-1}, y_\ell, dy_{\ell+1}, \ldots, y_k)dy_\ell\; \; \text{(Integration by parts)} \\& \\ &\;\;\; = \frac{(-2)^m}{h^m} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1 }^m\widetilde{K}_{2}\left(y_k\right) \widetilde{K}_{2}^{\prime}\left(y_k\right) \mathbb{P}(y_1, \ldots, y_k)dy_1\ldots dy_k \\ &\;\;\;\sim \frac{2^m}{h^m} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m \widetilde{K}_{2}\left(y_k\right) \widetilde{K}_{2}^\prime\left(y_k\right) \phi_k(y_k) d\mathbb{P}(y_1, \ldots, y_k) \\ &\;\;\; = \frac{2^m}{h^m} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m \left(1-\frac{y_k}{h_n}\right)\phi_k(y_k) d\mathbb{P}(y_1, \ldots, y_k) \\ & \quad \qquad\qquad \qquad\qquad \qquad\text { (Using Assumption } \left.2 \text { ii) and } K_2(x)=(1-x) I(x \in[0,1])\right) \\ &\;\;\; = \frac{2^m}{h^{2m}} \int_{0}^h\cdots\int_{0}^h \left(\int_{0}^y \cdots\int_{0}^y \prod\limits_{k = 1}^m\phi_k(\epsilon_k)d(\epsilon_1, \ldots, \epsilon_k)\right) d\mathbb{P}(y_1, \ldots, y_k) \\ & \qquad \quad\qquad \qquad\qquad \qquad\text{(By an integration by parts)} \\ &\;\;\;\sim \frac{2^m}{h^{2m}} \int_{0}^h\cdots\int_{0}^h \prod\limits_{k = 1}^m y_k \phi_k(y_k) d\mathbb{P}(y_1, \ldots, y_k)\sim \frac{1}{h^{2m}}\phi^{m}(h_n) h^{2m} \\&\;\;\;\sim \phi^{m}(h_n). \end{aligned}$

The final result is established by utilizing the small-ball lower bound provided in (2.11). Consequently, inequality (8.60) follows. □

In the ensuing discussion, we will present a lemma that can be regarded as a technical result in the proof of our proposition.

Lemma 8.2. Consider $\mathscr{F}_{m}\mathfrak{K}^m_\Theta$ as a uniformly bounded class of measurable canonical functions, where $m\geq 2$ . Suppose there exist finite constants $\boldsymbol{a}$ and $\boldsymbol{b}$ such that the $\mathscr{F}_{m}\mathfrak{K}^m_\Theta$ covering number satisfies:

$\begin{equation} N(\epsilon, \mathscr{F}_m\mathfrak{K}^m_\Theta, \Vert\cdot\Vert_{L_2(Q)}) \leq \boldsymbol{a}\epsilon^{-\boldsymbol{b}}, \end{equation}$

(8.62)

for every $\epsilon > 0$ and every probability measure $Q$ . If the mixing coefficients $\beta$ of the local stationary sequence $\{Z_i = (X_{\mathbf{i}, n}, {\mathfrak W}_{\mathbf{i}, n})\}_{i \in \mathbb{N}^\star}$ satisfy:

$\begin{equation} \beta(k) k^r \rightarrow 0, \; \; \mathit{\text{as}}\; \; k \rightarrow \infty, \end{equation}$

(8.63)

for some $r > 1$ , then:

$\begin{equation} \sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m} \mathbb{P} \left[ h^{m/2}_n \phi^{m/2}(h_n) n^{-m+1/2} \sum\limits_{\mathbf{i} \in I_n^m} \xi_{i_1} \cdots \xi_{i_m} H(Z_{i_1}, \ldots, Z_{i_m}) \right] \rightarrow 0. \end{equation}$

(8.64)

Remark 8.3. As mentioned before, ${\mathfrak W}_{\mathbf{i}, n}$ will be equal to $1$ or $\varepsilon_{\mathbf{i}, n} = \sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right)\varepsilon_{\mathbf{i}}$ . In the proof of the previous Lemma, ${\mathfrak W}_{\mathbf{i}, n}$ will be equal $\varepsilon_{\mathbf{i}, n} = \sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right)\varepsilon_{\mathbf{i}}$ , and we will use the notation $\mathfrak{W}^{(u)}_{\mathbf{i}, n}$ to indicate $\sigma\left(\mathbf{u}, \mathbf{x}\right)\varepsilon_{\mathbf{i}}$

Proof of Lemma 8.2. The proof of this lemma relies on the blocking method, specifically drawing upon techniques introduced by ^[13]. The central idea involves partitioning the strictly stationary sequence $(Z_1, \ldots, Z_n)$ into $2n$ blocks, each of length $a_n$ , along with a residual block of length $n-2v_n a_n$ . This approach, known as Bernstein's method and discussed in ^[22], facilitates the application of symmetrization and various techniques designed for i.i.d. random variables. To establish the independence between the blocks, the smaller blocks are placed between two consecutive larger blocks, and their contribution should be asymptotically negligible. Next, introduce the sequence of independent blocks $\left(\eta_{1}, \ldots, \eta_{n}\right)$ such that:

$\mathscr{L}\left(\eta_{1}, \ldots, \eta_{n}\right) = \mathscr{L}\left(\mathrm{Z}_{1}, \ldots, \mathrm{Z}_{a_{n}}\right) \times \mathscr{L}\left(\mathrm{Z}_{a_{n}+1}, \ldots, \mathrm{Z}_{2 a_{n}}\right) \times \cdots$

An application of the result of ^[73] implies that for any measurable set $A$ :

$\begin{aligned} & \mid \mathbb{P}\left\{\left\{\eta_1, \ldots, \eta_{a_n}, \eta_{2 a_n+1}, \ldots, \eta_{3 a_n}, \ldots, \eta_{2\left(v_n-1\right) a_n+1}, \ldots, \eta_{2 v_n a_n}\right) \in \mathrm{A}\right\} \\ & \quad\;\;\;\;\;\;\;\;\;\;\;-\mathbb{P}\left\{\left(\mathrm{Z}_1, \ldots, \mathrm{Z}_{a_n}, \mathrm{Z}_{2 a_n+1}, \ldots, \mathrm{Z}_{3 a_n}, \ldots, \mathrm{Z}_{2\left(v_n-1\right) a_n+1}, \ldots, \mathrm{Z}_{2 v_n a_n}\right) \in \mathrm{A}\right\} \mid \\ & \leqslant 2\left(v_n-1\right) \beta\left(a_n\right) . \end{aligned}$

(8.65)

Since we are working with a locally stationary sequence $(X_1, \ldots, X_n)$ , the sequence of independent blocks used subsequently is denoted by $\{\eta_i\}_{i\in \mathbb{N}^*}$ . We decompose the process based on the distribution of these blocks:

$\begin{aligned} & \sum\limits_{i_1 \neq i_2}^{n}\frac{1}{h^{2}\phi^2(h_n)}\prod\limits_{k = 1}^2\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\}{\mathfrak W}_{i_1, i_2, \varphi, n} \\ &\;\;\; = \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \frac{ {\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+2 \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+2 \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \leqslant 1}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+\sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{T}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;\;\;\;\;\;+\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2} \sum\limits_{i_1, i_2 \in \mathrm{T}_{p}^{(\mathrm{U})}} \frac{{\mathfrak W}_{i_1, i_2, \varphi, n}}{h^{2}\phi^2(h_n)} \prod\limits_{k = 1}^m\left\{K_{1}\left(\frac{u_k-i_k/n}{h_n}\right)K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right\} \\ &\;\;\;: = \mathrm{I}+\mathrm{II}+\mathrm{III}+\mathrm{IV}+\mathrm{V}+\mathrm{VI} . \end{aligned}$

(8.66)

(I): The same type of block but not the same block: Assume that the sequence of independent blocks $\{\eta_i\}_{i\in \mathbb{N}^*}$ is of size $a_n$ . An application of (8.61) shows that:

$\begin{eqnarray} \label{First block inequality: I} &&{ \mathbb{P}\left( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right.\right.}\left. \left.\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n} \right\vert > \delta\right) \\ &&\leq \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ && \qquad\qquad\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg) \\ &&+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ &&\qquad\qquad\qquad \qquad\left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \left[{\mathfrak W}_{i_1, i_2, \varphi, n} - \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]\right\vert > \delta \Bigg) \\ &&+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right.\left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg) \\ &&\leq \mathbb{P}\left( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \qquad\qquad\qquad\left. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right)\mathfrak{W}^{({u})}_{\mathbf{i}, \varphi, n} \right\vert > \delta\right)+ 2\upsilon_n \beta({b_n}) + o_ \mathbb{P}(1)+o_ \mathbb{P}(1). \end{eqnarray}$

By the fact that

$\begin{eqnarray} &&{ \mathbb{E} \left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right.}\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert \\ && = n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \left\vert \mathbb{E}\left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert \\ && = n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \\ &&\qquad \mathbb{E}\left\vert \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] \sigma\left(\frac{\mathbf{i}}{n}, X_{i, n}\right)\varepsilon_{\mathbf{i}}\right\vert \\ && = n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \mathbb{E}(\varepsilon_{\mathbf{i}}) \mathbb{E} \left\vert \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right)\right.\right. \\ && \left.\left.-\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] \left[ \sigma\left(\frac{\mathbf{i}}{n}, X_{\mathbf{i}, n}\right) - \sigma\left(\mathbf{u}, X_{\mathbf{i}, n}\right) + \sigma\left(\mathbf{u}, X_{\mathbf{i}, n}\right)\right] \right\vert \\ &&\lesssim n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \mathbb{E}(\varepsilon_{\mathbf{i}}) (\sigma\left(\mathbf{u}, \mathbf{x}\right)+o_ \mathbb{P}(1)) \\ &&\qquad\qquad \mathbb{E}\left\vert \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] \right\vert \\ &&\lesssim n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \mathbb{E}(\varepsilon_{\mathbf{i}}) (\sigma\left(\mathbf{u}, \mathbf{x}\right)+o_ \mathbb{P}(1)) \left[ \frac{m \phi^{m-1}(h_n)}{n h_n }\right] \\&&\qquad\qquad \mbox{(where m = 2 and using Lemma 8.1 Equation (8.58))} \\ &\sim& o_ \mathbb{P}(1), \end{eqnarray}$

and

(8.67)

We keep the choice of $b_n$ and $\upsilon_n$ such that

$\begin{equation} \upsilon_nb_n^r \leqslant 1, \end{equation}$

(8.68)

which implies that $2\upsilon_n \alpha_{b_n} \to 0$ as $n \to \infty$ , so the term to consider is the second summand. For the second part of the inequality, we turn to the work of ^[11] in the non-fixed kernel setting. Specifically, we define

$f_{i_{1}, \ldots, i_m} = \prod\limits_{k = 1}^m \xi_{i_k} \times H ,$

and $\mathcal{F}_{i_{1}, \ldots, i_m}$ represents a collection of kernels and the corresponding class of functions associated with this kernel. Subsequently, we will apply [, Theorem 3.1.1 and Remarks 3.5.4 part 2] for decoupling and randomization. Given our assumption that $m = 2$ , we can observe that:

$\begin{aligned} & \mathbb{E} \left\Vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_2 \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ & \;\;\; = \mathbb{E} \left\Vert n^{-3/2} h\phi(h_n) \sum\limits_{p \neq q}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{i_2 \in \mathrm{H}_{q}^{(\mathrm{U})}} f_{i_1, i_2} (\boldsymbol{u}, \boldsymbol{\eta}) \right\Vert_{\mathcal{F}_{i_1, i_2}}\\ &\;\;\; \leq c_2 \mathbb{E} \left\Vert n^{-3/2} h \phi(h_n) \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} }f_{i_1, i_2} (\boldsymbol{u}, \boldsymbol{\eta})\right\Vert_{\mathcal{F}_{i_1, i_2}} \\ & \;\;\;\leqslant c_2 \mathbb{E} \int_0^{D_{nh}^{(U_1)}}{N\left(t, \mathcal{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(1)}\right)} dt, \; \; \; \text{(by Lemma A.5 and Proposition A.6)} \end{aligned}$

(8.69)

where $D_{nh}^{(U_1)}$ is the diameter of $\mathcal{F}_{i_1, i_2}$ according to the distance $\widetilde{d}_{nh, 2}^{(1)},$ respectively defined as

$\begin{aligned} &D_{nh}^{(U_1)} : = \left\Vert \mathbb{E}_{\epsilon} \left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} }f_{i_1, i_2} (\boldsymbol{u}, \boldsymbol{\eta}) \right\vert \right\Vert_{\mathcal{F}_{i_1, i_2}} \\ &\;\;\; = \left\Vert \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h \phi^{-1}(h_n) \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} } \xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_2 \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\vert \right\Vert_{\mathcal{F}_{2}\mathcal{K}^{2}}, \end{aligned}$

and :

$\begin{aligned} &\widetilde{d}_{nh, 2}^{(1)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right) \\ & \;\;\;: = \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h{\phi^{-1}(h_n)}\sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} } \left[\xi_{1i_1}\xi_{1i_2}\prod\limits_{k = 1}^2 K_{1, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\right. \right. \\ & \;\;\;\;\;\;\;\;-\left. \vphantom{ \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i\in H_p^{(U)} } \sum\limits_{j\in H_q^{(U)} }}\left.\xi_{2i_1}\xi_{2i_2} \prod\limits_{k = 1}^2 K_{2, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n} \right] \right\vert .\end{aligned}$

Let's consider another semi-norm $\widetilde{d}_{nh, 2}^{(2)} :$

$\begin{eqnarray*} {\widetilde{d}_{ nh, 2}^{(2)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)} && = \frac{1}{n h^2\phi^2(h_n)}\left[\sum\limits_{i\neq j}^{\upsilon_n}\left(\xi_{1i_1}\xi_{1i_2}\prod\limits_{k = 1}^2 K_{1, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\right. \right. \nonumber\\ && -\left. \vphantom{ \sum\limits_{p\neq q}^{\upsilon_n} \epsilon_p \epsilon_q\sum\limits_{i\in H_p^{(U)} } \sum\limits_{j\in H_q^{(U)} }}\left.\xi_{2i_1}\xi_{2i_2} \prod\limits_{k = 1}^2 K_{2, 2} \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n} \right)^2\right]^{1/2}. \end{eqnarray*}$

One can see that

$\widetilde{d}_{nh, 2}^{(1)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right) \leqslant a_n n^{-1/2}h \phi(h_n) \widetilde{d}_{ nh, 2}^{(2)}\left( \xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right).$

We readily infer that

(8.70)

where $\lambda_n \to 0.$ We have

$\frac{ \left(\int_0^{\lambda_n}{\log{t^{-1}}dt}\right)}{ \left(\lambda_n \log{\lambda_n^{-1}}\right)} \to 0,$

where $a_n$ and $\lambda_n$ must be chosen in such a way that the following relation will be achieved

$\begin{equation} a_n \lambda_n n^{-1/2} \log{\lambda_n ^{-1}} \to 0. \end{equation}$

(8.71)

Utilizing the triangle inequality along with Hoeffding's trick, we readily obtain that

$\begin{aligned} & a_n n^{-1/2} \mathbb{P}\left\{D_{nh}^{(U_1)}\geqslant \lambda_na_n n^{-1/2} \right\} \\ &\;\;\;\leqslant \lambda_n ^{-2}a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\left\Vert \sum\limits_{p\neq q}^{\upsilon_n}\left[\sum\limits_{i_1\in H_p^{(U)} }\sum\limits_{i_2\in H_q^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left. \left.K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ & \;\;\;\leqslant c_2 \upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\left\Vert \sum\limits_{p = 1 }^{\upsilon_n}\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left. \left.K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2}, \end{aligned}$

(8.72)

where $\left\{{\eta}^\prime_i\right\}_{i \in \mathbb{N}^*}$ are independent copies of $({\eta}_i)_{i \in \mathbb{N}^*}$ . By imposing:

$\begin{equation} \lambda_n ^{-2}a_n^{1-r}n^{-1/2} \to 0, \end{equation}$

(8.73)

we readily infer that

$\begin{eqnarray*} {\left\Vert \upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\sum\limits_{p = 1 }^{\upsilon_n}\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_2 \left(\frac{d_{\theta_k}(x_k, \eta_{i_k})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2}} \leqslant O\left(\lambda_n ^{-2}a_n^{1-r}n^{-1/2}\right). \nonumber \end{eqnarray*}$

Symmetrizing the last inequality in (8.72) and subsequently applying Proposition A.6 from the Appendix yields

$\begin{eqnarray} &&\upsilon_n \lambda_n ^{-2} a_n^{-1}n^{-5/2} h \phi^{-1}(h_n) \mathbb{E}\left\Vert \sum\limits_{p = 1 }^{\upsilon_n}\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\epsilon_p\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left. \left.K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ && \leqslant c_2 \mathbb{E}\left(\int_0^{D_{nh}^{(U_2)}}\left(\log{N(u, \mathscr{F}_{i, j}, \widetilde{d}_{nh, 2}^\prime)}\right)^{1/2}\right), \end{eqnarray}$

(8.74)

where

$\begin{eqnarray} D_{nh}^{(U_2)} && = \left\Vert \mathbb{E}_{\epsilon}\left\vert\upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2}\phi^{-1}(h_n)\right.\right. \\ &&\left.\left.\sum\limits_{p = 1 }^{\upsilon_n}\epsilon_p\left[\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2}K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2 \left(\frac{d_{\theta_2}(x_2, \eta^\prime_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2 \right\vert \right\Vert_{\mathscr{F}_2\mathscr{K}^2}. \end{eqnarray}$

and for $\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^\prime\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime} \in \mathcal{F}_{ij}$ :

$\begin{aligned} &\widetilde{d}_{nh, 2}^\prime \left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)\\ &\;\;\;: = \mathbb{E}_{\epsilon}\left\vert\upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2}\phi^{-1}(h_n)\sum\limits_{p = 1 }^{\upsilon_n}\epsilon_p\left[\left(\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{1i_1}\xi_{1i_2}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta^{\prime}_{i_2}})}{h_n}\right) \right. \right.\right. \left.\mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\right)^2\nonumber\\ &\;\;\;\;\;\;\;\; - \left. \left. \left(\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{2i}\xi_{2j} {K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, {\eta_{i_1}})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta^{\prime}_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right)^2\right]\right\vert.\nonumber \end{aligned}$

By the fact that:

$\begin{aligned} & \mathbb{E}_{\epsilon} \left\vert\upsilon_n \lambda_n ^{-2}a_n^{-1}n^{-5/2}\phi^{-1}(h_n)\sum\limits_{p = 1 }^{\upsilon_n}\epsilon_p\left(\sum\limits_{i_1, i_2\in H_p^{(U)} }\xi_{i_1} \xi_{i_2}K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2 \left(\frac{d_{\theta_2}(x_2, \eta^{\prime}_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2\right\vert \\ & \;\;\;\leqslant a_n^{3/2}\lambda_n^{-2}n^{-1}\left[\upsilon_n^{-1}a_n^{-2}\phi^{-2}(h_n)\sum\limits_{p = 1 }^{\upsilon_n}\sum\limits_{i_1, i_2\in H_p^{(U)} }\left(\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_i}(x_i, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, \eta^{\prime}_j)}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^4\right]^{1/2}, \end{aligned}$

so:

$\begin{equation} a_n^{3/2}\lambda_n^{-2}n^{-1}\to 0, \end{equation}$

(8.75)

we have the convergence of (8.74) to zero. For the choice of $a_n,$ $b_n$ and $\upsilon_n$ , it should be noted that all the values satisfying (8.54), (8.68), (8.71), (8.73), and (8.74) are suitable.

(II): The same block:

$\begin{aligned} & \mathbb{P}\left(\sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)} \xi_{i_1} \xi_{i_2} \right.\right. \\ & \;\;\;\;\;\;\;\;\left.\left.K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1, n})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2, n})}{h_n}\right) {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta\right) \\ &\;\;\;\leq \mathbb{P}\left(\sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\|n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)} \xi_{i_1} \xi_{i_2} \right.\right. \\ &\;\;\;\;\;\;\;\;\left.\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta \right) \\ &\;\;\;+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ &\;\;\;\;\;\;\;\;\left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \left[{\mathfrak W}_{i_1, i_2, \varphi, n} - \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]\right\vert > \delta \Bigg) \\ &\;\;\;+\mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right. \\ &\left. \;\;\;\;\;\;\;\; \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg) \\ &\;\;\;\leqslant \mathbb{P}\left( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \neq i_2; i_1, i_2 \in \mathrm{H}_{p}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_i}(x_i, \eta_{i_1})}{h_n}\right)\right.\right. \\ & \;\;\;\;\;\;\;\;\left.\left. K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\vert > \delta\right)+ 2\upsilon_n \alpha_{b_n} . \end{aligned}$

(8.76)

Similar to $\rm I$ , we can show that both the first and second terms in the previous inequality are of order $o_ \mathbb{P}(1)$ . Therefore, as in the preceding proof, it is enough to establish

$\begin{eqnarray} \mathbb{E}\left(\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} } \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\left.\left. K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \to 0 . \end{eqnarray}$

Notice that when we consider a uniformly bounded class of functions, we obtain uniformity in $B^m \times \mathscr{F}_2\mathscr{K}^2$

$\mathbb{E}\left(\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} } \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n} \right) = O(a_n).$

This implies that we have to prove that, for $\mathbf{u} \in B^m$

$\begin{aligned} & \mathbb{E}\left(\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\left[\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_i}(x_i, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right. \right. \right. \\ & \qquad - \left. \left. \left. \mathbb{E}\left(\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}}}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n} \right)\right]\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \to 0 . \end{aligned}$

As for empirical processes, to prove (8.77), it suffices to symmetrize and show that

$\begin{eqnarray} \label{II.symmetrization} { \mathbb{E}\left(\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\epsilon_p\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right. \right.} \left. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \to 0 . \end{eqnarray}$

In a similar way as in (8.69), we infer that :

$\begin{eqnarray*} &&{ \mathbb{E}\left(\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\epsilon_p\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.} \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\right) \nonumber\\ &&\leqslant \mathbb{E}\left(\int_0^{D_{nh}^{(U_3)}}{\left(\log N \left(u, \mathscr{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(3)}\right)\right)^{1/2}du}\right), \end{eqnarray*}$

where

$\begin{eqnarray} D_{nh}^{(U_3)}& = & \left\Vert \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\epsilon_p\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert\right\Vert_{\mathscr{F}_2\mathscr{K}^2}, \end{eqnarray}$

(8.77)

and the semi-metric $\widetilde{d}_{nh, 2}^{(3)}$ is defined by

$\begin{aligned}& \widetilde{d}_{nh, 2}^{(3)}\left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)\\ &\;\;\; = \mathbb{E}_{\epsilon} \left\vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\epsilon_p\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\left(\xi_{1i}\xi_{1j}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \right. \right. \\ &\qquad\qquad\qquad \;\;\; \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}- \left. \left. \xi_{2i}\xi_{2j}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right)\right\vert. \end{aligned}$

Since we are trading uniformly bounded classes of functions, we infer that

$\begin{eqnarray*} && \mathbb{E}_{\epsilon} \left\vert n^{-3/2} h\phi^{-1}(h_n)\sum\limits_{p = 1}^{\upsilon_n}\epsilon_p\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert \nonumber\\ && \leqslant a_n^{3/2}(n)^{-1}h\phi^{-1}(h_n)\left[\frac{1}{\upsilon_na_n^2}\sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \neq i_2; i_1, i_2 \in H_p^{(U)} }\left(\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right)\right.\right. \nonumber\\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right)^2\right]^{1/2}\leqslant O\left( a_n^{3/2}(n)^{-1}\phi^{-1}(h_n)\right). \end{eqnarray*}$

Since $a_n^{3/2}(n)^{-1}\phi^{-1}(h_n)\to 0$ , $D_{nh}^{(U_3)}\to 0$ , we obtain $\rm{II}\to 0$ as $n \to \infty$ .

(III): Different types of blocks:

$\begin{aligned} &\label{III: 1} \mathbb{P}\left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) \right.\right. \\ &\;\;\;\;\;\;\;\; \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta\right) \\&\;\;\;\leq \mathbb{P}\left(\sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \frac{1}{\phi^2(h_n)} \xi_{i_1} \xi_{i_2} \right.\right. \\ &\;\;\;\;\;\;\;\; \left.\left. \left[\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i_k, n})}{h_n}\right) -\prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right)\right] {\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta \right) \\ &\;\;\;\;\;\;\;\;+ \mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2}\right. \\ &\;\;\;\;\;\;\;\; \left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i, n}^{(i/n)})}{h_n}\right) \left[{\mathfrak W}_{i_1, i_2, \varphi, n} - \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]\right\vert > \delta \Bigg) \\ & \;\;\;\;\;\;\;\;+\mathbb{P}\Bigg( \sup\limits_{ \mathscr{F}_{m}\mathfrak{K}^m_\Theta}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup\limits_{\mathbf{u} \in B^m}\left\vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \geqslant 2}^{v_{n}} \sum\limits_{i_2 \in \mathrm{T}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} \right. \\ &\;\;\;\;\;\;\;\; \left. \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert > \delta \Bigg). \end{aligned}$

As mentioned earlier, we have addressed the first and second summands in the previous inequality. What remains is the last summation, where the application of (8.61) reveals that

$\begin{aligned} &\sum\limits_{p = 1}^{\upsilon_n} \mathbb{E}\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2\in T_q^{(U)}}\xi_{i_1} \xi_{i_2} \prod\limits_{k = 1}^2 K_{2}\left(\frac{d_{\theta_k}(x_k, X_{i}^{(i/n)})}{h_n}\right) \mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ & \;\;\; \leqslant \sum\limits_{p = 1}^{\upsilon_n} \mathbb{E}\left\Vert n^{-3/2}h\phi^{-1}(h_n)\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} \right. \\ & \;\;\;\;\;\;\;\;\left.K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} +n^{-3/2} h \phi^{-1}(h_n) \upsilon_n^2 a_n b_n \beta({a_n}) , \end{aligned}$

we have

$n^{-3/2}\phi^{-1}(h_n) \upsilon_n^2 a_n b_n \beta({a_n})\to 0,$

using Condition (8.63) and the choice of $a_n$ , $b_n$ and $\upsilon_n$ . For $p = 1$ and $p = \nu_n$ :

$\begin{aligned} & \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &\;\;\; = \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} . \end{aligned}$

For $2\leqslant p\leqslant \upsilon_n-1$ , we obtain:

$\begin{aligned}& \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\geqslant2}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &\;\;\;\; = \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 4}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ &\;\;\;\;\leqslant \mathbb{E}\left\Vert n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}, \end{aligned}$

therefore it suffices to treat the convergence:

$\begin{eqnarray} \mathbb{E}\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\longrightarrow 0 . \end{eqnarray}$

(8.78)

Using similar arguments as in ^[13], we apply the standard symmetrization and:

$\begin{eqnarray} &&{ \mathbb{E}\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.} \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ &&\leqslant 2 \mathbb{E}\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \\ && = 2 \mathbb{E}\left\{\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \mathbb{1}_{\left\{D_{nh}^{(U_4)}\leqslant \gamma_n\right\}}\right\}\\ &&+ 2 \mathbb{E}\left\{\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2} \mathbb{1}_{\left\{D_{nh}^{(U_4)} > \gamma_n\right\}}\right\}\\ && = 2\rm{III}_1 + 2 \rm{III}_2, \end{eqnarray}$

(8.79)

where

$\begin{eqnarray} D_{nh}^{(U_4)} & = & \left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\left[\sum\limits_{q = 3}^{\upsilon_n} \left(\sum\limits_{i_2 \in T_q^{(U)}}\sum\limits_{i_1 \in H_1^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\right. \\ && \left.\left. \left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2\right]^{1/2}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}. \end{eqnarray}$

(8.80)

In a similar way as in (8.69), we infer that

$\begin{eqnarray} {\rm{III}_1} \leqslant c_2 \int_0^{\gamma_n}{\left(\log{N\left(t, \mathscr{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(4)}\right)}\right)^{1/2} dt}, \end{eqnarray}$

(8.81)

where

$\begin{eqnarray} &&{\widetilde{d}_{nh, 2}^{(4)}\left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right)} : = \mathbb{E}_{\epsilon}\left\vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\left[\xi_{1i_1}\xi_{1i_2}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\\ && \times {K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n} - \left.\left.\xi_{2i_1}\xi_{2i_2}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right]\right\vert. \end{eqnarray}$

Since we have

$\begin{eqnarray} &&{ \mathbb{E}_{\epsilon}\left\vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.}\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\vert \\ &&\leqslant a_n^{-1/2}b_n h^2\phi(h_n)\left(\frac{1}{ a_n b_n \upsilon_n h^2 \phi^{4}(h_n)}\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\left[\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right.\\ &&\left.\left.K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right]^2\right)^{1/2}, \end{eqnarray}$

and considering the semi-metric

$\begin{aligned} &\widetilde{d}_{nh, 2}^{(5)}\left(\xi_{1\boldsymbol{.}} {K}_{2, 1}\mathfrak{W}^{\prime(u)}\; , \; \xi_{2\boldsymbol{.}} {K}_{2, 2}\mathfrak{W}^{\prime\prime(u)}\right) \nonumber\\ &\;\;\;\;\;\;\;\;: = \left(\frac{1}{ a_n b_n \upsilon_n h^2 \phi^4(h_n)}\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\left[\xi_{1i_1}\xi_{1i_2}{K}_{2, 1} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right){K}_{2, 1}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \right.\right.\mathfrak{W}^{\prime(u)}_{i_1, i_2, \varphi, n}\nonumber\\ &\;\;\;\;\;\;\;\; - \left.\left.\xi_{2i_1}\xi_{2i_2}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \mathfrak{W}^{\prime\prime(u)}_{i_1, i_2, \varphi, n}\right]^2\vphantom{\frac{1}{ a_n b_n \upsilon_n h^4}\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\left[\xi_{2i_1}\xi_{2i_2}{K}_{2, 2} \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) {K}_{2, 2}\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right) \varphi_{2}({\zeta}_i, {\zeta}_j)\right.}\right)^{1/2}. \end{aligned}$

We show that the expression in (8.81) is bounded as follows

$\begin{equation*} \upsilon_n^{1/2} b_n n^{-1/2}h^2\phi(h_n)\int_0^{\upsilon_n^{-1/2}b_n^{-1}n^{1/2}h^2\gamma_n}{\left(\log{N\left(t, \mathscr{F}_{i_1, i_2}, \widetilde{d}_{nh, 2}^{(5)}\right)}\right)^{1/2}}dt, \end{equation*}$

by choosing $\gamma_n = n^{-\alpha}$ for some $\alpha > (17r-26)/60r$ , we get the convergence to zero of the previous quantity. To bound the second term on the right-hand side of (8.79), we observe that

$\begin{eqnarray} \rm{III}_2 & = & \mathbb{E}\left\{\left\Vert \upsilon_n n^{-3/2} h \phi^{-1}(h_n)\sum\limits_{i_1 \in H_1^{(U)} }\sum\limits_{q = 3}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\epsilon_q\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right) \right.\right. \\ && \left.\left. K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n} \right\Vert_{\mathscr{F}_2\mathscr{K}^2} \mathbb{1}_{\left\{D_{nh}^{(U_4)} > \gamma_n\right\}}\right\}\\ & \leqslant &a_n^{-1}b_n n^{1/2}h\phi^{-1}(h_n)\mathbb{P}\left\{ \left\Vert \upsilon_n^2 n^{-3}h^2\phi^{-2}(h_n)\sum\limits_{q = 3}^{\upsilon_n} \left(\sum\limits_{i_2 \in T_q^{(U)}}\sum\limits_{i_1 \in H_1^{(U)} }\xi_{i_1} \xi_{i_2} \right.\right. \right.\\ && \left.\left. K_2 \left(\frac{d_{\theta_1}(x_1, \eta_{i_1})}{h_n}\right)K_2\left(\frac{d_{\theta_2}(x_2, {\eta_{i_2}})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\geqslant \gamma_n ^2\Bigg\}. \end{eqnarray}$

(8.82)

Now, we apply the square root trick to the last expression conditional on $H_{1}^{U}$ . Denoting $\mathbb{E}_T$ as the expectation with respect to $\sigma\{{\eta}_{i_2}: i_2 \in T_q, q\geqslant 3\}$ , we assume that any class of functions $\mathscr{F}_m$ is unbounded, and its envelope function satisfies, for some $p > 2$ :

$\begin{equation} \theta_p: = \sup\limits_{\mathbf{t}\in\mathcal{S}^m_\mathscr{H}} \mathbb{E}\left( F^p(\mathbf{Y})\vert\mathbf{X} = \mathbf{t}\right) < \infty , \end{equation}$

(8.83)

for $2r/(r-1) < s < \infty$ , (in the notation in of [89, Lemma 5.2]).

$\begin{eqnarray*} M_n& = & \upsilon_n^{1/2} \mathbb{E}_T \left(\sum\limits_{j \in T_q^{(U)}}\sum\limits_{i \in H_1^{(U)} }\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_i})}{h_n}\right) K_2\left(\frac{d_{\theta_2}(x_2, X_{i_j})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right)^2, \end{eqnarray*}$

where

$t = \gamma_n^2a_n^{5/2}n^{1/2}h\phi^{-1}(h_n), \; \; \rho = \lambda = 2^{-4}\gamma_na_n^{5/4}n^{1/4}h^{1/2}\phi^{-1/2}(h_n),$

and

$m = \exp{\left(\gamma_n^2n h^2\phi^{-2}(h_n)b_n^{-2}\right)}.$

Nevertheless, since we require $t > 8M_n$ and $m \to \infty$ , using similar arguments as in [13, page 69], we achieve the convergence of (8.81) and (8.82) to zero.

(IV): Different types of blocks: The target here is to prove that:

$\begin{eqnarray} \mathbb{P}\left(\sup\limits_{\mathscr{F}_m\mathfrak{K}^m_\Theta} \sup\limits_{\boldsymbol{\theta}\in \Theta^m}\sup\limits_{\mathbf{x} \in \mathcal{H}^m}\sup _{\mathbf{u} \in B^m} \left\vert\sum\limits_{p = 1}^{v_{n}} \sum\limits_{i_1 \in \mathrm{H}_{p}^{(\mathrm{U})}} \sum\limits_{q:\vert q-p\vert \leqslant 1}^{v_{n}} \sum\limits_{i_2 \in \mathrm{n}_{q}^{(\mathrm{U})}} \xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1})}{h_n}\right) \right.\right.\left.\left.K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2})}{h_n}\right){\mathfrak W}_{i_1, i_2, \varphi, n}\right\vert > \delta\right) \rightarrow 0. \end{eqnarray}$

We have

$\begin{eqnarray*} && \left\Vert n^{-3/2} h \phi^{-1}(h_n) \sum\limits_{p = 1}^{\upsilon_n}\sum\limits_{i_1 \in H_p^{(U)} }\sum\limits_{q : \vert q-p\vert\leqslant1}^{\upsilon_n} \sum\limits_{i_2 \in T_q^{(U)}}\xi_{i_1} \xi_{i_2} K_2 \left(\frac{d_{\theta_1}(x_1, X_{i_1}^{(i_1/n)})}{h_n}\right) \right.\left.K_2\left(\frac{d_{\theta_2}(x_2, X_{i_2}^{(i_2/n)})}{h_n}\right)\mathfrak{W}^{(u)}_{i_1, i_2, \varphi, n}\right\Vert_{\mathscr{F}_2\mathscr{K}^2}\\ &&\leqslant c_2 \upsilon_n a_n b_n n^{-3/2} h \phi^{-1}(h_n) \to 0 . \end{eqnarray*}$

Hence the proof of the lemma is complete. □

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in creating this article.

Acknowledgements

The authors would like to thank the Editor-in-Chief, an Associate-Editor, and three referees for their extremely helpful remarks, which resulted in a substantial improvement of the original form of the work and a presentation that was more sharply focused.

Conflict of interest

The author declares that he has no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix

A. Additional information

This appendix includes additional information that is integral to achieving a more comprehensive understanding of the paper.

Lemma A.1 (^[165]). Let $I_{h} = \left[C_{1} h, 1-C_{1} h\right]$ . Suppose that kernel $K_{1}$ satisfies Assumption 2 part(i). Then, for $q = 0, 1, 2$ and $m > 1$ :

$\begin{aligned} &\sup _{\mathbf u \in I_{h}^m} \left\lvert \frac{1}{n^m h^m} \sum\limits_{\mathbf{i}\in I_n^m}\right. \prod\limits_{k = 1}^mK_{1}\left(\frac{u_k-i_k/n}{h_n}\right)\left(\frac{u_k-i_k /n}{h_n}\right)^{q}\\ & \;\;\;\;\;\;\left. -\int_{0}^{1}\cdots \int_{0}^{1} \frac{1}{h^m} \prod\limits_{k = 1}^m \left\{ K_{1}\left(\frac{(u_k-v_k)}{h_n}\right)\left(\frac{u_k-v_k}{h_n}\right)^{q} \right\} \prod\limits_{k = 1}^m d v_k\right\rvert = O\left(\frac{1}{nh^{m+1}}\right). \end{aligned}$

Lemma A.2 (^[165]). Suppose that kernel $K_{1}$ satisfies Assumption 2 part (i) and let $g:[0, 1] \times \mathscr{H} \rightarrow \mathbb{R}$ , $(\mathbf{u}, \mathbf{x}) \mapsto g(\mathbf{u}, \mathbf{x})$ be continuously differentiable with respect to $\mathbf{u}$ . Then,

$\begin{eqnarray} {\sup _{u \in I_{h}}\left\lvert\frac{1}{n^m h^m} \sum\limits_{\mathbf{i}\in I_n^m} \prod\limits_{k = 1}^m K_{1}\left(\frac{u_k-i_k /n}{h_n}\right) g\left(\frac{i_k}{n}, x_{k}\right)- \prod\limits_{k = 1}^m g(u_k, x_k)\right\rvert } = O\left(\frac{1}{nh^{m+1}}\right)+o(h_n). \end{eqnarray}$

(A.1)

Lemma A.3 (^[124]). Let $\left\{Z_{i, n}\right\}$ be a zero-mean triangular array such that $\left\lvert Z_{i, n}\right\rvert \leq b_{n}$ with $\alpha$ -mixing coefficients $\alpha(k)$ . Then, for any $\varepsilon > 0$ and $S_{n} \leq n$ with $\epsilon > 4 S_{n} b_{n}$ ,

$\begin{equation} \mathbb P\left(\left\lvert\sum\limits_{i = 1}^{n} Z_{i, i}(u, x)\right\rvert \geq \varepsilon\right) \leq 4 \exp \left(-\frac{\varepsilon^{2}}{64 \sigma_{S_{n}, n}^{2} \frac{n}{S_{n}}+\frac{8}{3} \varepsilon b_{n} S_{n}}\right)+4 \frac{n}{S_{n}} \alpha\left(S_{n}\right). \end{equation}$

(A.2)

Lemma A.4. Let $\left\{Z_{i, n}\right\}$ be a zero-mean triangular array such that $\left\lvert Z_{i, n}\right\rvert \leq b_{n}$ with $\beta$ -mixing coefficients $\beta(k)$ . Then, for any $\varepsilon > 0$ and $S_{n} \leq n$ with $\epsilon > 4 S_{n} b_{n}$ ,

(A.3)

Proof of Lemma A.4. Using Lemma A.3 and the fact that for any $\sigma$ -algebra $\mathcal{A}$ and $\mathcal{B}$ , $\alpha(\mathcal{A}, \mathcal{B}) \subseteq \beta(\mathcal{A}, \mathcal{B})$ , Lemma A.7 holds. □

Lemma A.5 (^[63]). Let $X_{1}, \ldots, X_{n}$ be a sequence of independent random elements taking values in a Banach space $(B, \|\cdot\|)$ with $\mathbb E X_{i} = 0$ for all $i$ . Let $\left\{\varepsilon_{i}\right\}$ be a sequence of independent Bernoulli random variables independent of $\left\{X_{i}\right\}$ . Then, for any convex increasing function $\Phi$ ,

$\mathbb E \Phi\left(\frac{1}{2}\left\|\sum\limits_{i = 1}^{n} X_{i} \varepsilon_{i}\right\|\right) \leq\mathbb E \Phi\left(\left\|\sum\limits_{i = 1}^{n} X_{i}\right\|\right) \leq\mathbb E \Phi\left(2\left\|\sum\limits_{i = 1}^{n} X_{i} \varepsilon_{i}\right\|\right).$

Proposition A.6 (^[11]). Let $\{{X}_i : i \in {n}\}$ be a process satisfying, for $m \geq 1$ :

$\left(\mathbb E\left\Vert{X}_i-{X}_j\right\Vert^{p}\right)^{1/p} \leq \left(\frac{p-1}{q-1}\right)^{m/2}\left(\mathbb E\left\Vert{X}_i-{X}_j\right\Vert^{q}\right)^{1/q} , \quad 1 < q < p < \infty,$

and the semi-metric :

$\rho(j, i) = \left(\mathbb E\left\Vert{X}_i-{X}_j\right\Vert^{2}\right)^{1/2}.$

There exists a constant $K = K(m)$ such that :

$\mathbb E\sup\limits_{i, j \in {n}}\left\Vert{X}_i-{X}_j\right\Vert \leq K \int_0^{D} [\log{N(\epsilon, {n}, \rho)}]^{m/2}d\epsilon,$

where $D$ is the $\rho$ -diameter of ${n}$ .

Lemma A.7 (^[61]). Suppose that $X$ and $Y$ are random variables which are $\mathscr{G}$ and $\mathscr{H}$ -measurable, respectively, and that $\mathbb E|X|^{p} < \infty, \mathbb E|Y|^{q} < \infty$ , where $p > 0$ ,

$q > 1, p^{-1}+q^{-1} < 1.$

Then,

$\lvert \mathbb E X Y- \mathbb E X \mathbb E Y\rvert \leq 8\lVert X\rVert_{p}\lVert Y \rVert_{q}[\beta(\mathscr{G}, \mathscr{H})]^{1-p^{-1}-q^{-1}}.$

Proof of Lemma A.7. This Lemma follows directly using Lemma A.7 and the fact that for any $\sigma$ -algebra $\mathcal{A}$ and $\mathcal{B}$ , $\alpha(\mathcal{A}, \mathcal{B}) \subseteq \beta(\mathcal{A}, \mathcal{B})$ .□

Lemma A.8 (^[180]). Let $V_{1}, \ldots, V_{L}$ be strongly mixing random variables measurable with respect to the $\sigma$ -algebras $\mathscr{F}_{i_{1}}^{j_{1}}, \ldots, \mathscr{F}_{i_{L}}^{j_{L}}$ respectively with $1 \leq i_{1} < j_{1} < i_{2} < \cdots < j_{L} \leq n, i_{l+1}-j_{l} \geq w \geq 1$ and $\left\lvert V_{j}\right\rvert \leq 1$ for $j = 1, \ldots, L$ . Then,

$\left\lvert \mathbb E\left(\prod\limits_{j = 1}^{L} V_{j}\right)-\prod\limits_{j = 1}^{L} \mathbb E\left(V_{j}\right)\right\rvert \leq 16(L-1) \alpha(w),$

where $\alpha(w)$ is the strongly mixing coefficient.

A1. Examples of classes of functions

Example A.9. The set $\mathscr{F}$ of all indicator functions ${\rm 1\!I}_{\{(-\infty, t]\}}$ of cells in $\mathbb{R}$ satisfies :

${N}\left(\epsilon, \mathscr{F}, d_{\mathbb{P}}^{(2)}\right)\leq \frac{2}{\epsilon^{2}},$

for any probability measure $\mathbb{P}$ and $\epsilon\leq 1$ . Notice that :

$\int_{0}^{1}\sqrt{\log\left(\frac{1}{\epsilon}\right)}d\epsilon\leq\int_{0}^{\infty}u^{1/2}\exp(-u)du\leq 1.$

For more details and discussion on this example refer to Example 2.5.4 of ^[178] and [, p. 157]. The covering numbers of the class of cells $(-\infty, t]$ in higher dimension satisfy a similar bound, but with higher power of $(1/\epsilon)$ , see Theorem 9.19 of ^[114].

Example A.10. (Classes of functions that are Lipschitz in a parameter, Section 2.7.4 in ^[178]). Let $\mathscr{F}$ be the class of functions $x\mapsto \varphi(t, x)$ that are Lipschitz in the index parameter $t\in T$ . Suppose that:

$|\varphi(t_1, x)-\varphi(t_2, x)|\leq d({t}_1, {t}_2)\kappa(x)$

for some metric $d$ on the index set $T$ , the function $\kappa(\cdot)$ defined on the sample space $\mathcal{X}$ , and all $x$ . According to Theorem 2.7.11 of ^[178] and Lemma 9.18 of ^[114], it follows, for any norm $\|\cdot\|_{{\mathscr F}}$ on $\mathscr{F}$ , that :

$N(\epsilon\|F\|_{{\mathscr F}}, {\mathscr F}, \|\cdot\|_{{\mathscr F}})\leq{N}(\epsilon/2, T, d).$

Hence if $(T, d)$ satisfy

$J(\infty, T, d) = \int_{0}^{\infty}\sqrt{\log {N}(\epsilon, T, d)} d\epsilon < \infty,$

then the conclusions holds for $\mathscr{F}$ .

Example A.11. Let us consider as an example the classes of functions that are smooth up to order $\alpha$ defined as follows, see Section 2.7.1 of ^[178] and Section 2 of ^[177]. For $0 < \alpha < \infty$ let $\lfloor \alpha \rfloor$ be the greatest integer strictly smaller than $\alpha$ . For any vector $k = (k_{1}, \ldots, k_{d})$ of $d$ integers define the differential operator

$D^{k_{.}}: = \frac{\partial^{k_{.}}}{\partial^{k_{1}}\cdots \partial^{k_{d}}}, \; where\; k_{.}: = \sum\limits_{i = 1}^{d}k_{i}.$

Then, for a function $f:\mathcal{X}\rightarrow \mathbb{R}$ , let

$\|f\|_{\alpha}: = \max\limits_{k_{.}\leq \lfloor \alpha \rfloor}\sup\limits_{x}|D^{k}f(x)|+\max\limits_{k_{.} = \lfloor \alpha \rfloor}\sup\limits_{x}\frac{D^{k}f(x)-D^{k}f(y)}{\|x-y\|^{\alpha-\lfloor \alpha \rfloor}},$

where the suprema are taken over all $x, y$ in the interior of $\mathcal{X}$ with $x \neq y$ . Let $C_{M}^{\alpha}(\mathcal{X})$ be the set of all continuous functions $f: \mathcal{X}\rightarrow \mathbb{R}$ with

$\|f\|_{\alpha}\leq M.$

Note that for $\alpha \leq 1$ this class consists of bounded functions $f$ that satisfy a Lipschitz condition. ^[112] computed the entropy of the classes of $C_{M}^{\alpha}(\mathcal{X})$ for the uniform norm. As a consequence of their results, ^[177] shows that there exists a constant $K$ depending only on $\alpha, d$ and the diameter of $\mathcal{X}$ such that for every measure $\gamma$ and every $\epsilon > 0$ ,

$\log \mathcal{N}_{[\; ]}(\epsilon M\gamma(\mathcal{X}), C_{M}^{\alpha}(\mathcal{X}), L_{2}(\gamma) )\leq K\left(\frac{1}{\epsilon}\right)^{d/\alpha},$

${\mathcal N}_{[\; ]}$ is the bracketing number, refer to Definition 2.1.6 of ^[178] and we refer to Theorem 2.7.1 of ^[178] for a variant of the last inequality. By Lemma 9.18 of ^[114], we have

$\log \mathcal{N}(\epsilon M\gamma(\mathcal{X}), C_{M}^{\alpha}(\mathcal{X}), L_{2}(\gamma) )\leq K\left(\frac{1}{2\epsilon}\right)^{d/\alpha}.$

A2. Examples of U-kernels

In this section, we present some classical $U$ -kernels.

Example A.12. ^[99] introduced the parameter

$\triangle = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} D^{2}(y_1, y_2) d F(y_1, y_2),$

where $D(y_1, y_2) = F(y_1, y_2)-F(y_1, \infty) F(\infty, y_2)$ and $F(\cdot, \cdot)$ is the distribution function of $Y_1$ and $Y_2$ . The parameter $\triangle$ has the property that $\triangle = 0$ if and only if $Y_1$ and $Y_2$ are independent. From ^[117], an alternative expression for $\triangle$ can be developed by introducing the functions

$\psi\left(y_{1}, y_{2}, y_{3}\right) = \left\{\begin{array}{rcl} 1 & if& y_{2} \leq y_{1} < y_{3}, \\ 0 & if & y_{1} < y_{2}, y_{3} \text { or } y_{1} \geq y_{2}, y_{3}, \\ -1 & if& y_{3} \leq y_{1} < y_{2}, \end{array}\right.$

and

$h\left(y_{1, 1}, y_{1, 2} , \ldots , y_{5, 1}, y_{5, 2}\right) = \frac{1}{4} \psi\left(y_{1, 1}, y_{1, 2}, y_{1, 3}\right) \psi\left(y_{1, 1}, y_{1, 4}, y_{1, 5}\right) \psi\left(y_{1, 2}, y_{2, 2}, y_{3, 2}\right) \psi\left(y_{1, 2}, y_{4, 2}, y_{5, 2}\right).$

We have

$\triangle = \int \ldots \int h\left(y_{1, 1}, y_{1, 2} , \ldots , y_{5, 1}, y_{5, 2}\right)d F\left(y_{1, 1}, y_{1, 2}\right) \ldots d F\left(y_{1, 5}, y_{2, 5}\right).$

Example A.13. (Hoeffding's $D$ ). From the symmetric kernel,

$\begin{aligned} h_{D} &\left(z_{1}, \ldots, z_{5}\right) \\ : = & \frac{1}{16} \sum\limits_{\left(i_{1}, \ldots, i_{5}\right) \in \mathcal{P}_{5}}\left[\left\{ \mathbb{1}\left(z_{i_{1}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{2}, 1} \leq z_{i_{5}, 1}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{4}, 1} \leq z_{i_{5}, 1}\right)\right\}\right] \\ & \times\left[\left\{ \mathbb{1}\left(z_{i_{1}, 2} \leq z_{i_{5}, 2}\right)- \mathbb{1}\left(z_{i_{2}, 2} \leq z_{i_{5}, 2}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 2} \leq z_{i_{5}, 2}\right)- \mathbb{1}\left(z_{i_{4}, 2} \leq z_{i_{5}, 2}\right)\right\}\right], \end{aligned}$

we recover Hoeffding's $D$ statistic, a rank-based $U$ -statistic of order 5, and gives rise to Hoeffding's $D$ correlation measure $\mathbb{E} h_{D}$ .

Example A.14. (Blum-Kiefer-Rosenblatt's $R$ ). The symmetric kernel

$\begin{aligned} h_{R}&\left(z_{1}, \ldots, z_{6}\right) & \\ : = & \frac{1}{32} \sum\limits_{\left(i_{1}, \ldots, i_{6}\right) \in \mathcal{P}_{6}}\left[\left\{ \mathbb{1}\left(z_{i_{1}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{2}, 1} \leq z_{i_{5}, 1}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 1} \leq z_{i_{5}, 1}\right)- \mathbb{1}\left(z_{i_{4}, 1} \leq z_{i_{5}, 1}\right)\right\}\right] \\ & \times\left[\left\{ \mathbb{1}\left(z_{i_{1}, 2} \leq z_{i_{6}, 2}\right)- \mathbb{1}\left(z_{i_{2}, 2} \leq z_{i_{6}, 2}\right)\right\}\left\{ \mathbb{1}\left(z_{i_{3}, 2} \leq z_{i_{6}, 2}\right)- \mathbb{1}\left(z_{i_{4}, 2} \leq z_{i_{6}, 2}\right)\right\}\right] \end{aligned}$

yields Blum-Kiefer-Rosenblatt's $R$ statistic (^[24]),

Example A.15. (Bergsma-Dassios-Yanagimoto's $\tau^{*}$ ). ^[21] introduced a rank correlation statistic as a $U$ -statistic of order 4 with the symmetric kernel

$\begin{aligned} h_{\tau^{*}}\left(z_{1}\right.&\left., \ldots, z_{4}\right) \\ : = & \frac{1}{16} \sum\limits_{\left(i_{1}, \ldots, i_{4}\right) \in \mathcal{P}_{4}}\left\{ \mathbb{1}\left(z_{i_{1}, 1}, z_{i_{3}, 1} < z_{i_{2}, 1}, z_{i_{4}, 1}\right)+ \mathbb{1}\left(z_{i_{2}, 1}, z_{i_{4}, 1} < z_{i_{1}, 1}, z_{i_{3}, 1}\right)\right.\\ &\left.- \mathbb{1}\left(z_{i_{1}, 1}, z_{i_{4}, 1} < z_{i_{2}, 1}, z_{i_{3}, 1}\right)- \mathbb{1}\left(z_{i_{2}, 1}, z_{i_{3}, 1} < z_{i_{1}, 1}, z_{i_{4}, 1}\right)\right\} \\ & \times\left\{ \mathbb{1}\left(z_{i_{1}, 2}, z_{i_{3}, 2} < z_{i_{2}, 2}, z_{i_{4}, 2}\right)+ \mathbb{1}\left(z_{i_{2}, 2}, z_{i_{4}, 2} < z_{i_{1}, 2}, z_{i_{3}, 2}\right)\right.\\ &\left.- \mathbb{1}\left(z_{i_{1}, 2}, z_{i_{4}, 2} < z_{i_{2}, 2}, z_{i_{3}, 2}\right)- \mathbb{1}\left(z_{i_{2}, 2}, z_{i_{3}, 2} < z_{i_{1}, 2}, z_{i_{4}, 2}\right)\right\} \end{aligned}$

Here, $\mathbb{1}\left(y_{1}, y_{2} < y_{3}, y_{4}\right): = \mathbb{1}\left(y_{1} < y_{3}\right) \mathbb{1}\left(y_{1} < y_{4}\right) \mathbb{1}\left(y_{2} < y_{3}\right) \mathbb{1}\left(y_{2} < y_{4}\right).$

Example A.16. The Wilcoxon Statistic. Suppose that $E \subset \mathbb{R}$ is symmetric around zero. As an estimate of the quantity

$\int_{(x, y) \in E^2}\left\{2 \mathbb{1}_{\{x+y > 0\}}-1\right\} dF(x) dF(y),$

it is pertinent to consider the statistic

$W_n = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n}\left\{2 \cdot \mathbb{1}_{\left\{X_i+X_j > 0\right\}}-1\right\},$

which is relevant for testing whether or not $\mu$ is located at zero.

Example A.17. The Takens estimator. Denote by $\|\cdot\|$ the usual Euclidean norm on $\mathbb{R}^d$ . In ^[29], the following estimate of the correlation integral,

$C_F(r) = \int \mathbb{I}_{\left\{\left\|x-x^{\prime}\right\| \leq r\right\}} dF(x) dF\left(x^{\prime}\right), \quad r > 0,$

is considered:

$C_n(r) = \frac{1}{n(n-1)} \sum\limits_{1 \leq i \neq j \leq n} \mathbb{I}_{\left\{\left\|X_i-X_j\right\| \leq r\right\}}.$

In the case where a scaling law holds for the correlation integral, i.e., when there exists $\left(\alpha, r_0, c\right) \in$ $\mathbb{R}_{+}^{* 3}$ such that $C_F(r) = c \cdot r^{-\alpha}$ for $0 < r \leq r_0$ , the $U$ -statistic

$T_n = \frac{1}{n(n-1)} \sum\limits_{1 \leq i \neq j \leq n} \log \left(\frac{\left\|X_i-X_j\right\|}{r_0}\right),$

is used in order to build the Takens estimator $\hat{\alpha}_n = -T_n^{-1}$ of the correlation dimension $\alpha$ .

Example A.18. Let $\widehat{{Y}_1{Y}_2}$ denote the oriented angle between ${Y}_1, {Y}_2 \in T$ , $T$ is the circle of radius 1 and center $0$ in $\mathbb{R}^{2}$ . Let:

$h_{t}({Y}_1, {Y}_2) = \mathbb{1}\{\widehat{{Y}_1{Y}_2}\leq {t}\} -{ t}/\pi, \; \; \mbox{for}\; \; {t} \in[0, \pi).$

Reference ^[159] has used this kernel in order to propose a $U$ -process to test uniformity on the circle.

Example A.19. For $m = 3$ , let :

$\varphi(Y_1, Y_2, Y_3) = \mathbb{1}\{Y_1-Y_2-Y_3 > 0\},$

We have

$r^{(3)}(\varphi, {t}_1, {t}_2, {t}_3) = \mathbb{P}(Y_1 > Y_2+Y_3\mid {X}_{1} = {X}_2 = { X}_{3} = { t})$

and the corresponding conditional $U$ -Statistic can be considered a conditional analog of the Hollander-Proschan test-statistic (^[101]). It may be used to test the hypothesis that the conditional distribution of $Y_1$ given ${X}_1 = { t}$ , is exponential, against the alternative that it is of the New-Better than-Used-type.

Example A.20. The Gini mean difference. The Gini index provides another popular measure of dispersion. It corresponds to the case where $E \subset \mathbb{R}$ and $h(x, y) = |x-y|$ :

$G_n = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n}\left|X_i-X_j\right|.$

Example A.21 (^[113]). Let the sample central moments of any order $m = 2, 3, \ldots$ be given by

$\theta_m(F) = \mathbb E\left(X_1-\mathbb E X_1\right)^m = \int(x-\mathbb E X_1)^m d F(x).$

In this case, the $U$ -statistic has a symmetric kernel

$\begin{aligned} & h\left(x_1, \ldots, x_m\right) = \frac{1}{m !} \sum\left[x_{i_1}^m-\left(\begin{array}{c} m \\ 1 \end{array}\right) x_{i_1}^{m-1} x_{i_2}\right. \\ &\left.\quad+\left(\begin{array}{c} m \\ 2 \end{array}\right) x_{i_1}^{m-2} x_{i_2} x_{i_3}-\cdots+(-1)^{m-1}\left(\left(\begin{array}{c} m \\ m-1 \end{array}\right)-1\right) x_{i_1} x_{i_2} \ldots x_{i_m}\right], \end{aligned}$

where summation is carried out over all permutations $\left(i_1, \ldots, i_m\right)$ of the numbers $(1, \ldots, m)$ . In particular, if $m = 3$ , then

$\begin{aligned} h\left(x_1, x_2, x_3\right) = \frac{1}{3}\left(x_1^3+x_2^3+x_3^3\right) -\frac{1}{2}\left(x_1^2 x_2+x_2^2 x_1+x_1^2 x_3+x_3^2 x_1+x_2^2 x_3+x_2^2 x_2\right)+2 x_1 x_2 x_3 . \end{aligned}$

In the case of $m = 2$ ,

$\theta_2(F) = \mathbb E\left(X_1-\mathbb E X_1\right)^2 = \int(x-\mathbb E X_1)^2 d F(x).$

For the kernel

$h\left(x_1, x_2\right) = \frac{x_1^2+x_2^2-2 x_1 x_2}{2} = \frac{1}{2}\left(x_1-x_2\right)^2 .$

the corresponding $U$ -statistic is the sample variance

$\begin{aligned} U_n\left(h\right) & = \frac{2}{n(n-1)} \sum\limits_{1 \leq i < j \leq n} h\left(X_i, X_j\right) \\ & = \frac{1}{n-1}\left(\sum\limits_{i = 1}^n X_i^2-n \left\{\frac{1}{n}\sum\limits_{i = 1}^n X_i\right\}^2\right) = \frac{1}{n-1}\left(\sum\limits_{i = 1}^n X_i^2-n\bar X_n^2\right), \end{aligned}$

we refer also to ^[155].

References

[1]	Q. Ding, S. Liu, Y. Yao, H. Liu, T. Cai, L. Han, Global, regional, and national burden of ischemic stroke, 1990–2019, Neurology, 98 (2022), E279–E290. https://doi.org/10.1212/WNL.0000000000013115 doi: 10.1212/WNL.0000000000013115
[2]	S. Ogoh, T. Tarumi, Cerebral blood flow regulation and cognitive function: A role of arterial baroreflex function, J. Physiol. Sci., 69 (2019), 813–823. https://doi.org/10.1007/s12576-019-00704-6 doi: 10.1007/s12576-019-00704-6
[3]	W. Hacke, M. Kaste, E. Bluhmki, M. Brozman, A. Dávalos, D. Guidetti, et al., Thrombolysis with alteplase 3 to 4.5 hours after acute ischemic stroke, New Engl. J. Med., 359 (2008), 1317–1329. https://doi.org/10.1056/NEJMoa0804656 doi: 10.1056/NEJMoa0804656
[4]	H. Liu, W. Hu, F. Zhang, W. Gu, J. Hong, J. Chen, et al., Efficacy and safety of rt-PA intravenous thrombolysis in patients with wake-up stroke: A meta-analysis, Medicine, 101 (2022), e28914. https://doi.org/10.1097%2FMD.0000000000028914
[5]	A. R. Al-Buhairi, M. M. Jan, Recombinant tissue plasminogen activator for acute ischemic stroke, Neurosci. J., 7 (2002), 7–13.
[6]	A. Nelson, G. Kelly, R. Byyny, C. Dionne, C. Preslaski, K. Kaucher, Tenecteplase utility in acute ischemic stroke patients: A clinical review of current evidence, Am. J. Emerg. Med., 37 (2019): 344–348. https://doi.org/10.1016/j.ajem.2018.11.018 doi: 10.1016/j.ajem.2018.11.018
[7]	B. C. Campbell, H. Ma, S. Curtze, G. A. Donnan, M. Kaste, Extending thrombolysis to 4.5–9 h and wake-up stroke using perfusion imaging: a systematic review and meta-analysis of individual patient data, Lancet, 394 (2019), 139–147. https://doi.org/10.1016/S0140-6736(19)31053-0 doi: 10.1016/S0140-6736(19)31053-0
[8]	A. Damiza-Detmer, I. Damiza, M. Pawełczyk, Wake-up stroke-diagnosis, management and treatment, Curr. Neurol., 20 (2020), 66–70. https://doi.org/10.15557/AN.2020.0009 doi: 10.15557/AN.2020.0009
[9]	A. Wouters, R. Lemmens, P. Dupont, V. Thijs, Wake-up stroke and stroke of unknown onset: a critical review, Front. Neurol., 5 (2014). https://doi.org/10.3389/fneur.2014.00153 doi: 10.3389/fneur.2014.00153
[10]	C. S. Anderson, T. Robinson, R. I. Lindley, H. Arima, P. M. Lavados, T. H. Lee, et al., Low-dose versus standard-dose intravenous alteplase in acute ischemic stroke, New Engl. J. Med., 374 (2016), 2313–2323. https://doi.org/10.1056/NEJMoa1515510 doi: 10.1056/NEJMoa1515510
[11]	J. Mackey, D. Kleindorfer, H. Sucharew, C. J. Moomaw, B. M. Kissela, K. Alwell, et al., Population-based study of wake-up strokes, Neurology, 76 (2011), 1662–1667. https://doi.org/10.1212/WNL.0b013e318219fb30 doi: 10.1212/WNL.0b013e318219fb30
[12]	S. Emeriau, I. Serre, O. Toubas, F. Pombourcq, C. Oppenheim, L. Pierot, Can diffusion-weighted imaging-fluid-attenuated inversion recovery mismatch (positive diffusion-weighted imaging/negative fluid-attenuated inversion recovery) at 3 tesla identify patients with stroke at < 4.5 hours?, Stroke, 44 (2013), 1647–1651. https://doi.org/10.1161/STROKEAHA.113.001001 doi: 10.1161/STROKEAHA.113.001001
[13]	D. Buck, L. C. Shaw, C. I. Price, G. A. Ford, Reperfusion therapies for wake-up stroke: systematic review, Stroke, 45 (2014), 1869–1875. https://doi.org/10.1161/STROKEAHA.114.005126 doi: 10.1161/STROKEAHA.114.005126
[14]	O. M. Rø nning, Reperfusion therapy in stroke cases with unknown onset, Tidsskrift for Den norske legeforening, 136 (2016), 1333. https://doi.org/10.4045/tidsskr.16.0626 doi: 10.4045/tidsskr.16.0626
[15]	Q. Chen, T. Xia, M. Zhang, N. Xia, J. Liu, Y. Yang, Radiomics in stroke neuroimaging: techniques, applications, and challenges, Aging Dis., 12 (2021), 143–154. https://doi.org/10.14336%2FAD.2020.0421
[16]	K. C. Ho, W. Speier, H. Zhang, F. Scalzo, S. El-Saden, C. W. Arnold, A machine learning approach for classifying ischemic stroke onset time from imaging, IEEE Trans. Med. Imaging, 38 (2019), 1666–1676. https://doi.org/10.1109/TMI.2019.2901445 doi: 10.1109/TMI.2019.2901445
[17]	H. Lee, E. J. Lee, S. Ham, H. B. Lee, J. S. Lee, S. U. Kwon, et al., Machine learning approach to identify stroke within 4.5 hours, Stroke, 51 (2020), 860–866. https://doi.org/10.1161/STROKEAHA.119.027611 doi: 10.1161/STROKEAHA.119.027611
[18]	H. Zhu, L. Jiang, H. Zhang, L. Luo, Y. Chen, Y. Chen, An automatic machine learning approach for ischemic stroke onset time identification based on DWI and FLAIR imaging, Neuroimage-Clin., 31 (2021), 102744. https://doi.org/10.1016/j.nicl.2021.102744 doi: 10.1016/j.nicl.2021.102744
[19]	Y. Q. Zhang, A. F. Liu, F. Y. Man, Y. Y. Zhang, C. Li, Y. E. Liu, et al., MRI radiomic features-based machine learning approach to classify ischemic stroke onset time, J. Neurol., 269 (2022), 350–360.
[20]	M. Jenkinson, S. Smith, A global optimisation method for robust affine registration of brain images, Med. Image Anal., 5 (2001), 143–156. https://doi.org/10.1016/S1361-8415(01)00036-6 doi: 10.1016/S1361-8415(01)00036-6
[21]	M. M. Jenkinson, P. Bannister, M. Brady, S. Smith, Improved optimisation for the robust and accurate linear registration and motion correction of brain images, Neuroimage, 17 (2002), 825–841. https://doi.org/10.1006/nimg.2002.1132 doi: 10.1006/nimg.2002.1132
[22]	H. Lee, K. Jung, D. W. Kang, N. Kim, Fully automated and real-time volumetric measurement of infarct core and penumbra in diffusion-and perfusion-weighted MRI of patients with hyper-acute stroke, J. Digit. Imaging, 33 (2020), 262–272. https://doi.org/10.1007/s10278-019-00222-2 doi: 10.1007/s10278-019-00222-2
[23]	S. M. Smith, Fast robust automated brain extraction, Hum. Brain Mapp., 17 (2002), 143–155. https://doi.org/10.1002/hbm.10062 doi: 10.1002/hbm.10062
[24]	M. Jenkinson, M. Pechaud, S. Smith, BET2: MR-based estimation of brain, skull and scalp surfaces, in Eleventh Annual Meeting of the Organization for Human Brain Mapping, 17 (2005), 167.
[25]	J. J. M. Van Griethuysen, A. Fedorov, C. Parmar, A. Hosny, N. Aucoin, V. Narayan, et al., Computational radiomics system to decode the radiographic phenotype, Cancer Res., 77 (2017), e104–e107. https://doi.org/10.1158/0008-5472.CAN-17-0339 doi: 10.1158/0008-5472.CAN-17-0339
[26]	Y. Zhang, B. Zhang, F. Liang, S. Liang, Y. Zhang, P. Yan, et al., Radiomics features on non-contrast-enhanced CT scan can precisely classify AVM-related hematomas from other spontaneous intraparenchymal hematoma types, Eur. Radiol., 29 (2019), 2157–2165. https://doi.org/10.1007/s00330-018-5747-x doi: 10.1007/s00330-018-5747-x
[27]	Muthukrishnan, R. and R. Rohini. LASSO: A feature selection technique in predictive modeling for machine learning, in 2016 IEEE International Conference on Advances in Computer Applications (ICACA), (2016), 18–20. https://doi.org/10.1109/ICACA.2016.7887916
[28]	X. Wu, H. Wang, F. Chen, L. Jin, J. Li, Y. Feng, et al., Rat model of reperfused partial liver infarction: characterization with multiparametric magnetic resonance imaging, microangiography, and histomorphology, Acta Radiol., 50 (2009), 276–287. https://doi.org/10.1080/02841850802647021 doi: 10.1080/02841850802647021
[29]	C. Wang, P. Miao, J. Liu, Z. Li, Y. Wei, Y. Wang, et al., Validation of cerebral blood flow connectivity as imaging prognostic biomarker on subcortical stroke, J. Neurochem., 159 (2021), 172–184. https://doi.org/10.1111/jnc.15359 doi: 10.1111/jnc.15359
[30]	D. A. Hernandez, R. P. H. Bokkers, R. V. Mirasol, M. Luby, E. C. Henning, J. G. Merino, et al., Pseudocontinuous arterial spin labeling quantifies relative cerebral blood flow in acute stroke, Stroke, 43 (2012), 753–758. https://doi.org/10.1161/STROKEAHA.111.635979 doi: 10.1161/STROKEAHA.111.635979
[31]	C. Wang, P. Miao, J. Liu, S. Wei, Y. Guo, Z. Li, et al., Cerebral blood flow features in chronic subcortical stroke: Lesion location-dependent study, Brain Res., 1706 (2019), 177–183. https://doi.org/10.1016/j.brainres.2018.11.009 doi: 10.1016/j.brainres.2018.11.009
[32]	T. Love, D. Swinney, E. Wong, R. Buxton, Perfusion imaging and stroke: A more sensitive measure of the brain bases of cognitive deficits, Aphasiology, 16 (2002), 873–883. https://doi.org/10.1080/02687030244000356 doi: 10.1080/02687030244000356
[33]	M. H. Lev, A. Z. Segal, J. Farkas, S. T. Hossain, C. Putman, G. J. Hunter, et al., Utility of perfusion-weighted CT imaging in acute middle cerebral artery stroke treated with intra-arterial thrombolysis-Prediction of final infarct volume and clinical outcome, Stroke, 32 (2001), 2021–2027. https://doi.org/10.1161/hs0901.095680 doi: 10.1161/hs0901.095680
[34]	C. Grefkes, G. R. Fink, Connectivity-based approaches in stroke and recovery of function, Lancet Neurol., 13 (2014), 206–216. https://doi.org/10.1016/S1474-4422(13)70264-3 doi: 10.1016/S1474-4422(13)70264-3
[35]	M Giacalone, P Rasti, N Debs, C Frindel, TH Cho, E. Grenier, Local spatio-temporal encoding of raw perfusion MRI for the prediction of final lesion in stroke, Med. Image Anal., 50 (2018), 117–126. https://doi.org/10.1016/j.media.2018.08.008 doi: 10.1016/j.media.2018.08.008
[36]	J. D. Jordan, W. J. Powers, Cerebral autoregulation and acute ischemic stroke, Am. J. Hypertens., 25 (2012), 946–950. https://doi.org/10.1038/ajh.2012.53 doi: 10.1038/ajh.2012.53
[37]	X. Yao, L. Mao, S. Lv, Z. Ren, W. Li, K. Ren, CT radiomics features as a diagnostic tool for classifying basal ganglia infarction onset time, J. Neurol. Sci., 412 (2020), 116730. https://doi.org/10.1016/j.jns.2020.116730 doi: 10.1016/j.jns.2020.116730
[38]	Z. Yi, L. Long, Y. Zeng, Z. Liu, Current advances and challenges in radiomics of brain tumors, Front. Oncol., 11 (2021). https://doi.org/10.3389/fonc.2021.732196 doi: 10.3389/fonc.2021.732196
[39]	Y. Zhang, B. Zhang, F. Liang, S. Liang, Y. Zhang, P. Yan, et al., Radiomics features on non-contrast-enhanced CT scan can precisely classify AVM-related hematomas from other spontaneous intraparenchymal hematoma types, Eur. Radiol., 29 (2019), 2157–2165. https://doi.org/10.1007/s00330-018-5747-x doi: 10.1007/s00330-018-5747-x
[40]	M. E. Mayerhoefer, A. Materka, G. Langs, I. Häggström, P. Szczypiński, P. Gibbs, et al., Introduction to radiomics, J. Nucl. Med., 61 (2020), 488–495. https://doi.org/10.2967/jnumed.118.222893 doi: 10.2967/jnumed.118.222893
[41]	M. Zhou, J. Scott, B. Chaudhury, L. Hall, D. Goldgof, K. W. Yeom, et al., Radiomics in brain tumor: Image assessment, quantitative feature descriptors, and machine-learning approaches, Am. J. Neuroradiol., 39 (2018), 208–216.

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Mathematical Biosciences and Engineering

4.4

Metrics

Article views(2559) PDF downloads(147) Cited by(4)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(3) / Tables(4)

Mathematical Biosciences and Engineering

Determining acute ischemic stroke onset time using machine learning and radiomics features of infarct lesions and whole brain

Related Papers:

Abstract

1. Introduction and motivations

2. Background and preliminaries

2.1. Notation

2.2. Model

2.3. Local stationarity

2.4. Small-ball probability

2.5. Mixing conditions

2.6. Kernel estimation

2.7. VC-type classes of functions

2.8. Assumptions

2.9. Comments on the assumptions

3. Uniform convergence rates for kernel estimators

3.1. Hoeffding's decomposition

3.2. Uniform convergence rate

4. Weak convergence for kernel estimators

5. Applications

5.1. Discrimination

5.2. Metric learning

5.3. Kendall rank correlation coefficient

5.4. Conditional U-statistics for censored data

5.5. Conditional U-statistics for left truncated and right censored data

6. The bandwidth selection criterion

7. Concluding remarks

8. Mathematical developments

Use of AI tools declaration

Acknowledgements

Conflict of interest

Appendix

A. Additional information

A1. Examples of classes of functions

A2. Examples of U-kernels

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog