Simulating a strongly nonlinear backward stochastic partial differential equation via efficient approximation and machine learning

Wanyang Dai; Wanyang Dai

doi:10.3934/math.2024909

AIMS Mathematics

2024, Volume 9, Issue 7: 18688-18711. doi: 10.3934/math.2024909

Previous Article Next Article

Research article Special Issues

Simulating a strongly nonlinear backward stochastic partial differential equation via efficient approximation and machine learning

Wanyang Dai ^,

Department of Mathematics and State Key Laboratory of Novel Software Technology, Nanjing University, Nanjing 210093, China

Received: 15 March 2024 Revised: 22 May 2024 Accepted: 23 May 2024 Published: 04 June 2024
MSC : 60H35, 65C30, 60H15, 60K37, 60H30

We have studied a strongly nonlinear backward stochastic partial differential equation (B-SPDE) through an approximation method and with machine learning (ML)-based Monte Carlo simulation. This equation is well-known and was previously derived from studies in finance. However, how to analyze and solve this equation has remained a problem for quite a long time. The main difficulty is due to the singularity of the B-SPDE since it is a strongly nonlinear one. Therefore, by introducing new truncation operators and integrating the machine learning technique into the platform of a convolutional neural network (CNN), we have developed an effective approximation method with a Monte Carlo simulation algorithm to tackle the well-known open problem. In doing so, the existence and uniqueness of a 2-tuple adapted strong solution to an approximation B-SPDE were proved. Meanwhile, the convergence of a newly designed simulation algorithm was established. Simulation examples and an application in finance were also provided.

Keywords:

Citation: Wanyang Dai. Simulating a strongly nonlinear backward stochastic partial differential equation via efficient approximation and machine learning[J]. AIMS Mathematics, 2024, 9(7): 18688-18711. doi: 10.3934/math.2024909

Related Papers:

[1]	Shahbaz Ahmad, Adel M. Al-Mahdi, Rashad Ahmed . Two new preconditioners for mean curvature-based image deblurring problem. AIMS Mathematics, 2021, 6(12): 13824-13844. doi: 10.3934/math.2021802
[2]	Suparat Kesornprom, Prasit Cholamjiak . A modified inertial proximal gradient method for minimization problems and applications. AIMS Mathematics, 2022, 7(5): 8147-8161. doi: 10.3934/math.2022453
[3]	Damrongsak Yambangwai, Tanakit Thianwan . An efficient iterative algorithm for common solutions of three G-nonexpansive mappings in Banach spaces involving a graph with applications to signal and image restoration problems. AIMS Mathematics, 2022, 7(1): 1366-1398. doi: 10.3934/math.2022081
[4]	Shahbaz Ahmad, Faisal Fairag, Adel M. Al-Mahdi, Jamshaid ul Rahman . Preconditioned augmented Lagrangian method for mean curvature image deblurring. AIMS Mathematics, 2022, 7(10): 17989-18009. doi: 10.3934/math.2022991
[5]	Xiao Guo, Chuanpei Xu, Zhibin Zhu, Benxin Zhang . Nonmonotone variable metric Barzilai-Borwein method for composite minimization problem. AIMS Mathematics, 2024, 9(6): 16335-16353. doi: 10.3934/math.2024791
[6]	Bolin Song, Zhenhao Shuai, Yuanyuan Si, Ke Li . Blind2Grad: Blind detail-preserving denoising via zero-shot with gradient regularized loss. AIMS Mathematics, 2025, 10(6): 14140-14166. doi: 10.3934/math.2025637
[7]	Ziqing Du, Yaru Li, Guangming Lv . Evaluating the nonlinear relationship between nonfinancial corporate sector leverage and financial stability in the post crisis era. AIMS Mathematics, 2022, 7(11): 20178-20198. doi: 10.3934/math.20221104
[8]	H. M. Barakat, Magdy E. El-Adll, M. E. Sobh . Bootstrapping $m$ -generalized order statistics with variable rank. AIMS Mathematics, 2022, 7(8): 13704-13732. doi: 10.3934/math.2022755
[9]	Chih-Hung Chang, Ya-Chu Yang, Ferhat Şah . Reversibility of linear cellular automata with intermediate boundary condition. AIMS Mathematics, 2024, 9(3): 7645-7661. doi: 10.3934/math.2024371
[10]	Shabir Ahmad, Saud Fahad Aldosary, Meraj Ali Khan . Stochastic solitons of a short-wave intermediate dispersive variable (SIdV) equation. AIMS Mathematics, 2024, 9(5): 10717-10733. doi: 10.3934/math.2024523

Abstract

1. Introduction

Image blur, such as motion blur, is a common disturbance in real-world photography applications. Therefore, image deblurring is of great importance for further practical vision tasks. Motion blur can be modeled as the covolution of the sharp image and blur kernel, which is typically unknown in real-world scenarios. The image degradation can be modeled as:

$\begin{equation} B = L \otimes K+n, \end{equation}$

(1.1)

where $B$ , $L$ , and $K$ denote the motion-blurred image, the sharp image, and the blur kernel (point spread function), respectively, and $n$ represents the additive white Gaussian noise with a mean of 0 and a standard deviation of $\sigma$ , which is introduced during the image degradation process. The symbol $\otimes$ denotes the convolution operator.

Blind deblurring aims to reconstruct both the blur kernel $K$ and the sharp latent image $L$ from a blurred input image $B$ . However, this process is ill-posed because different combinations of $L$ and $K$ can produce identical outputs of $B$ . To address this problem, it is essential to incorporate prior knowledge to avoid the local optimal solution.

Researchers have extensively explored the optimization of blur kernels modeled with prior knowledge of images in recent years^[1,2,3]. Li et al. ^[4] utilized a deep network to formulate the image prior as a binary classifier. Levin et al. ^[5] employed hyper-Laplacian priors to model the latent image and derived a simple approximation method to optimize the maximum a posteriori (MAP). In the pursuit of developing an efficient blind deblurring method, various prior terms tailored to enhance image clarity have been integrated within the MAP framework^[6,7,8]. Krishnan et al. ^[9] utilized an L1/L2 regularization scheme to sparsely represent the gradient image, whose main feature is to adapt the L1 norm regularization by using the L2 norm of the image gradient as the weight in the iterative process. However, this approach is not conducive to recovering image details in the early stages of the optimization process. Meanwhile, Xu et al. ^[10] proposed an unnatural L0 norm sparse representation to eliminate detrimental small-amplitude structures, providing a unified framework for both uniform and non-uniform motion deblurring. Liu et al. ^[11] explored that the surface maps of intermediate latent images containing detrimental structures typically have a large surface area, and they introduced an additional surface-perception a prior based on the use of the L0 norm to enforce sparsity on the image gradient, thereby preserving sharp edges and removing unfavorable microstructures from the intermediate latent images.

These methods still fail when dealing with images containing more saturated pixels and large blur kernels. Therefore, recent works concentrate on the image reconstruction with outliers for non-blind deblurring^[12] and blind deblurring tasks^[13,14,15]. Chen et al. ^[16] proposed to remove the outliers by adopting a confidence map and further shrunk the outliers by multiplying with its inverse value^[17]. Zhang et al. ^[18] proposed an intermediate image correction method for saturated pixels to improve the quality of saturated image restoration by screening the intermediate image using Bayesian a posteriori estimation and excluding pixel points that adversely affect the blur kernel estimation. Much progress has been made in blurred image estimation for natural images and in image reconstruction techniques, but there are still several major problems with the current blind deblurring algorithms. First, most current motion blur estimation methods are based on images with a linear blurring process^[19,20,21]. In practice, blurred images are often accompanied by large noise and outlier points, such as saturated pixel points, and linear blur models cannot effectively describe saturated pixel points, leading to their poor performance in processing blurred images with outlier pixels. In particular, blurred images taken in low-light environments will contain large noise and outlier points. Therefore, how to effectively cope with the interference caused by saturated pixels has great practical value.

Recently, deep learning methods based on Bayes theory have also developed^[22,23,24]. Kingma et al. ^[22] proposed the auto-encoding variational Bayesian algorithm, where the encoder maps the input into a distribution within the latent space, and the decoder maps the sample points from the latent space to the input space. Zhang et al. ^[20] and Ren et al. ^[23] constructed blind deblurring networks based on the MAP estimation. However, these deep learning-based methods can easily fail when the data distribution is different from the training data. For this reason, the proposed method focuses on the conventional iterative blind deblurring method.

This work investigates the blind deblurring optimization model for saturated pixels established under the MAP framework. By solving the intermediate image and blur kernel by alternating iterations, the blur kernel will eventually converge to the blur kernel of the observed image. In order to overcome the highly ill-posed problem of blind deblurring, the image regularity and the blur kernel regularity are usually used to constrain the model. Although the dark channel prior (DCP) has achieved excellent results, when dealing with images with larger blur kernels or saturated pixels, the results are often unsatisfactory. Therefore, we utilize the pixel screening strategy ^[18] to further correct the intermediate images with large blur kernels or saturated pixels. By distinguishing whether a pixel conforms to the linear degradation assumption, the proposed method reduces the influence of unfavorable structure to obtain a more accurate blur kernel.

2. Intermediate corrected blind deblurring model based on the DCP

2.1. MAP framework

We use the MAP probability estimation to construct a probabilistic modeling framework between a sharp image, a blur kernel, and a blurred image. Given the blurred image, the sharp image and the blur kernel are estimated by maximizing a posterior probability based on the assumption that the sharp image $L$ and the blur kernel $K$ are independent of each other. According to the conditional probability formula, we obtain

$\begin{equation} \begin{aligned} (L, K) = \underset{L, K}{\operatorname{argmax}} P(L, K \mid B) = \underset{L, K}{\operatorname{argmax}} \frac{P(B \mid L, K) * P(L) * P(K)}{P(B)} . \end{aligned} \end{equation}$

(2.1)

Taking the negative logarithms on both sides of the above equation, we derive a new form that is equivalent to the original probability density function:

$\begin{equation} -\log P(L, K \mid B) \propto-\log P(B \mid L, K)-\log P(L)-\log P(K). \end{equation}$

(2.2)

Assume that $n$ is an additive white Gaussian noise with a mean of 0 and a variance of $\sigma$ , and the variable $B$ follows a normal distribution, provided that $L$ and $K$ are known. The solution of $L$ and $K$ is transformed into the following minimization problem:

$\begin{equation} (L, K) = \underset{L, K}{\operatorname{argmin}}\|L \otimes K-B\|_2^2+\Phi(L)+\Psi(K) . \end{equation}$

(2.3)

The first term on the right-hand side is the data fitting term, and the second and third terms are regularization terms, which involve a priori knowledge, including statistical and distribution properties about the sharp image and blur kernel. Blind deblurring is to estimate the blur kernel and then recover the sharp image from the blurred image.

2.2. Blur kernel prior

The motion blur is usually caused by relative motion between the camera and the subject. This motion causes pixels shifting in a specific direction and distance, thus resulting in image degradation. Assume all values in the blur kernel are greater than or equal to 0, and the sum is 1, that is,

$K(z) \geq 0, \ \ \ \sum\limits_{z \in \Omega_k} K(z) = 1,$

where $\Omega$ is the whole image space.

Since blur kernels are sparse, we constrain the possible blur kernels as follows:

$\begin{equation} \Psi(K) = \|K\|_p, \end{equation}$

(2.4)

where $\|\cdot\|_p$ denotes the $p$ norm operator. As the $\mathrm{L}_2$ norm constraint focuses more on the smoothness of the blur kernel, this leads to more stable results for kernel estimation. Therefore, we use the $\mathrm{L}_2$ norm to constrain the blur kernel in this paper.

2.3. Blind deblurring method based on the DCP

The dark channel is a natural metric for distinguishing sharp images from blurry images^[25]. He et al. ^[26] first proposed dark channels for image haze removal. The dark channel of image $L$ can be defined as the minimum value of an image patch as follows:

$\begin{equation} C_{i, j}(L) = \min \limits_{N(i, j)}\left(\min \limits_{c \in\{r, g, b\}} L_{i, j}^c\right), \end{equation}$

(2.5)

where $N(i, j)$ is the image patch centered at pixel ${(i, j)}$ . Experiments show that the dark channels of sharp images are more sparse. The possible reason is that the image blur is a weighted sum of pixel values within the local neighborhood, thereby increasing the dark channel pixels. Therefore, we use the $L_0$ norm of the dark channel as the image regularization.

The deblurring model based on the DCP is to solve the following problem:

$\begin{equation} \min \limits_{L, K}\|L \otimes K-B\|_2^2+\lambda\|\mathrm{D}(L)\|_0+\mu\|\nabla L\|_0+\gamma\|K\|_2^2. \end{equation}$

(2.6)

The first term of this formula is a fidelity item that constrains the output of the convolution of the recovered image with the blur kernel to be as similar as possible to the observed result. The $\|\nabla L\|_0$ term is used to preserve large image gradients and $\|\mathrm{D}(L)\|_0$ is used to measure the dark channel sparsity. The blind deconvolution method commonly optimizes $L$ and $K$ alternately during the iterative process. The main purpose of this alternating iterative optimization is to progressively refine the motion blur kernel $K$ and the latent image $L$ .

In this work, the following two subproblems are solved by the alternating iteration method:

$\begin{equation} \begin{gathered} \min _L\|L \otimes K-B\|_2^2+\lambda\|\mathrm{D}(L)\|_0+\mu\|\nabla L\|_0, \ \ \ \ \ \min _K\|L \otimes K-B\|_2^2+\gamma\|K\|_2^2. \end{gathered} \end{equation}$

(2.7)

Specifically, for the $k$ -th iterative process, $L$ can be solved using the fast Fourier transform. When $L$ is given, kernel estimation in Eq (2.7) is a least-squares problem. The gradient-based kernel estimation methods have shown superiority ^[11], and the kernel estimation model as follows:

$\begin{equation} K^{k+1} = \underset{K}{\operatorname{argmin}}\left\|\nabla L^{k+1} \otimes K^k-\nabla B\right\|_2^2+\gamma\left\|K^k\right\|_2^2 . \end{equation}$

(2.8)

2.4. Pixel screening strategy for intermediate image correction

Normally, blind image deblurring follows the basic linear blurring assumption Eq (1.1). However, methods based on this assumption do not yield satisfactory results in recovering images with a high number of saturated pixels. When outliers are present, intermediate potential images, estimated using methods with traditional data fidelity items, contain significant artifacts. Even a small number of outliers severely degrade the quality of the estimated blur kernel because these outliers often do not fit the linear model.

An effective way to identify and discard outliers during the iterative process is assigning different weights to the pixels while updating the latent image and the blur kernel. Those pixels categorized as outliers are assigned a weight equal to zero to ensure that they do not affect the subsequent iterations^[18]. We introduce variable $Z$ to determine whether the pixel $(i, j)$ complies with the linearity assumption^[12], and the intermediate correction operator is defined as

$\begin{equation} P_{i, j}^{k+1} = P\left(Z_{i, j}^{k+1} = 1 \mid B_{i, j}, K^k, L^{k+1}\right). \end{equation}$

(2.9)

According to the Bayes formula, we have

$\begin{equation} \begin{aligned} P\left(Z_{i j}^{k+1} = 1 \mid B_{i j}, K^k, L^{k+1}\right) = \frac{P\left(B_{i j} \mid Z_{i j}^{k+1} = 1, K^k, L^{k+1}\right)P\left(Z_{i j}^{k+1} = 1 \mid K^k, L^{k+1}\right)}{P\left(B_{i j} \mid K^k, L^{k+1}\right)} . \end{aligned} \end{equation}$

(2.10)

In this work, we assume that the noise $n$ obeys a Gaussian distribution with a mean of 0 and a variance $\sigma^2$ . When

$Z_{i j}^{k+1} = 1,$

the degradation assumption holds, and we obtain

$\begin{equation} P\left(B_{i j} \mid Z_{i j}^{k+1} = 1, K^k, L^{k+1}\right) = \varphi_{i j}, \end{equation}$

(2.11)

where

$\varphi_{i j} \sim N\left(\left(L^{k+1} * K^k\right)_{i j}, \sigma^2\right).$

When

$Z_{i j}^{k+1} = 0,$

pixel $(i, j)$ is considered an outlier. The posterior distribution is approximated by a uniform distribution as follows:

$\begin{equation} P\left(B_{i j} \mid Z_{i j}^{k+1} = 0, K^k, L^{k+1}\right) = 1 /(b-a), \end{equation}$

(2.12)

where $b$ and $a$ correspond to the maximum and minimum values of the input image, respectively.

Given the intermediate image $L^{k+1}$ and kernel $K^k$ , we use $p_0$ to represent the percentage of image pixels that deviate from the linear model. The probability of a pixel conforming to Eq (1.1) can be defined as

$\begin{equation} P\left(Z_{i j}^{k+1} = 0 \mid K^k, L^{k+1}\right) = p_0, \end{equation}$

(2.13)

and we generally assume that about 0–10 $\%$ of the pixels are deviated. The probability of satisfying the linearity assumption Eq (1.1) for a given intermediate blur kernel and intermediate image is defined as

$\begin{equation} P\left(Z_{i j}^{k+1} = 1 \mid K^k, L^{k+1}\right) = 1-p_0. \end{equation}$

(2.14)

According to the full probability formula, we obtain

$\begin{equation} \begin{aligned} P\left(B_{i j} \mid K^k, L^{k+1}\right) & = \sum\limits_{Z_{i j} = 0, 1} P\left(B_{i j} \mid Z_{i j}^{k+1}, K^k, L^{k+1}\right) P\left(Z_{i j}^{k+1} \mid K^k, L^{k+1}\right) \\ & = \varphi_{i j} \left(1-p_0\right)+\frac{p_0}{(b-a)}. \end{aligned} \end{equation}$

(2.15)

Thus, with the above definitions, the pixel screening operator $P$ is calculated as follows:

$\begin{equation} P_{i, j}^{k+1} = \frac{\varphi_{i j} \left(1-p_0\right)}{\varphi_{i j} \left(1-p_0\right)+p_0 /(b-a)} . \end{equation}$

(2.16)

During the iterative process, after obtaining the estimated intermediate image, we alternate the estimation of the blurring kernel. Based on the intermediate correction operator, we screen and correct the pixels of intermediate images. For those pixels with a high probability of deviation, which means they have greater disadvantageous impact for blur kernel estimation, they are appropriately corrected. With the corrected intermediate image, we solve the following model to estimate the blur kernel:

$\begin{equation} K^{k+1} = \underset{K}{\arg \min }\left\|\nabla\left(L^{\mathrm{k}+1} \circ P\right) \otimes K-\nabla B\right\|_2^2+\gamma\|K\|_2^2, \end{equation}$

(2.17)

where $\circ$ is the matrix dot product operator.

2.5. Algorithm

As shown in Figure 1, this work is carried out in the framework of multi-scale deblurring, where kernel estimation is carried out from a coarse to a fine scheme with an image pyramid. With the color input image, we first transform it to a grayscale image. We use the image to create a pyramid and resize the blur kernel with a down-sampling operation, thus a set of multi-resolution images is obtained. Starting from the smallest level, the structure of the whole image is restored and we recover the rough blur kernel using the correction operator. As the image and kernel resolution improve, the finer details are gradually restored.

Figure 1. Kernel estimation from coarse to fine under the multi-scale deblurring framework.

DownLoad: Full-Size Img PowerPoint

3. Experimental results

In order to verify the effectiveness of this method, we conduct numerical experiments on both synthetic and real-world image datasets to compare the processing effects of the dark channel blind deblurring method before and after the correction improvement. We set the parameters

$\lambda = 0.003, \; \; \; \mu = 0.003, \; \; \; \text{and}\; \; \; \gamma = 2,$

and $p$ is an adjustable parameter in the range of 0.02 to 0.1. compares the results on the Levin dataset^[5] by adjusting $p$ from 0.02 to 0.16. The results show that the deblurring performance relies on the choice of $p$ . The more outliers present, the higher the value of $p$ , which will provide better results.

Figure 2. Peak signal-to-noise ratio (PSNR) comparison on the Levin dataset by adjusting

$p$ from 0.02 to 0.16.

DownLoad: Full-Size Img PowerPoint

The experimental hardware configuration is an Intel Core i5-10300 CPU, NVIDlA GeForce GTX 1650 GPU, 16.0 GB RAM, the software configuration operating system is Windows 10 (64-bit). We use the PSNR and structural similarity (SSIM) as our evaluation metrics.

3.1. Synthetic dataset

We use the Levin dataset ^[5] and Köhler dataset ^[27] to evaluate our method. The Levin dataset is a standard benchmark dataset that consists of 32 blurred images synthesized from 4 original images and 8 different convolution kernels, with each image size of $255\times255$ . The Köhler dataset is a standard benchmark dataset that consists of 48 blurred images synthesized from 4 original images and 12 different convolutional kernels, with each image size of $800\times800$ . We compare our method with DCP^[25], PMP^[28], LMG^[21], and Sat^[17] to demonstrate the effectiveness of our method.

In Figure 3, the left figure shows the PSNR comparison between the proposed method and state-of-the-art methods, where our method significantly improves the PSNR metric. The right figure shows the error ratio comparison with and without intermediate correction. It can be seen that the proposed method has the smallest error ratio. As shown in Figure 3 and Table 1, experimental results on the Levin dataset demonstrate that the deblurring algorithm proposed in this paper achieves significant performance improvements across a wide range of blur types and degrees. The improved method recovery results obtain higher PSNR and SSIM values, and its ability to reach a 100% success rate faster proves its effectiveness in removing blur of different types and degrees.

Figure 3. PSNR and error ratio comparison on the Levin dataset.

DownLoad: Full-Size Img PowerPoint

Table 1. Comparison of averaged SSIM on the Levin dataset.

	LMG^[21]	PMP^[28]	Sat^[17]	DCP^[25]	Ours
SSIM	0.4662	0.4753	0.2438	0.4559	0.5268

| Show Table

DownLoad: CSV

Figure 4 shows that our method recovers the image and kernel with less artifacts and higher quality.

Figure 4. Visual comparison on the Levin dataset. The proposed method obtains the best restoration performance.

DownLoad: Full-Size Img PowerPoint

As shown in Figure 5 and Table 2, experimental results on the Köhler dataset show that our proposed deblurring method achieves significant performance improvement, and the recovery results obtain higher PSNR and SSIM values, demonstrating its effectiveness for image quality improvement. Figure 6 shows that the deblurred image of the proposed method obtains most restoration performance with the least ringing artifacts. The restored kernel of the proposed method is more clean and the image has the best visual quality.

Figure 5. PSNR comparison on the Köhler dataset.

DownLoad: Full-Size Img PowerPoint

Table 2. Comparison of averaged SSIM on the Köhler dataset.

	LMG^[21]	PMP^[28]	Sat^[17]	DCP^[25]	Ours
SSIM	0.6148	0.6147	0.5351	0.6104	0.6256

| Show Table

DownLoad: CSV

Figure 6. Visual comparison on the Köhler dataset.

DownLoad: Full-Size Img PowerPoint

3.2. Real dataset

As shown in Figure 7, we compare the dark channels of intermediate results with and without the intermediate correction. Without the correction strategy, our method reduces to the DCP-based method^[25]. The intermediate results show that our method restores more sharp edges and clear blurry kernels. The final recovered image contains more details that demonstrate that our method improves the deblurring quality for saturated images.

Figure 7. Visual comparison with and without intermediate correction.

DownLoad: Full-Size Img PowerPoint

Estimating motion kernels from blurred images with saturated pixel regions is challenging in image processing. As shown in Figure 8, we present three blurry images with saturated pixels to demonstrate the performance of our method. The first column shows blurry images, and the second and the third column are the results of the DCP^[25] and ours, respectively. The results show that with the intermediate correction, it not only improves the quality of the recovered images, but also restores more clear blurry trajectories.

Figure 8. Visual comparison on the real-world dataset.

DownLoad: Full-Size Img PowerPoint

4. Conclusions

In this work, we introduce a blind deblurring method based on the DCP with an intermediate image correction strategy. In order to remove the disadvantageous effect of outliers, such as saturated pixels, we correct the intermediate image during the deblurring process. By assigning different weights to intermediate images, we improve the kernel estimation performance and thus enhance the final image restoration quality. Experimental results show that our method can significantly improve the accuracy and robustness of blur estimation when dealing with blurred images containing noise and outlier pixels.

Author contributions

Min Xiao: writing—original draft; Jinkang Zhang: writing—original draft; Zijin Zhu: writing—review and editing; Meina Zhang: methodology, supervision. All authors have read and agreed to the published version of the manuscript.

Use of Generative-AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

This work is supported by the Science Foundation of China University of Petroleum, Beijing (No. 2462023YJRC008), Foundation of National Key Laboratory of Computational Physics (No. 6142A05QN23005), Postdoctoral Fellowship Program of CPSF (Nos. GZC20231997 and 2024M752451), National Natural Science Foundation of China (No. 62372467).

Conflict of interest

The authors have no conflicts to disclose.

References

[1]	J. Braun, M. Griebel, On a constructive proof of Kolmogorov's superposition thoerem, Constr. Approx., 35 (2009), 653–675. https://doi.org/10.1007/s00365-009-9054-2 doi: 10.1007/s00365-009-9054-2
[2]	A. Cĕrný, J. Kallsen. On the structure of general mean-variance hedging strategies, Ann. Appl. Probab., 35 (2007), 1479–1531. https://doi.org/10.1214/009117906000000872 doi: 10.1214/009117906000000872
[3]	G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signal System, 1 (1989), 303–314. https://doi.org/10.1007/BF02551274 doi: 10.1007/BF02551274
[4]	W. Dai, Brownian approximations for queueing networks with finite buffers: modeling, heavy traffic analysis and numerical implementations, Ph.D thesis, Georgia Institute of Technology, 1996.
[5]	J. G. Dai, W. Dai, A heavy traffic limit theorem for a class of open queueing networks with finite buffers, Queueing Syst., 32 (1999), 5–40. https://doi.org/10.1023/A:1019178802391 doi: 10.1023/A:1019178802391
[6]	W. Dai, Mean-variance portfolio selection based on a generalized BNS stochastic volatility model, Int. J. Comput. Math., 88 (2011), 3521–3534. https://doi.org/10.1080/00207160.2011.606904 doi: 10.1080/00207160.2011.606904
[7]	W. Dai, Optimal rate scheduling via utility-maximization for $J$ -user MIMO Markov fading wireless channels with cooperation, Oper. Res., 61 (2013), 1450–1462. https://doi.org/10.1287/opre.2013.1224 doi: 10.1287/opre.2013.1224
[8]	W. Dai, Mean-variance hedging based on an incomplete market with external risk factors of non-Gaussian OU processes, Math. Probl. Eng., 2015 (2015), 625289. https://doi.org/10.1155/2015/625289 doi: 10.1155/2015/625289
[9]	W. Dai, Convolutional neural network based simulation and analysis for backward stochastic partial differential equations, Comput. Math. Appl., 119 (2022), 21–58. https://doi.org/10.1016/j.camwa.2022.05.019 doi: 10.1016/j.camwa.2022.05.019
[10]	W. Dai, Optimal policy computing for blockchain based smart contracts via federated learning, Oper. Res. Int. J., 22 (2022), 5817–5844. https://doi.org/10.1007/s12351-022-00723-z doi: 10.1007/s12351-022-00723-z
[11]	L. Gonon, L. Grigoryeva, J. P. Ortega, Approximation bounds for random neural networks and reservoir systems, Ann. Appl. Probab., 33 (2023), 28–69. https://doi.org/10.1214/22-AAP1806 doi: 10.1214/22-AAP1806
[12]	R. Gozalo-Brizuela, E. C. Garrido-Merchan, ChatGPT is not all you need. A state of the art review of large generative AI models, preprint paper, 2023. https://doi.org/10.48550/arXiv.2301.04655
[13]	S. Haykin, Neural networks: A Comprehensive Foundation, New Jersey: Prentice Hall PTR, 1994.
[14]	K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neur. Networks, 2 (1989), 359–366. https://doi.org/10.1016/0893-6080(89)90020-8 doi: 10.1016/0893-6080(89)90020-8
[15]	N. Ikeda, S. Watanabe, Stochastic Differential Equations and Diffusion Processes, 2 Eds., Kodansha: North-Holland, 1989.
[16]	O. Kallenberg, Foundation of Modern Probability, Berlin: Springer, 1997.
[17]	A. N. Kolmogorov, On the representation of continuous functions of several variables as superpositions of continuous functions of a smaller number of variables, Dokl. Akad. Nauk, 108 (1956).
[18]	D. Kramkov, M. Sirbu, On the two times differentiability of the value function in the problem of optimal investment in incomplete markets, Ann. Appl. Probab., 16 (2006), 1352–1384. https://doi.org/10.1214/105051606000000259 doi: 10.1214/105051606000000259
[19]	A. Kratsios, V. Debarnot, I. Dokmannić, Small transformers compute universal metric embeddings, J. Mach. Learning Res., 24 (2023), 1–48.
[20]	Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, et al., Backpropagation applied to handwritten zip code recognition, Neur. Comput., 1 (1989), 541–551. https://doi.org/10.1162/neco.1989.1.4.541 doi: 10.1162/neco.1989.1.4.541
[21]	Z. Liu, Y. Wang, S. Vaidya, F. Ruehle, J. Halverson, M. Solja $\breve{c}$ ić, et al., KAN: Kolmogorov-Arnold networks, preprint paper, 2024. https://arXiv.org/pdf/2404.19756
[22]	M. Musiela, T. Zariphopoulou. Stochastic partial differential equations and portfolio choice, In: Contemporary Quantitative Finance, Berlin: Springer, 2009. https://doi.org/10.1007/978-3-642-03479-4_11
[23]	B. $\emptyset$ ksendal, Stochastic Differential Equations, 6 Eds, New York: Springer, 2005.
[24]	B. $\emptyset$ ksendal, A. Sulem, T. Zhang, A stochastic HJB equation for optimal control of forward-backward SDEs, In: The Fascination of Probability, Statistics and their Applications, Berlin: Springer, 2016.
[25]	S. Peluchetti, Diffusion bridge mixture transports, Schr $\ddot{o}$ dinger bridge problems and generative modeling, J. Mach. Learning Res., 24 (2023), 1–51.
[26]	J. Sirignano, K. Spiliopoulos, Dgm: a deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375 (2018), 1339–1364. https://doi.org/10.1016/j.jcp.2018.08.029 doi: 10.1016/j.jcp.2018.08.029
[27]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, Adv. Neur. Informa. Proc. Syst., 30 (2017), 5998–6008.
[28]	R. Yamashitza, M. Nishio, R. K. G. Do, Togashi, Convolutional neural networks: an overview and application in radiology, Insights into Imaging, 9 (2018), 611–629. https://doi.org/10.1007/s13244-018-0639-9 doi: 10.1007/s13244-018-0639-9

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)