WG-ICRN: Protein 8-state secondary structure prediction based on Wasserstein generative adversarial networks and residual networks with Inception modules

Shun Li; Lu Yuan; Yuming Ma; Yihui Liu; Shun Li; Lu Yuan; Yuming Ma; Yihui Liu

doi:10.3934/mbe.2023333

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 5: 7721-7737. doi: 10.3934/mbe.2023333

Previous Article Next Article

Research article

WG-ICRN: Protein 8-state secondary structure prediction based on Wasserstein generative adversarial networks and residual networks with Inception modules

School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China

Academic Editor: Yang Kuang

Received: 06 December 2022 Revised: 18 January 2023 Accepted: 11 February 2023 Published: 20 February 2023

Protein secondary structure is the basis of studying the tertiary structure of proteins, drug design and development, and the 8-state protein secondary structure can provide more adequate protein information than the 3-state structure. Therefore, this paper proposes a novel method WG-ICRN for predicting protein 8-state secondary structures. First, we use the Wasserstein generative adversarial network (WGAN) to extract protein features in the position-specific scoring matrix (PSSM). The extracted features are combined with PSSM into a new feature set of WG-data, which contains richer feature information. Then, we use the residual network (ICRN) with Inception to further extract the features in WG-data and complete the prediction. Compared with the residual network, ICRN can reduce parameter calculations and increase the width of feature extraction to obtain more feature information. We evaluated the prediction performance of the model using six datasets. The experimental results show that the WGAN has excellent feature extraction capabilities, and ICRN can further improve network performance and improve prediction accuracy. Compared with four popular models, WG-ICRN achieves better prediction performance.

Keywords:

residual network,
Wasserstein generative adversarial network,
inception,
prediction of protein 8-state secondary structures

Citation: Shun Li, Lu Yuan, Yuming Ma, Yihui Liu. WG-ICRN: Protein 8-state secondary structure prediction based on Wasserstein generative adversarial networks and residual networks with Inception modules[J]. Mathematical Biosciences and Engineering, 2023, 20(5): 7721-7737. doi: 10.3934/mbe.2023333

Related Papers:

[1]	Hammad Siddiqi . Financial market disruption and investor awareness: the case of implied volatility skew. Quantitative Finance and Economics, 2022, 6(3): 505-517. doi: 10.3934/QFE.2022021
[2]	Takashi Kanamura . Diversification effect of commodity futures on financial markets. Quantitative Finance and Economics, 2018, 2(4): 821-836. doi: 10.3934/QFE.2018.4.821
[3]	Oleg S. Sukharev . The restructuring of the investment portfolio: the risk and effect of the emergence of new combinations. Quantitative Finance and Economics, 2019, 3(2): 390-411. doi: 10.3934/QFE.2019.2.390
[4]	Zhongxian Men, Tony S. Wirjanto . A new variant of estimation approach to asymmetric stochastic volatilitymodel . Quantitative Finance and Economics, 2018, 2(2): 325-347. doi: 10.3934/QFE.2018.2.325
[5]	Sang Phu Nguyen, Toan Luu Duc Huynh . Portfolio optimization from a Copulas-GJR-GARCH-EVT-CVAR model: Empirical evidence from ASEAN stock indexes. Quantitative Finance and Economics, 2019, 3(3): 562-585. doi: 10.3934/QFE.2019.3.562
[6]	Samuel Asante Gyamerah . Modelling the volatility of Bitcoin returns using GARCH models. Quantitative Finance and Economics, 2019, 3(4): 739-753. doi: 10.3934/QFE.2019.4.739
[7]	Cemile Özgür, Vedat Sarıkovanlık . An application of Regular Vine copula in portfolio risk forecasting: evidence from Istanbul stock exchange. Quantitative Finance and Economics, 2021, 5(3): 452-470. doi: 10.3934/QFE.2021020
[8]	Hongxuan Huang, Zhengjun Zhang . An intrinsic robust rank-one-approximation approach for currency portfolio optimization. Quantitative Finance and Economics, 2018, 2(1): 160-189. doi: 10.3934/QFE.2018.1.160
[9]	Ngo Thai Hung . Equity market integration of China and Southeast Asian countries: further evidence from MGARCH-ADCC and wavelet coherence analysis. Quantitative Finance and Economics, 2019, 3(2): 201-220. doi: 10.3934/QFE.2019.2.201
[10]	Ick Jin . Systematic ESG risk and hedge fund. Quantitative Finance and Economics, 2024, 8(2): 387-409. doi: 10.3934/QFE.2024015

Abstract

1. Introduction

Portfolio selection aims at either maximizing the return or minimizing the risk. In 1952, Markowitz (1952) suggests to select the portfolio by minimizing the standard deviation at a given expected return under the assumption that asset returns are normally distributed. This means that standard deviation is chosen as the risk measure. Markowitz's work laid down the cornerstone for modern portfolio selection theory framework.

Risk measures and probability distributions are two important constituents of the portfolio selection theory. Traditional Markowitz's model (Markowitz, 1952) is established based on normality assumption and standard deviation is chosen as the risk measure.

One disadvantage of taking standard deviation (StD) as a risk measure is that the loss in the extreme cases tends to be underestimated. To overcome such a difficulty, the idea of Value at Risk (VaR) is also widely used in practice. Artzner et al. (1999) suggests that desirable risk measure should be "coherent". However, VaR does not fulfill the subadditivity condition as required by the definition of "coherence". Yiu (2004) proposed an optimal portfolio selection under Value-at-Risk. On the other hand, Expected Shortfall (ES) is coherent as a popular risk measure for portfolio selection that aims at averaging the tail uncertainties.

It is well-known that financial data cannot be described satisfactorily by normal distribution. The normality assumption is restrictive and is generally violated due to financial market uncertainties and managers' risk aversion. As pointed out, alternatives for multivariate normal distribution are necessary for portfolio selection. A desirable alternative model should be able to explain tail heaviness, skewness, and excess kurtosis. Various heavy tailed distributions have been applied to portfolio selection problems. Among these, concluded that the daily rate of return of stock price data exhibit heavy tailed distributions; apply multivariate skewed $t$ and student $t$ distribution for efficient frontier analysis; Generalized hyperbolic distribution is extensively studied in (Behr and Ptter, 2009; Eberlein, 2001; Hellmich and Kassberger, 2011; Hu and Kercheval, 2007; Surya and Kurniawan, 2014; Socgnia and Wilcox, 2014), with special cases including hyperbolic distribution (Bingham and Kiesel, 2001; Eberlein and Keller, 1995), Variance Gamma distribution (Seneta, 2004), Normal Inverse Gaussian distribution (Barndor-Nielsen, 1995), etc.

Recently, Asymmetric Laplace distribution has received various attention in the literature, to name a few, (Ayebo and Kozubowski, 2003; Kollo and Srivastava, 2005; Kozubowski and Podgrski, 1999; Kozubowski and Podgrski, 2001; Punathumparambath, 2012). Compared to Normal distribution, the Asymmetric Laplace distribution describes asymmetry, steep peak, and tail heaviness better. Portfolio selection models are extensively studied under Asymmetric Laplace framework. Zhu (2007), Kozubowski and Podgrski (2001) apply Asymmetric Laplace distribution to financial data. By assuming that the asset data is generated from autoregressive moving average (ARMA) time series models with Asymmetric Laplace noise, Zhu (2007) establish the asymptotic inference theory under very mild conditions and present methods of computing conditional Value at Risk (CVaR). Zhao et al. (2015) further propose a so-called mean-CVaR-skewness portfolio selection strategy under Asymmetric Laplace distribution, this model can be further transformed to quadratic programming problem with explicit solutions.

In this paper, we extended Hu's work (Hu, 2010) to Asymmetric Laplace framework. We first derived the equivalence of mean-VaR/ES/Std-skewness-kurtosis models, and show that these models can be reduced to quadratic programming problem. Since Zhao et al. (2015) utilized moment estimation for parameter estimation of Asymmetric Laplace distribution which less efficient compare to maximum likelihood estimation. Taken into consideration of the normal mean-variance mixture of Asymmetric Laplace distribution, followed by Expectation-Maximization algorithm for multivariate Laplace distribution in Arslan (2010), we derived the EM algorithm for Asymmetric Laplace Distributions that outperforms moment estimation in Zhao et al. (2015). The advantage of the proposed EM algorithm is to alleviate the complicated calculation of Bessel function. This improves many existing methods of estimating Asymmetric Laplace distributions, for example, Hrlimann (2013), Kollo and Srivastava (2005) and Visk (2009). Extensive simulation studies and efficient frontier analysis are complemented to confirm that our algorithm performs better than moment estimation for parameter estimation.

The rest of the article is organized as follows. In Section 2, properties of Asymmetric Laplace distributions and coherent risk measures are summarized. In Section 3, portfolio selection under Asymmetric Laplace framework are derived, complement with Expectation-Maximization (EM) algorithm for parameter estimation of Asymmetric Laplace distributions. In Section 4, simulation studies are provided to show the efficiency of the Expectation-Maximization procedure. Section 5 presents real data analysis of Asymmetric Laplace Distributions based portfolio selection models, followed by conclusive remarks in Section 6.

2. Preliminary knowledge

2.1. Asymmetric Laplace distribution

Kotz et al. (2001) proposed the Asymmetric Laplace Distribution with density function

$f({\boldsymbol{x}}) = \frac{ 2e^{{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}} } { (2\pi)^{n/2}|{\bf{\Sigma}}|^{1/2} } \big{(} \frac{{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}} }{ 2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}} \big{)} ^{v/2} K_{v} \big{(} \sqrt{ (2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}})({\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}) } \big{)},$

(2.1)

denoted as ${\boldsymbol{X}}\sim AL_n({\boldsymbol{\mu}}, {\bf{\Sigma}})$ . Here, $n$ is the dimension of random vector ${\boldsymbol{X}}$ , $v = (2-n)/2$ and $K_{v}(u)$ is the modified Bessel function of the third kind with the following two popular representations:

$K_v(u) = \frac{1}{2} \big(\frac{u}{2}\big)^v\int_{0}^{\infty}t^{-v-1}\exp\Big{\{} -t-\frac{u^2}{4t} \Big{\}} dt, \quad u>0,$

(2.2)

$K_v(u) = \frac{(u/2)^v\Gamma(1/2)}{\Gamma(v+1/2)}\int_{1}^{\infty} e^{-ut}(t^2-1)^{v-1/2}dt, \quad u>0, v\geq -1/2.$

(2.3)

When ${\boldsymbol{\mu}} = {\boldsymbol{0}}_n$ , we can obtain Symmetric Laplace distribution $SL\, ({\bf{\Sigma}})$ with density

$f({\boldsymbol{x}}) = 2(2\pi)^{-n/2}|{\bf{\Sigma}}|^{-1/2}\Big({\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}/2 \Big{)}^{\nu/2} K_{\nu}\Big(\sqrt{ 2{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}} } \Big).$

When $n = 1$ , we have ${\bf{\Sigma}} = \sigma_{11} = \sigma\, .$ In such cases, (2.1) becomes the univariate Laplace distribution $AL_1(\mu, \sigma)$ distribution with parameters $\mu$ and $\sigma$ . The corresponding density function is

$f(x) = \frac{1}{\gamma} \exp\Big{\{} -\frac{|x|}{\sigma^2}\big[\gamma-\mu\cdot \text{sign}(x)\big] \Big{\}} \quad \text{ with } \quad \gamma = \sqrt{\mu^2+2\sigma^2}.$

(2.4)

The symmetric case ( $\mu = 0$ ) leads to the univariate Laplace distribution $SL_1\, (0, \sigma)$ .

displays plot of symmetric densities and AL densities. Symmetric densities including standard normal distribution, student $t$ distribution with 2 degrees of freedom, and univariate symmetric Laplace distribution, denoted as $N\, (0, 1), t\, (2), SL_1\, (0, 1)$ . The student $t$ distribution possesses heavier tail than normal distribution, whereas $SL_1\, (0, 1)$ distribution imposes greater peakedness and heavier tail than normal case. As for plots of AL densities, when $\mu>0$ , the density skews to the right. On the other hand, when $\mu < 0$ , the density skews to the left.

Figure 1. Univariate densities.

DownLoad: Full-Size Img PowerPoint

Important results of univariate and multivariate asymptotic Laplace distributions that will be used later on are presented below.

Proposition 2.1. (See Kotz, 2001)

(1). If ${\boldsymbol{X}} = (X_1, \cdots, X_n)$ follows multivariate Asymmetric Laplace distribution, i.e., ${\boldsymbol{X}}\sim AL_n\, ({\boldsymbol{\mu}}, {\bf{\Sigma}})$ , $n$ is the number of securities. The linear combination ${\boldsymbol{w}}'{\boldsymbol{X}} = w_1X_1+\cdots+w_nX_n$ follows univariate Asymmetric Laplace distribution, i.e. ${\boldsymbol{w}}'{\boldsymbol{X}}\sim AL_1\, (\mu, \sigma)$ , with $\mu = {\boldsymbol{w}}'{\boldsymbol{\mu}}\, , \sigma = \sqrt{{\boldsymbol{w}}'{\bf{\Sigma}}{\boldsymbol{w}}}\, , {\boldsymbol{w}} = (w_1, \cdots, w_n)'$ .

(2). Assume that univariate random variable $Y \sim AL_1\, (\mu, \sigma)$ . To measure the asymmetry and peakedness of the distribution, define the skewness (Skew[Y]) and kurtosis (Kurt[Y]) as the third and fourth standardized moment of a random variable $Y\, .$ Then,

$\begin{eqnarray*} \text{Skew}[Y] & = & \frac{ \mathbb{E}(Y-\mathbb{E}Y)^3 }{\big[ \mathbb{E}(Y-\mathbb{E}Y)^2 \big]^{3/2}} = \frac{2\mu^3+3\mu\sigma^2}{(\mu^2+\sigma^2)^{3/2}}, \\ {\text{Kurt}[Y]} & = & \frac{\mathbb{E}(Y-\mathbb{E}Y)^4}{\big[\text{Var}(Y)\big]^2} = \frac{ 9\mu^4+6\sigma^4+18\mu^2\sigma^2 }{ (\mu^2+\sigma^2)^2 }. \end{eqnarray*}$

(3). Let ${\boldsymbol{X}} = (X_1, X_2, \cdots, X_n)\sim AL_n\, ({\boldsymbol{\mu}}, {\bf{\Sigma}})$ . Then the first and second order moments of ${\boldsymbol{X}}$ are

$\mathbb{E}({\boldsymbol{X}}) = {\boldsymbol{\mu}} \quad and \quad \text{Cov}\, ({\boldsymbol{X}}) = {\bf{\Sigma}}+{\boldsymbol{\mu}}'{\boldsymbol{\mu}}.$

(4). The Asymmetric Laplace distribution can be represented as a mixture of normal vector and a standard exponential variable, i.e., ${\boldsymbol{X}} \sim AL_n\, ({\boldsymbol{\mu}}, {\bf{\Sigma}})$ can be represented as

${\boldsymbol{X}} = {\boldsymbol{\mu}} Z+Z^{1/2}{\boldsymbol{Y}},$

where ${\boldsymbol{Y}}\sim N_n({\boldsymbol{0}}, {\bf{\Sigma}})\, , Z\sim \text{Exp}(1)$ . This indicate that we can simulate multivariate Asymmetric Laplace random vector ${\boldsymbol{X}}\sim AL_n({\boldsymbol{\mu}}, {\bf{\Sigma}})$ as follows:

1. Generate a multivariate normal variable ${\boldsymbol{Y}}\sim N_n\, ({\boldsymbol{0}}, {\bf{\Sigma}})$ ;

2. Generate a standard exponential variable $Z\sim \text{Exp}\, (1)$ ;

3. Construct Asymmetric Laplace random vector as ${\boldsymbol{X}} = {\boldsymbol{\mu}} Z+Z^{1/2}{\boldsymbol{Y}}$ .

Figure 2 displays several realizations of bivariate Asymmetric Laplace distribution with different levels of asymmetry and peakedness.

Figure 2. Bivariate Asymmetric Laplace data with

$\mu$ cases: (a1, a2, a3):

$\mu = (0, 0)$ ; (b1, b2, b3):

$\mu = (1, 1)$ ; (c1, c2, c3):

$\mu = (-1, -1)$ . Covariance matrix

${\bf{\Sigma}}$ cases. (a1, b1, c1):

$\sigma_{11} = \sigma_{22} = 1, \sigma_{12} = \sigma_{21} = 0$ ; (a2, b2, c2):

$\sigma_{11} = \sigma_{22} = 1\, , \sigma_{12} = \sigma_{21} = 0.8$ ; (a3, b3, c3):

$\sigma_{11} = \sigma_{22} = 1\, , \sigma_{12} = \sigma_{21} = -0.8$ .

DownLoad: Full-Size Img PowerPoint

2.2. Risk measures

Since mean and covariance matrix cannot be used to characterize non-Gaussian distribution, alternative risk measures are necessary for portfolio selection problems. Artzner et al. (1999) suggests that a desirable risk measure should be defined fulfilling certain properties and such a risk measure is said to be coherent.

A risk measure $\phi$ that maps a random variable to a real number is coherent if it satisfies the following conditions:

1). Translation invariance: $\phi(l+h) = \phi(l)+h$ , for all random losses $l$ and all $h\in\mathbb{R}$ ;

2). Subadditivity: $\phi(l+h)\leq \phi(l)+\phi(h)$ , for all random losses $l, h$ ;

3). Positive homogeneity: $\phi(\lambda l) = \lambda \phi(l)$ for all random losses $l$ and all $\lambda>0$ ;

4). Monotonicity: $\phi(l_1)\leq \phi(l_2)$ for all random losses $l_1\, , l_2$ with $l_1\leq l_2$ almost surely.

Standard deviation is not coherent in general excepting the Gaussian cases. VaR is coherent when the underlying distribution is elliptically distributed. Expected Shortfall, or the so-called conditional value at risk (CVaR) is a coherent risk measure since it always satisfies subadditivity, monotonicity, positive homogeneity, and convexity. For any fixed $\alpha\in(0, 1)$ , $\alpha$ - $\text{VaR}$ is the $\alpha$ -quantile loss while $\alpha$ -ES is the average of all $\beta$ -VaR for $\beta\in(\alpha, 1)$ . Both VaR and CVaR measure the potential maximal loss. VaR and ES can be written as

$\text{VaR}_{\alpha} = F^{-1}(\alpha)\quad \mbox{and} \quad \text{ES}_{\alpha} = E[L|L\leq \text{-VaR}_{\alpha}] = - \frac{1}{\alpha}\int_{-\infty}^{-\text{VaR}_\alpha}\text{VaR}_{\beta}d\beta,$

where $F(\cdot)$ is the cumulative distribution function of loss $L$ and $\text{ES}_\alpha$ is the expected loss above $\text{VaR}_\alpha$ . Thus, the estimation process are

$\int_{-\infty}^{-\text{VaR}_\alpha} f_X(x)dx = \alpha\quad \mbox{and} \quad \text{ES}_{\alpha} = - \frac{1}{\alpha}\int_{-\infty}^{-\text{VaR}_\alpha} xf(x)dx.$

(2.5)

Under normality assumption, VaR $_\alpha$ and ES $_\alpha$ are

$\begin{eqnarray*} && \text{VaR}_\alpha = \mu+\sigma \Phi^{-1}(1-\alpha), \\ && \text{ES}_\alpha = \mu+\sigma\frac{\psi(\Phi^{-1}(1-\alpha))}{\alpha}. \end{eqnarray*}$

where $\psi(\cdot)$ as the normal density distribution, and $\Phi^{-1}(\cdot)$ is the quantile distribution.

As shown in , portfolio selected by minimizing standard deviation, VaR $_\alpha$ , and ES $_\alpha$ are the equivalent under elliptical distribution assumption.

It is well-documented that asset securities are not normally distributed. As an alternative to Gaussian distribution, Asymmetric Laplace distribution exhibits tail-heaviness, skewness, and peakedness.

3. Portfolio selection under ALD framework

Let ${\boldsymbol{X}} = (X_1, X_2, \cdots, X_n)\sim AL_n({\boldsymbol{\mu}}_{\mathcal{P}}, {\bf{\Sigma}}_{\mathcal{P}})$ be the return vectors of $n$ securities, and ${\boldsymbol{w}} = (w_1, w_2, \cdots, w_n)'$ is the allocation weight vector. Then, the portfolio is defined as

$\mathcal{P}\, ({\boldsymbol{w}}) = {\boldsymbol{w}}'{\boldsymbol{X}} = \sum\limits_{i = 1}^{n}w_iX_i .$

According to Proposition 2.1 (2), $\mathcal{P}\, ({\boldsymbol{w}})\sim AL_1\, (\mu, \sigma)$ with $\mu = {\boldsymbol{w}}'{\boldsymbol{\mu}}\, , \sigma = \sqrt{{\boldsymbol{w}}'{\bf{\Sigma}}{\boldsymbol{w}}}$ .

From Theorem 3.1–3.2 below, in order to select a portfolio under Asymmetric Laplace distribution, it suffices to obtain the unknown parameters ${\boldsymbol{\mu}}_{\mathcal{P}}$ and ${\bf{\Sigma}}_{\mathcal{P}}$ . Thus portfolio selection models under Asymmetric Laplace distribution lead to parameter estimation for $AL_n({\boldsymbol{\mu}}_{\mathcal{P}}, {\bf{\Sigma}}_{\mathcal{P}})$ . Zhao et al. (2015) proposed the multi-objective portfolio selection model under Asymmetric Laplace framework and derived the simplified model that can be reformulated as quadratic programming problem. However, to estimate the unknown parameters, the authors adopt a moment estimation method that is less efficient compared to maximum likelihood method. Since Asymmetric Laplace distribution can be represented as a mixture of exponential distribution and multivariate normal distribution, we derived the Expectation-Maximization algorithm for parameter estimation of Asymmetric Laplace distribution. The algorithm for estimating these unknown parameters is discussed in Section 3.2.

3.1. Portfolio selection theorems

Theorem 3.1. Let ${\boldsymbol{X}} = (X_1, \cdots, X_n) \sim AL_n({\boldsymbol{\mu}}_{\mathcal{P}}, {\bf{\Sigma}}_{\mathcal{P}})$ be a $n$ -dimensional random vector that follow multivariate Asymmetric Laplace distribution, each element ( $X_i, i = 1, 2, \cdots, n$ ) represent a stock. Let ${\boldsymbol{w}}$ be the weight vector and $\mathcal{P}({\boldsymbol{w}}) = {\boldsymbol{w}}'{\boldsymbol{X}} = \sum_{i = 1}^{n}w_iX_i$ be the portfolio. Then, under Asymmetric Laplace framework, risk measures of StD, VaR $_\alpha$ , and ES $_\alpha$ at $\alpha\in(0, 1)$ level formulated as

$\begin{eqnarray*} && \text{Standard Deviation: }\quad \text{StD}\, \big(\mathcal{P}({\boldsymbol{w}})\big) = \frac{\sigma}{\sqrt{2}}; \\ && \text{Value at Risk: } \quad \qquad \text{VaR}_\alpha\, \big(\mathcal{P}({\boldsymbol{w}})\big) = -\frac{\sigma^2}{\gamma+\mu}\ln \frac{\alpha\gamma(\gamma+\mu)}{\sigma^2}; \\ && \text{Expected Shortfall: }\quad \text{ES}_\alpha\, \big(\mathcal{P}({\boldsymbol{w}})\big) = \frac{\sigma^2}{\gamma+\mu}-\frac{\sigma^2}{\gamma+\mu}\ln \frac{\alpha\gamma(\gamma+\mu)}{\sigma^2}. \end{eqnarray*}$

Here, $\mu = {\boldsymbol{w}}'{\boldsymbol{\mu}}_{\mathcal{P}} = \text{mean}\, \big(\mathcal{P}({\boldsymbol{w}})\big)\, , \sigma = \sqrt{{\boldsymbol{w}}'{\bf{\Sigma}}_{\mathcal{P}}{\boldsymbol{w}}} = \text{std}\, \big(\mathcal{P}({\boldsymbol{w}})\big)$ and $\gamma = \sqrt{\mu^2+2\sigma^2}$ .

Proof. Let ${\boldsymbol{\mu}}_{\mathcal{P}} = (\mu_1, \cdots, \mu_n)$ be the mean return vector of the securities $(X_1, \cdots, X_n)$ and $\Sigma_{\mathcal{P}} = \big(\sigma_{\mathcal{P}} \big)_{i, j = 1}^{p}$ be the scale matrix of $(X_1, \cdots, X_n)$ . Denote the allocation vector by ${\boldsymbol{w}} = (w_1, \cdots, w_n)'$ . Then, the portfolio $\mathcal{P}({\boldsymbol{w}}) = \sum_{i = 1}^{n}w_iX_i$ follows univariate Asymmetric Laplace distribution with

$\mathcal{P}({\boldsymbol{w}}) = \sum\limits_{i = 1}^{n}w_iX_i\sim AL_1(\mu, \sigma) \quad \text{ with } \quad \mu = \sum\limits_{i = 1}^{n}\mu_iw_i \, , \sigma = \Big(\sum\limits_{i = 1}^{n}\sum\limits_{j = 1}^{n}\sigma_{\mathcal{P}_{ij}}w_iw_j \Big)^{1/2}.$

If ${\boldsymbol{\mu}} = {\boldsymbol{0}}_n$ , the univariate symmetric Asymmetric Laplace distribution becomes $AL_1(0, \sigma)$ with density

$g(x) = \frac{1}{\gamma}\exp\Big\{ -\frac{|x|}{\sigma^2}\gamma \Big\} \quad \text{ with } \quad \gamma = \sqrt{2}\sigma.$

Thus, standard deviation (StD) of portfolio $\mathcal{P}({\boldsymbol{w}}) = {\boldsymbol{w}}'{\boldsymbol{X}}$ is

$\text{StD}\, (\mathcal{P}({\boldsymbol{w}})) = \int_{-\infty}^{+\infty} \frac{1}{\gamma}|x|\exp\Big\{ -\frac{\gamma}{\sigma^2}|x| \Big\} dx = 2\int_{0}^{+\infty} \frac{x}{\gamma}\exp\Big\{ -\frac{\gamma}{\sigma^2}x \Big\} dx = \frac{2\sigma^4}{\gamma^3} = \frac{\sigma}{\sqrt{2}}.$

According to the definition of VaR $_\alpha$ and ES $_\alpha$ as defined in (2.5) and univariate Asymmetric Laplace density (2.4), we have

$\begin{eqnarray*} && -\int_{-\infty}^{-\text{VaR}_\alpha} \frac{1}{\gamma}\exp\Big{\{} -\frac{|x|}{\sigma^2} \big[ \gamma-\mu\cdot\text{sgn}(x)\big] \Big{\}} dx = \alpha, \\ && \frac{\sigma^2}{\gamma(\gamma+\mu)}\exp\Big{\{} -\frac{\gamma+\mu}{\sigma^2}\text{VaR}_\alpha \Big{\}} = \alpha. \\ \end{eqnarray*}$

Thus, VaR $_\alpha$ and ES $_\alpha$ are

$\begin{eqnarray*} && \text{VaR}_\alpha\, \big(\mathcal{P}({\boldsymbol{w}})\big) = -\frac{\sigma^2}{\sqrt{\mu^2+2\sigma^2}+\mu}\ln\frac{\alpha(\mu^2+2\sigma^2+\mu\sqrt{\mu^2+2\sigma^2})}{\sigma^2} = -\frac{\sigma^2}{\gamma+\mu}\ln\frac{\alpha\gamma(\gamma+\mu)}{\sigma^2}; \\ && \text{ES}_\alpha\, \big(\mathcal{P}({\boldsymbol{w}})\big) = -\frac{1}{\alpha}\int_{-\infty}^{-\text{VaR}_\alpha}xf_X(x)dx = -\frac{1}{\alpha}\int_{-\infty}^{-\text{VaR}_\alpha}x\frac{1}{\gamma}\exp\Big{\{} -\frac{|x|}{\sigma^2}\big[\gamma-\mu\cdot\text{sgn}(x)\big] \Big{\}} dx\\ && \qquad = \frac{\sigma^2}{\mu+\sqrt{\mu^2+2\sigma^2}} - \frac{\sigma^2}{\mu+\sqrt{\mu^2+2\sigma^2}}\ln\Big\{ 2\alpha+\frac{\alpha(\mu^2+\mu\sqrt{\mu^2+2\sigma^2})}{\sigma^2} \Big\} \\ && \qquad = \frac{\sigma^2}{\gamma+\mu}-\frac{\sigma^2}{\gamma+\mu}\ln\frac{\alpha\gamma(\gamma+\mu)}{\sigma^2}. \end{eqnarray*}$

Then we have the following theorem.

Theorem 3.2. Let ${\boldsymbol{X}}\sim AL_n({\boldsymbol{\mu}}_{\mathcal{P}}, {\bf{\Sigma}}_{\mathcal{P}})$ . Then, portfolio $\mathcal{P}({\boldsymbol{w}}) = {\boldsymbol{w}}'{\boldsymbol{X}}$ with following models based on $ES_{\alpha}\, ,$ $VaR_{\alpha}\, ,$ and $StD$ (as defined in Theorem 3.1)

$\begin{eqnarray*}\label{opt1} && \min\limits_{{\boldsymbol{w}}}\text{ES}_{\alpha}\, \big(\mathcal{P}({\boldsymbol{w}})\big) \text{ or } \min\limits_{{\boldsymbol{w}}} \text{VaR}_{\alpha}\, \big(\mathcal{P}({\boldsymbol{w}})\big) \text{ or } \min\limits_{{\boldsymbol{w}}} \text{StD}\, \big(\mathcal{P}({\boldsymbol{w}})\big) \\ && \max\limits_{{\boldsymbol{w}}} \quad \text{Skew[} \mathcal{P}({\boldsymbol{w}}) \text{]} = \frac{2\mu^3+3\mu\sigma^2}{(\mu^2+\sigma^2)^{3/2}} \\ && \max\limits_{{\boldsymbol{w}}} \quad \text{Kurt[} \mathcal{P}({\boldsymbol{w}}) \text{]} = \frac{ 9\mu^4+6\sigma^4+18\mu^2\sigma^2 }{ (\mu^2+\sigma^2)^2 } \\ && \text{s.t.} \qquad {\boldsymbol{w}}'{\boldsymbol{\mu}} = r_0\, , {\boldsymbol{w}}'{\boldsymbol{1}} = 1. \end{eqnarray*}$

are equivalent. Here, $\mu = {\boldsymbol{w}}'{\boldsymbol{\mu}}_{\mathcal{P}} = \text{mean}\, \big[\mathcal{P}({\boldsymbol{w}})\big]\, , \sigma = \sqrt{ {\boldsymbol{w}}'{\bf{\Sigma}}_{\mathcal{P}}{\boldsymbol{w}} } = \text{std}\, \big[\mathcal{P}({\boldsymbol{w}})\big]\, , {\boldsymbol{w}} = (w_1, w_2, \cdots, w_n)'$ .

Proof. Let $g(\mu, \sigma) = \frac{\sigma^2}{\mu+\sqrt{\mu^2+2\sigma^2}}$ . Then, ES $_\alpha$ [ $\mathcal{P}({\boldsymbol{w}})$ ] and VaR $_\alpha$ [ $\mathcal{P}({\boldsymbol{w}})$ ] are

$\begin{eqnarray*} && \text{VaR}_\alpha[\mathcal{P}({\boldsymbol{w}})] = - g(\mu, \sigma)\ln \big(2\alpha+\frac{\alpha\mu}{g(\mu, \sigma)} \big) = -g(\mu, \sigma) \big[ \ln \alpha + \ln\big(2+\frac{\mu}{g(\mu, \sigma)} \big) \big], \\ && \text{ES}_\alpha[\mathcal{P}({\boldsymbol{w}})] = g(\mu, \sigma) - g(\mu, \sigma)\ln \big(2\alpha+ \frac{\alpha\mu}{g(\mu, \sigma)} \big) = (1-\ln\alpha)g(\mu, \sigma) - g(\mu, \sigma)\ln(2+\frac{\mu}{g(\mu, \sigma)}). \end{eqnarray*}$

Differentiating the above expressions with respect to $\sigma$ , we have

$\begin{eqnarray*} && \frac{\partial\text{VaR}_\alpha[\mathcal{P}({\boldsymbol{w}})] }{\partial\sigma} = \frac{\partial g(\mu, \sigma)}{\partial\sigma}\left[ -\ln\alpha -\ln\big(2+\frac{\mu}{g(\mu, \sigma)} \big) + \frac{ \frac{\mu}{g(\mu, \sigma)} }{2+\frac{\mu}{g(\mu, \sigma) }} \right] > 0, \\ && \frac{\partial\text{ES}_\alpha[\mathcal{P}({\boldsymbol{w}})] }{\partial\sigma} = \frac{\partial g(\mu, \sigma)}{\partial\sigma} \left[ 1-\ln\alpha-\ln\big(2+\frac{\mu}{g(\mu, \sigma)} \big) + \frac{ \frac{\mu}{g(\mu, \sigma)} }{2+\frac{\mu}{2+g(\mu, \sigma)}} \right] > 0, \end{eqnarray*}$

where

$\frac{\partial g(\mu, \sigma)}{\partial \sigma} = \frac{\partial\Big[ \frac{\sigma^2}{\mu+\sqrt{\mu^2+2\sigma^2}} \Big]}{\partial\sigma} = 2\sigma \frac{ \mu+\frac{\mu^2+\sigma^2}{\sqrt{\mu^2+2\sigma^2}} }{ (\mu^2+\sqrt{\mu^2+2\sigma^2})^2 } > 0 .$

The derivative of skewness measure with respect to $\sigma$ is

$\frac{\partial \text{Skew[}\mathcal{P}({\boldsymbol{w}})\text{]}}{\partial\sigma} = \frac{\partial\Big[ \frac{2\mu^3+3\mu\sigma^2}{(\mu^2+\sigma^2)^{3/2}} \big]}{\partial\sigma} = \frac{-3\mu\sigma^3}{ (\mu^2+\sigma^2)^{5/2} } < 0.$

The derivative of kurtosis measure with respect to $\sigma$ is

$\frac{\partial \text{Kurt[}\mathcal{P}({\boldsymbol{w}})\text{]}}{\partial\sigma} = \frac{ \frac{ 9\mu^4+6\sigma^4+18\mu^2\sigma^2 }{ (\mu^2+\sigma^2)^2 } }{\partial \sigma} = \frac{ -12\mu^4\sigma^3-12\mu^2\sigma^5 }{ (\mu^2+\sigma^2)^4 } <0.$

The monotonicity of VaR $_\alpha$ [ $\mathcal{P}({\boldsymbol{w}})$ ], ES $_\alpha$ [ $\mathcal{P}({\boldsymbol{w}})$ ], Skew[ $\mathcal{P}({\boldsymbol{w}})$ ], and Kurt[ $\mathcal{P}({\boldsymbol{w}})$ ] with respect to $\sigma$ indicate that the portfolio selection problems based on these risk measures are equivalent. This means that minimizing VaR $_\alpha$ [ $\mathcal{P}({\boldsymbol{w}})$ ], ES $_\alpha$ [ $\mathcal{P}({\boldsymbol{w}})$ ], StD[ $\mathcal{P}({\boldsymbol{w}})$ ] are equivalent to minimizing ${\boldsymbol{w}}'{\bf{\Sigma}}_{\mathcal{P}}{\boldsymbol{w}}$ .

3.2. Parameter estimation of Asymmetric Laplace distribution

Assume ${\boldsymbol{X}} = (X_1, X_2, \cdots, X_n)\sim AL_n({\boldsymbol{\mu}}, {\bf{\Sigma}})$ . Let ${\boldsymbol{x}}_1\, , {\boldsymbol{x}}_2\, , \cdots, {\boldsymbol{x}}_T \in \mathbb{R}^n$ be the $T$ observations. We aim at fitting a multivariate Asymmetric Laplace distribution $\mathcal{AL}_n({\boldsymbol{\mu}}, {\bf{\Sigma}})$ with unknown parameters ${\boldsymbol{\mu}}, {\bf{\Sigma}}$ .

Hrlimann (2013), Kollo and Srivastava (2005), Visk (2009) consider moment matching methods that is less efficient than maximum likelihood estimation. Kotz et al. (2002) and Kotz et al. (2001) presented the maximum likelihood estimators for parameter estimation of Asymmetric Laplace distributions. However, maximum likelihood estimation require computation of complicated Bessel function. Thus we derived the expectation-maximization algorithm for parameter estimation of Asymmetric Laplace distribution.

3.2.1. Moment estimation

As Zhao et al. (2015) pointed out, according to Proposition 2.1 (3), Asymmetric Laplace distribution can be estimated via moment method (Moment-AL) with

$\hat{{\boldsymbol{\mu}}} = \bar{{\boldsymbol{x}}} \quad\mbox{and}\quad \hat{{\bf{\Sigma}}} = \text{cov}({\boldsymbol{X}})-\hat{{\boldsymbol{\mu}}}'\hat{{\boldsymbol{\mu}}},$

where $\bar{{\boldsymbol{x}}} = \frac{1}{n}\sum_{i = 1}^{n}{\boldsymbol{x}}_i\, , \text{Cov}({\boldsymbol{X}}) = \sum_{i = 1}^{n}({\boldsymbol{x}}_i-\bar{{\boldsymbol{x}}})^T({\boldsymbol{x}}_i-\bar{{\boldsymbol{x}}})$ .

3.2.2. Maximum likelihood estimation

Consider sample points ${\boldsymbol{x}}_1, {\boldsymbol{x}}_2, \cdots, {\boldsymbol{x}}_n$ and density function of Asymmetric Laplace distribution as defined in (2.1). Taken logarithm with respect to likelihood function, the log-likelihood is

$\begin{eqnarray*} \ell({\boldsymbol{\mu}}, {\bf{\Sigma}}) & = & \ln L({\boldsymbol{\mu}}, {\bf{\Sigma}}) = \sum\limits_{t = 1}^{T} \ln f({\boldsymbol{x}}_t;{\boldsymbol{\mu}}, {\bf{\Sigma}}) \\ & = & \sum\limits_{t = 1}^{T}{\boldsymbol{x}}_t{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} + T\ln 2 - \frac{Tn}{2} \ln(2\pi) - \frac{T}{2} \ln \big(|{\bf{\Sigma}}|\big) +\frac{\nu}{2} \sum\limits_{t = 1}^{T}\ln \big({\boldsymbol{x}}_t'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}_t \big) - \\ && \frac{\nu T}{2} \ln \big(2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} \big) + \sum\limits_{t = 1}^{T} \ln K_{v} \Big{\{} \sqrt{ (2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}})({\boldsymbol{x}}_t'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}_t) } \Big{\}}. \end{eqnarray*}$

Generally, we can directly maximize the log-likelihood function $\ell({\boldsymbol{\mu}}, {\bf{\Sigma}})$ with respect to parameters ${\boldsymbol{\mu}}, {\bf{\Sigma}}$ and thus obtain the maximum likelihood estimator. Unfortunately, the density function involves modified Bessel function of the third kind with density (2.2), (2.3) that are too complex and complicated for numerical maximization. However, Gaussian-Exponential mixture representation of the Asymmetric Laplace distribution allows us to employ the expectation-maximization algorithm without involving modified Bessel functions.

3.2.3. Expectation-maximization algorithm

Then we derive the Expectation-Maximization algorithm for parameter estimation of multivariate Asymmetric Laplace Distribution (mALD), we follow the EM derivation for Multivariate Skew Laplace distribution in Arslan (2010).

Let ${\boldsymbol{X}} = (X_1, X_2, \cdots, X_n)$ be Asymmetric Laplace distributed random vector. Proposition 2.1 suggests that ${\boldsymbol{X}}$ can be generated from a latent random variable $Z = z$ through multivariate Gaussian distribution with $z{\boldsymbol{\mu}}, z{\bf{\Sigma}}$ , i.e. $X|Z = z \sim N_{n}\, (z{\boldsymbol{\mu}}, z{\bf{\Sigma}})\, , Z\, \sim \text{Exp}\, (1)$ with density

$\begin{eqnarray*} f_{{\boldsymbol{X}}|Z}({\boldsymbol{x}}, z) & = & \frac{1}{(2\pi)^{n/2}|z{\bf{\Sigma}}|^{1/2}} \exp \Big{\{} -\frac{1}{2}({\boldsymbol{x}}-z{\boldsymbol{\mu}})'(z{\bf{\Sigma}})^{-1}({\boldsymbol{x}}-z{\boldsymbol{\mu}}) \Big{\}}, \\ f_{Z}(z) & = & e^{-z}1_{ \{z\geq 0\} }. \end{eqnarray*}$

Thus the joint density function of ${\boldsymbol{X}}$ and $Z$ is

$f_{{\boldsymbol{X}}, Z}({\boldsymbol{x}}, z) = f_{{\boldsymbol{X}}|Z}({\boldsymbol{x}}, z)f_{Z}(z) = \frac{1}{ (2\pi)^{\frac{n}{2}}z^{\frac{n}{2}}|{\bf{\Sigma}}|^{\frac{1}{2}} } \exp \Big{\{} -\frac{1}{2z}{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}+{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} -\frac{z}{2}{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}-z 1_{ \{z\geq 0\} } \Big{\}}.$

Suppose that there are $T$ observations ${\boldsymbol{X}}_1, \ldots, {\boldsymbol{X}}_T$ generated from the latent random variables $z_1, z_2, \cdots, z_T$ respectively. The complete data is defined as $\{ ({\boldsymbol{x}}_t, z_t) \}\, , t = 1, 2, \cdots, T$ . In the EM algorithm, ${\boldsymbol{x}}_t$ and $z_t$ are the observed and missing data respectively. The log-likelihood up to an additive constant can be written as

$\tilde{L}({\boldsymbol{\mu}}, {\bf{\Sigma}}) = \sum\limits_{t = 1}^{T}\ln f_{{\boldsymbol{X}}, Z}({\boldsymbol{x}}_t, z_t) = -\frac{T}{2}\ln|{\bf{\Sigma}}|-\frac{1}{2}\sum\limits_{t = 1}^{T}\frac{1}{z_t}{\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}{\boldsymbol{x}}_t +\sum\limits_{t = 1}^{T}{\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} -\frac{1}{2}{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}\sum\limits_{t = 1}^{T}z_{t}-\sum\limits_{t = 1}^{T}z_{t}1_{ \{z_t\geq 0\} }.$

Note that the last term of the above equation does not contain any unknown parameters and thus is negligible. Then, the E-step becomes

$E\Big(\tilde{L}({\boldsymbol{\mu}}, {\bf{\Sigma}})|{\boldsymbol{x}}_{t}, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}\Big) \propto -\frac{T}{2}\ln\big|{\bf{\Sigma}}\big| + \sum\limits_{t = 1}^{T}{\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} - \frac{1}{2}\sum\limits_{t = 1}^{T}E(z_t^{-1}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat\Sigma){\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}{\boldsymbol{x}}_t -\frac{1}{2}{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}\sum\limits_{t = 1}^{T}E(z_{t}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}).$

where $E(z_t|{\boldsymbol{x}}, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}})$ and $E(z_t^{-1}|{\boldsymbol{x}}, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}})$ are the conditional expectations of $z_t$ and $z_t^{-1}$ given ${\boldsymbol{x}}_t$ and the current estimates $\hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}$ .

To evaluate conditional expectations $E(z_t^{-1}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}})$ and $E(z_t|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}})$ , we need the conditional density of $Z$ given ${\boldsymbol{X}}$ , $f_{Z|{\boldsymbol{X}}}$ . After some straightforward algebra, the conditional distribution of $Z$ given ${\boldsymbol{X}}$ is an inverse Gaussian distribution with density function

$\begin{eqnarray} f_{Z|{\boldsymbol{X}}}(z|{\boldsymbol{x}}, {\boldsymbol{\mu}}, {\bf{\Sigma}}) & = & \frac{f_{{\boldsymbol{X}}, Z}({\boldsymbol{x}}, z)}{f_{{\boldsymbol{X}}}({\boldsymbol{x}})} = \frac{ \frac{1}{ (2\pi)^{\frac{n}{2}}z^{\frac{n}{2}}|{\bf{\Sigma}}|^{\frac{1}{2}} } \exp \big{\{} -\frac{1}{2z}{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}+{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} -\frac{z}{2}{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}-z 1_{ \{z\geq 0\} } \big{\}} }{ \frac{ 2e^{{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}} } { (2\pi)^{n/2} |{\bf{\Sigma}}|^{1/2} } \big{(} \frac{{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}} }{ 2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}} \big{)} ^{v/2} K_{v} \big{(} \sqrt{ (2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}})({\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}) } \big{)} } \nonumber \\ & = & \Big{(} \frac{ 2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}}{{\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}} } \Big{)} ^{v/2} z^{-n/2} \frac{ \exp \big{\{} -\frac{1}{2} \big{[} z^{-1} {\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}} +z({\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}+z 1_{\{z\geq 0\}}) \big{]} \big{\}} }{ 2K_{v} \big{(} \sqrt{ (2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}})({\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}) } \big{)} }. \label{conditional-density} \end{eqnarray}$

(3.1)

Lemma 3.1. (GIG (Stacy, 1962)) A random variable $X$ follows Generalized Inverse Gaussian distribution(denoted as $X\sim N^{-}(\lambda, \chi, \psi)$ ) if its density function could be represented as

$f(x) = \frac{ \chi^{-\lambda}(\sqrt{\chi\psi})^{\lambda} }{2K_{\lambda}(\sqrt{\chi\psi})} x^{\lambda-1} \exp\Big{\{}-\frac{1}{2}(\chi x^{-1}+\psi x)\Big{\}} \, , \quad x>0.$

where $K_{\lambda}$ denotes the third kind modified Bessel function, and the parameters satisfy

$\begin{equation*} \left\{ \begin{array}{l} \chi>0, \psi\geq 0, & if \;\lambda<0; \\ \chi>0, \psi>0, & if \;\lambda = 0; \\ \chi\geq0, \psi>0, & if\; \lambda>0. \end{array} \right. \end{equation*}$

After some algebraic manipulations, it is easy to show that $Z|{\boldsymbol{X}}$ follows Generalized Inverse Gaussian distribution:

$Z|{\boldsymbol{X}} \sim N^{-}\Big{(} \frac{2-n}{2}, {\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}\, , 2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} \Big{)}.$

If $\chi>0, \psi>0,$ the moments could be calculated through the following formulas:

$\begin{eqnarray*} && E(X^{\alpha}) = \Big{(} \frac{\chi}{\psi} \Big{)}^{\alpha/2} \frac{K_{\lambda+\alpha}(\sqrt{\chi\psi})}{K_{\lambda}(\sqrt{\chi\psi})}, \quad \alpha\in \mathbb{R}, \\ && E(\ln X) = \frac{dE(X^{\alpha})}{d\alpha}\Big{|}_{\alpha = 0}. \end{eqnarray*}$

Denote $\chi = {\boldsymbol{x}}'{\bf{\Sigma}}^{-1}{\boldsymbol{x}}, \psi = 2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}$ . Then, $Z|{\boldsymbol{X}}\sim N^-(\frac{2-n}{2}, \chi, \psi)$ . From the conditional density function of (3.1), we can obtain the conditional expectations with the following moment properties:

$\begin{eqnarray*} a_{t} & = & E(z_t|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}) = \sqrt{\frac{\chi_t}{\psi}}\frac{K_{\frac{n}{2}-2}(\sqrt{\chi_t\psi})}{K_{\frac{n}{2}-1}(\sqrt{\chi_t\psi})}\, , \quad t = 1, 2, \cdots, T;\\ b_{t} & = & E(z_{t}^{-1}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}) = \sqrt{\frac{\psi}{\chi_t}}\frac{K_{\frac{n}{2}}(\sqrt{\chi_t\psi})}{K_{\frac{n}{2}-1}(\sqrt{\chi_t\psi})}\, , \quad t = 1, 2, \cdots, T. \end{eqnarray*}$

where $\chi_t = {\boldsymbol{x}}'_t\Sigma^{-1}{\boldsymbol{x}}_t$ , $R(\lambda) = \frac{ K_{\lambda+1}(x) }{ K_{\lambda}(x) }$ is strictly decreasing in $x$ with $\lim_{x\rightarrow \infty}R_{\lambda}(x) = 1$ and $\lim_{x\rightarrow 0^+}R_{\lambda}(x) = \infty$ . Thus, $a_t>0, b_t>0\, , t = 1, 2, \cdots, T$ .

Finally, if the conditional expectation $E(z_t|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}})$ and $E(z_t^{-1}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}})$ are replaced by $a_t$ and $b_t$ respectively, the objective function becomes

$Q({\boldsymbol{\mu}}, {\bf{\Sigma}}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}) = -\frac{T}{2}\ln|{\bf{\Sigma}}|+\sum\limits_{t = 1}^{T}{\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}} -\frac{1}{2}\sum\limits_{t = 1}^{T}b_{t}{\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}{\boldsymbol{x}}_t -\frac{1}{2}{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}\sum\limits_{t = 1}^{T}a_{t}.$

(3.2)

Denote ${\boldsymbol{S}} = {\bf{\Sigma}}^{-1}$ . The objective function (3.2) becomes

$Q({\boldsymbol{\mu}}, {\bf{\Sigma}}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\boldsymbol{S}}}) = \frac{T}{2}\ln|{\boldsymbol{S}}|+\sum\limits_{t = 1}^{T}{\boldsymbol{x}}'_t{\boldsymbol{S}}{\boldsymbol{\mu}} -\frac{1}{2}\sum\limits_{t = 1}^{T}b_{t}{\boldsymbol{x}}'_t{\boldsymbol{S}}{\boldsymbol{x}}_t -\frac{1}{2}{\boldsymbol{\mu}}'{\boldsymbol{S}}{\boldsymbol{\mu}}\sum\limits_{t = 1}^{T}a_{t}.$

(3.3)

Taking derivative of objective function (3.3) with respect to ${\boldsymbol{\mu}}, {\boldsymbol{S}}$ , we obtain

$\begin{eqnarray*} \frac{\partial Q}{\partial {\boldsymbol{\mu}}} & = & \sum\limits_{t = 1}^{T} {\boldsymbol{x}}'_t {\boldsymbol{S}} - \sum\limits_{t = 1}^{T} a_t {\boldsymbol{\mu}}'{\boldsymbol{S}} = 0, \\ \frac{\partial Q}{\partial {\boldsymbol{S}}} & = & \frac{T}{2}{\boldsymbol{S}}^{-1}-\frac{1}{2}\sum\limits_{t = 1}^{T} b_t {\boldsymbol{x}}'_t {\boldsymbol{x}}_t+\sum\limits_{t = 1}^{T}{\boldsymbol{x}}_t'{\boldsymbol{\mu}} -\frac{1}{2}\sum\limits_{t = 1}^{T} a_t{\boldsymbol{\mu}}'{\boldsymbol{\mu}} = 0. \end{eqnarray*}$

Substituting ${\boldsymbol{S}}$ by ${\bf{\Sigma}}$ and setting these derivatives to zero yield

$\begin{eqnarray*} && \sum\limits_{t = 1}^{T} {\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}- \sum\limits_{t = 1}^{T} a_t {\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1} = {\boldsymbol{0}}, \\ && \frac{T}{2}{\bf{\Sigma}}-\frac{1}{2}\sum\limits_{t = 1}^{T}b_{t}{\boldsymbol{x}}'_t {\boldsymbol{x}}_t+\sum\limits_{t = 1}^{T}{\boldsymbol{x}}'_t{\boldsymbol{\mu}} -\frac{1}{2}\sum\limits_{t = 1}^{T}a_{t}{\boldsymbol{\mu}}'{\boldsymbol{\mu}} = {\boldsymbol{0}}. \end{eqnarray*}$

Thus, maximization of the objective function $Q({\boldsymbol{\mu}}, {\bf{\Sigma}}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}})$ can be achieved by the following iterative updating formulas:

$\hat{{\boldsymbol{\mu}}} = \frac{\bar{{\boldsymbol{x}}}}{\bar{a}}; \quad \hat{{\bf{\Sigma}}} = \overline{b_{t}{\boldsymbol{x}}'_t {\boldsymbol{x}}_t}-\frac{ \bar{{\boldsymbol{x}}}'\bar{{\boldsymbol{x}}} }{\bar{a}}.$

where $\bar{a}, \bar{b}$ stand for the average of $\{a_t\}_{t = 1}^T$ and $\{b_t\}_{t = 1}^T$ respectively and $\bar{{\boldsymbol{x}}}$ is the average of $\{{\boldsymbol{x}}_t\}_{t = 1}^{T}$ . In what follows, we present the iterative reweighted Expectation-Maximization algorithm for parameter estimation of Asymmetric Laplace distribution.

Algorithm 1 Iterative reweighting algorithm

1. Set iteration number

$k=1$ and select initial estimates for parameters

${\boldsymbol{\mu}}^{(0)}, {\bf{\Sigma}}^{(0)}$ .
2. (E-Step) At

$k$ -th iteration with current estimates

${\boldsymbol{\mu}}^{(k)}, {\bf{\Sigma}}^{(k)}$ , define the corresponding log-likelihood as

$l^{(k)} = \log \sum\limits_{t=1}^{T} f({\boldsymbol{x}}_t|{\boldsymbol{\mu}}^{(k)}, {\bf{\Sigma}}^{(k)}), \quad k=1, 2, \cdots.$
With notations

$\chi_t ={\boldsymbol{x}}'_t{\bf{\Sigma}}^{-1}{\boldsymbol{x}}_t, \psi = 2+{\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}$ , we can obtain iterative weights

$a_{t} = E(z_t|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}) =\sqrt{\frac{\chi_t}{\psi}}\frac{K_{2-\frac{n}{2}}(\sqrt{\chi_t\psi})}{K_{1-\frac{n}{2}}(\sqrt{\chi_t\psi})}, \quad t=1, 2, \cdots, T;$

$b_{t} = E(z_{t}^{-1}|{\boldsymbol{x}}_t, \hat{{\boldsymbol{\mu}}}, \hat{{\bf{\Sigma}}}) =\sqrt{\frac{\psi}{\chi_t}}\frac{K_{-\frac{n}{2}}(\sqrt{\chi_t\psi})}{K_{1-\frac{n}{2}}(\sqrt{\chi_t\psi})}, \quad t=1, 2, \cdots, T.$
3. (M-Step) Employ the following iteration formulas to calculate the new estimates

${\boldsymbol{\mu}}^{(k+1)}, {\bf{\Sigma}}^{(k+1)}$ at

$(k+1)$ -th iteration:

${\boldsymbol{\mu}}^{(k+1)}=\frac{\bar{{\boldsymbol{x}}}}{\bar{a}}, \quad {\bf{\Sigma}}^{(k+1)} = \overline{b_{t}{\boldsymbol{x}}'_t {\boldsymbol{x}}_t}-\frac{ \bar{{\boldsymbol{x}}}'\bar{{\boldsymbol{x}}} }{\bar{a}}.\quad \quad \quad \quad (3.4)$
The log-likelihood at

$(k+1)$ -th iteration becomes

$l^{(k+1)} = \log \sum\limits_{t=1}^{T} f({\boldsymbol{x}}_t|{\boldsymbol{\mu}}^{(k+1)}, {\bf{\Sigma}}^{(k+1)}).$
4. Repeat these iteration steps until convergence with criterion

$l^{(k+1)}-l^{(k)} < \varepsilon$ , where

$\varepsilon>0$ is a small number that control the convergence precision, for convenience, we take

$\varepsilon=1e^{-16}$ .

4. Simulation studies

To evaluate the performance of portfolio selection models and parameter estimation methods in Section 3, we generate 100 datasets from Gaussian distribution and Asymmetric Laplace distribution respectively. Each dataset consists of $T = 200$ observations with the following parameter settings: Case (1): $n = 3\, , {\boldsymbol{\mu}} = (0.03\, , 0.06\, , 0.09)$ ; Case (2): $n = 5\, , {\boldsymbol{\mu}} = (0.01\, , 0.02\, , 0.06\, , 0.08\, , 0.09)$ ; Case (3): $n = 10\, , {\boldsymbol{\mu}} = (0.01\, , 0.02\, , 0.03\, , \cdots, 0.10)$ . For each case, we set ${\bf{\Sigma}} = \text{diag}\, ({\boldsymbol{\mu}}/10)$ . All the simulation studies are carried out on a PC with Intel Core i7 3.6 GHz processor under $\texttt{R}$ platform.

Each dataset are estimated under both multivariate Gaussian and Asymmetric Laplace distribution. ALD (EM-AL) is estimated using the EM algorithm described in Section 3.2. We evaluate the estimation performance using Bias measure, defined as $\text{Bias} = \|\hat{{\boldsymbol{\mu}}}-{\boldsymbol{\mu}}\|_1 + \|\hat{{\bf{\Sigma}}}-{\bf{\Sigma}} \|_1$ . The mean log-likelihood and mean bias of the simulated 200 datasets are reported in Table 1.

Table 1. Model fitting results of Gaussian data and Asymmetric Laplace data using Gauss Model and EM-AL Model.

Gaussian Data
	Log-Likelihood		Bias
	Gauss	EM-AL	Gauss	Moment-AL	EM-AL
Case (1)	709.6159	641.2472	0.0153	0.0451	0.0183
Case (2)	1363.0231	1260.3676	0.0280	0.0886	0.0319
Case (3)	2593.1151	2432.2192	0.0672	0.3309	0.0766
Asymmetric Laplace Data
	Log-Likelihood		Bias
	Gauss	EM-AL	Gauss	Moment-AL	EM-AL
Case (1)	619.1798	735.0847	0.0542	0.0252	0.0226
Case (2)	1250.7110	1463.9149	0.0870	0.0376	0.0302
Case (3)	2432.3401	2919.9878	0.3750	0.1304	0.0832

| Show Table

DownLoad: CSV

Table 1 indicate that if the model is correctly specified, the estimation performance is always the best in terms of bias. If the data is generated from Gaussian distribution, the estimation from Gaussian model is the best, so does Asymmetric Laplace distribution. If data is generated from Gaussian distribution, then the estimation log-likelihood of Gaussian model is larger than Asymmetric Laplace distribution, this is true for Asymmetric Laplace data as well.

Figure 3 show that for Gaussian data, since Gaussian data fit the model better, efficient frontiers under Gaussian data are more close to Gaussian models; Figure 4 indicate that for generated Asymmetric Laplace data, efficient frontiers nearly equivalent to true Asymmetric Laplace data. Figure 3–4 suggest that we can first modeling data using Gaussian and Asymmetric Laplace distribution, and use the fitted log-likelihood to determine the distribution, then we evaluate the performance with the corresponding efficient frontier analysis.

Figure 3. Efficient frontiers of simulated Gaussian data of case (1)–(3) using Gaussian and EM-AL model.

DownLoad: Full-Size Img PowerPoint

Figure 4. Efficient frontiers of simulated Asymmetric Laplace data of case (1)–(3) using Gaussian and EM-AL model.

DownLoad: Full-Size Img PowerPoint

5. Real data analysis

In this section, we apply our proposed methodology to two real financial datasets, Hang Seng Index and Nasdaq Index, both datasets are downloaded from Bloomberg, with daily data range from January 4, 2011, to December 29, 2017. The variable of interest is the rate of returns multiplied by the annualized ratio $\sqrt{252}$ , formulated as

$\text{LogRet}\, (t) = \sqrt{252} \Big\{ \log\big(\text{price}[t+1]\big)-\log\big(\text{price}[t]\big) \Big\}, \quad t = 1, 2, \cdots, 1721.$

These two datasets are analyzed through efficient frontier analysis under ALD framework on $\texttt{R}$ platform. Theorem 3.1–3.2 indicate that portfolio selection models under ALD framework can be reduced to the following quadratic programming problem:

$\min\limits_{{\boldsymbol{w}}} \sigma^2 = {\boldsymbol{w}}' {\bf{\Sigma}} {\boldsymbol{w}} \quad \text{ s.t. } \quad {\boldsymbol{w}}'{\boldsymbol{\mu}} = r_0\, , {\boldsymbol{w}}'{\boldsymbol{1}} = 1.$

with explicit solution (see Lai and Xing, 2008) as follows:

$\hat{{\boldsymbol{w}}} = \frac{D-r_0B}{AD-B^2}{\bf{\Sigma}}^{-1}{\boldsymbol{1}} + \frac{r_0A-B}{AD-B^2}{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}.$

(5.1)

Here, $A = {\bf{1}}'{\bf{\Sigma}}^{-1}{\bf{1}}, B = {\bf{1}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}$ and $D = {\boldsymbol{\mu}}'{\bf{\Sigma}}^{-1}{\boldsymbol{\mu}}$ .

5.1. Example 1: Hang seng index

In the first example, we construct a portfolio consisting of 8 Hang Seng indexes: HK1, HK175, HK2007, HK2318, HK4, HK6, HK66. The summary descriptive statistics are reported in Table 2. It is clear that the all stock returns exhibit larger skewness and kurtosis. The median of these stocks are close to zero, the log-likelihood of Asymmetric Laplace distribution is larger than gaussian distribution, indicating that Asymmetric Laplace distribution would be a good fit than gaussian distribution.

Table 2. Hang Seng data statistics.

Descriptive Statistics
	StD	Mean	Median	Skewness	Kurtosis	Jarq.Test	Jarq.Prob
HK1	19.7819	0.2370	0.0000	3.9601	71.8163	375244.7434	0.0000
HK175	3.7101	0.2183	0.0000	1.1510	28.6192	59264.8528	0.0000
HK2007	2.1968	0.1115	0.0000	0.5928	16.5623	19825.4392	0.0000
HK2318	12.6007	0.3431	0.0000	0.6174	6.2371	2908.6947	0.0000
HK4	5.8690	0.0409	0.0000	0.4894	7.6362	4263.7750	0.0000
HK6	13.0712	0.1517	0.0000	-1.1430	20.3112	30037.3385	0.0000
HK66	5.8708	0.1545	0.0000	-1.7941	19.9392	29510.3684	0.0000
		Gaussian			EM-AL
Log-likelihood		-39707.45			-37865.83
Parameter Estimation
	HK1	HK175	HK2007	HK2318	HK4	HK6	HK66
$\mu$	0.2370	0.2183	0.1115	0.3431	0.0409	0.1517	0.1545
$\Sigma$	HK1	HK175	HK2007	HK2318	HK4	HK6	HK66
HK1	406.4426	12.8089	11.4547	130.0934	68.3857	94.4069	60.3087
HK175	12.8089	7.3656	1.6330	12.0356	4.7467	4.8725	3.6070
HK2007	11.4547	1.6330	3.8506	9.9261	4.6830	3.9055	2.4395
HK2318	130.0934	12.0356	9.9261	174.0423	42.1879	48.2877	35.0499
HK4	68.3857	4.7467	4.6830	42.1879	44.6336	26.5314	17.6232
HK6	94.4069	4.8725	3.9055	48.2877	26.5314	205.1471	32.5570
HK66	60.3087	3.6070	2.4395	35.0499	17.6232	32.5570	41.6153

| Show Table

DownLoad: CSV

Then we fit the data to Asymmetric Laplace distribution through EM algorithm as described in Section 3.2. Parameter estimation results are displayed in Table 2, we construct portfolios under Asymmetric Laplace framework at different levels of expected return. Consider increasing target expected return values

$r_0 = 0.040\, , 0.0745\, , 0.1081\, , 0.1417\, , 0.1753\, , 0.2088\, , 0.2424\, , 0.2760\, , 0.3096\, , 0.3431\, .$

Portfolio selection results are summarized in , with kurtosis, skewness, sharpe ratio, VaR and ES results at $\alpha = 0.01, 0.05, 0.10$ levels.

Table 3. Efficient frontier results of Hang Seng data.

	r	$\mu$	$\sigma$	Skew	Kurt	Sharpe	VaR $_{0.01}$	ES $_{0.01}$	VaR $_{0.05}$	ES $_{0.05}$	VaR $_{0.10}$	ES $_{0.10}$
1	0.0409	0.0409	2.5461	0.0482	6.0016	0.0161	6.9429	8.7229	4.0782	5.8582	2.8444	4.6244
2	0.0745	0.0745	2.0565	0.1086	6.0079	0.0362	5.5080	6.9254	3.2268	4.6442	2.2444	3.6618
3	0.1081	0.1081	1.7299	0.1869	6.0233	0.0625	4.5256	5.6960	2.6420	3.8124	1.8308	3.0011
4	0.1417	0.1417	1.6650	0.2537	6.0430	0.0851	4.2684	5.3771	2.4841	3.5928	1.7156	2.8243
5	0.1753	0.1753	1.8891	0.2763	6.0510	0.0928	4.8095	6.0606	2.7960	4.0471	1.9288	3.1799
6	0.2088	0.2088	2.3199	0.2682	6.0480	0.0900	5.9208	7.4601	3.4434	4.9827	2.3764	3.9157
7	0.2424	0.2424	2.8656	0.2523	6.0425	0.0846	7.3493	9.2579	4.2774	6.1861	2.9544	4.8631
8	0.2760	0.2760	3.4724	0.2372	6.0375	0.0795	8.9467	11.2679	5.2108	7.5321	3.6018	5.9231
9	0.3096	0.3096	4.1134	0.2247	6.0337	0.0753	10.6386	13.3966	6.1999	8.9578	4.2882	7.0462
10	0.3431	0.3431	4.7749	0.2147	6.0307	0.0719	12.3871	15.5962	7.2222	10.4313	4.9978	8.2069

| Show Table

DownLoad: CSV

The efficient frontier tendencies are displayed in Figure 5. It is suggested that aggressive investors should impose higher confidence levels and conservative investors may choose smaller confidence levels. Figure 6 depicts the kurtosis, skewness, and sharpe ratio tendency of portfolio selection models. Results show that Sharpe Ratio, Skewness and Kurtosis increase fast and decreases slowly down as the target expected returns increases.

Figure 5. ALD Efficient Frontier of Hang Seng Data.

DownLoad: Full-Size Img PowerPoint

Figure 6. Skewness, Kurtosis and Sharpe Ratio Tendency of Hang Seng Data.

DownLoad: Full-Size Img PowerPoint

5.2. Example 2: Nasdaq index

In the second example, we consider Nasdaq index, including CTRP, MNST, NFLX, NTES, NVDA, TTWO, and report the descriptive statistics in Table 4. All the indexes exhibit significant skewness and kurtosis. Jarq.Test results indicate that this dataset deviates from normality significantly. We fit the log returns data to Gaussian and Asymmetric Laplace distributions. Since Asymmetric Laplace model achieve higher log-likelihood results compared to Gaussian model, we choose EM-AL model for data fitting. Parameter estimation results are displayed in Table 4.

Table 4. Nasdaq data statistics.

Descriptive Statistics
	StD	Mean	Median	Skewness	Kurtosis	Jarq.Test	Jarq.Prob
CTRP	12.5853	0.2100	0.0000	1.5907	16.5100	20786.4126	0.0000
MNST	10.0679	0.4904	0.1587	1.9632	25.7988	50065.6119	0.0000
NFLX	32.9311	1.5015	-0.0079	1.0314	20.0251	29796.6074	0.0000
NTES	58.0910	2.7817	1.2700	1.2457	26.0418	50315.0307	0.0000
NVDA	25.2360	1.6024	0.3175	1.7413	40.3981	120864.0386	0.0000
TTWO	12.2826	0.8786	0.1587	2.1392	52.3926	203127.9368	0.0000
		Gaussian			EM-AL
Log-Likelihood		-46400.25			-42798.03
Parameter Estimation
		CTRP	MNST	NFLX	NTES	NVDA	TTWO
$\mu$		0.2100	0.4904	1.5015	2.7817	1.6024	0.8786
$\Sigma$		CTRP	MNST	NFLX	NTES	NVDA	TTWO
CTRP		187.1236	25.4281	129.9104	246.6268	46.1539	40.0369
MNST		25.4281	116.7107	51.2572	82.0634	26.2438	23.5405
NFLX		129.9104	51.2572	900.1261	327.6980	122.5361	83.8650
NTES		246.6268	82.0634	327.6980	2380.8894	213.1414	121.1493
NVDA		46.1539	26.2438	122.5361	213.1414	259.7019	53.8530
TTWO		40.0369	23.5405	83.8650	121.1493	53.8530	111.0020

| Show Table

DownLoad: CSV

Then we consider increasing target expected returns

$r_0 = 0.2100, 0.4958, 0.7815, 1.0673, 1.3530, 1.6388, 1.9245, 2.2102, 2.4960, 2.7817.$

Results of skewness, kurtosis, sharpe ratio and VaR, ES results are summarized in . displays the efficient frontiers at confidence level $\alpha = 0.01, 0.05, 0.10$ . These results show that the portfolio capture higher risk at higher $\alpha$ levels. Figure 8 displays the skewness, kurtosis, and Sharpe Ratio Tendency. The optimal portfolios can be obtained from (5.1) with the corresponding VaR, ES, skewness, kurtosis and Sharpe Ratio.

Table 5. Efficient frontier analysis of Nasdaq data.

	r	$\mu$	$\sigma$	Skew	Kurt	Sharpe	VaR $_{0.01}$	ES $_{0.01}$	VaR $_{0.05}$	ES $_{0.05}$	VaR $_{0.10}$	ES $_{0.10}$
1	0.2100	0.2100	8.5237	0.0739	6.0036	0.0246	23.0672	28.9903	13.5344	19.4575	9.4288	15.3519
2	0.4958	0.4958	7.5957	0.1951	6.0254	0.0653	19.8220	24.9509	11.5675	16.6963	8.0125	13.1413
3	0.7815	0.7815	7.8219	0.2973	6.0590	0.0999	19.7857	24.9397	11.4907	16.6447	7.9183	13.0723
4	1.0673	1.0673	9.1167	0.3472	6.0806	0.1171	22.7065	28.6414	13.1547	19.0896	9.0409	14.9758
5	1.3530	1.3530	11.1127	0.3608	6.0870	0.1218	27.5608	34.7713	15.9560	23.1665	10.9581	18.1686
6	1.6388	1.6388	13.5025	0.3597	6.0865	0.1214	33.4993	42.2627	19.3952	28.1586	13.3208	22.0843
7	1.9245	1.9245	16.1117	0.3541	6.0838	0.1194	40.0422	50.5132	23.1898	33.6608	15.9318	26.4028
8	2.2102	2.2102	18.8494	0.3478	6.0808	0.1173	46.9392	59.2083	27.1927	39.4619	18.6883	30.9575
9	2.4960	2.4960	21.6671	0.3418	6.0781	0.1152	54.0563	68.1800	31.3251	45.4488	21.5353	35.6590
10	2.7817	2.7817	24.5371	0.3365	6.0757	0.1134	61.3178	77.3329	35.5424	51.5576	24.4416	40.4567

| Show Table

DownLoad: CSV

Figure 7. ALD efficient frontier of Hang Seng data.

DownLoad: Full-Size Img PowerPoint

Figure 8. Skewness, kurtosis and Sharpe Ratio tendency of Nasdaq data.

DownLoad: Full-Size Img PowerPoint

and suggests that as $r_0$ increases, all ES (ES $_{0.01}$ , ES $_{0.05}$ , ES $_{0.10}$ ) increases, indicating that higher return is derived from higher risk. It is interesting that under the ALD assumption, as $r_0$ increases, Sharpe ratio and skewness first decreases then increases accordingly. As $\alpha$ increases, VaR and ES measures decreases. Thus, conservative investors can choose larger $\alpha$ levels and aggressive investors would select smaller $\alpha$ levels.

6. Conclusion and prospects

In this paper, we derive several equivalent portfolio selection models under ALD framework, these models can be transformed to quadratic programming problem with explicit solutions. The Expectation-Maximization algorithm for parameter estimation of Asymmetric Laplace distribution is obtained and outperforms moment estimation.

There are several advantages of Asymmetric Laplace distribution based models. First, the equivalence of risk measures such as VaR, ES and StD at maximization of skewness and minimization of kurtosis faciliate portfolio selection models significantly. Second, confidence levels of these models offer investors various portfolio selection choices. Conservative investors can choose larger $\alpha$ levels and aggressive investors can select smaller $\alpha$ levels. Finally, ALD model is able to explain skewness and kurtosis in financial data. Therefore, the Asymmetric Laplace distribution can be widely applied to handle real financial datasets.

We may further extend the Asymmetric Laplace based portfolio selection model to cases of mixture Asymmetric Laplace distributions. Another direction is to combine clustering techniques (see Dias et al., 2015; Iorio et al., 2018) with Asymmetric Laplace distribution for portfolio selection of time series models.

Acknowledgments

Chi Tim, Ng's work is supported by the 2016 Chonnam National University Research Program grant (No. 2016–2762).

Conflict of interest

The authors declare no conflict of interest.

References

[1]	A. W. Senior, R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre, T. Green, et al., Improved protein structure prediction using potentials from deep learning, Nature, 577 (2020), 706–710. https://doi.org/10.1038/s41586-019-1923-7 doi: 10.1038/s41586-019-1923-7
[2]	J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, et al., Highly accurate protein structure prediction with AlphaFold, Nature, 596 (2021), 583–589. https://doi.org/10.1038/s41586-021-03819-2 doi: 10.1038/s41586-021-03819-2
[3]	J. Zhou, O. Troyanskaya, Deep supervised and convolutional generative stochastic network for protein secondary structure prediction, in Proceedings of the 31st International Conference on Machine Learning (ICML-14), 32 (2014), 745–753.
[4]	A. Yaseen, Y. H. Li, Template-based c8-scorpion: A protein 8-state secondary structure prediction method using structural information and context-based features, BMC Bioinformatics, 15 (2014). https://doi.org/10.1186/1471-2105-15-S8-S3 doi: 10.1186/1471-2105-15-S8-S3
[5]	W. Kabsch, C. Sander, Dictionary of protein secondary structure, Biopolymers, 22 (1983), 2577–2637.
[6]	B. Rost, C. Sander, Combining evolutionary information and neural networks to predict protein secondary structure, Proteins., 19 (1994), 55–72. https://doi.org/10.1002/prot.340190108 doi: 10.1002/prot.340190108
[7]	Y. Yang, J. Gao, J. Wang, R. Heffernan, J. Hanson, K. Paliwal, et al., Sixty-five years of the long march in protein secondary structure prediction: The final stretch?, Brief. Bioinform., 19 (2018), 482–494. https://doi.org/10.1093/bib/bbw129 doi: 10.1093/bib/bbw129
[8]	Y. Ma, Y. Liu, J. Cheng, Protein secondary structure prediction based on data partition and semi-random subspace method, Sci. Rep., 8 (2018), 1–10. https://doi.org/10.1038/s41598-018-28084-8 doi: 10.1038/s41598-018-28084-8
[9]	M. Lasfar, H. Bouden, A method of data mining using hidden markov models (HMMs) for protein secondary structure prediction, Procedia Comput. Sci., 127 (2018), 42–51. https://doi.org/10.1016/j.procs.2018.01.096 doi: 10.1016/j.procs.2018.01.096
[10]	A. Drozdetskiy, C. Cole, J. Procter, et al. JPred4: A protein secondary structure prediction server, Nucleic Acids Res., 43 (2015), 389–394. https://doi.org/10.1093/nar/gkv332 doi: 10.1093/nar/gkv332
[11]	D. T. Jones, Protein secondary structure prediction based on position-specific scoring matrices, J. Mol. Biol., 292 (1999), 195–202. https://doi.org/10.1006/jmbi.1999.3091 doi: 10.1006/jmbi.1999.3091
[12]	A. Busia, N. Jaitly, Next-step conditioned deep convolutional neural networks improve protein secondary structure prediction, preprint, arXiv: 2017: 1702.0386.
[13]	B. Z. Zhang, J. Y. Li, Q. Lü, Prediction of 8-state protein secondary structures by a novel deep learning architecture, BMC Bioinformatics, 19 (2018), 1–13. https://doi.org/10.1186/s12859-018-2280-5 doi: 10.1186/s12859-018-2280-5
[14]	S. Krieger, J. Kececioglu, Boosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridization, Bioinformatics, 36 (2020). https://doi.org/10.1093/bioinformatics/btaa336 doi: 10.1093/bioinformatics/btaa336
[15]	M. R. Uddin, S. Mahbub, Saifur Rahman, M., Bayzid, M.S. SAINT: Self-attention augmented inception-inside-inception network improves protein secondary structure prediction, Bioinformatics, 36 (2020), 4599–4608. https://doi.org/10.1093/bioinformatics/btaa531 doi: 10.1093/bioinformatics/btaa531
[16]	K. Kotowski, T. Smolarczyk, I. Roterman-Konieczna, K. Stapor, ProteinUnet-An efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures, J. Comput. Chem., 42 (2021), 50–59. https://doi.org/10.1002/jcc.26432 doi: 10.1002/jcc.26432
[17]	P. M. Sonsare, C. Gunavathi, Cascading 1D-convnet bidirectional long short term memory network with modified COCOB optimizer: A novel approach for protein secondary structure prediction, Chaos Soliton. Fract., 153 (2021), 111446. https://doi.org/10.1016/j.chaos.2021.111446 doi: 10.1016/j.chaos.2021.111446
[18]	M. J. Zvelebil, J. O. Baum, Understanding Bioinformatics, Garland Science, New York, 2007.
[19]	I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial networks, Commun. ACM, 63 (2020), 139–144. https://doi.org/10.1145/3422622 doi: 10.1145/3422622
[20]	R. Wang, X. Xiao, B. Guo, Q. Qin, R.Chen, An effective image denoising method for UAV images via improved generative adversarial networks, Sensors, 18 (2018), 1985. https://doi.org/10.3390/s18071985 doi: 10.3390/s18071985
[21]	S. Yu, H. Chen, E. B. Garcia Reyes, N. Poh, Gaitgan: Invariant gait feature extraction using generative adversarial networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017 (2017), 30–37.
[22]	Y. Zhao, H. Zhang, Y. Liu, Protein secondary structure prediction based on generative confrontation and convolutional neural network, IEEE Access, 8 (2020), 199171–199178. https://doi.org/10.1109/ACCESS.2020.3035208 doi: 10.1109/ACCESS.2020.3035208
[23]	L. Abualigah, A. Diabat, S. Mirjalili, M. Abd Elaziz, A. H. Gandomi, The arithmetic optimization algorithm, Comput. Method. Appl. M., 376 (2021), 113609. https://doi.org/10.1016/j.cma.2020.113609 doi: 10.1016/j.cma.2020.113609
[24]	L. Abualigah, A. Diabat, P. Sumari, A. H. Gandomi, Applications, deployments, and integration of internet of drones (iod): A review, IEEE Sens. J., 21 (2021), 25532–25546. https://doi.org/10.1109/JSEN.2021.3114266 doi: 10.1109/JSEN.2021.3114266
[25]	L. Abualigah, M. Abd Elaziz, P. Sumari, Z. W. Geem, A. H. Gandomi, Reptile search algorithm (rsa): A nature-inspired meta-heuristic optimizer, Expert Syst. Appl., 191 (2022), 116158. https://doi.org/10.1016/j.eswa.2021.116158 doi: 10.1016/j.eswa.2021.116158
[26]	A. E. Ezugwu, J. O. Agushaka, L. Abualigah, S. Mirjalili, A. H. Gandomi, Prairie dog optimization algorithm, Neural Comput. Appl., 34 (2022), 20017–20065. https://doi.org/10.1007/s00521-022-07530-9 doi: 10.1007/s00521-022-07530-9
[27]	J. O. Agushaka, A. E. Ezugwu, L. Abualigah, Gazelle optimization algorithm: A novel nature-inspired metaheuristic optimizer, Neural Comput. Appl., 35 (2023), 4099–4131. https://doi.org/10.1007/s00521-022-07854-6 doi: 10.1007/s00521-022-07854-6
[28]	L. Abualigah, D. Yousri, M. Abd Elaziz, A. A. Ewees, M. A. Al-Qaness, A. H. Gandomi, Aquila optimizer: A novel meta-heuristic optimization algorithm, Comput. Ind. Eng., 157 (2021), 107250. https://doi.org/10.1016/j.cie.2021.107250 doi: 10.1016/j.cie.2021.107250
[29]	M. Arjovsky, S. Chintala, L. Bottou, Wasserstein generative adversarial networks, in International Conference on Machine Learning, 70 (2017), 214–223.
[30]	K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), 770–778.
[31]	M. Farooq, A. Hafeez, Covid-resnet: A deep learning framework for screening of covid19 from radiographs, preprint, arXiv: 2003.14395.
[32]	Z. Wu, C. Shen, A. Van Den Hengel, Wider or deeper: Revisiting the resnet model for visual recognition, Pattern Recogn., 90 (2019), 119–133. https://doi.org/10.1016/j.patcog.2019.01.006 doi: 10.1016/j.patcog.2019.01.006
[33]	K. He, X. Zhang, S. Ren, J. Sun, Identity mappings in deep residual networks, in European Conference on Computer Vision, Springer, Cham, (2016), 630–645.
[34]	C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., Going deeper with convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2015), 1–9.
[35]	G. Wang, R. L. Dunbrack, Pisces: Recent improvements to a PDB sequence culling server, Nucleic Acids Res., 33 (2005), W94–W98. https://doi.org/10.1093/nar/gki402 doi: 10.1093/nar/gki402
[36]	J. Moult, K. Fidelis, A. Kryshtafovych, T. Schwede, A. Tramontano, Critical assessment of methods of protein structure prediction (casp)-round x, Proteins., 82 (2014), 1–6. https://doi.org/10.1002/prot.24452 doi: 10.1002/prot.24452
[37]	J. Moult, K. Fidelis, A. Kryshtafovych, T. Schwede, A. Tramontano, Critical assessment of methods of protein structure prediction: Progress and new directions in round xi, Proteins., 84 (2016), 4–14. https://doi.org/10.1002/prot.25064 doi: 10.1002/prot.25064
[38]	J. Moult, K. Fidelis, A. Kryshtafovych, T. Schwede, A. Tramontano, Critical assessment of methods of protein structure prediction (casp)-round xii, Proteins., 86 (2018), 7–15. https://doi.org/10.1002/prot.25415 doi: 10.1002/prot.25415
[39]	A. Kryshtafovych, T. Schwede, M. Topf, K. Fidelis, J. Moult, Critical assessment of methods of protein structure prediction (casp)-round xiii, Proteins., 87 (2019), 1011–1020. https://doi.org/10.1002/prot.25823 doi: 10.1002/prot.25823
[40]	A. Kryshtafovych, T. Schwede, M. Topf, K. Fidelis, J. Moult, Critical assessment of methods of protein structure prediction (casp)-round xiv, Proteins., 89 (2021), 1607–1617. https://doi.org/10.1002/prot.26237 doi: 10.1002/prot.26237
[41]	J. A. Cuff, G. J. Barton, Evaluation and improvement of multiple sequence methods for protein secondary structure prediction, Proteins., 34 (1999), 508–519.
[42]	S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller et al., Gapped blast and psi-blast: a new generation of protein database search programs, Nucleic Acids Res., 25 (1997), 3389–3402. https://doi.org/10.1093/nar/25.17.3389 doi: 10.1093/nar/25.17.3389
[43]	B. Rost, C. Sander, R. Schneider, Redefining the goals of protein secondary structure prediction, J. Mol. Biol., 235 (1994), 13–26. https://doi.org/10.1016/S0022-2836(05)80007-5 doi: 10.1016/S0022-2836(05)80007-5
[44]	Y. Guo, W. Li, B. Wang, H. Liu, D. Zhou, Deepaclstm: Deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction, BMC bioinformatics, 20 (2019), 1–12. https://doi.org/10.1186/s12859-019-2940-0 doi: 10.1186/s12859-019-2940-0
[45]	A. R. Ratul, M. Turcotte, M. H. Mozaffari, W. S. Lee, Prediction of 8-state protein secondary structures by 1D-Inception and BD-LSTM, BioRxiv, 2019 (2019), 871921. https://doi.org/10.1101/871921 doi: 10.1101/871921
[46]	Z. Li, Y. Yu, Protein secondary structure prediction using cascaded convolutional and recurrent neural networks, preprint, arXiv: 1604.07176.
[47]	I. Drori, I. Dwivedi, P. Shrestha, J. Wan, Y. Wang, Y. He, et al., High quality prediction of protein q8 secondary structure by diverse neural network architectures, preprint, arXiv: 1811.07143.

This article has been cited by:

1.	Gaoke Liao, Zhenghui Li, Ziqing Du, Yue Liu, The Heterogeneous Interconnections between Supply or Demand Side and Oil Risks, 2019, 12, 1996-1073, 2226, 10.3390/en12112226
2.	Masoud Rahiminezhad Galankashi, Farimah Mokhatab Rafiei, Maryam Ghezelbash, Portfolio selection: a fuzzy-ANP approach, 2020, 6, 2199-4730, 10.1186/s40854-020-00175-4
3.	Luca Merlo, Lea Petrella, Valentina Raponi, Forecasting VaR and ES using a joint quantile regression and its implications in portfolio allocation, 2021, 133, 03784266, 106248, 10.1016/j.jbankfin.2021.106248
4.	Hongyue Guo, Mengjun Wan, Lidong Wang, Xiaodong Liu, Witold Pedrycz, Weighted Fuzzy Clustering for Time Series With Trend-Based Information Granulation, 2024, 54, 2168-2267, 903, 10.1109/TCYB.2022.3190705

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)