Review

Affective algorithmic composition of music: A systematic review

  • Affective music composition systems are known to trigger emotions in humans. However, the design of such systems to stimulate users' emotions continues to be a challenge because, studies that aggregate existing literature in the domain to help advance research and knowledge is limited. This study presents a systematic literature review on affective algorithmic composition systems. Eighteen primary studies were selected from IEEE Xplore, ACM Digital Library, SpringerLink, PubMed, ScienceDirect, and Google Scholar databases following a systematic review protocol. The findings revealed that there is a lack of a unique definition that encapsulates the various types of affective algorithmic composition systems. Accordingly, a unique definition is provided. The findings also show that most affective algorithmic composition systems are designed for games to provide background music. The generative composition method was the most used compositional approach. Overall, there was rather a low amount of research in the domain. Possible reasons for these trends are the lack of a common definition for affective music composition systems and also the lack of detailed documentation of the design, implementation and evaluation of the existing systems.

    Citation: Abigail Wiafe, Pasi Fränti. Affective algorithmic composition of music: A systematic review[J]. Applied Computing and Intelligence, 2023, 3(1): 27-43. doi: 10.3934/aci.2023003

    Related Papers:

    [1] Taewan Kim, Jung Hoon Kim . A new optimal control approach to uncertain Euler-Lagrange equations: H disturbance estimator and generalized H2 tracking controller. AIMS Mathematics, 2024, 9(12): 34466-34487. doi: 10.3934/math.20241642
    [2] Valérie Gauthier-Umaña, Henryk Gzyl, Enrique ter Horst . Decoding as a linear ill-posed problem: The entropy minimization approach. AIMS Mathematics, 2025, 10(2): 4139-4152. doi: 10.3934/math.2025192
    [3] Dayang Dai, Dabuxilatu Wang . A generalized Liu-type estimator for logistic partial linear regression model with multicollinearity. AIMS Mathematics, 2023, 8(5): 11851-11874. doi: 10.3934/math.2023600
    [4] Urmee Maitra, Ashish R. Hota, Rohit Gupta, Alfred O. Hero . Optimal protection and vaccination against epidemics with reinfection risk. AIMS Mathematics, 2025, 10(4): 10140-10162. doi: 10.3934/math.2025462
    [5] Fei Yan, Junpeng Li, Haosheng Jiang, Chongqi Zhang . A-Optimal designs for mixture polynomial models with heteroscedastic errors. AIMS Mathematics, 2023, 8(11): 26745-26757. doi: 10.3934/math.20231369
    [6] Jiali Wu, Maoning Tang, Qingxin Meng . A stochastic linear-quadratic optimal control problem with jumps in an infinite horizon. AIMS Mathematics, 2023, 8(2): 4042-4078. doi: 10.3934/math.2023202
    [7] Xiaowei Zhang, Junliang Li . Model averaging with causal effects for partially linear models. AIMS Mathematics, 2024, 9(6): 16392-16421. doi: 10.3934/math.2024794
    [8] Bo Jiang, Yongge Tian . Equivalent analysis of different estimations under a multivariate general linear model. AIMS Mathematics, 2024, 9(9): 23544-23563. doi: 10.3934/math.20241144
    [9] Zuliang Lu, Ruixiang Xu, Chunjuan Hou, Lu Xing . A priori error estimates of finite volume element method for bilinear parabolic optimal control problem. AIMS Mathematics, 2023, 8(8): 19374-19390. doi: 10.3934/math.2023988
    [10] Chengjin Tang, Jiahao Guo, Yinghui Dong . Optimal investment based on performance measure with a stochastic benchmark. AIMS Mathematics, 2025, 10(2): 2750-2770. doi: 10.3934/math.2025129
  • Affective music composition systems are known to trigger emotions in humans. However, the design of such systems to stimulate users' emotions continues to be a challenge because, studies that aggregate existing literature in the domain to help advance research and knowledge is limited. This study presents a systematic literature review on affective algorithmic composition systems. Eighteen primary studies were selected from IEEE Xplore, ACM Digital Library, SpringerLink, PubMed, ScienceDirect, and Google Scholar databases following a systematic review protocol. The findings revealed that there is a lack of a unique definition that encapsulates the various types of affective algorithmic composition systems. Accordingly, a unique definition is provided. The findings also show that most affective algorithmic composition systems are designed for games to provide background music. The generative composition method was the most used compositional approach. Overall, there was rather a low amount of research in the domain. Possible reasons for these trends are the lack of a common definition for affective music composition systems and also the lack of detailed documentation of the design, implementation and evaluation of the existing systems.



    The generalized linear model (GLM), a generalization of the linear model with wide applications in many research areas, was proposed by Nelder and Wedderburn [1] in 1972 for discrete dependent variables, which cannot be dealt with by the ordinary linear regression model. The GLM allows the response variable to be nonnormal distributions, including binomial, Poisson, gamma, and inverse Gaussian distributions, whose means are linked with the predictors by a link function.

    Nowadays, with the rapid development of science and technology, massive data is ubiquitous in many fields, including medicine, industry, and economics. Extracting effective information from massive data is the core challenge of big data analysis. However, the limited arithmetic power of computers tends to consume a lot of computing time. In order to deal with this challenge, parallel computing and distributed computing are commonly used, and subsampling techniques have emerged as a result, i.e., a small number of representative samples are extracted from massive data. Imberg et al. [2] proposed a theory on optimal design in the context of general data subsampling issues. It includes and extends most existing methods, works out optimality conditions, and offers algorithms for finding optimal subsampling scheme designs, which introduces a new class of invariant linear optimality criteria. Chao et al. [3] presented an optimal subsampling approach for modal regression with big data. The estimators are obtained by means of a two-step algorithm based on the modal expectation maximization when the bandwidth is not related to the subsample size.

    There has been a great deal of research on subsampling algorithms of specific models. Wang et al. [4] devised a rapid subsampling algorithm to approximate the maximum likelihood estimators in the context of logistic regression. Based on the previous study, Wang [5] presented an enhanced estimation method for logistic regression, which has a higher estimation efficiency. In the case that data are usually distributed in multiple distributed sites for storage, Zuo et al. [6] developed a distributed subsampling procedure to effectively approximate the maximum likelihood estimators of logistic regression. Ai et al. [7] focused on the optimal subsampling method under the A-optimality criteria based on the method developed by Wang [4] for generalized linear models to quickly approximate maximum likelihood estimators from massive data. Yao and Wang [8] examined optimal subsampling methods for various models, including logistic regression models, softmax regression models, generalized linear models, quantile regression models, and quasi-likelihood estimators. Yu et al. [9] proposed an efficient subsampling procedure for online data streams with a multinomial logistic model. Yu et al. [10] studied the subsampling technique for the Akaike information criterion (AIC) and the smoothed AIC model-averaging framework for generalized linear models. Yu et al. [11] reviewed several subsampling methods for massive datasets from the viewpoint of statistical design.

    To the best of our knowledge, all the existing methods above assume that the covariates are fully observable. However, in practice, this assumption is not realistic, and covariates may be inaccurately observed owing to measurement errors, which will lead to biases in the estimators of the regression coefficients. This means that we may incorrectly determine some unimportant variables as significant, which in turn affects the model selection and interpretation. Therefore, it is necessary to consider measurement errors. Liang et al. [12], Li and Xue [13], and Liang and Li [14] investigated the partial linear measurement error models. Stefanski [15] and Nakamura [16] obtained the corrected score functions of the GLM, such as linear regression, gamma regression, inverse gamma regression, and Poisson regression. Yang et al. [17] proposed an empirical likelihood method based on the moment identity of the corrected score function to perform statistical inference for a class of generalized linear measurement error models. Fuller [18] estimated the variable error model using the maximum likelihood method and studied statistical inference. Hu and Cui [19] proposed a corrected error variance method to accurately estimate the error variance, which can effectively reduce the influence of measurement error and false correlation at the same time. Carroll et al. [20] summarized the measurement errors in linear regression and described some simple and universally applicable measurement error analysis methods. Yi et al. [21] presented a regression calibration method, which is one of the first statistical methods introduced to address measurement errors in the covariates. In addition, they presented an overview of the conditional score and corrected score approaches for measurement error correction. Regarding the measurement errors in different situations existing in actual data, extensive research has been carried out, and a variety of methods have been proposed, see [22,23,24,25]. Recently, a class of variable selection procedures has been developed for measurement error models, see [26,27]. More recently, Ju et al. [28] studied the optimal subsampling algorithm and the random perturbation subsampling algorithm for big data linear models with measurement errors. The aim of this paper is to estimate the parameters using a subsampling algorithm for a class of generalized linear measurement error models in the massive data analysis.

    In this paper, we study a class of the GLM with measurement errors, such as logistic regression models and Poisson regression models. We combine the corrected score function method with subsampling techniques to investigate subsampling algorithms. The consistency and asymptotic normality of the estimators obtained in the general subsampling algorithm are derived. We optimize the subsampling probabilities based on the design of A-optimality and L-optimality criteria and incorporate a truncation method in the optimal subsampling probabilities to obtain the optimal estimators. In addition, we develop an adaptive two-step algorithm and obtain the consistency and asymptotic normality of the final subsampling estimators. Finally, the effectiveness of the proposed method is demonstrated through numerical simulations and real data analysis.

    The remainder of this paper is organized as follows: Section 2 introduces the corrected score function under different distributions and derives the general subsampling algorithm and the adaptive two-step algorithm. Sections 3 and 4 verify the effectiveness of the proposed method by generating simulated experimental data and two real data sets, respectively. Section 5 provides conclusions.

    In the GLM, it is assumed that the conditional distribution of the response variable belongs to the exponential family

    f(y;θ)=exp{θyb(θ)a(ϕ)+c(y,ϕ)},

    where a(),b(),c(,) are known functions, θ is called the natural parameter, and ϕ is called the dispersion parameter.

    Let {(Xi,Yi)}Ni=1 be independent and identically distributed random samples, μi=E(YiXi),V(μi)=Var(YiXi), where the covariate XiRp and the response variable YiR, V() is a known variance function. The conditional expectation of Yi given Xi is

    g(μi)=XTiβ, (2.1)

    where g() is the canonical link function, and β=(β1,,βp)Tis a p-dimensional unknown regression parameter.

    In practice, covariates are not always accurately observed, and there are measurement errors that cannot be ignored. Let Wi be an accurate observation of the covariate Xi. Assuming that the additive measurement error model is

    Wi=Xi+Ui, (2.2)

    where UiNp(0,Σu), and it is independent of (Xi,Yi). Combining (2.1) and (2.2) yields a generalized linear model with measurement errors.

    Define the log-likelihood function as (β;Yi)=Ni=1logf(Yi;β). If Xi is observable, the score function for β in (2.1) is

    Ni=1ηi(β;Xi,Yi)=Ni=1(β;Yi)β=Ni=1YiμiV(μi)μiβ,

    and satisfies E[ηi(β;Xi,Yi)Xi]=0. However, when there is an error in Xi, directly replacing Xi with Wi to calculate ηi(β;Xi,Yi) causes a bias, i.e., E[ηi(β;Xi,Yi)]=0 will not always hold, hence a correction is needed. We define an unbiased score function ηi(Σu,β;Wi,Yi) for β satisfying E[ηi(Σu,β;Wi,Yi)Xi]=0 by the idea of [16]. The maximum likelihood estimator ˆβMLE of β is the solution of the estimating equation

    Q(β):=Ni=1ηi(Σu,β;Wi,Yi)=0. (2.3)

    Based on the following moment identities associated with the error model (2.2),

    E(WiXi)=Xi,
    E(WiWTiXi)=XiXTi+Σu,
    E(exp(WTiβ)Xi)=exp(XTiβ+12βTΣuβ),
    E[Wiexp(WTiβ)Xi]=(Xi+Σuβ)exp(XTiβ+12βTΣuβ),
    E[Wiexp(WTiβ)Xi]=(XiΣuβ)exp[XTiβ+12βTΣuβ],
    E[Wiexp(2WTiβ)Xi]=(Xi2Σuβ)exp[2XTiβ+2βTΣuβ],

    then we can construct the unbiased score function for binary logistic measurement error regression models and Poisson measurement error regression models, which are widely used in practice.

    (1) Binary logistic measurement error regression models.

    We consider the logistic measurement error regression model

    {P(Yi=1Xi)=11+exp(XTiβ),Wi=Xi+Ui,

    with mean μi=[1+exp(XTiβ)]1 and variance Var(YiXi)=μi(1μi). Followed by Huang and Wang [29], the corrected score function is

    ηi(Σu,β;Wi,Yi)=WiYi+(Wi+Σuβ)exp(WTiβ12βTΣuβ)YiWi,

    and its first-order derivative is

    Ωi(Σu,β;Wi,Yi)=ηi(Σu,β;Wi,Yi)βT=[Σu(Wi+Σuβ)(Wi+Σuβ)T]exp(WTiβ12βTΣuβ)Yi.

    (2) Poisson measurement error regression models.

    Let Yi follow the Poisson distribution with mean μi, Var(YiXi)=μi. Consider the log linear measurement error model

    {log(μi)=XTiβ,Wi=Xi+Ui,

    then we have the corrected score function

    ηi(Σu,β;Wi,Yi)=WiYi(WiΣuβ)exp(WTiβ12βTΣuβ),

    and its first-order derivative is

    Ωi(Σu,β;Wi,Yi)=ηi(Σu,β;Wi,Yi)βT=[Σu(WiΣuβ)(WiΣuβ)T]exp(WTiβ12βTΣuβ).

    It is assumed that πi is the probability of sampling the i-th sample (Wi,Yi), i=1,,N. Let S be the set of the subsamples (˜Wi,~Yi) with corresponding sampling probabilities ~πi, i.e., S={(~Wi,~Yi,~πi)} with the subsample size r. The general subsampling algorithm is shown in Algorithm 1.

    Algorithm 1 General subsampling algorithm.
    Step 1. Given the subsampling probabilities πi,i=1,,N of all data points.
    Step 2. Perform repeated sampling with replacement r times to form the subsample set S={(˜Wi,˜Yi,˜πi)}, where ˜Wi, ~Yi and ~πi represent the covariate, response variable and subsampling probability in the subsample, respectively.
    Step 3. Based on the subsample set S, solve the weighted estimation equation Q(β) to obtain β, where
    Q(β):=1rri=11~πi˜ηi(Σu,β;˜Wi,~Yi)=0,              (2.4)
    where ˜ηi(Σu,β;˜Wi,~Yi) is the unbiased score function of i-th sample point in the subsample and its first order derivative is ˜Ωi(Σu,β;˜Wi,~Yi).

    To obtain the consistency and asymptotic normality of β, the following assumptions should be made. For simplicity, denote ηi(Σu,β;Wi,Yi) and Ωi(Σu,β;Wi,Yi) as ηi(Σu,β) and Ωi(Σu,β).

    A1: It is assumed that WTiβ is almost necessarily in the interior of a closed set KΘ, Θ is a natural parameter space.

    A2: The regression parameters are located in the ball Λ={βRp:β1B},βt and ˆβMLE are true parameters and maximum likelihood estimators, which are interior points of Λ, and B is a constant, where 1 denotes 1-norm.

    A3: As n, the observed information matrix MX:=1NNi=1Ωi(Σu,ˆβMLE) is a positive definite matrix in probability.

    A4: Assume that for all βΛ, 1NNi=1ηi(Σu,β)4=OP(1), where denotes the Euclidean norm.

    A5: Suppose that the full sample covariates have finite 6th-order moments, i.e., EW16.

    A6: For any δ0, we assume that

    1N2+δNi=1ηi(Σu,ˆβMLE)2+δπ1+δi=OP(1),1N2+δNi=1|Ω(j1j2)i(Σu,ˆβMLE)|2+δπ1+δi=OP(1),

    where Ω(j1j2)i represents the elements of the j1-th row and j2-th column of the matrix Ωi.

    A7: Assume that ηi(Σu,β) and Ωi(Σu,β) are m(Wi)-Lipschitz continuous. For any β1,β2Λ, there exist functions m1(Wi) and m2(Wi) such that ηi(Σu,β1)ηi(Σu,β2)m1(Wi)β1β2, Ωi(Σu,β1)Ωi(Σu,β2)Sm2(Wi)β1β2, where AS denotes the spectral norm of matrix A. Further assume that E{m1(Wi)} and E{m2(Wi)}.

    Assumptions A1 and A2 are also used in Clémencon et al. [30]. The set Λ in Assumption A2 is also known as the admissible set and is a prerequisite for consistency estimation for the GLM with full data [31]. Assumption A3 imposes a condition on the covariates to ensure that the MLE based on the full dataset is consistent. In order to obtain the Bahadur representation of the subsampling estimators, Assumptions A4 and A5 are required. Assumption A6 is a moment condition for the subsampling probability and is also required for the Lindberg-Feller central limit theorem. Assumption A7 adds a restriction on smoothing, which can be found in [32].

    The following theorems show the consistency and asymptotic normality of the subsampling estimators.

    Theorem 2.1. If Assumptions A1–A7 hold, as r and N, β converges to ˆβMLE in conditional probability given FN, and the convergence rate is r12. That is, for all ε>0, there exist constants Δε and rε such that

    P(βˆβMLEr12ΔεFN)<ε, (2.5)

    for all r>rε.

    Theorem 2.2. If Assumptions A1–A7 hold, as r and N, conditional on FN, the estimator β obtained from Algorithm 1 satisfies

    V12(βˆβMLE)dNp(0,I), (2.6)

    where V=M1XVCM1X=OP(r1), and

    VC=1N2rNi=1ηi(Σu,ˆβMLE)ηiT(Σu,ˆβMLE)πi.

    Remark 1. In order to get the standard error of the corresponding estimator, we estimate the variance-covariance matrix of β by

    ˆV=ˆM1XˆVCˆM1X,

    where

    ˆMX=1Nrri=1˜Ωi(Σu,ˆβMLE)˜πi,
    ˆVC=1N2r2ri=1˜ηi(Σu,ˆβMLE)˜ηiT(Σu,ˆβMLE)˜π2i.

    Based on the A-optimality criteria in the optimal design language, the optimal subsampling probabilities are obtained by minimizing the asymptotic mean square error of β in Theorem 2.2.

    However, Σu is usually unknown in practice. Therefore, we need to estimate the covariance matrix Σu as suggested by [12]. We observe that the consistent, unbiased moment estimator of Σu is

    ˆΣu=Ni=1mij=1(Wij¯Wi)(Wij¯Wi)TNi=1(mi1),

    where ¯Wi is the sample mean of the replicates, and mi is the number of repeated measurements of the i-th individual.

    Theorem 2.3. Define gmVi=M1Xηi(Σu,ˆβMLE),i=1,,N. The subsampling strategy is mV-optimal if the subsampling probability is chosen such that

    πmVi=gmViNj=1gmVj, (2.7)

    which is obtained by minimizing tr(V).

    Theorem 2.4. Define gmVci=ηi(Σu,ˆβMLE),i=1,,N. The subsampling strategy is mVc-optimal if the subsampling probability is chosen such that

    πmVci=gmVciNj=1gmVcj, (2.8)

    which is obtained by minimizing tr(VC).

    Remark 2. MX and VC are non-negative definite matrices, and V=M1XVCM1X, then tr(V)=tr(M1XVCM1X)σmax(M2X)tr(VC), where σmax(A) represents the maximum eigenvalue of square matrix A. As σmax(M2X) does not depend on π, minimizing tr(VC) means minimizing the upper bound of tr(V). In fact, for two given subsampling probabilities π1 and π2, V(π1)V(π2) if and only if VC(π1)VC(π2). Therefore, minimizing tr(VC) reduces considerable computational time compared to minimizing tr(V), and tr(VC) does not take into account the structural information of the data.

    The optimal subsampling probabilities are defined as {πopi}Ni=1={πmVi}Ni=1 or {πmVci}Ni=1. However, because πopi depends on ˆβMLE, it cannot be used directly in applications. To calculate πopi, it is necessary to use a prior estimator ˜β0, which is obtained by the prior subsample of size r0.

    We know πopi is proportional to ηi(Σu,ˆβMLE), however, in actual situations, there may be some data points that make ηi(Σu,ˆβMLE)=0, which will never be included in a subsample, and some data points with ηi(Σu,ˆβMLE)0 also have small probabilities of being sampled. If these special data points are excluded, some sample information will be missed, but if these data points are included, the variance of the subsampling estimator may increase.

    To avoid Eq (2.4) from being inflated by these special data points, this paper adopts a truncation method, setting a threshold ω for ηi(Σu,ˆβMLE), that is, replacing ηi(Σu,ˆβMLE) with max{ηi(Σu,ˆβMLE),ω}, where ω is a very small positive number, for example, 104. In applications, the choice and design of the truncation weight function, which is a commonly used technique, are crucial to improving the robustness of the model and optimizing the performance.

    We replace ˆβMLE in the matrix V with ˜β0, denoted as ˜V, then tr(˜V)tr(˜Vω)tr(˜V)+ω2N2rNi=1M1X2πopi. Therefore, when ω is sufficiently small, tr(˜Vω) approaches tr(˜V). The threshold ω is set to make the subsample estimators more robust without sacrificing excessively estimation efficiency. ˜MX=1Nr0r0i=1Ωi(Σu,˜β0) based on the prior subsample can be used to approximate MX. The two-step algorithm is presented in Algorithm 2.

    Algorithm 2 Optimal subsampling algorithm.
    Step 1. Extract a prior subsample set Sr0 with a subsample size of r0 from the full data, assuming that the subsampling probabilities of the prior subsample are πUNIF={πi:=1N}Ni=1. We use Algorithm 1 to obtain a prior estimator ˜β0, replace ˆβMLE with ˜β0 in Eqs (2.7) and (2.8) to get the optimal subsampling probabilities {πopti}Ni=1.
    Step 2. Use the optimal subsample probabilities {πopti}Ni=1 computed in Step 1 to extract a subsample size of r with replacement. According to the step in Algorithm 1, combining the subsamples from Step 1 and solving the estimating Eq (2.4) to get the estimator ˇβ based on a subsample of total size r0+r.

    Remark 3. In Algorithm 2, ˜β0 in Step 1 satisfies

    Q0˜β0(β)=1r0r0i=1˜ηi(Σu,β)πUNIFi=0

    with the prior subsample set Sr0, and

    M˜β0X=1Nr0r0i=1˜Ωi(Σu,˜β0)πUNIFi.

    In Step 2, the subsampling probabilities are {πopti}Ni=1={πmVti}Ni=1 or {πmVcti}Ni=1, let

    gmVti={M1Xηi(Σu,ˆβMLE),ifηi(Σu,ˆβMLE)>ωωM1X,ifηi(Σu,ˆβMLE)<ω,i=1,,N,
    gmVcti=max{ηi(Σu,ˆβMLE),ω},

    then

    πmVti=gmVtiNj=1gmVtjandπmVcti=gmVctiNj=1gmVctj.

    The subsample set is Sr0{(˜Wi,˜Yi,˜πopti)i=1,,r} with a subsample size of r+r0, and ˇβ is the solution to the corresponding estimating equation

    Qtwostep˜β0(β)=1r+r0r+r0i=1˜ηi(Σu,β)˜πopti=rr+r0Q˜β0(β)+r0r+r0Q0˜β0(β)=0,

    where

    Q˜β0(β)=1rri=1˜ηi(Σu,β)˜πopti.

    Theorem 2.5. If Assumptions A1–A7 hold, as r0r10, r0,r and N, if ˜β0 exists, then the estimator ˇβ obtained from Algorithm 2 converges to ˆβMLE in conditional probability given FN, and its convergence rate is r12. For all ε>0, there exist finite Δε and rε such that

    P(ˇβˆβMLEr12ΔεFN)<ε, (2.9)

    for all r>rε.

    Theorem 2.6. If Assumptions A1–A7 hold, as r0r10, r0,r and N, conditional on FN, the estimator ˇβ obtained from Algorithm 2 satisfies

    V12opt(ˇβˆβMLE)dNp(0,I), (2.10)

    where Vopt=M1XVoptCM1X=OP(r1), and

    VoptC=1N2rNi=1ηi(Σu,ˆβMLE)ηiT(Σu,ˆβMLE)πopti.

    Remark 4. We estimate the variance-covariance matrix of ˇβ by

    ˆVopt=ˆM1XˆVoptCˆM1X,

    where

    ˆMX=1N(r0+r)[r0i=1˜Ωi(Σu,ˆβMLE)˜πUNIFi+ri=1˜Ωi(Σu,ˆβMLE)˜πopti],
    ˆVoptC=1N2(r0+r)2[r0i=1˜ηi(Σu,ˆβMLE)˜ηiT(Σu,ˆβMLE)˜πUNIF2i+ri=1˜ηi(Σu,ˆβMLE)˜ηiT(Σu,ˆβMLE)˜πopt2i].

    In this section, we perform numerical simulations using synthetic data to evaluate the finite sample performance of the proposed method in Algorithm 2 (denoted as mV and mVc). For a fair comparison, we also give the results of the uniform subsampling method and set the size to be the same as that of Algorithm 2. The estimators of the above three subsampling methods, uniform—the uniform subsampling, mV—the mV probability subsampling, and mVc—the mVc probability subsampling, are compared with MLE—the maximum likelihood estimators for full data. In addition, we conduct simulation experiments using two models: the logistic regression model and the Poisson regression model.

    Set the sample size N=100000, the true value βt=(0.5,0.6,0.5)T, the covariate XiN3(0,Σ), where Σ=0.5I+0.511T, I is an identity matrix. The response Yi follows a binomial distribution with P(Yi=1Xi)=(1+exp(XTiβt))1. We consider the following three cases to generate the measurement error term Ui.

    ● Case 1: UiN3(0,0.42I);

    ● Case 2: UiN3(0,0.52I);

    ● Case 3: UiN3(0,0.62I).

    The subsample size in Step 1 of Algorithm 2 is selected as r0=400. The subsample size r is set to be 500, 1000, 1500, 2000, 2500, and 5000. In order to verify that ˇβ can asymptotically approach βt, we repeat K=1000 and calculate MSE=1KKk=1ˇβ(k)βt2, where ˇβ(k) is the parameter estimator of the subsample generated by the k-th repetition.

    The simulation results are shown in Figure 1, which can be seen that both mV and mVc always have smaller MSEs than uniform subsampling. The MSEs of all the subsampling methods decrease as an increase of r, which confirms the theoretical results of the consistency of the subsampling methods. As the variance of the error term increases, the MSEs of uniform, mV, and mVc also increase. The mV is better than the mVc because the subsampling probabilities of mV take the structural information of the data into account. A comparison between the corrected and uncorrected methods shows that the MSEs of the corrected methods are much smaller than those of the uncorrected methods, and the difference between the corrected and uncorrected methods increases as the error variance increases.

    Figure 1.  MSEs for ˇβ with different second step subsample size r and r0=400. The colorful icons and lines represent the corrected subsampling methods. The gray icons and lines represent the uncorrected subsampling methods.

    Now, we evaluate the statistical inference performance of the optimal subsampling method for different r and variances of Ui. The parameter β1 is taken as an example, and a 95% confidence interval is constructed. Table 1 reports the empirical coverage probabilities and average lengths of three subsampling methods. It is evident that both mV and mVc have similar performance and consistently outperform the uniform subsampling method. As r increases, the length of the confidence interval uniformly decreases.

    Table 1.  Empirical coverage probabilities and average lengths of confidence intervals for β1 in the logistic regression models with different r and r0=500.
    uniform mV mVc
    Case r Coverage Length Coverage Length Coverage Length
    Case 1 500 0.958 0.565 0.932 0.331 0.942 0.457
    1000 0.952 0.453 0.925 0.248 0.954 0.333
    1500 0.960 0.387 0.920 0.206 0.964 0.274
    2000 0.932 0.345 0.907 0.180 0.954 0.237
    2500 0.938 0.313 0.910 0.160 0.956 0.211
    5000 0.964 0.302 0.908 0.148 0.937 0.202
    Case 2 500 0.956 0.634 0.946 0.602 0.962 0.613
    1000 0.946 0.621 0.934 0.586 0.946 0.593
    1500 0.927 0.597 0.954 0.551 0.962 0.561
    2000 0.943 0.543 0.956 0.524 0.921 0.518
    2500 0.970 0.475 0.958 0.453 0.944 0.462
    5000 0.963 0.438 0.932 0.417 0.947 0.441
    Case 3 500 0.958 0.706 0.956 0.432 0.968 0.550
    1000 0.946 0.561 0.972 0.399 0.970 0.409
    1500 0.944 0.479 0.968 0.321 0.960 0.329
    2000 0.936 0.425 0.964 0.265 0.958 0.281
    2500 0.926 0.389 0.966 0.249 0.954 0.250
    5000 0.915 0.356 0.947 0.220 0.942 0.236

     | Show Table
    DownLoad: CSV

    Let βt=(0.5,0.6,0.5)T, the covariate XiN3(0,Σ), where Σ=0.3I+0.511T, I is an identity matrix. We consider the following three cases to generate the measurement error term Ui.

    ● Case 1: UiN3(0,0.32I);

    ● Case 2: UiN3(0,0.42I);

    ● Case 3: UiN3(0,0.52I).

    We also generate a sample of N=100000 following Poisson(μi), where μi=exp(XTiβt). distribution, and summarize the MSEs with the number of simulations K=1000 in Figure 2. The other settings are the same as those in the logistic regression example.

    Figure 2.  MSEs for ˇβ with different second step subsample size r and r0=400. The colorful icons and lines represent the corrected subsampling methods. The gray icons and lines represent the uncorrected subsampling methods.

    In Figure 2, it can be seen that the MSEs of both the mV and mVc methods are smaller than those of the uniform subsampling, with the mV method being the optimal. In addition, the corrected method is obviously effective, which is consistent with Figure 1. Table 2 reports the empirical coverage probabilities and average lengths of 95% confidence interval of the parameter β3 for three subsampling methods. The conclusions of Table 2 are consistent with those of Table 1, but the average lengths of the intervals for Poisson regression are significantly longer than those for logistic regression.

    Table 2.  Empirical coverage probabilities and average lengths of confidence intervals for β3 in the Poisson regression models with different r and r0=500.
    uniform mV mVc
    Case r Coverage Length Coverage Length Coverage Length
    Case 1 500 0.962 0.441 0.962 0.383 0.958 0.399
    1000 0.944 0.352 0.964 0.291 0.964 0.304
    1500 0.932 0.302 0.964 0.241 0.966 0.255
    2000 0.952 0.268 0.930 0.210 0.944 0.223
    2500 0.946 0.244 0.958 0.188 0.974 0.201
    5000 0.952 0.234 0.961 0.173 0.943 0.185
    Case 2 500 0.938 0.127 0.936 0.108 0.948 0.109
    1000 0.936 0.102 0.946 0.082 0.934 0.082
    1500 0.942 0.087 0.934 0.069 0.936 0.068
    2000 0.952 0.078 0.956 0.060 0.952 0.059
    2500 0.946 0.071 0.932 0.053 0.944 0.053
    5000 0.935 0.068 0.965 0.045 0.971 0.047
    Case 3 500 0.940 0.185 0.936 0.153 0.953 0.156
    1000 0.950 0.148 0.954 0.113 0.958 0.118
    1500 0.932 0.127 0.950 0.094 0.958 0.099
    2000 0.946 0.113 0.952 0.082 0.960 0.086
    2500 0.942 0.103 0.932 0.073 0.950 0.077
    5000 0.937 0.096 0.956 0.065 0.964 0.061

     | Show Table
    DownLoad: CSV

    In order to explore the influence of different subsample size allocated in the two-step algorithm, we calculate the MSEs for r0 at different proportions under the condition that the total subsample size remains constant. Set the total subsample size r0+r=3000, and the result is shown in Figure 3. It can be seen that the accuracy of the two-step algorithm will initially improve with the increase of r0. However, when r0 increases to a certain extent, the accuracy of the algorithm begins to decrease. There are two reasons: (1) if r0 is too small, the estimators in the first step will be biased, and it is difficult to ensure the accuracy; (2) if r0 are too large, then the performances of mV and mVc are similar to that of the uniform subsampling. When r0/(r0+r) is around 0.25, the two-step algorithm performs the best.

    Figure 3.  MSEs vs proportions of the first step subsample with fixed total subsample size for logistic and Poisson models with Case 1.

    We use the Sys.time() function in R to calculate the running time of three subsampling methods and full data. We conduct 1000 repetitions, set r0=200, and consider different r values in Case 1. The results are shown in Tables 3 and 4. It is easy to find that the uniform subsampling algorithm requires the least computation time. Because there is no need to calculate the subsampling probabilities. In addition, the mV method takes longer than the mVc method, and this result is consistent with the theoretical analysis in Section 2.

    Table 3.  Computing time (in seconds) for logistic regression with Case 1 for different r and fixed r0=200.
    r
    Method 300 500 800 1200 1600 2000
    uniform 0.2993 0.3337 0.4985 0.5632 0.8547 0.5083
    mV 3.5461 3.6485 3.8623 4.1256 4.4325 5.2365
    mVc 3.2852 3.3658 3.5463 3.8562 4.0235 4.4235
    Full 45.9075

     | Show Table
    DownLoad: CSV
    Table 4.  Computing time (in seconds) for Poisson regression with Case 1 for different r and fixed r0=200.
    r
    Method 300 500 800 1200 1600 2000
    uniform 0.4213 0.4868 0.5327 0.5932 0.7147 0.8883
    mV 4.6723 4.8963 5.2369 5.6524 6.0128 6.3567
    mVc 4.3521 4.6329 4.9658 5.2156 5.7652 5.9635
    Full 51.2603

     | Show Table
    DownLoad: CSV

    In this section, we apply the proposed method to analyze the 1994 global census data, which contains 42 countries, from the Machine Learning Database [33]. There are 5 covariates in the data: x1 represents age; x2 represents the population weight value, which is assigned by the Population Division of the Census Bureau and is related to socioeconomic characteristics; x3 represents the highest level of education, that is, the highest level of education since primary school; x4 represents capital loss, which refers to the loss of income from bad investment, which is the difference between the lower selling price and the higher purchase price of an individual's investment; x5 represents weekly working hours. If an individual's annual income exceeds 50,000 dollars, it is expressed as yi=1 and yi=0 otherwise.

    To verify the effectiveness of the proposed method, we add the measurement errors to the covariates x2, x4 and x5 in this dataset, and the covariance matrix of the measurement error is

    Σu=[00.0400.040.04].

    We split the full dataset into a training set of 32561 observations and a test set of 16281 observations in a 2:1 ratio. We apply the proposed method to the training set and evaluate the classification performance with the test set. We calculate LEMSE=log(1KKk=1ˇβ(k)ˆβMLE2) based on 1000 bootstrap subsample estimators with r=500,1000,1500,2200,2500, and r0=500. The corrected MLE estimators for the training set are ˆβerrMLE, 0=1.6121, ˆβerrMLE, 1=1.1992, ˆβerrMLE, 2=0.0103, ˆβerrMLE, 3=0.9142, ˆβerrMLE, 4=0.2617, ˆβerrMLE, 5=0.8694.

    Table 5 shows the average estimators and the corresponding standard errors based on the proposed method (r0=500, r=2000). It can be seen that the estimators from three subsampling methods are close to the estimators from the full data. In general, the mV and mVc subsampling methods produce small standard errors.

    Table 5.  Average estimators based on subsamples with measurement error and subsample size r=2000. The numbers in parentheses are the standard errors of the average estimators.
    uniform mV mVc
    Intercept -1.6084(0.069) -1.5998(0.055) -1.3122(0.052)
    ˇβerr1 1.2879(0.205) 1.1880(0.103) 1.2038(0.097)
    ˇβerr2 0.0105(0.106) 0.0104(0.059) 0.0111(0.046)
    ˇβerr3 1.0033(0.201) 0.9217(0.067) 0.9199(0.054)
    ˇβerr4 0.2636(0.094) 0.2698(0.054) 0.2555(0.063)
    ˇβerr5 0.9469(0.229) 0.8741(0.083) 0.8628(0.076)

     | Show Table
    DownLoad: CSV

    All subsampling methods show that each variable has a positive impact on income, with age, highest education level, and weekly working hours having significant impacts on income. Interestingly, capital losses have a significant positive impact on income because low-income people rarely invest. However, the population weight value has the smallest impact on income, the reason should be more inclined to reflect the overall distribution characteristics among groups rather than the specific economic performance of individuals. Income is a highly volatile variable, and the income gap between different groups may be large. Even under the same socioeconomic characteristics, the income distribution may have a large variance. This high variability weakens the overall impact of the population weight on income.

    Fix r0=500, Figure 4(a) shows the LEMSEs calculated for the subsample with measurement errors. We can see that the LEMSEs of the corrected methods are much smaller than those of the uncorrected methods. As r increases, the LEMSEs become increasingly small. The estimators of the subsampling methods are consistent and the mV method is the best. Figure 4(b) shows the proportion of responses in the test set being correctly classified for different subsample sizes. The mV performs slightly better than the mVc. It can also be seen that the prediction accuracy of the corrected subsampling methods is slightly greater compared with the correspondingly uncorrected methods.

    Figure 4.  LEMSEs and model prediction accuracy (proportion of correctly classified models) for the subsample with measurement errors. The colorful icons and lines represent the corrected subsampling methods. The gray icons and lines represent the uncorrected subsampling methods.

    This subsection applies the corrected subsampling method to creditcard fraud detection dataset from Kaggle *, and the dependent variable is whether an individual has committed creditcard fraud. There are 284,807 pieces of data in the dataset, with a total of 492 fraud cases. Since the data involves sensitive information, the covariates have all been processed by principal component analysis with a total of 28 principal components. Amount represents the consumption amount, class is the dependent variable, 1 represents fraud, and 0 means normal. The first four principal components and the consumption amount are selected as independent variables.

    *https://www.kaggle.com/datasets/creepycrap/creditcard-fraud-dataset

    To verify the effectiveness of the proposed method, we add the measurement errors to the covariates, and the covariance matrix of the measurement error is Σu=0.16I. We split the dataset into the training set and the test set in a 3:1 ratio and summarize the LEMSEs based on the number of simulations K=1000 with r=500,1000,1500,2200,2500,5000, and r0=500.

    The MLE estimators for the training set are ˆβerrMLE, 0=8.8016, ˆβerrMLE, 1=0.6070, ˆβerrMLE, 2=0.0737, ˆβerrMLE, 3=0.9056, ˆβerrMLE, 4=1.4553, ˆβerrMLE, 5=0.1329. Table 6 shows the average estimators and the corresponding standard errors (r0=500, r=2000). It can be seen that the estimators from three subsampling methods are close to the estimators from the full data. In general, the mV and mVc subsampling methods produce small standard errors. From Figure 5, we can obtain similar results as in Figure 4.

    Table 6.  Average estimators based on subsamples with measurement error and subsample size r=2000. The numbers in parentheses are the standard errors of the average estimators.
    uniform mV mVc
    Intercept -8.7934(0.0678) -8.8105(0.0562) -8.8135(0.0543)
    ˇβerr1 -0.6123(0.341) -0.6047(0.142) -0.6035(0.105)
    ˇβerr2 0.0712(0.125) 0.0730(0.064) 0.0798(0.088)
    ˇβerr3 -0.9321(0.245) -0.9087(0.067) -0.9123(0.057)
    ˇβerr4 1.4618(0.198) 1.4580(0.054) 1.4603(0.075)
    ˇβerr5 -0.1435(0.531) -0.1347(0.242) -0.1408(0.225)

     | Show Table
    DownLoad: CSV
    Figure 5.  LEMSEs and model prediction accuracy (proportion of correctly classified models) for the subsample with measurement errors. The colorful icons and lines represent the corrected subsampling methods. The gray icons and lines represent the uncorrected subsampling methods.

    In this paper, we not only combine the corrected score method with the subsampling technique, but also theoretically derive the consistency and asymptotic normality of the subsampling estimators. In addition, an adaptive two-step algorithm is developed based on optimal subsampling probabilities using A-optimality and L-optimality criteria and the truncation method. The theoretical results of the proposed method are tested with simulated and two real datasets, and the experimental results demonstrate the effectiveness and good performance of the proposed method.

    This paper merely assumes that the covariates are affected by the measurement error. However, in practical applications, the response variables can be influenced by measurement errors. The optimal subsampling probabilities are obtained by minimizing tr(V) or tr(VC) using the design ideas of the A-optimality and L-optimality criteria. In the future, the other optimality criteria for subsampling can be considered to develop more efficient algorithms.

    Ruiyuan Chang: Furnished the algorithms and numerical results presented in the manuscript and composed the original draft of the manuscript; Xiuli Wang: Rendered explicit guidance regarding the proof of the theorem and refined the language of the entire manuscript; Mingqiu Wang: Rendered explicit guidance regarding the proof of theorems and the writing of codes and refined the language of the entire manuscript. All authors have read and consented to the published version of the manuscript.

    The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This research was supported by the National Natural Science Foundation of China (12271294) and the Natural Science Foundation of Shandong Province (ZR2024MA089).

    The authors declare no conflict of interest.

    The proofs of the following lemmas and theorems are primarily based on Wang et al. [5], Ai et al. [7] and Yu et al. [34].

    Lemma 1. If Assumptions A1–A4 and A6 hold, as r and N, conditional on FN, we have

    MXMX=OPFN(r12), (A.1)
    1NQ(ˆβMLE)1NQ(ˆβMLE)=OPFN(r12), (A.2)
    1NV12CQ(ˆβMLE)dNp(0,I), (A.3)

    where

    MX=1Nrri=1˜Ωi(Σu,ˆβMLE)˜πi,

    and

    VC=1N2rNi=1ηi(Σu,ˆβMLE)ηiT(Σu,ˆβMLE).

    Proof.

    E(MX|FN)=E(1Nrri=1˜Ωi(Σu,ˆβMLE)˜πi|FN)=1Nrri=1Nj=1πjΩj(Σu,ˆβMLE)πj=1NNi=1Ωi(Σu,ˆβMLE)=MX.

    By Assumption A6, we have

    E[(Mj1j2XMj1j2X)2|FN]=E[(1Nrri=1˜Ω(j1j2)i(Σu,ˆβMLE)˜πi1NNi=1Ω(j1j2)i(Σu,ˆβMLE))2|FN]=1rNi=1πi(Ω(j1j2)i(Σu,ˆβMLE)NπiMj1j2X)2=1rNi=1πi(Ω(j1j2)i(Σu,ˆβMLE)Nπi)21r(Mj1j2X)21rNi=1πi(Ω(j1j2)i(Σu,ˆβMLE)Nπi)2=OP(r1).

    It follows from Chebyshev's inequality that (A.1) holds.

    E(1NQ(ˆβMLE)|FN)=E(1N1rri=1˜ηi(Σu,ˆβMLE)˜πi|FN)=1Nrri=1Nj=1πjηj(Σu,ˆβMLE)πj=1NNi=1ηi(Σu,ˆβMLE)=0.

    By Assumption A4, we have

    Var(1NQ(ˆβMLE)|FN)=Var[(1N1rri=1˜ηi(Σu,ˆβMLE)˜πi)|FN]=1N2r2ri=1Nj=1πjηi(Σu,ˆβMLE)ηTi(Σu,ˆβMLE)π2j=1N2rNi=1ηi(Σu,ˆβMLE)ηTi(Σu,ˆβMLE)πi=OP(r1).

    Now (A.2) follows from Markov's Inequality.

    Let γi=(Nπi)1˜ηi(Σu,ˆβMLE), then N1Q(ˆβMLE)=r1ri=1γi holds. Based on Assumption A6, for all ε>0, we have

    ri=1E{r12γi2I(γi>r12ε)|FN}=1rri=1E{γi2I(γi>r12ε)|FN}1r32εri=1E{γi3|FN}=1r12ε1N3Ni=1γi3π2i=OP(r12)=oP(1).

    This shows that the Lindeberg-Feller conditions are satisfied in probability. Therefore (A.3) is true.

    Lemma 2. Assumptions A1–A7 hold, as r and N, conditional on FN, for all sr0, we have

    1NrNi=1˜Ωi(Σu,ˆβMLE+sr)˜πi1NNi=1Ωi(Σu,ˆβMLE)=oPFN(1). (A.4)

    Proof. The Eq (A.4) can be written as

    1NrNi=1˜Ωi(Σu,ˆβMLE+sr)˜πi1NrNi=1˜Ωi(Σu,ˆβMLE)˜πi+1NrNi=1˜Ωi(Σu,ˆβMLE)˜πi1NNi=1Ωi(Σu,ˆβMLE).

    Let

    τ1:=1NrNi=1˜Ωi(Σu,ˆβMLE+sr)˜πi1NrNi=1˜Ωi(Σu,ˆβMLE)˜πi,

    then by Assumption A7, we have

    E(τ1S|FN)=E{1NrNi=11˜πi˜Ωi(Σu,ˆβMLE+sr)˜Ωi(Σu,ˆβMLE)S|FN}=1Nrri=1Nj=1πj1πjΩi(Σu,ˆβMLE+sr)Ωi(Σu,ˆβMLE)S1NNi=1m2(Wi)sr=oP(1).

    It follows from Markov's inequality that τ1=oPFN(1).

    Let

    τ2:=1NrNi=1˜Ωi(Σu,ˆβMLE)˜πi1NNi=1Ωi(Σu,ˆβMLE),

    then

    E{1NrNi=1˜Ωi(Σu,ˆβMLE)˜πi|FN}=1NNi=1Ωi(Σu,ˆβMLE).

    From the proof of Lemma 1, it follows that

    E[(Mj1j2XMj1j2X)2|FN]=OP(r1)=oP(1).

    Therefore τ2=oPFN(1), and (A.4) holds

    Next, we will prove Theorems 2.1 and 2.2.

    Proof of Theorem 2.1. β is the solution of Q(β)=1rri=11˜πi˜ηi(Σu,β)=0, then

    E(1NQ(β)|FN)=1Nrri=1Nj=1πjηj(Σu,β)πj=1NNi=1ηi(Σu,β)=1NQ(β).

    By Assumption A6, we have

    Var(1NQ(β)|FN)=Var(1N1rri=1˜ηi(Σu,β)˜πi|FN)=1N2r2ri=1Nj=1πjηi(Σu,β)ηTi(Σu,β)π2j=1N2rNi=1ηi(Σu,β)ηTi(Σu,β)πi=OP(r1).

    Therefore, as r, N1Q(β)N1Q(β)0 for all βΛ in conditional probability given FN. Thus, from Theorem 5.9 in [32], we have βˆβMLE=oPFN(1). By Taylor expansion,

    1NQ(β)=0=1NQ(ˆβMLE)+1Nrri=1˜Ωi(Σu,ˆβMLE+sr)˜πi(βˆβMLE).

    By Lemma 2, it follows that

    1NrNi=1˜Ωi(Σu,ˆβMLE+sr)˜πi1NNi=1Ωi(Σu,ˆβMLE)=oPFN(1),

    then

    0=1NQ(ˆβMLE)+1NNi=1Ωi(Σu,ˆβMLE)(βˆβMLE)+oPFN(1)(βˆβMLE).

    Here is

    1NQ(ˆβMLE)+MX(βˆβMLE)+oPFN(βˆβMLE)=0,

    we have

    βˆβMLE=M1X{1NQ(ˆβMLE)+oPFN(βˆβMLE)}=M1XV12CV12C1NQ(ˆβMLE)+M1XoPFN(βˆβMLE)=OPFN(r12)+oPFN(βˆβMLE). (A.5)

    By Lemma 1 and Assumption A3, M1X=OPFN(1), we have βˆβMLE=OPFN(r12).

    Proof of Theorem 2.2. By Lemma 1 and (A.5), as r, conditional on FN, it holds that

    V12(βˆβMLE)=V12M1XV12CV12C1NQ(ˆβMLE)+oP|FN(1).

    By Lemma 1 and Slutsky's theorem, it follows that

    V12(βˆβMLE)dNp(0,I).

    Proof of Theorem 2.3. To minimize the asymptotic variance tr(V) of β, the optimization problem is

    {mintr(V)=min1N2rNi=1[1πiM1Xηi(Σu,ˆβMLE)2], s.t. Ni=1πi=1,0πi1,i=1,,N. (A.6)

    Define gmVi=M1Xηi(Σu,ˆβMLE),i=1,,N, it follows from Cauchy's inequality that

    tr(V)=1N2rNi=1[1πiM1Xηi(Σu,ˆβMLE)2]=1N2r(Ni=1πi){Ni=1[1πiM1Xηi(Σu,ˆβMLE)2]}1N2r[Ni=1M1Xηi(Σu,ˆβMLE)]2=1N2r[Ni=1gmVi]2.

    The equality sign holds if and only if πigmVi, therefore

    πmVi=gmViNj=1gmVj

    is the optimal solution.

    The proof of Theorem 2.4 is similar to Theorem 2.3.

    Lemma 3. If Assumptions A1–A4 and A6 hold, as r0, r and N, conditional on FN, we have

    M˜β0XMX=OP|FN(r12), (A.7)
    M0XMX=OP|FN(r012), (A.8)
    1NQ˜β0(ˆβMLE)=OP|FN(r12), (A.9)
    1NQ0˜β0(ˆβMLE)=OP|FN(r012), (A.10)
    1NVopt12CQ˜β0(ˆβMLE)dNp(0,I), (A.11)

    where

    M˜β0X=1Nrri=1˜Ωi(Σu,ˆβMLE)˜πopti,
    M0X=1Nr0r0i=1˜Ωi(Σu,ˆβMLE)˜πUNIFi.

    Proof.

    E(M˜β0X|FN)=E˜β0[E(M˜β0X|FN,˜β0)]=E˜β0[E(1Nrri=1˜Ωi(Σu,ˆβMLE)˜πopti|FN,˜β0)]=E˜β0[E(MX|FN,˜β0)]=MX.

    By Assumption A6, we have

    E[(M˜β0,j1j2XMXj1j2)2|FN]=E˜β0{E[(M˜β0,j1j2XMj1j2X)2|FN,˜β0]}=E˜β0[1rNi=1πopti(˜Ωj1j2i(Σu,ˆβMLE)NπoptiMj1j2X)2|FN,˜β0]=E˜β0[1rNi=1πopti(˜Ωj1j2i(Σu,ˆβMLE)Nπopti)21r(Mj1j2X)2|FN,˜β0]E˜β0[1rNi=1πopti(˜Ωj1j2i(Σu,ˆβMLE)Nπopti)2|FN,˜β0]=1rNi=1(Ωj1j2i(Σu,ˆβMLE))2N2πopti=OP(r1).

    It follows from Chebyshev's inequality that (A.7) holds. Similarly, (A.8) also holds.

    E(1NQ˜β0(ˆβMLE)|FN)=E˜β0[E(1N1rri=1˜ηi(Σu,ˆβMLE)˜πopti|FN,˜β0)]=1NNi=1ηi(Σu,ˆβMLE)=0.

    By Assumption A6, we have

    Var(1NQ˜β0(ˆβMLE)|FN)=E˜β0{Var[(1N1rri=1˜ηi(Σu,ˆβMLE)˜πopti)|FN,˜β0]}=1N2rNi=1ηi(Σu,ˆβMLE)ηTi(Σu,ˆβMLE)πopti=OP(r1).

    Therefore, the (A.9) and (A.10) follow from Markov's Inequality.

    Let

    γi,˜β0=˜ηi(Σu,ˆβMLE)N˜πopti,

    for all ε>0, it follows that N1Q˜β0(ˆβMLE)=r1ri=1γi,˜β0,

    ri=1E˜β0{E[r12γi,˜β02I(γi,˜β0>r12ε)|FN,˜β0]}=1rri=1E˜β0{E[γi,˜β02I(γi,˜β0>r12ε)|FN,˜β0]}1r32εri=1E˜β0[E(γi,˜β03|FN,˜β0)]=1r12ε1N3Ni=1˜ηi(Σu,ˆβMLE)3πopt2i=OP(r12)=oP(1).

    This shows that the Lindeberg-Feller conditions are satisfied in probability. Therefore (A.11) is true.

    Lemma 4. If Assumptions A1–A7 hold, as r0, r and N, for all sr00 and sr0, conditional on FN, we have

    1Nr0r0i=1˜Ωi(Σu,ˆβMLE+sr0)˜πopti1NNi=1Ωi(Σu,ˆβMLE)=oPFN(1), (A.12)
    1Nrri=1˜Ωi(Σu,ˆβMLE+sr)˜πopti1NNi=1Ωi(Σu,ˆβMLE)=oPFN(1). (A.13)

    Proof. The Eq (A.12) can be written as

    1Nr0r0i=1˜Ωi(Σu,ˆβMLE+sr0)˜πopti1Nr0r0i=1˜Ωi(Σu,ˆβMLE)˜πopti+1Nr0r0i=1˜Ωi(Σu,ˆβMLE)˜πopti1NNi=1Ωi(Σu,ˆβMLE).

    Let

    τ01:=1Nr0r0i=1˜Ωi(Σu,ˆβMLE+sr0)˜πopti1Nr0r0i=1˜Ωi(Σu,ˆβMLE)˜πopti,

    then by Assumption A7, we have

    E(τ01S|FN)=E˜β0{E[1Nr0r0i=11˜πopti˜Ωi(Σu,ˆβMLE+sr0)˜Ωi(Σu,ˆβMLE)S|FN,˜β0]}=1NNi=1Ωi(Σu,ˆβMLE+sr0)Ωi(Σu,ˆβMLE)S1NNi=1m2(Wi)sr0=oP(1).

    It follows from Markov's inequality that τ01=oPFN(1).

    Let

    τ02:=1Nr0r0i=1˜Ωi(Σu,ˆβMLE)˜πopti1NNi=1Ωi(Σu,ˆβMLE),

    then

    E˜β0{E[1Nr0r0i=1˜Ωi(Σu,ˆβMLE+sr0)˜πopti|FN,˜β0]}=1NNi=1Ωi(Σu,ˆβMLE).

    By the proof of Lemma 3, it follows that

    E[(M˜β0,j1j2XMj1j2X)2|FN]=OP(r01)=oP(1),

    we have τ02=oPFN(1). Therefore (A.12) holds. Similarly, (A.13) is also true.

    Next, we will prove Theorems 2.5 and 2.6.

    Proof of Theorem 2.5.

    E(1NQ˜β0(β)|FN)=E˜β0[E(1N1rri=1˜ηi(Σu,β)˜πopti|FN,˜β0)]=1NNi=1ηi(Σu,β)=1NQ(β).

    By Assumption A6, we have

    Var(1NQ˜β0(β)|FN)=E˜β0{Var(1N1rri=1˜ηi(Σu,β)˜πopti|FN,˜β0)}=1N2rNi=1ηi(Σu,β)ηTi(Σu,β)πopti=OP(r1).

    Hence, as r, N1Q˜β0(β)N1Q(β)0 for all βΛ in conditional probability given FN.

    ˇβ is the solution of Qtwostep˜β0(β)=0, we have

    0=1NQtwostep˜β0(ˇβ)=rr+r01NQ˜β0(ˇβ)+r0r+r01NQ0˜β0(ˇβ). (A.14)

    By Lemma 4, we have

    1Nr0r0i=1˜Ωi(Σu,ˆβMLE+sr0)˜πopti=1NNi=1Ωi(Σu,ˆβMLE)+oP|FN(1)=MX+oP|FN(1),

    and

    1Nrri=1˜Ωi(Σu,ˆβMLE+sr)˜πopti=MX+oP|FN(1).

    By Taylor expansion, we have

    1NQ˜β0(ˇβ)=1NQ˜β0(ˆβMLE)+1Nrri=1Ωi(Σu,ˆβMLE+sr)˜πopti(ˇβˆβMLE)=1NQ˜β0(ˆβMLE)+MX(ˆβˆβMLE)+oP|FN(1)(ˇβˆβMLE). (A.15)

    Similarly,

    (A.16)

    As , then

    Combining this with (A.14)–(A.16), we have

    (A.17)

    which implies that .

    Proof of Theorem 2.6. By Lemma 3, , we have

    Taking as an example, by Assumpion A4, the above equation can be summarized as

    Therefore , and

    which implies that

    Therefore



    [1] D. M. Butler, An Historical Investigation and Bibligraphy of Ninetheeth Century Music Psychology Literature, Thesis, 1973.
    [2] C. Roads, The Computer Music Tutorial, MIT Press, 1996.
    [3] M. Scirea, P. Eklund, J. Togelius, S. Risi, Can you feel it? Evaluation of affective expression in music generated by MetaCompose, GECCO 2017 - Proc. 2017 Genet. Evol. Comput. Conf., (2017), 211–218. https://doi.org/10.1145/3071178.3071314 doi: 10.1145/3071178.3071314
    [4] M. Scirea, J. Togelius, P. Eklund, S. Risi, Towards an experiment on perception of affective music generation using MetaCompose, GECCO 2018 Companion - Proc. 2018 Genet. Evol. Comput. Conf. Companion, (2018), 131–132. https://doi.org/10.1145/3205651.3205745 doi: 10.1145/3205651.3205745
    [5] G. R. Marcos, An investigation on the automatic generation of music and its application into video games, 2019 8th Int. Conf. Affect. Comput. Intell. Interact. Work. Demos, ACⅡW 2019, (2019), 21–25. https://doi.org/10.1109/ACIIW.2019.8925275
    [6] M. Scirea, M. J. Nelson, J. Togelius, Moody Music Generator: Characterising Control Parameters Using Crowdsourcing, International Conference on Evolutionary and Biologically Inspired Music and Art, (2015), 200–211. https://doi.org/10.1007/978-3-319-16498-4 doi: 10.1007/978-3-319-16498-4
    [7] I. Daly, D. Williams, A. Malik, J. Weaver, A. Kirke, F. Hwang, et al., Personalised, Multi-Modal, Affective State Detection for Hybrid Brain-Computer Music Interfacing, IEEE T. Affect. Comput., 11 (2018), 111–124. https://doi.org/10.1109/TAFFC.2018.2801811 doi: 10.1109/TAFFC.2018.2801811
    [8] K. Trochidis, S. Lui, Modeling affective responses to music using audio signal analysis and physiology, Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 9617 (2016), 346–357. https://doi.org/10.1007/978-3-319-46282-0_22 doi: 10.1007/978-3-319-46282-0_22
    [9] E. J. S. Gonzalez, K. McMullen, The Design of an Algorithmic Modal Music Platform for Eliciting and Detecting Emotion, 8th Int. Winter Conf. Brain-Computer Interface, BCI 2020, (2020), 31–33. https://doi.org/10.1109/BCI48061.2020.9061664
    [10] D. Williams, A. Kirke, E. Miranda, I. Daly, J. Hallowell, J. Weaver, J. et al., Investigating perceived emotional correlates of rhythmic density in algorithmic music composition, ACM T. Appl. Percept., 12 (2015), 1–21. https://doi.org/10.1145/2749466
    [11] J. C. Wang, Y. H. Yang, H. M. Wang, S. K. Jeng, Modeling the affective content of music with a Gaussian mixture model, IEEE T. Affect. Comput., 6 (2015), 56–68. https://doi.org/10.1109/TAFFC.2015.2397457 doi: 10.1109/TAFFC.2015.2397457
    [12] A. Chamberlain, M. Bødker, M. Kallionpää, R. Ramchurn, H. P. Gasselseder, The Design of Future Music Technologies, Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion - AM'18, (2018), 1–2. https://doi.org/10.1145/3243274.3243314
    [13] D. Williams, A. Kirke, E. R. Miranda, E. Roesch, I. Daly, S. Nasuto, Investigating affect in algorithmic composition systems, Psychol. Music, 43 (2015) 831–854. https://doi.org/10.1177/0305735614543282 doi: 10.1177/0305735614543282
    [14] T. Eerola, J. K. Vuoskoski, A review of music and emotion studies: Approaches, emotion models, and stimuli, Music Percept., 30 (2013), 307–340. https://doi.org/10.1525/mp.2012.30.3.307 doi: 10.1525/mp.2012.30.3.307
    [15] D. J. Fernández, F. Vico, AI Methods in Algorithmic Composition: A Comprehensive Survey, 2013. Accessed: Feb. 14, 2020. Available from: http://www.flexatone.net/algoNet/.
    [16] O. Lopez-Rincon, O. Starostenko, G. A.-S. Martín, Algoritmic music composition based on artificial intelligence: A survey, 2018 International Conference on Electronics, Communications and Computers (CONIELECOMP), (2018), 187–193. https://doi.org/10.1109/CONIELECOMP.2018.8327197
    [17] B. Kitchenham, S. Charters, Guidelines for performing Systematic Literature Reviews in Software Engineering, 2007, Accessed: Feb. 15, 2020. Available from: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.117.471.
    [18] M. Cerqueira, P. Silva, S. Fernandes, Systematic Literature Review on Machine Learning in Software Engineering, Am. Acad. Sci. Res. J. Eng. Technol. Sci., 85 (2022), 370–396.
    [19] M. N. Giannakos, P. Mikalef, I. O. Pappas, Systematic Literature Review of E-Learning Capabilities to Enhance Organizational Learning, Inf. Syst. Front., 24 (2021), 619–635. https://doi.org/10.1007/s10796-020-10097-2 doi: 10.1007/s10796-020-10097-2
    [20] R. van Dinter, B. Tekinerdogan, C. Catal, Automation of systematic literature reviews: A systematic literature review, Inf. Softw. Technol., 136 (2021), 106589. https://doi.org/10.1016/j.infsof.2021.106589 doi: 10.1016/j.infsof.2021.106589
    [21] L. M. Kmet, R. C. Lee, L. S. Cook, Standard quality assessment criteria for evaluating primary research papers from a variety of fields, 2004.
    [22] M. Schreier, Qualitative content analysis in practice, Sage Publications, 2012.
    [23] H. Koechlin, R. Coakley, N. Schechter, C. Werner, J. Kossowsky, The role of emotion regulation in chronic pain: A systematic literature review, J. Psychosom. Res., 107 (2018), 38–45. https://doi.org/10.1016/j.jpsychores.2018.02.002 doi: 10.1016/j.jpsychores.2018.02.002
    [24] T. Materla, E. A. Cudney, J. Antony, The application of Kano model in the healthcare industry: a systematic literature review, Total Qual. Manag. Bus. Excell., 30 (2019), 660–681. https://doi.org/10.1080/14783363.2017.1328980 doi: 10.1080/14783363.2017.1328980
    [25] D. Ni, Z. Xiao, M. K. Lim, A systematic review of the research trends of machine learning in supply chain management, Int. J. Mach. Learn. Cybern., 11 (2020), 1463–1482. https://doi.org/10.1007/s13042-019-01050-0 doi: 10.1007/s13042-019-01050-0
    [26] A. Mattek, Emotional Communication in Computer Generated Music: Experimenting with Affective Algorithms, 2011.
    [27] Y. Feng, Y. Zhuang, Y. Pan, Music information retrieval by detecting mood via computational media aesthetics, Proceedings - IEEE/WIC International Conference on Web Intelligence, WI 2003, (2003), 235–241. https://doi.org/10.1109/WI.2003.1241199
    [28] J. A. Russell, A circumplex model of affect, J. Pers. Soc. Psychol., 39 (1980), 1161–1178. https://doi.org/10.1037/h0077714 doi: 10.1037/h0077714
    [29] T. Eerola, J. K. Vuoskoski, A comparison of the discrete and dimensional models of emotion in music, Psychol. Music, 39 (2011), 18–49. https://doi.org/10.1177/0305735610362821 doi: 10.1177/0305735610362821
    [30] R. A. Calvo, S. Mac Kim, Emotions in text: Dimensional and categorical models, Comput. Intell., 29 (2013), 527–543. https://doi.org/10.1111/j.1467-8640.2012.00456.x doi: 10.1111/j.1467-8640.2012.00456.x
    [31] E. Brattico, M. Pearce, The neuroaesthetics of music, Psychol. Aesthetics, Creat. Arts, 7 (2013), 48–61. https://doi.org/10.1037/a0031624 doi: 10.1037/a0031624
    [32] S. Cunningham, H. Ridley, J. Weinel, R. Picking, Supervised machine learning for audio emotion recognition: Enhancing film sound design using audio features, regression models and artificial neural networks, Pers. Ubiquitous Comput., 25 (2021), 637–650. https://doi.org/10.1007/s00779-020-01389-0
    [33] R. L. De Mantaras, Making Music with AI: Some examples, Proceeding of the 2006 conference on Rob Milne: A Tribute to a Pioneering AI Scientist, Entrepreneur and Mountaineer, (2006), 90–100. Available from: http://portal.acm.org/citation.cfm?id=1565089
    [34] R. Wooller, A. Brown, E. Miranda, J. Diederich, A framework for comparison of process in algorithmic music systems, Generative Arts Practice, (2005), 109–124.
    [35] R. Rowe, Interactive music systems: machine listening and composing. Cambridge, Mass.: MIT Press, 1992. https://doi.org/10.2307/3680494
    [36] D. Williams, A. Kirke, E. R. Miranda, E. B. Roesch, S. J. Nasuto, Towards Affective Algorithmic Composition, The 3rd International Conference on Music & Emotion, Jyväskylä, Finland, June 11-15, 2013, 2013.
    [37] M. M. Bradley, P. J. Lang, Measuring emotion: The self-assessment manikin and the semantic differential, J. Behav. Ther. Exp. Psychiatry, 25 (1994), 49–59. https://doi.org/10.1016/0005-7916(94)90063-9 doi: 10.1016/0005-7916(94)90063-9
    [38] R. Cowie, E. Douglas-Cowie, S. Savvidou, E. Mcmahon, M. Sawey, M. Schröder, 'Feeltrace': An instrument for recording perceived emotion in real time, ISCA Work. Speech & Emot., (2000), 19–24.
    [39] P. Evans, G. E. McPherson, J. W. Davidson, The role of psychological needs in ceasing music and music learning activities, Psychol. Music, 41 (2013), 600–619. https://doi.org/10.1177/0305735612441736 doi: 10.1177/0305735612441736
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3421) PDF downloads(130) Cited by(1)

Figures and Tables

Figures(5)  /  Tables(2)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog