Loading [MathJax]/jax/output/SVG/jax.js
Research article

Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization


  • Received: 09 April 2021 Accepted: 18 May 2021 Published: 01 June 2021
  • Distributed learning over data from sensor-based networks has been adopted to collaboratively train models on these sensitive data without privacy leakages. We present a distributed learning framework that involves the integration of secure multi-party computation and differential privacy. In our differential privacy method, we explore the potential of output perturbation and gradient perturbation and also progress with the cutting-edge methods of both techniques in the distributed learning domain. In our proposed multi-scheme output perturbation algorithm (MS-OP), data owners combine their local classifiers within a secure multi-party computation and later inject an appreciable amount of statistical noise into the model before they are revealed. In our Adaptive Iterative gradient perturbation (MS-GP) method, data providers collaboratively train a global model. During each iteration, the data owners aggregate their locally trained models within the secure multi-party domain. Since the conversion of differentially private algorithms are often naive, we improve on the method by a meticulous calibration of the privacy budget for each iteration. As the parameters of the model approach the optimal values, gradients are decreased and therefore require accurate measurement. We, therefore, add a fundamental line-search capability to enable our MS-GP algorithm to decide exactly when a more accurate measurement of the gradient is indispensable. Validation of our models on three (3) real-world datasets shows that our algorithm possesses a sustainable competitive advantage over the existing cutting-edge privacy-preserving requirements in the distributed setting.

    Citation: Kwabena Owusu-Agyemang, Zhen Qin, Appiah Benjamin, Hu Xiong, Zhiguang Qin. Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization[J]. Mathematical Biosciences and Engineering, 2021, 18(4): 4772-4796. doi: 10.3934/mbe.2021243

    Related Papers:

    [1] Kwabena Owusu-Agyemang, Zhen Qin, Appiah Benjamin, Hu Xiong, Zhiguang Qin . Insuring against the perils in distributed learning: privacy-preserving empirical risk minimization. Mathematical Biosciences and Engineering, 2021, 18(4): 3006-3033. doi: 10.3934/mbe.2021151
    [2] Chunkai Zhang, Ao Yin, Wei Zuo, Yingyang Chen . Privacy preserving anomaly detection based on local density estimation. Mathematical Biosciences and Engineering, 2020, 17(4): 3478-3497. doi: 10.3934/mbe.2020196
    [3] Zilong Liu, Jingbing Li, Jing Liu . Encrypted face recognition algorithm based on Ridgelet-DCT transform and THM chaos. Mathematical Biosciences and Engineering, 2022, 19(2): 1373-1387. doi: 10.3934/mbe.2022063
    [4] Radi P. Romansky, Irina S. Noninska . Challenges of the digital age for privacy and personal data protection. Mathematical Biosciences and Engineering, 2020, 17(5): 5288-5303. doi: 10.3934/mbe.2020286
    [5] Songfeng Liu, Jinyan Wang, Wenliang Zhang . Federated personalized random forest for human activity recognition. Mathematical Biosciences and Engineering, 2022, 19(1): 953-971. doi: 10.3934/mbe.2022044
    [6] Xiao Dong Yang, Wen Jia Wang, Bin Shu, Mei Juan Li, Rui Xia Liu, Cai Fen Wang . Multi-message multi-receiver signcryption scheme based on blockchain. Mathematical Biosciences and Engineering, 2023, 20(10): 18146-18172. doi: 10.3934/mbe.2023806
    [7] Lihong Guo, Weilei Gao, Ye Cao, Xu Lai . Research on medical data security sharing scheme based on homomorphic encryption. Mathematical Biosciences and Engineering, 2023, 20(2): 2261-2279. doi: 10.3934/mbe.2023106
    [8] Yifeng Yin, Zhaobo Wang, Wanyi Zhou, Yong Gan, Yanhua Zhang . Group key agreement protocol for edge computing in industrial internet. Mathematical Biosciences and Engineering, 2022, 19(12): 12730-12743. doi: 10.3934/mbe.2022594
    [9] Ching-Chun Chang, Chang-Tsun Li . Algebraic secret sharing using privacy homomorphisms for IoT-based healthcare systems. Mathematical Biosciences and Engineering, 2019, 16(5): 3367-3381. doi: 10.3934/mbe.2019168
    [10] Muhammad Ahmad Nawaz Ul Ghani, Kun She, Muhammad Arslan Rauf, Shumaila Khan, Masoud Alajmi, Yazeed Yasin Ghadi, Hend Khalid Alkahtani . Toward robust and privacy-enhanced facial recognition: A decentralized blockchain-based approach with GANs and deep learning. Mathematical Biosciences and Engineering, 2024, 21(3): 4165-4186. doi: 10.3934/mbe.2024184
  • Distributed learning over data from sensor-based networks has been adopted to collaboratively train models on these sensitive data without privacy leakages. We present a distributed learning framework that involves the integration of secure multi-party computation and differential privacy. In our differential privacy method, we explore the potential of output perturbation and gradient perturbation and also progress with the cutting-edge methods of both techniques in the distributed learning domain. In our proposed multi-scheme output perturbation algorithm (MS-OP), data owners combine their local classifiers within a secure multi-party computation and later inject an appreciable amount of statistical noise into the model before they are revealed. In our Adaptive Iterative gradient perturbation (MS-GP) method, data providers collaboratively train a global model. During each iteration, the data owners aggregate their locally trained models within the secure multi-party domain. Since the conversion of differentially private algorithms are often naive, we improve on the method by a meticulous calibration of the privacy budget for each iteration. As the parameters of the model approach the optimal values, gradients are decreased and therefore require accurate measurement. We, therefore, add a fundamental line-search capability to enable our MS-GP algorithm to decide exactly when a more accurate measurement of the gradient is indispensable. Validation of our models on three (3) real-world datasets shows that our algorithm possesses a sustainable competitive advantage over the existing cutting-edge privacy-preserving requirements in the distributed setting.



    With the proliferation of Mobile Sensor Networks [1], mobile devices (i.e., smartphones, tablets, personal digital assistant, laptop computers, smart watches, e-readers) are widely preferred by users since it is gradually occupying an imperative position in sensor-based activity recognition [2], coupled with the rapid growth of its per capita holding. Privacy-preserving [3] is a fundamental challenge for the massive structured and unstructured [4] datasets generated in these numerous smart applications that rely on data aggregation and combined learning across diverse nodes. Existing approaches take different methods to address these privacy concerns. Two predominant approaches are secure Multiparty Computation (MPC) and Differential Privacy (DP) [5].

    MPC is a preferred option to optimize a computational function over jointly distributed data resources with the application of cryptographic primitives (i.e., oblivious transfer, secret sharing, and homomorphic encryption) whiles keeping these data private. MPC under different adversarial models have been studied in numerous computer domains (e.g., [6]). One of the initial private data mining methods in this domain was proposed by Lui et al. [7], which was preceded by plethora of works considering various applications or implementation of adversarial models. Most of the existing MPC primitives to help meet the secure requirements are based on the advancement in fully homomorphic encryption (FHE) [8]. It enables data providers to encrypt individual datasets with the public key and outsource a computation to the cloud server. The cloud server computes on the encrypted datasets and therefore generates encrypted intermediate results. In the absence of the secrete key, cloud server basically serves as a computation platform that is unable to access any of the individual records. Current emphasis has been on achieving realistic and efficient Distributed machine learning (DML)with the application of MPC primitives [9], and in some domains these approaches have demonstrated to scale hundreds of millions of records to learning tasks [10]. Notwithstanding the considerable benefits of MPC, it still has an inevitable obstacle: issues of security and privacy. The cloud server may be semi-honest and even malicious in specific cases, which means it may be motivated to infer from the individual's confidential information for profit or curiosity during the training of the machine learning models. This is becoming a non-trivial privacy concern for outsourced secure multiparty computation.

    Differential privacy comes in handy to inject noise into the intermediate results to avoid inferring sensitive information about any specific individual record. However, in order to balance privacy and model usability, careful calibration of the statistical noise is required. Conversely, accomplishing the trade-off amid the privacy and utility of the DP algorithm still remains a problem. Concretely, unlike prevailing approaches involving the use of differential privacy on classifiers, they only protect the training data during the learning process; there is no focus on mitigating the privacy risk of black-box inference attacks on the resultant machine learning model. In the architecture of differential privacy, Empirical Risk Minimization (ERM) plays a crucial role as it encompasses a myriad of tasks in machine learning. In cases where one knows how to implement ERM privately (DP-ERM), for extended machine learning problems, essentially regression and classification, it, therefore, becomes easy to obtain differentially private algorithms. The initial representative works in this line of study was conducted by [11]. DP-ERM should have indistinguishable intermediate results when there is a small modification in the input datasets. Specifically, earlier studies on DP-ERM to guarantee differentially private optimization are based on three methods: Objective Perturbation, Output Perturbation (OP), and Gradient Perturbation(GP).

    Concretely, existing approaches have demonstrated that it may be applied to permit privacy-preserving in the centralized domain with an individual entity possessing the entire dataset. However, there is a more prevailing problem where data is owned by multiple organizations in the distributed domain. Consider the real-world application of this paradigm in medical research where multiple hospitals may wish to collaboratively train a model with the application of the sensor-based activities recognition medical records generated from a large number of patient wearable mobile devices without exposing their individual data instances to other parties. This powerful method has been integrated with Distributed machine learning in the pioneering study of Pathak et al. [12], which securely aggregates locally trained classifiers with the application of output perturbation to establish differential privacy. Since the allocated noise in the model is inversely proportional to the smallest dataset with p parties, Jayaraman et al. [13] later improved the noise scale by a factor of p with the adoption of secure model aggregation of locally trained classifiers [14] and then execute naïve aggregation of the locally trained models whiles providing meaningful privacy guarantee. Their proposed output perturbation algorithm is capable of improving Pathak et al.'s [12] protocol by a component of p by injection of statistical noise within the secure MPC domain based on a required noise scale which is inversely proportional to the volume of the whole dataset.

    Despite the recurrent progress in such markets, for several hospitals to deploy mobile sensor networks in a Pervasive healthcare monitoring setup to accumulate health dataset through mobile devices to develop innovative diagnostic classifiers from these patient records, there are still quite some few threatening issues:

    (a) Such existing works [12,13] provides model parameters without any uncertainty. Therefore, making it extremely difficult to add capabilities for quantifying uncertainty in the model coefficients.

    (b) In cases where the data is extremely huge with each individual data provider having only one training instance, application of classifier aggregation method for generating a global model from multiple locally trained classifiers based on the use of parameter averaging technique may lead to information leakages. Since the parameter averaging is only applicable to the collection of the same model type, which will also tend to generate less accurate global aggregated classifiers in comparison to the centralized domain.

    (c) Since the accuracy of the model relies heavily on the pre-specified amount of iterations -T, in situations where the iteration is too little, the learning procedure will discontinue well short of the optimal, and the larger the iteration, the smaller the privacy budget ϵt for each iteration. therefore, it requires large volumes of statistical noise to be injected at every gradient iteration, hence causing the swamping of the signal contributed by the gradient in the iterative training procedure.

    (d) In the initial stage of ERM optimization, gradients are anticipated to remain very huge, this is to enable the learning algorithm to find adequate parameter updates irrespective of when the gradient is not computed accurately. Nonetheless, the gradient begins to decrease and requires accurate measurement as the current parameter wt approaches the optimal values. This is to enable the optimization algorithm to continue minimizing or approximately minimizing the loss function f. This implies that on the grounds that total privacy cost remains unchanged it is important to apply an adaptive iterative privacy budget allotment than a fixed allotment.

    The basic pipeline for securely training distributed machine learning models in this scenario is illustrated in Figure 1, where the secure multi-party machine learning framework is a composition of p + 1 data providers {1,2,,p} and a cloud server. In this architecture, the assumption is that each data provider i, who holds their individual private dataset Di medical records and a training model θ generated from the dataset, is willing to improve the accuracy of the globally trained model without leaking any sensitive information about the locally owned confidential dataset.

    Figure 1.  Secure distributed machine learning and description pipeline.

    In this paper, we argue that it is of great significance to introduce differentially-private distributed machine learning algorithms to thrash out the three outlined fundamental problems in the current protocol [13,14]. We apply both output perturbation and gradient perturbation with the injection of statistical noise inside a secure multi-party computation domain. The inherent strategy is to demonstrate that our output perturbation algorithm securely aggregates the locally trained models which are encrypted with distinct encryption primitives or even with distinctive keys to achieving ϵ-differential privacy with the addition of statistical Laplace noise to the aggregated classifiers parameters. In our gradient perturbation scheme, individual data owners jointly execute an adaptive iterative gradient-based protocol where they securely aggregate the local gradients per each iteration with distinctive share ϵt of the total privacy budget ϵ. Therefore providing (ϵ, δ)-differential privacy with the application of adaptive gradient descent approach for zero-mean Concentrated DP [15] (zCDP).

    This protocol offers (ϵ,δ)-differential privacy guarantee in the honest-but-curious security threat environment by the underlying cryptosystem and privacy-preserving primitives. In this domain, adversaries may only obtain negligible opportunities to leak the sensitive data instance since the individual models are globally encrypted. It provides a more practicable and efficiency privacy-preserving primitive in the Distributed machine learning domain.

    The principal contribution to this work can be summarized in threefold:

    ● To strongly achieve computationally more efficient and also maintain higher accuracy irrespective of how the dataset is partitioned when accurate machine learning models are privately trained in the distributed setting, we apply noise bounds for each of our two chosen protocols; multi-scheme output perturbation (MS-OP) and gradient perturbation (MS-GP) method in our proposed algorithm.

    ● For our gradient perturbation method, we present a private gradient descent protocol based on zCDP [15] which is more robust than (ϵ,δ)-differential and also attain minimum known bound on the privacy budget. In this algorithm step size per iteration and privacy, the budget is dynamically decided at run-time grounded on the value of injected statistical noise obtained for the current iteration of the gradient. Additionally, the noise is generated within the secure multiparty protocol. This will enable us to inject a unique copy of statistical noise as compared to existing approaches that aggregate noise from individual data owners.

    ● Our algorithm is validated on real human activity datasets against other recently proposed empirical risk minimization algorithms on Distributed learning; whiles alternating the number of data owners and volumes of the local dataset. We empirically show the effectiveness of the proposed protocol for an extended privacy level. Our simulations indicate the practicality and feasibility of our model and are very close to the non-private classifier relative to the accuracy of the classifiers and their generalization errors.

    In Section 2, we provide an outline of the related works. Section 3 presents some preliminaries and the definition of multi-party Differential privacy in the distributed domain. The proposed architecture is given in Section 4. Experimental results for the proposed architecture will be demonstrated in Section 5, Section 6 deals with the comparative evaluation of our algorithm.

    Distributed machine learning (DML) [16] as a decentralized machine learning theory enables distributed training on a large scale of a dataset in edge devices including electronic meters, and sensors smartphones where no individual node is capable of getting the intelligent decision from a massive dataset within an appreciable period of time. DML technique has earned a remarkable reputation in numerous pragmatic areas such as visual object detection, health records, big data analysis, control policies, and medical text extraction. Regrettably, as the number of distributed data owners increases, the guarantee for the security of the datasets from the individual data owners becomes extremely difficult. This lack of security will increase the threat that adversaries attack the dataset preceded by the manipulation of the intermediate training result. Therefore, affecting data integrity which is a key component in training machine learning models. Adversarial data attacks [17] is one of the distinctive ways with the aim of corrupting the machine learning models by contaminating data during the training phase. Consider in a scenario where newly generated datasets are expected to be updated periodically by the data owners for improving the models, the adversary is likely to gain more chances of poisoning the dataset, posing a severe threat in the Distributed machine learning models. This kind of threat is considered one of the most imperative emerging security threats against machine learning models. Since the adversarial attack has the potential of misleading the diverse machine learning methods, a widely applicable DML protection mechanism is urgently required to be investigated. There have been numerous works conducted on privacy-preserving DML, most of the research approaches were greatly stimulated by privacy-preserving machine learning and data mining. The existing literature on privacy-preserving DML basically falls into two major categories: cryptography-based technique and perturbation-based methods.

    The cryptography-based methods typically incorporate cryptographic tools to preserve the privacy of the datasets. Secure multiparty computation may potentially address these security threat in the DML system. Though acquiring information with the aggregation of the dataset from multi-parties is a critical task for machine learning, in a real-world business setting, the prevention of privacy leakages while carrying out this privacy-preserving task is a very crucial requirement for secure multiparty computation. Secure multi-party computation problem refers to collaborative execution of a statistical function together by multiple data owners. After the computation, individual data owners acquire accurate results and no one can get more knowledge than the dataset inferred from the public intermediate results. Secure multiparty algorithms are basically cryptographic-based methods that apply a typical cryptographic technique to perform these Distributed machine learning tasks. The data owners try not to expose any knowledge on original data except that can be inferred from the output [7] of the Distributed machine learning task. Over the past few years, secure multiparty computation has widely been applied to achieve privacy-preserving in the distributed machine since it is more effective and efficient than other privacy-preserving algorithms. Secure MPC is capable of supporting floating-point and fix-point arithmetic operations [18], this arithmetic functions can be executed with controlled linear complexity [19]. Owing to these benefits, exploring the potential of secure multiparty computation required for privacy-preserving in Distributed machine learning has gained considerable attention over the past few years. Initial proposals to secure two-party(2PC) computation was first introduced by Yao [20] in 1982. Subsequently, Goldreich et al. [21] generalized and stretched 2PC method to the secure multiparty computer (MPC) problem. Secure MPC later gains so much research attention to finding practical ways of exploring the potential in this domain. Gentry [8] with the aid of homomorphic encryption with ideal lattices primitive was the first to introduce secure MPC. This was later preceded by numerous researchers proposing various secure MPC implementations, including Lost-cost Multi-party Computation for Dishonest Majority [22] Semi-homomorphic Encryption and Active-Secure Two-Party Computation [25]. To the highest degree, these algorithms can be categorized into two major methods such as secret sharing and homomorphic encryption. Gentry et al. [8] achieve the initial scheme with multiplicative and additive homomorphism respectively, this will require a long period of time to execute the complex circuits during the performance of the inevitable elimination of the noise. To the contrary, secrete sharing primitive [24] is capable of calculating infinite times of any multiplication and addition with an additional exchange of datasets. In a typical example of a two-party domain, Bansal et al. [26] applied secure scalar and secret sharing to protect privacy leakages during the training process which is not trivial to be extended to the secure multi-party models. Yuan et al. [27] propose a privacy-preserving back-propagation algorithm for secure multi-party deep learning over arbitrarily partitioned datasets grounded on BGN homomorphic encryption [28]. Nevertheless, the primitive requires all the data owners to be online and interactively collaborate to decrypt the ciphertext of the intervening parameters during each iteration. In [29], Hesamifard et al. propose a privacy-preserving machine learning algorithm using encrypted data to make encrypted predictions in the cloud. Since fully homomorphic encryption(FHE) [8] generates high computational complexities, they proposed a confidentially binary classification-based method to find a trade-off between the degree of the polynomial approximation and the secure performance of the training model. Li et al. [30] with the aid of multi-key fully homomorphic encryption (MK-FHE) [31] also propose a privacy-preserving multi-party deep learning primitive, in this setting, the individual data owners encrypts their datasets with different public keys and outsource them to the cloud server, the cloud server therefore train the deep learning classifiers with the application of MK-FHE. Based on the relevant literature above, it is obvious that most of the cryptography-based algorithms use fully homomorphic or multi-key fully homomorphic encryption primitives to encrypt the entire dataset before outsourcing it to the cloud server.

    With the addition of noise to the raw dataset, the perturbation-based method is able to protect the privacy of the dataset. Agrawal et al. [32] proposes an algorithm that injects elaborately designed noise to the training data while preserving the statistical properties to enable the training of Naive Bayes classifier. With the gradual explosion of the digitized dataset, Fong et al. [33] offered a privacy-preserving learning algorithm by transforming the original dataset into groups of unreal datasets whiles preserving the accuracy for the learning model. Furthermore, this algorithm ensures that the original data samples cannot be reconstructed without the whole group of constructed datasets. DP is a strong golden standard to guarantee privacy-preserving for algorithms on aggregated datasets, which is widely applied in privacy-preserving DML to ensure that the dominance of any single data owners record is insignificant. DP has been utilized in many existing applications by large-scale businesses such as US Census Bureau [34]. A typical Distributed machine learning strategy applied in this domain is Empirical Risk Minimization (ERM), where the average error of the trained model over the aggregated datasets is minimized. There have been numerous proposals to advance the works on privacy-preserving algorithms for ERM problems with the application of variations of differential privacy. This paradigm is known as Differentially Private Empirical Risk Minimization (DP-ERM) [13]. DP-ERM reduces the empirical risk whiles providing the assurance the intermediate results of the learning model is differentially private based on the aggregated training dataset. This privacy-preserving guarantee ensures strong protection against potential adversaries such as inference attacks. To guarantee privacy-preserving in this domain, it is always essential to launch randomness to the training protocol. Based on the time for noise injection, there are mainly three ways to initiate randomness: objective perturbation, output perturbation, and gradient perturbation. This work introduces a differentially-private (DML) algorithms with the application of both gradient perturbation and output perturbation whiles injecting the noise within the secure multi-party computation.

    In this section, we introduce the notations (Table 1) in our proposed protocol and the necessary background for our analyses, including empirical risk minimization, zero-concentrated differential privacy, and basic assumptions in secure multi-party computation.

    Table 1.  Notations in our proposed architecture.
    Notations Description
    D Database of n records
    (xi,yi) The i-th record in database D
    ϑ The parameter vector of neural networks
    ϑ The optimal model parameter ϑ
    L(ϑ) The loss function on database D
    ϵ The privacy budget of neural networks
    Rj Average relevance
    Δ Sensitivity
    Lap() Laplace distribution
    g(xi) Gradients
    ˜g(xi) The noisy gradients
    α Relevance ratio

     | Show Table
    DownLoad: CSV

    Given a dataset D= {d1,,dn} from a data universe X and a closed convex set CRp, DP-ERM is to find xprivC so as to minimize the empirical risk, i.e.,

    Fr(x,D)=1nni=1f(x,zi)+r(x), (3.1)

    with the guarantee of being differentially private, where f is the loss function and r is some simple (non)smooth convex function called regularizer. When the inputs are drawn i.i.d from an unknown underlying distribution P on X, we also consider the population risk EzP[f(x,z)]. If the loss function is convex, the utility of the algorithm is measured by the expected excess empirical risk, i.e.,

    EA[Fr(xpriv,D)]minxCFr(x,D), (3.2)

    or the expected excess population risk (i.e., generalization error), i.e.,

    EzP,A[f(xpriv,z)]minxCEzP[f(x,z)], (3.3)

    where the expectation of A is taking over all the randomness of the algorithm.

    Let D={d1,d2,...,dn} represent n data points, each derived from some setting D. Two neighboring databases D and D; if |D| = |D| differing in exactly one data point. D is obtained by the addition or removal of one observation from D. The concept of DP was commenced by Dwork [5] and outlined as follows:

    Definition 1. (ϵ,δ)-DP [5]. A randomized mechanism M satisfy (ϵ,δ)-DP if for every event S range (M) and for all D,DDn,

    Pr[M(D)S]exp(e)Pr[M(D)S]+δ (3.4)

    when δ=0, M accomplishes pure DP by providing stronger privacy protection than approximate DP with Δ>0. We can add noise sampled form Guassisan and Laplace distributions respectively to achieve ϵ-DP and (ϵ,δ)-DP where the noise is proportional to 2 norm sensitivity of M; given as ΔM=M(D)M(D).

    Definition 2. (1and 2 norm sensitivity) Let q:DnRd be a query function. The 1(resp. 2) sensitivity of q, denoted by Δ1(q) (resp., Δ2(q)) is defined as follows:

    Δ1(q)=maxDDq(D)q(D)1,Δ2(q)=maxDDq(D)q(D)2 (3.5)

    The 1 and 2 sensitivities represent the maximum change in the output value of q (over all possible neighboring databases in Dn) when one individual's data is changed.

    Theorem 1. Let ϵ(0,1) be arbitrary and q be a query function with 2 sensitivity of Δ2(q). The Gaussian Mechanism, which returns q(D)+N(0,o2) Let ϵ(0,1) be arbitrary and q be a query function with 2 sensitivity of Δ2(q). The Gaussian Mechanism [35], which returns q(D)+N(0,σ2) with

    σΔ2(q)ϵ2ln(1.25/δ) (3.6)

    is (ϵ,δ)-DP

    A critical property of DP is its privacy guarantee reduces gracefully under the composition. The most basic composition result shows that the privacy loss grows linearly under k-fold composition [36]. This implies, if we sequentially apply an (ϵ,δ)-differential privacy algorithm n times on the same data, the resulting process is (nϵ,nδ). Dwork et al. [37] provided a booting method to construct an improved privacy-preserving synopsis on the queries with an advanced composition; the loss function upsurges sub-linearly at the rate of n.

    Theorem 2. For all ϵ,δ,δ0 the class of (ϵ,δ)-differential private mechanisms satisfy (ϵ,nδ+δ)-differential privacy under k-fold adaptive composition for ϵ=2nIn(1δ)ϵ+nϵ(eϵ1) as stated in the advance composition [37].

    Whiles, differential privacy is suitable for algorithms such as output perturbation, it is not the preferred choice for gradient perturbation which is essentially repeated sampling of statistical noise during the iterative learning process. Bun and Stainke [15] provided zero-concentrated DP (zCDP). which has a tight composition bound and a preferable option for gradient perturbation. We define ρ-zCDP by the introduction of the privacy loss random variable as applied in the definition of zCDP.

    Definition 3. For an output 0range(M), the privacy loss random variable Z of the mechanism M is defined as follows:

    Z=logP[M(D)=]P[M(D)=] (3.7)

    ρ-zCDP therefore imposes a bound on the moment generating function of the privacy loss Z and requires it to be tightly concentrated around zero mean, hence it is unlikely to distinguish D from D. Formally, it important to satisfy the following:

    eDα(M(D)M(D))=E[e(α1)Z]e(α1)αρ,α(1,) (3.8)

    where Dα(M(D)M(D)) is the α-Rényi divergence. In this work, we apply the resulting zCDP composition.

    Lemma 1. [15] If two mechanisms satisfy ρ1 -zCDP and ρ2 -zCDP, then their composition satisfy (ρ1+ρ1)-zCDP

    If two mechanisms satisfy ρ1 -zCDP and ρ2 -zCDP, then their composition satisfy (ρ1+ρ2)-zCDP

    Lemma 2. [15] The Gaussian mechanism returns q(D)+N(0,σ2) satisfies Δ2(q)2/(2σ2)-zCDP

    Lemma 3. [15] If M satisfies ϵ-DP, then M satisfy (12ϵ2)-zCDP, then M is (ρ+2ρlog(1/δ),δ)-DP for any δ>0.

    In our security threat model, we consider honest-but-curious (semi-honest) data providers who wish to collaboratively train a model without exposing their individual inputs to other data owners. Data providers do not collude to temper with the collaborative functionality or inject garbage input data, they are capable of passively inferring about inputs of other data providers depending on the implementation of the algorithm. We applied generic secure multiparty computation primitives to securely aggregate locally trained classifiers and their gradients. K. Owusu-Agyemang et al. [38] demonstrated the secure aggregation of multiple individual datasets and locally trained models. This method will enable our model to have a secure verifiable computation delegation from the aggregated locally trained models. Therefore helping the data owners to also update their locally trained models with higher accuracy improvement. In this paper, we can use this method to achieve our secured multi-party model aggregation. For circumstances where there is a high risk of collusion, numerous secure multi-party protocols can be used to an individual honest data provider even if all other data owners are malicious. The focus of this paper is not to improve or evaluate the execution of the secure multiparty computation, since our proposed algorithm is capable of been implemented using well know secured multiparty computation algorithms.

    We demonstrate our OP and GP methods with the theoretical analysis of DP and generalization error bound. The following ERM objective is considered:

    RD(θ)=1nni=1(θ,xi,yi)+λN(θ) (3.9)

    where (θ) is a convex loss function that is G-Lipschitz and L-smooth over θRd. N() is the regularization term. We consider R() to be λ-strongly convex. Individual data instance (xi;yi)D lies in a unit ball. For an individual j, with dataset Dj of size nj, we denote its data instance as (xji,yji).

    Given φ={w1,w2,...,wn} Rn and f:RnR as a function that implicitly depends on D. In cases where we want to pick a point wiφ with highest (wi,D). An algorithm (ϵ)-DP called Report Noisy Max Mechanism [36] injects independent statistical noise from Lap(1(/ϵ)) to each (wi), for i[n], whiles returning the index i of the maximum value i.e.,

    i=argmaxj[n]{(wj)+Lap(Δ/ϵ)}

    with Lap(λ) denoting a Laplace noise allocation with 0 mean and scale parameter λ; the notation [n] is used to denote the set {1,2,3,...n}. Although Report Noisy Max is ϵDP for arrays of any length; and it was originally considered to operate with pure ϵ-DP, it has also been affected by inaccurate privacy proofs in lots of previous proposals. To prove this stronger guarantee, it is essential to apply a more sophisticated coupling strategy [39]. To enable [39] to work with ρ-zCDP Lemma 3 conversion result is applied. Hence the ϵ-differential private protocol satisfies ϵ22-zCDP. Consequently, anytime we wish to use zCDP, we assign ρ of the privacy budget to Coupling Strategies, we represent the Coupling Strategies by ϵ=2ρ.

    Algorithm 1:: Coupling Strategies (φ,Δ1(),ϵ)
    1 Input: φ: a set of data entities, Δ1: sensitivity of , ϵ : privacy budget for pure ϵ-DP
    2 ˆφ={ˆνi=ν+Lap(Δ1()/ϵ):νφ,iϵ[|φ|]};
    3 Return: argmaxj[|φ|] ˆνj

    An algorithm for the recycling of the gradients estimates that were not useful during the parameter updates will be applied in our main protocol. At iteration r we will use ρr-zCDP privacy budget for the estimation of the privacy budget. If Δ2(Δ)=L2 sensitivity of the gradients of then, we can then measured as Gr=Δ(wr)+N(O,Δ2(Δ)22ρr) A larger share of privacy budget ρr+1>ρr is triggered if the accuracy is not enough. Another independent measurement is initiated using ρr+1ρr privacy budget Gr=Δ(wr)+N(O,Δ2(Δ)22(ρr+1ρr)). Therefore Gr+Gr=ˆGr=ρrGr+(ρr+1ρr)Grρr+(ρr+1ρr).

    We demonstrate the estimated gradient as : E[ˆGt]=(wr)

    Var(ˆGr)=(ρ2rΔ2(Δ)22ρr+Δ2(Δ)22(ρr+1ρr)(ρr+1ρr)2)/ρ2r+1=Δ2(Δ)22ρr+1

    In our proposed distributed machine learning protocol, we extend the differential privacy bound of [40] for the output perturbation algorithm, where adequate noise is injected to preserve the privacy of individual data owners and final output throughout the learning process. Our model aggregation with the output the perturbation model is represented in Figure 2.

    Figure 2.  Workflow of our multi-scheme output perturbation method.

    Theorem 3. In our aggregated model estimator, ˆθ=1mmj=1ˆθ(j) where ˆθ(j)= argminθ1njnji=1(θ,x(j)i,y(j)i)+λN(θ) with an optimal model estimator θ learning from the centralized individual dataset such that the data reside in a unit ball and () is G -Lipschitz, we therefore obtain:

    ˆθθG(m1)n(1)λ (4.1)

    Proof. For an individual data owner Pj, the local model estimator is given as:

    ˆθ(j)=argminθ1njnji=1(θ,x(j)i,y(j)i)+λN(θ)=argminθG(θ)

    Therefore the centralized model estimator is stated as:

    θ=argminθ1njnji=1(θ,x(j)i,y(j)i)+lj1nlnli=1(θ,x(l)i,y(l)i)+λN(θ)=argminθG(θ)+g(θ)

    The values of g1 and G2 are represented as follows:

    g1=maxθg(θ)=maxθlj1nl(θ,x(l)i,y(l)i)Glj1nlG2=minvminθv2G(θ)v=minvminθv(1nj2(θ,x(j)i,y(j)i)+λ.1)vλ

    With the application of Lemma [1] of Chaudhuri and Monteleoni [40] we obtain ˆθ(j)θ=Gλlj1nl. With the application of triangle inequality, we have:

    ˆθθ1mjˆθ(j)θ=Gmλjlj1nl=G(m1)mλj1njG(m1)n(1)λ

    In minimizing the local objective function RD(θ)=1nni=1(θ,xi,yi)+λN(θ), we obtain ˆθ(j) as its corresponding local model estimator.

    ˆθpriv=1kkj=1ˆθ(j)+η (4.2)

    is the perturbed aggregate model estimator; where η is the Laplace noise injected into an aggregated model estimator to obtain DP. We, therefore, adopt K.Owusu-Agyemang et al. [38] secure model aggregation for our secure MPC model. The theory below administers a bound on the quantity of noise required to attain DP.

    Theorem 4. If ˆθpriv=1kkj=1ˆθ(j)+η is the perturbed aggregate model estimator whiles ˆθ(j)=argminθ1njnji=1(θ,x(j)i,y(j)i)+λP(θ) with the data reside in a unit ball and () is G-Lipschitz, then ˆθpriv is ϵ-differentially private if : η=Lap(2Gkn(1)λϵ)

    n(1) represent the volume of the minimum data set amid the k data owners, regularization parameter λ and DP budget ϵ.

    Proof. As there are k entities representing one record of data owner j altered in the neighboring datasets formally:

    Pr(ˆθ|D)Pr(ˆθ|D)=Pr(1kijˆθ(i)+1kˆθ(j)+η|D)Pr(1kijˆθ(i)+1kˆθ(j)+η|D)=exp[kn(1)ϵλ2Gˆθ(j)k]exp[kn(1)ϵλ2Gˆθ(j)k]exp[n(1)ϵλ2Gˆθ(j)ˆθ(j)]exp[n(1)ϵλ2G2Gnjλ]exp(ϵ)

    Provision of a bound on the excess empirical risk and true risk is similar to [12] and [13] with our bounds tighter than both of them as we demand very fewer DP noise.

    Theorem 5. If a perturbed aggregated model estimator ˆθpriv=1mmj=1ˆθ(j)+η where ˆθ(j)=argminθ1njnji=1(θ,x(j)i,y(j)i)+λN(θ) and an optimum model estimator θ learning from the centralized data such that the data reside in a unit ball and () is G -Lipschitz and L -smooth, then the bound on excess empirical risk is given as:

    R(ˆθpriv)R(θ)+A1G2(λ+L)n2(1)λ2(m2+d2log2(d/δ)m2ϵ2+dlog(d/δ)ϵ)

    where A1 is an absolute constant.

    Proof. With the application of Taylor's Expansion, we therefore obtain:

    R(ˆθpriv)=R(θ)+(ˆθprivθ)R(θ)+12(ˆθprivθ)2R(θ)(ˆθprivθ)

    where θ=αˆθpriv+(1α)θ for some α[0,1]. By definition, R(θ)=0, thus we get

    R(ˆθpriv)R(θ)12ˆθprivθ22J(θ)

    since () is L -smooth, we have 2J(θ)λ+L, therefore

    R(ˆθpriv)R(θ)+λ+L2ˆθθ+η2R(θ)+λ+L2[ˆθθt2+η2+2(ˆθθ)η]R(θ)+λ+L2[ˆθθ2+η2+2ˆθθη]

    Theorem 6. If a perturbed aggregated model estimator ˆθpriv=1kkj=1ˆθ(j)+η where ˆθ(j)=argminθ1njnji=1(θ,x(j)i,y(j)i)+λP(θ) and an optimum model estimator θ trained on the centralized data to this extent dataset reside in a unit ball and () is G-Lipschitz and L-smooth, then the bound on excess empirical risk is grounded on probability at minimum 1γ:

    E[˜R(ˆθpriv)]minθ˜R(θ)C1G2(λ+L)n2(1)λ2(m2+d2log2(d/δ)m2ϵ2+dlog(d/δ)ϵ)+C2G2log(1/γ)λn

    where n is the volume of the centralized dataset. ˜R(θ)=Ex,y[(θ,x,y)]+λN(θ),A1,A2 are absolute constants, and the expectation is taking relative to the noise η.

    We give considerable attention to a centralized private empirical risk minimization to adopt per-iteration privacy budget for k entities, with individual data set Dj of volume nj of independent observations.

    RD(θ)=minθ1mmj=11njnji=1(θ,x(j)i,y(j)i)+λN(θ) (4.3)

    Data owners can collaboratively train a differentially private classifier by adopting the per-iteration privacy budget by the addition of noise to the aggregated gradient inside secure MPC to make individual iteration progress towards an optimal solution. It is imperative to consider that the regularization expression in Eq (4.3) has no privacy guarantee and does not have any privacy implications since it is independent of the data sets. Our adaptive iterative gradient perturbation method is represented in Figure 3.

    Figure 3.  Gradient perturbation workflow.

    Theorem 7. With a Distributed model estimator θT attained by minimizing JR(θ) followed by T iterations of our gradient descent method executed collaboratively by k participants individually possessing dataset Dj of size nj with each data instance {d1,d2,dn}Dj reside in a unit ball and (θ) is G-Lipschitz and L-smooth above θA. 1L as the learning rate whiles gradients are perturbed with statistical noise using the Gaussian mechanism with variance σ2 in zN(O,σ2I), then θT is (ϵtot,δtot)-differentially private if:

    σ2=8G2Tlog(1/δ)m2n2(1)ϵ2 (4.4)

    with ni been the least data volume amid the individual k participants.

    Proof. With initial gradient at step i

    ˆGt=J(w,D)+N(0,σ2Ip)=1mmj=11njnji=1(w,x(j)i,y(j)i)+N(0,σ2Ip)

    The magnitude of the Gaussian noise σ2 on the highest impact an individual can exhibit on gt which is measured by Δ2(g). To bound Δ2(g) quantity, we apply the gradient clipping method of [41] to compute the gradient (w,x(j)i,y(j)i) for {i=1,2...,n} we clip the gradient in L2-norm and divide it by Max(1, (w,x(j)i,y(j)i)2Ag), compute the sum and add the Gaussian noise with the variance A2g2ρgm and eventually normalize it to a unit norm. That is to ensure L2-sensitivity of the gradient is bounded by Ag to satisfy ρgm-zCDP by Lemma 2. In a secure MPC learning domain, since an algorithm cannot depend on a guarantee in expectation (i.e. E[(wt;dit)|wt]=f(wt)), it is important to efficiently apply per-iteration budget. We therefore deploy the privacy budget by testing if a given noisy estimate ˆgt of gradient is in a descent direction by applying a portion of the privacy budget ρnmax. Our algorithm then construct Ω={f(wtαˆgt):αΦ} where Ω represent the objective value evaluated at {wtαˆgt} whiles Φ is the set of pre-defined step sizes. Then it decides which step volume yields the minimum objective value using the coupling strategy. We bound the sensitivity of by using the concept of gradient clipping to f objective function. With a specified fixed clipping threshold Ab, we then calculate (wt;xi,yi), clipping the values greater than Ab to take the summation of the clipped values.

    In cases where the direction ˆgt is considered to be in a bad direction by the coupling strategy, our proposed protocol escalates the privacy budget for the estimation of the noisy gradient ρgn by a factor of 1+γ We utilize the gradient averaging method to improve the validity of the gradient by subtracting ρ0 from ρgn i.e(noisy gradient) to enable us to get the current gradient. The coupling strategy is applied to verify the new direction again. it is continuously applied until it returns a non-zero index i.e., attaining a decent direction.

    Our proposed iterative gradient perturbation-based scheme is a composition of coupling Strategy ϵ, and ϵ22 as the basic tools to enable us to achieve diverse types of differential privacy; Gaussian mechanism on the other hand is also applied to provide zCDP. Algorithm 2 demonstrates the details of each step of our proposed gradient perturbation-based-algorithm.

    Theorem 8. We apply the conversion tools provided by Lemmas 2 and 3 to enable us to compare other algorithms that use approximated (ϵ,δ). Given (ϵT,δT)-DP as the total privacy budget and with the application of Lemma 3 we achieve the subsequent inequality for ρ:

    ϵTρ+2ρlog(1/δT) (4.5)

    With the overall privacy budget ρ for zCDP, our protocol vigorously calculates and subtracts the quantity of essential privacy budget whenever it requires access to the data sets throughout the run-time. This promises that the complete execution of our protocol meets ρ-zCDP. SGD method with continuous step sizes, in general, cannot assure the convergence to the optimal irrespective of how a strongly convex objective function can become. For this reason, the discrepancy in private gradient estimation is expected to be managed. In order to assure convergence, stochastic optimization methods basically implement the conditions tα2t< and tαt= based on their step size [42]. This will enable the differences in the updates to diminish progressively towards the optimal. Consequently, we monitor our step sizes chosen by the coupling strategy method and adaptively manage the scope of step sizes in Θ. We therefore initialize Θ with fairly spaced n points ranging from 0 and αmax. An update of αmax=(1+η)max(αt,αt1,,αtτ+1) is performed at every iteration τ; with (αt denoting the step size chosen at iteration t. Our algorithm is empirically observed to adaptively alter the scope of step sizes depending on the relative position of the present iteration to the optimal. The validity of our proposed scheme basically depends on ρ-zCDP composition which accounts for the cost of privacy for each primitive.

    Theorem 9. Algorithm 2 Meets ρ-zCDP and (ϵT,δT)-DP

    Proof. We expect values of (ϵT,δT) where no ρ-zCDP privacy budget is offered to our algorithm. Line 2 is capable of imagining the correct value of ρ to such a degree that ρ-zCDP can additionally satisfy frail (ϵT,δT). For our algorithm to achieve pure zCDP mode with the utilization of the privacy budget; Line 7 measures the noisy gradient, Line 12 performing the coupling strategy for noisy max whiles Line 18 is responsible for averaging the gradient. Furthermore, Line 3 guarantees that remaining privacy budget is always above 0 while Line 14 ensures our privacy budget is not exhausted. This is very critical to our algorithm since the weights are the output that is visible outside our proposed protocol. Our algorithm satisfies ρ-zCDP when all protocol operations apply the required share of the privacy budget allocated.

    We validate the performance of the simulation for our proposed algorithm for regression and classification tasks. Based on Wang et al. [43] method, we perform a pre-processing of the datasets. In our classification task, we apply a regularized logistic regression classifier over three (3) real-world datasets i.e., Adult [44], CICIDS2018 [45] and CICMalDroid2020 [46]. Table 2 represents the features of the datasets explored in the investigation.

    Table 2.  Summary of the datasets.
    Dataset Size (n) Dimension
    Adult [44] 48,842 124
    CICIDS2018 [45] 16,000,000 125
    CICMalDroid2020 [46] 17,341 470

     | Show Table
    DownLoad: CSV

    In this domain, we predict if the interconnection is a denial-of-service(DoS) attack or otherwise. We indiscriminately sample 70,000 individual data along with dividing amongst training dataset of 50,000 records whiles the 20,000 records are used as a test set. With xiRp+1,yi{1,+1}, and λ>0 as the regularization coefficient. Throughout our simulations; we set coefficient λ=0.001, learning rate η=1, failure probability θ=0.001, ϵ=0.05 privacy budget, G-Lipschitz constant =1 and entire iterations T=1000 for GD. We validate our proposed output perturbation and gradient perturbation-based algorithms with other benchmarks relative to optimality and relative accuracy loss. In the case of regression, relative accuracy loss is termed as mean square error (MSE) θ and θ which is the difference in accuracy over the test data.

    In our proposed multi-party computation output perturbation-based model aggregation (MS-OP), we demonstrate the comparison of this model with Pathak et al. [12] which is represented as PAT, other cutting-edge differential privacy benchmarks are also achieved by the application of objective perturbation and output perturbation techniques of Wang et al. [43] on each of the locally trained model estimator ^θj to attain a differentially private local model estimator whiles aggregation of the classifier is computed to attain differentially private aggregated classifier ˆθpriv with confidence intervals for the parameters in the model. For our experiments on our proposed adaptive iterative gradient perturbation-based learning algorithm, we adopt a benchmark of aggregation for locally perturbed gradients [23] with the aim of improving the noise bound by applying zCDP and coupling strategy [39] also for the verification of the privacy budget. Our proposed multi-party computation output perturbation-based model aggregation and multiparty computation adaptive iterative gradient perturbation-based frameworks is represented as MS-OP and MS-GP respectively (with the application of Algorithm 2). In all our simulations there is a variation of the number of data owners p from 100–1000 with up to 50,000 data owners with each of them possessing only one data instance.

    Result on Adult Dataset: The Adult [44] data is a composition of demographic information of approximately 47,000 individuals, In this domain, our duty is to predict if annual income of data owners is exceed or below $50,000.00 threshold. During the pre-processing stage, we obtained 104 features for each of the records, whiles the missing values were removed therefore yielding 45,222 records with 30,000 of them now forming the training dataset whiles the rest are used for the testing of our model. In our simulation on the Adult dataset, our proposed MS-OP and MS-GP methods outperform the benchmarks both with respect to the accuracy loss and optimality gab as demonstrated in Figure 4.

    Figure 4.  Relative accuracy loss and optimality gap comparison on adult (p = 1000). All models have privacy budget of ϵ=0.05 for each iteration.).

    Result on CICMalDroid2020 Dataset: As the amount of data owners p grows and with the decrease in the size of the local data, the relative accuracy of all the models begins to decrease with the exclusion of MS-GP as represented in Figure 5. It is obvious that the performance of the benchmarks algorithms begins to depreciate basically owing to the huge volumes of statistical noise injected within the classifier. Furthermore, the performance of MS-OP also deteriorates with the reduction in the size of the local dataset owing to the loss in information from partitioning of the dataset which has been one of the challenges with the aggregation of locally trained classifiers.

    Figure 5.  Relative accuracy loss and pptimality gap comparison on CICMalDroid2020 (p = 1000). (Entire models contains privacy budget of ϵ = 0.05 for each iteration.).

    Result on CICIDS2018 Dataset: With the reduction in the amount of local dataset which is as a result of the decrease in the number of data owners p, there is a great decrease in the performance of all the methods with the exception of MS-GP (Figure 6). Although there is a reduction in the performance of MS-OP as the size of the local dataset is reduced, it continues to outperform the benchmarks of existing model aggregation. We observed that as p=1000, the performance of W-LObjP is weaker as compared to the W-LOP, this is due to the deviation in the objective function of W-LObjP (n). It is important to note that the utility of PAT is greatly affected as a result of the huge amount of statistical noise injected into the model, resulting in the plot been out of range for p=1000 (Figure 6).

    Figure 6.  Relative accuracy loss and optimality gap comparison on CICIDS2018 (p = 1000). All models have privacy budget of ϵ=0.05 for each iteration.).

    Table 3 is a summary of the amount of noise for the individual algorithms requires to preserve differential privacy. As stated in Table 3, MS-GP add the lowest amount of statistical noise to the model. Though it is obvious W-LOP injects noise in a similar range as our proposed MS-GP, it applies the statistical noise in a profoundly different approach. Whereas the existing algorithms inject the sampled statistical noise into the optimal non-private model either thorough output perturbation and gradient perturbation to minimize the objective function J(θ), whiles optimizing an exclusively different objective function R(θ)=R(θ)+Lap(2Gp1ϵ), therefore explaining the increases in optimality gap as the value for the amount p(1) of the dataset decreases.

    Table 3.  Noise magnitudes compared with diverse multi-party DP algorithms.
    PAT W-LOP W-LObjP W-LGP MS-OP MS-GP
    L - Laplace, N- Gaussian
    L(2Gp(1)λϵ) L(2Gmp(1)λϵ) L(2Gp(1)ϵ) N(2TGmp(1)ϵ) L(2Gmp(1)λϵ) N(2TGmp(1)ϵ)
    m=100, p(1)=500, λ=0.01, ϵ=0.5, G=1 and T=100 with the Generated Noise Input
    769 × 103 80.0 × 103 7.98 × 103 5.66 × 103 7.98 ×103 0.46 × 103
    Standard deviation over 1000 samples for the generated noise
    1149 × 103 111 × 103 10.99 × 103 4.98 × 103 11.90 × 103 0.591 × 103

     | Show Table
    DownLoad: CSV

    Variations in Parameters for the Simulation: It's important to note that all of the real-world datasets we used in the experiments have a variety of numerical and categorical characteristics. We apply some of the most common pre-processing agorithms in machine learning, such as converting all categorical features into a collection of binary variables by creating one binary variable for each individual distinct class; and re-scaling all numerical features into the range of [0, 1] to ensure that all features have the same size. To meet the specification, we normalized each observation to a unit norm (i.e.,x2=1fori=1,2,...,p). To compare our approach to other standard methods on a real-world data set in order to demonstrate its efficacy.

    minw1nni=1log(1+exp(yiwxi))+λ2w22

    Specifically, we also consider (regularized) logistic regression on the three(3) real world data sets. We therefore validate the minimization error EF(wprivacy,S)F(ˆw,S) and running time of our algorithms based on different ϵ={0.05,0.5,0.1} and δ=0.001 (see Table 4 for more details).

    Table 4.  Simulation results.
    Dataset ϵ δ Error Runtime
    MS-GP (ϵ,δ) [13](ϵ,δ) MS-GP (ϵ,δ) [13](ϵ,δ)
    Adult 0.05 0.001 0.0912 0.3692 10.896 202.12
    0.5 0.0212 0.6082 68.504 253.72
    0.1 0.0429 0.6227 22.814 255.12
    CICMalDroid2020 0.05 0.001 0.1714 1.3545 10.214 532.12
    0.5 0.2231 1.4585 36.796 519.33
    0.1 0.3983 2.2552 12.613 518.67
    CICIDS2018 0.05 0.001 0.0092 0.3039 11.036 185.32
    0.5 0.0101 0.4162 13.893 190.87
    0.1 0.0292 0.4202 4.977 190.32

     | Show Table
    DownLoad: CSV

    In this work, we prove that an appreciable among of statistical noise in a distributed learning domain can be reduced when it is generated and injected into a model within a secure multi-party computation. Our paper shows how the statistical noise as a prerequisite for our distributed learning domain may be minimized by generating and injecting noise inside a secure computation. Both of our proposed multi-scheme output perturbation (MS-OP) and gradient perturbation (MS-GP) algorithms were both to achieve ϵ-DP and (δ,ϵ)-differential privacy respectively. Though the aggregation of our locally trained models requires only a single secure aggregation, it still remains very efficient. Our MS-GP method has also been able to improve on the optimization algorithm in this multi-party differential privacy setting whereby the privacy budget at each iteration has been adaptively determined depending on the utility of privacy-preserving statistics. In the future, we will try and explore the potential of our framework in other domains.

    This work was supported in part by the National Natural Science Foundation of China (No.61672135), the Frontier Science and Technology Innovation Projects of National Key R & D Program (No.2019QY1405), the Sichuan Science and Technology Innovation Platform and Talent Plan (No.20JCQN0256), and the fundamental Research Funds for the Central Universities (No.2672018ZYGX2018J057).

    All authors declare that there is no conflicts of interest in this paper.



    [1] X. Chen, L. Yu, T. Wang, A. Liu, X. Wu, B. Zhang, et al., Artificial intelligence-empowered path selection: A survey of ant colony optimization for static and mobile sensor networks, IEEE Access, 8 (2020), 71497–71511.
    [2] M. A. R. Ahad, A. D. Antar, M. Ahmed, IoT Sensor-Based Activity Recognition - Human Activity Recognition, Intelligent Systems Reference Library, Springer, 173 (2021).
    [3] J. Lin, G. Srivastava, Y. Zhang, Y. Djenouri, M. Aloqaily, Privacy-preserving multiobjective sanitization model in 6G IoT environments, IEEE Int. Things J., 8 (2021), 5340–5349. doi: 10.1109/JIOT.2020.3032896
    [4] C. Iwendi, S. A. Moqurrab, A. Anjum, S. Khan, S. Mohan, G. Srivastava, N-sanitization: A semantic privacy-preserving framework for unstructured medical datasets, Comput. Commun., 161 (2020), 160–171. doi: 10.1016/j.comcom.2020.07.032
    [5] C. Dwork, F. McSherry, K. Nissim, A. D. Smith, Calibrating noise to sensitivity in private data analysis, J. Priv. Confidentiality, 7 (2016), 17–51.
    [6] J. Du, F. Bian, A privacy-preserving and efficient k-nearest neighbor query and classification scheme based on k-dimensional tree for outsourced data, IEEE Access, 8 (2020), 69333–69345.
    [7] J. Liu, Y. Tian, Y. Zhou, Y. Xiao, N. Ansari, Privacy preserving distributed data mining based on secure multi-party computation, Comput. Commun., 153 (2020), 208–216. doi: 10.1016/j.comcom.2020.02.014
    [8] C. Gentry, Fully homomorphic encryption using ideal lattices, in Proceedings of the 41st Annual ACM Symposium on Theory of Computing, ACM, (2009), 169–178.
    [9] H. K. Bhuyan, N. K. Kamila, Privacy preserving sub-feature selection in distributed data mining, Appl. Soft Comput., 36 (2015), 552–569. doi: 10.1016/j.asoc.2015.06.060
    [10] A. Gascón, P. Schoppmann, B. Balle, M. Raykova, J. Doerner, S. Zahur, et al., Privacy-preserving distributed linear regression on high-dimensional data, PoPETs, 2017 (2017), 345–364.
    [11] K. Chaudhuri, C. Monteleoni, A. D. Sarwate, Differentially private empirical risk minimization, J. Mach. Learn. Res., 12 (2011), 1069–1109.
    [12] M. A. Pathak, S. Rane, B. Raj, Multiparty differential privacy via aggregation of locally trained classifiers, in NIPS, (2010), 1876–1884.
    [13] B. Jayaraman, L. Wang, D. Evans, Q. Gu, Distributed learning without distress: Privacy-preserving empirical risk minimization, Adv. Neural Inf. Proc. Syst., 6346–6357, 2018.
    [14] L. Tian, B. Jayaraman, Q. Gu, D. Evans, Aggregating private sparse learning models using multi-party computation, in NIPS Workshop on Private Multi-Party Machine Learning, 2016.
    [15] M. Bun, T. Steinke, Concentrated differential privacy: Simplifications, extensions, and lower bounds, in Theory of Cryptography Conference, Springer, Berlin, Heidelberg, (2016), 635–658.
    [16] Y. Chen, Y. Mao, H. Liang, S. Yu, Y. Wei, S. Leng, Data poison detection schemes for distributed machine learning, IEEE Access, 8 (2020), 7442–7454.
    [17] E. Alsuwat, H. Alsuwat, M. Valtorta, C. Farkas, Adversarial data poisoning attacks against the PC learning algorithm, Int. J. Gen. Syst., 49 (2020), 3–31. doi: 10.1080/03081079.2019.1630401
    [18] M. Aliasgari, M. Blanton, F. Bayatbabolghani, Secure computation of hidden markov models and secure floating-point arithmetic in the malicious model, Int. J. Inf. Sec., 16 (2017), 577–601. doi: 10.1007/s10207-016-0350-0
    [19] O. Catrina, A. Saxena, Secure computation with fixed-point numbers, in International Conference on Financial Cryptography and Data Security, Springer, Berlin, Heidelberg, (2010), 35–50.
    [20] A. C. Yao, Protocols for secure computations, in 23rd annual symposium on foundations of computer science (sfcs 1982), IEEE, (1982), 160–164.
    [21] O. Goldreich, Secure multi-party computation, Manuscr. Prelim. Version, 78 (1998).
    [22] I. Damgård, C. Orlandi, Multiparty computation for dishonest majority: From passive to active security at low cost, in Annual cryptology conference, Springer, Berlin, Heidelberg, (2010), 558–576.
    [23] R. Shokri, V. Shmatikov, Privacy-preserving deep learning, in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, (2015), 1310–1321.
    [24] R. Bendlin, I. Damgård, C. Orlandi, S. Zakarias, Semi-homomorphic Encryption and Multiparty Computation, Advances in Cryptology - EUROCRYPT 2011 - 30th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tallinn, Estonia, May 15-19, 2011. Proceedings
    [25] J. B. Nielsen, P. S. Nordholt, C. Orlandi, S. S. Burra, A new approach to practical active-secure two-party computation, in Annual Cryptology Conference, Springer, Berlin, Heidelberg, (2012), 681–700.
    [26] A. Bansal, T. Chen, S. Zhong, Privacy preserving back-propagation neural network learning over arbitrarily partitioned data, Neural Comput. Appl., 20 (2011), 143–150. doi: 10.1007/s00521-010-0346-z
    [27] J. Yuan, S. Yu, Privacy preserving back-propagation neural network learning made practical with cloud computing, IEEE Trans. Parallel Distrib. Syst., 25 (2014), 212–221.
    [28] W. Zhang, A BGN-type multiuser homomorphic encryption scheme, in 2015 International Conference on Intelligent Networking and Collaborative Systems, IEEE, (2015), 268–271.
    [29] E. Hesamifard, H. Takabi, M. Ghasemi, C. Jones, Privacy-preserving machine learning in cloud, in Proceedings of the 2017 on cloud computing security workshop, (2017), 39–43.
    [30] P. Li, J. Li, Z. Huang, T. Li, C. Gao, S. Yiu, et al., Multi-key privacy-preserving deep learning in cloud computing, Future Gener. Comput. Syst., 74 (2017), 76–85. doi: 10.1016/j.future.2017.02.006
    [31] P. Mukherjee, D. Wichs, Two round multiparty computation via multi-key FHE, in Annual International Conference on the Theory and Applications of Cryptographic Techniques, Springer, Berlin, Heidelberg, (2016), 735–763.
    [32] R. Agrawal, R. Srikant, Privacy-preserving data mining, in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, (2000), 439–450.
    [33] P. K. Fong, J. H. Weber-Jahnke, Privacy preserving decision tree learning using unrealized data sets, IEEE Trans. Knowl. Data Eng., 24 (2012), 353–364.
    [34] J. M. Abowd, The US census bureau adopts differential privacy, in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2018), 2867–2867.
    [35] F. Liu, Generalized gaussian mechanism for differential privacy, IEEE Trans. Knowl. Data Eng., 31 (2019), 747–756.
    [36] C. Dwork, A. Roth, The algorithmic foundations of differential privacy, Found. Trends Theor. Comput. Sci., 9 (2014), 211–407.
    [37] C. Dwork, G. N. Rothblum, S. P. Vadhan, Boosting and differential privacy, in 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, IEEE, (2010), 51–60.
    [38] O. Kwabena, Z. Qin, T. Zhuang, Z. Qin, Mscryptonet: Multi-scheme privacy-preserving deep learning in cloud computing, IEEE Access, 7 (2019), 29344–29354.
    [39] A. Albarghouthi, J. Hsu, Synthesizing coupling proofs of differential privacy, Proc. ACM Program. Lang., 2 (2017), 1–30.
    [40] K. Chaudhuri, C. Monteleoni, Privacy-preserving logistic regression, in NIPS, 8 (2008), 289–296.
    [41] J. Zhang, T. He, S. Sra, A. Jadbabaie, Why gradient clipping accelerates training: A theoretical justification for adaptivity, preprint, arXiv: 1905.11881.
    [42] S. Alipour, F. Mirzaee, An iterative algorithm for solving two dimensional nonlinear stochastic integral equations: A combined successive approximations method with bilinear spline interpolation, Appl. Math. Comput., 371 (2020), 124947. doi: 10.1016/j.amc.2019.124947
    [43] Y. Wang, D. Kifer, J. Lee, Differentially private confidence intervals for empirical risk minimization, J. Priv. Confidentiality, 9, (2019).
    [44] M. Lichman, UCI machine learning repository, 2013. Available from: http://archive.ics.uci.edu/ml.
    [45] I. Sharafaldin, A. H. Lashkari, A. A. Ghorbani, Toward generating a new intrusion detection dataset and intrusion traffic characterization, in ICISSp, (2018), 108–116.
    [46] S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, A. A. Ghorbani, Dynamic android malware category classification using semi-supervised deep learning, in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), IEEE, (2020), 515–522.
  • This article has been cited by:

    1. Bing Duan, Digital marketing solutions based on consumer data and homomorphic encryption, 2022, 0, 2444-8656, 10.2478/amns.2021.2.00253
    2. Hao Wang, Guangmin Sun, Kun Zheng, Hui Li, Jie Liu, Yu Bai, Privacy protection generalization with adversarial fusion, 2022, 19, 1551-0018, 7314, 10.3934/mbe.2022345
    3. P. Jayaprabha, K. Paulose Jacob, Critical Review of Different Approaches of Multiparty Privacy Protection Methods and Effectiveness on Social Media, 2025, 36, 2161-3915, 10.1002/ett.70130
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3299) PDF downloads(134) Cited by(3)

Figures and Tables

Figures(6)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog