Research article Special Issues

Flexible functional data smoothing and optimization using beta spline

  • Functional data analysis (FDA) is a method used to analyze data represented in its functional form. The method is particularly useful for exploring both curve and longitudinal data in both exploratory and inferential contexts, with minimal constraints on the parameters. In FDA, the choice of basis function is crucial for the smoothing process. However, traditional basis functions lack flexibility, limiting the ability to modify the shape of curves and accurately represent abnormal details in modern and complex datasets. This study introduced a novel and flexible data smoothing technique for interpreting functional data, employing the beta spline introduced by Barsky in 1981. The beta spline offers flexibility due to the inclusion of two shape parameters. The proposed methodology integrated the roughness penalty approach and generalized cross-validation (GCV) to identify the optimal curve that best fitted the data, ensuring appropriate parameters were considered for transforming data into a functional form. The effectiveness of the approach was assessed by analyzing the GCV color grid chart to determine the optimal curve. In contrast to existing methodologies, the proposed method enhanced flexibility by incorporating the beta spline into the smoothing procedure. This approach was anticipated to effectively handle various forms of time series data, offering improved interpretability and accuracy in data analysis, including forecasting.

    Citation: Wan Anis Farhah Wan Amir, Md Yushalify Misro, Mohd Hafiz Mohd. Flexible functional data smoothing and optimization using beta spline[J]. AIMS Mathematics, 2024, 9(9): 23158-23181. doi: 10.3934/math.20241126

    Related Papers:

    [1] Mohammed Aljebreen, Hanan Abdullah Mengash, Khalid Mahmood, Asma A. Alhashmi, Ahmed S. Salama . Enhancing cybersecurity in cloud-assisted Internet of Things environments: A unified approach using evolutionary algorithms and ensemble learning. AIMS Mathematics, 2024, 9(6): 15796-15818. doi: 10.3934/math.2024763
    [2] Youseef Alotaibi, R Deepa, K Shankar, Surendran Rajendran . Inverse chi-square-based flamingo search optimization with machine learning-based security solution for Internet of Things edge devices. AIMS Mathematics, 2024, 9(1): 22-37. doi: 10.3934/math.2024002
    [3] Maha M. Althobaiti, José Escorcia-Gutierrez . Weighted salp swarm algorithm with deep learning-powered cyber-threat detection for robust network security. AIMS Mathematics, 2024, 9(7): 17676-17695. doi: 10.3934/math.2024859
    [4] E Laxmi Lydia, Chukka Santhaiah, Mohammed Altaf Ahmed, K. Vijaya Kumar, Gyanendra Prasad Joshi, Woong Cho . An equilibrium optimizer with deep recurrent neural networks enabled intrusion detection in secure cyber-physical systems. AIMS Mathematics, 2024, 9(5): 11718-11734. doi: 10.3934/math.2024574
    [5] Sultanah M. Alshammari, Nofe A. Alganmi, Mohammed H. Ba-Aoum, Sami Saeed Binyamin, Abdullah AL-Malaise AL-Ghamdi, Mahmoud Ragab . Hybrid arithmetic optimization algorithm with deep learning model for secure Unmanned Aerial Vehicle networks. AIMS Mathematics, 2024, 9(3): 7131-7151. doi: 10.3934/math.2024348
    [6] Jiawen Ye, Lei Dai, Haiying Wang . Enhancing sewage flow prediction using an integrated improved SSA-CNN-Transformer-BiLSTM model. AIMS Mathematics, 2024, 9(10): 26916-26950. doi: 10.3934/math.20241310
    [7] Hend Khalid Alkahtani, Nuha Alruwais, Asma Alshuhail, Nadhem NEMRI, Achraf Ben Miled, Ahmed Mahmud . Election-based optimization algorithm with deep learning-enabled false data injection attack detection in cyber-physical systems. AIMS Mathematics, 2024, 9(6): 15076-15096. doi: 10.3934/math.2024731
    [8] Wahida Mansouri, Amal Alshardan, Nazir Ahmad, Nuha Alruwais . Deepfake image detection and classification model using Bayesian deep learning with coronavirus herd immunity optimizer. AIMS Mathematics, 2024, 9(10): 29107-29134. doi: 10.3934/math.20241412
    [9] Waeal J. Obidallah . Enhancing healthcare security measures in IoTT applications through a Hesitant Fuzzy-Based integrated approach. AIMS Mathematics, 2024, 9(4): 9020-9048. doi: 10.3934/math.2024439
    [10] Eatedal Alabdulkreem, Mesfer Alduhayyem, Mohammed Abdullah Al-Hagery, Abdelwahed Motwakel, Manar Ahmed Hamza, Radwa Marzouk . Artificial Rabbit Optimizer with deep learning for fall detection of disabled people in the IoT Environment. AIMS Mathematics, 2024, 9(6): 15486-15504. doi: 10.3934/math.2024749
  • Functional data analysis (FDA) is a method used to analyze data represented in its functional form. The method is particularly useful for exploring both curve and longitudinal data in both exploratory and inferential contexts, with minimal constraints on the parameters. In FDA, the choice of basis function is crucial for the smoothing process. However, traditional basis functions lack flexibility, limiting the ability to modify the shape of curves and accurately represent abnormal details in modern and complex datasets. This study introduced a novel and flexible data smoothing technique for interpreting functional data, employing the beta spline introduced by Barsky in 1981. The beta spline offers flexibility due to the inclusion of two shape parameters. The proposed methodology integrated the roughness penalty approach and generalized cross-validation (GCV) to identify the optimal curve that best fitted the data, ensuring appropriate parameters were considered for transforming data into a functional form. The effectiveness of the approach was assessed by analyzing the GCV color grid chart to determine the optimal curve. In contrast to existing methodologies, the proposed method enhanced flexibility by incorporating the beta spline into the smoothing procedure. This approach was anticipated to effectively handle various forms of time series data, offering improved interpretability and accuracy in data analysis, including forecasting.



    In recent times, the Internet of Things (IoTs) has exponentially increased with the usage of smart devices. IoT devices allow us to access from anywhere such as homes, vehicles, and offices to make day-to-day tasks simple [1] and are utilized in smart cities, care services, health, smart homes, smart grids, vehicular networks, and other industries. Also, they have special features, namely lower energy consumption, lighter protocols, and compact size which adapt them better [2]. Extended transportation of smart devices in advertising along with declined trust in identifying devices has made the web of things more and more versatile [3]. Malicious attacks or applications, like ransomware and malware families, constantly pose crucial security problems to cybersecurity and can result in catastrophic losses to the web, data centers, mobile applications, and computer systems across several businesses and industries [4]. Ransomware is mainly developed to prevent and block victims from accessing system databases by using a robust encrypting method that can be decrypted by attackers [5].

    Removing the ransomware will lead the targeted victim to permanently lose data, therefore, targeted victims are compelled to comply with the attacker's demand [6]. Attackers transform traditional ransomware into new ransomware families through modern technology, which makes it more challenging to reverse the ransomware infection [7]. Ransomware is a variant and sophisticated threat affecting users around the world that limits users from accessing the data or system by encrypting or locking the system screen and the user files unless a ransom is paid [8]. Locker ransomware and crypto-ransomware are the two different types of ransomware based on attack strategies. Crypto ransomware prevents access to data or files and the access is denied to the device or computer in locker ransomware [9].

    Conventional ransomware detection methods, like data-centric-based, event-based, and statistical-based approaches, are not suitable to combat the attacks. Thus, the high level of security and protection implemented by adopting innovative technology against these malware attacks has gained immense attention from researchers [10]. Due to their fixed architecture, classical machine learning (ML) techniques are unable to distinguish complicated cyberattacks from ever-growing cyber threats and adversaries' or attacker's resources and capabilities. The objective is to provide security on the device from different attacks by using the latest and advanced technologies that are capable of detecting the attacks with recognition accuracy in less time [11]. In this context, deep learning (DL) shows the real face of cyber data, either attack or legitimate, by identifying the slight changes or differences. Therefore, DL may quickly identify the anomalies and facilitate an in-depth analysis of network data [12]. Therefore, a DL-driven detection technique becomes cost-effective, adaptive, and highly scalable without exhausting the primitive devices, which is a breakthrough invention in cyber-security [13].

    Alohali et al. [14] developed a sine cosine algorithm with a DL-based ransomware detection and classification (SCADL-RWDC) algorithm in the IoT platform. This algorithm employs the SCA-feature selection (SCA-FS) system to increase the recognition accuracy. Also, the proposed method implements the Hybrid Grey Wolf Optimizer GWO (HGWO) with the GRU technique to classify ransomware. The author in [15], introduced a new method to avoid crypto-ransomware by identifying block cipher techniques for IoT environment. This method has extracted the features in the opcode of binary documents for the microcontroller named 8-bit Alf and Vegard's RISC (AVR) processor.

    In [16], developed a static analysis model based on N-gram opcodes with DL algorithm. At first, the proposed method splits the N-gram sequence into numerous patches as well as provides every patch to self-attention-based CNN (Conventual Neural Network) (SA-CNN). Next, the efficiency of SA-CNNs must be combined and implemented in a bi-directional SA network to achieve the outcome of ransomware classification. In [17], an IoT-based IDS and classification system based-CNN (IoT-IDCS-CNN) method was presented. The performance assessment utilizes parallel processing to use strong compute unified device architectures (CUDA) based Nvidia graphical processing unit (GPU) and high speed I9-core-based Intel CPU.

    In [18], an optimum graph-CNN-enabled ransomware detection (OGCNN-RWD) method was developed for cyber-security in the IoT infrastructure. This study presents learnable enthusiasm to teach learning-based optimizer (LETLBO) techniques for the subcategory of the FS method. The GCNN architecture has been employed to classify ransomware, and hyperparameters should be effectively preferred by the harmony search algorithm (HSA). In [19], the main objective is to examine a lightweight DL method that increases the detection rate with a decreased computation rate for confirming the real-time application of malware monitoring in limited IoT devices. The architecture has been employed for RNN, LSTM, and the bi-directional-LSTM-DL method under a vanilla configuration trained with conventional malware databases.

    Basnet et al. [20] projected the DL-based ransomware identification technique in SCADA-controlled electric vehicle charging stations (EVCS) with evaluation studies of 3 DL techniques such as LSTM-RNN, 1D-CNN, and DNN. Ransomware was determined the Distributed Denial-of-Service DDoS (distributed denial-of-service) attack prefers to change the state of charge (SOC) configuration by surpassing the control threshold of SOC. In [21], various assessment of malware evaluation of sample was determined. The 3 malware identification algorithms based on visualization methods (i.e., clustering technique, probabilistic method, and DL algorithm) were developed. Afterwards, a developed measure depends on the risk of instances that could be utilized for evaluation.

    In the domain of IoT cybersecurity, researchers like Alohali et al. ([14]) have proposed innovative approaches, such as the SCADL-RWDC algorithm integrating sine cosine slgorithm and deep learning, while Basnet et al. ([20]) focused on DL-based ransomware identification in SCADA-controlled electric vehicle charging stations. These studies collectively offer a variety of methodologies, from SCA-FS to OGCNN-RWD, contributing to the advancement of ransomware detection and overall cybersecurity in IoT environments.

    The presented article develops an EBSAEDL-RD approach in IoT security. In order to achieve this, the EBSAEDL-RD approach utilizes min-max normalization to scale input data effectively and incorporates the EBSA method for optimal feature selection. Ransomware classification is performed using the bidirectional gated recurrent unit (BiGRU) method, with the sparrow search algorithm (SSA) employed for fine-tuning hyperparameters. Extensive experiments employing the EBSAEDL-RD approach are conducted on a benchmark dataset.

    In this study, we design a new EBSAEDL-RD algorithm in IoT security. The purpose of the EBSAEDL-RD technique is to recognize and classify the ransomware to achieve security in the IoT platform. To achieve this, the EBSAEDL-RD technique contains different types of processes, namely min-max normalization, EBSA-based feature selection, BiGRU classification, and SSA-based hyperparameter tuning. Figure 1 illustrates the working flow of the EBSAEDL-RD technique.

    Figure 1.  Workflow of EBSAEDL-RD technique.

    Initially, the EBSAEDL-RD method exploits min-max normalization. In the context of ransomware detection, min-max normalization is a preprocessing stage for IoT security [22]. This method is used to standardize and scale mathematical features within a certain range, between 0 and 1. In the field of IoT security, where the recognition of ransomware threats is of great significance, normalizing input data ensures that dissimilar feature sizes do not excessively impact the performance of ML algorithms. The min-max normalization facilitates the effective utilization of diverse features in detecting patterns indicative of ransomware attacks by transforming the data into a consistent scale. This normalization method improves the accuracy and robustness of prediction techniques, contributing to the general efficiency of ransomware detection systems in protecting the IoT environment from possible security attacks.

    The EBSAEDL-RD technique makes use of the EBSA technique to select an optimum set of features. Ebola optimization search algorithm (EOSA), a recent meta-heuristic technique, draws inspiration from the propagation model of Ebola virus disease introduced by Oyelade and Ezugwu [23]. The explanation of the EOSA technique is discussed below:

    1) Set each scalar and vector quantity which are parameters and individuals. Individuals in the set: Infected (I), Susceptible (S), Vaccinated (V), Dead (D), Recovered (R), Hospitalized (H), and Quarantine (Q) with the initial value.

    2) The index case (I) is randomly generated from inclined individuals.

    3) The index case is set as global and local optimum and the fitness values.

    4) When the iteration count is not exhausted infected individuals exist,

    a. Generate and update their location depending on their movement for every susceptible individual. Note that the infected state is further moved, then there exists more infection, hence short displacement defines exploitation or else exploration.

    i. Generate diseased individual (nI) depend on (a).

    ii. Add that case to the I

    b. Calculate the individual number and add it to H, D, R, B, V, and Q through the corresponding rate based on the dimension of I

    c. Update S and I based on I.

    d. Pick the present finest from I and compute it with global finest.

    e. If the terminating criteria are not met, then return to step 6.

    5) Return global best and each solution.

    The mathematical modelling is given as follows: update of Funeral (F), Exposed (E), S, I, H, V, R, Q, and D are directed by a method of difference equation derived. The differential calculus aims to get the rates of change of quantities in terms of time t:

    S(t)t=π(β1I+β3D+β4R+β2(PE)η)S(τS+I) (1)
    I(t)t=(β1I+β3D+β4R+β2(PE)λ)S(+γ)I(τ)S (2)
    H(t)t=αI(γ+ϖ)H (3)
    R(t)t=γIR (4)
    V(t)t=γI(μ+ϑ)V (5)
    D(t)t=(τS+I)δD (6)
    Q(t)t=(πI(γR+D))ξQ (7)

    In the EBSA approach, the fitness function (FF) is intended to have a balance between the number of features chosen in every solution (minimum) and the classifier outcome (maximum) attained, Eq (8) shows the FF to calculate the solution.

    Fitness=αγR(D)+β|R||C| (8)

    In Eq (8), α and β are the significance of classifier quality and subset length, ∈ [1,0] and β=1α.γR(D) indicates the classifier error rate, |R| stands for the cardinality of the selected subset, and |C| refers to the overall amount of features in the dataset (parameters).

    In this phase, the classification of ransomware takes place using the BiGRU model. BiGRU is an RNN that has been effectively utilized for solving time‐series sequence data challenges due to its bi-directional learning system that improves the learning of temporal designs from the time‐sequence data [24]. All the BiGRU blocks comprise a cell that stores data. All the blocks are composed of update and reset gates and the cells assist in addressing the disappearing gradient problems. BiGRU contains 2 GRU units: reset and update gates. The reset gate integrates novel input with preceding memory and the update gate determines the preceding memory to recollect. The input dataset is fed into feedback and feedforward networks in terms of time, and these two are linked to one resultant layer. The BiGRU gates are planned to store data extensively in either backward or forward ways if the optimum solution than feedforward networks. The bi-directional method offers the ability to employ either past or future contexts from the sequences. BGRU has been formulated as:

    ht=[ht,ht] (9)

    where ht and ht are the feedforward and the backward blocks, respectively.

    The last resultant layer at time t is:

    yt=σ(Wyht+by) (10)

    where σ stands for the activation function, Wy denotes the weighted, and by represents the bias vector.

    Every GRU block is composed of 4 modules: reset gate rI with equivalent weights and biases Wr,Ur,br, input vector xl with equivalent weights and biases, output vector ht with its weights and biases Wh,Uh,bh, and update gate zI with equivalent weights and biases Wz,Uz,bz. The gating units are defined as follows:

    Primarily, for t=0, the resultant vector is h0=0

    zt=σg(Wzxt+Uzht1+bz) (11)
    rt=σg(Wrxt+Urht1+br) (12)
    ht=ztht1+(1zt)h(Whxt+Uh(rtht1)+bh) (13)

    where W,U,andb denote the parameter matrices and vectors, σg defines the sigmoid function, indicates the Hadamard product, σg and h imply the activation functions, and h signifies the hyperbolic tangent. Figure 2 defines the infrastructure of the BiGRU model.

    Figure 2.  BiGRU architecture.

    Initially, the BGRU cells have been generated for the outcome of feedforward has been calculated (Ft) and the feedback propagation (Bt) is combined. These 4 approaches combine the solution, multiplication, concatenation (default), average, and summation. In this case, it is related to the solution of the entire combining model. The combining is defined as:

    O1t=concat((Ft),(Bt))

    Such that

    (Ft)=(h1,h2,h3,,ht)

    and

    (Bt)=(ht,ht+1,ht+2,ht+3,hn) (14)

    Then, the FC layer has been utilized to increase the BiGRU solution with its bias and weight. Afterwards, a Softmax regression layer generates a predictive utilization in the FC layer. The weighted classification layer has been utilized for computing the weighted cross‐entropy loss function to predict score and training target that assists in addressing the class imbalanced problems. The next loss can be utilized as:

    (p,t)=(1(pt)γ)log2(pt)θi (15)

    where (p,t) defines the assessed probability of all the classes, γ0 refers to the discount factor parameter that is tuned to better evaluate, and θi refers to the logic weight of all the classes.

    Finally, the SSA can be applied for optimal hyperparameter selection of the BiGRU model. SSA developed that pretends to antipredatory and predatory performance of sparrows [25]. In the SSA model, the individuals are separated into producers by huge energy assets, joiners discover food through producers and vigilantes who are highly answerable for cautionary. The uniqueness of finders and joiners is not stable. Any individual who finds a superior food source becomes a producer while others become a joiner. Since the producer's ratio to joiners is constant in a cluster, during the foraging procedure, producers are highly responsible for searching regions for plentiful food and delivering guidelines to other joiners who constantly discover producers by optimal food. As soon as vigilantes discover a hunter, they guide an alarm sign via song and the producer takes the joiner far away to a protected region once a sign attains a definite threshold. At the edge of the cluster, other sparrows rapidly moved to the security area, but the sparrows who were in the middle had to move arbitrarily in confidence of receiving nearer to other sparrows. Let us assume that the complete number of sparrows is m,j signifies spatial distribution, the ratio of the producer to joiner is between 7:1 and 3:1, and Ws denotes protection threshold of cautionary signal, Then, Si,j=(S1,j,S2,j,,Sm,j) refers to the location of ith sparrow in flight. So, the location of producer, joiner, and vigilante upgraded affording to Eqs (16–18). R2Ws in Eq. (16), signifies vigilantes discover a hunter, all sparrows must rapidly fly to harmless places, and R2<Ws then the producer continues its search in a wider region. If I>m2 in Eq (17), then theith joiner with inferior fitness value is most probably a hungry sparrow. If fi=fb shows that the sparrow is in mid of the swarm, and fi>fb then the sparrow is at the edge of the swarm in Eq (18).

    Sk+1i,j={Ski,j+QL,R2WsSki,jexp(iαitermax),R2<Ws (16)
    Sk+1i,j={Qexp(SkworstSki,ji2),i>m2Sk+1b+|Ski,jSk+1b|A+L,im2 (17)
    Sk+1i,j={Skbest+β|Ski,jSkbest|,fi>fbSki,j+k(|Ski,jSkworst|(fifw)+ϵ),fi=fb (18)

    where Sk+1i,j signifies the location of jth element of ith sparrow at (k+1)th iteration, Ski,j represents the position of jth dimension of ith sparrow at kth iteration, Sk+1best denotes location of best producer at (k+1)th iteration, L demonstrates a medium of every element inside is 1, Skbest is the global optimum solution at the kth iteration, Skworst is the global worst place at thekth iteration, Q denotes an arbitrary amount that follows the standard distribution, R2 directs the value of the alarm signal for all sparrows, Ws signifies the protection threshold of the alarm signal that is equivalent to 0.8, fi and fb are said to be present and global best fitness value respectively, itermax refers to the maximal amount of iterations, α and κ signify random numbers in [0,1], ϵ defines the error constant, and β denotes the control parameter.

    The SSA method derives an FF to acquire higher efficiency of classification. It describes a positive integer to characterize the enhanced accuracy of candidate solutions. Here, the decline of the classifier error rate is regarded as an FF,

    fitness(xi)=ClassifierErrorRate(xi)
    =No.ofmisclassifiedsamplesTotalNo.ofsamples100 (19)

    The ransomware detection outcomes of the EBSAEDL-RD method are tested using a dataset [26] encompassing 840 samples as defined by Table 1.

    Table 1.  Details of dataset.
    Classes No. of Instances
    Goodware 420
    Ransomware 420
    Total Instances 840

     | Show Table
    DownLoad: CSV

    Figure 3 defines the confusion matrices achieved by the EBSAEDL-RD algorithm under epochs from 500 to 3000. The experimental values imply that the EBSAEDL-RD algorithm has efficient recognition of the goodware and ransomware samples under two classes.

    Figure 3.  Confusion matrices of the EBSAEDL-RD model (a–f) epochs 500–3000.

    Table 2 and Figure 4 show the ransomware detection of the EBSAEDL-RD technique is investigated under distinct epochs. The outcome inferred that the EBSAEDL-RD method reaches effectual detection of the goodware and ransomware. On 500 epochs, the EBSAEDL-RD method attains an average accuy of 98.69%, sensy of 98.69%, specy of 98.69%, Fscore of 98.69%, and MCC of 97.39%. On 1000 epochs, the EBSAEDL-RD system achieved an average accuy of 99.88%, sensy of 99.88%, specy of 99.88%, Fscore of 99.88%, and MCC of 99.76%. On 2000 epochs, the EBSAEDL-RD methodology reached an average accuy of 99.52%, sensy of 99.52%, specy of 99.52%, Fscore of 99.52%, and MCC of 99.05%. On 2500 epochs, the EBSAEDL-RD algorithm achieved an average accuy of 99.17%, sensy of 99.17%, specy of 99.17%, Fscore of 99.17%, and MCC of 98.34%. Lastly, on 3000 epochs, the EBSAEDL-RD technique obtained an average accuy of 99.05%, sensy of 99.05%, specy of 99.05%, Fscore of 99.05%, and MCC of 98.10%.

    Table 2.  Ransomware detection of the EBSAEDL-RD system under different epochs.
    Classes Accuy Sensy Specy Fscore MCC
    Epoch - 500
    Goodware 99.52 99.52 97.86 98.70 97.39
    Ransomware 97.86 97.86 99.52 98.68 97.39
    Average 98.69 98.69 98.69 98.69 97.39
    Epoch - 1000
    Goodware 99.76 99.76 100.00 99.88 99.76
    Ransomware 100.00 100.00 99.76 99.88 99.76
    Average 99.88 99.88 99.88 99.88 99.76
    Epoch - 1500
    Goodware 99.52 99.52 99.52 99.52 99.05
    Ransomware 99.52 99.52 99.52 99.52 99.05
    Average 99.52 99.52 99.52 99.52 99.05
    Epoch - 2000
    Goodware 99.76 99.76 99.29 99.52 99.05
    Ransomware 99.29 99.29 99.76 99.52 99.05
    Average 99.52 99.52 99.52 99.52 99.05
    Epoch - 2500
    Goodware 99.52 99.52 98.81 99.17 98.34
    Ransomware 98.81 98.81 99.52 99.16 98.34
    Average 99.17 99.17 99.17 99.17 98.34
    Epoch - 3000
    Goodware 99.52 99.52 98.57 99.05 98.10
    Ransomware 98.57 98.57 99.52 99.04 98.10
    Average 99.05 99.05 99.05 99.05 98.10

     | Show Table
    DownLoad: CSV
    Figure 4.  Average outcome of EBSAEDL-RD system under various epochs.

    The accuy curves for training (TR) and validation (VL) depicted in Figure 5 for the EBSAEDL-RD approach under epochs 500–3000 offer appreciated insights into its outcome. Specifically, there is a consistent development in both TR as well as TS accuy with maximum epochs, demonstrating the model's ability to learn and distinguish designs in both TR and TS data. The rising trend in TS accuy underlines the model's adaptability to the TR dataset and its capability to create accurate predictions on unnoticed data, emphasizing robust generalized abilities.

    Figure 5.  Accuy curve of the EBSAEDL-RD method (a–f) epochs 500–3000.

    Figure 6 offers a widespread outline of the TR and TS loss performances for the EBSAEDL-RD system on distinct epochs 500–3000. The TR loss constantly diminishes as the model increases its weights to reduce classifier errors on both databases. The loss curves exemplify the model's alignment with the TR data, emphasizing its proficiency to capture designs successfully in both databases. The continuous refinement of parameters in the EBSAEDL-RD approach is noticeable, intended to diminish discrepancies among predictions and actual TR labels.

    Figure 6.  Loss curve of the EBSAEDL-RD system (a–f) epochs 500–3000.

    Concerning the PR curve existing in Figure 7, the findings affirm that the EBSAEDL-RD methodology under epoch 1000 consistently achieves improved PR values across each class. These results underscore the model's effective capacity for discriminating between various classes, highlighting its effectiveness in correctly distinguishing classes.

    Figure 7.  PR curve of the EBSAEDL-RD algorithm under epoch 1000.

    Additionally, in Figure 8, we existing ROC curves generated by the EBSAEDL-RD algorithm under epoch 1000, demonstrating its proficiency in distinguishing among class labels. These curves provide appreciated insights into how the tradeoff between TPR and FPR differs across dissimilar classification epochs and thresholds. The results underscore the model's correct classification solution under two class labels, highlighting its efficacy in addressing diverse classification tests.

    Figure 8.  ROC curve of the EBSAEDL-RD technique under epoch 1000.

    In Table 3, the comparative results of the EBSAEDL-RD technique are portrayed [18]. Figure 9 investigates the comparison study of the EBSAEDL-RD technique in terms of accuy. The outcomes show that the EBSAEDL-RD method gains improved accuy values. Based on accuy, the EBSAEDL-RD technique offers the greatest accuy of 99.88% whereas the OGCNN-RWD, DWOML, Bagging, AdaBoostM1, ROF, DT, and RF systems offer lesser accuy values of 99.67%, 99.12%, 98.53%, 96.19%, 95.87%, 97.71%, and 98.86%, respectively.

    Table 3.  Comparison analysis of the EBSAEDL-RD method with other techniques.
    Methods Accuy Sensy Specy
    EBSAEDL-RD 99.88 99.88 99.88
    OGCNN-RWD 99.67 99.68 99.68
    DWOML 99.12 99.49 99.24
    Bagging 98.53 93.74 96.14
    AdaBoostM1 96.19 94.56 94.67
    Rotation Forest (ROF) 95.87 96.81 97.44
    Decision Tree (DT) 97.71 97.87 98.20
    Random Forest (RF) 98.86 98.82 98.32

     | Show Table
    DownLoad: CSV
    Figure 9.  Accuy of the EBSAEDL-RD method compared with other systems.

    Figure 10 scrutinizes the comparison analysis of the EBSAEDL-RD algorithm with respect to sensy and specy. The outcome means that the EBSAEDL-RD methodology obtains superior sensy and specy values. Based on sensy, the EBSAEDL-RD method offers a higher sensy of 99.88% whereas the OGCNN-RWD, DWOML, Bagging, AdaBoostM1, ROF, DT, and RF algorithms attain lower sensy values of 99.68%, 99.49%, 93.74%, 94.56%, 96.81%, 97.87%, and 98.82%, respectively. According to specy, the EBSAEDL-RD system offers an enhanced sensy of 99.88% whereas the OGCNN-RWD, DWOML, Bagging, AdaBoostM1, ROF, DT, and RF systems reach reduced specy values of 99.68%, 99.24%, 96.14%, 94.67%, 97.44%, 98.20%, and 98.32%, respectively. Accordingly, the EBSAEDL-RD system has been executed for enhanced ransomware detection.

    Figure 10.  Sensy and Specy of the EBSAEDL-RD technique compared with other approaches.

    In this study, we design a new EBSAEDL-RD method in IoT security. The purpose of the EBSAEDL-RD technique is to recognize and classify the ransomware to achieve security in the IoT platform. To achieve this, the EBSAEDL-RD technique contains different types of processes, namely min-max normalization, EBSA-based feature selection, BiGRU classification, and SSA-based hyperparameter tuning. Initially, the EBSAEDL-RD technique employs min-max normalization to scale the input data into useful format. Then, the EBSAEDL-RD technique makes use of the EBSA method to select an optimum set of features. Meanwhile, the classification of ransomware takes place using the BiGRU model. At last, SSA can be applied for optimum hyperparameter selection of the BiGRU model. The wide-ranging experiments of the EBSAEDL-RD approach are performed on benchmark data. The obtained results highlighted that the EBSAEDL-RD method reaches better performance over other models on IoT security.

    The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, KSA for funding this research work through the project number "NBU-FFR-2024-170-01".



    [1] Y. Xu, Functional Data Analysis, London: Springer, 2023. https://doi.org/10.1007/978-1-4471-7503-2_4
    [2] P, Hall, M, Hosseini-Nasab, On Properties of Functional Principal Components Analysis, J. R. Stat. Soc. Ser. B: Stat. Methodol., 68 (2006), 109–126. https://doi.org/10.1111/j.1467-9868.2005.00535.x doi: 10.1111/j.1467-9868.2005.00535.x
    [3] W. Seo, Functional principal component analysis for cointegrated functional time series, J. Time Ser. Anal., 45 (2023), 320–330. https://doi.org/10.1111/jtsa.12707 doi: 10.1111/jtsa.12707
    [4] O. A. Montesinos López, A. Montesinos López, J. Crossa, Multivariate Statistical Machine Learning Methods for Genomic Prediction, Cham: Springer, 2022. https://doi.org/10.1007/978-3-030-89010-0
    [5] H. Hullait, D. S. Leslie, N. G. Pavlidis, S. King, Robust Function-on-Function Regression, Technometrics, 63 (2020), 396–409. https://doi.org/10.1080/00401706.2020.1802350 doi: 10.1080/00401706.2020.1802350
    [6] J. O. Razo-De-Anda, L. L. Romero-Castro, F. Venegas-Martínez, Contagion Patterns Classification in Stock Indices: A Functional Clustering Analysis Using Decision Trees, Mathematics, 11 (2023), 2961. https://doi.org/10.3390/math11132961 doi: 10.3390/math11132961
    [7] F. Centofanti, A. Lepore, B. Palumbo, Sparse and smooth functional data clustering, Stat. Pap., 65 (2024), 795–825. https://doi.org/10.1007/s00362-023-01408-1 doi: 10.1007/s00362-023-01408-1
    [8] J. A. Arias-López, C. Cadarso-Suárez, P. Aguiar-Fernánde, Computational Issues in the Application of Functional Data Analysis to Imaging Data, Lect. Notes Comput. Sci., 42 (2021), 630–638. https://doi.org/10.1007/978-3-030-86960-1_46 doi: 10.1007/978-3-030-86960-1_46
    [9] C. Tang, T. Wang, P. Zhang, Functional data analysis: An application to COVID‐19 data in the United States in 2020, Quant. Bio., 10 (2022), 172–187. https://doi.org/10.15302/J-QB-022-0300 doi: 10.15302/J-QB-022-0300
    [10] C. Zhang, H. Lin, L. Liu, J. Liu, Y. Li, Functional Data Analysis with Covariate-Dependent Mean and Covariance Structures, Biometrics, 79 (2023), 2232–2245. https://doi.org/10.1111/biom.13744 doi: 10.1111/biom.13744
    [11] I. Shah, P. Mubassir, S. Ali, O. Albalawi, A functional autoregressive approach for modeling and forecasting short-term air temperature, Front. Environ. Sci., 12 (2024), 1411237. https://doi.org/10.3389/fenvs.2024.1411237 doi: 10.3389/fenvs.2024.1411237
    [12] V. Villani, E. Romano, J. Mateu, Climate model selection via conformal clustering of spatial functional data, Environ. Ecol. Stat., 31 (2024), 365–385. https://doi.org/10.1007/s10651-024-00616-8 doi: 10.1007/s10651-024-00616-8
    [13] A. Palummo, E. Arnone, L. Formaggia, L. M. Sangalli, Functional principal component analysis for incomplete space-time data, Environ. Ecol. Stat., 31 (2024), 555–582. https://doi.org/10.1007/s10651-024-00598-7 doi: 10.1007/s10651-024-00598-7
    [14] J. O. Ramsay, B. W. Silverman, Functional Data Analysis, 2 Eds., New York: Springer, 2005. https://doi.org/10.1007/b98888
    [15] M. A. Hael, Unveiling air pollution patterns in Yemen: A spatial-temporal functional data analysis, Environ. Sci. Pollut. Res., 30 (2023), 50067–50095. https://doi.org/10.1007/s11356-023-25790-3 doi: 10.1007/s11356-023-25790-3
    [16] M. Gong, R. O'Donnell, C. Miller, M. Scott, S. Simis, S. Groom, et. al, Adaptive smoothing to identify spatial structure in global lake ecological processes using satellite remote sensing data, Spat. Stat., 50 (2022), 100615. https://doi.org/10.1016/j.spasta.2022.100615 doi: 10.1016/j.spasta.2022.100615
    [17] R. Raturi, Large Data Analysis via Interpolation of Functions: Interpolating Polynomials vs Artificial Neural Networks, Amer. J. Intell. Syst., 8 (2018), 6–11. https://doi.org/10.5923/j.ajis.20180801.02 doi: 10.5923/j.ajis.20180801.02
    [18] N. A. Mazelan, J. Suhaila, Exploring rainfall variabilities using statistical functional data analysis, IOP Conf. Ser.: Earth Environ. Sci., 1167 (2023), 012007. https://doi.org/10.1088/1755-1315/1167/1/012007 doi: 10.1088/1755-1315/1167/1/012007
    [19] C. Sözen, Y. Öner, The investigation of temperature data in Turkey's Black Sea Region using functional data analysis, J. Appl. Stat., 49 (2021), 2403–2415. https://doi.org/10.1080/02664763.2021.1896683 doi: 10.1080/02664763.2021.1896683
    [20] J. Baz, J. Davis, L. Han, C. Stracke, The value of smoothing, J. Portfolio Manag., 48 (2022), 73–85. https://doi.org/10.3905/jpm.2022.1.399 doi: 10.3905/jpm.2022.1.399
    [21] A. Falini, F. Mazzia, C. Tamborrino, Spline based Hermite quasi-interpolation for univariate time series, Discrete Cont. Dyn. Syst. - S, 15 (2022), 3667–3688. https://doi.org/10.3934/dcdss.2022039 doi: 10.3934/dcdss.2022039
    [22] L. Brugnano, D. Giordano, F. Iavernaro, G. Rubino, An entropy-based approach for a robust least squares spline approximation, J. Comput. Appl. Math., 443 (2024), 115773. https://doi.org/10.1016/j.cam.2024.115773 doi: 10.1016/j.cam.2024.115773
    [23] M. Spreafico, F. Ieva, M. Fiocco, Modelling time-varying covariates effect on survival via functional data analysis: Application to the MRC BO06 trial in osteosarcoma, Stat. Methods Appl., 32 (2023), 271–298. https://doi.org/10.1007/s10260-022-00647-0 doi: 10.1007/s10260-022-00647-0
    [24] A. Rahman, D. Jiang, Regional and temporal patterns of influenza: Application of functional data analysis, Infect. Dis. Modell., 6 (2021), 1061–1072. https://doi.org/10.1016/j.idm.2021.08.006 doi: 10.1016/j.idm.2021.08.006
    [25] M. Rangata, S. Das, M. Ali, Analysing Maximum Monthly Temperatures in South Africa for 45 years Using Functional Data Analysis, Adv. Decis. Sci., 24 (2020), 1–27.
    [26] U. Beyaztas, S. Q. Salih, K.-W. Chau, N. Al-Ansari, Z. M. Yaseen, Construction of functional data analysis modeling strategy for global solar radiation prediction: Application of cross-station paradigm, Eng. Appl. Comput. Fluid Mech., 13 (2019), 1165–1181. http://doi.org/10.1080/19942060.2019.1676314 doi: 10.1080/19942060.2019.1676314
    [27] S. Curceac, C. Ternynck, T. B. Ouarda, F. Chebana, S. D. Niang, Short-term air temperature forecasting using Nonparametric Functional Data Analysis and SARMA models, Environ. Modell. Software, 111 (2019), 394–408. http://doi.org/10.1016/j.envsoft.2018.09.017 doi: 10.1016/j.envsoft.2018.09.017
    [28] M. Ammad, M. Y. Misro, A. Ramli, A novel generalized trigonometric Bézier curve: Properties, continuity conditions and applications to the curve modeling, J. Amer. Math. Soc., 194 (2022), 744–763. http://doi.org/10.1016/j.matcom.2021.12.011 doi: 10.1016/j.matcom.2021.12.011
    [29] S. A. A. A. Said Mad Zain, M. Y. Misro, K. T. Miura, Generalized Fractional Bézier Curve with Shape Parameters, Mathematics, 9 (2021), 2141. https://doi.org/10.3390/math9172141 doi: 10.3390/math9172141
    [30] B. A. Barsky, The Beta-Spline: A Local Representation based on Shape Parameters and Fundamental Geometric Measures, PhD thesis, The University of Utah, 1981.
    [31] B. A. Barsky, Rational Beta-splines for representing curves and surfaces, IEEE Comput. Graph. Appl., 13 (1993), 24–32. http://doi.org/10.1109/38.252550 doi: 10.1109/38.252550
    [32] N. A. Hadi, A. Ibrahim, F. Yahya, J. M. Ali, A Comparative Study on Cubic Bezier and Beta-Spline Curves, Mathematika, 29 (2013), 55–64.
    [33] B. Sambhunath, C. L. Brian, Bézier and Splines in Image Processing and Machine Vision, London: Springer, 2008. https://doi.org/10.1007/978-1-84628-957-6
    [34] N. A. Hadi, N. S. M. Kamal, H. Nordin, Computational Method for Digital Khat Calligraphy Using Beta-Spline Curve Fitting, ASM Sc. J., 13 (2020). https://doi.org/10.32802/asmscj.2020.sm26(5.8) doi: 10.32802/asmscj.2020.sm26(5.8)
    [35] S. A. Suliman, N. A. Hadi, Optimizing the Shape Parameters of Beta-Spline Using Particle Swarm Optimization, Int. J. Eng. Technol., 7 (2018), 93–97. http://doi.org/10.14419/ijet.v7i4.33.23492 doi: 10.14419/ijet.v7i4.33.23492
    [36] M. S. A. Halim, N. A. Hadi, H. Sulaiman, S. Abd Halim, An algorithm for beta-spline surface reconstruction from multi slice CT scan images using MATLAB pmode, 2017 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE), 2017, 1–6. http://doi.org/10.1109/ISCAIE.2017.8074939 doi: 10.1109/ISCAIE.2017.8074939
    [37] B. A. Barsky, J. C. Beatty, Local Control of Bias and Tension in Beta-splines, ACM Trans. Graph., 2 (1983), 109–134. http://doi.org/10.1145/357318.357321 doi: 10.1145/357318.357321
    [38] B. A. Barsky, Computer Graphics and Geometric Modeling Using Beta-splines, Berlin, Heidelberg: Springer, 1988. https://doi.org/10.1007/978-3-642-72292-9
    [39] B. A. Barsky, J. C. Beatty, Varying the Betas in Beta-splines, Technical Report UCB/CSD-83-112, EECS Department, University of California, Berkeley, 1982. Available from: https://digicoll.lib.berkeley.edu/record/137388/files/CSD-83-112.pdf.
    [40] E. Holtanová, T. Mendlik, J. Koláček, I. Horová, J. Mikšovský, Similarities within a multi-model ensemble: functional data analysis framework, Geosci. Model Dev., 12 (2019), 735–747. http://doi.org/10.5194/gmd-12-735-2019 doi: 10.5194/gmd-12-735-2019
    [41] D. A. Shah, E. D. De Wolf, P. A. Paul, L. V. Madden, Functional Data Analysis of Weather Variables Linked to Fusarium Head Blight Epidemics in the United States, Phytopathology®, 109 (2019), 96–110. http://doi.org/10.1094/PHYTO-11-17-0386-R doi: 10.1094/PHYTO-11-17-0386-R
    [42] B. Guo, H. Wu, L. Pei, X. Zhu, D. Zhang, Y. Wang, et al., Study on the spatiotemporal dynamic of ground-level ozone concentrations on multiple scales across China during the blue sky protection campaign, Environ. Int., 170 (2022), 107606. http://doi.org/10.1016/j.envint.2022.107606 doi: 10.1016/j.envint.2022.107606
    [43] P. Craven, G. Wahba, Smoothing noisy data with spline functions, Numer. Math., 31 (1978), 377–403. http://doi.org/10.1007/BF01404567 doi: 10.1007/BF01404567
    [44] M. Gubian, F. Torreira, L. Boves, Using Functional Data Analysis for investigating multidimensional dynamic phonetic contrasts, J. Phonetics, 49 (2015), 16–40. http://doi.org/10.1016/j.wocn.2014.10.001 doi: 10.1016/j.wocn.2014.10.001
    [45] L. Tavi, T. Kinnunen, R. González Hautamäki, Improving speaker de-identification with functional data analysis of f0 trajectories, Speech Commun, , 140 (2022), 1–10. http://doi.org/10.1016/j.specom.2022.03.010 doi: 10.1016/j.specom.2022.03.010
  • This article has been cited by:

    1. Aya H. Salem, Safaa M. Azzam, O. E. Emam, Amr A. Abohany, Advancing cybersecurity: a comprehensive review of AI-driven detection techniques, 2024, 11, 2196-1115, 10.1186/s40537-024-00957-y
    2. Sarah A. Alzakari, Mohammed Aljebreen, Nazir Ahmad, Asma A. Alhashmi, Sultan Alahmari, Othman Alrusaini, Ali M. Al-Sharafi, Wafa Sulaiman Almukadi, An intelligent ransomware based cyberthreat detection model using multi head attention-based recurrent neural networks with optimization algorithm in IoT environment, 2025, 15, 2045-2322, 10.1038/s41598-025-92711-4
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1367) PDF downloads(193) Cited by(0)

Figures and Tables

Figures(21)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog