Research article Special Issues

Fractional view evaluation system of Schrödinger-KdV equation by a comparative analysis

  • Received: 15 June 2022 Revised: 23 August 2022 Accepted: 30 August 2022 Published: 08 September 2022
  • MSC : 34A34, 35A20, 35A22, 44A10, 33B15

  • The time-fractional coupled Schrödinger-KdV equation is an interesting mathematical model because of its wide and significant application in mathematics and applied sciences. A fractional coupled Schrödinger-KdV equation in the sense of Caputo derivative is investigated in this article. Namely, we provide a comparative study of the considered model using the Adomian decomposition method and the homotopy perturbation method with Shehu transform. Approximate solutions obtained using the Adomian decomposition and homotopy perturbation methods were numerically evaluated and presented in graphs and tables. Then, these solutions were compared to the exact solutions, demonstrating the simplicity, effectiveness, and good accuracy of the applied method. To demonstrate the accuracy and efficiency of the suggested techniques, numerical problem are provided.

    Citation: Rasool Shah, Abd-Allah Hyder, Naveed Iqbal, Thongchai Botmart. Fractional view evaluation system of Schrödinger-KdV equation by a comparative analysis[J]. AIMS Mathematics, 2022, 7(11): 19846-19864. doi: 10.3934/math.20221087

    Related Papers:

    [1] Eman A. Al-Shahari, Marwa Obayya, Faiz Abdullah Alotaibi, Safa Alsafari, Ahmed S. Salama, Mohammed Assiri . Accelerating biomedical image segmentation using equilibrium optimization with a deep learning approach. AIMS Mathematics, 2024, 9(3): 5905-5924. doi: 10.3934/math.2024288
    [2] Myeongmin Kang, Miyoun Jung . Nonconvex fractional order total variation based image denoising model under mixed stripe and Gaussian noise. AIMS Mathematics, 2024, 9(8): 21094-21124. doi: 10.3934/math.20241025
    [3] Ming Shi, Ibrar Hussain . Improved region-based active contour segmentation through divergence and convolution techniques. AIMS Mathematics, 2025, 10(1): 654-671. doi: 10.3934/math.2025029
    [4] Miyoun Jung . A variational image denoising model under mixed Cauchy and Gaussian noise. AIMS Mathematics, 2022, 7(11): 19696-19726. doi: 10.3934/math.20221080
    [5] Muhammad Suleman, Muhammad Ilyas, M. Ikram Ullah Lali, Hafiz Tayyab Rauf, Seifedine Kadry . A review of different deep learning techniques for sperm fertility prediction. AIMS Mathematics, 2023, 8(7): 16360-16416. doi: 10.3934/math.2023838
    [6] Bing Xue, Jiakang Du, Hongchun Sun, Yiju Wang . A linearly convergent proximal ADMM with new iterative format for BPDN in compressed sensing problem. AIMS Mathematics, 2022, 7(6): 10513-10533. doi: 10.3934/math.2022586
    [7] Yating Zhu, Zixun Zeng, Zhong Chen, Deqiang Zhou, Jian Zou . Performance analysis of the convex non-convex total variation denoising model. AIMS Mathematics, 2024, 9(10): 29031-29052. doi: 10.3934/math.20241409
    [8] Saulo Orizaga, Ogochukwu Ifeacho, Sampson Owusu . On an efficient numerical procedure for the Functionalized Cahn-Hilliard equation. AIMS Mathematics, 2024, 9(8): 20773-20792. doi: 10.3934/math.20241010
    [9] Miyoun Jung . Group sparse representation and saturation-value total variation based color image denoising under multiplicative noise. AIMS Mathematics, 2024, 9(3): 6013-6040. doi: 10.3934/math.2024294
    [10] A. Joumad, A. El Moutaouakkil, A. Nasroallah, O. Boutkhoum, Mejdl Safran, Sultan Alfarhood, Imran Ashraf . Unsupervised segmentation of images using bi-dimensional pairwise Markov chains model. AIMS Mathematics, 2024, 9(11): 31057-31086. doi: 10.3934/math.20241498
  • The time-fractional coupled Schrödinger-KdV equation is an interesting mathematical model because of its wide and significant application in mathematics and applied sciences. A fractional coupled Schrödinger-KdV equation in the sense of Caputo derivative is investigated in this article. Namely, we provide a comparative study of the considered model using the Adomian decomposition method and the homotopy perturbation method with Shehu transform. Approximate solutions obtained using the Adomian decomposition and homotopy perturbation methods were numerically evaluated and presented in graphs and tables. Then, these solutions were compared to the exact solutions, demonstrating the simplicity, effectiveness, and good accuracy of the applied method. To demonstrate the accuracy and efficiency of the suggested techniques, numerical problem are provided.



    Clustering represents a versatile conceptual and algorithmic framework employed in diverse domains like pattern recognition, image segmentation, data mining and genetic disease detection, among others [1,2,3]. The fundamental objective of clustering is to categorize data points into meaningful clusters based on their similarity characteristics [4]. The overarching objective is to optimize similarity within clusters while minimizing it between distinct clusters [5,6]. Over time, numerous clustering methods have emerged, encompassing the likes of k-means, AP, SC algorithms and others [7,8,9]. However, the performance of most conventional algorithms is constrained when dealing with datasets that exhibit arbitrary shapes and densities [10,11,12].

    In 2014, the density peaks clustering (DPC) algorithm was introduced on the Science Journal, presenting two key features for identifying cluster centers [13]. First, cluster centers exhibit higher local density compared to their neighboring points. Second, cluster centers are positioned at relatively large distances from each other. By capitalizing on these unique traits, the DPC algorithm effectively identifies cluster centers through the construction of a decision graph. In addition, DPC does not require iterative processes or excessive input parameters [14]. As a simplistic yet highly efficient density-based clustering technique, DPC has played an eminent role in diverse domains, including data mining, community exploration, genetic disease investigation, biology and other related areas [15,16,17,18,19].

    However, the DPC algorithm has a limitation as it may inaccurately estimate the number of clusters when selecting cluster centers based on the decision graph [20]. In addition, determining the appropriate input parameter dc for satisfactory clustering performance requires prior knowledge [21]. In recent years, several approaches have been proposed to address these limitations. Xu et al. [22] utilized a linear fitting method based on the distribution of parameters to select all potential centers. Chen et al.[23] employed a linear regression model and residuals analysis to automatically determine the cluster centers. Liu et al. [24] introduced the ADPC-KNN algorithm, which selects initial cluster centers and then aggregates density-reachable sub-clusters. Masud et al. [25] presented the I-nice algorithm, inspired by human observation of mountains during field exploration to automatically detect the number of clusters and select their centers. d'Errico et al. [26] proposed that density peaks can be automatically identified using a point adaptive k-nearest neighbor density estimator. Despite the theoretical and practical advantages of the algorithms mentioned above, they introduce new parameters to facilitate obtaining an exact number of clusters. Alternatively, the algorithms may be complex and not scalable. Consequently, the challenge of automatically obtaining the optimal number of clusters and a suitable parameter persists.

    To address the challenges faced by the DPC algorithm, we present an innovative automatic density peaks clustering (ADPC) algorithm. First, based on the Silhouette Coefficient[27], a clustering index named density-distance cluster (DDC) is defined. Then, ADPC introduces the DDC index to identify accurate number of clusters and select a suitable parameter automatically. The new features of the ADPC algorithm are (i) a novel DDC index is proposed based on the characteristics of the DPC algorithm and while simultaneously fulfilling the clustering definition. The DDC index is specifically designed to be appropriate for DPC, and (ii) suitable parameter dc is selected according to the optimal DDC value. Thus, ADPC detects the number of a clusters automatically without any additional parameters.

    To summarize, the major contributions of our work are

    ● The proposed novel clustering validity index leverages both density and distance characteristics of cluster centers. Notably, this index is not only suitable for the DPC algorithm but also aligns harmoniously with the broader definition of clustering.

    ● An improved automatic density peaks clustering algorithm is proposed based on the DDC index, which automatically selects a suitable cut-off distance and determines the optimal number of clusters without the need for additional parameters.

    ● The experimental results validate the effectiveness of the ADPC algorithm in automatically determining the optimal number of clusters and cut-off distance.

    The subsequent chapters of this paper are structured as follows: Section 2 presents the fundamental principles of the DPC algorithm. Section 3 introduces the innovative DDC index and details the enhanced ADPC algorithm. In Section 4, experiments are designed to demonstrate the efficiency of the ADPC algorithm. A comparison of the DPC, AP and DBSCAN algorithms is conducted on diverse datasets. The paper concludes with a summary of the key findings and generalization and outlines potential challenges for future research.

    This section introduces the major ideas of the Silhouette Coefficient and the DPC algorithm.

    Clustering algorithms are evaluated based on two significant factors: Within-cluster similarity and between-cluster dissimilarity. As far as we know, the Silhouette Coefficient is a clustering validity index that reflects the compactness and separation of clusters [27,28]. Its value rang is between [-1, 1], and the larger the indicator value, the better the clustering effect of the clustering results.

    Assuming that dataset X={x1,x2,,xn} has been divided into C={C1,C2,,Ck} clusters through clustering algorithms. To calculate the total Silhouette Coefficient of the clustering result, we first calculate the Silhouette Coefficient of each sample point in the dataset separately. First, the average distance between point and other points in the same cluster is calculated as

    a(i)=1|Ci|1xjCi,ijd(xi,xj), (2.1)

    where d(xi,xj) represents the Euclidean distance between xi and xj. a(i) determines the degree to which xi is assigned to this cluster. Second, calculate the minimum average distance from xi to other clusters

    b(i)=minmi1|Cm|xmCmd(xi,xm), (2.2)

    b(i) represents the dissimilarity between clusters. Third, the Silhouette Coefficient of xi is defined as

    s(i)=b(i)a(i)max{a(i),b(i)}. (2.3)

    It can be seen that the closer s(i) approaches 1, the better the compactness and separation of the clustering results. Finally, average the Silhouette Coefficient of all points to obtain the total Silhouette Coefficient of the clustering result

    SC=1nni=1s(i). (2.4)

    To calculate the Silhouette Coefficient for each sample point, O(|Ci|1) and O(n|Ci|) time complexity are required to obtain a(i) and b(i), respectively. An iteration requires calculating the entire data point to obtain SC, and the distance between two data points will be calculated. Therefore, the total time complexity of the Silhouette Coefficient is O(n2).

    The Silhouette Coefficient validity reflects the compactness of datasets within clusters and the separation between clusters. However, the computational complexity of the Silhouette Coefficient is high. Therefore, we design a new clustering index DDC based on the concept of the Silhouette Coefficient for DPC, which not only meets the density and distance characteristics of DPC, but also meets the definition of clustering.

    The DPC algorithm is introduced as an efficient method for identifying cluster centers and creating arbitrary clusters [29]. It is a straightforward approach with substantial potential, leading to significant interest from the research community [30,31,32]. The algorithm is built upon two fundamental assumptions:

    Assumption 1. Cluster centers exhibit a higher density compared to their neighboring data points. Local density ρi is calculated for each data as

    ρi=jχ(dijdc),χ(x)={1,x<0,0,x0, (2.5)

    where dij involves the dissimilarity between objects xi and xj, computed using the Euclidean distance. The cut-off distance, dc, is a parameter in the DPC algorithm. It is defined as the 2% percentile value of the similarity matrix, which is obtained by sorting the similarities in ascending order. The cutoff distance serves as a threshold to determine the neighborhood of each data point. In addition, ρi can be obtained using a Gaussian kernel function when dealing with a small dataset:

    ρi=jexp(dij2dc2). (2.6)

    Assumption 2. The distance between any two cluster centers is relatively large. For each data point, it calculates the algorithm calculates the distance to these cluster centers

    δi=minj:ρj>ρi(dij). (2.7)

    For the data with the highest density, the following can be observed:

    δi=maxj(dij). (2.8)

    Once the local density ρ and the minimum distance to higher-density neighbors δ are calculated for all data points, the DPC algorithm constructs a decision graph. Figure 1(a) visually represents this decision graph, where ρ is plotted on the abscissa (x-axis), and δ is plotted on the ordinate (y-axis). The decision graph provides valuable insights into the distribution of data points and helps in identifying cluster centers. Subsequently, the DPC algorithm selects the cluster centers based on Figure 1(b). The decision graph is generated using parameter γ, which is computed as follows:

    γi=ρiδi. (2.9)
    Figure 1.  Different methods of drawing the decision graph; (a) Decision graph based on ρ and δ; (b) Decision graph based on γ.

    Subsequently, in Figure 1(a) or Figure 1(b), data points with higher values of ρ and δ and larger values of γ can be identified as cluster centers. These points are distinguishable as they are more separated from the remaining data points. In the figures, the colored objects represent the identified cluster centers. Finally, the DPC algorithm assigns the non-center points to their nearest neighbors with higher densities [33]. This step ensures that each data point is allocated to the cluster represented by its nearest higher-density neighbor, thus completing the clustering process.

    It is worth noting that the DPC algorithm's key advantage lies in its decision graph, which plays a crucial role in identifying cluster centers based on the parameters ρ and δ, or relying solely on γ [34]. However, one limitation of the decision graph is that the cluster centers it identifies may not always be distinctly separate from the remaining data points, making it challenging to precisely define the boundaries of larger clusters [35]. Consequently, correctly determining cluster centers solely based on a decision graph can be a non-trivial task [36]. Additionally, the performance of the DPC algorithm can be influenced by the selection of the parameter dc. Imperfect choices of dc may fail to highlight the characteristics of cluster centers, leading to suboptimal clustering results [37]. The appropriate selection of dc is crucial for achieving accurate and meaningful cluster assignments.

    For example, Figure 2(a)(f) show different decision graphs generated by the DPC algorithm on the Aggregation dataset, which consists of 7 clusters [13]. In Figure 2(a)(c), the decision graphs are constructed using ρ and δ with dc=1, dc=2, and dc=4, respectively. Figure 2(d)(f) demonstrate decision graphs drawn using γ with dc=1, dc=2 and dc=4, respectively. Figure 2(a) and (d) illustrate the challenge of accurately identifying the actual number of cluster centers, regardless of whether the decision graph is based on ρ and δ or on γ. The larger values of ρ, δ or γ can lead to ambiguity in determining the cluster centers. As observed in Figure 2(b) and (e), when dc=2, according to the DPC algorithm, 10 data points exhibit both larger ρ and δ values or have larger γ values. Consequently, misclassification can occur, resulting in the division of the Aggregation dataset into 10 clusters instead of the correct 7 clusters. Figure 2(c) and (f) demonstrate that by setting dc=4, seven cluster centers are identified, accurately capturing the underlying seven clusters in the Aggregation dataset. This emphasizes the need for an appropriate selection of the parameter dc in order to obtain satisfactory clustering results with the DPC algorithm. In summary, Figure 2(a)(f) provide further evidence of the limitations of the DPC algorithm, as it struggles to automatically select the optimal number of cluster centers. Additionally, the selection of dc plays a critical role in achieving desirable clustering outcomes.

    Figure 2.  Decision graphs of DPC on Aggregation; (a) Decision graph based on ρ and δ with dc=1; (b) Decision graph based on ρ and δ with dc=2; (c) Decision graph based on ρ and δ with dc=4; (d) Decision graph based on γ with dc=1; (e) Decision graph based on γ with dc=2; (f) Decision graph based on γ with dc=4.

    To address the limitations mentioned above, this paper introduces an improved ADPC algorithm based on the DDC index. To obtain DDC values iteratively, the DPC algorithm is executed multiple times using different values of the parameter dc and varying the number of cluster centers. Finally, satisfactory clustering results can be achieved by identifying the optimal DDC index, along with selecting a suitable value for the parameter dc and determining the optimal number of clusters.

    Absolutely, assessing within-cluster similarity and between-cluster dissimilarity are essential aspects when evaluating clustering algorithms [38]. In the case of the DPC algorithm, the optimal partitioning of data should maximize within-cluster similarity while minimize between-cluster similarity [39]. Furthermore, DPC is grounded on density and distance assumptions, wherein cluster centers exhibit higher local density and are relatively distant from each other. Consequently, to effectively capture within-cluster compactness and between-cluster separation in datasets while adhering to the characteristics of higher local density and larger distance, a novel clustering validity index, known as the density-distance clustering (DDC) index, is introduced.

    Let X={x1,x2,,xn} represents the set of data samples. Assuming that there are n samples clustered into k clusters, we denote the cluster center in the ith cluster as ui. In the subsequent definitions, d(xi,xj) denotes the dissimilarity between xi and xj, which is computed using the Euclidean distance.

    Definition 1. We take the average similarity between each data samples of ith cluster and its corresponding cluster center ui as the within-cluster similarity a(i):

    a(i)=1|Ci||Ci|j=1d(xj,ui), (3.1)

    where |Ci| represents the scale of the ith cluster, the smaller a(i) is, the higher the local density is.

    Definition 2. The between-cluster similarity, denoted as b(i), is defined as the minimum similarity between the cluster center ui of the ith cluster and each cluster center um of other clusters. It can be calculated as:

    b(i)=min1mk,mid(ui,um). (3.2)

    The larger b(i) is, the larger the relative distance between cluster centers is.

    Definition 3. To optimize the within-cluster similarity while minimizing the between-cluster similarity, we define the DDC(i) of the ith cluster as follows:

    DDC(i)=b(i)a(i)max(b(i),a(i)). (3.3)

    The denominator is set to ensure that the DDC(i) ranges from -1 to 1.

    Definition 4. We define the average DDC(i) of the k clusters obtained as the DDC of this data partition:

    DDCk=1kki=1DDC(i). (3.4)

    Definition 5. The number of clusters corresponding to the maximum average DDC value is determined as the optimal number of clusters, denoted as k:

    kbest=argmax3k<n{DDCk}. (3.5)

    The DDC index is illustrated in Figure 3, which depicts the clustering of all data samples into three clusters: A, B and C. The cluster centers corresponding to these clusters are denoted as a, b and c, respectively. In addition, let's assume that there are eight data points in cluster A. Consequently, we can assess the within-cluster similarity of cluster A by computing the average distance between each data sample and the cluster center a.

    a(A)=d(a,a1)+d(a,a2)+d(a,a3)+d(a,a4)+d(a,a5)+d(a,a6)+d(a,a7)+d(a,a8)8. (3.6)
    Figure 3.  Illustration of DDC index.

    The between-cluster similarity of cluster A can be measured by determining the minimum similarity between the cluster center a and the other cluster centers b and c:

    b(A)=min(d(a,b),d(a,c)). (3.7)

    Then, we calculate DDC(A) for cluster A according to (3.3). As for calculating DDC(A), we can obtain DDC(B) and DDC(C). The DDC of this partition is the average of DDC(A), DDC(B) and DDC(C) according to (3.4):

    DDC3=DDC(A)+DDC(B)+DDC(C)3. (3.8)

    The DDC index is designed to effectively optimize the within-cluster similarity while minimizing the between-cluster similarity by incorporating both local density and distance characteristics in DPC. In Figure 3, a(i) represents intra-class point aggregation, indicating how closely the points within a cluster are grouped together. A smaller value of a(i) signifies higher similarity among the points within the cluster, indicating a higher local density around that cluster center. This observation aligns with Assumption 1 of the DPC algorithm. On the other hand, b(i) represents inter-class distinctions, indicating the dissimilarity between objects in different clusters. A higher value of b(i) implies greater distinctiveness between the clusters, indicating that the cluster center is farther away from other centers. This aligns with the distance assumption of the DPC algorithm. In summary, the DDC index not only fulfills the objectives of the clustering algorithm by optimizing similarity and dissimilarity measures but also aligns with the two fundamental assumptions of the DPC algorithm. It can be considered a novel clustering index suitable for the DPC algorithm.

    Using the DDC index as a foundation, we propose the Automatic Density Peaks Clustering (ADPC) algorithm to enhance the DPC algorithm. Algorithm 1 summarizes the steps involved in the ADPC algorithm.

    Algorithm 1: ADPC algorithm
    Input: Dataset X={x1,x2,,xn};
    Output: The optimal number of clusters and optimal dc and the clustering result.
    Step 1: For k=3 to k=n;
    a: Select one unvisited value from dc=[1,2,4];
    b: Use (2) to calculate ρi for each data sample;
    c: Use (3) and (4) to calculate δi for each data sample;
    d: Use (5) to calculate γi for each data sample;
    e: Sort γ as a decreasing order;
    f: Take k points corresponding to the first γ values as the cluster centers;
    g: Class the dataset into k clusters;
    h: Use (8) to calculate the DDC value of a single cluster;
    i: Use (9) to calculate the average DDC value of k clusters as DDC for this clustering partition;
    j:If dc has been not traversed, then go to a;
    Step 2: Use (10) to obtain the optimal number of clusters;
    Step 3: Output suitable dc and clustering results.

    In the ADPC algorithm, there are three major processes. First, it takes the number of clusters k, as an input parameter. The algorithm allows k to range from 3 to n, where n represents the number of data samples. Through experience and literature [13,30], DPC performances satisfactory when dc=1, or dc=2, or dc=4 generally. To avoid an increase in algorithm complexity, ADPC set dc to be the same as in DPC. The algorithm performs DPC iteratively with different combinations of k and dc, obtaining multiple clustering results. Second, after obtaining each clustering result from the DPC process, the DDC index is calculated for each clustering result. This index quantifies the within-cluster similarity and between-cluster dissimilarity, considering the density and distance characteristics. It allows for an objective evaluation of the quality of each clustering result. Finally, based on the calculated DDC index values, the clustering result with the best DDC value is selected. This indicates the clustering result that achieves satisfactory performance, with an optimal number of clusters and a suitable dc value. By considering the DDC index, ADPC aims to find the clustering configuration that maximizes within-cluster similarity while minimizing between-cluster similarity.

    Through these three processes, the ADPC algorithm iteratively explores different combinations of k and dc, calculates the DDC index for each clustering result and selects the clustering configuration with the best DDC value. This approach helps in achieving satisfactory clustering performance with an optimal number of clusters and an appropriate dc value.

    The primary principle of the ADPC algorithm is to iteratively search for the best DDC value. The DDC index is specifically designed to be applicable to the DPC algorithm, considering the local density and distance characteristics. The best DDC value represents the optimal performance of the clustering algorithm. This optimal performance corresponds to the ideal number of clusters and parameter values.

    Assume that |Ci| is the size of the ith cluster, and k stands for the number of clusters. In the ADPC algorithm, the time complexity is determined by both the DDC and DPC processes. The major time-consuming tasks in DPC include constructing the similarity matrix and calculating density and distance. Each operation has a time complexity of O(n2), where n is the number of data samples. Therefore, the total time complexity of DPC is O(n2). As for the DDC process, the time complexity is determined by both a(i) and b(i). It takes O(|Ci|) to compute a(i) and O(k1) to compute b(i). Since k and |Ci| are typically much smaller than n(|Ci|n,kn), the time complexity of DDC is approximately O(k(|Ci|+k)), ensuring that the algorithm maintains efficiency.

    The ADPC algorithm achieves the optimal number of clusters using the novel DDC index and suitable dc automatically, surpassing the performance of DPC. Additionally, the ADPC algorithm provides satisfactory clustering results without the need for any parameters, while maintaining the same level of efficiency as DPC.

    The effectiveness of the proposed ADPC algorithm was demonstrated through extensive experimentation on a diverse set of datasets, including eight synthetic datasets, eight real-world datasets and the well-known Olivetti face dataset [40]. Table 1 provides a description of these datasets, including the number of clusters and the scale, which vary from small to large. To compare the performance of ADPC with DPC, we also applied a widely used AP algorithm [41] and the DBSCAN algorithm [42] on the UCI datasets and the Olivetti face dataset. These two algorithms, AP and DBSCAN, do not require prior determination of cluster centers, thus serving as benchmarks to validate the superiority of the proposed ADPC algorithm. Furthermore, we provide a discussion on the DDC index to demonstrate its applicability to the DPC algorithm.

    Table 1.  Characteristics of different datasets.
    Datasets Samples Attributes Clusters
    Aggregation 788 2 7
    D31 3100 2 31
    S 1765 2 5
    Twenty 1000 2 20
    Square 1000 2 4
    S1 5000 2 15
    A3 7500 2 50
    S3 5000 2 15
    Iris 150 4 3
    Seeds 210 7 3
    Waveform 5000 21 3
    Vertebral 310 6 3
    Soybean 47 35 4
    X8D5K 1000 8 5
    Leuk 72 39 3
    Wine 178 13 3
    Olivetti face 100 92×112 10

     | Show Table
    DownLoad: CSV

    The experiments were conducted on a desktop computer equipped with a 3.10 GHz Intel Core i5 processor, running the MacOS 10.14.6 operating system and equipped with 4 GB of RAM. The experiments were executed using MATLAB 2015 as the programming environment.

    Table 1 presents the different characteristics of the eight synthetic datasets used in the experiments. These datasets consist of clusters with various shapes and densities, allowing for a comprehensive evaluation of clustering algorithm performance. To showcase the effectiveness of our ADPC algorithm in achieving satisfactory clustering results with the optimal number of clusters and a suitable parameter, Table 2 presents the number of clusters achieved by different algorithms. Additionally, the value of dc obtained by ADPC is also included in Table 2.

    Table 2.  Number of clusters on synthetic datasets obtained by different algorithms.
    Datasets ADPC/dc DPC AP
    Aggregation 7/4 10 17
    D31 31/1 31 31
    S 5/4 6 27
    Twenty 20/4 20 20
    Square 4/4 4 20
    S1 15/1 15 24
    A3 50/1 32 50
    S3 15/1 15 53

     | Show Table
    DownLoad: CSV

    Given that the datasets are two-dimensional, visualizing the clustering results of the ADPC algorithm, along with the compared DPC, AP and DBSCAN algorithms, using colored plots would offer a more straightforward interpretation. This visualization method allows for a clearer understanding of the performance of each algorithm. Additionally, we have considered the references [13,41,42] to determine the parameters of the DPC, AP and DBSCAN algorithms. Through careful selection, we have chosen the optimal parameters for these algorithms to ensure fair and accurate comparisons in our experiments.

    As shown in Table 2, ADPC effectively determines the optimal number of clusters for the eight datasets. Additionally, ADPC can determine a suitable value for dc, rather than relying on the default parameter dc=2 used in DPC. If DPC is applied with dc=2 as described in reference [13], it fails to produce a reasonable number of clusters for the Aggregation, S and A3 datasets, as indicated by the decision graph. This means that DPC, relying on visual identification of cluster centers based on a decision graph, suffers from a significant limitation. Furthermore, the AP and DBSCAN algorithms only determine the optimal number of clusters for the Twenty dataset. Moreover, these algorithms require parameter adjustments and are sensitive to the chosen parameters [41,42]. As a result, ADPC outperforms DPC, AP and DBSCAN in terms of clustering results, as demonstrated in Figures 411.

    Figure 4.  Clustering results on Aggregation.
    Figure 5.  Clustering results on D31.
    Figure 6.  Clustering results on S.
    Figure 7.  Clustering results on Twenty.
    Figure 8.  Clustering results on Square.
    Figure 9.  Clustering results on S1.
    Figure 10.  Clustering results on A3.
    Figure 11.  Clustering results on S3.

    Figures 411 display the clustering performance of the ADPC, DPC, AP and DBSCAN algorithms on each of the synthetic datasets. In these figures, each color represents a distinct cluster. It is evident that the ADPC algorithm consistently achieves satisfactory clustering results across all eight synthetic datasets. On the other hand, the DPC algorithm fails to produce reasonable clustering results on the Aggregation, S and A3 datasets. Similarly, the AP and DBSCAN algorithms only exhibit good clustering performance on the Twenty dataset. These findings further validate the effectiveness of the ADPC algorithm, which utilizes the DDC index, as it consistently outperforms DPC, AP and DBSCAN in terms of clustering accuracy and robustness across diverse synthetic datasets.

    In this subsection, we present the results of applying the introduced ADPC algorithm on eight real-world datasets to demonstrate its superiority. Table 3 provides a comparison of the number of clusters obtained by the ADPC, DPC, AP and DBSCAN algorithms on these datasets ('-' means the algorithm either identifies only one cluster or fails to find any clusters).

    Table 3.  The number of clusters on the UCI datasets obtained through different algorithms.
    Datasets ADPC DPC AP DBSCAN
    Iris 3 2 6 2
    Seeds 3 3 11 -
    Waveform 3 2 139 -
    Vertebral 3 2 21 -
    Soybean 4 4 5 -
    X8D5K 5 5 14 2
    Leuk 3 2 6 3
    Wine 3 7 8 -

     | Show Table
    DownLoad: CSV

    Table 3 shows that the AP and DBSCAN algorithms are unable to determine the optimal number of clusters for the eight UCI datasets. Additionally, these algorithms require parameter adjustments and exhibit sensitivity to the chosen parameters. On the contrary, DPC fails to obtain a reasonable number of clusters among the UCI datasets, with the exception of Soybean, X8D5K and Seeds datasets. To gain deeper insights into the challenges faced by DPC in accurately identifying the number of clusters on these datasets, refer to Figure 12, which showcases the decision graphs produced by DPC for the same datasets. The decision graphs vividly demonstrate the difficulties and limitations encountered by DPC in cluster determination for these specific datasets.

    Figure 12.  Decision graphs of DPC on UCI datasets.

    We can see from Figure 12(b), (e), (g) and (h) that the cluster centers in these decision graphs are not clearly separated from the surrounding points. This lack of clear separation makes it difficult to accurately identify the correct number of cluster centers. Similarly, in Figure 12(a), (c) and (d), the correct number of cluster centers may be challenging to determine due to the use of dc=2 as described in reference [13]. The reliance of DPC on human-based selection represents a significant limitation. In contrast, the ADPC algorithm can automatically determine the optimal number of clusters on these eight UCI datasets and establish correct clusters without the need for any parameters. Consequently, the ADPC algorithm outperforms DPC, AP and DBSCAN in terms of clustering results, offering a more effective and reliable approach.

    To provide further evidence of the superiority of the introduced ADPC algorithm, we present a detailed comparison with the DPC algorithm in Table 4. The table presents the clustering results of ADPC and DPC on all eight UCI datasets, measured in terms of Accuracy (Acc) [43], Adjusted Mutual Information (AMI) [44] and Adjusted Rand Index (ARI) [45].

    Table 4.  Clustering results of ADPC and DPC on UCI datasets.
    ADPC DPC
    ACC AMI ARI dc NC ACC AMI ARI dc NC
    Iris 0.9067 0.7960 0.7592 2 3 0.6667 1 0.5681 2 2
    Seeds 0.8952 0.6741 0.7170 1 3 0.8857 0.6926 0.7027 2 3
    Waveform 0.5794 0.3482 0.2962 1 3 0.582 0.4978 0.2422 2 2
    Vertebral 0.6387 0.2618 0.2850 4 3 0.4806 0.0054 -0.0022 2 2
    Soybean 0.8936 0.8061 0.7251 1 4 0.8936 0.8061 0.7251 2 4
    X8D5K 1 1 1 1 5 1 1 1 2 5
    Leuk 0.9583 0.8573 0.8809 1 3 0.7083 0.0781 0.5413 2 2
    Wine 0.7921 0.5534 0.5054 1 3 0.5393 0.3168 0.2802 2 7

     | Show Table
    DownLoad: CSV

    Table 4 indicates that ADPC generally outperforms DPC on most datasets, demonstrating superior results. ADPC has the ability to automatically adjust the value of dc to achieve optimal clustering outcomes. The only exception is observed in the Waveform datasets, where DPC performs slightly better than ADPC. This can be attributed to the fact that Waveform comprises three clusters, with each cluster occupying approximately 33% of the data. Additionally, ADPC yields consistent results with DPC on the Soybean and X8D5K datasets. This consistency arises from the fact that DPC produces the same outcomes regardless of whether dc=1 or dc=2. These findings further validate that our ADPC algorithm, based on the DDC index, can effectively determine the optimal number of clusters and adjust without the need for additional parameters.

    To further evaluate the performance of ADPC, the ADPC algorithm is tested on the famous Olivetti Face Database. The database consists of 40 subjects, each having 10 different images. For this evaluation, we utilized the first 100 images, dividing them into 10 clusters. To measure the similarity between each pair of images, we utilized the Structural Similarity Index (SSIM) [46], defined by formula (4.1)

    SSIM(x,y)=(2μxμy+c1)(2σxy+c2)(μx2+μy2+c1)(σx2+σy2+c2), (4.1)

    where c1 and c2 are the constants taken to maintain the stability. μx and μy represent the average of x and y, σx and σy respectively represent the variances of x and y, σxy represents the covariance of x and y. Images were allocated to a cluster only if their distance is less than dc. Figure 13 depicts the cluster allocation results obtained through the ADPC algorithm on the Olivetti Face Dataset. On the other hand, Figure 14 shows the clustering outcomes achieved through the DPC algorithm on the same dataset. In both figures, images belonging to the same cluster are represented with the same color. However, images displayed in gray indicate incorrect classifications.

    Figure 13.  ADPC clustering on Olivetti Face Database (the first 100 images).
    Figure 14.  DPC clustering on Olivetti Face Database (the first 100 images).

    In accordance with reference [13], the optimal number of clusters for DPC on the Olivetti Face dataset is determined to be 7. Remarkably, ADPC has also successfully identified 7 subjects out of 10, aligning with the results achieved by the DPC algorithm. However, upon examining Figure 15, we observe that when relying on ρ and δ or γ=ρδ to draw decision graphs, the DPC algorithm fails to accurately select cluster centers. In contrast, although ADPC's accuracy may not be as high as that of DPC, it compensates for the limitations of DPC by automatically identifying clusters. Consequently, the ADPC algorithm improves upon the challenge faced by DPC in establishing clusters automatically. Through this comparison, it becomes evident that ADPC offers a valuable enhancement to DPC, enabling the automatic identification of clusters and addressing the shortcomings of DPC in this particular context.

    Figure 15.  DPC clustering on Olivetti Face Database (the first 100 images); (a) Decision graph based on ρ and δ; (b) Decision graph based on γ.

    In this section, we discuss the applicability of the DDC index to DPC. First, in order to verify the advantages of using the density and distance characteristics of the cluster centers, we plotted the clustering accuracy of DPC with other similar assumptions on the two-dimensional datasets in Table 1 based on DDC and Silhouette Coefficient, as shown in Figures 16 and 17. Using visualizations will provide a more direct performance comparison.

    Figure 16.  DPC clustering results based on the DDC index.
    Figure 17.  DPC clustering results based on the Silhouette Coefficient.

    From Figures 16 and Figure 17, it is evident that the DDC index is more suitable for the DPC algorithm than the Silhouette Coefficient and achieves better clustering accuracy. This is because the DDC index not only considers within-cluster similarity and between-cluster dissimilarity, but also effectively utilizes the density and distance characteristics of cluster centers, which is more in line with the cluster center hypothesis in DPC. The Silhouette Coefficient cannot guide the DPC algorithm in finding the optimal number of clusters on Aggregation, S and S1. To further demonstrate the effectiveness of the DDC index, we present a comparison of the clustering time between them in Figure 18. Both algorithms are run 10 times to obtain the average results.

    Figure 18.  Clustering time comparison.

    We can see from Figure 18 that the execution efficiency of DDC is higher because it does not need to calculate the similarity between all data points, and its complexity O(k(|Ci|+k)) is much smaller than the complexity O(n2) of the Silhouette Coefficient. Therefore, the DDC index is more suitable for DPC regarding clustering accuracy and efficiency, and its design is reasonable.

    We introduce a novel clustering validity index called the DDC index, specifically designed for the DPC algorithm. Building upon the concept of the DDC index, we propose a new algorithm called ADPC. The ADPC algorithm aims to achieve desirable clustering results by determining the optimal number of clusters and identifying a suitable parameter. ADPC begins with a similar approach as DPC, utilizing DPC to calculate the DDC values iteratively and considering different numbers of clusters and parameters. The DDC value serves as an indicator of the quality of the clustering results, with larger DDC values indicating better clustering outcomes.

    ADPC successfully addresses two significant limitations of DPC. First, it resolves the issue of inaccurate visual identification of cluster centers on the decision graph. Second, it tackles the problem of selecting an unsuitable parameter dc, which can negatively impact the clustering results. The DDC index, which constitutes the foundation of ADPC, consists of two essential evaluation factors for clustering algorithms: the within-cluster parameter and the between-cluster parameter. By incorporating these factors, the DDC index not only aligns with the assumptions of DPC but also remains consistent with the objectives of clustering algorithms. Experimental evaluations conducted on both synthetic and real-world datasets demonstrate that ADPC outperforms conventional DPC. ADPC not only automatically selects the optimal number of clusters but also determines a suitable value for the parameter dc. Notably, ADPC achieves this without the need for additional parameters.

    Indeed, while the ADPC and DPC algorithms have shown promising results in clustering tasks, they may still encounter challenges when applied to manifold datasets, such as S4-D and S5-B in the paper describing the original DP algorithm. These challenges arise due to the inherent complexities and intricacies present in such datasets. In order to improve the clustering performance on complex datasets, further exploration and research are necessary.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This work is supported by the National Natural Science Foundations of China (No. 62206296) and the Key Research and Development Projects in Xuzhou (No. KC22142).

    All authors declare no conflicts of interest in this paper.



    [1] V. Prajapati, R. Meher, A robust analytical approach to the generalized Burgers-Fisher equation with fractional derivatives including singular and non-singular kernels, J. Ocean Eng. Sci., 2022. https://doi.org/10.1016/j.joes.2022.06.035 doi: 10.1016/j.joes.2022.06.035
    [2] L. Verma, R. Meher, Z. Avazzadeh, O. Nikan, Solution for generalized fuzzy fractional Kortewege-de Varies equation using a robust fuzzy double parametric approach, J. Ocean Eng. Sci., 2022. https://doi.org/10.1016/j.joes.2022.04.026 doi: 10.1016/j.joes.2022.04.026
    [3] M. Alqhtani, K. Saad, W. Weera, W. Hamanah, Analysis of the fractional-order local Poisson equation in fractal porous media, Symmetry, 14 (2022), 1323. https://doi.org/10.3390/sym14071323 doi: 10.3390/sym14071323
    [4] M. Al-Sawalha, R. Agarwal, R. Shah, O. Ababneh, W. Weera, A reliable way to deal with fractional-order equations that describe the unsteady flow of a polytropic gas, Mathematics, 10 (2022), 2293. https://doi.org/10.3390/math10132293 doi: 10.3390/math10132293
    [5] S. Mukhtar, S. Noor, The numerical investigation of a fractional-order multi-dimensional model of Navier-Stokes equation via novel techniques, Symmetry, 14 (2022), 1102. https://doi.org/10.3390/sym14061102 doi: 10.3390/sym14061102
    [6] N. Shah, Y. Hamed, K. Abualnaja, J. Chung, A. Khan, A comparative analysis of fractional-order Kaup-Kupershmidt equation within different operators, Symmetry, 14 (2022), 986. https://doi.org/10.3390/sym14050986 doi: 10.3390/sym14050986
    [7] N. Shah, H. Alyousef, S. El-Tantawy, J. Chung, Analytical investigation of fractional-order Korteweg-De-Vries-Type equations under Atangana-Baleanu-Caputo operator: Modeling nonlinear waves in a plasma and fluid, Symmetry, 14 (2022), 739. https://doi.org/10.3390/sym14040739 doi: 10.3390/sym14040739
    [8] N. Aljahdaly, A. Akgul, R. Shah, I. Mahariq, J. Kafle, A comparative analysis of the fractional-order coupled Korteweg-De Vries equations with the Mittag-Leffler law, J. Math., 2022 (2022), 1–30. https://doi.org/10.1155/2022/8876149 doi: 10.1155/2022/8876149
    [9] H. Ismael, H. Bulut, H. Baskonus, W-shaped surfaces to the nematic liquid crystals with three nonlinearity laws, Soft Comput., 25 (2020), 4513–4524. https://doi.org/10.1007/s00500-020-05459-6 doi: 10.1007/s00500-020-05459-6
    [10] M. Yavuz, T. Sulaiman, A. Yusuf, T. Abdeljawad, The Schrodinger-KdV equation of fractional order with Mittag-Leffler nonsingular kernel, Alex. Eng. J., 60 (2021), 2715–2724. https://doi.org/10.1016/j.aej.2021.01.009 doi: 10.1016/j.aej.2021.01.009
    [11] K. Safare, V. Betageri, D. Prakasha, P. Veeresha, S. Kumar, A mathematical analysis of ongoing outbreak COVID-19 in India through nonsingular derivative, Numer. Meth. Part. D. E., 37 (2020), 1282–1298. https://doi.org/10.1002/num.22579 doi: 10.1002/num.22579
    [12] H. Ismael, H. Baskonus, H. Bulut, Abundant novel solutions of the conformable Lakshmanan-Porsezian-Daniel model, Discrete Cont. Dyn.-S., 14 (2021), 2311. https://doi.org/10.3934/dcdss.2020398. doi: 10.3934/dcdss.2020398
    [13] H. Yasmin, N. Iqbal, A comparative study of the fractional coupled Burgers and Hirota-Satsuma KdV equations via analytical techniques, Symmetry, 14 (2022), 1364. https://doi.org/10.3390/sym14071364 doi: 10.3390/sym14071364
    [14] R. Alyusof, S. Alyusof, N. Iqbal, M. Arefin, Novel evaluation of the fractional acoustic wave model with the exponential-decay kernel, Complexity, 2022 (2022), 1–14. https://doi.org/10.1155/2022/9712388 doi: 10.1155/2022/9712388
    [15] M. Yavuz, T. Sulaiman, A. Yusuf, T. Abdeljawad, The Schrodinger-KdV equation of fractional order with Mittag-Leffler nonsingular kernel, Alex. Eng. J., 60 (2021), 2715–2724. https://doi.org/10.1016/j.aej.2021.01.009 doi: 10.1016/j.aej.2021.01.009
    [16] H. Triki, A. Biswas, Dark solitons for a generalized nonlinear Schrodinger equation with parabolic law and dual-power law nonlinearities, Math. Meth. Appl. Sci., 34 (2011), https://doi.org/10.1002/mma.1414 doi: 10.1002/mma.1414
    [17] L. Zhang, J. Si, New soliton and periodic solutions of (1+2)-dimensional nonlinear Schrodinger equation with dual-power law nonlinearity, Commun. Nonlinear Sci. Numer. Simul., 15 (2010), 2747–2754. https://doi.org/10.1016/j.cnsns.2009.10.028 doi: 10.1016/j.cnsns.2009.10.028
    [18] Q. Xu, S. Songhe, Numerical analysis of two local conservative methods for two-dimensional nonlinear Schrodinger equation, SCIENTIA SINICA Math., 48 (2017), 345. https://doi.org/10.1360/scm-2016-0308 doi: 10.1360/scm-2016-0308
    [19] J. C. Bronski, L. D. Carr, B. Deconinck, J. N. Kutz, Bose-Einstein condensates in standing waves: The cubic nonlinear Schrödinger equation with a periodic potential, Phys. Rev. Lett., 86 (2001), 1402–1405. https://doi.org/10.1103/PhysRevLett.86.1402 doi: 10.1103/PhysRevLett.86.1402
    [20] A. Trombettoni, A. Smerzi, Discrete solitons and breathers with dilute Bose-Einstein condensates, Phys. Rev. Lett., 86 (2001), 2353–2356. https://doi.org/10.1103/physrevlett.86.2353 doi: 10.1103/physrevlett.86.2353
    [21] A. Biswas, Quasi-stationary non-Kerr law optical solitons, Opt. Fiber Technol., 9 (2003), 224–259. https://doi.org/10.1016/s1068-5200(03)00044-0 doi: 10.1016/s1068-5200(03)00044-0
    [22] M. Eslami, M. Mirzazadeh, Topological 1-soliton solution of nonlinear Schrödinger equation with dual-power law nonlinearity in nonlinear optical fibers, Eur. Phys. J. Plus, 128 (2013). https://doi.org/10.1140/epjp/i2013-13140-y doi: 10.1140/epjp/i2013-13140-y
    [23] A. Hyder, The influence of the differential conformable operators through modern exact solutions of the double Schrödinger-Boussinesq system, Phys. Scripta, 2021. https://doi.org/10.1088/1402-4896/ac169f doi: 10.1088/1402-4896/ac169f
    [24] G. Adomian, R. Rach, Inversion of nonlinear stochastic operators, J. Math. Anal. Appl., 91 (1983), 39–46. https://doi.org/10.1016/0022-247x(83)90090-2 doi: 10.1016/0022-247x(83)90090-2
    [25] S. Alinhac, M. Baouendi, A counterexample to strong uniqueness for partial differential equations of Schrodinger's type, Commun. Part. Diff. Eq., 19 (1994), 1727–1733. https://doi.org/10.1080/03605309408821069 doi: 10.1080/03605309408821069
    [26] J. H. He, Homotopy perturbation technique, Comput. Meth. Appl. Mech. Eng., 178 (1999), 257–262. https://doi.org/10.1016/s0045-7825(99)00018-3 doi: 10.1016/s0045-7825(99)00018-3
    [27] J. H. He, A coupling method of homotopy technique and perturbation technique for nonlinear problems, Int. J. Non-Linear Mech., 35 (2000), 743. https://doi.org/10.1016/s0020-7462(98)00085-7 doi: 10.1016/s0020-7462(98)00085-7
    [28] D. D. Ganji, M. Rafei, Solitary wave solutions for a generalized Hirota Satsuma coupled KdV equation by homotopy perturbation method, Phys. Lett. A, 356 (2006), 131–137. https://doi.org/10.1016/j.physleta.2006.03.039 doi: 10.1016/j.physleta.2006.03.039
    [29] A. M. Siddiqui, R. Mahmood, Q. K. Ghori, Homotopy perturbation method for thin film flow of a fourth grade fluid down a vertical cylinder, Phys. Lett. A, 352 (2006), 404–410. https://doi.org/10.1016/j.physleta.2005.12.033 doi: 10.1016/j.physleta.2005.12.033
    [30] Y. Qin, A. Khan, I. Ali, M. Al Qurashi, H. Khan, R. Shah, D. Baleanu, An efficient analytical approach for the solution of certain fractional-order dynamical systems, Energies, 13 (2020), 2725. https://doi.org/10.3390/en13112725 doi: 10.3390/en13112725
    [31] M. Alaoui, R. Fayyaz, A. Khan, R. Shah, M. Abdo, Analytical investigation of Noyes-Field model for time-fractional Belousov-Zhabotinsky reaction, Complexity, 2021 (2021), 1–21. https://doi.org/10.1155/2021/3248376 doi: 10.1155/2021/3248376
    [32] J. H. He, Y. O. El-Dib, Homotopy perturbation method for Fangzhu oscillator, J. Math. Chem., 58 (2020), 2245–2253. https://doi.org/10.1007/s10910-020-01167-6 doi: 10.1007/s10910-020-01167-6
    [33] J. Honggang, Z. Yanmin, Conformable double Laplace-Sumudu transform decomposition method for fractional partial differential equations, Complexity, 2022 (2022), 1–8. https://doi.org/10.1155/2022/7602254 doi: 10.1155/2022/7602254
    [34] L. Akinyemi, O. S. Iyiola, Exact and approximate solutions of time-fractional models arising from physics via Shehu transform, Math. Meth. Appl. Sci., 43 (2020), 7442–7464. https://doi.org/10.1002/mma.6484 doi: 10.1002/mma.6484
    [35] L. Ali, R. Shah, W. Weera, Fractional view analysis of Cahn-Allen equations by new iterative transform method, Fractal Fract., 6 (2022), 293. https://doi.org/10.3390/fractalfract6060293 doi: 10.3390/fractalfract6060293
    [36] S. Maitama, W. Zhao, Homotopy perturbation Shehu transform method for solving fractional models arising in applied sciences, J. Appl. Math. Comput. Mech., 20 (2021), 71–82. https://doi.org/10.17512/jamcm.2021.1.07 doi: 10.17512/jamcm.2021.1.07
    [37] M. Liaqat, A. Khan, M. Alam, M. Pandit, S. Etemad, S. Rezapour, Approximate and Closed-Form solutions of Newell-Whitehead-Segel eEquations via modified conformable Shehu transform decomposition method, Math. Probl. Eng., 2022 (2022), 1–14. https://doi.org/10.1155/2022/6752455 doi: 10.1155/2022/6752455
  • This article has been cited by:

    1. Minzhen Wang, Yanshan Wang, Renkang Xu, Runqiao Peng, Jian Wang, Junseok Kim, Multifractal Detrended Fluctuation Analysis Combined with Allen–Cahn Equation for Image Segmentation, 2025, 9, 2504-3110, 310, 10.3390/fractalfract9050310
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1926) PDF downloads(101) Cited by(7)

Figures and Tables

Figures(6)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog