This study aimed to assess the prevalence of using herbal therapy and the beliefs toward the use of this type of therapy among patients with diabetes. It also aimed to identify the significant predictors of these beliefs and the factors that increase the likelihood of using herbal therapy. A descriptive cross-sectional design was used. A convenience sample comprised 310 patients with diabetes. Sixty-seven (21.6%) of the participants used herbal therapy. The mean beliefs score was 3.72 and ranged from (0–12). Linear regression showed that beliefs were significantly predicted by self-care, attending workshops, education level, and number of complications. The logistic regression showed that the lower the self-care and the higher the beliefs, the more likelihood the patient uses herbal therapy. Informing patient through individualized diabetes education influences the patient’s beliefs and promotes self-care. This education program should target mainly those patients with low self-care, high number of complications, lower educational level and having more complications.
Citation: Besher Gharaibeh, Loai Tawalbeh. Beliefs and Practices of Patients with Diabetes toward the Use of Herbal Therapy[J]. AIMS Public Health, 2017, 4(6): 650-664. doi: 10.3934/publichealth.2017.6.650
Related Papers:
[1]
Tao Zhang, Hao Zhang, Ran Wang, Yunda Wu .
A new JPEG image steganalysis technique combining rich model features and convolutional neural networks. Mathematical Biosciences and Engineering, 2019, 16(5): 4069-4081.
doi: 10.3934/mbe.2019201
[2]
Chunmei He, Hongyu Kang, Tong Yao, Xiaorui Li .
An effective classifier based on convolutional neural network and regularized extreme learning machine. Mathematical Biosciences and Engineering, 2019, 16(6): 8309-8321.
doi: 10.3934/mbe.2019420
[3]
Boyang Wang, Wenyu Zhang .
ACRnet: Adaptive Cross-transfer Residual neural network for chest X-ray images discrimination of the cardiothoracic diseases. Mathematical Biosciences and Engineering, 2022, 19(7): 6841-6859.
doi: 10.3934/mbe.2022322
[4]
Gang Cao, Antao Zhou, Xianglin Huang, Gege Song, Lifang Yang, Yonggui Zhu .
Resampling detection of recompressed images via dual-stream convolutional neural network. Mathematical Biosciences and Engineering, 2019, 16(5): 5022-5040.
doi: 10.3934/mbe.2019253
[5]
Shuai Cao, Biao Song .
Visual attentional-driven deep learning method for flower recognition. Mathematical Biosciences and Engineering, 2021, 18(3): 1981-1991.
doi: 10.3934/mbe.2021103
Chun Li, Ying Chen, Zhijin Zhao .
Frequency hopping signal detection based on optimized generalized S transform and ResNet. Mathematical Biosciences and Engineering, 2023, 20(7): 12843-12863.
doi: 10.3934/mbe.2023573
[8]
Jose Guadalupe Beltran-Hernandez, Jose Ruiz-Pinales, Pedro Lopez-Rodriguez, Jose Luis Lopez-Ramirez, Juan Gabriel Avina-Cervantes .
Multi-Stroke handwriting character recognition based on sEMG using convolutional-recurrent neural networks. Mathematical Biosciences and Engineering, 2020, 17(5): 5432-5448.
doi: 10.3934/mbe.2020293
[9]
Sakorn Mekruksavanich, Wikanda Phaphan, Anuchit Jitpattanakul .
Epileptic seizure detection in EEG signals via an enhanced hybrid CNN with an integrated attention mechanism. Mathematical Biosciences and Engineering, 2025, 22(1): 73-105.
doi: 10.3934/mbe.2025004
[10]
Mei-Ling Huang, Zong-Bin Huang .
An ensemble-acute lymphoblastic leukemia model for acute lymphoblastic leukemia image classification. Mathematical Biosciences and Engineering, 2024, 21(2): 1959-1978.
doi: 10.3934/mbe.2024087
Abstract
This study aimed to assess the prevalence of using herbal therapy and the beliefs toward the use of this type of therapy among patients with diabetes. It also aimed to identify the significant predictors of these beliefs and the factors that increase the likelihood of using herbal therapy. A descriptive cross-sectional design was used. A convenience sample comprised 310 patients with diabetes. Sixty-seven (21.6%) of the participants used herbal therapy. The mean beliefs score was 3.72 and ranged from (0–12). Linear regression showed that beliefs were significantly predicted by self-care, attending workshops, education level, and number of complications. The logistic regression showed that the lower the self-care and the higher the beliefs, the more likelihood the patient uses herbal therapy. Informing patient through individualized diabetes education influences the patient’s beliefs and promotes self-care. This education program should target mainly those patients with low self-care, high number of complications, lower educational level and having more complications.
1.
Introduction
Enhancers are non-coding DNA fragments, which hold responsibility for regulating gene expression in both transcription and translation and the production of RNA and proteins [1]. Unlike the proximal elements promoters of the gene, enhancers are distal elements that can be located up to 20kb upstream or downstream away from a gene, or even located on a different chromosome [2]. Such locational variation makes the identification of enhancers challenging. Moreover, genetic variation in enhancers has been demonstrated that it is related to many human illnesses, such as cancer [3,4], disorder [4,5] and inflammatory bowel disease [6]. Genome-wide study of histone modifications has shown that enhancers are a large group of functional elements with many different subgroups, such as strong enhancers and weak enhancers, poised enhancers and inactive enhancers [7]. Because enhancers of different subgroups have different biological activities, understanding enhancers and their subgroups is an important task, especially for the identification of the enhancers and their strength.
Due to the importance of enhancers in genomics and disease, the identification of the enhancers and their strength has become a popular topic in biological research. The pioneering works carried out purely by the experimental techniques include chromatin immunoprecipitation followed by deep sequencing [8,9,10], DNase I hypersensitivity [11] and genome-wide mapping of histone modifications [12,13,14,15,16]. However, the experimental methods are expensive, time consuming and low accuracy. Therefore, several computational methods were developed in order to fast identify enhancers and their strength in genomes. In 2016, Liu et al. [2] developed a two-layer predictor iEnhancer-2L, which is the first computational model for identifying not only enhancers, but also their strength by pseudo k-tuple nucleotide composition. At the same year, Jia et al. [17] proposed EnhancerPred model by fusing bi-profile Bayes and pseudo-nucleotide composition as multiple features, and a two-step wrapper for feature selection to distinguish between enhancers and non-enhancers and to determine enhancers' strength. In 2018, Liu et al. [18] established the iEnhancer-EL model for identifying enhancers and their strength with ensemble learning approach. In 2019, Nguyen et al. [19] put forward iEnhancer-ECNN model to identify enhancers and their strength using ensembles of convolutional neural networks. At the same year, Tan et al. [20] used ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties. Le et al. [21] developed iEnhancer-5Step model to identifying enhancers and their strength using hidden information of DNA sequences via Chou's 5-step rule and word embedding. In 2021, Basith et al. [22] proposed Enhancer-IF model by integrative machine learning (ML)-based framework for identifying cell-specific enhancers. At the same year, Cai ea al. [23] established iEnhancer-XG model by using XGBoost as a base classifier and k-spectrum profile, mismatch k-tuple, subsequence profile, position-specific scoring matrix and pseudo dinucleotide composition as feature extraction methods. Le et al. [24] use a transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers. Lim et al. [25] proposed iEnhancer-RF model to identify enhancers and their strength by enhanced feature representation using random forest. However, the stability of the model still needs to be improved, especially for identifying the strong enhancers from the weak enhancers.
In this study, we focus on developing a novel model named iEnhancer-MFGBDT to identify enhancers and their strength. Its first layer serves to identify whether a DNA sequence sample is of enhancer or not, while its second layer is to identify whether the identified enhancer as being strong or weak. We fuse k-mer and reverse complement k-mer nucleotide composition based on DNA sequence, and second-order moving average, normalized Moreau-Broto auto-cross correlation and Moran auto-cross correlation based on dinucleotide physical structural property matrix as extracted multiple features, and a 902-dimensional feature vector is obtained for each enhancer sequence. Then, gradient boosting decision tree (GBDT) algorithm in this study is adopted as the feature selection strategy and also as the classifier. The accuracy of enhancers and their strength on the benchmark dataset with the 10-fold cross-validation are 78.67% and 66.04%, respectively. The accuracy of enhancers and their strength on the independent dataset with the 10-fold cross-validation are 77.50% and 68.50%, respectively. The experimental results indicate that our model improves the accuracies to identify enhancers and their strength, and is a useful supplementary tool.
2.
Materials and methods
2.1. Datasets
In order to facilitate comparison, in this study, we adopt the benchmark dataset S constructed by Liu et al. [2], they obtain the 2968 enhancer sequences with 200bp which can be formulated by
S=S+∪S−S+=S+strong∪S+weak,
(1)
where S+ contains 1484 enhancer sequences, S− contains 1484 non-enhancer sequences, S+strong contains 742 strong enhancer sequences, S+weak contains 742 weak enhancer sequence, in which none of the enhancer DNA sequences has the pairwise sequences similarities more than 80%.
2.2. Feature extraction
Suppose that a DNA enhancer sequence D with L nucleic acid residues is expressed by
where Bi denotes the i-th nucleic acid residue of the DNA sequence at the sequence position i. In this study, 902 multiple features are extracted by fusing k-mer nucleotide composition, reverse complementary k-mer, second-order moving average, Moreau-Broto auto-cross correlation, and Moran auto-cross correlation based on dinucleotide property matrix.
2.2.1. K-mer nucleotide composition
K-mer nucleotide composition is a basic feature extraction approach and widely used in different fields of bioinformatics [26,27,28,29]. For a enhancer sequence with nucleotides, the k-mer nucleotide compositions involve all the possible subsequences with length of the enhancer sequence. We slide along the enhancer sequence with one nucleotide as a step size using a sliding window. When the subsequence of the enhancer sequence matches with the -th k-mer nucleotide composition, the occurrence number of the k-mer is denoted by. represents the occurrence frequency of the -th k-mer, and can be expressed by
fi=niL−k+1.
(3)
For each k, we can obtain 4kk-mer features, here we let k=1,2,3, Finally, each enhancer sequence obtains 41+42+43=84-dimensional k-mer feature vector.
2.2.2. Reverse complementary k-mer
The reverse complementary k-mer is a variant of the basic k-mer, and abbreviated as RevKmer, in which the k-mers are not expected to be strand-specific, so reverse complements are collapsed into a single feature. For example, when k = 2, there are totally 16 basic k-mers, but by removing the reverse complementary k-mers, only 10 different dinucleotides AA, AC, AG, AT, CA, CC, CG, GA, GC and TA are be retained. In other words, we obtain 10 reverse complementary 2-mer features. Let k=1,2,3, 2+10+32 = 44 RevKmer features are extracted, which can be calculated by a web server named Pse-in-One 2.0 [30].
2.2.3. Second-order moving average based on dinucleotide property matrix
As has been reported, DNA physicochemical properties play crucial role in gene expression regulation and genome analysis, and are also closely correlated with the functional non-coding elements [31,32,33]. In this study, six dinucleotide physical structural properties are adopted, include three the local translational parameters related to shift, slide and rise, and three the local angular parameters related to twist, tilt and roll [34]. The values of six DNA dinucleotide physical structural properties are shown in Table 1. Each DNA physical structural property is normalized for reducing the bias and noise by the following formula
P−PminPmax−Pmin,
(4)
Table 1.
The original values of the six physical structural properties for the 16 dinucleotides in DNA.
where P is the original value of the property, Pmin and Pmax are the minimum and the maximum property values, respectively.
A DNA sequence is a polymer of four nucleotides with A, C, G and T. Any combination of two nucleotides is called dinucleotide. Hence, there are totally 4*4 = 16 basic dinucleotides. First of all each dinucleotide in a DNA sequence is replaced by the value of the physical structural property. Then, each DNA sequence in the datasets can be converted into a matrix P=(pi,j)(L−1)×6, which is named by dinucleotide property matrix (DPM), where L represents the number of nucleic acid residue in this DNA sequence. pi,j represents the value of the ith dinucleotide corresponding to the jth physical structural property.
Second-order moving average (SOMA) algorithm is proposed by Alessio et al. [35], which is defined by fusing the idea of the moving average and the second-order difference. SOMA mainly investigate the long-range correlation properties of a stochastic time series.
Let a discrete stochastic time series be y(i),i=1,2,⋯,L, where L is the size of the stochastic series y(i). The algorithm of the SOMA is described as follows
Step 1. Calculate the moving average ˜yn(i) of the time series y(i) as
˜yn(i)=1nn−1∑k=0y(i−k),
(5)
where n is the moving average window. When n→0, then ˜yn(i)→y(i).
Step 2. For a given moving average window n, 2⩽n<L, the second-order difference between the y(i) and ˜yn(i) is defined by
σ2MA=1L−nL∑i=n[y(i)−˜yn(i)]2,
(6)
where σ2MA is a systematic analysis of the properties of y(i) with respect to ˜yn(i), so σ2MA is called the second-order moving average.
A dinucleotide property matrix contains 6 columns, each column is considered a time series, in other words, a dinucleotide property matrix contains 6 time series. Hence, each enhancer DNA sequence is represented by 6 SOMA features for a certain moving average window n. Here, we let n=2,3,⋯,10, we construct a 6∗9=54-dimensional SOMA-DPM feature vector for each enhancer sequence.
2.2.4. Moreau-Broto auto-cross correlation based on dinucleotide property matrix
Normalized Moreau-Broto auto-cross correlation (NMBACC) [36] based on dinucleotide property matrix for extracting global sequence information can be described by
where λ is the lag of the auto-cross correlation along the column in dinucleotide property matrix. Pi,s represents the value at the i-th row for the s-th column (s-th property index), Pi+λ,t represents the value at the i+λ-th row for the t-th column (t-th property index). When s=t, NMBACC(s,s,λ) represents the auto- correlation with the same property. When s≠t, NMBACC(s,t,λ) represents the cross-correlation with the different property. Here, we let λ=1,2,3,⋯,10, finally, each enhancer sequence obtains a 6∗6∗10=360-dimensional NMBACC-DPM feature vector.
2.2.5. Moran auto-cross correlation based on dinucleotide property matrix
Moran auto-cross correlation (MACC) [37] based on dinucleotide property matrix for extracting global sequence information can be described by
where λ is the lag along the column in dinucleotide property matrix, pi,s and pi,trepresent the value at the i-th row for the s-th column and t-th column in dinucleotide property matrix, respectively. pi+λ,t represents the value at the (i+λ)-th row for the t-th column in dinucleotide property matrix. ˉps and ˉpt are the average value for the s-th and t-th column, respectively. When s=t, MACC(s,s,λ) represents the auto-correlation with the same property. When s≠t,MACC(s,t,λ) represents the cross-correlation with the different property. Here, we let λ=1,2,3,⋯,10, finally, each enhancer sequence obtains a 6∗6∗10=360-dimensional MACC-DPM feature vector.
2.3. Gradient boosting decision tree
Gradient boosting decision tree (GBDT) is a Boosting algorithm based on decision tree as base learner, was proposed by Freidman in 2001 [38,39]. It builds a decision tree in each iteration to reduce the residual of the current model in the gradient direction. Then linearly combines the decision tree with the current model to obtain a new model. GBDT repeats the iteration until the number of decision trees reaches the specified value, and the final strong learner is obtained. GBDT is commonly used for regression, classification and feature selection. GBDT's advantages include: (a) It flexible processes of various types of data, including both continuous and discrete dataset; (b) It has powerful predictive ability and generalization ability; (c) It has good interpretability and robustness, can automatically discover high-order relationships between features, and does not require data normalization and other processing.
The GBDT classification algorithm process is as follows
Input: training dataset D={(x1,y1),(x2,y2),⋯,(xm,ym)}. Suppose that the maximum iteration number is T, the loss function is L(y,f(x)), and m is the number of samples.
(1) Initialize the weak classifier as follows
f0(x)=argmincm∑i=1L(yi,c),
(9)
c is the constant value that minimizes the loss function, that is, f0(x) is a tree with only one root node.
(2) For t=1 to T
a. For i=1 to m, calculate negative gradient as follows
where the loss function L(y,f(x))=log(1+exp(yf(x))),y∈{−1,1}.
b. Use L(xi,rti),i=1,2,⋯,m to fit a CART regression tree to get the t-th regression tree, and its corresponding leaf node area is Rtj, j=1,2,⋯,J.J is the number of leaf nodes of the regression tree t.
c. For leaf node area j=1,2,⋯,J, calculate the best residual fitting value as follows
ctj=argmincm∑xi∈Rtjlog[1+exp(−yi(ft−1(xi)+c))].
(11)
As the above equation is difficult to optimize, ctj is generally replaced by the approximate value as
ctj=∑xi∈Rtjrti∑xi∈Rtj|rti|(|1−rti|).
(12)
d. Update the strong classifier by
ft(x)=ft−1(x)+J∑j=1ctjI(x∈Rtj).
(13)
(3) Get the final strong classifier f(x) by
f(x)=fT(x)=J∑t=1J∑j=1ctjI(x∈Rtj).
(14)
Output: fT(x).
GBDT can not only be used for classification, but also can be used for feature selection by calculating the gini index. The gini index is ranked in descending order by the importance of the feature, the first k features can be selected as needed. In this study, we adopt GBDT to carry out feature selection and classification.
2.4. Cross-validation and performance assessment
In order to save the computational time, 10-fold cross-validation is carried out for each feature to evaluate the identification performance in this study. The dataset is randomly divided into ten subsets with approximately equal size, and the ratio of the testing set to the training set is 1:9. Each subset is in turn taken as a test set and the remaining nine subsets are used to train the GBDT classifier, and finally the average performance measures over the ten validation results are used for performance evaluation. K-fold cross-validation approach can improve the reliability of evaluation, because all of the original data are considered and each subset is tested only once.
To make an objective and comprehensive evaluation, we employ different performance measures [40,41,42,43], including sensitivity (Sn), specificity (Sp), accuracy (Acc) and Matthews correlation coefficient (MCC), The MCC value is ranging from -1 to 1, while the values of other three measures range from 0 to 1. They can be formulated as
where N+ represents the total number of the true enhancer sequences investigated, while N+− represents the number of true enhancer sequences incorrectly identified to be non-enhancer sequences; N− represents the total number of the non-enhancer sequences investigated while N−+ represents the number of the non-enhancer sequences incorrectly identified to be enhancer sequences.
We also employ the receiver operating characteristic (ROC) curve [44] and the area under the ROC curve (AUC) [45] to evaluate our model. The ROC curve plots the true positive rate (Specificity) as a function of the false positive rate (1-Specificity) for all possible thresholds. The ROC curve is closer to the upper left corner, the better the identification performance is. In other words, the closer the AUC is to 1, the better the identification system is.
3.
Results and discussion
3.1. Identification performance on the benchmark dataset
Identifying enhancers is a binary classification problem, which can be divided into two layers, the first layer is devoted to identify whether a DNA sequence is of enhancer or not, while the second layer is committed to identify enhancer sequence as being strong or weak enhancer. In this study, a novel model iEnhancer-MFGBDT is proposed by using multi-features and gradient boosting decision tree. Firstly, the 902 multi-features are extracted for both layers of each enhancer sequence, which contain 84 k-mer features, 44 RevKmer features, 54 SOMA-DPM features, 360 NMBACC-DPM features and 360 MACC-DPM features. Next, 156 features for the first layer, 263 features for the second layer are selected from 902 multi-features with the GBDT algorithm by the gini index, respectively. Finally, the GBDT classifier is adopted to implement classification using the 10-fold cross-validation. The Figure 1 shows the operating flow of iEnhancer-MFGBDT model.
Figure 1.
The flowchart of the iEnhancer-MFGBDT model.
Identification results by the 10-fold cross-validation are shown in Table 2 by our iEnhancer-MFGBDT model on the benchmark datastet. From Table 2, we can see that the accuracy reaches 78.67% and 66.04% for the first and second layers on the benchmark dataset, respectively. Meanwhile, the values of Sn, Sp and MCC reach 77.54%, 79.78%, 0.5735 for the first layer, 70.56%, 61.63%, 0.3232 for the second layer. The AUC indicates the probability at which the model ranks a randomly selected positive sample higher than a randomly selected negative sample. In fact, The AUC can measure the overall performance of a given identification system. The ROC curves are plotted for the both first and second layers, and shown in Figure 2. The AUC values on the benchmark dataset are 0.8615 and 0.7187 for the first layer and the second layer, respectively. Obviously, the second layer is more difficult to identify than the first layer due to their position variation and free scattering.
Table 2.
The identification performance of iEnhancer-MFGBDT with 10-fold cross validation on the benchmark dataset.
In this study, we adopt five different approaches to extract features from the benchmark dataset, which are named as K-mer, RevKmer, SOMA-DPM, NMBACC-DPM and MACC-DPM feature group, respectively. For the purpose of obtaining the importance of single feature group, we calculate the performance for K-mer, RevKmer, SOMA-DPM, NMBACC-DPM and MACC-DPM, respectively, and as shown in Table 3. The accuracy of single feature group is lower than that of multiple features after GBDT feature selection (MGBDT) for the both layers. Therefore, the fusion of multiple features is very necessary. From Table 3, we can see that the best identification performance is K-mer, followed by RevKmer, NMBACC-DPM, SOMA-DPM successively, the MACC-DPM is the lowest one for the first layer. Meanwhile, we also can see that the best identification performance is RevKmer, followed by K-mer, SOMA-DPM, MACC-DPM successively, the NMBACC-DPM is the lowest one for the second layer. Among these five feature groups, k-mer and RevKmer are the feature extraction methods based on DNA sequence, SOMA-DPM, NMBACC-DPM and MACC-DPM are the feature extraction methods based on physical structural properties of DNA dinucleotide. Obviously, the DNA sequence-based feature group is superior to physical structural properties-based feature group.
Table 3.
Feature group analysis of iEnhancer-MFGBDT with 10-fold cross validation on the benchmark dataset.
3.3. Comparison with feature selection and without feature selection
We construct 902 features by multiple features, and the large dimension will lead to decrease predictive performance, a handicap for the computation and information redundancy. The features selection can help the original classification system achieve a better predictive performance and a lower computational cost by removing redundant features. Hence, finding a suitable dimension reduction method is very important. The gini index is ranked in descending order by importance for GBDT, here, we use "mean" and "gini" as the threshold and criterion for feature selection. Figure 3 shows the accuracy comparison between our model with feature selection and without feature selection. It is obvious that the accuracies have been improved for both layers in the benchmark dataset, and clearly shows that GBDT feature selection method has great effect on improving accuracy. The accuracy is improved by 1.35% and 5.87% for the first layer and the second layer by using GBDT feature selection, respectively. These experimental results show that GBDT is very effective for the benchmark dataset.
Figure 3.
Identification accuracy comparison between with feature selection and without feature selection on the benchmark dataset.
To demonstrate the superiority of GBDT classifier, support vector machine (SVM), extra trees (ET), random forest (RF) and Bagging classifiers are tested successively using the selected features by GBDT based on the 10-fold cross-validation. As shown in Figure 4, the identification accuracy of SVM, ET, RF and Bagging reaches 75.64%, 77.02%, 77.15% and 76.75% for the first layer, and 60.04%, 62.47%, 65.02% and 64.75% for the second layer, respectively. However, the identification accuracy of GBDT reaches 78.67% and 66.04% for the first and second layer, respectively, we can see that from Figure 4, the accuracies of SVM, ET, RF and Bagging are all lower than the accuracies obtained by GBDT for the both layers. The results show that GBDT is more powerful for our benchmark dataset than other classifiers.
Figure 4.
Identification accuracy comparison with different classifiers.
In order to avoid experimental errors, it is persuasive to use an independent dataset to objectively evaluate our model. We adopt the independent dataset also constructed by Liu et al. [2], which contains the 400 enhancer sequences with 200bp, among them, 100 strong enhancer sequences, 100 weak enhancer sequences and 200 non-enhancer sequences, and sequence similarity is less than or equal to 80%. The results obtained by the proposed model using the 10-fold cross-validation on the independent dataset test are given in Table 4. For the first layer, the ACC, Sn, Sp, MCC and AUC reach 77.50%, 76.79%, 79.55%, 0.5607 and 0.8589, respectively. For the second layer, the ACC, Sn, Sp, MCC and AUC reach 68.50, 72.55%, 66.81%, 0.3862 and 0.7524, respectively. The values of these metrics further illustrate the effectiveness of our model.
Table 4.
The identification performance of iEnhancer-MFGBDT with 10-fold cross validation on the independent dataset.
The proposed iEnhancer-MFGBDT model, is compared with eight state-of-the-art models: iEnhancer-2L [2], iEnhancerPred [17], iEnhancer-EL [18], iEnhancer-ECNN [19], Tan et al. [20], iEnhancer-XG [23], BERT-2D CNNs [24] and iEnhancer-RF [25]. The values of Acc, Sn, Sp and MCC are listed in Tables 5 and 6.
Table 5.
The comparison with other methods in identifying enhancers and their strength on the benchmark dataset.
For the benchmark dataset, iEnhancer-2L, iEnhancerPred, iEnhancer-EL, Tan et al., iEnhancer-XG and iEnhancer-RF models are adopted for comparison for the both layers, of which the values of ACC, Sn, Sp and MCC are listed in Table 5. Among the six models, the accuracy for our model is lower than that of iEnhancer-XG model for the both layers, but the stability of our model is higher than that of iEnhancer-XG model. The accuracy for our model is 1.78%, 5.49%, 0.64%, 3.84% and 2.49% higher than the iEnhancer-2L, iEnhancerPred, iEnhancer-EL, Tan et al. and iEnhancer-RF models for the first layer, respectively, and the accuracy for our model is 4.11%, 3.98%, 1.01%, 7.08% and 3.51% higher than the iEnhancer-2L, iEnhancerPred, iEnhancer-EL, Tan et al and iEnhancer-RF models for the second layer, respectively. As shown in Table 5, our model has the best performance and is the most stable model from Sn, Sp and MCC.
For the independent dataset, iEnhancer-2L, iEnhancerPred, iEnhancer-EL and iEnhancer-ECNN, Tan et al., iEnhancer-XG and BERT-2D CNNs models are adopted for comparison for the first layer, of which the values of ACC, Sn, Sp and MCC are listed in Table 6. The accuracy is improved by 0.6%–4.5% for the first layer. From Table 6, we can see that iEnhancer-2L, iEnhancerPred, iEnhancer-EL and iEnhancer-ECNN, Tan et al., and iEnhancer-XG models are adopted for comparison for the second layer, The accuracy is improved by 0.01%-13.5% for the second layer. The test results still show that the performance of iEnhancer-MFGBDT is best on the independent dataset. Our model achieves remarkably better results than other existing models, and make a considerable improvement for performance.
4.
Conclusions
In this study, an effective computational tool called enhancers-MFGBDT has been developed for identification of DNA enhancers and their strength. The iEnhancer-MFGBDT model is established by fusing multi-features and GBDT based on the 10-fold cross validation. Compared with the existing models, our model can obtain satisfactory accuracies for the first and second layers on the benchmark dataset and independent dataset. It is anticipated that iEnhancer-MFGBDT will become a very useful high throughput tool for researching enhancers or, at the least, play an important complementary role to the existing models. As pointed out in [46] by Chou and Shen, user-friendly and publicly accessible web-servers represent the future direction for practically developing more useful computational tools, and have increasing impacts on medical science [47]. In the future, we will make great efforts to establish a web-server for the iEnhancer-MFGBDT model to facilitate communication among colleagues in bioinformatics.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 12101480), the Natural Science Basic Research Program of Shaanxi (Nos.2021JM-115, 2021JM-444), and the Fundamental Research Funds for the Central Universities (No. JB210715).
Conflict of interest
The authors declare no conflict of interest.
References
[1]
American Diabetes Association (2012) Standards of medical care in diabetes-2013. Diabetes Care 36: 11-66.
[2]
World Health Organization (2016) Jordan. Available from: http://www.who.int/diabetes/country-profiles/jor_en.pdf?ua=1.
[3]
Haas L, Maryniuk M, Beck J, et al. (2011) National standards for diabetes self-management education and support. Diabetes Care 36: S100-108.
[4]
Funnell MM, Brown TL, Childs BP, et al. (2012) National standards for diabetes self-management education. Diabetes Care 35: S101-108. doi: 10.2337/dc12-s101
[5]
Rezaei A, Farzadfard A, Amirahmadi A, et al. (2015) Diabetes mellitus and its management with medicinal plants: A perspective based on Iranian research. J Ethnopharmacol 175: 567-616. doi: 10.1016/j.jep.2015.08.010
[6]
Samani NB, Jokar A, Soveid M, et al. (2016) Efficacy of Tribulus Terrestris Extract on the Serum Glucose and Lipids of Women with Diabetes Mellitus. Iran J Med Sci 41: S5.
[7]
Voroneanu L, Nistor I, Dumea R, et al. (2016) Silymarin in Type 2 Diabetes Mellitus: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. J Diabetes Res 2016: 5147468.
[8]
American Diabetes Association. Herbs, Supplements and Alternative Medicines. 2–14. Available from: http://www.diabetes.org/living-with-diabetes/treatment-and-care/medication/other-treatments/herbs-supplements-and-alternative-medicines/.
[9]
Ekor M (2014) The growing use of herbal medicines: issues relating to adverse reactions and challenges in monitoring safety. Front Pharmacol 4: 177.
[10]
Aburjaia T, Hudaiba M, Tayyema R, et al. (2007) Ethnopharmacological survey of medicinal herbs in Jordan, the Ajloun Heights region. J Ethnopharmacol 110: 294-304. doi: 10.1016/j.jep.2006.09.031
[11]
Gharaibeh B, Gajewski BJ, Al-smadi A, et al. (2016) The relationships among depression, self-care agency, self-efficacy and diabetes self-care management. J Res Nurs 21: 110-122. doi: 10.1177/1744987115621782
[12]
Shrivastava SR, Shrivastava PS, Ramasamy J (2013) Role of self-care in management of diabetes mellitus. J Diabetes Metab Disord 12: 14. doi: 10.1186/2251-6581-12-14
[13]
Vazini H, Barati M (2014) The Health Belief Model and self-care behaviors among Type 2 diabetic patients. Iran J Diabetes Obes 6: 107-113.
[14]
Ajzen I, Fishbein M (1977) Attitude-behavior relations: A theoretical analysis and review of empirical research. Psychological bulletin. 84: 888-918. doi: 10.1037/0033-2909.84.5.888
[15]
Faul F, Erdfelder E, Lang AG, et al. (2007) G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 39: 175-191. doi: 10.3758/BF03193146
[16]
American Diabetes Association (2016) Standards of medical care in diabetes-2016. Diabetes Care 39: S1-S2. doi: 10.2337/dc16-S001
[17]
Lu Y, Xu J, Zhao W, et al. (2016) Measuring Self-Care in Persons With Type 2 Diabetes: A Systematic Review. Eval Health Prof 39: 131-184. doi: 10.1177/0163278715588927
[18]
Yeh GY, Eisenberg DM, Kaptchuk TJ, et al. (2003) Systematic review of herbs and dietary supplements for glycemic control in diabetes. Diabetes care 26: 1277-1294.
[19]
Al-Rowais N (2002) Herbal medicine in the treatment of diabetes mellitus. SaudiMedical J 23: 1327-1331.
[20]
Ezuruike UF, Prieto JM (2014) The use of plants in the traditional management of diabetes in Nigeria: Pharmacological and toxicological considerations. J Ethnopharmacol 55: 857-924.
[21]
The World Bank. Poverty headcount ratio at national poverty lines (% of population), 2010. Available from: http://data.worldbank.org/country/jordan.
[22]
Ahmed MU, Seriwala HM, Danish SH, et al. (2016) Knowledge, Attitude, and Self Care Practices Amongsts Patients With Type 2 Diabetes in Pakistan. Glob J Health Sci 8: 1-8.
[23]
Tawalbeh LI, Ahmad MM (2014) The effect of cardiac education on knowledge and adherence to healthy lifestyle. Clin Nurs Res 23: 245-258. doi: 10.1177/1054773813486476
[24]
Sharma T, Kalra J, Dhasmana DC, et al. (2014) Poor adherence to treatment: A major challenge in diabetes. J Indian Acad Clin Med 15: 26-29.
This article has been cited by:
1.
Luca Biggio, Iason Kastanis,
Prognostics and Health Management of Industrial Assets: Current Progress and Road Ahead,
2020,
3,
2624-8212,
10.3389/frai.2020.578613
2.
Tarek Berghout, Leïla-Hayet Mouss, Ouahab Kadri, Lotfi Saïdi, Mohamed Benbouzid,
Aircraft Engines Remaining Useful Life Prediction with an Improved Online Sequential Extreme Learning Machine,
2020,
10,
2076-3417,
1062,
10.3390/app10031062
3.
Bin He, Long Liu, Dong Zhang,
Digital Twin-Driven Remaining Useful Life Prediction for Gear Performance Degradation: A Review,
2021,
21,
1530-9827,
10.1115/1.4049537
4.
Sergey Barkalov, Dmitry Dorofeev, Irina Fedorova, Alla Polovinkina, V. Breskich, S. Uvarova,
Application of digital twins in the management of socio-economic systems,
2021,
244,
2267-1242,
11001,
10.1051/e3sconf/202124411001
5.
Biao Wang, Yaguo Lei, Tao Yan, Naipeng Li, Liang Guo,
Recurrent convolutional neural network: A new framework for remaining useful life prediction of machinery,
2020,
379,
09252312,
117,
10.1016/j.neucom.2019.10.064
6.
Tarek Berghout, Leïla-Hayet Mouss, Toufik Bentrcia, Elhoussin Elbouchikhi, Mohamed Benbouzid,
A deep supervised learning approach for condition-based maintenance of naval propulsion systems,
2021,
221,
00298018,
108525,
10.1016/j.oceaneng.2020.108525
7.
Jinyang Jiao, Ming Zhao, Jing Lin, Kaixuan Liang,
A comprehensive review on convolutional neural network in machine fault diagnosis,
2020,
417,
09252312,
36,
10.1016/j.neucom.2020.07.088
8.
Xinyun Zhang, Yan Dong, Long Wen, Fang Lu, Wei Li,
2019,
Remaining Useful Life Estimation Based on a New Convolutional and Recurrent Neural Network,
978-1-7281-0356-3,
317,
10.1109/COASE.2019.8843078
9.
Liangwei Zhang, Jing Lin, Bin Liu, Zhicong Zhang, Xiaohui Yan, Muheng Wei,
A Review on Deep Learning Applications in Prognostics and Health Management,
2019,
7,
2169-3536,
162415,
10.1109/ACCESS.2019.2950985
10.
Zhao Zhang, Xinyu Li, Long Wen, Liang Gao, Yiping Gao,
2019,
Fault Diagnosis Using Unsupervised Transfer Learning Based on Adversarial Network,
978-1-7281-0356-3,
305,
10.1109/COASE.2019.8842881
11.
Ali Al-Dulaimi, Soheil Zabihi, Amir Asif, Arash Mohammed,
NBLSTM: Noisy and Hybrid Convolutional Neural Network and BLSTM-Based Deep Architecture for Remaining Useful Life Estimation,
2020,
20,
1530-9827,
10.1115/1.4045491
12.
Tao Li, YinQuan Yu, JinWen Yang, Long Zhang, WenBin Tu, Hao Yong,
Method for Predicting Cutter Remaining Life Based on Multi-scale Cyclic Convolutional Network (MSRCNN),
2021,
1754,
1742-6588,
012218,
10.1088/1742-6596/1754/1/012218
13.
Zhen Jia, Yaguang Luo, Dayang Wang, Quynh N. Dinh, Sophia Lin, Arnav Sharma, Ethan M. Block, Manyun Yang, Tingting Gu, Arne J. Pearlstein, Hengyong Yu, Boce Zhang,
Nondestructive Multiplex Detection of Foodborne Pathogens with Background Microflora and Symbiosis Using a Paper Chromogenic Array and Advanced Neural Network,
2021,
09565663,
113209,
10.1016/j.bios.2021.113209
14.
Jun Xia, Yunwen Feng, Cheng Lu, Chengwei Fei, Xiaofeng Xue,
LSTM-based multi-layer self-attention method for remaining useful life estimation of mechanical systems,
2021,
125,
13506307,
105385,
10.1016/j.engfailanal.2021.105385
15.
Seokgoo Kim, Nam Ho Kim, Joo-Ho Choi,
A Study Toward Appropriate Architecture of System-Level Prognostics: Physics-Based and Data-Driven Approaches,
2021,
9,
2169-3536,
157960,
10.1109/ACCESS.2021.3129516
16.
Angel J. Alfaro-Nango, Elias N. Escobar-Gomez, Eduardo Chandomi-Castellanos, Sabino Velazquez-Trujillo, Hector R. Hernandez-De-Leon, Lidya M. Blanco-Gonzalez,
2022,
Predictive Maintenance Algorithm Based on Machine Learning for Industrial Asset,
978-1-6654-9607-0,
1489,
10.1109/CoDIT55151.2022.9803983
17.
Amgad Muneer, Shakirah Mohd Taib, Suliman Mohamed Fati, Hitham Alhussian,
Deep-Learning Based Prognosis Approach for Remaining Useful Life Prediction of Turbofan Engine,
2021,
13,
2073-8994,
1861,
10.3390/sym13101861
18.
Seokgoo Kim, Joo-Ho Choi, Nam H. Kim,
Challenges and Opportunities of System-Level Prognostics,
2021,
21,
1424-8220,
7655,
10.3390/s21227655
19.
Manuel Arias Chao, Chetan Kulkarni, Kai Goebel, Olga Fink,
Fusing physics-based and deep learning models for prognostics,
2022,
217,
09518320,
107961,
10.1016/j.ress.2021.107961
20.
Viatcheslav P. Shuvalov, Boris P. Zelentsov, Irina G. Kvitkova,
2021,
Optimization of Testing Intervals in the Conditions of Optical Fiber Periodic Predictive Control,
978-1-6654-3408-9,
346,
10.1109/APEIE52976.2021.9647546
21.
Xianjun Du, Wenchao Jia, Ping Yu, Yaoke Shi, Bin Gong,
RUL prediction based on GAM–CNN for rotating machinery,
2023,
45,
1678-5878,
10.1007/s40430-023-04062-8
22.
Cheng Peng, Jiaqi Wu, Zhaohui Tang, Xinpan Yuan, Changyun Li, Tapan Senapati,
A Spatio-Temporal Attention Mechanism Based Approach for Remaining Useful Life Prediction of Turbofan Engine,
2022,
2022,
1687-5273,
1,
10.1155/2022/9707940
23.
Hongchun Sun, Chenchen Wu, Zunyang Lei,
Uncertainty Measurement of the Prediction of the Remaining Useful Life of Rolling Bearings,
2022,
5,
2572-3901,
10.1115/1.4054392
24.
Luca Biggio, Alexander Wieland, Manuel Arias Chao, Iason Kastanis, Olga Fink,
Uncertainty-Aware Prognosis via Deep Gaussian Process,
2021,
9,
2169-3536,
123517,
10.1109/ACCESS.2021.3110049
25.
Asefeh Asemi, Andrea Ko, Adeleh Asemi,
Infoecology of the deep learning and smart manufacturing: thematic and concept interactions,
2022,
40,
0737-8831,
994,
10.1108/LHT-08-2021-0252
26.
Narjes Davari, Bruno Veloso, Gustavo de Assis Costa, Pedro Mota Pereira, Rita P. Ribeiro, João Gama,
A Survey on Data-Driven Predictive Maintenance for the Railway Industry,
2021,
21,
1424-8220,
5739,
10.3390/s21175739
27.
Amgad Muneer, Shakirah Mohd Taib, Sheraz Naseer, Rao Faizan Ali, Izzatdin Abdul Aziz,
Data-Driven Deep Learning-Based Attention Mechanism for Remaining Useful Life Prediction: Case Study Application to Turbofan Engine Analysis,
2021,
10,
2079-9292,
2453,
10.3390/electronics10202453
Mantas Landauskas, Loreta Saunoriene, Minvydas Ragulskis,
2021,
Multiscale Diversity Index for RUL Analysis with Bernstein Polynomial Neural Networks,
9781450389839,
517,
10.1145/3459104.3459188
30.
Xianli Liu, Shaoyang Liu, Xuebing Li, Bowen Zhang, Caixu Yue, Steven Y. Liang,
Intelligent tool wear monitoring based on parallel residual and stacked bidirectional long short-term memory network,
2021,
60,
02786125,
608,
10.1016/j.jmsy.2021.06.006
31.
Ibrahem M. A. Ibrahem, Ouassima Akhrif, Hany Moustapha, Martin Staniszewski,
An Ensemble of Recurrent Neural Networks for Real Time Performance Modeling of Three-Spool Aero-Derivative Gas Turbine Engine,
2021,
143,
0742-4795,
10.1115/1.4051112
32.
Jichao Zhuang, Minping Jia, Yifei Ding, Peng Ding,
Temporal convolution-based transferable cross-domain adaptation approach for remaining useful life estimation under variable failure behaviors,
2021,
216,
09518320,
107946,
10.1016/j.ress.2021.107946
33.
Khanh T. P. Nguyen, Kamal Medjaher, Do T. Tran,
A review of artificial intelligence methods for engineering prognostics and health management with implementation guidelines,
2022,
0269-2821,
10.1007/s10462-022-10260-y
34.
Xuguo Yan, Xuhui Xia, Lei Wang, Zelin Zhang,
A Cotraining-Based Semisupervised Approach for Remaining-Useful-Life Prediction of Bearings,
2022,
22,
1424-8220,
7766,
10.3390/s22207766
35.
Amare Desalegn Fentaye, Valentina Zaccaria, Konstantinos Kyprianidis,
Aircraft Engine Performance Monitoring and Diagnostics Based on Deep Convolutional Neural Networks,
2021,
9,
2075-1702,
337,
10.3390/machines9120337
36.
Cheng Peng, Jiaqi Wu, Qilong Wang, Weihua Gui, Zhaohui Tang,
Remaining Useful Life Prediction Using Dual-Channel LSTM with Time Feature and Its Difference,
2022,
24,
1099-4300,
1818,
10.3390/e24121818
37.
Yuxin Wen, Md. Fashiar Rahman, Honglun Xu, Tzu-Liang Bill Tseng,
Recent advances and trends of predictive maintenance from data-driven machine prognostics perspective,
2022,
187,
02632241,
110276,
10.1016/j.measurement.2021.110276
38.
Kürşat İnce, Yakup Genc,
Joint autoencoder-regressor deep neural network for remaining useful life prediction,
2023,
41,
22150986,
101409,
10.1016/j.jestch.2023.101409
39.
Ye Zhu, Bo Xu, Zhenjie Luo, Zhiqiang Liu, Hao Wang, Chenglie Du,
2022,
Prediction method of turbine engine RUL based on GA-SVR,
978-1-6654-5087-4,
1,
10.1109/AICIT55386.2022.9930303
40.
Qiankun Hu, Yongping Zhao, Lihua Ren,
Novel Transformer-based Fusion Models for Aero-engine Remaining Useful Life Estimation,
2023,
2169-3536,
1,
10.1109/ACCESS.2023.3277730
41.
Suleyman Yildirim, Zeeshan A. Rana,
Enhancing Aircraft Safety through Advanced Engine Health Monitoring with Long Short-Term Memory,
2024,
24,
1424-8220,
518,
10.3390/s24020518
Lin Song, Jun Wu, Liping Wang, Guo Chen, Yile Shi, Zhigui Liu,
Remaining Useful Life Prediction of Rolling Bearings Based on Multi-Scale Attention Residual Network,
2023,
25,
1099-4300,
798,
10.3390/e25050798
44.
Yasunari Matsuzaka, Yoshihiro Uesawa,
Computational Models That Use a Quantitative Structure–Activity Relationship Approach Based on Deep Learning,
2023,
11,
2227-9717,
1296,
10.3390/pr11041296
45.
Jun Guo, Dapeng Li, Baigang Du,
A stacked ensemble method based on TCN and convolutional bi-directional GRU with multiple time windows for remaining useful life estimation,
2024,
150,
15684946,
111071,
10.1016/j.asoc.2023.111071
46.
Gyeongho Kim, Jae Gyeong Choi, Sunghoon Lim,
Using transformer and a reweighting technique to develop a remaining useful life estimation method for turbofan engines,
2024,
133,
09521976,
108475,
10.1016/j.engappai.2024.108475
47.
Hairui Wang, Dongwen Li, Dongjun Li, Cuiqin Liu, Xiuqi Yang, Guifu Zhu,
Remaining Useful Life Prediction of Aircraft Turbofan Engine Based on Random Forest Feature Selection and Multi-Layer Perceptron,
2023,
13,
2076-3417,
7186,
10.3390/app13127186
48.
F Gougam, A Afia, MA Aitchikh, W Touzout, C Rahmoune, D Benazzouz,
Computer numerical control machine tool wear monitoring through a data-driven approach,
2024,
16,
1687-8132,
10.1177/16878132241229314
49.
Safa Ben Ayed, Roozbeh Sadeghian Broujeny, Rachid Tahar Hamza,
Remaining Useful Life Prediction with Uncertainty Quantification Using Evidential Deep Learning,
2025,
15,
2449-6499,
37,
10.2478/jaiscr-2025-0003
50.
Yucheng Wang, Min Wu, Ruibing Jin, Xiaoli Li, Lihua Xie, Zhenghua Chen,
Local–Global Correlation Fusion-Based Graph Neural Network for Remaining Useful Life Prediction,
2025,
36,
2162-237X,
753,
10.1109/TNNLS.2023.3330487
51.
Josue Luiz Dalboni da Rocha, Jesyin Lai, Pankaj Pandey, Phyu Sin M. Myat, Zachary Loschinskey, Asim K. Bag, Ranganatha Sitaram,
Artificial Intelligence for Neuroimaging in Pediatric Cancer,
2025,
17,
2072-6694,
622,
10.3390/cancers17040622
52.
Qian Zhao, Dian Zhang, Xiang Jia, Bo Guo, Fengchen Qian,
2024,
Remaining Useful Life Prediction for Complex Systems by Fusing Multi-Level Information,
979-8-3503-5401-0,
1,
10.1109/PHM-Beijing63284.2024.10874767
53.
Chengying Zhao, Jiajun Wang, Fengxia He, Xiaotian Bai, Huaitao Shi, Jialin Li, Xianzhen Huang,
A fatigue life prediction method based on multi-signal fusion deep attention residual convolutional neural network,
2025,
235,
0003682X,
110646,
10.1016/j.apacoust.2025.110646
54.
M.A. Benatia, M. Hafsi, S. Ben Ayed,
A continual learning approach for failure prediction under non-stationary conditions: Application to condition monitoring data streams,
2025,
204,
03608352,
111049,
10.1016/j.cie.2025.111049
Besher Gharaibeh, Loai Tawalbeh. Beliefs and Practices of Patients with Diabetes toward the Use of Herbal Therapy[J]. AIMS Public Health, 2017, 4(6): 650-664. doi: 10.3934/publichealth.2017.6.650
Besher Gharaibeh, Loai Tawalbeh. Beliefs and Practices of Patients with Diabetes toward the Use of Herbal Therapy[J]. AIMS Public Health, 2017, 4(6): 650-664. doi: 10.3934/publichealth.2017.6.650