
Establishing a reasonable and effective feature system is the basis of credit risk early warning. Whether the system design is appropriate directly determines the accuracy of the credit risk evaluation results. In this paper, we proposed a feature system through a validity index with maximum discrimination and commercial banks' loan profit maximization. First, the first objective function is the minimum validity index constructed by the intra-class, between-class, and partition coefficients. The maximum difference between the right income and wrong cost is taken as the second objective function to obtain the optimal feature combination. Second, the feature weights are obtained by calculating the change in profit after deleting each feature with replacement to the sum of all change values. An empirical analysis of 3, 425 listed companies from t-1 to t-5 time windows reveals that five groups of feature systems selected from 614 features can distinguish between defaults and non-defaults. Compared with 14 other models, it is found that the feature systems can provide at least five years' prediction and enable financial institutions to obtain the maximum profit.
Citation: Meng Pang, Zhe Li. A novel profit-based validity index approach for feature selection in credit risk prediction[J]. AIMS Mathematics, 2024, 9(1): 974-997. doi: 10.3934/math.2024049
[1] | Yufeng Li, Chengcheng Liu, Weiping Zhao, Yufeng Huang . Multi-spectral remote sensing images feature coverage classification based on improved convolutional neural network. Mathematical Biosciences and Engineering, 2020, 17(5): 4443-4456. doi: 10.3934/mbe.2020245 |
[2] | Haifeng Song, Weiwei Yang, Songsong Dai, Haiyan Yuan . Multi-source remote sensing image classification based on two-channel densely connected convolutional networks. Mathematical Biosciences and Engineering, 2020, 17(6): 7353-7377. doi: 10.3934/mbe.2020376 |
[3] | Hua Wang, Weiwei Li, Wei Huang, Jiqiang Niu, Ke Nie . Research on land use classification of hyperspectral images based on multiscale superpixels. Mathematical Biosciences and Engineering, 2020, 17(5): 5099-5119. doi: 10.3934/mbe.2020275 |
[4] | Eric Ke Wang, Nie Zhe, Yueping Li, Zuodong Liang, Xun Zhang, Juntao Yu, Yunming Ye . A sparse deep learning model for privacy attack on remote sensing images. Mathematical Biosciences and Engineering, 2019, 16(3): 1300-1312. doi: 10.3934/mbe.2019063 |
[5] | Haifeng Song, Weiwei Yang, Songsong Dai, Lei Du, Yongchen Sun . Using dual-channel CNN to classify hyperspectral image based on spatial-spectral information. Mathematical Biosciences and Engineering, 2020, 17(4): 3450-3477. doi: 10.3934/mbe.2020195 |
[6] | Qian Zhang, Haigang Li, Ming Li, Lei Ding . Feature extraction of face image based on LBP and 2-D Gabor wavelet transform. Mathematical Biosciences and Engineering, 2020, 17(2): 1578-1592. doi: 10.3934/mbe.2020082 |
[7] | Eric Ke Wang, Fan Wang, Ruipei Sun, Xi Liu . A new privacy attack network for remote sensing images classification with small training samples. Mathematical Biosciences and Engineering, 2019, 16(5): 4456-4476. doi: 10.3934/mbe.2019222 |
[8] | Qihan Feng, Xinzheng Xu, Zhixiao Wang . Deep learning-based small object detection: A survey. Mathematical Biosciences and Engineering, 2023, 20(4): 6551-6590. doi: 10.3934/mbe.2023282 |
[9] | Qing Yang, Jun Chen, Najla Al-Nabhan . Data representation using robust nonnegative matrix factorization for edge computing. Mathematical Biosciences and Engineering, 2022, 19(2): 2147-2178. doi: 10.3934/mbe.2022100 |
[10] | Lei Yang, Guowu Yuan, Hao Wu, Wenhua Qian . An ultra-lightweight detector with high accuracy and speed for aerial images. Mathematical Biosciences and Engineering, 2023, 20(8): 13947-13973. doi: 10.3934/mbe.2023621 |
Establishing a reasonable and effective feature system is the basis of credit risk early warning. Whether the system design is appropriate directly determines the accuracy of the credit risk evaluation results. In this paper, we proposed a feature system through a validity index with maximum discrimination and commercial banks' loan profit maximization. First, the first objective function is the minimum validity index constructed by the intra-class, between-class, and partition coefficients. The maximum difference between the right income and wrong cost is taken as the second objective function to obtain the optimal feature combination. Second, the feature weights are obtained by calculating the change in profit after deleting each feature with replacement to the sum of all change values. An empirical analysis of 3, 425 listed companies from t-1 to t-5 time windows reveals that five groups of feature systems selected from 614 features can distinguish between defaults and non-defaults. Compared with 14 other models, it is found that the feature systems can provide at least five years' prediction and enable financial institutions to obtain the maximum profit.
Hyperspectral Image (HSI) is the simultaneous imaging of target areas in dozens to hundreds of continuous spectral bands [1,2,3,4]. It effectively integrates the spatial and spectral information in the imaging scene, with strong target detection ability and better material identification ability [5,6,7,8]. It is widely used in agriculture and forestry, geological exploration, marine exploration, Environmental monitoring and other fields [9,10,11,12]. However, HSI is characterized by high data dimension, large information redundancy and high correlation between bands, which brings great difficulties to its processing and classification [13,14,15,16,17,18,19,20,21,22,23]. Therefore, how to reduce the redundant information of the data, extract the features of the hyperspectral images effectively, and realize the accurate classification of the hyperspectral images are the hot and difficult issues in the current hyperspectral image processing and classification research.
Sample labeling of hyperspectral images often requires expert knowledge and experience, so the cost of sample labeling is high [24,25,26]. When the labeled samples are limited, semi-supervised learning can explore the useful information of the unlabeled samples to participate in the model training and reduce the labeling cost [27,28]. In the field of machine learning, semi-supervised learning acquires knowledge and experience from a small number of labeled samples. The usable information is mined from a large number of unlabeled samples, which can improve the classification accuracy [29,30,31]. Therefore, a large number of scholars have studied the semi-supervised learning for remote sensing images. Camps-Valls et al. [32] proposed a graph-based hyperspectral image classification method, and constructed the graph structure through the graph method. The data context information based on the composite kernel is integrated, and the Nystrom method is used to speed up classification. Yang et al. [33] proposed a semi-supervised band selection technique for hyperspectral image classification. A metric learning method has been used to measure the features of hyperspectral images, and a semi-supervised learning method has been used to select a subset of valid bands from the original bands. Tan et al. [34] proposed a hyperspectral image classification method based on segmentation integration and semi-supervised support vector machine. The spatial information of the tag samples is extracted by using a segmentation algorithm to filter the samples, and classified based on semi-supervised learning. Samiappan et al. [35] combined active learning and co-training to perform semi-supervised classification of hyperspectral images. The initial classification model is trained according to the labeled samples, and the heuristic active learning is performed on the unlabeled samples. Combined with the original data, the labeled samples are divided into views, and the unlabeled samples with high heuristic values are selected to join the training samples. Zhang et al. [36] used a semi-supervised classification method based on hierarchical segmentation and active learning to extract spatial information from hyperspectral images, the training set is updated iteratively by using the information of a large number of unlabeled samples to complete the hyperspectral image classification. In addition, some other methods are also proposed in recent years [37,38,39,40,41,42,43,44,45].
In hyperspectral images, each pixel corresponds to a spectral curve that reflects its inherent physical, chemical and optical properties. The main basis of hyperspectral image classification is to use the feature information of different pixels to label the pixels belonging to different landmarks and obtain the corresponding classification maps [46,47]. Therefore, a large number of scholars have studied hyperspectral image classification. Melgani et al. [48] proposed a hyperspectral image classification method based on support vector machines (SVM). The kernel function is introduced to solve the nonlinear separable problem and avoid the curse of dimensionality. Ratle et al. [49] introduced neural networks into hyperspectral image classification. In the training phase, the loss function is optimized to avoid local optimization. Chen et al. [50] constructed a hyperspectral image classification model based on sparse representation, and compared the classification results of common machine learning methods. In order to improve the shortcomings of sparse representation in dealing with nonlinear problems, Chen et al. [51] proposed a kernel sparse representation technique. In addition, Cui et al. [52] proposed a multiscale sparse representation algorithm for robust hyperspectral image classification. Automatic and adaptive weight allocation schemes based on spectral angle ratio are incorporated into the multi-classifier framework to fuse sparse representation information at all scales. Tang et al. [53] proposed two sparse representation algorithms based on manifolds to solve the instability problem of sparse algorithms. The regularization and local invariance techniques are used, and two manifold-based regularization items are merged into the l1-based objective function. Wang et al. [54] applied the neighborhood-cutting technique to sparse representation, and combined the joint spatial and spectral sparse representation classification method. Wang and Celik [55] improved the classification accuracy of hyperspectral images by combining context information in the sparse coefficient domain. Hu et al. [56] proposed two weighted kernel joint sparse representation methods, which determine the calculation weight by calculating the kernel similarity between adjacent pixels. Xue et al. [57] presented two novel sparse graph regularization methods, SGR and SGR with total variation. Yang et al. [58] studied the effect of the p-norm distance metric on the minimum distance technique, and proposed a supervised-learning p-norm distance metric to optimize the value of p. Zhang et al. [59] proposed a multi-scale dense network for HSI classification that made full use of different scale information in the network structure and combined scale information throughout the network. Liu et al. [60] proposed a class-wise adversarial adaptation method in conjunction with the class-wise probability MMD as the class-wise distribution adaptation network. Wang et al. [61] proposed the graph-based semi-supervised learning with weighted features for HSI classification. In addition, some new other methods are also proposed by a lot of researchers [62,63,64,65,66,67,68,69,70,71].
To sum up, hyperspectral images contain rich spectral and spatial information of earth surface features, which increases the processing and analysis difficulties. In addition, the training samples of actual hyperspectral images are small and there is a sample labeling problem. The local binary pattern, sparse representation and mixed logistic regression model are used in this paper. A new hyperspectral image feature extraction method based on local binary pattern is proposed to obtain texture features of hyperspectral images and enrich hyperspectral image sample information. A sample selection strategy based on active learning is designed to determine the unlabeled samples. Based on these, a new sample labeling method based on neighborhood information and priority classifier discrimination is deeply studied to expand the training samples. And a novel classification method based on texture features and semi-supervised learning is studied to improve the classification accuracy of remote sensing images.
The main contributions of this paper are described as follows.
(ⅰ) A novel a classification method of hyperspectral remote sensing images based on texture features and semi-supervised learning is proposed by using local binary pattern, sparse representation, hybrid logistic regression.
(ⅱ) The local binary pattern is used to effectively extract the features of spatial texture information of remote sensing images and enrich the feature information of samples.
(ⅲ) A multiple logistic regression model is used to optimally select unlabeled samples, which are labeled by using neighbourhood information and priority classifier discrimination to achieve pseudo-labeling of unlabeled samples.
(ⅳ) A novel classification method of hyperspectral remote sensing images based on semi-supervised learning is proposed to effectively achieve accurate classification of hyperspectral images.
In here, the local binary pattern (LBP) and sparse expression are introduced in order to clearly describe the basic theories of these methods.
LBP is a feature extraction method, which can extract spatial texture information of images. Texture, which is widely used in image processing and image analysis, represents the slow change or periodic change of the surface structure of the object [72]. LBP is also widely used in feature extraction of hyperspectral images due to the simple structure and easy calculation. Give the center pixel gc(xc,yc) and the neighborhood pixel gp, which are described as follow.
gp=(xc+Rcos(2ΠpP),yc−Rsin(2ΠpP)) | (1) |
where, gp(p=0,1,...,P−1) represents the coordinate values of P pixels uniformly distributed on the circular domain with gc as the centre and R as the radius. The quantized texture feature form of one region is shown in Figure 1.
LBPgc=2p×∑P−1p=0s(gp−gc) | (2) |
s(x)={1,0,x>0x≤0 | (3) |
Sparse representation means that the signal can be approximately represented by using linear combination of the atoms in the dictionary. Now, X=[X1,X2,...,Xc]∈RD is given as the HSI pixel and D is the number of image bands. In here, Xi=[xi1,xi2,...,xiNc]∈RD, Nc represents the number of samples in the class i.
For samples in the class i, it can be approximated as follow.
y≈xi1α1+xi2α2+...+xiNcαNc = [xi1,xi2,...,xiNc][α1,α2,...,αNc]T=Xiαi | (4) |
where, Xi represents a sparse sub-dictionary of the samples in the class i. αi represents the sparse vector of test samples y, which contains only a few non-zero values.
In order to obtain the sparsest vector αi, the following formula is solved.
˜α=argmin‖αi‖0,s.t.y=Aαi | (5) |
where, ‖ . ‖0 is a l0 norm, which represents the number of non-zero atoms in the vector, also known as sparsity. A is a sparse dictionary. It is a NP-hard problem to solve the Eq (5) directly. Under some conditions, the minimization solving problem (l0) is approximated by the minimization solving problem (l1), which can be relaxed.
˜α=argmin‖αi‖1,s.t.y=Aαi | (6) |
Furthermore, the solution can be converted to the following expression.
˜α=argmin‖αi‖0,s.t.‖Aαi−y‖22<ε | (7) |
where, ε represents the refactoring error. Orthogonal matching pursuit (OMP) algorithm is used to solve the expression (7). After the sparsity coefficient is calculated, the reconstruction residual for each class of the test samples y can be calculated.
ri(y)=‖y−A˜αi‖2 | (8) |
where, there is i∈{1,2,...,C}.
Finally, the reconstruction residuals of all dictionaries are compared. The minimum residual is the classification y.
class(y)=argmin(ri(y)),i=1,2,...C | (9) |
In hyperspectral remote sensing image classification, the attribute characteristics of each ground feature are subordinated to a specific distribution. If the number of training samples does not reach the amount that depicts the distribution of the ground feature, it will affect the subsequent classification results. In practical applications, there are difficulties, such as fewer labeled samples, difficulty in manual labeling, and time-consuming and labor-intensive problems and so on. Therefore, a new sample labeling method based on neighborhood information and priority classifier discrimination is proposed to obtain the learned pseudo labeled samples in order to expand the labeled samples and improve the classification accuracy of model.
Before the samples are labeled, sample selection is required. This is because if all unlabeled samples are labeled directly, all unlabeled samples need to be labeled, which will cost a lot of computation time. Moreover, due to the small number of initial labeled samples and the limited available information, it is difficult to label some samples with a certain accuracy. The mislabeled samples obviously affect the classification accuracy of model. The main objective of the sample selection strategy is to select the unlabeled samples with the largest amount of information. These unlabeled samples can construct a valuable training set after labeling, which can effectively promote the improvement of classification results. Therefore, a sample selection method based on multiple logistic regression is proposed to realize the selection of samples.
The classified probability matrix p(yki|xi) of each sample by using multiple logistic regression model has a large amount of information that can be mined. The multiple logistic regression classifier is modeled by discriminant Bayesian decision model. According to the generalized linear model, it can be obtained as follow.
P(y;δ)=b(y)exp(δTT(y)−a(δ)) | (10) |
The specific expression of multiple logistic regression is described as follow.
p(yi=k∣xi,η)=exp(ηkg(xi))/∑Nk=1exp(ηkg(xi)) | (11) |
where, g(x)=[g1(x),g2(x),…,gf(x)]T is the feature vectors of the input, and η=[ηT1,ηT2,…ηTk] represents the regression parameter vector of the classifier. It is worth noting that the feature vector is often represented by introducing the kernel, which is not only used to improve the indivisibility, but also obtain the better classifier by training samples. Generally, the kernel function is radial basis function (RBF), which is described as follow.
K(xm,xn)=e−||xm−xn||22p2 | (12) |
After the feature vector is determined, the regression parameter η of model is only determined, and the probability matrix p(yki|xi) of each unlabeled sample belonging to each class is determined. The sample amount of information is determined by the Breaking Ties (BT) and the Least Confidence (LC). In this paper, the BT is selected to determine the amount of information.
The BT shows the similarity between the two classifications by comparing the difference between the maximum category probability and the sub-maximum category probability. The difference is smaller, the similarity between the two types of samples is greater. The uncertainty is greater, the amount of information is greater. Si is the similarity between classifications, which is described as follow.
Si=maxp(yki|xi)−secondmax(p(yki|xi)) | (13) |
The Si is finally sorted in ascending order. p(yki|xi) is the obtained probability matrix of each sample by multiple logistic regression classifier. maxp(yki|xi) represents the largest value in the obtained probability matrix, secondmax(p(yki|xi)) represents the second largest value in the obtained probability matrix. k is the number of regression parameters.
The features of hyperspectral remote sensing images have some correlation. The ground objects are closer, the correlation is stronger. In the research of sample labeling, spatial neighborhood information based on training samples is widely used. However, due to the unknown central pixel and the lack of sufficient information, the neighborhood information of unlabeled samples is relatively less in the research of sample labeling. Generally, each pixel label on hyperspectral image must be consistent with one pixel label in its neighborhood. This property can be used to label the unlabeled samples. The label information of training samples around the unlabeled samples can be used to discriminate the unlabeled samples. The labeling discrimination method based on neighborhood information centers on the sample to be labeled. The unlabeled samples are labeled with a block diagram. All the occurrences of sample labels are recorded and denoted as the neighborhood information. The labeled samples are used to train the classifier and classify the unlabeled samples. Determine whether the predicted sample label by the classifier appears in the neighborhood information of the unlabeled samples. If it appears, the predicted label by the classifier is the sample label. Otherwise, the samples are put into the unlabeled samples. One of the most important problems is whether the unlabeled samples which satisfy the neighborhood information can be reliably labeled by the classifier. At present, some studies use multiple classifiers to discriminate together and achieve good classification effect. However, how to determine the determination of labels, when the predicted labels by multiple classifiers are inconsistent, but all appear in the neighborhood information of unlabeled samples.
Therefore, a sample labeling method based on neighborhood information and priority classifier discrimination is proposed in this paper. For unlabeled samples with the neighborhood information, the classifier with the highest priority is used for prediction. If the obtained prediction marker appears in the neighborhood information, its marker is determined. Otherwise, the classifier with the lowest priority is used for prediction. Determine whether the label can be determined until the sample labeling is ended. The sample labeling process based on neighborhood information and priority classifier discrimination is shown in Figure 2.
The sample labeling method is an iterative process. Although it cannot ensure enough training samples around all unlabeled samples at the initial stage of sample labeling, it can ensure that some unlabeled samples are sufficient. The unlabeled samples are labeled and extended to the training set. With each iteration, the training set grows. Those unlabeled samples whose neighborhood training samples are not sufficient may reach the label condition at a certain labeling. The sample labeling method with replacement ensures the sample labeling accuracy to a certain extent, and improves the performance of classifier.
Hyperspectral remote sensing images consist of pairs of continuous spectral bands, which contain rich spectral and spatial information of earth surface features. Some objects that cannot be identified by conventional remote sensing means can be identified in hyperspectral images. However, the abundant data information increases the processing and analysis difficulties, and there are some problems of fewer labeled samples, difficulty in manual labeling and time-consuming and so on. In order to improve the classification accuracy of hyperspectral remote sensing images, a new classification method of hyperspectral remote sensing images based on texture features and semi-supervised learning is proposed in this paper. Firstly, aiming at the problems of high correlation between bands, information redundancy, high dimension and complex processing, LBP is employed to deal with the hyperspectral images. The texture features of hyperspectral images are effectively extracted to enrich the feature information. To solve the problem of limited label samples, a new sample labeling method based on neighborhood information and priority classifier discrimination is proposed in here. Secondly, a sample selection strategy is designed to find some samples from a large number of unlabeled samples. Finally, the sparse representation and mixed logistic regression are applied to achieve a new classification method to effectively achieve accurate classification of hyperspectral images.
The model of hyperspectral image classification method based on texture features and semi-supervised learning is shown in Figure 3.
The detailed implementation steps of hyperspectral image classification method based on texture features and semi-supervised learning are described as follows.
Step 1. Normalize the hyperspectral images. Principal component analysis is used to perform dimensionality reduction.
Step 2. Calculate the texture for each principal component by using LBP. The histogram statistics are performed according to the symmetric rotation invariant equivalence mode. D-dimensional feature data for each layer is used to obtain feature information for each pixel and retain pixel coordinates.
Step 3. The original hyperspectral images are performed by linear discriminant analysis to obtain a projection matrix.
Step 4. The hyperspectral images are mapped into low dimensional images for suitable Euclidean distance analysis by using a projection matrix.
Step 5. Training sample sets and test sample sets are randomly selected from each type of sample according to a certain proportion of low-dimensional images.
Step 6. Calculate the Euclidean distance from all training samples by using Lk=‖yi−Ak‖2 to maintain K training coordinates that are close to the Euclidean distance of each test sample as the sparse dictionary for the test samples.
Step 7. The extracted feature information is loaded into the test samples and corresponding sparse dictionary according to the corresponding coordinates to obtain the test sample set and sparse dictionary.
Step 8. The sparse representation is solved. Orthogonal matching pursuit (OMP) has been used to calculate the sparse coefficient, and the classifications of the test samples is obtained.
1) Indian Pines data
The images of Indian pines in northwest Indiana were collected by AVIRIS sensor. The images consist of 145 × 145 pixels and 224 spectral reflection bands with a wavelength range of 0.4~2.5 nm, including 16 types of feature elements. The false color map and real ground object distribution are shown in Figure 4.
2) Salinas Scene data
The AVIRIS spectrometer collects images of the Salinas Valley in California, USA, with a size of 512 × 217 pixels and a total of 224 bands. After the bands covering the water absorption area are removed, 204 bands are used, including 16 types of ground feature elements. The false color map and the real ground object distribution are shown in Figure 5.
3) Pavia University data
Images of the Italian University of Pavia campus taken by the Rosis Spectrometer. It is 610 × 340 pixels in size and has a total of 115 wavebands. The 103 wavebands after the wavebands covering the water-absorbing region contain a total of 9 types of features are removed. The false color map and the real ground object distribution are shown in Figure 6.
In the experiment, 10% of each type of ground object of the three kinds of data is randomly selected as the training samples, and the rest is the test samples.
Confusion Matrix (CM) is usually used in the classification and evaluation of hyperspectral images. A confusion matrix is generally defined as follow.
P=[p11p21p12…p22…p1np2n…………pn1pn2…pnn] | (14) |
where, n denotes the number of objects in the category, pij represents the number of samples belonging to the class i that are assigned to the class j. The total amount of data in each row denotes the true number of objects in the category. The total amount of data in each column represents the total number of samples.
Based on the confusion matrix, three classification indexes can be obtained, which are Overall Accuracy (OA), Average Accuracy (AA) and Kappa coefficient.
OA=∑ni=1piiN | (15) |
where, N represents the total number of samples in the classification. pii represents the number of correctly classified samples of the class i. OA represents the probability that the classified result corresponds to its true label for each random sample.
CAi=piiNi | (16) |
where, Ni represents the total number of samples for the first category in the class i. CAi represents the probability that category i is correctly classified.
Kappa=(n(∑Ni=1pii)−∑Ni=1(∑Nj=1pij∑Nj=1pji))n2−∑Ni=1(∑Nj=1pij∑Nj=1pji) | (17) |
The Kappa coefficient comprehensively considers the number of correctly classified objects and the error of misclassified objects on the confusion matrix.
In the experiment, a large number of alternative values are tested, and some classical values are selected from literatures, these parameter values are experimentally modified until the most reasonable parameter values are determined. These selected parameter values have obtained the optimal solution, so that they can accurately and efficiently verify the effectiveness.
The quality of sample selection directly affects the efficiency of the experiment, and also affects the performance of classifier. In order to select the best samples, the experimental results of Information Entropy (IE), Min Error (ME), Breaking Ties (BT) and Least Confidence (LC) on three kinds of hyperspectral images are compared. The experiment is 10 initial samples for each class, and all remaining samples are test samples. Two hundred unlabeled samples are selected by four sample selection methods in each iteration. The label samples are used to expand the training samples, train the classifier and classify the test samples. The quality of the selection samples is determined according to the classification results after each iteration is executed. The classification accuracy of different sample selection methods on different data sets is shown in Table 1.
Data | Selection method | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | IE | 78.39 | 79.10 | 79.99 | 80.71 | 82.11 | 83.47 | 84.10 | 85.51 | 86.48 | 87.23 |
ME | 80.33 | 86.83 | 91.08 | 92.92 | 95.28 | 97.02 | 98.00 | 98.45 | 98.90 | 99.13 | |
BT | 91.47 | 95.47 | 98.22 | 98.36 | 98.64 | 98.59 | 98.66 | 98.71 | 99.34 | 99.29 | |
LC | 84.74 | 88.95 | 91.63 | 94.33 | 95.27 | 96.46 | 98.15 | 98.55 | 98.62 | 98.66 | |
Pavia University | IE | 69.87 | 71.31 | 71.75 | 71.74 | 72.04 | 72.40 | 73.25 | 73.36 | 73.76 | 74.45 |
ME | 73.01 | 75.23 | 78.74 | 82.78 | 89.14 | 92.82 | 95.14 | 96.10 | 97.13 | 97.82 | |
BT | 87.54 | 92.63 | 94.51 | 95.24 | 95.84 | 96.10 | 96.39 | 96.50 | 96.69 | 96.71 | |
LC | 74.91 | 76.33 | 80.76 | 84.45 | 87.23 | 89.89 | 90.27 | 90.67 | 90.56 | 91.47 | |
Salinas Scene | IE | 84.16 | 84.60 | 84.65 | 85.02 | 85.09 | 85.25 | 85.50 | 85.65 | 85.91 | 85.98 |
ME | 85.14 | 89.29 | 92.81 | 94.31 | 96.48 | 97.41 | 98.04 | 98.20 | 98.66 | 98.88 | |
BT | 95.30 | 96.98 | 98.26 | 98.71 | 98.95 | 98.90 | 99.03 | 99.21 | 99.25 | 99.24 | |
LC | 88.60 | 91.19 | 92.84 | 93.58 | 93.80 | 95.74 | 97.56 | 97.72 | 98.56 | 98.86 |
From Table 1, it can be seen that the classification accuracies of ME, BT and LC are greatly improved on the three data sets with the increase of the number of iterations. The results of the BT method are significantly better than those of the other methods. The accuracy can be improved in the first few iterations, which indicate that the BT method can select samples with greater classification improvement. Therefore, the BT method is chosen as the sample selection method in this paper.
In the labeling process, the samples are not completely labeled correctly. The more samples are screened, the more samples may be misclassified. This will make the training set noisier and affects the generalization ability of the classifier. If the number of samples is too small, the number of labeled samples will not improve the classification accuracy of the classifier or will reduce the classification efficiency. The results of classification accuracy under different sample sizes are shown in Table 2.
Data | Quantity | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | 200 | 77.30 | 77.57 | 78.32 | 77.97 | 78.59 | 78.97 | 78.96 | 79.28 | 79.51 | 79.23 |
400 | 77.54 | 78.60 | 79.80 | 80.80 | 82.13 | 82.20 | 83.03 | 83.77 | 83.83 | 83.85 | |
600 | 77.52 | 79.54 | 79.37 | 79.27 | 80.08 | 81.99 | 83.89 | 83.60 | 84.30 | 84.48 | |
800 | 77.75 | 79.84 | 80.87 | 80.22 | 82.11 | 83.91 | 84.41 | 84.57 | 85.12 | 85.94 | |
1000 | 77.85 | 80.49 | 80.28 | 82.74 | 81.35 | 82.60 | 84.18 | 85.88 | 86.77 | 87.84 | |
1200 | 77.85 | 79.95 | 79.79 | 80.38 | 81.59 | 84.41 | 84.72 | 85.74 | 87.48 | 88.85 | |
1400 | 78.18 | 80.20 | 80.09 | 83.96 | 85.00 | 85.78 | 87.93 | 89.90 | 91.07 | 91.20 | |
1600 | 78.55 | 80.56 | 80.59 | 84.12 | 86.90 | 87.82 | 89.02 | 90.87 | 91.49 | 91.83 | |
1800 | 78.34 | 79.76 | 79.23 | 82.64 | 85.39 | 87.32 | 88.81 | 89.97 | 90.69 | 91.46 | |
2000 | 78.01 | 79.16 | 80.24 | 82.55 | 86.02 | 87.48 | 88.66 | 89.63 | 90.02 | 90.46 | |
Pavia University | 200 | 68.75 | 73.93 | 76.73 | 78.18 | 79.23 | 80.92 | 81.85 | 82.40 | 82.74 | 83.60 |
400 | 66.41 | 73.11 | 75.45 | 78.20 | 81.18 | 82.13 | 82.57 | 83.39 | 84.07 | 83.98 | |
600 | 68.88 | 76.35 | 78.22 | 80.73 | 82.59 | 83.29 | 83.68 | 84.57 | 84.95 | 84.98 | |
800 | 69.89 | 77.50 | 80.31 | 81.91 | 83.22 | 84.85 | 84.99 | 84.79 | 85.16 | 85.31 | |
1000 | 70.28 | 76.35 | 79.92 | 82.68 | 83.83 | 84.48 | 84.84 | 85.21 | 85.37 | 85.04 | |
1200 | 70.24 | 75.18 | 80.32 | 83.13 | 84.14 | 85.04 | 85.30 | 85.55 | 85.04 | 84.90 | |
1400 | 70.34 | 76.23 | 80.57 | 82.46 | 83.92 | 84.71 | 85.59 | 85.77 | 85.87 | 86.47 | |
1600 | 70.40 | 75.87 | 80.64 | 83.06 | 83.93 | 84.69 | 85.28 | 86.02 | 86.19 | 85.83 | |
1800 | 69.77 | 76.12 | 80.18 | 82.99 | 85.19 | 85.04 | 84.89 | 85.26 | 85.68 | 85.68 | |
2000 | 69.71 | 75.90 | 82.29 | 83.40 | 84.41 | 85.23 | 85.59 | 85.77 | 85.86 | 85.87 | |
Salinas Scene | 200 | 85.09 | 87.26 | 89.04 | 89.35 | 89.24 | 89.22 | 88.85 | 88.47 | 88.14 | 88.03 |
400 | 84.94 | 88.34 | 89.88 | 89.26 | 89.12 | 88.88 | 88.26 | 87.68 | 87.40 | 87.25 | |
600 | 85.69 | 90.85 | 90.80 | 90.42 | 89.71 | 89.04 | 88.63 | 87.84 | 87.12 | 86.60 | |
800 | 85.36 | 89.17 | 88.87 | 87.96 | 87.46 | 86.85 | 86.42 | 85.69 | 85.13 | 84.58 | |
1000 | 85.35 | 88.93 | 89.88 | 89.04 | 88.34 | 87.58 | 87.03 | 85.49 | 85.08 | 85.08 | |
1200 | 85.04 | 89.66 | 90.08 | 88.98 | 87.67 | 87.14 | 86.13 | 85.30 | 85.09 | 84.85 | |
1400 | 85.07 | 88.46 | 89.39 | 88.63 | 88.10 | 88.06 | 87.34 | 86.81 | 86.69 | 86.58 | |
1600 | 85.47 | 89.54 | 89.88 | 88.87 | 87.68 | 86.21 | 85.00 | 84.34 | 83.89 | 83.70 | |
1800 | 85.50 | 90.16 | 90.15 | 89.45 | 88.58 | 87.71 | 86.79 | 86.52 | 86.31 | 85.93 | |
2000 | 85.48 | 89.40 | 89.38 | 88.60 | 87.73 | 85.98 | 85.67 | 85.12 | 85.39 | 85.34 |
It can be seen from Table 2 that the sample selection screening quantity in different data sets presents different rules. Indian Pines datasets have the highest classification accuracy after 10 iterations are finished. Pavia University datasets have the highest classification accuracy after 8–10 iterations are finished. Salinas Scene datasets have the highest classification accuracy after 2–4 iterations are finished. The sample screening quantity with the highest accuracy is regarded as the experiment parameter, which are 1600 for Indian Pines, 1400 for Pavia University and 600 for Salinas Scene.
The size of the block window determines the neighborhood information of the samples, which directly affect the accuracy of the pseudo-tagging method. Due to the different size of data, the optimal block window size is also determined by a large number of experiments. The classification accuracy under different block window sizes is shown in Table 3.
Data | Block window size | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | 3 | 77.62 | 77.86 | 78.35 | 78.65 | 78.91 | 79.14 | 79.09 | 79.12 | 79.56 | 79.56 |
4 | 77.73 | 78.24 | 78.65 | 78.75 | 78.70 | 78.66 | 79.14 | 79.18 | 79.70 | 79.74 | |
5 | 78.41 | 79.61 | 79.41 | 81.28 | 81.11 | 81.73 | 82.67 | 82.84 | 84.18 | 84.68 | |
6 | 77.85 | 80.14 | 80.41 | 80.91 | 80.90 | 83.54 | 84.91 | 85.09 | 86.42 | 88.30 | |
7 | 78.86 | 80.58 | 82.58 | 84.90 | 86.55 | 86.99 | 88.47 | 88.87 | 89.58 | 90.98 | |
8 | 78.33 | 78.95 | 81.00 | 84.24 | 85.44 | 86.37 | 87.60 | 88.13 | 88.57 | 89.19 | |
9 | 79.61 | 81.00 | 83.39 | 85.51 | 85.71 | 86.34 | 86.78 | 86.86 | 87.07 | 87.86 | |
10 | 78.61 | 79.32 | 83.45 | 85.45 | 85.59 | 85.69 | 86.12 | 87.00 | 87.17 | 87.32 | |
Pavia University | 5 | 68.39 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 |
10 | 68.20 | 67.67 | 69.05 | 69.06 | 68.30 | 68.13 | 68.06 | 67.98 | 67.89 | 67.62 | |
15 | 70.91 | 70.59 | 70.96 | 70.50 | 71.89 | 73.54 | 73.80 | 74.58 | 75.25 | 75.32 | |
20 | 71.47 | 71.63 | 73.60 | 76.12 | 75.99 | 75.61 | 77.18 | 77.51 | 77.81 | 78.14 | |
25 | 71.25 | 74.57 | 78.52 | 79.15 | 81.95 | 82.32 | 83.66 | 84.88 | 85.45 | 85.51 | |
30 | 70.28 | 76.35 | 79.92 | 82.68 | 83.83 | 84.48 | 84.84 | 85.21 | 85.37 | 85.04 | |
35 | 71.48 | 77.43 | 80.15 | 81.75 | 83.18 | 83.62 | 84.11 | 83.65 | 83.23 | 83.17 | |
40 | 73.40 | 77.50 | 80.36 | 81.92 | 83.15 | 82.68 | 82.23 | 82.38 | 82.18 | 82.04 | |
Salinas Scene | 5 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 |
10 | 83.61 | 84.16 | 84.95 | 85.03 | 84.77 | 84.68 | 84.59 | 84.82 | 84.61 | 85.48 | |
15 | 83.86 | 83.17 | 85.51 | 87.04 | 87.70 | 88.44 | 89.77 | 89.60 | 91.01 | 91.09 | |
20 | 84.16 | 86.84 | 89.41 | 90.56 | 90.22 | 90.69 | 91.02 | 91.13 | 90.87 | 90.68 | |
25 | 84.11 | 88.22 | 89.46 | 89.68 | 89.35 | 88.77 | 87.86 | 87.49 | 86.71 | 86.78 | |
30 | 85.35 | 88.93 | 89.88 | 89.04 | 88.34 | 87.58 | 87.03 | 85.49 | 85.08 | 85.08 | |
35 | 86.17 | 88.80 | 88.67 | 87.67 | 86.63 | 85.84 | 85.26 | 84.87 | 83.95 | 83.06 | |
40 | 86.86 | 89.25 | 89.02 | 87.78 | 86.37 | 84.74 | 83.60 | 82.83 | 81.69 | 80.92 |
As can be seen from Table 3, there have different changes in classification accuracy for different datasets. Compared with the other two data sets, the size of Indian Pines is the smallest, so its experimental block window side length values are from 3 to 10. With the increase of the number of iterations, the classification accuracy showed a trend of gradual increase, and the optimal accuracy is obtained when the side length is 7. When the side length of the block window for Pavia and Salinas datasets is too small, the classification accuracy will not improve with the increase of iterations. This indicates that the neighborhood information cannot distinguish the category at this time.
With the increase of the side length of the block window, the number of iterations to achieve the optimal classification accuracy is advanced, but the optimal accuracy decreases. The block window size is larger, the more noise information will be introduced, and it will affect the sample labeling accuracy. Therefore, the block window size of the Indian Pines dataset is 7 × 7, and the block window sizes of the Pavia and Salinas datasets are 25 × 25 and 20 × 20, respectively.
In fact, the determination of pseudo-tags of samples mainly depends on the determination of classifiers. K-Nearest Neighbor (KNN), Sparse Representation-Based Classifier (SRC), Neighbor Rough Set (NRS) and mixed logistic regression (MLR) are employed to determine the pseudo-tags. The experimental results of single classifier and different combination classifiers on different datasets are shown in Tables 4–6.
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 314 | 673 | 1139 | 1630 | 2178 | 2680 | 3324 | 4063 | 4820 | 5474 |
OA (%) | 78.69 | 79.93 | 81.20 | 82.05 | 82.49 | 84.38 | 85.70 | 86.52 | 87.33 | 87.77 | |
SRC | NUM | 311 | 648 | 1068 | 1525 | 2032 | 2606 | 3248 | 3899 | 4668 | 5522 |
OA (%) | 78.64 | 80.42 | 80.10 | 81.56 | 82.87 | 84.39 | 85.83 | 87.04 | 87.92 | 88.19 | |
NRS | NUM | 315 | 673 | 1116 | 1597 | 2136 | 2697 | 3265 | 3815 | 4437 | 5016 |
OA (%) | 78.82 | 81.02 | 80.96 | 83.71 | 84.94 | 85.57 | 86.87 | 88.01 | 89.25 | 89.42 | |
MLR | NUM | 133 | 295 | 580 | 936 | 1296 | 1790 | 2396 | 3077 | 3858 | 4648 |
OA (%) | 77.55 | 79.31 | 82.21 | 83.91 | 83.99 | 85.75 | 87.36 | 87.68 | 88.27 | 88.41 | |
KNN + SRC | NUM | 317 | 706 | 1120 | 1702 | 2294 | 2967 | 3728 | 4494 | 5233 | 5950 |
OA (%) | 78.71 | 80.96 | 81.99 | 83.97 | 84.96 | 85.82 | 86.99 | 87.75 | 87.93 | 88.10 | |
KNN + NRS | NUM | 317 | 707 | 1198 | 1691 | 2308 | 2954 | 3625 | 4450 | 5374 | 6305 |
OA (%) | 78.78 | 80.27 | 80.73 | 82.51 | 83.82 | 85.45 | 87.56 | 89.06 | 89.95 | 90.19 | |
KNN + MLR | NUM | 318 | 712 | 1206 | 1794 | 2555 | 3398 | 4292 | 5072 | 5783 | 6678 |
OA (%) | 78.79 | 80.56 | 82.57 | 86.08 | 87.10 | 87.87 | 88.50 | 88.88 | 89.18 | 89.31 | |
SRC + KNN | NUM | 317 | 706 | 1134 | 1673 | 2223 | 2875 | 3658 | 4317 | 5011 | 5748 |
OA (%) | 78.71 | 81.09 | 81.51 | 83.17 | 84.27 | 85.84 | 86.94 | 87.54 | 88.26 | 88.64 | |
SRC + NRS | NUM | 318 | 730 | 1205 | 1778 | 2372 | 3091 | 3969 | 4813 | 5787 | 6641 |
OA (%) | 78.81 | 80.79 | 82.37 | 85.03 | 86.81 | 88.45 | 88.93 | 89.86 | 90.49 | 90.78 | |
SRC + MLR | NUM | 315 | 708 | 1153 | 1744 | 2457 | 3322 | 4231 | 5042 | 5800 | 6735 |
OA (%) | 78.80 | 80.92 | 82.16 | 85.16 | 86.24 | 87.61 | 88.39 | 88.71 | 89.07 | 89.19 | |
NRS + KNN | NUM | 317 | 707 | 1202 | 1712 | 2385 | 3021 | 3700 | 4538 | 5399 | 6333 |
OA (%) | 78.74 | 80.67 | 81.33 | 83.36 | 84.46 | 86.35 | 87.93 | 89.23 | 90.03 | 90.43 | |
NRS + SRC | NUM | 318 | 734 | 1207 | 1728 | 2378 | 3061 | 3959 | 4799 | 5691 | 6739 |
OA (%) | 78.77 | 81.01 | 82.56 | 84.52 | 86.37 | 87.43 | 88.95 | 90.11 | 90.61 | 90.90 | |
NRS + MLR | NUM | 318 | 694 | 1148 | 1690 | 2446 | 3194 | 4102 | 4950 | 5768 | 6792 |
OA (%) | 78.91 | 81.39 | 81.62 | 85.51 | 87.21 | 89.63 | 90.28 | 90.92 | 91.28 | 91.88 | |
MLR + KNN | NUM | 318 | 677 | 1166 | 1689 | 2429 | 3246 | 4018 | 4737 | 5677 | 6672 |
OA (%) | 78.30 | 81.13 | 81.85 | 85.94 | 87.22 | 88.28 | 88.77 | 88.98 | 89.20 | 89.29 | |
MLR + SRC | NUM | 315 | 704 | 1258 | 1741 | 2439 | 3286 | 4097 | 4867 | 5775 | 6760 |
OA (%) | 78.31 | 81.81 | 82.55 | 85.09 | 86.86 | 88.30 | 89.01 | 89.44 | 90.21 | 90.71 | |
MLR + NRS | NUM | 318 | 683 | 1154 | 1701 | 2458 | 3301 | 4219 | 5057 | 5997 | 6889 |
OA (%) | 78.46 | 81.49 | 82.11 | 86.14 | 87.86 | 89.70 | 90.68 | 91.58 | 92.15 | 92.42 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 236 | 444 | 661 | 965 | 1210 | 1466 | 1812 | 2254 | 2762 | 3251 |
OA (%) | 71.19 | 73.36 | 74.70 | 75.69 | 76.10 | 76.16 | 77.28 | 78.48 | 77.60 | 77.69 | |
SRC | NUM | 225 | 477 | 733 | 1080 | 1492 | 2047 | 2793 | 3654 | 4605 | 5631 |
OA (%) | 72.21 | 72.19 | 76.97 | 79.12 | 80.08 | 81.77 | 83.09 | 84.07 | 84.95 | 85.27 | |
NRS | NUM | 225 | 438 | 763 | 1007 | 1256 | 1487 | 1726 | 1893 | 1991 | 2179 |
OA (%) | 71.22 | 73.74 | 75.14 | 76.59 | 76.15 | 76.47 | 76.41 | 76.20 | 75.68 | 75.40 | |
MLR | NUM | 100 | 244 | 396 | 605 | 829 | 1035 | 1304 | 1587 | 1872 | 2183 |
OA (%) | 72.36 | 76.37 | 77.18 | 79.04 | 79.40 | 81.16 | 82.21 | 82.18 | 83.36 | 85.85 | |
KNN + SRC | NUM | 248 | 473 | 787 | 1118 | 1509 | 1996 | 2552 | 3111 | 3975 | 4837 |
OA (%) | 71.48 | 72.82 | 74.61 | 76.59 | 78.38 | 80.08 | 80.82 | 81.62 | 82.40 | 83.99 | |
KNN + NRS | NUM | 240 | 491 | 775 | 1067 | 1382 | 1850 | 2403 | 3040 | 3794 | 4415 |
OA (%) | 71.11 | 73.83 | 77.44 | 78.83 | 79.50 | 79.58 | 79.39 | 80.11 | 80.74 | 81.48 | |
KNN + MLR | NUM | 244 | 515 | 822 | 1176 | 1616 | 2162 | 2905 | 3698 | 4745 | 5682 |
OA (%) | 71.37 | 75.03 | 77.84 | 78.08 | 79.73 | 79.59 | 81.30 | 83.88 | 84.90 | 85.08 | |
SRC + KNN | NUM | 248 | 476 | 795 | 1215 | 1725 | 2298 | 3055 | 3967 | 4977 | 5913 |
OA (%) | 71.38 | 72.54 | 75.26 | 78.09 | 79.19 | 80.21 | 81.88 | 83.31 | 83.35 | 83.51 | |
SRC + NRS | NUM | 242 | 511 | 867 | 1385 | 1889 | 2513 | 3201 | 3988 | 4910 | 5794 |
OA (%) | 71.40 | 74.12 | 77.56 | 80.80 | 82.19 | 82.55 | 83.02 | 83.80 | 84.34 | 85.09 | |
SRC + MLR | NUM | 236 | 507 | 841 | 1289 | 1731 | 2261 | 3016 | 3928 | 4939 | 6015 |
OA (%) | 71.52 | 73.62 | 76.47 | 78.45 | 79.51 | 81.54 | 83.10 | 84.37 | 84.68 | 84.66 | |
NRS + KNN | NUM | 240 | 486 | 803 | 1119 | 1541 | 1992 | 2557 | 3190 | 3939 | 4611 |
OA (%) | 71.05 | 74.37 | 76.57 | 77.77 | 78.06 | 77.21 | 76.75 | 77.35 | 78.88 | 79.28 | |
NRS + SRC | NUM | 242 | 501 | 803 | 1296 | 1792 | 2433 | 3089 | 3839 | 4776 | 5798 |
OA (%) | 71.44 | 74.50 | 76.72 | 79.93 | 81.44 | 81.90 | 82.85 | 83.81 | 85.00 | 85.48 | |
NRS + MLR | NUM | 234 | 517 | 828 | 1237 | 1796 | 2446 | 3354 | 4220 | 5061 | 5857 |
OA (%) | 71.49 | 75.47 | 77.89 | 80.86 | 83.68 | 84.43 | 84.93 | 85.12 | 85.57 | 85.46 | |
MLR + KNN | NUM | 244 | 486 | 746 | 1170 | 1658 | 2300 | 3205 | 4208 | 5336 | 6514 |
OA (%) | 71.47 | 75.30 | 76.67 | 78.76 | 80.35 | 82.97 | 85.05 | 86.05 | 86.88 | 87.02 | |
MLR + SRC | NUM | 236 | 504 | 903 | 1310 | 1708 | 2207 | 3009 | 4024 | 5098 | 6234 |
OA (%) | 71.71 | 74.01 | 79.30 | 79.81 | 80.40 | 83.03 | 86.27 | 86.93 | 87.97 | 88.53 | |
MLR + NRS | NUM | 234 | 524 | 787 | 1205 | 1738 | 2374 | 3116 | 3953 | 4754 | 5586 |
OA (%) | 71.63 | 75.94 | 79.83 | 80.66 | 82.75 | 84.64 | 85.98 | 86.19 | 86.37 | 86.87 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 133 | 251 | 443 | 684 | 963 | 1304 | 1672 | 2047 | 2425 | 2857 |
OA (%) | 83.38 | 86.46 | 86.46 | 86.65 | 87.26 | 87.31 | 87.79 | 87.92 | 87.66 | 87.15 | |
SRC | NUM | 144 | 275 | 441 | 666 | 937 | 1255 | 1606 | 1968 | 2391 | 2811 |
OA (%) | 83.10 | 83.56 | 85.11 | 85.83 | 85.81 | 86.63 | 86.50 | 86.64 | 86.95 | 86.96 | |
NRS | NUM | 148 | 271 | 423 | 668 | 952 | 1263 | 1569 | 1940 | 2351 | 2799 |
OA (%) | 84.04 | 86.06 | 87.62 | 87.63 | 87.10 | 87.48 | 88.44 | 88.43 | 88.32 | 87.92 | |
MLR | NUM | 102 | 177 | 302 | 451 | 621 | 867 | 1176 | 1518 | 1848 | 2217 |
OA (%) | 82.88 | 85.41 | 87.11 | 88.36 | 88.73 | 90.68 | 91.52 | 92.20 | 91.70 | 92.02 | |
KNN + SRC | NUM | 146 | 294 | 479 | 694 | 976 | 1330 | 1691 | 2106 | 2520 | 2985 |
OA (%) | 83.60 | 84.65 | 85.38 | 86.76 | 87.39 | 87.39 | 88.00 | 88.11 | 87.53 | 87.23 | |
KNN + NRS | NUM | 150 | 297 | 500 | 761 | 1108 | 1466 | 1891 | 2354 | 2834 | 3316 |
OA (%) | 83.53 | 86.43 | 87.14 | 87.49 | 87.64 | 87.37 | 88.06 | 88.03 | 87.95 | 88.03 | |
KNN + MLR | NUM | 143 | 285 | 508 | 768 | 1132 | 1546 | 2026 | 2526 | 3049 | 3590 |
OA (%) | 83.38 | 85.50 | 86.34 | 88.35 | 88.80 | 90.07 | 90.24 | 90.08 | 89.86 | 89.79 | |
SRC + KNN | NUM | 146 | 287 | 472 | 686 | 957 | 1281 | 1673 | 2061 | 2485 | 2892 |
OA (%) | 83.25 | 84.70 | 85.79 | 85.88 | 86.70 | 86.92 | 86.89 | 86.93 | 86.85 | 86.90 | |
SRC + NRS | NUM | 150 | 280 | 493 | 755 | 1017 | 1328 | 1708 | 2114 | 2539 | 2982 |
OA (%) | 83.07 | 85.55 | 86.28 | 85.47 | 86.66 | 86.97 | 87.52 | 87.05 | 87.19 | 87.16 | |
SRC + MLR | NUM | 150 | 271 | 483 | 720 | 1108 | 1499 | 1958 | 2451 | 2969 | 3487 |
OA (%) | 83.15 | 84.27 | 87.25 | 87.88 | 88.88 | 89.29 | 89.36 | 89.85 | 89.70 | 89.85 | |
NRS + KNN | NUM | 150 | 298 | 519 | 814 | 1148 | 1556 | 2037 | 2538 | 3027 | 3522 |
OA (%) | 84.01 | 87.12 | 87.79 | 87.30 | 87.54 | 88.38 | 88.09 | 88.25 | 87.95 | 87.88 | |
NRS + SRC | NUM | 150 | 284 | 488 | 803 | 1158 | 1539 | 1988 | 2482 | 2955 | 3423 |
OA (%) | 83.91 | 87.23 | 86.98 | 87.73 | 88.30 | 88.97 | 89.57 | 89.57 | 89.60 | 89.43 | |
NRS + MLR | NUM | 153 | 293 | 509 | 762 | 1104 | 1509 | 1940 | 2441 | 2934 | 3452 |
OA (%) | 83.87 | 85.82 | 88.25 | 89.93 | 90.40 | 90.51 | 90.75 | 90.48 | 90.12 | 89.61 | |
MLR + KNN | NUM | 143 | 299 | 521 | 825 | 1187 | 1602 | 2046 | 2514 | 2993 | 3407 |
OA (%) | 82.80 | 85.42 | 87.67 | 88.79 | 89.68 | 90.06 | 90.54 | 90.92 | 91.27 | 91.32 | |
MLR + SRC | NUM | 150 | 292 | 521 | 799 | 1123 | 1537 | 2007 | 2448 | 2929 | 3367 |
OA (%) | 82.91 | 85.45 | 88.89 | 90.12 | 90.21 | 90.93 | 90.99 | 91.83 | 92.08 | 92.64 | |
MLR + NRS | NUM | 153 | 315 | 564 | 841 | 1197 | 1605 | 2064 | 2561 | 3060 | 3500 |
OA (%) | 82.81 | 84.87 | 88.16 | 89.34 | 89.69 | 90.03 | 90.42 | 90.52 | 91.19 | 91.43 |
As can be seen in Table 4, with the increase of iterations, the classification accuracy of each category samples increased gradually. Compared with the results of the single classifier, the SRC has the largest number of samples after 10 iterations are finished, but the classification effect is not the best value. The classifier with the best classification effect is NRS. The number of samples and classification accuracy of the combination classifiers are mostly better than that of single classifier. The experiment results are different for combination classifiers with different priority. The classifier with NRS can achieve more than 90% classification effect after 10 iterations are finished. The number of combinations with MLR is more than 6600 after 10 iterations are finished, and the best combination is MLR + NRS after 10 iterations are finished.
As can be seen in Table 5, compared with the experimental results by single classifier, the number of labeled samples with SRC is the largest value after 10 iterations are finished, which is higher than the other three methods. However, the MLR obtained best classification results. After 10 iterations are finished, the number of labeled samples of two classifiers is more than that of the single classifier. KNN, SRC and NRS are considered as the first priority classifiers, the classification results of the sample set after 10 iterations are not as good as those of MLR. The combination of MLR as the first priority classifier has better classification effect than single MLR after 10 iterations are finished.
For Salinas Scene data, the number of labeled samples by MLR after 10 iterations is the smallest result, but the classification accuracy of the labeled samples is the highest result. After 10 iterations are finished, the number of iterations of the two classifiers is also higher than that of the single classifier. However, from the perspective of the performance of labeled samples in classification, MLR+SRC has higher classification accuracy than MLR, which indicates that the addition of classifiers improves the classification accuracy.
From three experiment results, it can be seen that the method with the largest number of labeled samples does not necessarily achieve the best classification results. The labeled samples are needed to improve the classification accuracy of the classifier, so the obtained labeled samples after 10 iterations are taken as the evaluation criteria. The Indian Pines dataset uses a combination of classifiers MLR + NRS. The Pavia University and Salinas Scene data sets use MLR + SRC.
Based on the analysis, the settings of the related parameters are shown in Table 7.
Data set | Indian Pines | Pavia University | Salinas Scene |
Selection policy | BT | BT | BT |
Number of selections | 1600 | 1400 | 600 |
Window size | 7 × 7 | 25 × 25 | 20 × 20 |
Combination of classifiers | MLR + NRS | MLR + SRC | MLR + SRC |
Number of labeled samples | 6889 | 6234 | 3367 |
Firstly, the LBP is used to extract the features of spatial texture information of hyperspectral remote sensing images. Secondly, the sample labeling method based on neighborhood information and priority classifier discrimination is used to obtain the learned pseudo-labeled samples. The SRC classifier is trained with the labeled samples, and the test samples are predicted. The obtained classification results are compared with those of the SRC classifier on the training data, and the classification results of training models with different training dataset are shown in Table 8.
Training samples | Index | Initial samples | Labeling samples |
Indian Pines | AA | 67.93% | 84.70% |
OA | 77.38% | 92.42% | |
KAPPA | 0.746 | 0.914 | |
Pavia University | AA | 60.53% | 81.87% |
OA | 69.00% | 88.53% | |
KAPPA | 0.609 | 0.848 | |
Salinas Scene | AA | 82.59% | 87.76% |
OA | 84.00% | 92.64% | |
KAPPA | 0.823 | 0.918 |
It can be seen from Table 8, for Indian Pines dataset, the classification results of AA, OA and KAPPA are 84.7%, 94.42% and 0.914, respectively. For Pavia University dataset, the classification results of AA, OA and KAPPA are 81.87%, 88.53% and 0.848, respectively. For Salinas Scene dataset, the classification results of AA, OA and KAPPA are 87.76%, 92.64% and 0.918, respectively. Therefore, the proposed classification method obtains higher classification accuracy.
The classification visualizations of the proposed classification method for the initial samples and labeled samples are shown in Figures 7 and 8.
By comparing with the results of the experiments, it can find that the classification results of the trained classifier with expanded samples on the three datasets are better than those of the trained classifier with initial samples. Moreover, from the classification visualization, it can also see that the obtained classification results by the classifier and the labeled samples is smoother and has fewer discrete points, which indicates that the generalization ability of the classifier is improved by labeling the samples.
For the processing and analysis difficulties of hyperspectral images, a new sample labeling method based on neighborhood information and priority classifier discrimination is developed to implement a new classification method of hyperspectral remote sensing images based on texture features and semi-supervised learning by introducing LBP, sparse representation and mixed logistic regression. The LBP is employed to extract the texture features of the hyperspectral remote sensing images. The multivariate logistic regression model is used to select the unlabeled samples with the largest amount of information, and the unlabeled samples with neighborhood information and priority classifier tags are selected to obtain the pseudo-labeled samples after learning. The problem of limited labeled samples of hyperspectral images is solved. The Indian Pines dataset, Salinas scene dataset and Pavia University dataset are selected in here. The experiment results show that the block window of Indian Pines dataset is 7 × 7, and the block windows of Pavia University and Salinas scene are 25 × 25 and 20 × 20, respectively. The combination of MLR and SRC can obtain better classification results. The obtained classification results are smoother and have fewer discrete points, which indicate that the generalization ability of the classifier is improved by labeling the samples. For Indian Pines dataset, the classification results of AA, OA and KAPPA are 84.7%, 94.42% and 0.914, respectively. For Pavia University dataset, the classification results of AA, OA and KAPPA are 81.87%, 88.53% and 0.848, respectively. For Salinas Scene dataset, the classification results of AA, OA and KAPPA are 87.76%, 92.64% and 0.918, respectively. Therefore, the proposed classification method obtains higher classification accuracy by comparing with the other methods.
However, the proposed classification method has the more computing time, so the next step should be more in-depth research to reduce the time complexity.
This research was funded by the Sichuan Science and Technology Program, grant number 2021YFS0407, 2022YFS0593, 2023YFG0028; the Sichuan Provincial Transfer Payment Program, Chian under Grant R21ZYZF0006; the A Ba Achievements Transformation Program under Grant R21CGZH0001, R22CGZH0006, R22CGZH0007; the Chengdu Science and technology planning project, grant number 2021-YF05-00933-SN, the Research Foundation for Civil Aviation University of China grant number 2020KYQD123.
The authors declare no conflict of interest.
[1] |
C. Liu, W. Wang, M. Konan, S. Wang, L. Huang, Y. Tang, et al., A new validity index of feature subset for evaluating the dimensionality reduction algorithms, Knowl.-Based Syst., 121 (2017), 83–98. https://doi.org/10.1016/j.knosys.2017.01.017 doi: 10.1016/j.knosys.2017.01.017
![]() |
[2] |
N. Kozodoi, S. Lessmann, K. Papakonstantinou, Y. Gatsoulis, B. Baesens, A multi-objective approach for profit-driven feature selection in credit scoring, Decis. Support Syst., 120 (2019), 106–117. https://doi.org/10.1016/j.dss.2019.03.011 doi: 10.1016/j.dss.2019.03.011
![]() |
[3] |
F. Chen, F. Li, Combination of feature selection approaches with SVM in credit scoring, Expert Syst. Appl., 37 (2010), 4902–4909. https://doi.org/10.1016/j.eswa.2009.12.025 doi: 10.1016/j.eswa.2009.12.025
![]() |
[4] |
M. Doumpos, J. R. Figueira, A multicriteria outranking approach for modeling corporate credit ratings: An application of the Electre Tri-nC method, Omega, 82 (2019), 166–180. https://doi.org/10.1016/j.omega.2018.01.003 doi: 10.1016/j.omega.2018.01.003
![]() |
[5] |
D. Mateos-García, J. García-Gutiérrez, J. C. Riquelme-Santos, On the evolutionary weighting of neighbours and features in the k-nearest neighbour rule, Neurocomputing, 326 (2019), 54–60. https://doi.org/10.1016/j.neucom.2016.08.159 doi: 10.1016/j.neucom.2016.08.159
![]() |
[6] |
F. N. Koutanaei, H. Sajedi, M. Khanbabaei, A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring, J. Retail. Consum. Serv., 27 (2015), 11–23. https://doi.org/10.1016/j.jretconser.2015.07.003 doi: 10.1016/j.jretconser.2015.07.003
![]() |
[7] |
S. Lessmann, B. Baesens, H. V. Seow, L. C. Thomas, Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research, Eur. J. Oper. Res., 247 (2015), 124–136. https://doi.org/10.1016/j.ejor.2015.05.030 doi: 10.1016/j.ejor.2015.05.030
![]() |
[8] |
S. A. Sridharan, Volatility forecasting using financial statement information, Account. Rev. 90 (2015), 2079–2106. https://doi.org/10.2308/accr-51025 doi: 10.2308/accr-51025
![]() |
[9] |
S. Maldonado, J. Pérez, C. Bravo, Cost-based feature selection for support vector machines: An application in credit scoring, Eur. J. Oper. Res., 261 (2017), 656–665. https://doi.org/10.1016/j.ejor.2017.02.037 doi: 10.1016/j.ejor.2017.02.037
![]() |
[10] |
P. Bertolazzi, G. Felici, P. Festa, G. Fiscon, E. Weitschek, Integer programming models for feature selection: New extensions and a randomized solution algorithm, Eur. J. Oper. Res., 250 (2016), 389–399. https://doi.org/10.1016/j.ejor.2015.09.051 doi: 10.1016/j.ejor.2015.09.051
![]() |
[11] |
Y. Xia, C. Liu, Y. Li, N. Liu, A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring, Expert Syst. Appl., 78 (2017), 225–241. https://doi.org/10.1016/j.eswa.2017.02.017 doi: 10.1016/j.eswa.2017.02.017
![]() |
[12] |
S. Jadhav, H. He, K. Jenkins, Information gain directed genetic algorithm wrapper feature selection for credit rating, Appl. Soft Comput., 69 (2018), 541–553. https://doi.org/10.1016/j.asoc.2018.04.033 doi: 10.1016/j.asoc.2018.04.033
![]() |
[13] |
N. Arora, P. D. Kaur, A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment, Appl. Soft Comput., 86 (2020), 105936. https://doi.org/10.1016/j.asoc.2019.105936 doi: 10.1016/j.asoc.2019.105936
![]() |
[14] |
W. Gu, M. Basu, Z. Chao, L. Wei, A unified framework for credit evaluation for internet finance companies: Multi-criteria analysis through AHP and DEA, Int. J. Inf. Tech. Decis., 16 (2017), 597–624. https://doi.org/10.1142/S0219622017500134 doi: 10.1142/S0219622017500134
![]() |
[15] |
Z. Li, N. Hou, J. Su, Y. Liu, Model of credit rating of micro enterprise based on fuzzy integration, Filomat, 32 (2018), 1831–1842. https://doi.org/10.2298/FIL1805831L doi: 10.2298/FIL1805831L
![]() |
[16] |
A. Karaaslan, K. Ö. Özden, Forecasting Turkey's credit ratings with multivariate grey model and grey relational analysis, J. Quant. Econ., 15 (2017), 583–610. https://doi.org/10.1007/s40953-016-0064-1 doi: 10.1007/s40953-016-0064-1
![]() |
[17] |
X. Zhu, J. Li, D. Wu, H. Wang, C. Liang, Balancing accuracy, complexity and interpretability in consumer credit decision making: A C-TOPSIS classification approach, Knowl.-Based Syst., 52 (2013), 258–267. https://doi.org/10.1016/j.knosys.2013.08.004 doi: 10.1016/j.knosys.2013.08.004
![]() |
[18] |
H. Chen, T. Li, X. Fan, C. Luo, Feature selection for imbalanced data based on neighborhood rough sets, Inform. Sciences, 483 (2019), 1–20. https://doi.org/10.1016/j.ins.2019.01.041 doi: 10.1016/j.ins.2019.01.041
![]() |
[19] |
D. Panday, R. C. de Amorim, P. Lane, Feature weighting as a tool for unsupervised feature selection, Inform. Process. Lett., 129 (2018), 44–52. https://doi.org/10.1016/j.ipl.2017.09.005 doi: 10.1016/j.ipl.2017.09.005
![]() |
[20] |
Y. O. Serrano-Silva, Y. Villuendas-Rey, C. Yáñez-Márquez, Automatic feature weighting for improving financial Decision Support Systems, Decis. Support Syst., 107 (2018), 78–87. https://doi.org/10.1016/j.dss.2018.01.005 doi: 10.1016/j.dss.2018.01.005
![]() |
[21] |
M. Mercadier, J. P. Lardy, Credit spread approximation and improvement using random forest regression, Eur. J. Oper. Res., 277 (2019), 351–365. https://doi.org/10.1016/j.ejor.2019.02.005 doi: 10.1016/j.ejor.2019.02.005
![]() |
[22] |
M. M. Chijoriga, Application of multiple discriminant analysis (MDA) as a credit scoring and risk assessment model, Int. J. Emerg. Mark., 6 (2011), 132–147. https://doi.org/10.1108/17468801111119498 doi: 10.1108/17468801111119498
![]() |
[23] |
L. Kao, C. Chiu, F. Chiu, A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring, Knowl.-Based Syst., 36 (2012), 245–252. https://doi.org/10.1016/j.knosys.2012.07.004 doi: 10.1016/j.knosys.2012.07.004
![]() |
[24] |
N. Mahmoudi, E. Duman, Detecting credit card fraud by modified Fisher discriminant analysis, Expert Syst. Appl., 42 (2015), 2510–2516. https://doi.org/10.1016/j.eswa.2014.10.037 doi: 10.1016/j.eswa.2014.10.037
![]() |
[25] |
S. Y. Sohn, D. H. Kim, J. H. Yoon, Technology credit scoring model with fuzzy logistic regression, Appl. Soft Comput., 43 (2016), 150–158. https://doi.org/10.1016/j.asoc.2016.02.025 doi: 10.1016/j.asoc.2016.02.025
![]() |
[26] |
M. S. Colak, A new multivariate approach for assessing corporate financial risk using balance sheets, Borsa Istanb. Rev., 21 (2021), 239–255. https://doi.org/10.1016/j.bir.2020.10.007 doi: 10.1016/j.bir.2020.10.007
![]() |
[27] |
N. Dwarika, The risk-return relationship and volatility feedback in South Africa: a comparative analysis of the parametric and nonparametric Bayesian approach, Quant. Financ. Econ., 7 (2023), 119–146. https://doi.org/10.3934/QFE.2023007 doi: 10.3934/QFE.2023007
![]() |
[28] |
Y. Guo, Y. Bai, C. Li, Y. Shao, Y. Ye, C. Jiang, Reverse nearest neighbors Bhattacharyya bound linear discriminant analysis for multimodal classification, Eng. Appl. Artif. Intel., 97 (2021), 104033. https://doi.org/10.1016/j.engappai.2020.104033 doi: 10.1016/j.engappai.2020.104033
![]() |
[29] |
N. Chukhrova, A. Johannssen, Fuzzy regression analysis: systematic review and bibliography, Appl. Soft Comput., 84 (2019), 105708. https://doi.org/10.1016/j.asoc.2019.105708 doi: 10.1016/j.asoc.2019.105708
![]() |
[30] |
A. Khashman, Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes, Expert Syst. Appl., 37 (2010), 6233–6239. https://doi.org/10.1016/j.eswa.2010.02.101 doi: 10.1016/j.eswa.2010.02.101
![]() |
[31] |
S. Maldonado, C. Bravo, J. López, J. Pérez, Integrated framework for profit-based feature selection and SVM classification in credit scoring, Decis. Support Syst., 104 (2017), 113–121. https://doi.org/10.1016/j.dss.2017.10.007 doi: 10.1016/j.dss.2017.10.007
![]() |
[32] |
A. Bequé, S. Lessmann, Extreme learning machines for credit scoring: An empirical evaluation, Expert Syst. Appl., 86 (2017), 42–53. https://doi.org/10.1016/j.eswa.2017.05.050 doi: 10.1016/j.eswa.2017.05.050
![]() |
[33] |
X. Zhang, Y. Han, W. Xu, Q. Wang, HOBA: A novel feature engineering methodology for credit card fraud detection with a deep learning architecture, Inform. Sciences, 557 (2021), 302–316. https://doi.org/10.1016/j.ins.2019.05.023 doi: 10.1016/j.ins.2019.05.023
![]() |
[34] |
M. Ala'raj, M. F. Abbod, M. Majdalawieh, Modelling customers credit card behaviour using bidirectional LSTM neural networks, J. Big Data, 8 (2021), 69. https://doi.org/10.1186/s40537-021-00461-7 doi: 10.1186/s40537-021-00461-7
![]() |
[35] |
F. Zhao, Y. Lu, X. Li, L. Wang, Y. Song, D. Fan, et al., Multiple imputation method of missing credit risk assessment data based on generative adversarial networks, Appl. Soft Comput., 126 (2022), 109273. https://doi.org/10.1016/j.asoc.2022.109273 doi: 10.1016/j.asoc.2022.109273
![]() |
[36] |
S. Asadi, S. E. Roshan, A bi-objective optimization method to produce a near-optimal number of classifiers and increase diversity in Bagging, Knowl.-Based Syst., 213 (2021), 106656. https://doi.org/10.1016/j.knosys.2020.106656 doi: 10.1016/j.knosys.2020.106656
![]() |
[37] |
Y. C. Chang, K. H. Chang, G. J. Wu, Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions, Appl. Soft Comput., 73 (2018), 914–920. https://doi.org/10.1016/j.asoc.2018.09.029 doi: 10.1016/j.asoc.2018.09.029
![]() |
[38] |
Y. Xia, J. Zhao, L. He, Y. Li, X. Yang, Forecasting loss given default for peer-to-peer loans via heterogeneous stacking ensemble approach, Int. J. Forecasting, 37 (2021), 1590–1613. https://doi.org/10.1016/j.ijforecast.2021.03.002 doi: 10.1016/j.ijforecast.2021.03.002
![]() |
[39] |
F. Shen, X. Zhao, G. Kou, F. E. Alsaadi, A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique, Appl. Soft Comput., 98 (2021), 106852. https://doi.org/10.1016/j.asoc.2020.106852 doi: 10.1016/j.asoc.2020.106852
![]() |
[40] |
J. Forough, S. Momtazi, Ensemble of deep sequential models for credit card fraud detection, Appl. Soft Comput., 99 (2021), 106883. https://doi.org/10.1016/j.asoc.2020.106883 doi: 10.1016/j.asoc.2020.106883
![]() |
[41] |
A. Belhadi, S. S. Kamble, V. Mani, I. Benkhati, F. E. Touriki, An ensemble machine learning approach for forecasting credit risk of agricultural SMEs' investments in agriculture 4.0 through supply chain finance, Ann. Oper. Res., 2021 (2021), 1–29. https://doi.org/10.1007/s10479-021-04366-9 doi: 10.1007/s10479-021-04366-9
![]() |
[42] |
C. Jiang, W. Xiong, Q. Xu, Y. Liu, Predicting default of listed companies in mainland China via U-MIDAS Logit model with group lasso penalty, Financ. Res. Lett., 38 (2021) 101487. https://doi.org/10.1016/j.frl.2020.101487 doi: 10.1016/j.frl.2020.101487
![]() |
[43] |
J. Donovan, J. Jennings, K. Koharki, J. Lee, Measuring credit risk using qualitative disclosure, Rev. Account. Stud., 26 (2021), 815–863. https://doi.org/10.1007/s11142-020-09575-4 doi: 10.1007/s11142-020-09575-4
![]() |
[44] |
N. Camanho, P. Deb, Z. Liu, Credit rating and competition, Int. J. Financ. Econ., 27 (2022) 2873–2897. https://doi.org/10.1002/ijfe.2303 doi: 10.1002/ijfe.2303
![]() |
[45] |
H. Zhang, Y. Shi, X. Yang, R. Zhou, A firefly algorithm modified support vector machine for the credit risk assessment of supply chain finance, Res. Int. Bus. Financ., 58 (2021), 101482. https://doi.org/10.1016/j.ribaf.2021.101482 doi: 10.1016/j.ribaf.2021.101482
![]() |
[46] |
Z. Ma, W. Hou, D. Zhang, A credit risk assessment model of borrowers in P2P lending based on BP neural network, PLOS one, 16 (2021), e0255216. https://doi.org/10.1371/journal.pone.0255216 doi: 10.1371/journal.pone.0255216
![]() |
[47] |
W. Hou, X. Wang, H. Zhang, J. Wang, L. Li, A novel dynamic ensemble selection classifier for an imbalanced data set: an application for credit risk assessment, Knowl.-Based Syst., 208 (2020), 106462. https://doi.org/10.1016/j.knosys.2020.106462 doi: 10.1016/j.knosys.2020.106462
![]() |
[48] |
F. O. Sameer, M. R. A. Bakar, A. A. Zaidan, B. B. Zaidan, A new algorithm of modified binary particle swarm optimization based on the Gustafson-Kessel for credit risk assessment, Neural Comput. & Applic., 31 (2019), 337–346. https://doi.org/10.1007/s00521-017-3018-4 doi: 10.1007/s00521-017-3018-4
![]() |
[49] |
J. Traczynski, Firm default prediction: A Bayesian model-averaging approach, J. Financ. Quant. Anal., 52 (2017), 1211–1245. https://doi.org/10.1017/S002210901700031X doi: 10.1017/S002210901700031X
![]() |
[50] |
Y. Zhou, W. Zhang, J. Kang, X. Zhang, X. Wang, A problem-specific non-dominated sorting genetic algorithm for supervised feature selection, Inform. Sciences, 547 (2021), 841–859. https://doi.org/10.1016/j.ins.2020.08.083 doi: 10.1016/j.ins.2020.08.083
![]() |
[51] |
Y. Zhu, L. Zhou, C. Xie, G. Wang, T. V. Nguyen, Forecasting SMEs' credit risk in supply chain finance with an enhanced hybrid ensemble machine learning approach, Int. J. Prod. Econ., 211 (2019), 22–33. https://doi.org/10.1016/j.ijpe.2019.01.032 doi: 10.1016/j.ijpe.2019.01.032
![]() |
[52] |
G. Chi, B. Meng, Debt rating model based on default identification: Empirical evidence from Chinese small industrial enterprises, Manage. Decis., 57 (2019), 2239–2260. https://doi.org/10.1108/MD-11-2017-1109 doi: 10.1108/MD-11-2017-1109
![]() |
[53] |
A. Bequé, K. Coussement, R. Gayler, S. Lessmann, Approaches for credit scorecard calibration: An empirical analysis, Knowl.-Based Syst., 134 (2017), 213–227. https://doi.org/10.1016/j.knosys.2017.07.034 doi: 10.1016/j.knosys.2017.07.034
![]() |
[54] |
R. Geng, I. Bose, X. Chen, Prediction of financial distress: An empirical study of listed Chinese companies using data mining, Eur. J. Oper. Res., 241 (2015), 236–247. https://doi.org/10.1016/j.ejor.2014.08.016 doi: 10.1016/j.ejor.2014.08.016
![]() |
[55] |
R.P. Baghai, B. Becker, Reputations and credit ratings: Evidence from commercial mortgage-backed securities, J. Financ. Econ., 135 (2020), 425–444. https://doi.org/10.1016/j.jfineco.2019.06.001 doi: 10.1016/j.jfineco.2019.06.001
![]() |
[56] |
N. Chai, B. Wu, W. Yang, B. Shi, A multicriteria approach for modeling small enterprise credit rating: evidence from China, Emerg. Mark. Financ. Tr., 55 (2019), 2523–2543. https://doi.org/10.1080/1540496X.2019.1577237 doi: 10.1080/1540496X.2019.1577237
![]() |
[57] |
L. Li, J. Yang, X. Zou, A study of credit risk of Chinese listed companies: ZPP versus KMV, Appl. Econ., 48 (2016), 2697–2710. https://doi.org/10.1080/00036846.2015.1128077 doi: 10.1080/00036846.2015.1128077
![]() |
[58] |
M. Livingston, W. P. Poon, L. Zhou, Are Chinese credit ratings relevant? A study of the Chinese bond market and credit rating industry, J. Bank. Financ., 87 (2018), 216–232. https://doi.org/10.1016/j.jbankfin.2017.09.020 doi: 10.1016/j.jbankfin.2017.09.020
![]() |
[59] |
M. S. Uddin, G. Chi, M. A. A. Janabi, T. Habib, Leveraging random forest in micro‐enterprises credit risk modelling for accuracy and interpretability, Int. J. Financ. Econ., 27 (2022), 3713–3729. https://doi.org/10.1002/ijfe.2346 doi: 10.1002/ijfe.2346
![]() |
[60] |
B. Meng, G. Chi, Evaluation index system of green industry based on maximum information content, Singap. Econ. Rev., 63 (2018), 229–248. https://doi.org/10.1142/S0217590817400094 doi: 10.1142/S0217590817400094
![]() |
[61] |
Z. Li, S. Liang, X. Pan, M. Pang, Credit risk prediction based on loan profit: Evidence from Chinese SMEs, Res. Int. Bus. Financ., 67 (2024), 102155. https://doi.org/10.1016/j.ribaf.2023.102155 doi: 10.1016/j.ribaf.2023.102155
![]() |
[62] |
J. A. Ohlson, Financial ratios and the probabilistic prediction of bankruptcy, J. Account. Res., 18 (1980), 109–131. https://doi.org/10.2307/2490395 doi: 10.2307/2490395
![]() |
[63] |
D. G. Kirikos, An evaluation of quantitative easing effectiveness based on out-of-sample forecasts, National Accounting Review, 4 (2022), 378–389. https://doi.org/10.3934/NAR.2022021 doi: 10.3934/NAR.2022021
![]() |
[64] | M. Peña, M. Cerrada, D. Cabrera, R.-V. Sánchez, Fast feature selection based on cluster validity index applied on data-driven bearing fault detection, 2020 IEEE ANDESCON, Quito, Ecuador, 2020, 1–6. http://doi.org/10.1109/ANDESCON50619.2020.9272146 |
[65] |
Y. Zhou, M. S. Uddin, T. Habib, G. Chi, K. Yuan, Feature selection in credit risk modeling: an international evidence, Econ. Res.-Ekon. Istraž., 34 (2021), 3064–3091. http://hdl.handle.net/10.1080/1331677X.2020.1867213 doi: 10.1080/1331677X.2020.1867213
![]() |
[66] |
F. Garrido, W. Verbeke, C. Bravo, A Robust profit measure for binary classification model evaluation, Expert Syst. Appl., 92 (2018), 154–160. https://doi.org/10.1016/j.eswa.2017.09.045 doi: 10.1016/j.eswa.2017.09.045
![]() |
[67] |
T. M. Luong, H. Scheule, Benchmarking forecast approaches for mortgage credit risk for forward periods, Eur. J. Oper. Res., 299 (2022), 750–767. https://doi.org/10.1016/j.ejor.2021.09.026 doi: 10.1016/j.ejor.2021.09.026
![]() |
[68] |
C. Bai, B. Shi, F. Liu, J. Sarkis, Banking credit worthiness: Evaluating the complex relationships, Omega, 83 (2019), 26–38. https://doi.org/10.1016/j.omega.2018.02.001 doi: 10.1016/j.omega.2018.02.001
![]() |
[69] |
M. Z. Abedin, C. Guotai, F. E. Moula, A. S. Azad, M. S. U. Khan, Topological applications of multilayer perceptrons and support vector machines in financial decision support systems, Int. J. Financ. Econ., 24 (2019), 474–507. https://doi.org/10.1002/ijfe.1675 doi: 10.1002/ijfe.1675
![]() |
[70] |
Q. Lan, X. Xu, H. Ma, G. Li, Multivariable data imputation for the analysis of incomplete credit data, Expert Syst. Appl., 141 (2020), 112926. https://doi.org/10.1016/j.eswa.2019.112926 doi: 10.1016/j.eswa.2019.112926
![]() |
[71] |
S. Wu, X. Gao, W. Zhou, COSLE: Cost sensitive loan evaluation for P2P lending, Inform. Sciences, 586 (2022), 74–98. https://doi.org/10.1016/j.ins.2021.11.055 doi: 10.1016/j.ins.2021.11.055
![]() |
[72] |
N. Kozodoi, J. Jacob, S. Lessmann, Fairness in credit scoring: Assessment, implementation and profit implications, Eur. J. Oper. Res., 297 (2022) 1083–1094. https://doi.org/10.1016/j.ejor.2021.06.023 doi: 10.1016/j.ejor.2021.06.023
![]() |
[73] |
X. Su, S. Zhou, R. Xue, J. Tian, Does economic policy uncertainty raise corporate precautionary cash holdings? Evidence from China, Account. Financ., 60 (2020), 4567–4592. https://doi.org/10.1111/acfi.12674 doi: 10.1111/acfi.12674
![]() |
[74] |
L. He, L. Zhang, Z. Zhong, D. Wang, F. Wang, Green credit, renewable energy investment and green economy development: Empirical analysis based on 150 listed companies of China, J. Clean. Prod., 208 (2019), 363–372. https://doi.org/10.1016/j.jclepro.2018.10.119 doi: 10.1016/j.jclepro.2018.10.119
![]() |
[75] |
V. Hlasny, Market and home production earnings gaps in Russia, National Accounting Review, 5 (2023), 108–124. https://doi.org/10.3934/NAR.2023007 doi: 10.3934/NAR.2023007
![]() |
[76] |
Y. Huang, Y. Ma, Z. Yang, Y. Zhang, A fire sale without fire: An explanation of labor-intensive FDI in China, J. Comp. Econ., 44 (2016), 884–901. https://doi.org/10.1016/j.jce.2016.04.007 doi: 10.1016/j.jce.2016.04.007
![]() |
[77] |
Z. Zhao, K. H. Zhang, FDI and industrial productivity in China: Evidence from panel data in 2001-06, Rev. Dev. Econ., 14 (2010), 656–665. https://doi.org/10.1111/j.1467-9361.2010.00580.x doi: 10.1111/j.1467-9361.2010.00580.x
![]() |
[78] |
Y. Zhang, L. Ma, Board faultlines, innovation strategy decisions, and faultline activation: Research on technology-intensive enterprises in Chinese A-share companies, Front. Psychol., 13 (2022), 855610. https://doi.org/10.3389/fpsyg.2022.855610 doi: 10.3389/fpsyg.2022.855610
![]() |
1. | Songyang Lyu, Ray C. C. Cheung, Efficient and Automatic Breast Cancer Early Diagnosis System Based on the Hierarchical Extreme Learning Machine, 2023, 23, 1424-8220, 7772, 10.3390/s23187772 | |
2. | Yansong Du, 2023, Research on Photographic Image Classification Based on Multi-model Fusion and Data Augmentation, 979-8-3503-1545-5, 1, 10.1109/ICIICS59993.2023.10420882 | |
3. | Fuyang Tian, Yinuo Zhang, Shakeel Ahmed Soomro, Qiang Wang, Shuaiyang Zhang, Ji Zhang, Qinglu Yang, Yunpeng Yan, Zhenwei Yu, Zhanhua Song, SSOD-MViT: A novel model for recognizing alfalfa seed pod maturity based on semi-supervised learning, 2025, 236, 01681699, 110439, 10.1016/j.compag.2025.110439 |
Data | Selection method | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | IE | 78.39 | 79.10 | 79.99 | 80.71 | 82.11 | 83.47 | 84.10 | 85.51 | 86.48 | 87.23 |
ME | 80.33 | 86.83 | 91.08 | 92.92 | 95.28 | 97.02 | 98.00 | 98.45 | 98.90 | 99.13 | |
BT | 91.47 | 95.47 | 98.22 | 98.36 | 98.64 | 98.59 | 98.66 | 98.71 | 99.34 | 99.29 | |
LC | 84.74 | 88.95 | 91.63 | 94.33 | 95.27 | 96.46 | 98.15 | 98.55 | 98.62 | 98.66 | |
Pavia University | IE | 69.87 | 71.31 | 71.75 | 71.74 | 72.04 | 72.40 | 73.25 | 73.36 | 73.76 | 74.45 |
ME | 73.01 | 75.23 | 78.74 | 82.78 | 89.14 | 92.82 | 95.14 | 96.10 | 97.13 | 97.82 | |
BT | 87.54 | 92.63 | 94.51 | 95.24 | 95.84 | 96.10 | 96.39 | 96.50 | 96.69 | 96.71 | |
LC | 74.91 | 76.33 | 80.76 | 84.45 | 87.23 | 89.89 | 90.27 | 90.67 | 90.56 | 91.47 | |
Salinas Scene | IE | 84.16 | 84.60 | 84.65 | 85.02 | 85.09 | 85.25 | 85.50 | 85.65 | 85.91 | 85.98 |
ME | 85.14 | 89.29 | 92.81 | 94.31 | 96.48 | 97.41 | 98.04 | 98.20 | 98.66 | 98.88 | |
BT | 95.30 | 96.98 | 98.26 | 98.71 | 98.95 | 98.90 | 99.03 | 99.21 | 99.25 | 99.24 | |
LC | 88.60 | 91.19 | 92.84 | 93.58 | 93.80 | 95.74 | 97.56 | 97.72 | 98.56 | 98.86 |
Data | Quantity | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | 200 | 77.30 | 77.57 | 78.32 | 77.97 | 78.59 | 78.97 | 78.96 | 79.28 | 79.51 | 79.23 |
400 | 77.54 | 78.60 | 79.80 | 80.80 | 82.13 | 82.20 | 83.03 | 83.77 | 83.83 | 83.85 | |
600 | 77.52 | 79.54 | 79.37 | 79.27 | 80.08 | 81.99 | 83.89 | 83.60 | 84.30 | 84.48 | |
800 | 77.75 | 79.84 | 80.87 | 80.22 | 82.11 | 83.91 | 84.41 | 84.57 | 85.12 | 85.94 | |
1000 | 77.85 | 80.49 | 80.28 | 82.74 | 81.35 | 82.60 | 84.18 | 85.88 | 86.77 | 87.84 | |
1200 | 77.85 | 79.95 | 79.79 | 80.38 | 81.59 | 84.41 | 84.72 | 85.74 | 87.48 | 88.85 | |
1400 | 78.18 | 80.20 | 80.09 | 83.96 | 85.00 | 85.78 | 87.93 | 89.90 | 91.07 | 91.20 | |
1600 | 78.55 | 80.56 | 80.59 | 84.12 | 86.90 | 87.82 | 89.02 | 90.87 | 91.49 | 91.83 | |
1800 | 78.34 | 79.76 | 79.23 | 82.64 | 85.39 | 87.32 | 88.81 | 89.97 | 90.69 | 91.46 | |
2000 | 78.01 | 79.16 | 80.24 | 82.55 | 86.02 | 87.48 | 88.66 | 89.63 | 90.02 | 90.46 | |
Pavia University | 200 | 68.75 | 73.93 | 76.73 | 78.18 | 79.23 | 80.92 | 81.85 | 82.40 | 82.74 | 83.60 |
400 | 66.41 | 73.11 | 75.45 | 78.20 | 81.18 | 82.13 | 82.57 | 83.39 | 84.07 | 83.98 | |
600 | 68.88 | 76.35 | 78.22 | 80.73 | 82.59 | 83.29 | 83.68 | 84.57 | 84.95 | 84.98 | |
800 | 69.89 | 77.50 | 80.31 | 81.91 | 83.22 | 84.85 | 84.99 | 84.79 | 85.16 | 85.31 | |
1000 | 70.28 | 76.35 | 79.92 | 82.68 | 83.83 | 84.48 | 84.84 | 85.21 | 85.37 | 85.04 | |
1200 | 70.24 | 75.18 | 80.32 | 83.13 | 84.14 | 85.04 | 85.30 | 85.55 | 85.04 | 84.90 | |
1400 | 70.34 | 76.23 | 80.57 | 82.46 | 83.92 | 84.71 | 85.59 | 85.77 | 85.87 | 86.47 | |
1600 | 70.40 | 75.87 | 80.64 | 83.06 | 83.93 | 84.69 | 85.28 | 86.02 | 86.19 | 85.83 | |
1800 | 69.77 | 76.12 | 80.18 | 82.99 | 85.19 | 85.04 | 84.89 | 85.26 | 85.68 | 85.68 | |
2000 | 69.71 | 75.90 | 82.29 | 83.40 | 84.41 | 85.23 | 85.59 | 85.77 | 85.86 | 85.87 | |
Salinas Scene | 200 | 85.09 | 87.26 | 89.04 | 89.35 | 89.24 | 89.22 | 88.85 | 88.47 | 88.14 | 88.03 |
400 | 84.94 | 88.34 | 89.88 | 89.26 | 89.12 | 88.88 | 88.26 | 87.68 | 87.40 | 87.25 | |
600 | 85.69 | 90.85 | 90.80 | 90.42 | 89.71 | 89.04 | 88.63 | 87.84 | 87.12 | 86.60 | |
800 | 85.36 | 89.17 | 88.87 | 87.96 | 87.46 | 86.85 | 86.42 | 85.69 | 85.13 | 84.58 | |
1000 | 85.35 | 88.93 | 89.88 | 89.04 | 88.34 | 87.58 | 87.03 | 85.49 | 85.08 | 85.08 | |
1200 | 85.04 | 89.66 | 90.08 | 88.98 | 87.67 | 87.14 | 86.13 | 85.30 | 85.09 | 84.85 | |
1400 | 85.07 | 88.46 | 89.39 | 88.63 | 88.10 | 88.06 | 87.34 | 86.81 | 86.69 | 86.58 | |
1600 | 85.47 | 89.54 | 89.88 | 88.87 | 87.68 | 86.21 | 85.00 | 84.34 | 83.89 | 83.70 | |
1800 | 85.50 | 90.16 | 90.15 | 89.45 | 88.58 | 87.71 | 86.79 | 86.52 | 86.31 | 85.93 | |
2000 | 85.48 | 89.40 | 89.38 | 88.60 | 87.73 | 85.98 | 85.67 | 85.12 | 85.39 | 85.34 |
Data | Block window size | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | 3 | 77.62 | 77.86 | 78.35 | 78.65 | 78.91 | 79.14 | 79.09 | 79.12 | 79.56 | 79.56 |
4 | 77.73 | 78.24 | 78.65 | 78.75 | 78.70 | 78.66 | 79.14 | 79.18 | 79.70 | 79.74 | |
5 | 78.41 | 79.61 | 79.41 | 81.28 | 81.11 | 81.73 | 82.67 | 82.84 | 84.18 | 84.68 | |
6 | 77.85 | 80.14 | 80.41 | 80.91 | 80.90 | 83.54 | 84.91 | 85.09 | 86.42 | 88.30 | |
7 | 78.86 | 80.58 | 82.58 | 84.90 | 86.55 | 86.99 | 88.47 | 88.87 | 89.58 | 90.98 | |
8 | 78.33 | 78.95 | 81.00 | 84.24 | 85.44 | 86.37 | 87.60 | 88.13 | 88.57 | 89.19 | |
9 | 79.61 | 81.00 | 83.39 | 85.51 | 85.71 | 86.34 | 86.78 | 86.86 | 87.07 | 87.86 | |
10 | 78.61 | 79.32 | 83.45 | 85.45 | 85.59 | 85.69 | 86.12 | 87.00 | 87.17 | 87.32 | |
Pavia University | 5 | 68.39 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 |
10 | 68.20 | 67.67 | 69.05 | 69.06 | 68.30 | 68.13 | 68.06 | 67.98 | 67.89 | 67.62 | |
15 | 70.91 | 70.59 | 70.96 | 70.50 | 71.89 | 73.54 | 73.80 | 74.58 | 75.25 | 75.32 | |
20 | 71.47 | 71.63 | 73.60 | 76.12 | 75.99 | 75.61 | 77.18 | 77.51 | 77.81 | 78.14 | |
25 | 71.25 | 74.57 | 78.52 | 79.15 | 81.95 | 82.32 | 83.66 | 84.88 | 85.45 | 85.51 | |
30 | 70.28 | 76.35 | 79.92 | 82.68 | 83.83 | 84.48 | 84.84 | 85.21 | 85.37 | 85.04 | |
35 | 71.48 | 77.43 | 80.15 | 81.75 | 83.18 | 83.62 | 84.11 | 83.65 | 83.23 | 83.17 | |
40 | 73.40 | 77.50 | 80.36 | 81.92 | 83.15 | 82.68 | 82.23 | 82.38 | 82.18 | 82.04 | |
Salinas Scene | 5 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 |
10 | 83.61 | 84.16 | 84.95 | 85.03 | 84.77 | 84.68 | 84.59 | 84.82 | 84.61 | 85.48 | |
15 | 83.86 | 83.17 | 85.51 | 87.04 | 87.70 | 88.44 | 89.77 | 89.60 | 91.01 | 91.09 | |
20 | 84.16 | 86.84 | 89.41 | 90.56 | 90.22 | 90.69 | 91.02 | 91.13 | 90.87 | 90.68 | |
25 | 84.11 | 88.22 | 89.46 | 89.68 | 89.35 | 88.77 | 87.86 | 87.49 | 86.71 | 86.78 | |
30 | 85.35 | 88.93 | 89.88 | 89.04 | 88.34 | 87.58 | 87.03 | 85.49 | 85.08 | 85.08 | |
35 | 86.17 | 88.80 | 88.67 | 87.67 | 86.63 | 85.84 | 85.26 | 84.87 | 83.95 | 83.06 | |
40 | 86.86 | 89.25 | 89.02 | 87.78 | 86.37 | 84.74 | 83.60 | 82.83 | 81.69 | 80.92 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 314 | 673 | 1139 | 1630 | 2178 | 2680 | 3324 | 4063 | 4820 | 5474 |
OA (%) | 78.69 | 79.93 | 81.20 | 82.05 | 82.49 | 84.38 | 85.70 | 86.52 | 87.33 | 87.77 | |
SRC | NUM | 311 | 648 | 1068 | 1525 | 2032 | 2606 | 3248 | 3899 | 4668 | 5522 |
OA (%) | 78.64 | 80.42 | 80.10 | 81.56 | 82.87 | 84.39 | 85.83 | 87.04 | 87.92 | 88.19 | |
NRS | NUM | 315 | 673 | 1116 | 1597 | 2136 | 2697 | 3265 | 3815 | 4437 | 5016 |
OA (%) | 78.82 | 81.02 | 80.96 | 83.71 | 84.94 | 85.57 | 86.87 | 88.01 | 89.25 | 89.42 | |
MLR | NUM | 133 | 295 | 580 | 936 | 1296 | 1790 | 2396 | 3077 | 3858 | 4648 |
OA (%) | 77.55 | 79.31 | 82.21 | 83.91 | 83.99 | 85.75 | 87.36 | 87.68 | 88.27 | 88.41 | |
KNN + SRC | NUM | 317 | 706 | 1120 | 1702 | 2294 | 2967 | 3728 | 4494 | 5233 | 5950 |
OA (%) | 78.71 | 80.96 | 81.99 | 83.97 | 84.96 | 85.82 | 86.99 | 87.75 | 87.93 | 88.10 | |
KNN + NRS | NUM | 317 | 707 | 1198 | 1691 | 2308 | 2954 | 3625 | 4450 | 5374 | 6305 |
OA (%) | 78.78 | 80.27 | 80.73 | 82.51 | 83.82 | 85.45 | 87.56 | 89.06 | 89.95 | 90.19 | |
KNN + MLR | NUM | 318 | 712 | 1206 | 1794 | 2555 | 3398 | 4292 | 5072 | 5783 | 6678 |
OA (%) | 78.79 | 80.56 | 82.57 | 86.08 | 87.10 | 87.87 | 88.50 | 88.88 | 89.18 | 89.31 | |
SRC + KNN | NUM | 317 | 706 | 1134 | 1673 | 2223 | 2875 | 3658 | 4317 | 5011 | 5748 |
OA (%) | 78.71 | 81.09 | 81.51 | 83.17 | 84.27 | 85.84 | 86.94 | 87.54 | 88.26 | 88.64 | |
SRC + NRS | NUM | 318 | 730 | 1205 | 1778 | 2372 | 3091 | 3969 | 4813 | 5787 | 6641 |
OA (%) | 78.81 | 80.79 | 82.37 | 85.03 | 86.81 | 88.45 | 88.93 | 89.86 | 90.49 | 90.78 | |
SRC + MLR | NUM | 315 | 708 | 1153 | 1744 | 2457 | 3322 | 4231 | 5042 | 5800 | 6735 |
OA (%) | 78.80 | 80.92 | 82.16 | 85.16 | 86.24 | 87.61 | 88.39 | 88.71 | 89.07 | 89.19 | |
NRS + KNN | NUM | 317 | 707 | 1202 | 1712 | 2385 | 3021 | 3700 | 4538 | 5399 | 6333 |
OA (%) | 78.74 | 80.67 | 81.33 | 83.36 | 84.46 | 86.35 | 87.93 | 89.23 | 90.03 | 90.43 | |
NRS + SRC | NUM | 318 | 734 | 1207 | 1728 | 2378 | 3061 | 3959 | 4799 | 5691 | 6739 |
OA (%) | 78.77 | 81.01 | 82.56 | 84.52 | 86.37 | 87.43 | 88.95 | 90.11 | 90.61 | 90.90 | |
NRS + MLR | NUM | 318 | 694 | 1148 | 1690 | 2446 | 3194 | 4102 | 4950 | 5768 | 6792 |
OA (%) | 78.91 | 81.39 | 81.62 | 85.51 | 87.21 | 89.63 | 90.28 | 90.92 | 91.28 | 91.88 | |
MLR + KNN | NUM | 318 | 677 | 1166 | 1689 | 2429 | 3246 | 4018 | 4737 | 5677 | 6672 |
OA (%) | 78.30 | 81.13 | 81.85 | 85.94 | 87.22 | 88.28 | 88.77 | 88.98 | 89.20 | 89.29 | |
MLR + SRC | NUM | 315 | 704 | 1258 | 1741 | 2439 | 3286 | 4097 | 4867 | 5775 | 6760 |
OA (%) | 78.31 | 81.81 | 82.55 | 85.09 | 86.86 | 88.30 | 89.01 | 89.44 | 90.21 | 90.71 | |
MLR + NRS | NUM | 318 | 683 | 1154 | 1701 | 2458 | 3301 | 4219 | 5057 | 5997 | 6889 |
OA (%) | 78.46 | 81.49 | 82.11 | 86.14 | 87.86 | 89.70 | 90.68 | 91.58 | 92.15 | 92.42 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 236 | 444 | 661 | 965 | 1210 | 1466 | 1812 | 2254 | 2762 | 3251 |
OA (%) | 71.19 | 73.36 | 74.70 | 75.69 | 76.10 | 76.16 | 77.28 | 78.48 | 77.60 | 77.69 | |
SRC | NUM | 225 | 477 | 733 | 1080 | 1492 | 2047 | 2793 | 3654 | 4605 | 5631 |
OA (%) | 72.21 | 72.19 | 76.97 | 79.12 | 80.08 | 81.77 | 83.09 | 84.07 | 84.95 | 85.27 | |
NRS | NUM | 225 | 438 | 763 | 1007 | 1256 | 1487 | 1726 | 1893 | 1991 | 2179 |
OA (%) | 71.22 | 73.74 | 75.14 | 76.59 | 76.15 | 76.47 | 76.41 | 76.20 | 75.68 | 75.40 | |
MLR | NUM | 100 | 244 | 396 | 605 | 829 | 1035 | 1304 | 1587 | 1872 | 2183 |
OA (%) | 72.36 | 76.37 | 77.18 | 79.04 | 79.40 | 81.16 | 82.21 | 82.18 | 83.36 | 85.85 | |
KNN + SRC | NUM | 248 | 473 | 787 | 1118 | 1509 | 1996 | 2552 | 3111 | 3975 | 4837 |
OA (%) | 71.48 | 72.82 | 74.61 | 76.59 | 78.38 | 80.08 | 80.82 | 81.62 | 82.40 | 83.99 | |
KNN + NRS | NUM | 240 | 491 | 775 | 1067 | 1382 | 1850 | 2403 | 3040 | 3794 | 4415 |
OA (%) | 71.11 | 73.83 | 77.44 | 78.83 | 79.50 | 79.58 | 79.39 | 80.11 | 80.74 | 81.48 | |
KNN + MLR | NUM | 244 | 515 | 822 | 1176 | 1616 | 2162 | 2905 | 3698 | 4745 | 5682 |
OA (%) | 71.37 | 75.03 | 77.84 | 78.08 | 79.73 | 79.59 | 81.30 | 83.88 | 84.90 | 85.08 | |
SRC + KNN | NUM | 248 | 476 | 795 | 1215 | 1725 | 2298 | 3055 | 3967 | 4977 | 5913 |
OA (%) | 71.38 | 72.54 | 75.26 | 78.09 | 79.19 | 80.21 | 81.88 | 83.31 | 83.35 | 83.51 | |
SRC + NRS | NUM | 242 | 511 | 867 | 1385 | 1889 | 2513 | 3201 | 3988 | 4910 | 5794 |
OA (%) | 71.40 | 74.12 | 77.56 | 80.80 | 82.19 | 82.55 | 83.02 | 83.80 | 84.34 | 85.09 | |
SRC + MLR | NUM | 236 | 507 | 841 | 1289 | 1731 | 2261 | 3016 | 3928 | 4939 | 6015 |
OA (%) | 71.52 | 73.62 | 76.47 | 78.45 | 79.51 | 81.54 | 83.10 | 84.37 | 84.68 | 84.66 | |
NRS + KNN | NUM | 240 | 486 | 803 | 1119 | 1541 | 1992 | 2557 | 3190 | 3939 | 4611 |
OA (%) | 71.05 | 74.37 | 76.57 | 77.77 | 78.06 | 77.21 | 76.75 | 77.35 | 78.88 | 79.28 | |
NRS + SRC | NUM | 242 | 501 | 803 | 1296 | 1792 | 2433 | 3089 | 3839 | 4776 | 5798 |
OA (%) | 71.44 | 74.50 | 76.72 | 79.93 | 81.44 | 81.90 | 82.85 | 83.81 | 85.00 | 85.48 | |
NRS + MLR | NUM | 234 | 517 | 828 | 1237 | 1796 | 2446 | 3354 | 4220 | 5061 | 5857 |
OA (%) | 71.49 | 75.47 | 77.89 | 80.86 | 83.68 | 84.43 | 84.93 | 85.12 | 85.57 | 85.46 | |
MLR + KNN | NUM | 244 | 486 | 746 | 1170 | 1658 | 2300 | 3205 | 4208 | 5336 | 6514 |
OA (%) | 71.47 | 75.30 | 76.67 | 78.76 | 80.35 | 82.97 | 85.05 | 86.05 | 86.88 | 87.02 | |
MLR + SRC | NUM | 236 | 504 | 903 | 1310 | 1708 | 2207 | 3009 | 4024 | 5098 | 6234 |
OA (%) | 71.71 | 74.01 | 79.30 | 79.81 | 80.40 | 83.03 | 86.27 | 86.93 | 87.97 | 88.53 | |
MLR + NRS | NUM | 234 | 524 | 787 | 1205 | 1738 | 2374 | 3116 | 3953 | 4754 | 5586 |
OA (%) | 71.63 | 75.94 | 79.83 | 80.66 | 82.75 | 84.64 | 85.98 | 86.19 | 86.37 | 86.87 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 133 | 251 | 443 | 684 | 963 | 1304 | 1672 | 2047 | 2425 | 2857 |
OA (%) | 83.38 | 86.46 | 86.46 | 86.65 | 87.26 | 87.31 | 87.79 | 87.92 | 87.66 | 87.15 | |
SRC | NUM | 144 | 275 | 441 | 666 | 937 | 1255 | 1606 | 1968 | 2391 | 2811 |
OA (%) | 83.10 | 83.56 | 85.11 | 85.83 | 85.81 | 86.63 | 86.50 | 86.64 | 86.95 | 86.96 | |
NRS | NUM | 148 | 271 | 423 | 668 | 952 | 1263 | 1569 | 1940 | 2351 | 2799 |
OA (%) | 84.04 | 86.06 | 87.62 | 87.63 | 87.10 | 87.48 | 88.44 | 88.43 | 88.32 | 87.92 | |
MLR | NUM | 102 | 177 | 302 | 451 | 621 | 867 | 1176 | 1518 | 1848 | 2217 |
OA (%) | 82.88 | 85.41 | 87.11 | 88.36 | 88.73 | 90.68 | 91.52 | 92.20 | 91.70 | 92.02 | |
KNN + SRC | NUM | 146 | 294 | 479 | 694 | 976 | 1330 | 1691 | 2106 | 2520 | 2985 |
OA (%) | 83.60 | 84.65 | 85.38 | 86.76 | 87.39 | 87.39 | 88.00 | 88.11 | 87.53 | 87.23 | |
KNN + NRS | NUM | 150 | 297 | 500 | 761 | 1108 | 1466 | 1891 | 2354 | 2834 | 3316 |
OA (%) | 83.53 | 86.43 | 87.14 | 87.49 | 87.64 | 87.37 | 88.06 | 88.03 | 87.95 | 88.03 | |
KNN + MLR | NUM | 143 | 285 | 508 | 768 | 1132 | 1546 | 2026 | 2526 | 3049 | 3590 |
OA (%) | 83.38 | 85.50 | 86.34 | 88.35 | 88.80 | 90.07 | 90.24 | 90.08 | 89.86 | 89.79 | |
SRC + KNN | NUM | 146 | 287 | 472 | 686 | 957 | 1281 | 1673 | 2061 | 2485 | 2892 |
OA (%) | 83.25 | 84.70 | 85.79 | 85.88 | 86.70 | 86.92 | 86.89 | 86.93 | 86.85 | 86.90 | |
SRC + NRS | NUM | 150 | 280 | 493 | 755 | 1017 | 1328 | 1708 | 2114 | 2539 | 2982 |
OA (%) | 83.07 | 85.55 | 86.28 | 85.47 | 86.66 | 86.97 | 87.52 | 87.05 | 87.19 | 87.16 | |
SRC + MLR | NUM | 150 | 271 | 483 | 720 | 1108 | 1499 | 1958 | 2451 | 2969 | 3487 |
OA (%) | 83.15 | 84.27 | 87.25 | 87.88 | 88.88 | 89.29 | 89.36 | 89.85 | 89.70 | 89.85 | |
NRS + KNN | NUM | 150 | 298 | 519 | 814 | 1148 | 1556 | 2037 | 2538 | 3027 | 3522 |
OA (%) | 84.01 | 87.12 | 87.79 | 87.30 | 87.54 | 88.38 | 88.09 | 88.25 | 87.95 | 87.88 | |
NRS + SRC | NUM | 150 | 284 | 488 | 803 | 1158 | 1539 | 1988 | 2482 | 2955 | 3423 |
OA (%) | 83.91 | 87.23 | 86.98 | 87.73 | 88.30 | 88.97 | 89.57 | 89.57 | 89.60 | 89.43 | |
NRS + MLR | NUM | 153 | 293 | 509 | 762 | 1104 | 1509 | 1940 | 2441 | 2934 | 3452 |
OA (%) | 83.87 | 85.82 | 88.25 | 89.93 | 90.40 | 90.51 | 90.75 | 90.48 | 90.12 | 89.61 | |
MLR + KNN | NUM | 143 | 299 | 521 | 825 | 1187 | 1602 | 2046 | 2514 | 2993 | 3407 |
OA (%) | 82.80 | 85.42 | 87.67 | 88.79 | 89.68 | 90.06 | 90.54 | 90.92 | 91.27 | 91.32 | |
MLR + SRC | NUM | 150 | 292 | 521 | 799 | 1123 | 1537 | 2007 | 2448 | 2929 | 3367 |
OA (%) | 82.91 | 85.45 | 88.89 | 90.12 | 90.21 | 90.93 | 90.99 | 91.83 | 92.08 | 92.64 | |
MLR + NRS | NUM | 153 | 315 | 564 | 841 | 1197 | 1605 | 2064 | 2561 | 3060 | 3500 |
OA (%) | 82.81 | 84.87 | 88.16 | 89.34 | 89.69 | 90.03 | 90.42 | 90.52 | 91.19 | 91.43 |
Data set | Indian Pines | Pavia University | Salinas Scene |
Selection policy | BT | BT | BT |
Number of selections | 1600 | 1400 | 600 |
Window size | 7 × 7 | 25 × 25 | 20 × 20 |
Combination of classifiers | MLR + NRS | MLR + SRC | MLR + SRC |
Number of labeled samples | 6889 | 6234 | 3367 |
Training samples | Index | Initial samples | Labeling samples |
Indian Pines | AA | 67.93% | 84.70% |
OA | 77.38% | 92.42% | |
KAPPA | 0.746 | 0.914 | |
Pavia University | AA | 60.53% | 81.87% |
OA | 69.00% | 88.53% | |
KAPPA | 0.609 | 0.848 | |
Salinas Scene | AA | 82.59% | 87.76% |
OA | 84.00% | 92.64% | |
KAPPA | 0.823 | 0.918 |
Data | Selection method | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | IE | 78.39 | 79.10 | 79.99 | 80.71 | 82.11 | 83.47 | 84.10 | 85.51 | 86.48 | 87.23 |
ME | 80.33 | 86.83 | 91.08 | 92.92 | 95.28 | 97.02 | 98.00 | 98.45 | 98.90 | 99.13 | |
BT | 91.47 | 95.47 | 98.22 | 98.36 | 98.64 | 98.59 | 98.66 | 98.71 | 99.34 | 99.29 | |
LC | 84.74 | 88.95 | 91.63 | 94.33 | 95.27 | 96.46 | 98.15 | 98.55 | 98.62 | 98.66 | |
Pavia University | IE | 69.87 | 71.31 | 71.75 | 71.74 | 72.04 | 72.40 | 73.25 | 73.36 | 73.76 | 74.45 |
ME | 73.01 | 75.23 | 78.74 | 82.78 | 89.14 | 92.82 | 95.14 | 96.10 | 97.13 | 97.82 | |
BT | 87.54 | 92.63 | 94.51 | 95.24 | 95.84 | 96.10 | 96.39 | 96.50 | 96.69 | 96.71 | |
LC | 74.91 | 76.33 | 80.76 | 84.45 | 87.23 | 89.89 | 90.27 | 90.67 | 90.56 | 91.47 | |
Salinas Scene | IE | 84.16 | 84.60 | 84.65 | 85.02 | 85.09 | 85.25 | 85.50 | 85.65 | 85.91 | 85.98 |
ME | 85.14 | 89.29 | 92.81 | 94.31 | 96.48 | 97.41 | 98.04 | 98.20 | 98.66 | 98.88 | |
BT | 95.30 | 96.98 | 98.26 | 98.71 | 98.95 | 98.90 | 99.03 | 99.21 | 99.25 | 99.24 | |
LC | 88.60 | 91.19 | 92.84 | 93.58 | 93.80 | 95.74 | 97.56 | 97.72 | 98.56 | 98.86 |
Data | Quantity | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | 200 | 77.30 | 77.57 | 78.32 | 77.97 | 78.59 | 78.97 | 78.96 | 79.28 | 79.51 | 79.23 |
400 | 77.54 | 78.60 | 79.80 | 80.80 | 82.13 | 82.20 | 83.03 | 83.77 | 83.83 | 83.85 | |
600 | 77.52 | 79.54 | 79.37 | 79.27 | 80.08 | 81.99 | 83.89 | 83.60 | 84.30 | 84.48 | |
800 | 77.75 | 79.84 | 80.87 | 80.22 | 82.11 | 83.91 | 84.41 | 84.57 | 85.12 | 85.94 | |
1000 | 77.85 | 80.49 | 80.28 | 82.74 | 81.35 | 82.60 | 84.18 | 85.88 | 86.77 | 87.84 | |
1200 | 77.85 | 79.95 | 79.79 | 80.38 | 81.59 | 84.41 | 84.72 | 85.74 | 87.48 | 88.85 | |
1400 | 78.18 | 80.20 | 80.09 | 83.96 | 85.00 | 85.78 | 87.93 | 89.90 | 91.07 | 91.20 | |
1600 | 78.55 | 80.56 | 80.59 | 84.12 | 86.90 | 87.82 | 89.02 | 90.87 | 91.49 | 91.83 | |
1800 | 78.34 | 79.76 | 79.23 | 82.64 | 85.39 | 87.32 | 88.81 | 89.97 | 90.69 | 91.46 | |
2000 | 78.01 | 79.16 | 80.24 | 82.55 | 86.02 | 87.48 | 88.66 | 89.63 | 90.02 | 90.46 | |
Pavia University | 200 | 68.75 | 73.93 | 76.73 | 78.18 | 79.23 | 80.92 | 81.85 | 82.40 | 82.74 | 83.60 |
400 | 66.41 | 73.11 | 75.45 | 78.20 | 81.18 | 82.13 | 82.57 | 83.39 | 84.07 | 83.98 | |
600 | 68.88 | 76.35 | 78.22 | 80.73 | 82.59 | 83.29 | 83.68 | 84.57 | 84.95 | 84.98 | |
800 | 69.89 | 77.50 | 80.31 | 81.91 | 83.22 | 84.85 | 84.99 | 84.79 | 85.16 | 85.31 | |
1000 | 70.28 | 76.35 | 79.92 | 82.68 | 83.83 | 84.48 | 84.84 | 85.21 | 85.37 | 85.04 | |
1200 | 70.24 | 75.18 | 80.32 | 83.13 | 84.14 | 85.04 | 85.30 | 85.55 | 85.04 | 84.90 | |
1400 | 70.34 | 76.23 | 80.57 | 82.46 | 83.92 | 84.71 | 85.59 | 85.77 | 85.87 | 86.47 | |
1600 | 70.40 | 75.87 | 80.64 | 83.06 | 83.93 | 84.69 | 85.28 | 86.02 | 86.19 | 85.83 | |
1800 | 69.77 | 76.12 | 80.18 | 82.99 | 85.19 | 85.04 | 84.89 | 85.26 | 85.68 | 85.68 | |
2000 | 69.71 | 75.90 | 82.29 | 83.40 | 84.41 | 85.23 | 85.59 | 85.77 | 85.86 | 85.87 | |
Salinas Scene | 200 | 85.09 | 87.26 | 89.04 | 89.35 | 89.24 | 89.22 | 88.85 | 88.47 | 88.14 | 88.03 |
400 | 84.94 | 88.34 | 89.88 | 89.26 | 89.12 | 88.88 | 88.26 | 87.68 | 87.40 | 87.25 | |
600 | 85.69 | 90.85 | 90.80 | 90.42 | 89.71 | 89.04 | 88.63 | 87.84 | 87.12 | 86.60 | |
800 | 85.36 | 89.17 | 88.87 | 87.96 | 87.46 | 86.85 | 86.42 | 85.69 | 85.13 | 84.58 | |
1000 | 85.35 | 88.93 | 89.88 | 89.04 | 88.34 | 87.58 | 87.03 | 85.49 | 85.08 | 85.08 | |
1200 | 85.04 | 89.66 | 90.08 | 88.98 | 87.67 | 87.14 | 86.13 | 85.30 | 85.09 | 84.85 | |
1400 | 85.07 | 88.46 | 89.39 | 88.63 | 88.10 | 88.06 | 87.34 | 86.81 | 86.69 | 86.58 | |
1600 | 85.47 | 89.54 | 89.88 | 88.87 | 87.68 | 86.21 | 85.00 | 84.34 | 83.89 | 83.70 | |
1800 | 85.50 | 90.16 | 90.15 | 89.45 | 88.58 | 87.71 | 86.79 | 86.52 | 86.31 | 85.93 | |
2000 | 85.48 | 89.40 | 89.38 | 88.60 | 87.73 | 85.98 | 85.67 | 85.12 | 85.39 | 85.34 |
Data | Block window size | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Indian Pines | 3 | 77.62 | 77.86 | 78.35 | 78.65 | 78.91 | 79.14 | 79.09 | 79.12 | 79.56 | 79.56 |
4 | 77.73 | 78.24 | 78.65 | 78.75 | 78.70 | 78.66 | 79.14 | 79.18 | 79.70 | 79.74 | |
5 | 78.41 | 79.61 | 79.41 | 81.28 | 81.11 | 81.73 | 82.67 | 82.84 | 84.18 | 84.68 | |
6 | 77.85 | 80.14 | 80.41 | 80.91 | 80.90 | 83.54 | 84.91 | 85.09 | 86.42 | 88.30 | |
7 | 78.86 | 80.58 | 82.58 | 84.90 | 86.55 | 86.99 | 88.47 | 88.87 | 89.58 | 90.98 | |
8 | 78.33 | 78.95 | 81.00 | 84.24 | 85.44 | 86.37 | 87.60 | 88.13 | 88.57 | 89.19 | |
9 | 79.61 | 81.00 | 83.39 | 85.51 | 85.71 | 86.34 | 86.78 | 86.86 | 87.07 | 87.86 | |
10 | 78.61 | 79.32 | 83.45 | 85.45 | 85.59 | 85.69 | 86.12 | 87.00 | 87.17 | 87.32 | |
Pavia University | 5 | 68.39 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 | 68.38 |
10 | 68.20 | 67.67 | 69.05 | 69.06 | 68.30 | 68.13 | 68.06 | 67.98 | 67.89 | 67.62 | |
15 | 70.91 | 70.59 | 70.96 | 70.50 | 71.89 | 73.54 | 73.80 | 74.58 | 75.25 | 75.32 | |
20 | 71.47 | 71.63 | 73.60 | 76.12 | 75.99 | 75.61 | 77.18 | 77.51 | 77.81 | 78.14 | |
25 | 71.25 | 74.57 | 78.52 | 79.15 | 81.95 | 82.32 | 83.66 | 84.88 | 85.45 | 85.51 | |
30 | 70.28 | 76.35 | 79.92 | 82.68 | 83.83 | 84.48 | 84.84 | 85.21 | 85.37 | 85.04 | |
35 | 71.48 | 77.43 | 80.15 | 81.75 | 83.18 | 83.62 | 84.11 | 83.65 | 83.23 | 83.17 | |
40 | 73.40 | 77.50 | 80.36 | 81.92 | 83.15 | 82.68 | 82.23 | 82.38 | 82.18 | 82.04 | |
Salinas Scene | 5 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 | 83.94 |
10 | 83.61 | 84.16 | 84.95 | 85.03 | 84.77 | 84.68 | 84.59 | 84.82 | 84.61 | 85.48 | |
15 | 83.86 | 83.17 | 85.51 | 87.04 | 87.70 | 88.44 | 89.77 | 89.60 | 91.01 | 91.09 | |
20 | 84.16 | 86.84 | 89.41 | 90.56 | 90.22 | 90.69 | 91.02 | 91.13 | 90.87 | 90.68 | |
25 | 84.11 | 88.22 | 89.46 | 89.68 | 89.35 | 88.77 | 87.86 | 87.49 | 86.71 | 86.78 | |
30 | 85.35 | 88.93 | 89.88 | 89.04 | 88.34 | 87.58 | 87.03 | 85.49 | 85.08 | 85.08 | |
35 | 86.17 | 88.80 | 88.67 | 87.67 | 86.63 | 85.84 | 85.26 | 84.87 | 83.95 | 83.06 | |
40 | 86.86 | 89.25 | 89.02 | 87.78 | 86.37 | 84.74 | 83.60 | 82.83 | 81.69 | 80.92 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 314 | 673 | 1139 | 1630 | 2178 | 2680 | 3324 | 4063 | 4820 | 5474 |
OA (%) | 78.69 | 79.93 | 81.20 | 82.05 | 82.49 | 84.38 | 85.70 | 86.52 | 87.33 | 87.77 | |
SRC | NUM | 311 | 648 | 1068 | 1525 | 2032 | 2606 | 3248 | 3899 | 4668 | 5522 |
OA (%) | 78.64 | 80.42 | 80.10 | 81.56 | 82.87 | 84.39 | 85.83 | 87.04 | 87.92 | 88.19 | |
NRS | NUM | 315 | 673 | 1116 | 1597 | 2136 | 2697 | 3265 | 3815 | 4437 | 5016 |
OA (%) | 78.82 | 81.02 | 80.96 | 83.71 | 84.94 | 85.57 | 86.87 | 88.01 | 89.25 | 89.42 | |
MLR | NUM | 133 | 295 | 580 | 936 | 1296 | 1790 | 2396 | 3077 | 3858 | 4648 |
OA (%) | 77.55 | 79.31 | 82.21 | 83.91 | 83.99 | 85.75 | 87.36 | 87.68 | 88.27 | 88.41 | |
KNN + SRC | NUM | 317 | 706 | 1120 | 1702 | 2294 | 2967 | 3728 | 4494 | 5233 | 5950 |
OA (%) | 78.71 | 80.96 | 81.99 | 83.97 | 84.96 | 85.82 | 86.99 | 87.75 | 87.93 | 88.10 | |
KNN + NRS | NUM | 317 | 707 | 1198 | 1691 | 2308 | 2954 | 3625 | 4450 | 5374 | 6305 |
OA (%) | 78.78 | 80.27 | 80.73 | 82.51 | 83.82 | 85.45 | 87.56 | 89.06 | 89.95 | 90.19 | |
KNN + MLR | NUM | 318 | 712 | 1206 | 1794 | 2555 | 3398 | 4292 | 5072 | 5783 | 6678 |
OA (%) | 78.79 | 80.56 | 82.57 | 86.08 | 87.10 | 87.87 | 88.50 | 88.88 | 89.18 | 89.31 | |
SRC + KNN | NUM | 317 | 706 | 1134 | 1673 | 2223 | 2875 | 3658 | 4317 | 5011 | 5748 |
OA (%) | 78.71 | 81.09 | 81.51 | 83.17 | 84.27 | 85.84 | 86.94 | 87.54 | 88.26 | 88.64 | |
SRC + NRS | NUM | 318 | 730 | 1205 | 1778 | 2372 | 3091 | 3969 | 4813 | 5787 | 6641 |
OA (%) | 78.81 | 80.79 | 82.37 | 85.03 | 86.81 | 88.45 | 88.93 | 89.86 | 90.49 | 90.78 | |
SRC + MLR | NUM | 315 | 708 | 1153 | 1744 | 2457 | 3322 | 4231 | 5042 | 5800 | 6735 |
OA (%) | 78.80 | 80.92 | 82.16 | 85.16 | 86.24 | 87.61 | 88.39 | 88.71 | 89.07 | 89.19 | |
NRS + KNN | NUM | 317 | 707 | 1202 | 1712 | 2385 | 3021 | 3700 | 4538 | 5399 | 6333 |
OA (%) | 78.74 | 80.67 | 81.33 | 83.36 | 84.46 | 86.35 | 87.93 | 89.23 | 90.03 | 90.43 | |
NRS + SRC | NUM | 318 | 734 | 1207 | 1728 | 2378 | 3061 | 3959 | 4799 | 5691 | 6739 |
OA (%) | 78.77 | 81.01 | 82.56 | 84.52 | 86.37 | 87.43 | 88.95 | 90.11 | 90.61 | 90.90 | |
NRS + MLR | NUM | 318 | 694 | 1148 | 1690 | 2446 | 3194 | 4102 | 4950 | 5768 | 6792 |
OA (%) | 78.91 | 81.39 | 81.62 | 85.51 | 87.21 | 89.63 | 90.28 | 90.92 | 91.28 | 91.88 | |
MLR + KNN | NUM | 318 | 677 | 1166 | 1689 | 2429 | 3246 | 4018 | 4737 | 5677 | 6672 |
OA (%) | 78.30 | 81.13 | 81.85 | 85.94 | 87.22 | 88.28 | 88.77 | 88.98 | 89.20 | 89.29 | |
MLR + SRC | NUM | 315 | 704 | 1258 | 1741 | 2439 | 3286 | 4097 | 4867 | 5775 | 6760 |
OA (%) | 78.31 | 81.81 | 82.55 | 85.09 | 86.86 | 88.30 | 89.01 | 89.44 | 90.21 | 90.71 | |
MLR + NRS | NUM | 318 | 683 | 1154 | 1701 | 2458 | 3301 | 4219 | 5057 | 5997 | 6889 |
OA (%) | 78.46 | 81.49 | 82.11 | 86.14 | 87.86 | 89.70 | 90.68 | 91.58 | 92.15 | 92.42 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 236 | 444 | 661 | 965 | 1210 | 1466 | 1812 | 2254 | 2762 | 3251 |
OA (%) | 71.19 | 73.36 | 74.70 | 75.69 | 76.10 | 76.16 | 77.28 | 78.48 | 77.60 | 77.69 | |
SRC | NUM | 225 | 477 | 733 | 1080 | 1492 | 2047 | 2793 | 3654 | 4605 | 5631 |
OA (%) | 72.21 | 72.19 | 76.97 | 79.12 | 80.08 | 81.77 | 83.09 | 84.07 | 84.95 | 85.27 | |
NRS | NUM | 225 | 438 | 763 | 1007 | 1256 | 1487 | 1726 | 1893 | 1991 | 2179 |
OA (%) | 71.22 | 73.74 | 75.14 | 76.59 | 76.15 | 76.47 | 76.41 | 76.20 | 75.68 | 75.40 | |
MLR | NUM | 100 | 244 | 396 | 605 | 829 | 1035 | 1304 | 1587 | 1872 | 2183 |
OA (%) | 72.36 | 76.37 | 77.18 | 79.04 | 79.40 | 81.16 | 82.21 | 82.18 | 83.36 | 85.85 | |
KNN + SRC | NUM | 248 | 473 | 787 | 1118 | 1509 | 1996 | 2552 | 3111 | 3975 | 4837 |
OA (%) | 71.48 | 72.82 | 74.61 | 76.59 | 78.38 | 80.08 | 80.82 | 81.62 | 82.40 | 83.99 | |
KNN + NRS | NUM | 240 | 491 | 775 | 1067 | 1382 | 1850 | 2403 | 3040 | 3794 | 4415 |
OA (%) | 71.11 | 73.83 | 77.44 | 78.83 | 79.50 | 79.58 | 79.39 | 80.11 | 80.74 | 81.48 | |
KNN + MLR | NUM | 244 | 515 | 822 | 1176 | 1616 | 2162 | 2905 | 3698 | 4745 | 5682 |
OA (%) | 71.37 | 75.03 | 77.84 | 78.08 | 79.73 | 79.59 | 81.30 | 83.88 | 84.90 | 85.08 | |
SRC + KNN | NUM | 248 | 476 | 795 | 1215 | 1725 | 2298 | 3055 | 3967 | 4977 | 5913 |
OA (%) | 71.38 | 72.54 | 75.26 | 78.09 | 79.19 | 80.21 | 81.88 | 83.31 | 83.35 | 83.51 | |
SRC + NRS | NUM | 242 | 511 | 867 | 1385 | 1889 | 2513 | 3201 | 3988 | 4910 | 5794 |
OA (%) | 71.40 | 74.12 | 77.56 | 80.80 | 82.19 | 82.55 | 83.02 | 83.80 | 84.34 | 85.09 | |
SRC + MLR | NUM | 236 | 507 | 841 | 1289 | 1731 | 2261 | 3016 | 3928 | 4939 | 6015 |
OA (%) | 71.52 | 73.62 | 76.47 | 78.45 | 79.51 | 81.54 | 83.10 | 84.37 | 84.68 | 84.66 | |
NRS + KNN | NUM | 240 | 486 | 803 | 1119 | 1541 | 1992 | 2557 | 3190 | 3939 | 4611 |
OA (%) | 71.05 | 74.37 | 76.57 | 77.77 | 78.06 | 77.21 | 76.75 | 77.35 | 78.88 | 79.28 | |
NRS + SRC | NUM | 242 | 501 | 803 | 1296 | 1792 | 2433 | 3089 | 3839 | 4776 | 5798 |
OA (%) | 71.44 | 74.50 | 76.72 | 79.93 | 81.44 | 81.90 | 82.85 | 83.81 | 85.00 | 85.48 | |
NRS + MLR | NUM | 234 | 517 | 828 | 1237 | 1796 | 2446 | 3354 | 4220 | 5061 | 5857 |
OA (%) | 71.49 | 75.47 | 77.89 | 80.86 | 83.68 | 84.43 | 84.93 | 85.12 | 85.57 | 85.46 | |
MLR + KNN | NUM | 244 | 486 | 746 | 1170 | 1658 | 2300 | 3205 | 4208 | 5336 | 6514 |
OA (%) | 71.47 | 75.30 | 76.67 | 78.76 | 80.35 | 82.97 | 85.05 | 86.05 | 86.88 | 87.02 | |
MLR + SRC | NUM | 236 | 504 | 903 | 1310 | 1708 | 2207 | 3009 | 4024 | 5098 | 6234 |
OA (%) | 71.71 | 74.01 | 79.30 | 79.81 | 80.40 | 83.03 | 86.27 | 86.93 | 87.97 | 88.53 | |
MLR + NRS | NUM | 234 | 524 | 787 | 1205 | 1738 | 2374 | 3116 | 3953 | 4754 | 5586 |
OA (%) | 71.63 | 75.94 | 79.83 | 80.66 | 82.75 | 84.64 | 85.98 | 86.19 | 86.37 | 86.87 |
Classifier | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
KNN | NUM | 133 | 251 | 443 | 684 | 963 | 1304 | 1672 | 2047 | 2425 | 2857 |
OA (%) | 83.38 | 86.46 | 86.46 | 86.65 | 87.26 | 87.31 | 87.79 | 87.92 | 87.66 | 87.15 | |
SRC | NUM | 144 | 275 | 441 | 666 | 937 | 1255 | 1606 | 1968 | 2391 | 2811 |
OA (%) | 83.10 | 83.56 | 85.11 | 85.83 | 85.81 | 86.63 | 86.50 | 86.64 | 86.95 | 86.96 | |
NRS | NUM | 148 | 271 | 423 | 668 | 952 | 1263 | 1569 | 1940 | 2351 | 2799 |
OA (%) | 84.04 | 86.06 | 87.62 | 87.63 | 87.10 | 87.48 | 88.44 | 88.43 | 88.32 | 87.92 | |
MLR | NUM | 102 | 177 | 302 | 451 | 621 | 867 | 1176 | 1518 | 1848 | 2217 |
OA (%) | 82.88 | 85.41 | 87.11 | 88.36 | 88.73 | 90.68 | 91.52 | 92.20 | 91.70 | 92.02 | |
KNN + SRC | NUM | 146 | 294 | 479 | 694 | 976 | 1330 | 1691 | 2106 | 2520 | 2985 |
OA (%) | 83.60 | 84.65 | 85.38 | 86.76 | 87.39 | 87.39 | 88.00 | 88.11 | 87.53 | 87.23 | |
KNN + NRS | NUM | 150 | 297 | 500 | 761 | 1108 | 1466 | 1891 | 2354 | 2834 | 3316 |
OA (%) | 83.53 | 86.43 | 87.14 | 87.49 | 87.64 | 87.37 | 88.06 | 88.03 | 87.95 | 88.03 | |
KNN + MLR | NUM | 143 | 285 | 508 | 768 | 1132 | 1546 | 2026 | 2526 | 3049 | 3590 |
OA (%) | 83.38 | 85.50 | 86.34 | 88.35 | 88.80 | 90.07 | 90.24 | 90.08 | 89.86 | 89.79 | |
SRC + KNN | NUM | 146 | 287 | 472 | 686 | 957 | 1281 | 1673 | 2061 | 2485 | 2892 |
OA (%) | 83.25 | 84.70 | 85.79 | 85.88 | 86.70 | 86.92 | 86.89 | 86.93 | 86.85 | 86.90 | |
SRC + NRS | NUM | 150 | 280 | 493 | 755 | 1017 | 1328 | 1708 | 2114 | 2539 | 2982 |
OA (%) | 83.07 | 85.55 | 86.28 | 85.47 | 86.66 | 86.97 | 87.52 | 87.05 | 87.19 | 87.16 | |
SRC + MLR | NUM | 150 | 271 | 483 | 720 | 1108 | 1499 | 1958 | 2451 | 2969 | 3487 |
OA (%) | 83.15 | 84.27 | 87.25 | 87.88 | 88.88 | 89.29 | 89.36 | 89.85 | 89.70 | 89.85 | |
NRS + KNN | NUM | 150 | 298 | 519 | 814 | 1148 | 1556 | 2037 | 2538 | 3027 | 3522 |
OA (%) | 84.01 | 87.12 | 87.79 | 87.30 | 87.54 | 88.38 | 88.09 | 88.25 | 87.95 | 87.88 | |
NRS + SRC | NUM | 150 | 284 | 488 | 803 | 1158 | 1539 | 1988 | 2482 | 2955 | 3423 |
OA (%) | 83.91 | 87.23 | 86.98 | 87.73 | 88.30 | 88.97 | 89.57 | 89.57 | 89.60 | 89.43 | |
NRS + MLR | NUM | 153 | 293 | 509 | 762 | 1104 | 1509 | 1940 | 2441 | 2934 | 3452 |
OA (%) | 83.87 | 85.82 | 88.25 | 89.93 | 90.40 | 90.51 | 90.75 | 90.48 | 90.12 | 89.61 | |
MLR + KNN | NUM | 143 | 299 | 521 | 825 | 1187 | 1602 | 2046 | 2514 | 2993 | 3407 |
OA (%) | 82.80 | 85.42 | 87.67 | 88.79 | 89.68 | 90.06 | 90.54 | 90.92 | 91.27 | 91.32 | |
MLR + SRC | NUM | 150 | 292 | 521 | 799 | 1123 | 1537 | 2007 | 2448 | 2929 | 3367 |
OA (%) | 82.91 | 85.45 | 88.89 | 90.12 | 90.21 | 90.93 | 90.99 | 91.83 | 92.08 | 92.64 | |
MLR + NRS | NUM | 153 | 315 | 564 | 841 | 1197 | 1605 | 2064 | 2561 | 3060 | 3500 |
OA (%) | 82.81 | 84.87 | 88.16 | 89.34 | 89.69 | 90.03 | 90.42 | 90.52 | 91.19 | 91.43 |
Data set | Indian Pines | Pavia University | Salinas Scene |
Selection policy | BT | BT | BT |
Number of selections | 1600 | 1400 | 600 |
Window size | 7 × 7 | 25 × 25 | 20 × 20 |
Combination of classifiers | MLR + NRS | MLR + SRC | MLR + SRC |
Number of labeled samples | 6889 | 6234 | 3367 |
Training samples | Index | Initial samples | Labeling samples |
Indian Pines | AA | 67.93% | 84.70% |
OA | 77.38% | 92.42% | |
KAPPA | 0.746 | 0.914 | |
Pavia University | AA | 60.53% | 81.87% |
OA | 69.00% | 88.53% | |
KAPPA | 0.609 | 0.848 | |
Salinas Scene | AA | 82.59% | 87.76% |
OA | 84.00% | 92.64% | |
KAPPA | 0.823 | 0.918 |