A novel profit-based validity index approach for feature selection in credit risk prediction

Meng Pang; Zhe Li; Meng Pang; Zhe Li

doi:10.3934/math.2024049

AIMS Mathematics

2024, Volume 9, Issue 1: 974-997. doi: 10.3934/math.2024049

Previous Article Next Article

Research article Special Issues

A novel profit-based validity index approach for feature selection in credit risk prediction

Meng Pang ^1,2,
Zhe Li ^{1
,
,}

1.
School of Business, Liaocheng University, Liaocheng 252059, China
2.
Graduate School, Lyceum of the Philippines University, Batangas City 4200, Philippines

Received: 21 October 2023 Revised: 15 November 2023 Accepted: 20 November 2023 Published: 05 December 2023
MSC : 68M25

Establishing a reasonable and effective feature system is the basis of credit risk early warning. Whether the system design is appropriate directly determines the accuracy of the credit risk evaluation results. In this paper, we proposed a feature system through a validity index with maximum discrimination and commercial banks' loan profit maximization. First, the first objective function is the minimum validity index constructed by the intra-class, between-class, and partition coefficients. The maximum difference between the right income and wrong cost is taken as the second objective function to obtain the optimal feature combination. Second, the feature weights are obtained by calculating the change in profit after deleting each feature with replacement to the sum of all change values. An empirical analysis of 3, 425 listed companies from t-1 to t-5 time windows reveals that five groups of feature systems selected from 614 features can distinguish between defaults and non-defaults. Compared with 14 other models, it is found that the feature systems can provide at least five years' prediction and enable financial institutions to obtain the maximum profit.

Keywords:

Citation: Meng Pang, Zhe Li. A novel profit-based validity index approach for feature selection in credit risk prediction[J]. AIMS Mathematics, 2024, 9(1): 974-997. doi: 10.3934/math.2024049

Related Papers:

[1]	Yufeng Li, Chengcheng Liu, Weiping Zhao, Yufeng Huang . Multi-spectral remote sensing images feature coverage classification based on improved convolutional neural network. Mathematical Biosciences and Engineering, 2020, 17(5): 4443-4456. doi: 10.3934/mbe.2020245
[2]	Haifeng Song, Weiwei Yang, Songsong Dai, Haiyan Yuan . Multi-source remote sensing image classification based on two-channel densely connected convolutional networks. Mathematical Biosciences and Engineering, 2020, 17(6): 7353-7377. doi: 10.3934/mbe.2020376
[3]	Hua Wang, Weiwei Li, Wei Huang, Jiqiang Niu, Ke Nie . Research on land use classification of hyperspectral images based on multiscale superpixels. Mathematical Biosciences and Engineering, 2020, 17(5): 5099-5119. doi: 10.3934/mbe.2020275
[4]	Eric Ke Wang, Nie Zhe, Yueping Li, Zuodong Liang, Xun Zhang, Juntao Yu, Yunming Ye . A sparse deep learning model for privacy attack on remote sensing images. Mathematical Biosciences and Engineering, 2019, 16(3): 1300-1312. doi: 10.3934/mbe.2019063
[5]	Haifeng Song, Weiwei Yang, Songsong Dai, Lei Du, Yongchen Sun . Using dual-channel CNN to classify hyperspectral image based on spatial-spectral information. Mathematical Biosciences and Engineering, 2020, 17(4): 3450-3477. doi: 10.3934/mbe.2020195
[6]	Qian Zhang, Haigang Li, Ming Li, Lei Ding . Feature extraction of face image based on LBP and 2-D Gabor wavelet transform. Mathematical Biosciences and Engineering, 2020, 17(2): 1578-1592. doi: 10.3934/mbe.2020082
[7]	Eric Ke Wang, Fan Wang, Ruipei Sun, Xi Liu . A new privacy attack network for remote sensing images classification with small training samples. Mathematical Biosciences and Engineering, 2019, 16(5): 4456-4476. doi: 10.3934/mbe.2019222
[8]	Qihan Feng, Xinzheng Xu, Zhixiao Wang . Deep learning-based small object detection: A survey. Mathematical Biosciences and Engineering, 2023, 20(4): 6551-6590. doi: 10.3934/mbe.2023282
[9]	Qing Yang, Jun Chen, Najla Al-Nabhan . Data representation using robust nonnegative matrix factorization for edge computing. Mathematical Biosciences and Engineering, 2022, 19(2): 2147-2178. doi: 10.3934/mbe.2022100
[10]	Lei Yang, Guowu Yuan, Hao Wu, Wenhua Qian . An ultra-lightweight detector with high accuracy and speed for aerial images. Mathematical Biosciences and Engineering, 2023, 20(8): 13947-13973. doi: 10.3934/mbe.2023621

Abstract

1. Introduction

Hyperspectral Image (HSI) is the simultaneous imaging of target areas in dozens to hundreds of continuous spectral bands ^[1,2,3,4]. It effectively integrates the spatial and spectral information in the imaging scene, with strong target detection ability and better material identification ability ^[5,6,7,8]. It is widely used in agriculture and forestry, geological exploration, marine exploration, Environmental monitoring and other fields ^[9,10,11,12]. However, HSI is characterized by high data dimension, large information redundancy and high correlation between bands, which brings great difficulties to its processing and classification ^{[13,14,15,16,17,18,19,20,21,22,23]}. Therefore, how to reduce the redundant information of the data, extract the features of the hyperspectral images effectively, and realize the accurate classification of the hyperspectral images are the hot and difficult issues in the current hyperspectral image processing and classification research.

Sample labeling of hyperspectral images often requires expert knowledge and experience, so the cost of sample labeling is high ^[24,25,26]. When the labeled samples are limited, semi-supervised learning can explore the useful information of the unlabeled samples to participate in the model training and reduce the labeling cost ^[27,28]. In the field of machine learning, semi-supervised learning acquires knowledge and experience from a small number of labeled samples. The usable information is mined from a large number of unlabeled samples, which can improve the classification accuracy ^[29,30,31]. Therefore, a large number of scholars have studied the semi-supervised learning for remote sensing images. Camps-Valls et al. ^[32] proposed a graph-based hyperspectral image classification method, and constructed the graph structure through the graph method. The data context information based on the composite kernel is integrated, and the Nystrom method is used to speed up classification. Yang et al. ^[33] proposed a semi-supervised band selection technique for hyperspectral image classification. A metric learning method has been used to measure the features of hyperspectral images, and a semi-supervised learning method has been used to select a subset of valid bands from the original bands. Tan et al. ^[34] proposed a hyperspectral image classification method based on segmentation integration and semi-supervised support vector machine. The spatial information of the tag samples is extracted by using a segmentation algorithm to filter the samples, and classified based on semi-supervised learning. Samiappan et al. ^[35] combined active learning and co-training to perform semi-supervised classification of hyperspectral images. The initial classification model is trained according to the labeled samples, and the heuristic active learning is performed on the unlabeled samples. Combined with the original data, the labeled samples are divided into views, and the unlabeled samples with high heuristic values are selected to join the training samples. Zhang et al. ^[36] used a semi-supervised classification method based on hierarchical segmentation and active learning to extract spatial information from hyperspectral images, the training set is updated iteratively by using the information of a large number of unlabeled samples to complete the hyperspectral image classification. In addition, some other methods are also proposed in recent years ^{[37,38,39,40,41,42,43,44,45]}.

In hyperspectral images, each pixel corresponds to a spectral curve that reflects its inherent physical, chemical and optical properties. The main basis of hyperspectral image classification is to use the feature information of different pixels to label the pixels belonging to different landmarks and obtain the corresponding classification maps ^[46,47]. Therefore, a large number of scholars have studied hyperspectral image classification. Melgani et al. ^[48] proposed a hyperspectral image classification method based on support vector machines (SVM). The kernel function is introduced to solve the nonlinear separable problem and avoid the curse of dimensionality. Ratle et al. ^[49] introduced neural networks into hyperspectral image classification. In the training phase, the loss function is optimized to avoid local optimization. Chen et al. ^[50] constructed a hyperspectral image classification model based on sparse representation, and compared the classification results of common machine learning methods. In order to improve the shortcomings of sparse representation in dealing with nonlinear problems, Chen et al. ^[51] proposed a kernel sparse representation technique. In addition, Cui et al. ^[52] proposed a multiscale sparse representation algorithm for robust hyperspectral image classification. Automatic and adaptive weight allocation schemes based on spectral angle ratio are incorporated into the multi-classifier framework to fuse sparse representation information at all scales. Tang et al. ^[53] proposed two sparse representation algorithms based on manifolds to solve the instability problem of sparse algorithms. The regularization and local invariance techniques are used, and two manifold-based regularization items are merged into the ${l}_{1}$ -based objective function. Wang et al. ^[54] applied the neighborhood-cutting technique to sparse representation, and combined the joint spatial and spectral sparse representation classification method. Wang and Celik ^[55] improved the classification accuracy of hyperspectral images by combining context information in the sparse coefficient domain. Hu et al. ^[56] proposed two weighted kernel joint sparse representation methods, which determine the calculation weight by calculating the kernel similarity between adjacent pixels. Xue et al. ^[57] presented two novel sparse graph regularization methods, SGR and SGR with total variation. Yang et al. ^[58] studied the effect of the p-norm distance metric on the minimum distance technique, and proposed a supervised-learning p-norm distance metric to optimize the value of p. Zhang et al. ^[59] proposed a multi-scale dense network for HSI classification that made full use of different scale information in the network structure and combined scale information throughout the network. Liu et al. ^[60] proposed a class-wise adversarial adaptation method in conjunction with the class-wise probability MMD as the class-wise distribution adaptation network. Wang et al. ^[61] proposed the graph-based semi-supervised learning with weighted features for HSI classification. In addition, some new other methods are also proposed by a lot of researchers ^{[62,63,64,65,66,67,68,69,70,71]}.

To sum up, hyperspectral images contain rich spectral and spatial information of earth surface features, which increases the processing and analysis difficulties. In addition, the training samples of actual hyperspectral images are small and there is a sample labeling problem. The local binary pattern, sparse representation and mixed logistic regression model are used in this paper. A new hyperspectral image feature extraction method based on local binary pattern is proposed to obtain texture features of hyperspectral images and enrich hyperspectral image sample information. A sample selection strategy based on active learning is designed to determine the unlabeled samples. Based on these, a new sample labeling method based on neighborhood information and priority classifier discrimination is deeply studied to expand the training samples. And a novel classification method based on texture features and semi-supervised learning is studied to improve the classification accuracy of remote sensing images.

The main contributions of this paper are described as follows.

(ⅰ) A novel a classification method of hyperspectral remote sensing images based on texture features and semi-supervised learning is proposed by using local binary pattern, sparse representation, hybrid logistic regression.

(ⅱ) The local binary pattern is used to effectively extract the features of spatial texture information of remote sensing images and enrich the feature information of samples.

(ⅲ) A multiple logistic regression model is used to optimally select unlabeled samples, which are labeled by using neighbourhood information and priority classifier discrimination to achieve pseudo-labeling of unlabeled samples.

(ⅳ) A novel classification method of hyperspectral remote sensing images based on semi-supervised learning is proposed to effectively achieve accurate classification of hyperspectral images.

2. Basic methods

In here, the local binary pattern (LBP) and sparse expression are introduced in order to clearly describe the basic theories of these methods.

2.1. Local binary pattern (LBP)

LBP is a feature extraction method, which can extract spatial texture information of images. Texture, which is widely used in image processing and image analysis, represents the slow change or periodic change of the surface structure of the object ^[72]. LBP is also widely used in feature extraction of hyperspectral images due to the simple structure and easy calculation. Give the center pixel ${g}_{c}({x}_{c}, {y}_{c})$ and the neighborhood pixel ${g}_{p}$ , which are described as follow.

${g}_{p} = ({x}_{c}+Rcos\left(\frac{2\Pi p}{P}\right), {y}_{c}-Rsin(\frac{2\Pi p}{P}\left)\right)$

(1)

where, ${g_p}(p = 0, 1, ..., P - 1)$ represents the coordinate values of P pixels uniformly distributed on the circular domain with ${g_c}$ as the centre and R as the radius. The quantized texture feature form of one region is shown in Figure 1.

$LB{P}_{{g}_{c}} = {2}^{p}\times \sum _{p = 0}^{P-1}s({g}_{p}-{g}_{c})$

(2)

$s\left(x\right) = \left\{\begin{array}{c}1, \\ 0, \end{array}\right.\begin{array}{c}x > 0\\ x\le 0\end{array}$

(3)

Figure 1. The quantized texture feature form of one region.

DownLoad: Full-Size Img PowerPoint

2.2. Sparse expression

Sparse representation means that the signal can be approximately represented by using linear combination of the atoms in the dictionary. Now, $X = [{X_1}, {X_2}, ..., {X_c}] \in {R^D}$ is given as the HSI pixel and $D$ is the number of image bands. In here, ${X_i} = [{x_{i1}}, {x_{i2}}, ..., {x_{i{N_c}}}] \in {R^D}$ , ${N_c}$ represents the number of samples in the class $i$ .

For samples in the class $i$ , it can be approximated as follow.

$\begin{array}{l} y \approx {x_{i1}}{\alpha _1} + {x_{i2}}{\alpha _2} + ... + {x_{i{N_c}}}{\alpha _{{N_c}}} \\ {\text{ = }}\left[ {{x_{i1}}, {x_{i2}}, ..., {x_{i{N_c}}}} \right]{\left[ {{\alpha _1}, {\alpha _2}, ..., {\alpha _{{N_c}}}} \right]^T} \\ = {X_i}{\alpha _i} \\ \end{array}$

(4)

where, ${X_i}$ represents a sparse sub-dictionary of the samples in the class $i$ . ${\alpha _i}$ represents the sparse vector of test samples $y$ , which contains only a few non-zero values.

In order to obtain the sparsest vector ${\alpha _i}$ , the following formula is solved.

$\tilde \alpha = \arg \min {\left\| {{\alpha _i}} \right\|_0}, s.t.y = A{\alpha _i}$

(5)

where, ${\left\| {{\text{ }}{\text{. }}} \right\|_0}$ is a ${l_0}$ norm, which represents the number of non-zero atoms in the vector, also known as sparsity. A is a sparse dictionary. It is a NP-hard problem to solve the Eq (5) directly. Under some conditions, the minimization solving problem ( ${l_0}$ ) is approximated by the minimization solving problem ( ${l_1}$ ), which can be relaxed.

$\tilde \alpha = \arg \min {\left\| {{\alpha _i}} \right\|_1}, s.t.y = A{\alpha _i}$

(6)

Furthermore, the solution can be converted to the following expression.

$\tilde \alpha = \arg \min {\left\| {{\alpha _i}} \right\|_0}, s.t.\left\| {A{\alpha _i} - y} \right\|_2^2 < \varepsilon$

(7)

where, $\varepsilon$ represents the refactoring error. Orthogonal matching pursuit (OMP) algorithm is used to solve the expression (7). After the sparsity coefficient is calculated, the reconstruction residual for each class of the test samples $y$ can be calculated.

${r_i}(y) = {\left\| {y - A{{\tilde \alpha }_i}} \right\|_2}$

(8)

where, there is $i \in \{ 1, 2, ..., C\}$ .

Finally, the reconstruction residuals of all dictionaries are compared. The minimum residual is the classification y.

$class(y) = \arg \min ({r_i}(y)), i = 1, 2, ...C$

(9)

3. A sample labelling method based on neighborhood information and priority classifier discrimination

In hyperspectral remote sensing image classification, the attribute characteristics of each ground feature are subordinated to a specific distribution. If the number of training samples does not reach the amount that depicts the distribution of the ground feature, it will affect the subsequent classification results. In practical applications, there are difficulties, such as fewer labeled samples, difficulty in manual labeling, and time-consuming and labor-intensive problems and so on. Therefore, a new sample labeling method based on neighborhood information and priority classifier discrimination is proposed to obtain the learned pseudo labeled samples in order to expand the labeled samples and improve the classification accuracy of model.

3.1. A sample selection method based on multiple logistic regression

Before the samples are labeled, sample selection is required. This is because if all unlabeled samples are labeled directly, all unlabeled samples need to be labeled, which will cost a lot of computation time. Moreover, due to the small number of initial labeled samples and the limited available information, it is difficult to label some samples with a certain accuracy. The mislabeled samples obviously affect the classification accuracy of model. The main objective of the sample selection strategy is to select the unlabeled samples with the largest amount of information. These unlabeled samples can construct a valuable training set after labeling, which can effectively promote the improvement of classification results. Therefore, a sample selection method based on multiple logistic regression is proposed to realize the selection of samples.

The classified probability matrix $p\left({y}_{i}^{k}|{x}_{i}\right)$ of each sample by using multiple logistic regression model has a large amount of information that can be mined. The multiple logistic regression classifier is modeled by discriminant Bayesian decision model. According to the generalized linear model, it can be obtained as follow.

$P\left(y;\delta \right) = b\left(y\right)\mathrm{e}\mathrm{x}\mathrm{p}({\delta }^{T}T\left(y\right)-a\left(\delta \right))$

(10)

The specific expression of multiple logistic regression is described as follow.

$p\left(y_i=k \mid x_i, \eta\right)=\exp \left(\eta^k g\left(x_i\right)\right) / \sum_{k=1}^N \exp \left(\eta^k g\left(x_i\right)\right)$

(11)

where, $g\left(x\right) = [{g}_{1}\left(x\right), {g}_{2}\left(x\right), \dots, {g}_{f}\left(x\right){]}^{T}$ is the feature vectors of the input, and $\eta = [{\eta }_{1}^{T}, {\eta }_{2}^{T}, \dots {\eta }_{k}^{T}]$ represents the regression parameter vector of the classifier. It is worth noting that the feature vector is often represented by introducing the kernel, which is not only used to improve the indivisibility, but also obtain the better classifier by training samples. Generally, the kernel function is radial basis function (RBF), which is described as follow.

$K\left({x}_{m}, {x}_{n}\right) = {e}^{\frac{-|\left|{x}_{m}-{x}_{n}\right|{|}^{2}}{2{p}^{2}}}$

(12)

After the feature vector is determined, the regression parameter $\eta$ of model is only determined, and the probability matrix $p\left({y}_{i}^{k}|{x}_{i}\right)$ of each unlabeled sample belonging to each class is determined. The sample amount of information is determined by the Breaking Ties (BT) and the Least Confidence (LC). In this paper, the BT is selected to determine the amount of information.

The BT shows the similarity between the two classifications by comparing the difference between the maximum category probability and the sub-maximum category probability. The difference is smaller, the similarity between the two types of samples is greater. The uncertainty is greater, the amount of information is greater. ${S}_{i}$ is the similarity between classifications, which is described as follow.

${S}_{i} = maxp\left({y}_{i}^{k}|{x}_{i}\right)-secondmax\left(p\left({y}_{i}^{k}|{x}_{i}\right)\right)$

(13)

The ${S}_{i}$ is finally sorted in ascending order. $p\left({y}_{i}^{k}|{x}_{i}\right)$ is the obtained probability matrix of each sample by multiple logistic regression classifier. $maxp\left({y}_{i}^{k}|{x}_{i}\right)$ represents the largest value in the obtained probability matrix, $secondmax\left(p\left({y}_{i}^{k}|{x}_{i}\right)\right)$ represents the second largest value in the obtained probability matrix. $k$ is the number of regression parameters.

3.2. A sample labeling method based on neighborhood information and priority classifier discrimination

The features of hyperspectral remote sensing images have some correlation. The ground objects are closer, the correlation is stronger. In the research of sample labeling, spatial neighborhood information based on training samples is widely used. However, due to the unknown central pixel and the lack of sufficient information, the neighborhood information of unlabeled samples is relatively less in the research of sample labeling. Generally, each pixel label on hyperspectral image must be consistent with one pixel label in its neighborhood. This property can be used to label the unlabeled samples. The label information of training samples around the unlabeled samples can be used to discriminate the unlabeled samples. The labeling discrimination method based on neighborhood information centers on the sample to be labeled. The unlabeled samples are labeled with a block diagram. All the occurrences of sample labels are recorded and denoted as the neighborhood information. The labeled samples are used to train the classifier and classify the unlabeled samples. Determine whether the predicted sample label by the classifier appears in the neighborhood information of the unlabeled samples. If it appears, the predicted label by the classifier is the sample label. Otherwise, the samples are put into the unlabeled samples. One of the most important problems is whether the unlabeled samples which satisfy the neighborhood information can be reliably labeled by the classifier. At present, some studies use multiple classifiers to discriminate together and achieve good classification effect. However, how to determine the determination of labels, when the predicted labels by multiple classifiers are inconsistent, but all appear in the neighborhood information of unlabeled samples.

Therefore, a sample labeling method based on neighborhood information and priority classifier discrimination is proposed in this paper. For unlabeled samples with the neighborhood information, the classifier with the highest priority is used for prediction. If the obtained prediction marker appears in the neighborhood information, its marker is determined. Otherwise, the classifier with the lowest priority is used for prediction. Determine whether the label can be determined until the sample labeling is ended. The sample labeling process based on neighborhood information and priority classifier discrimination is shown in Figure 2.

Figure 2. The sample labeling process based on neighborhood information and priority classifier discrimination.

DownLoad: Full-Size Img PowerPoint

The sample labeling method is an iterative process. Although it cannot ensure enough training samples around all unlabeled samples at the initial stage of sample labeling, it can ensure that some unlabeled samples are sufficient. The unlabeled samples are labeled and extended to the training set. With each iteration, the training set grows. Those unlabeled samples whose neighborhood training samples are not sufficient may reach the label condition at a certain labeling. The sample labeling method with replacement ensures the sample labeling accuracy to a certain extent, and improves the performance of classifier.

4. Hyperspectral image classification method based on texture features and semi-supervised learning

4.1. The idea of hyperspectral image classification

Hyperspectral remote sensing images consist of pairs of continuous spectral bands, which contain rich spectral and spatial information of earth surface features. Some objects that cannot be identified by conventional remote sensing means can be identified in hyperspectral images. However, the abundant data information increases the processing and analysis difficulties, and there are some problems of fewer labeled samples, difficulty in manual labeling and time-consuming and so on. In order to improve the classification accuracy of hyperspectral remote sensing images, a new classification method of hyperspectral remote sensing images based on texture features and semi-supervised learning is proposed in this paper. Firstly, aiming at the problems of high correlation between bands, information redundancy, high dimension and complex processing, LBP is employed to deal with the hyperspectral images. The texture features of hyperspectral images are effectively extracted to enrich the feature information. To solve the problem of limited label samples, a new sample labeling method based on neighborhood information and priority classifier discrimination is proposed in here. Secondly, a sample selection strategy is designed to find some samples from a large number of unlabeled samples. Finally, the sparse representation and mixed logistic regression are applied to achieve a new classification method to effectively achieve accurate classification of hyperspectral images.

4.2. The model of hyperspectral image classification

The model of hyperspectral image classification method based on texture features and semi-supervised learning is shown in Figure 3.

Figure 3. The flow of hyperspectral image classification method based on texture features and semi-supervised learning.

DownLoad: Full-Size Img PowerPoint

The detailed implementation steps of hyperspectral image classification method based on texture features and semi-supervised learning are described as follows.

Step 1. Normalize the hyperspectral images. Principal component analysis is used to perform dimensionality reduction.

Step 2. Calculate the texture for each principal component by using LBP. The histogram statistics are performed according to the symmetric rotation invariant equivalence mode. D-dimensional feature data for each layer is used to obtain feature information for each pixel and retain pixel coordinates.

Step 3. The original hyperspectral images are performed by linear discriminant analysis to obtain a projection matrix.

Step 4. The hyperspectral images are mapped into low dimensional images for suitable Euclidean distance analysis by using a projection matrix.

Step 5. Training sample sets and test sample sets are randomly selected from each type of sample according to a certain proportion of low-dimensional images.

Step 6. Calculate the Euclidean distance from all training samples by using ${L_k} = {\left\| {{y_i} - {A_k}} \right\|_2}$ to maintain K training coordinates that are close to the Euclidean distance of each test sample as the sparse dictionary for the test samples.

Step 7. The extracted feature information is loaded into the test samples and corresponding sparse dictionary according to the corresponding coordinates to obtain the test sample set and sparse dictionary.

Step 8. The sparse representation is solved. Orthogonal matching pursuit (OMP) has been used to calculate the sparse coefficient, and the classifications of the test samples is obtained.

5. Case analysis

5.1. Experimental data

1) Indian Pines data

The images of Indian pines in northwest Indiana were collected by AVIRIS sensor. The images consist of 145 × 145 pixels and 224 spectral reflection bands with a wavelength range of 0.4~2.5 nm, including 16 types of feature elements. The false color map and real ground object distribution are shown in Figure 4.

Figure 4. Hyperspectral remote sensing images of Indian Pines.

DownLoad: Full-Size Img PowerPoint

2) Salinas Scene data

The AVIRIS spectrometer collects images of the Salinas Valley in California, USA, with a size of 512 × 217 pixels and a total of 224 bands. After the bands covering the water absorption area are removed, 204 bands are used, including 16 types of ground feature elements. The false color map and the real ground object distribution are shown in Figure 5.

Figure 5. Hyperspectral remote sensing image of Salinas Scene.

DownLoad: Full-Size Img PowerPoint

3) Pavia University data

Images of the Italian University of Pavia campus taken by the Rosis Spectrometer. It is 610 × 340 pixels in size and has a total of 115 wavebands. The 103 wavebands after the wavebands covering the water-absorbing region contain a total of 9 types of features are removed. The false color map and the real ground object distribution are shown in Figure 6.

Figure 6. Hyperspectral remote sensing image of Pavia University.

DownLoad: Full-Size Img PowerPoint

In the experiment, 10% of each type of ground object of the three kinds of data is randomly selected as the training samples, and the rest is the test samples.

5.2. Evaluation criteria

Confusion Matrix (CM) is usually used in the classification and evaluation of hyperspectral images. A confusion matrix is generally defined as follow.

$P = \left[\begin{array}{ccc}\begin{array}{c}{p}_{11}\\ {p}_{21}\end{array}& \begin{array}{c}\begin{array}{cc}{p}_{12}& \dots \end{array}\\ \begin{array}{cc}{p}_{22}& \dots \end{array}\end{array}& \begin{array}{c}{p}_{1n}\\ {p}_{2n}\end{array}\\ \dots & \begin{array}{cc}\dots & \dots \end{array}& \dots \\ {p}_{n1}& \begin{array}{cc}{p}_{n2}& \dots \end{array}& {p}_{nn}\end{array}\right]$

(14)

where, n denotes the number of objects in the category, ${p}_{ij}$ represents the number of samples belonging to the class i that are assigned to the class j. The total amount of data in each row denotes the true number of objects in the category. The total amount of data in each column represents the total number of samples.

Based on the confusion matrix, three classification indexes can be obtained, which are Overall Accuracy (OA), Average Accuracy (AA) and Kappa coefficient.

$OA = \frac{\sum _{i = 1}^{n}{p}_{ii}}{N}$

(15)

where, N represents the total number of samples in the classification. ${p}_{ii}$ represents the number of correctly classified samples of the class i. OA represents the probability that the classified result corresponds to its true label for each random sample.

$C{A}_{i} = \frac{{p}_{ii}}{{N}_{i}}$

(16)

where, ${N}_{i}$ represents the total number of samples for the first category in the class i. $C{A}_{i}$ represents the probability that category i is correctly classified.

$Kappa＝\frac{\left(n\left(\sum _{i = 1}^{N}{p}_{ii}\right)-\sum _{i = 1}^{N}\left(\sum _{j = 1}^{N}{p}_{ij}\sum _{j = 1}^{N}{p}_{ji}\right)\right)}{{n}^{2}-\sum _{i = 1}^{N}\left(\sum _{j = 1}^{N}{p}_{ij}\sum _{j = 1}^{N}{p}_{ji}\right)}$

(17)

The Kappa coefficient comprehensively considers the number of correctly classified objects and the error of misclassified objects on the confusion matrix.

5.3. Parameter determination and analysis

5.3.1. Parameter settings

In the experiment, a large number of alternative values are tested, and some classical values are selected from literatures, these parameter values are experimentally modified until the most reasonable parameter values are determined. These selected parameter values have obtained the optimal solution, so that they can accurately and efficiently verify the effectiveness.

5.3.2. Sample selection methods

The quality of sample selection directly affects the efficiency of the experiment, and also affects the performance of classifier. In order to select the best samples, the experimental results of Information Entropy (IE), Min Error (ME), Breaking Ties (BT) and Least Confidence (LC) on three kinds of hyperspectral images are compared. The experiment is 10 initial samples for each class, and all remaining samples are test samples. Two hundred unlabeled samples are selected by four sample selection methods in each iteration. The label samples are used to expand the training samples, train the classifier and classify the test samples. The quality of the selection samples is determined according to the classification results after each iteration is executed. The classification accuracy of different sample selection methods on different data sets is shown in Table 1.

Table 1. The classification accuracy of different methods on different data sets (%).

Data	Selection method	1	2	3	4	5	6	7	8	9	10
Indian Pines	IE	78.39	79.10	79.99	80.71	82.11	83.47	84.10	85.51	86.48	87.23
	ME	80.33	86.83	91.08	92.92	95.28	97.02	98.00	98.45	98.90	99.13
	BT	91.47	95.47	98.22	98.36	98.64	98.59	98.66	98.71	99.34	99.29
	LC	84.74	88.95	91.63	94.33	95.27	96.46	98.15	98.55	98.62	98.66
Pavia University	IE	69.87	71.31	71.75	71.74	72.04	72.40	73.25	73.36	73.76	74.45
	ME	73.01	75.23	78.74	82.78	89.14	92.82	95.14	96.10	97.13	97.82
	BT	87.54	92.63	94.51	95.24	95.84	96.10	96.39	96.50	96.69	96.71
	LC	74.91	76.33	80.76	84.45	87.23	89.89	90.27	90.67	90.56	91.47
Salinas Scene	IE	84.16	84.60	84.65	85.02	85.09	85.25	85.50	85.65	85.91	85.98
	ME	85.14	89.29	92.81	94.31	96.48	97.41	98.04	98.20	98.66	98.88
	BT	95.30	96.98	98.26	98.71	98.95	98.90	99.03	99.21	99.25	99.24
	LC	88.60	91.19	92.84	93.58	93.80	95.74	97.56	97.72	98.56	98.86

| Show Table

DownLoad: CSV

From Table 1, it can be seen that the classification accuracies of ME, BT and LC are greatly improved on the three data sets with the increase of the number of iterations. The results of the BT method are significantly better than those of the other methods. The accuracy can be improved in the first few iterations, which indicate that the BT method can select samples with greater classification improvement. Therefore, the BT method is chosen as the sample selection method in this paper.

5.3.3. Determination of sample size

In the labeling process, the samples are not completely labeled correctly. The more samples are screened, the more samples may be misclassified. This will make the training set noisier and affects the generalization ability of the classifier. If the number of samples is too small, the number of labeled samples will not improve the classification accuracy of the classifier or will reduce the classification efficiency. The results of classification accuracy under different sample sizes are shown in Table 2.

Table 2. The classification accuracy results under different sample sizes (%).

Data	Quantity	1	2	3	4	5	6	7	8	9	10
Indian Pines	200	77.30	77.57	78.32	77.97	78.59	78.97	78.96	79.28	79.51	79.23
	400	77.54	78.60	79.80	80.80	82.13	82.20	83.03	83.77	83.83	83.85
	600	77.52	79.54	79.37	79.27	80.08	81.99	83.89	83.60	84.30	84.48
	800	77.75	79.84	80.87	80.22	82.11	83.91	84.41	84.57	85.12	85.94
	1000	77.85	80.49	80.28	82.74	81.35	82.60	84.18	85.88	86.77	87.84
	1200	77.85	79.95	79.79	80.38	81.59	84.41	84.72	85.74	87.48	88.85
	1400	78.18	80.20	80.09	83.96	85.00	85.78	87.93	89.90	91.07	91.20
	1600	78.55	80.56	80.59	84.12	86.90	87.82	89.02	90.87	91.49	91.83
	1800	78.34	79.76	79.23	82.64	85.39	87.32	88.81	89.97	90.69	91.46
	2000	78.01	79.16	80.24	82.55	86.02	87.48	88.66	89.63	90.02	90.46
Pavia University	200	68.75	73.93	76.73	78.18	79.23	80.92	81.85	82.40	82.74	83.60
	400	66.41	73.11	75.45	78.20	81.18	82.13	82.57	83.39	84.07	83.98
	600	68.88	76.35	78.22	80.73	82.59	83.29	83.68	84.57	84.95	84.98
	800	69.89	77.50	80.31	81.91	83.22	84.85	84.99	84.79	85.16	85.31
	1000	70.28	76.35	79.92	82.68	83.83	84.48	84.84	85.21	85.37	85.04
	1200	70.24	75.18	80.32	83.13	84.14	85.04	85.30	85.55	85.04	84.90
	1400	70.34	76.23	80.57	82.46	83.92	84.71	85.59	85.77	85.87	86.47
	1600	70.40	75.87	80.64	83.06	83.93	84.69	85.28	86.02	86.19	85.83
	1800	69.77	76.12	80.18	82.99	85.19	85.04	84.89	85.26	85.68	85.68
	2000	69.71	75.90	82.29	83.40	84.41	85.23	85.59	85.77	85.86	85.87
Salinas Scene	200	85.09	87.26	89.04	89.35	89.24	89.22	88.85	88.47	88.14	88.03
	400	84.94	88.34	89.88	89.26	89.12	88.88	88.26	87.68	87.40	87.25
	600	85.69	90.85	90.80	90.42	89.71	89.04	88.63	87.84	87.12	86.60
	800	85.36	89.17	88.87	87.96	87.46	86.85	86.42	85.69	85.13	84.58
	1000	85.35	88.93	89.88	89.04	88.34	87.58	87.03	85.49	85.08	85.08
	1200	85.04	89.66	90.08	88.98	87.67	87.14	86.13	85.30	85.09	84.85
	1400	85.07	88.46	89.39	88.63	88.10	88.06	87.34	86.81	86.69	86.58
	1600	85.47	89.54	89.88	88.87	87.68	86.21	85.00	84.34	83.89	83.70
	1800	85.50	90.16	90.15	89.45	88.58	87.71	86.79	86.52	86.31	85.93
	2000	85.48	89.40	89.38	88.60	87.73	85.98	85.67	85.12	85.39	85.34

| Show Table

DownLoad: CSV

It can be seen from Table 2 that the sample selection screening quantity in different data sets presents different rules. Indian Pines datasets have the highest classification accuracy after 10 iterations are finished. Pavia University datasets have the highest classification accuracy after 8–10 iterations are finished. Salinas Scene datasets have the highest classification accuracy after 2–4 iterations are finished. The sample screening quantity with the highest accuracy is regarded as the experiment parameter, which are 1600 for Indian Pines, 1400 for Pavia University and 600 for Salinas Scene.

5.3.4. Determination of block window size

The size of the block window determines the neighborhood information of the samples, which directly affect the accuracy of the pseudo-tagging method. Due to the different size of data, the optimal block window size is also determined by a large number of experiments. The classification accuracy under different block window sizes is shown in Table 3.

Table 3. The classification accuracy under different block window sizes (%).

Data	Block window size	1	2	3	4	5	6	7	8	9	10
Indian Pines	3	77.62	77.86	78.35	78.65	78.91	79.14	79.09	79.12	79.56	79.56
	4	77.73	78.24	78.65	78.75	78.70	78.66	79.14	79.18	79.70	79.74
	5	78.41	79.61	79.41	81.28	81.11	81.73	82.67	82.84	84.18	84.68
	6	77.85	80.14	80.41	80.91	80.90	83.54	84.91	85.09	86.42	88.30
	7	78.86	80.58	82.58	84.90	86.55	86.99	88.47	88.87	89.58	90.98
	8	78.33	78.95	81.00	84.24	85.44	86.37	87.60	88.13	88.57	89.19
	9	79.61	81.00	83.39	85.51	85.71	86.34	86.78	86.86	87.07	87.86
	10	78.61	79.32	83.45	85.45	85.59	85.69	86.12	87.00	87.17	87.32
Pavia University	5	68.39	68.38	68.38	68.38	68.38	68.38	68.38	68.38	68.38	68.38
	10	68.20	67.67	69.05	69.06	68.30	68.13	68.06	67.98	67.89	67.62
	15	70.91	70.59	70.96	70.50	71.89	73.54	73.80	74.58	75.25	75.32
	20	71.47	71.63	73.60	76.12	75.99	75.61	77.18	77.51	77.81	78.14
	25	71.25	74.57	78.52	79.15	81.95	82.32	83.66	84.88	85.45	85.51
	30	70.28	76.35	79.92	82.68	83.83	84.48	84.84	85.21	85.37	85.04
	35	71.48	77.43	80.15	81.75	83.18	83.62	84.11	83.65	83.23	83.17
	40	73.40	77.50	80.36	81.92	83.15	82.68	82.23	82.38	82.18	82.04
Salinas Scene	5	83.94	83.94	83.94	83.94	83.94	83.94	83.94	83.94	83.94	83.94
	10	83.61	84.16	84.95	85.03	84.77	84.68	84.59	84.82	84.61	85.48
	15	83.86	83.17	85.51	87.04	87.70	88.44	89.77	89.60	91.01	91.09
	20	84.16	86.84	89.41	90.56	90.22	90.69	91.02	91.13	90.87	90.68
	25	84.11	88.22	89.46	89.68	89.35	88.77	87.86	87.49	86.71	86.78
	30	85.35	88.93	89.88	89.04	88.34	87.58	87.03	85.49	85.08	85.08
	35	86.17	88.80	88.67	87.67	86.63	85.84	85.26	84.87	83.95	83.06
	40	86.86	89.25	89.02	87.78	86.37	84.74	83.60	82.83	81.69	80.92

| Show Table

DownLoad: CSV

As can be seen from Table 3, there have different changes in classification accuracy for different datasets. Compared with the other two data sets, the size of Indian Pines is the smallest, so its experimental block window side length values are from 3 to 10. With the increase of the number of iterations, the classification accuracy showed a trend of gradual increase, and the optimal accuracy is obtained when the side length is 7. When the side length of the block window for Pavia and Salinas datasets is too small, the classification accuracy will not improve with the increase of iterations. This indicates that the neighborhood information cannot distinguish the category at this time.

With the increase of the side length of the block window, the number of iterations to achieve the optimal classification accuracy is advanced, but the optimal accuracy decreases. The block window size is larger, the more noise information will be introduced, and it will affect the sample labeling accuracy. Therefore, the block window size of the Indian Pines dataset is 7 × 7, and the block window sizes of the Pavia and Salinas datasets are 25 × 25 and 20 × 20, respectively.

5.3.5. Determination of priority classifier

In fact, the determination of pseudo-tags of samples mainly depends on the determination of classifiers. K-Nearest Neighbor (KNN), Sparse Representation-Based Classifier (SRC), Neighbor Rough Set (NRS) and mixed logistic regression (MLR) are employed to determine the pseudo-tags. The experimental results of single classifier and different combination classifiers on different datasets are shown in Tables 4–6.

Table 4. The experimental results of different classifier combinations in Indian Pines data set.

Classifier	Index	1	2	3	4	5	6	7	8	9	10
KNN	NUM	314	673	1139	1630	2178	2680	3324	4063	4820	5474
KNN	OA (%)	78.69	79.93	81.20	82.05	82.49	84.38	85.70	86.52	87.33	87.77
SRC	NUM	311	648	1068	1525	2032	2606	3248	3899	4668	5522
SRC	OA (%)	78.64	80.42	80.10	81.56	82.87	84.39	85.83	87.04	87.92	88.19
NRS	NUM	315	673	1116	1597	2136	2697	3265	3815	4437	5016
NRS	OA (%)	78.82	81.02	80.96	83.71	84.94	85.57	86.87	88.01	89.25	89.42
MLR	NUM	133	295	580	936	1296	1790	2396	3077	3858	4648
MLR	OA (%)	77.55	79.31	82.21	83.91	83.99	85.75	87.36	87.68	88.27	88.41
KNN + SRC	NUM	317	706	1120	1702	2294	2967	3728	4494	5233	5950
KNN + SRC	OA (%)	78.71	80.96	81.99	83.97	84.96	85.82	86.99	87.75	87.93	88.10
KNN + NRS	NUM	317	707	1198	1691	2308	2954	3625	4450	5374	6305
KNN + NRS	OA (%)	78.78	80.27	80.73	82.51	83.82	85.45	87.56	89.06	89.95	90.19
KNN + MLR	NUM	318	712	1206	1794	2555	3398	4292	5072	5783	6678
KNN + MLR	OA (%)	78.79	80.56	82.57	86.08	87.10	87.87	88.50	88.88	89.18	89.31
SRC + KNN	NUM	317	706	1134	1673	2223	2875	3658	4317	5011	5748
SRC + KNN	OA (%)	78.71	81.09	81.51	83.17	84.27	85.84	86.94	87.54	88.26	88.64
SRC + NRS	NUM	318	730	1205	1778	2372	3091	3969	4813	5787	6641
SRC + NRS	OA (%)	78.81	80.79	82.37	85.03	86.81	88.45	88.93	89.86	90.49	90.78
SRC + MLR	NUM	315	708	1153	1744	2457	3322	4231	5042	5800	6735
SRC + MLR	OA (%)	78.80	80.92	82.16	85.16	86.24	87.61	88.39	88.71	89.07	89.19
NRS + KNN	NUM	317	707	1202	1712	2385	3021	3700	4538	5399	6333
NRS + KNN	OA (%)	78.74	80.67	81.33	83.36	84.46	86.35	87.93	89.23	90.03	90.43
NRS + SRC	NUM	318	734	1207	1728	2378	3061	3959	4799	5691	6739
NRS + SRC	OA (%)	78.77	81.01	82.56	84.52	86.37	87.43	88.95	90.11	90.61	90.90
NRS + MLR	NUM	318	694	1148	1690	2446	3194	4102	4950	5768	6792
NRS + MLR	OA (%)	78.91	81.39	81.62	85.51	87.21	89.63	90.28	90.92	91.28	91.88
MLR + KNN	NUM	318	677	1166	1689	2429	3246	4018	4737	5677	6672
MLR + KNN	OA (%)	78.30	81.13	81.85	85.94	87.22	88.28	88.77	88.98	89.20	89.29
MLR + SRC	NUM	315	704	1258	1741	2439	3286	4097	4867	5775	6760
MLR + SRC	OA (%)	78.31	81.81	82.55	85.09	86.86	88.30	89.01	89.44	90.21	90.71
MLR + NRS	NUM	318	683	1154	1701	2458	3301	4219	5057	5997	6889
MLR + NRS	OA (%)	78.46	81.49	82.11	86.14	87.86	89.70	90.68	91.58	92.15	92.42

| Show Table

DownLoad: CSV

Table 5. The experimental results of different classifiers for Pavia University data.

Classifier	Index	1	2	3	4	5	6	7	8	9	10
KNN	NUM	236	444	661	965	1210	1466	1812	2254	2762	3251
KNN	OA (%)	71.19	73.36	74.70	75.69	76.10	76.16	77.28	78.48	77.60	77.69
SRC	NUM	225	477	733	1080	1492	2047	2793	3654	4605	5631
SRC	OA (%)	72.21	72.19	76.97	79.12	80.08	81.77	83.09	84.07	84.95	85.27
NRS	NUM	225	438	763	1007	1256	1487	1726	1893	1991	2179
NRS	OA (%)	71.22	73.74	75.14	76.59	76.15	76.47	76.41	76.20	75.68	75.40
MLR	NUM	100	244	396	605	829	1035	1304	1587	1872	2183
MLR	OA (%)	72.36	76.37	77.18	79.04	79.40	81.16	82.21	82.18	83.36	85.85
KNN + SRC	NUM	248	473	787	1118	1509	1996	2552	3111	3975	4837
KNN + SRC	OA (%)	71.48	72.82	74.61	76.59	78.38	80.08	80.82	81.62	82.40	83.99
KNN + NRS	NUM	240	491	775	1067	1382	1850	2403	3040	3794	4415
KNN + NRS	OA (%)	71.11	73.83	77.44	78.83	79.50	79.58	79.39	80.11	80.74	81.48
KNN + MLR	NUM	244	515	822	1176	1616	2162	2905	3698	4745	5682
KNN + MLR	OA (%)	71.37	75.03	77.84	78.08	79.73	79.59	81.30	83.88	84.90	85.08
SRC + KNN	NUM	248	476	795	1215	1725	2298	3055	3967	4977	5913
SRC + KNN	OA (%)	71.38	72.54	75.26	78.09	79.19	80.21	81.88	83.31	83.35	83.51
SRC + NRS	NUM	242	511	867	1385	1889	2513	3201	3988	4910	5794
SRC + NRS	OA (%)	71.40	74.12	77.56	80.80	82.19	82.55	83.02	83.80	84.34	85.09
SRC + MLR	NUM	236	507	841	1289	1731	2261	3016	3928	4939	6015
SRC + MLR	OA (%)	71.52	73.62	76.47	78.45	79.51	81.54	83.10	84.37	84.68	84.66
NRS + KNN	NUM	240	486	803	1119	1541	1992	2557	3190	3939	4611
NRS + KNN	OA (%)	71.05	74.37	76.57	77.77	78.06	77.21	76.75	77.35	78.88	79.28
NRS + SRC	NUM	242	501	803	1296	1792	2433	3089	3839	4776	5798
NRS + SRC	OA (%)	71.44	74.50	76.72	79.93	81.44	81.90	82.85	83.81	85.00	85.48
NRS + MLR	NUM	234	517	828	1237	1796	2446	3354	4220	5061	5857
NRS + MLR	OA (%)	71.49	75.47	77.89	80.86	83.68	84.43	84.93	85.12	85.57	85.46
MLR + KNN	NUM	244	486	746	1170	1658	2300	3205	4208	5336	6514
MLR + KNN	OA (%)	71.47	75.30	76.67	78.76	80.35	82.97	85.05	86.05	86.88	87.02
MLR + SRC	NUM	236	504	903	1310	1708	2207	3009	4024	5098	6234
MLR + SRC	OA (%)	71.71	74.01	79.30	79.81	80.40	83.03	86.27	86.93	87.97	88.53
MLR + NRS	NUM	234	524	787	1205	1738	2374	3116	3953	4754	5586
MLR + NRS	OA (%)	71.63	75.94	79.83	80.66	82.75	84.64	85.98	86.19	86.37	86.87

| Show Table

DownLoad: CSV

Table 6. The experimental results of different classifiers for Salinas Scene data.

Classifier	Index	1	2	3	4	5	6	7	8	9	10
KNN	NUM	133	251	443	684	963	1304	1672	2047	2425	2857
KNN	OA (%)	83.38	86.46	86.46	86.65	87.26	87.31	87.79	87.92	87.66	87.15
SRC	NUM	144	275	441	666	937	1255	1606	1968	2391	2811
SRC	OA (%)	83.10	83.56	85.11	85.83	85.81	86.63	86.50	86.64	86.95	86.96
NRS	NUM	148	271	423	668	952	1263	1569	1940	2351	2799
NRS	OA (%)	84.04	86.06	87.62	87.63	87.10	87.48	88.44	88.43	88.32	87.92
MLR	NUM	102	177	302	451	621	867	1176	1518	1848	2217
MLR	OA (%)	82.88	85.41	87.11	88.36	88.73	90.68	91.52	92.20	91.70	92.02
KNN + SRC	NUM	146	294	479	694	976	1330	1691	2106	2520	2985
KNN + SRC	OA (%)	83.60	84.65	85.38	86.76	87.39	87.39	88.00	88.11	87.53	87.23
KNN + NRS	NUM	150	297	500	761	1108	1466	1891	2354	2834	3316
KNN + NRS	OA (%)	83.53	86.43	87.14	87.49	87.64	87.37	88.06	88.03	87.95	88.03
KNN + MLR	NUM	143	285	508	768	1132	1546	2026	2526	3049	3590
KNN + MLR	OA (%)	83.38	85.50	86.34	88.35	88.80	90.07	90.24	90.08	89.86	89.79
SRC + KNN	NUM	146	287	472	686	957	1281	1673	2061	2485	2892
SRC + KNN	OA (%)	83.25	84.70	85.79	85.88	86.70	86.92	86.89	86.93	86.85	86.90
SRC + NRS	NUM	150	280	493	755	1017	1328	1708	2114	2539	2982
SRC + NRS	OA (%)	83.07	85.55	86.28	85.47	86.66	86.97	87.52	87.05	87.19	87.16
SRC + MLR	NUM	150	271	483	720	1108	1499	1958	2451	2969	3487
SRC + MLR	OA (%)	83.15	84.27	87.25	87.88	88.88	89.29	89.36	89.85	89.70	89.85
NRS + KNN	NUM	150	298	519	814	1148	1556	2037	2538	3027	3522
NRS + KNN	OA (%)	84.01	87.12	87.79	87.30	87.54	88.38	88.09	88.25	87.95	87.88
NRS + SRC	NUM	150	284	488	803	1158	1539	1988	2482	2955	3423
NRS + SRC	OA (%)	83.91	87.23	86.98	87.73	88.30	88.97	89.57	89.57	89.60	89.43
NRS + MLR	NUM	153	293	509	762	1104	1509	1940	2441	2934	3452
NRS + MLR	OA (%)	83.87	85.82	88.25	89.93	90.40	90.51	90.75	90.48	90.12	89.61
MLR + KNN	NUM	143	299	521	825	1187	1602	2046	2514	2993	3407
MLR + KNN	OA (%)	82.80	85.42	87.67	88.79	89.68	90.06	90.54	90.92	91.27	91.32
MLR + SRC	NUM	150	292	521	799	1123	1537	2007	2448	2929	3367
MLR + SRC	OA (%)	82.91	85.45	88.89	90.12	90.21	90.93	90.99	91.83	92.08	92.64
MLR + NRS	NUM	153	315	564	841	1197	1605	2064	2561	3060	3500
MLR + NRS	OA (%)	82.81	84.87	88.16	89.34	89.69	90.03	90.42	90.52	91.19	91.43

| Show Table

DownLoad: CSV

As can be seen in Table 4, with the increase of iterations, the classification accuracy of each category samples increased gradually. Compared with the results of the single classifier, the SRC has the largest number of samples after 10 iterations are finished, but the classification effect is not the best value. The classifier with the best classification effect is NRS. The number of samples and classification accuracy of the combination classifiers are mostly better than that of single classifier. The experiment results are different for combination classifiers with different priority. The classifier with NRS can achieve more than 90% classification effect after 10 iterations are finished. The number of combinations with MLR is more than 6600 after 10 iterations are finished, and the best combination is MLR + NRS after 10 iterations are finished.

As can be seen in Table 5, compared with the experimental results by single classifier, the number of labeled samples with SRC is the largest value after 10 iterations are finished, which is higher than the other three methods. However, the MLR obtained best classification results. After 10 iterations are finished, the number of labeled samples of two classifiers is more than that of the single classifier. KNN, SRC and NRS are considered as the first priority classifiers, the classification results of the sample set after 10 iterations are not as good as those of MLR. The combination of MLR as the first priority classifier has better classification effect than single MLR after 10 iterations are finished.

For Salinas Scene data, the number of labeled samples by MLR after 10 iterations is the smallest result, but the classification accuracy of the labeled samples is the highest result. After 10 iterations are finished, the number of iterations of the two classifiers is also higher than that of the single classifier. However, from the perspective of the performance of labeled samples in classification, MLR+SRC has higher classification accuracy than MLR, which indicates that the addition of classifiers improves the classification accuracy.

From three experiment results, it can be seen that the method with the largest number of labeled samples does not necessarily achieve the best classification results. The labeled samples are needed to improve the classification accuracy of the classifier, so the obtained labeled samples after 10 iterations are taken as the evaluation criteria. The Indian Pines dataset uses a combination of classifiers MLR + NRS. The Pavia University and Salinas Scene data sets use MLR + SRC.

5.4. Experimental results and analysis

Based on the analysis, the settings of the related parameters are shown in Table 7.

Table 7. The settings of the related parameters.

Data set	Indian Pines	Pavia University	Salinas Scene
Selection policy	BT	BT	BT
Number of selections	1600	1400	600
Window size	7 × 7	25 × 25	20 × 20
Combination of classifiers	MLR + NRS	MLR + SRC	MLR + SRC
Number of labeled samples	6889	6234	3367

| Show Table

DownLoad: CSV

Firstly, the LBP is used to extract the features of spatial texture information of hyperspectral remote sensing images. Secondly, the sample labeling method based on neighborhood information and priority classifier discrimination is used to obtain the learned pseudo-labeled samples. The SRC classifier is trained with the labeled samples, and the test samples are predicted. The obtained classification results are compared with those of the SRC classifier on the training data, and the classification results of training models with different training dataset are shown in Table 8.

Table 8. The classification results of training models with different training dataset.

Training samples	Index	Initial samples	Labeling samples
Indian Pines	AA	67.93%	84.70%
	OA	77.38%	92.42%
	KAPPA	0.746	0.914
Pavia University	AA	60.53%	81.87%
	OA	69.00%	88.53%
	KAPPA	0.609	0.848
Salinas Scene	AA	82.59%	87.76%
	OA	84.00%	92.64%
	KAPPA	0.823	0.918

| Show Table

DownLoad: CSV

It can be seen from Table 8, for Indian Pines dataset, the classification results of AA, OA and KAPPA are 84.7%, 94.42% and 0.914, respectively. For Pavia University dataset, the classification results of AA, OA and KAPPA are 81.87%, 88.53% and 0.848, respectively. For Salinas Scene dataset, the classification results of AA, OA and KAPPA are 87.76%, 92.64% and 0.918, respectively. Therefore, the proposed classification method obtains higher classification accuracy.

The classification visualizations of the proposed classification method for the initial samples and labeled samples are shown in Figures 7 and 8.

Figure 7. The classification results of the initial samples.

DownLoad: Full-Size Img PowerPoint

Figure 8. The classification results of the labeling samples.

DownLoad: Full-Size Img PowerPoint

By comparing with the results of the experiments, it can find that the classification results of the trained classifier with expanded samples on the three datasets are better than those of the trained classifier with initial samples. Moreover, from the classification visualization, it can also see that the obtained classification results by the classifier and the labeled samples is smoother and has fewer discrete points, which indicates that the generalization ability of the classifier is improved by labeling the samples.

6. Conclusions

For the processing and analysis difficulties of hyperspectral images, a new sample labeling method based on neighborhood information and priority classifier discrimination is developed to implement a new classification method of hyperspectral remote sensing images based on texture features and semi-supervised learning by introducing LBP, sparse representation and mixed logistic regression. The LBP is employed to extract the texture features of the hyperspectral remote sensing images. The multivariate logistic regression model is used to select the unlabeled samples with the largest amount of information, and the unlabeled samples with neighborhood information and priority classifier tags are selected to obtain the pseudo-labeled samples after learning. The problem of limited labeled samples of hyperspectral images is solved. The Indian Pines dataset, Salinas scene dataset and Pavia University dataset are selected in here. The experiment results show that the block window of Indian Pines dataset is 7 × 7, and the block windows of Pavia University and Salinas scene are 25 × 25 and 20 × 20, respectively. The combination of MLR and SRC can obtain better classification results. The obtained classification results are smoother and have fewer discrete points, which indicate that the generalization ability of the classifier is improved by labeling the samples. For Indian Pines dataset, the classification results of AA, OA and KAPPA are 84.7%, 94.42% and 0.914, respectively. For Pavia University dataset, the classification results of AA, OA and KAPPA are 81.87%, 88.53% and 0.848, respectively. For Salinas Scene dataset, the classification results of AA, OA and KAPPA are 87.76%, 92.64% and 0.918, respectively. Therefore, the proposed classification method obtains higher classification accuracy by comparing with the other methods.

However, the proposed classification method has the more computing time, so the next step should be more in-depth research to reduce the time complexity.

Acknowledgments

This research was funded by the Sichuan Science and Technology Program, grant number 2021YFS0407, 2022YFS0593, 2023YFG0028; the Sichuan Provincial Transfer Payment Program, Chian under Grant R21ZYZF0006; the A Ba Achievements Transformation Program under Grant R21CGZH0001, R22CGZH0006, R22CGZH0007; the Chengdu Science and technology planning project, grant number 2021-YF05-00933-SN, the Research Foundation for Civil Aviation University of China grant number 2020KYQD123.

Conflict of interests

The authors declare no conflict of interest.

References

[1]	C. Liu, W. Wang, M. Konan, S. Wang, L. Huang, Y. Tang, et al., A new validity index of feature subset for evaluating the dimensionality reduction algorithms, Knowl.-Based Syst., 121 (2017), 83–98. https://doi.org/10.1016/j.knosys.2017.01.017 doi: 10.1016/j.knosys.2017.01.017
[2]	N. Kozodoi, S. Lessmann, K. Papakonstantinou, Y. Gatsoulis, B. Baesens, A multi-objective approach for profit-driven feature selection in credit scoring, Decis. Support Syst., 120 (2019), 106–117. https://doi.org/10.1016/j.dss.2019.03.011 doi: 10.1016/j.dss.2019.03.011
[3]	F. Chen, F. Li, Combination of feature selection approaches with SVM in credit scoring, Expert Syst. Appl., 37 (2010), 4902–4909. https://doi.org/10.1016/j.eswa.2009.12.025 doi: 10.1016/j.eswa.2009.12.025
[4]	M. Doumpos, J. R. Figueira, A multicriteria outranking approach for modeling corporate credit ratings: An application of the Electre Tri-nC method, Omega, 82 (2019), 166–180. https://doi.org/10.1016/j.omega.2018.01.003 doi: 10.1016/j.omega.2018.01.003
[5]	D. Mateos-García, J. García-Gutiérrez, J. C. Riquelme-Santos, On the evolutionary weighting of neighbours and features in the k-nearest neighbour rule, Neurocomputing, 326 (2019), 54–60. https://doi.org/10.1016/j.neucom.2016.08.159 doi: 10.1016/j.neucom.2016.08.159
[6]	F. N. Koutanaei, H. Sajedi, M. Khanbabaei, A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring, J. Retail. Consum. Serv., 27 (2015), 11–23. https://doi.org/10.1016/j.jretconser.2015.07.003 doi: 10.1016/j.jretconser.2015.07.003
[7]	S. Lessmann, B. Baesens, H. V. Seow, L. C. Thomas, Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research, Eur. J. Oper. Res., 247 (2015), 124–136. https://doi.org/10.1016/j.ejor.2015.05.030 doi: 10.1016/j.ejor.2015.05.030
[8]	S. A. Sridharan, Volatility forecasting using financial statement information, Account. Rev. 90 (2015), 2079–2106. https://doi.org/10.2308/accr-51025 doi: 10.2308/accr-51025
[9]	S. Maldonado, J. Pérez, C. Bravo, Cost-based feature selection for support vector machines: An application in credit scoring, Eur. J. Oper. Res., 261 (2017), 656–665. https://doi.org/10.1016/j.ejor.2017.02.037 doi: 10.1016/j.ejor.2017.02.037
[10]	P. Bertolazzi, G. Felici, P. Festa, G. Fiscon, E. Weitschek, Integer programming models for feature selection: New extensions and a randomized solution algorithm, Eur. J. Oper. Res., 250 (2016), 389–399. https://doi.org/10.1016/j.ejor.2015.09.051 doi: 10.1016/j.ejor.2015.09.051
[11]	Y. Xia, C. Liu, Y. Li, N. Liu, A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring, Expert Syst. Appl., 78 (2017), 225–241. https://doi.org/10.1016/j.eswa.2017.02.017 doi: 10.1016/j.eswa.2017.02.017
[12]	S. Jadhav, H. He, K. Jenkins, Information gain directed genetic algorithm wrapper feature selection for credit rating, Appl. Soft Comput., 69 (2018), 541–553. https://doi.org/10.1016/j.asoc.2018.04.033 doi: 10.1016/j.asoc.2018.04.033
[13]	N. Arora, P. D. Kaur, A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment, Appl. Soft Comput., 86 (2020), 105936. https://doi.org/10.1016/j.asoc.2019.105936 doi: 10.1016/j.asoc.2019.105936
[14]	W. Gu, M. Basu, Z. Chao, L. Wei, A unified framework for credit evaluation for internet finance companies: Multi-criteria analysis through AHP and DEA, Int. J. Inf. Tech. Decis., 16 (2017), 597–624. https://doi.org/10.1142/S0219622017500134 doi: 10.1142/S0219622017500134
[15]	Z. Li, N. Hou, J. Su, Y. Liu, Model of credit rating of micro enterprise based on fuzzy integration, Filomat, 32 (2018), 1831–1842. https://doi.org/10.2298/FIL1805831L doi: 10.2298/FIL1805831L
[16]	A. Karaaslan, K. Ö. Özden, Forecasting Turkey's credit ratings with multivariate grey model and grey relational analysis, J. Quant. Econ., 15 (2017), 583–610. https://doi.org/10.1007/s40953-016-0064-1 doi: 10.1007/s40953-016-0064-1
[17]	X. Zhu, J. Li, D. Wu, H. Wang, C. Liang, Balancing accuracy, complexity and interpretability in consumer credit decision making: A C-TOPSIS classification approach, Knowl.-Based Syst., 52 (2013), 258–267. https://doi.org/10.1016/j.knosys.2013.08.004 doi: 10.1016/j.knosys.2013.08.004
[18]	H. Chen, T. Li, X. Fan, C. Luo, Feature selection for imbalanced data based on neighborhood rough sets, Inform. Sciences, 483 (2019), 1–20. https://doi.org/10.1016/j.ins.2019.01.041 doi: 10.1016/j.ins.2019.01.041
[19]	D. Panday, R. C. de Amorim, P. Lane, Feature weighting as a tool for unsupervised feature selection, Inform. Process. Lett., 129 (2018), 44–52. https://doi.org/10.1016/j.ipl.2017.09.005 doi: 10.1016/j.ipl.2017.09.005
[20]	Y. O. Serrano-Silva, Y. Villuendas-Rey, C. Yáñez-Márquez, Automatic feature weighting for improving financial Decision Support Systems, Decis. Support Syst., 107 (2018), 78–87. https://doi.org/10.1016/j.dss.2018.01.005 doi: 10.1016/j.dss.2018.01.005
[21]	M. Mercadier, J. P. Lardy, Credit spread approximation and improvement using random forest regression, Eur. J. Oper. Res., 277 (2019), 351–365. https://doi.org/10.1016/j.ejor.2019.02.005 doi: 10.1016/j.ejor.2019.02.005
[22]	M. M. Chijoriga, Application of multiple discriminant analysis (MDA) as a credit scoring and risk assessment model, Int. J. Emerg. Mark., 6 (2011), 132–147. https://doi.org/10.1108/17468801111119498 doi: 10.1108/17468801111119498
[23]	L. Kao, C. Chiu, F. Chiu, A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring, Knowl.-Based Syst., 36 (2012), 245–252. https://doi.org/10.1016/j.knosys.2012.07.004 doi: 10.1016/j.knosys.2012.07.004
[24]	N. Mahmoudi, E. Duman, Detecting credit card fraud by modified Fisher discriminant analysis, Expert Syst. Appl., 42 (2015), 2510–2516. https://doi.org/10.1016/j.eswa.2014.10.037 doi: 10.1016/j.eswa.2014.10.037
[25]	S. Y. Sohn, D. H. Kim, J. H. Yoon, Technology credit scoring model with fuzzy logistic regression, Appl. Soft Comput., 43 (2016), 150–158. https://doi.org/10.1016/j.asoc.2016.02.025 doi: 10.1016/j.asoc.2016.02.025
[26]	M. S. Colak, A new multivariate approach for assessing corporate financial risk using balance sheets, Borsa Istanb. Rev., 21 (2021), 239–255. https://doi.org/10.1016/j.bir.2020.10.007 doi: 10.1016/j.bir.2020.10.007
[27]	N. Dwarika, The risk-return relationship and volatility feedback in South Africa: a comparative analysis of the parametric and nonparametric Bayesian approach, Quant. Financ. Econ., 7 (2023), 119–146. https://doi.org/10.3934/QFE.2023007 doi: 10.3934/QFE.2023007
[28]	Y. Guo, Y. Bai, C. Li, Y. Shao, Y. Ye, C. Jiang, Reverse nearest neighbors Bhattacharyya bound linear discriminant analysis for multimodal classification, Eng. Appl. Artif. Intel., 97 (2021), 104033. https://doi.org/10.1016/j.engappai.2020.104033 doi: 10.1016/j.engappai.2020.104033
[29]	N. Chukhrova, A. Johannssen, Fuzzy regression analysis: systematic review and bibliography, Appl. Soft Comput., 84 (2019), 105708. https://doi.org/10.1016/j.asoc.2019.105708 doi: 10.1016/j.asoc.2019.105708
[30]	A. Khashman, Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes, Expert Syst. Appl., 37 (2010), 6233–6239. https://doi.org/10.1016/j.eswa.2010.02.101 doi: 10.1016/j.eswa.2010.02.101
[31]	S. Maldonado, C. Bravo, J. López, J. Pérez, Integrated framework for profit-based feature selection and SVM classification in credit scoring, Decis. Support Syst., 104 (2017), 113–121. https://doi.org/10.1016/j.dss.2017.10.007 doi: 10.1016/j.dss.2017.10.007
[32]	A. Bequé, S. Lessmann, Extreme learning machines for credit scoring: An empirical evaluation, Expert Syst. Appl., 86 (2017), 42–53. https://doi.org/10.1016/j.eswa.2017.05.050 doi: 10.1016/j.eswa.2017.05.050
[33]	X. Zhang, Y. Han, W. Xu, Q. Wang, HOBA: A novel feature engineering methodology for credit card fraud detection with a deep learning architecture, Inform. Sciences, 557 (2021), 302–316. https://doi.org/10.1016/j.ins.2019.05.023 doi: 10.1016/j.ins.2019.05.023
[34]	M. Ala'raj, M. F. Abbod, M. Majdalawieh, Modelling customers credit card behaviour using bidirectional LSTM neural networks, J. Big Data, 8 (2021), 69. https://doi.org/10.1186/s40537-021-00461-7 doi: 10.1186/s40537-021-00461-7
[35]	F. Zhao, Y. Lu, X. Li, L. Wang, Y. Song, D. Fan, et al., Multiple imputation method of missing credit risk assessment data based on generative adversarial networks, Appl. Soft Comput., 126 (2022), 109273. https://doi.org/10.1016/j.asoc.2022.109273 doi: 10.1016/j.asoc.2022.109273
[36]	S. Asadi, S. E. Roshan, A bi-objective optimization method to produce a near-optimal number of classifiers and increase diversity in Bagging, Knowl.-Based Syst., 213 (2021), 106656. https://doi.org/10.1016/j.knosys.2020.106656 doi: 10.1016/j.knosys.2020.106656
[37]	Y. C. Chang, K. H. Chang, G. J. Wu, Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions, Appl. Soft Comput., 73 (2018), 914–920. https://doi.org/10.1016/j.asoc.2018.09.029 doi: 10.1016/j.asoc.2018.09.029
[38]	Y. Xia, J. Zhao, L. He, Y. Li, X. Yang, Forecasting loss given default for peer-to-peer loans via heterogeneous stacking ensemble approach, Int. J. Forecasting, 37 (2021), 1590–1613. https://doi.org/10.1016/j.ijforecast.2021.03.002 doi: 10.1016/j.ijforecast.2021.03.002
[39]	F. Shen, X. Zhao, G. Kou, F. E. Alsaadi, A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique, Appl. Soft Comput., 98 (2021), 106852. https://doi.org/10.1016/j.asoc.2020.106852 doi: 10.1016/j.asoc.2020.106852
[40]	J. Forough, S. Momtazi, Ensemble of deep sequential models for credit card fraud detection, Appl. Soft Comput., 99 (2021), 106883. https://doi.org/10.1016/j.asoc.2020.106883 doi: 10.1016/j.asoc.2020.106883
[41]	A. Belhadi, S. S. Kamble, V. Mani, I. Benkhati, F. E. Touriki, An ensemble machine learning approach for forecasting credit risk of agricultural SMEs' investments in agriculture 4.0 through supply chain finance, Ann. Oper. Res., 2021 (2021), 1–29. https://doi.org/10.1007/s10479-021-04366-9 doi: 10.1007/s10479-021-04366-9
[42]	C. Jiang, W. Xiong, Q. Xu, Y. Liu, Predicting default of listed companies in mainland China via U-MIDAS Logit model with group lasso penalty, Financ. Res. Lett., 38 (2021) 101487. https://doi.org/10.1016/j.frl.2020.101487 doi: 10.1016/j.frl.2020.101487
[43]	J. Donovan, J. Jennings, K. Koharki, J. Lee, Measuring credit risk using qualitative disclosure, Rev. Account. Stud., 26 (2021), 815–863. https://doi.org/10.1007/s11142-020-09575-4 doi: 10.1007/s11142-020-09575-4
[44]	N. Camanho, P. Deb, Z. Liu, Credit rating and competition, Int. J. Financ. Econ., 27 (2022) 2873–2897. https://doi.org/10.1002/ijfe.2303 doi: 10.1002/ijfe.2303
[45]	H. Zhang, Y. Shi, X. Yang, R. Zhou, A firefly algorithm modified support vector machine for the credit risk assessment of supply chain finance, Res. Int. Bus. Financ., 58 (2021), 101482. https://doi.org/10.1016/j.ribaf.2021.101482 doi: 10.1016/j.ribaf.2021.101482
[46]	Z. Ma, W. Hou, D. Zhang, A credit risk assessment model of borrowers in P2P lending based on BP neural network, PLOS one, 16 (2021), e0255216. https://doi.org/10.1371/journal.pone.0255216 doi: 10.1371/journal.pone.0255216
[47]	W. Hou, X. Wang, H. Zhang, J. Wang, L. Li, A novel dynamic ensemble selection classifier for an imbalanced data set: an application for credit risk assessment, Knowl.-Based Syst., 208 (2020), 106462. https://doi.org/10.1016/j.knosys.2020.106462 doi: 10.1016/j.knosys.2020.106462
[48]	F. O. Sameer, M. R. A. Bakar, A. A. Zaidan, B. B. Zaidan, A new algorithm of modified binary particle swarm optimization based on the Gustafson-Kessel for credit risk assessment, Neural Comput. & Applic., 31 (2019), 337–346. https://doi.org/10.1007/s00521-017-3018-4 doi: 10.1007/s00521-017-3018-4
[49]	J. Traczynski, Firm default prediction: A Bayesian model-averaging approach, J. Financ. Quant. Anal., 52 (2017), 1211–1245. https://doi.org/10.1017/S002210901700031X doi: 10.1017/S002210901700031X
[50]	Y. Zhou, W. Zhang, J. Kang, X. Zhang, X. Wang, A problem-specific non-dominated sorting genetic algorithm for supervised feature selection, Inform. Sciences, 547 (2021), 841–859. https://doi.org/10.1016/j.ins.2020.08.083 doi: 10.1016/j.ins.2020.08.083
[51]	Y. Zhu, L. Zhou, C. Xie, G. Wang, T. V. Nguyen, Forecasting SMEs' credit risk in supply chain finance with an enhanced hybrid ensemble machine learning approach, Int. J. Prod. Econ., 211 (2019), 22–33. https://doi.org/10.1016/j.ijpe.2019.01.032 doi: 10.1016/j.ijpe.2019.01.032
[52]	G. Chi, B. Meng, Debt rating model based on default identification: Empirical evidence from Chinese small industrial enterprises, Manage. Decis., 57 (2019), 2239–2260. https://doi.org/10.1108/MD-11-2017-1109 doi: 10.1108/MD-11-2017-1109
[53]	A. Bequé, K. Coussement, R. Gayler, S. Lessmann, Approaches for credit scorecard calibration: An empirical analysis, Knowl.-Based Syst., 134 (2017), 213–227. https://doi.org/10.1016/j.knosys.2017.07.034 doi: 10.1016/j.knosys.2017.07.034
[54]	R. Geng, I. Bose, X. Chen, Prediction of financial distress: An empirical study of listed Chinese companies using data mining, Eur. J. Oper. Res., 241 (2015), 236–247. https://doi.org/10.1016/j.ejor.2014.08.016 doi: 10.1016/j.ejor.2014.08.016
[55]	R.P. Baghai, B. Becker, Reputations and credit ratings: Evidence from commercial mortgage-backed securities, J. Financ. Econ., 135 (2020), 425–444. https://doi.org/10.1016/j.jfineco.2019.06.001 doi: 10.1016/j.jfineco.2019.06.001
[56]	N. Chai, B. Wu, W. Yang, B. Shi, A multicriteria approach for modeling small enterprise credit rating: evidence from China, Emerg. Mark. Financ. Tr., 55 (2019), 2523–2543. https://doi.org/10.1080/1540496X.2019.1577237 doi: 10.1080/1540496X.2019.1577237
[57]	L. Li, J. Yang, X. Zou, A study of credit risk of Chinese listed companies: ZPP versus KMV, Appl. Econ., 48 (2016), 2697–2710. https://doi.org/10.1080/00036846.2015.1128077 doi: 10.1080/00036846.2015.1128077
[58]	M. Livingston, W. P. Poon, L. Zhou, Are Chinese credit ratings relevant? A study of the Chinese bond market and credit rating industry, J. Bank. Financ., 87 (2018), 216–232. https://doi.org/10.1016/j.jbankfin.2017.09.020 doi: 10.1016/j.jbankfin.2017.09.020
[59]	M. S. Uddin, G. Chi, M. A. A. Janabi, T. Habib, Leveraging random forest in micro‐enterprises credit risk modelling for accuracy and interpretability, Int. J. Financ. Econ., 27 (2022), 3713–3729. https://doi.org/10.1002/ijfe.2346 doi: 10.1002/ijfe.2346
[60]	B. Meng, G. Chi, Evaluation index system of green industry based on maximum information content, Singap. Econ. Rev., 63 (2018), 229–248. https://doi.org/10.1142/S0217590817400094 doi: 10.1142/S0217590817400094
[61]	Z. Li, S. Liang, X. Pan, M. Pang, Credit risk prediction based on loan profit: Evidence from Chinese SMEs, Res. Int. Bus. Financ., 67 (2024), 102155. https://doi.org/10.1016/j.ribaf.2023.102155 doi: 10.1016/j.ribaf.2023.102155
[62]	J. A. Ohlson, Financial ratios and the probabilistic prediction of bankruptcy, J. Account. Res., 18 (1980), 109–131. https://doi.org/10.2307/2490395 doi: 10.2307/2490395
[63]	D. G. Kirikos, An evaluation of quantitative easing effectiveness based on out-of-sample forecasts, National Accounting Review, 4 (2022), 378–389. https://doi.org/10.3934/NAR.2022021 doi: 10.3934/NAR.2022021
[64]	M. Peña, M. Cerrada, D. Cabrera, R.-V. Sánchez, Fast feature selection based on cluster validity index applied on data-driven bearing fault detection, 2020 IEEE ANDESCON, Quito, Ecuador, 2020, 1–6. http://doi.org/10.1109/ANDESCON50619.2020.9272146
[65]	Y. Zhou, M. S. Uddin, T. Habib, G. Chi, K. Yuan, Feature selection in credit risk modeling: an international evidence, Econ. Res.-Ekon. Istraž., 34 (2021), 3064–3091. http://hdl.handle.net/10.1080/1331677X.2020.1867213 doi: 10.1080/1331677X.2020.1867213
[66]	F. Garrido, W. Verbeke, C. Bravo, A Robust profit measure for binary classification model evaluation, Expert Syst. Appl., 92 (2018), 154–160. https://doi.org/10.1016/j.eswa.2017.09.045 doi: 10.1016/j.eswa.2017.09.045
[67]	T. M. Luong, H. Scheule, Benchmarking forecast approaches for mortgage credit risk for forward periods, Eur. J. Oper. Res., 299 (2022), 750–767. https://doi.org/10.1016/j.ejor.2021.09.026 doi: 10.1016/j.ejor.2021.09.026
[68]	C. Bai, B. Shi, F. Liu, J. Sarkis, Banking credit worthiness: Evaluating the complex relationships, Omega, 83 (2019), 26–38. https://doi.org/10.1016/j.omega.2018.02.001 doi: 10.1016/j.omega.2018.02.001
[69]	M. Z. Abedin, C. Guotai, F. E. Moula, A. S. Azad, M. S. U. Khan, Topological applications of multilayer perceptrons and support vector machines in financial decision support systems, Int. J. Financ. Econ., 24 (2019), 474–507. https://doi.org/10.1002/ijfe.1675 doi: 10.1002/ijfe.1675
[70]	Q. Lan, X. Xu, H. Ma, G. Li, Multivariable data imputation for the analysis of incomplete credit data, Expert Syst. Appl., 141 (2020), 112926. https://doi.org/10.1016/j.eswa.2019.112926 doi: 10.1016/j.eswa.2019.112926
[71]	S. Wu, X. Gao, W. Zhou, COSLE: Cost sensitive loan evaluation for P2P lending, Inform. Sciences, 586 (2022), 74–98. https://doi.org/10.1016/j.ins.2021.11.055 doi: 10.1016/j.ins.2021.11.055
[72]	N. Kozodoi, J. Jacob, S. Lessmann, Fairness in credit scoring: Assessment, implementation and profit implications, Eur. J. Oper. Res., 297 (2022) 1083–1094. https://doi.org/10.1016/j.ejor.2021.06.023 doi: 10.1016/j.ejor.2021.06.023
[73]	X. Su, S. Zhou, R. Xue, J. Tian, Does economic policy uncertainty raise corporate precautionary cash holdings? Evidence from China, Account. Financ., 60 (2020), 4567–4592. https://doi.org/10.1111/acfi.12674 doi: 10.1111/acfi.12674
[74]	L. He, L. Zhang, Z. Zhong, D. Wang, F. Wang, Green credit, renewable energy investment and green economy development: Empirical analysis based on 150 listed companies of China, J. Clean. Prod., 208 (2019), 363–372. https://doi.org/10.1016/j.jclepro.2018.10.119 doi: 10.1016/j.jclepro.2018.10.119
[75]	V. Hlasny, Market and home production earnings gaps in Russia, National Accounting Review, 5 (2023), 108–124. https://doi.org/10.3934/NAR.2023007 doi: 10.3934/NAR.2023007
[76]	Y. Huang, Y. Ma, Z. Yang, Y. Zhang, A fire sale without fire: An explanation of labor-intensive FDI in China, J. Comp. Econ., 44 (2016), 884–901. https://doi.org/10.1016/j.jce.2016.04.007 doi: 10.1016/j.jce.2016.04.007
[77]	Z. Zhao, K. H. Zhang, FDI and industrial productivity in China: Evidence from panel data in 2001-06, Rev. Dev. Econ., 14 (2010), 656–665. https://doi.org/10.1111/j.1467-9361.2010.00580.x doi: 10.1111/j.1467-9361.2010.00580.x
[78]	Y. Zhang, L. Ma, Board faultlines, innovation strategy decisions, and faultline activation: Research on technology-intensive enterprises in Chinese A-share companies, Front. Psychol., 13 (2022), 855610. https://doi.org/10.3389/fpsyg.2022.855610 doi: 10.3389/fpsyg.2022.855610

This article has been cited by:

1.	Songyang Lyu, Ray C. C. Cheung, Efficient and Automatic Breast Cancer Early Diagnosis System Based on the Hierarchical Extreme Learning Machine, 2023, 23, 1424-8220, 7772, 10.3390/s23187772
2.	Yansong Du, 2023, Research on Photographic Image Classification Based on Multi-model Fusion and Data Augmentation, 979-8-3503-1545-5, 1, 10.1109/ICIICS59993.2023.10420882
3.	Fuyang Tian, Yinuo Zhang, Shakeel Ahmed Soomro, Qiang Wang, Shuaiyang Zhang, Ji Zhang, Qinglu Yang, Yunpeng Yan, Zhenwei Yu, Zhanhua Song, SSOD-MViT: A novel model for recognizing alfalfa seed pod maturity based on semi-supervised learning, 2025, 236, 01681699, 110439, 10.1016/j.compag.2025.110439

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(1760) PDF downloads(63) Cited by(2)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(1) / Tables(8)

AIMS Mathematics

A novel profit-based validity index approach for feature selection in credit risk prediction

Related Papers:

Abstract

1. Introduction

2. Basic methods

2.1. Local binary pattern (LBP)

2.2. Sparse expression

3. A sample labelling method based on neighborhood information and priority classifier discrimination

3.1. A sample selection method based on multiple logistic regression

3.2. A sample labeling method based on neighborhood information and priority classifier discrimination

4. Hyperspectral image classification method based on texture features and semi-supervised learning

4.1. The idea of hyperspectral image classification

4.2. The model of hyperspectral image classification

5. Case analysis

5.1. Experimental data

5.2. Evaluation criteria

5.3. Parameter determination and analysis

5.3.1. Parameter settings

5.3.2. Sample selection methods

5.3.3. Determination of sample size

5.3.4. Determination of block window size

5.3.5. Determination of priority classifier

5.4. Experimental results and analysis

6. Conclusions

Acknowledgments

Conflict of interests

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Mathematics

A novel profit-based validity index approach for feature selection in credit risk prediction

Related Papers:

Abstract

1. Introduction

2. Basic methods

2.1. Local binary pattern (LBP)

2.2. Sparse expression

3. A sample labelling method based on neighborhood information and priority classifier discrimination

3.1. A sample selection method based on multiple logistic regression

3.2. A sample labeling method based on neighborhood information and priority classifier discrimination

4. Hyperspectral image classification method based on texture features and semi-supervised learning

4.1. The idea of hyperspectral image classification

4.2. The model of hyperspectral image classification

5. Case analysis

5.1. Experimental data

5.2. Evaluation criteria

5.3. Parameter determination and analysis

5.3.1. Parameter settings

5.3.2. Sample selection methods

5.3.3. Determination of sample size

5.3.4. Determination of block window size

5.3.5. Determination of priority classifier

5.4. Experimental results and analysis

6. Conclusions

Acknowledgments

Conflict of interests

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog