
This paper focused on the point-of-interest (POI) recommendation task. Recently, graph representation learning-based POI recommendation models have gained significant attention due to the powerful modeling capacity of graph structural data. Despite their effectiveness, we have found that recent methods struggle to effectively utilize information from POIs that have not been checked in, which could limit their performance. Hence, in this paper, we proposed a new model, named the multi-contextual graph contrastive learning (MCGCL) model, which introduces the contrastive learning into graph representation learning-based methods. First, MCGCL extracts interactions between POIs under different contextual factors from user check-in records using predefined graph structure information. Next, it samples important POI sets from different contextual factors using a random walk-based method. Then, it introduces a new contrastive learning loss that incorporates contextual information into traditional contrastive learning to enhance its ability to capture contextual information. Finally, MCGCL employs a graph neural network (GNN) model to learn representations of users and POIs. Extensive experiments on real-world datasets have demonstrated the effectiveness of MCGCL on the POI recommendation task compared to representative POI recommendation approaches.
Citation: Xueping Han, Xueyong Wang. MCGCL: A multi-contextual graph contrastive learning-based approach for POI recommendation[J]. Electronic Research Archive, 2024, 32(5): 3618-3634. doi: 10.3934/era.2024166
[1] | Eray Önler . Feature fusion based artificial neural network model for disease detection of bean leaves. Electronic Research Archive, 2023, 31(5): 2409-2427. doi: 10.3934/era.2023122 |
[2] | Chetan Swarup, Kamred Udham Singh, Ankit Kumar, Saroj Kumar Pandey, Neeraj varshney, Teekam Singh . Brain tumor detection using CNN, AlexNet & GoogLeNet ensembling learning approaches. Electronic Research Archive, 2023, 31(5): 2900-2924. doi: 10.3934/era.2023146 |
[3] | Kuntha Pin, Jung Woo Han, Yunyoung Nam . Retinal diseases classification based on hybrid ensemble deep learning and optical coherence tomography images. Electronic Research Archive, 2023, 31(8): 4843-4861. doi: 10.3934/era.2023248 |
[4] | Tej Bahadur Shahi, Cheng-Yuan Xu, Arjun Neupane, William Guo . Machine learning methods for precision agriculture with UAV imagery: a review. Electronic Research Archive, 2022, 30(12): 4277-4317. doi: 10.3934/era.2022218 |
[5] | Jing Lu, Longfei Pan, Jingli Deng, Hongjun Chai, Zhou Ren, Yu Shi . Deep learning for Flight Maneuver Recognition: A survey. Electronic Research Archive, 2023, 31(1): 75-102. doi: 10.3934/era.2023005 |
[6] | Hui Xu, Longtan Bai, Wei Huang . An optimization-inspired intrusion detection model for software-defined networking. Electronic Research Archive, 2025, 33(1): 231-254. doi: 10.3934/era.2025012 |
[7] | Abul Bashar . Employing combined spatial and frequency domain image features for machine learning-based malware detection. Electronic Research Archive, 2024, 32(7): 4255-4290. doi: 10.3934/era.2024192 |
[8] | Yixin Sun, Lei Wu, Peng Chen, Feng Zhang, Lifeng Xu . Using deep learning in pathology image analysis: A novel active learning strategy based on latent representation. Electronic Research Archive, 2023, 31(9): 5340-5361. doi: 10.3934/era.2023271 |
[9] | Xiao Chen, Fuxiang Li, Hairong Lian, Peiguang Wang . A deep learning framework for predicting the spread of diffusion diseases. Electronic Research Archive, 2025, 33(4): 2475-2502. doi: 10.3934/era.2025110 |
[10] | Huixia Liu, Zhihong Qin . Deep quantization network with visual-semantic alignment for zero-shot image retrieval. Electronic Research Archive, 2023, 31(7): 4232-4247. doi: 10.3934/era.2023215 |
This paper focused on the point-of-interest (POI) recommendation task. Recently, graph representation learning-based POI recommendation models have gained significant attention due to the powerful modeling capacity of graph structural data. Despite their effectiveness, we have found that recent methods struggle to effectively utilize information from POIs that have not been checked in, which could limit their performance. Hence, in this paper, we proposed a new model, named the multi-contextual graph contrastive learning (MCGCL) model, which introduces the contrastive learning into graph representation learning-based methods. First, MCGCL extracts interactions between POIs under different contextual factors from user check-in records using predefined graph structure information. Next, it samples important POI sets from different contextual factors using a random walk-based method. Then, it introduces a new contrastive learning loss that incorporates contextual information into traditional contrastive learning to enhance its ability to capture contextual information. Finally, MCGCL employs a graph neural network (GNN) model to learn representations of users and POIs. Extensive experiments on real-world datasets have demonstrated the effectiveness of MCGCL on the POI recommendation task compared to representative POI recommendation approaches.
More than half of the world population includes rice in their main meals, which makes it the most important food to cultivate and store. To increase the production of rice, it must be saved from various types of diseases [1]. Disease in rice is perhaps the biggest damaging factor that causes significant losses in rice production and the agricultural economy [2]. Additionally, diseases present significant dangers to food security. Obtaining data about the sustainable development and well-being of rice is featured in present-day agriculture. Various studies have indicated that rice plant diseases can be detected at an early stage by analyzing the leaf, stem, grain, etc., and during this stage, the crop can be saved by applying the necessary fertilizer. The traditional method used to identify rice plant diseases is based on human perception, which may differ from person to person and the skill of the person. This requires constant inspection by specialists, which may be restrictively costly on large farms [3]. Rice leaf-related infections frequently present dangers to the sustainable production of rice, influencing numerous farmers all over the planet. The early finding and suitable cure of rice leaf contamination is urgent for the sustainable development of rice plants to guarantee satisfactory inventory and food security for the quickly expanding population. Hence, machine-driven disease analysis frameworks could relieve the constraints of the customary strategies for leaf infection detection procedures that are tedious, error-prone, and costly. These days, machine learning algorithms have made rice leaf disease exceptionally famous [4].
Deep learning has become the most popular technique in the field of agriculture to predict diseases, test the soil, and make various other predictions, such as weather forecasting and fertilizer analysis. At the same time, these models have become very important to predict the two most harmful diseases of rice plants, i.e., fungal and bacterial diseases, which are the main causes for loss in the crop [5]. The traditional approaches to detect plant disease are by visual perception, by taking the advice of an expert, or by taking samples to the research lab for analysis, but all these approaches are time-consuming and not useful for early detection [6]. In addition, the prediction of disease by the naked eyes also needs mastery and perception, and the required time for analysis may vary from person to person. These issues may be resolved by an image-based AI approach to deal with the prediction and classification of plant diseases [7]. Particularly, in some countries, farmers might need to travel significant distances to contact agricultural specialists, which is tedious, costly, delays the forecast, and cannot be completed in a wide reach [8,9,10].
In this study, we detected three types of rice plant disease: bacterial leaf blight, brown spot, and leaf smut. Bacterial leaf blight, additionally called blight of rice, is a dangerous bacterial disease that is among the most horrendous diseases of developed rice. Leaves with yellow-white or bright yellow lines at the middle or edges may change or destroy their shape.
Brown spot is a fungal disease that affects the color of the leaves. It is among the most dangerous of diseases and can damage all the leaves and the areas where the seed grows. Its symptoms are dim brown, unpredictable spots on the both upper and lower leaf surfaces, which will appear as yellow or light brown when held up to a backdrop illumination.
Leaf smut covers the entire leaf, but, to some degree, it is a minor infection of rice. Its development creates marginally raised, precise, dark spots on both sides of the leaves. The reasons for leaf spots include organisms, microorganisms, and infections brought by environmental factors, such as natural circumstances, poison levels, and herbicide wounds.
In this work, we used deep transfer learning for the detection of rice plant diseases. The reuse of a pre-prepared model on a new issue is known as transfer learning in AI. A machine utilizes the information gained from an earlier task to increment prediction about another undertaking in transfer learning. For instance, it is possible to utilize the data acquired during preparation to recognize drinks while preparing a classifier to predict whether a picture contains wine. The information about a generally prepared ML model is moved to an alternate but firmly connected issue in transfer learning. There are various famous pre-prepared models (for example VGG16, Inception, ResNet50) available that are useful for defeating sampling lacks; they have proactively been trained on different images and can predict a variety of features. These models commonly have complex designs that are vital when interpreting the distinctions among hundreds or thousands of classes [11]. The complexity that offers a predictive limit with regards to the variety of items can be a hindrance for more simplistic tasks, as the pre-prepared model can overfit the information.
Segmentation is one of the most important pre-processing techniques in plant disease classification. The hue, saturation, and intensity (HSI) model and luminance, a, and b chrominance (Lab) models perform well regarding segmentation [12]. Some pretrained deep learning models have shown outstanding performance in the classification of rice plant disease from image datasets [13], whereas handmade CNN models may also create in some specific cases [14]. Transfer learning is another new concept used by many researchers to enhance the accuracy of classification with a lightweight model, such as Es-MbNet [15].
The contributions of this study are as follows:
1) We designed a 17-layer lightweight model to detect plant disease with low computational cost.
2) Various data augmentation techniques were applied to the benchmark dataset to increase the training performance of the model.
3) The performance of the designed model was compared with other existing models on the basis of various parameters.
Chen et al. [13] used a deep learning approach for solving the task of disease detection since it has shown outstanding performance in image processing and classification problems. They used a UCI Repository dataset. Combining the advantages of both, the DenseNet model pre-trained on ImageNet and the Inception modules were selected to be used in the network. This achieved an average predicting accuracy of no less than 94.07% on the public dataset. Even when multiple diseases were considered, the average accuracy reached 98.63% for the class prediction of rice disease images. Sony [16] used a convolutional neural network utilizing R language to find diseases in rice by utilizing pictures of infected leaves. The infection pictures gathered from the UCI Machine Learning Repository contained three types of diseases: bacterial leaf blight, brown spot, and leaf smut. By further developing the preparation pictures, they accomplished better outcomes with an accuracy of 86.6%. Al-Amin et al. [7] constructed a deep CNN model to predict four normal rice diseases, namely, bacterial leaf blight, brown spot, leaf smut, and blast. Using the testing dataset of more than 900 pictures of infections and solid leaves from the UCI Machine Learning Repository and following the method of 10-fold cross-validation accomplished the most noteworthy accuracy of 97.40%. Their work presented a CNN-based model which provided 97.40% accurate results for different diseases of rice leaves. Wadhawan et al. [17] explored different techniques to recognize crop diseases in rice plants utilizing conventional image processing techniques and neural networks. They deployed an ML model that scanned the picture and, whether or not the leaf was tainted, estimated the area of contamination. The exactness of their identifications was around 85-86%. Ahmed et al. [18] presented a rice leaf disease discovery framework utilizing machine learning approaches. Three of the most widely recognized rice plant diseases, specifically, leaf smut, bacterial leaf blight, and brown spot, were identified in this work. Decision tree regression, after 10-fold cross-validation, accomplished a precision rate of more than 97% when applied to a test dataset that was collected from the UCI Machine Learning Repository [19]. Phadikar et al. [20] developed an automated system to classify two different types of rice diseases, including leaf brown spot and leaf blast diseases of rice plants, based on the morphological changes of the plants caused by the diseases. Radial distribution of the hue from the center to the boundary of the spot images was used as a feature to classify the diseases by Bayes and SVM Classifiers. Their identification scheme achieved an accuracy of 79.5%.
Azim et al. [21] proposed a model to detect three rice leaf diseases—bacterial leaf blight, brown spot, and leaf smut—in their paper. In their work, the backgrounds of the pictures were removed by the saturation threshold, while disease-affected areas were segmented using the hue threshold. The distinct components of color, shape, and surface space were separated from the impacted regions. Their model accomplished 86.58% accuracy on the rice leaf disease dataset from UCI [19]. Islam et al. [22] proposed a technique that applies threshold-based segmentation to separate the disease-affected areas on the rice leaves precisely. Three distinct CNN models—VGG16, ResNet50, and DenseNet121—have been used to group the diseases to decide the best model for such picture classification issues. DenseNet121 has been demonstrated as the best for this sort of image classification problem, with an achieved classification accuracy of 91.67%. Patidar et al. [23] proposed an accomplished model for the detection and classification of diseased rice leaves using a residual neural network. Three infections viz. leaf smut, bacterial leaf blight, and brown spot were distinguished by our framework, based on a dataset with 120 pictures (40 for every disease class). Their work accomplished an effective precision of 95.83%. The study by Teja et al. [24] was fundamentally centered around three different rice leaf diseases, i.e., brown spot, leaf smut, and bacterial leaf blight. The dataset for examination was acquired from the UCI, an ML library that contains pictures of three different leaf infections. Among various strategies, the best feasible arrangement was a shrewd blend of the advancement procedure taught to InceptionV3 and transfer learning, with an accuracy of 99.33%. Shrivastava et al. [25] utilized a pre-trained deep convolutional neural network (CNN) as a component extractor and support vector machine (SVM) as a classifier to classify rice plant disease. Their proposed model can characterize rice diseases with accuracy of 91.37% for an 80:20% training-testing ratio. Pushpa et al. [26] used the AlexNet model to develop a framework for detecting and identifying crop plant leaf diseases based on deep learning models. The result of the proposed approach was accomplished by distinguishing the boundaries, and the consequence of the CNN classifier-based features provided ideal accuracy in plant disease identification. Their proposed model accomplished an accuracy of 96.76%. Recently, deep learning models have been applied in various fields to detect targets in real-time images [27]. A combination of multi-scale feature selection [28] with different convolutional neural network models is playing an important role in various new studies [29,30,31]. These approaches are also very suitable for introducing automated machines and robots to different fields to save the time and effort of human beings [32,33]. Some researchers have also used these techniques in other fields, such as for human face and handwriting recognition [34,35].
In the literature review of earlier works, we observed that for the detection of rice plant disease, many complex models have been used, which is a very time-consuming process. However, we did not find significant work that has been done with the grayscale dataset. So, our focus was on building a model that is less complex and has a lower computation cost while detecting diseases.
Below is a pictorial representation of the workflow for the proposed methodology.
There are some limited public datasets available for plant disease detection. The most popular dataset of plant communities has more than 50, 000 images of 38 classes, but it contains the leaves of various plants, and as our work was restricted to the diseases of rice plants only, we used another public dataset from the UCI Repository [13,24]. The UCI Repository is an international general database for the testing of machine learning algorithms (https://archive.ics.uci.edu/ml/datasets/rice+leaf+diseases). It comprises 120 images divided into three classes: bacterial leaf blight, brown spot, and leaf smut, with 40 images in each category.
Since this dataset is small for performing both training and validation, we performed data augmentation to increase our dataset further to 420 images by performing rotation and zooming operations. All the images were resized to a fixed dimension of 120 × 120.
The rotation operation was performed by varying the angle of rotation from 15 degrees to 30 degrees by increasing the angle by 15 degrees after each iteration, and the zooming operation was performed by scaling the images by a factor of 1.5 in the x and y directions.
We manually performed all of the augmentations of the dataset, and as a result, we were able to gather 420 images of three different classes. Once we obtained the dataset, we then converted it into a NumPy array, which stores every pixel value of that specific image. Simultaneously, we added labelled the images as 0, 1, or 2.
To evaluate the performance of deep learning models [34], comparative experiments were performed. We tested a total of 10 pre-trained models with different training-testing ratios to select the best model. Thereafter, we varied different parameters, such as pooling, epochs, etc., to determine the best configuration for the selected model. Below, the models used in the methodology are described.
Following the advancement in image processing by AlexNet, new ConvNets are made deeper and denser, such as the VGG model, the Dense model, etc. As these models have become deeper, their operations cost have also increased. Hence, to balance these costs, an InceptionNet was created with different versions—V1, V2, and V3. This InceptionNet was made by adding Inception modules. These Inception modules were made by using different combinations of the convolutional layer, pooling layer, dense layer, dropout, and fully connected layer. The performance of the InceptionNet on our dataset with variation in the number of epochs is shown in Table 2.
As the name suggests, the DenseNet201 model is very densely made compared to other models. DenseNet201 is made up of dense blocks. The dense blocks in this study used ConvNets, and each layer in the dense block receives input from the other preceding layers, deepening to a total of 201 layers of these blocks. The performance of DenseNet201 on our dataset with variation in the number of epochs is shown in Table 2.
MobileNet is basically used for the processing of mobile applications, and due to its light weight, it is becoming widely used by data scientists. It was the first image processing model created by TensorFlow. For convolution, MobileNet used two operations, Pointwise and Depthwise. The performance of MobileNet on our dataset with variation in the number of epochs is shown in Table 2.
Visual Geometric Group 19 (VGG19) is a CNN-based model purpose-made for image classification. VGG19 is a 19-layer architecture. It comprises different blocks, including a convolutional layer, a pooling layer, and a dense layer, and each block is followed by a pooling layer that has a total count of 4. VGG19 performed exceptionally well in terms of accuracy and losses during the fitting of the model, as the yield accuracy increased within a few epochs compared to the other models. The performance of VGG19 on our dataset with variation in the number of epochs is shown in Table 2.
XceptionNet was created by Google. The architecture of XceptionNet is 71 layers deep. Specifically, this model used the modified version of convolution in a depthwise format. While performing the depthwise operation of XceptionNet, no activation occurred in any of the intermediate layers. The performance of XceptionNet on our dataset with variation in the number of epochs is shown in Table 2.
Visual Geometric Group 16 (VGG16) is a prior version of VGG19. In this study, VGG16 was used as a CNN-based model purposely created for image classification. VGG16 is a 16-layer architecture. It comprises different blocks of a convolutional layer, a pooling layer, and a dense layer, and each block is followed by a pooling layer that has a total count of 4. VGG16 performed exceptionally well in terms of accuracy and losses during the fitting of the model, as the yield accuracy increased within a few epochs as compared to the other models. For some datasets, VGG16 performed better than VGG19, and vice versa. The performance of VGG16 on our dataset with variation in the number of epochs is shown in Table 2.
EfficientNet was not created by scientists; it was created with the help of artificial intelligence. EfficientNet is a convolutional neural network design and scaling method that uniformly scales all depth, breadth, and resolution dimensions using a compound coefficient. This EfficientNet randomly modified the network's breadth, depth, and resolution, and the EfficientNet scaling technique consistently scaled the network's breadth, depth, and resolution with a set of specified scaling coefficients. The performance of EfficientNet on our dataset with variation in the number of epochs is shown in Table 2.
NasNet Mobile Neural architecture search was also developed by the Google Brain team. It was created for image processing and was trained with millions of images. It was developed to have the best CNN model for performing reinforcement learning. NasNet was used to find the best parameter for the CNN layers to attain the best accurate prediction results. The performance of NasNet on our dataset with variation in the number of epochs is shown in Table 2.
Residual Network 50 (ResNet50) is a type of neural network developed by researchers from Microsoft in the year 2015. This ResNet50 was a heavily dense model compared to InceptionNet and MobileNet, as this model contained five sub-blocks and each of these blocks was divided again into two other sub-blocks known as the identity block and convolution block. Both of these blocks had three convolution layers. The performance of ResNet50 on our dataset with variation in the number of epochs is shown in Table 2.
As the name suggests, Inception ResNet is the fusion of two neural networks—InceptionNet and Residual Network (ResNet). This neural network can predict thousands of different classes and has 164-layer architecture, which makes the model heavy and dense [24]. The performance of Inception ResNet on our dataset with variation of number of epochs is shown in Table 2.
In some cases, to achieve transfer learning, we used the concept of model stacking. To connect the output of one deep learning model to the input of another, we used the output of the first model as the input for the second model. This was carried out by passing the output of the first model to the input layer of the second model. Before the model stacking, we ensured that the output of the first model had high-level feature representation so that it could be used as the input for the second model for further processing. The second model was then trained with this input to learn how to perform a specific task.
Input Layer: RGB is input to the model with a size of 120*120*3.
• Block 1: The first block starts with a convolutional layer (Conv1-1) through which a rice leaf image with a of size 118*118 pixels and 32 kernels, which have a size of 3*3, is passed. The activation used is "ReLu". After this, there is Conv2-2, which is also a convolutional layer with 32 kernels with a size of 3*3. The activation used here is also ReLu. In the last layer of this block, the image is pooled down using the max pooling layer, which has a size of 2*2.
• Block 2: The second block starts with Conv2-1, which takes input from the first block, with an input image size of 58*58 and an increased number of kernels with 64, which have a size of 3*3. The activation used is the ReLu function. Again, this layer is connected to Conv2-2, which also has the same parameters. Furthermore, it is connected to Conv2-3, which is also a convolutional layer. Using repetitive convolutional layers helped the model to achieve higher and higher accuracy. This Conv2-3 layer is connected to the second max pooling layer to pool down the size of the image, as the pooling size is 2*2. This means that the size of the image is decreased again by half, and at the end of the block, there is a dropout layer. The sole purpose of the dropout layer is to prevent overfitting of the model while training, with a dropout rate of 0.5.
• Block 3: The third block takes input from Block 2 with an input image size of 26*26. The first layer in this model is Conv3-1, a convolutional layer. The total number of kernels is 128, and the filter size is also the same, 3*3. The activation function is ReLu. This layer is further connected to Conv3-2 with the same parameters as that of Conv3-1. Conv3-2 is connected to the last convolutional layer of this layer, i.e., Conv3-3, with the same parameters as those of the other two layers. In the last layer, the image is pooled down by using a max pooling layer with a size of 2*2.
• Block 4: The fourth block is the last block where a convolutional layer is used. This layer takes input from the third block with a reduced image size of 10*10. The first layer is Conv4-1, and the total number of kernels is 256 with a size of 3*3. The activation function is ReLu, this layer is connected to the last convolutional layer of this model, Conv4-2, which has a total of 64 kernels with a size of 3*3, and the activation function as ReLu. Lastly, the image size is again pooled down to half by using a max pooling layer with a size of 2*2.
• Block 5: This block contains a single layer, which is a flattening layer. The flattening layer is used to flatten the multidimensional array to a single dimensional array. Without this layer, it is not possible to classify objects using a 1D array.
• Block 6: This block contains the first dense layer of the model, named Dense-1, which takes input from the flattened layer. The purpose of adding a dense layer is because it feeds all the output coming from the last node to the current node and provides a single output. After this layer, we added a dropout layer with a dropout rate of 0.5, which reduces losses during training, potentially increasing the accuracy.
• Block 7: This block is similar to Block 6, as it also contains a dense layer and a dropout layer only. The sole purpose of this block to increase the accuracy of the model because the denser it is, the more noise-suppressed features can be extracted.
• Block 8: This is the last block of our proposed architecture. It has two layers, which are a dense layer and a softmax layer. This is the third and the last dense layer, with a total of three output nodes because we need to classify only three classes of rice plant disease. In the end, the softmax function is used to classify the images according to the classes. (The softmax function is used because the process is multiclass classification.)
To better explain the architecture, it is shown in Figure 10, and Table 1 represents the trainable parameters we used.
Layer (type) | Output Shape | Param # |
Conv2d 118 (Conv2D) | (None, 118, 118, 32) | 896 |
Conv2d 119 (Conv2D) | (None, 116, 116, 32) | 9248 |
Max pooling2d 56 (MaxPooling2D) | (None, 58, 58, 32) | 0 |
Conv2d 120 (Conv2D) | (None, 56, 56, 64) | 18, 496 |
Conv2d 121 (Conv2D) | (None, 54, 54, 64) | 36, 928 |
Conv2d 122 (Conv2D) | (None, 52, 52, 64) | 36, 928 |
Max pooling2d 57 (MaxPooling2D) | (None, 26, 26, 64) | 0 |
Dropout 28 (Dropout) | (None, 26, 26, 64) | 0 |
Conv2d 123 (Conv2D) | (None, 24, 24, 128) | 73, 856 |
Conv2d 124 (Conv2D) | (None, 22, 22, 128) | 147, 584 |
Conv2d 125 (Conv2D) | (None, 20, 20, 128) | 147, 584 |
Max pooling2d 58 (MaxPooling2D) | (None, 10, 10, 128) | 0 |
Conv2d 126 (Conv2D) | (None, 8, 8, 256) | 295, 168 |
Conv2d 127 (Conv2D) | (None, 6, 6, 256) | 590, 080 |
Max pooling2d 59 (MaxPooling2D) | (None, 3, 3, 256) | 0 |
Flatten 14 (Flatten) | (None, 2304) | 0 |
Dense 42 (Dense) | (None, 256) | 590, 080 |
Dropout 29 (Dropout) | (None, 256) | 0 |
Dense 43 (Dense) | (None, 256) | 65, 792 |
Dropout 30 (Dropout) | (None, 256) | 0 |
Dense 44 (Dense) | (None, 3) | 771 |
Activation 14 (Activation) | (None, 3) | 0 |
Notes: Total parameters: 2, 013, 411; trainable parameters: 2, 013, 411; non-trainable parameters: 0 |
Loss Function: In this model, we used sparse categorical cross-entropy as the loss function since this loss function is used when more than one class of images is being classified. Loss function is useful because when the model is being fitted, it signifies how accurately the dataset is getting fitted while training the model. The higher the loss value, the more accurate the model's prediction.
(3.1) |
here, CE is the cross-entropy, C is the the total number of output neurons, t is the target vector, and f(s) is the output value of the softmax function. Equation (3.1) shows the calculation of categorical cross-entropy.
Optimizer: The second key parameter while fitting the model is the optimizer. In this model, we used "Adam" as the optimizer, as it is faster and more efficient than some other optimizers, such as Adadelta, RMSprop, SGD, etc. The sole work of the optimizer is to modify the weights of each node after each epoch, and as a result, this will minimize the loss function. Adam uses a combination of two gradient descent algorithms, RMSprop and Momentum. This is how every block of the Light Weight 17 (LW17) model will perform its function whenever the image is passed through the input layer. Going further into the methodology, the training and validation of the RGB image datasets with the LW17 model are shown in Figure 10 below.
After 17 layers, we also tried to increase the number of layers to 20 and 21, but we did not obtain any reasonable improvements in terms of accuracy. So, we stopped at 17 layers.
All of the experiments were performed on a Dell Precision tower 5810, Intel(R) CPU ES-2630, v4@2.20GHz, 64 GB, X64 Processor machine at the National Institute of Technology, Raipur. We used Python version 3.11 with a Jupyter Notebook, and the Scikit-learn, TensorFlow, Keras, Matplotlib, Pandas, etc. libraries were installed to perform the deep learning experiments.
We performed these experiments iteratively 10 times to obtain the average accuracy of each model. The accuracy of each model is shown in Tables 2 and 3.
Model | Number of Epochs | Training Accuracy | Validation Accuracy | Validation Loss |
VGG-16 | 5 | 90.06% | 85.21% | 1.1936 |
10 | 92.63% | 87.43% | 1.1302 | |
15 | 94.47% | 89.78% | 1.2301 | |
INCEPTION 3 | 5 | 61.54% | 51.28% | 8.9556 |
10 | 72.12% | 58.97% | 7.1420 | |
15 | 74.47% | 59.10% | 9.2561 | |
MOBILENET | 5 | 85.00% | 50.18% | 1.1704 |
10 | 92.06% | 52.21% | 1.2332 | |
15 | 93.24% | 52.63% | 1.3489 | |
DENSNET-201 | 5 | 79.17% | 79.1% | 0.6808 |
10 | 89.74% | 79.49% | 0.6377 | |
15 | 90.47% | 80.12% | 0.5672 | |
XCEPTION | 5 | 66.53% | 53.17% | 3.6400 |
10 | 69.50% | 59.52% | 4.3366 | |
15 | 70.47% | 60.12% | 4.5672 | |
5 | 91.60% | 90.18% | 0.2238 | |
EFFICIENT NET | 10 | 94.05% | 94% | 0.2682 |
15 | 94.47% | 94.10% | 0.2561 | |
5 | 37.20% | 33.33% | 5.9619 | |
INCEPTION RES NETV2 | 10 | 48.81% | 45.24% | 3.7276 |
15 | 49.10% | 45.71% | 3.2682 | |
NASNET MOBILE | 5 | 67.26% | 55.95% | 0.9370 |
10 | 70.24% | 54.76% | 1.2071 | |
15 | 71.36% | 55.71% | 1.1820 | |
RESNET 50 | 5 | 93.15% | 83.33% | 0.3279 |
10 | 94.35% | 92.86% | 0.3517 | |
15 | 94.85% | 93.71% | 0.3178 | |
VGG-16 | 5 | 92.26% | 79.76% | 0.8327 |
10 | 92.56% | 84.52% | 0.6207 | |
15 | 93.05% | 85.54% | 0.4178 | |
5 | 91.52% | 87.50% | 0.1711 | |
LW17 CNN | 10 | 90.81% | 90.62% | 0.1605 |
15 | 94.93% | 87.50% | 0.2139 |
In Table 2, it can be observed that the ResNet-50 model obtained the highest accuracy, but the LW17 model had one-third of the layers of the ResNet-50 model and accuracy close to that model. Therefore, it can be a better option in terms of the complexity and cost of building a model.
MODEL | Training Accuracy | Validation Accuracy | Validation Loss | Execution Time (in Sec) |
Vgg19 | 90.50% | 89.78% | 1.2 | 218 |
InceptionV3 | 56.70% | 50.30% | 19 | 560 |
MobileNet | 81.50% | 77.80% | 1.9 | 488 |
Densnet 201 | 77.70% | 75.20% | 0.72 | 540 |
XceptionNet | 56.70% | 52.20% | 5.21 | 346 |
EfficientNet | 89.31% | 94.10% | 0.27 | 246 |
InceptionResnet | 49.10% | 42.00% | 46 | 389 |
NasnetMobile | 70.24% | 53.41% | 1.22 | 458 |
Resnet50 | 94.85% | 93.71% | 0.37 | 548 |
Vgg16 | 92.44% | 85.54% | 0.68 | 110 |
LW17 | 94.35% | 93.75% | 0.18 | 116 |
Based on Table 3, we can say that our model performed better than any of the pre-trained models. Our model (LW17) gave the best validation accuracy in comparison to all the pre-trained models, i.e., 93.75%. when we compared the execution time. All other models (except VGG16) took more time than our model as they have more layers in their architectures.
When the LW17 model was tested with (approximately) 10% random testing data, the confusion matrix was obtained, as shown in Figure 11.
The precision, recall, F1-Score, (Eqs (4.1)-(4.3)), and execution time were also calculated to check the robustness and complexity of the proposed model. The results were calculated for each class and are shown in table 4. The average accuracy achieved on the testing dataset was 94.44%, which is much greater than the approach used by Chen et al. [13]. They achieved an average accuracy of 94.07% on the same public dataset, whereas another researcher, Patidar et al. [23], reported a highest accuracy of 95.83%, but they used the ResNet model for their experiments, which has more layers than our proposed model.
(4.1) |
(4.2) |
(4.3) |
Class | Disease | Precision | Recall | F1-Score | Accuracy (%) |
1 | Bacterial Leaf Blight (BLB) | 1.000 | 0.8571 | 0.923 | 100 |
2 | Brown Spot | 0.833 | 1.00 | 0.9088 | 83.33 |
3 | Leaf Smut | 1.00 | 0.933 | 0.965 | 100 |
- | Average | 0.944 | 0.930 | 0.932 | 94.44 |
Table 5 contains a comparison of the LW17 models with the other existing models. The comparison shows a significant improvement in the classification accuracy when using the LW17 model. In some of the other authors' work on the UCI dataset and the applied machine CNN model [16,17], Bayes' and SVM classifier [20,25], Boosted Decision Tree [21], convolution model [22], their training and testing accuracy was restricted to 91.67%, as the standard UCI dataset has a very small number of images so the system may not be trained enough with the same amount of data. Our system was trained with three times more images than the above-mentioned models, so its performance reached 93.75%. At the same time, when comparing the model with other standard techniques, such as transfer learning [36], the enhanced neural network model [38], MobileNetV2 [41], MobileNet with SE [40], attention-embedded lightweight network [37,39], and a recent meta-heuristic deep neural network (DNN) model, where features are optimized by a butterfly optimizer [42,43] and PCA [44], the performance of our system was restricted. Still, we succeeded in providing a model with fewer layers than these models and whose accuracy is close to that of these standard models.
Author | Accuracy (%) |
A. Sony [16] | 86.6 |
Radhika Wadhawan [17] | 85 |
S. Phadikar et al. [20] | 79.5 |
Azim, M. A. et al. [21] | 86.58 |
Anam Islam et al. [22] | 91.67 |
Vimal Shrivastava et al. [25] | 91.37 |
Chen, J., et al., 2020 [36] | 99.11 |
Chen, J. et al., 2021 [37] | 98.50 |
Chen, J., et al., 2021 [38] | 93.75 |
Chen, J., et al., 2021 [39] | 99.14 |
Chen, J., et al., 2021 [40] | 99.78 |
Chen, J., et al., 2021 [41] | 99.67 |
Ruth J. A. et.al., 2022 [42] | 99.00 |
Uma, R., & Meenakshi, A., 2021 [43] | 98.42 |
Gadekallu, T. R. et al., 2021 [44] | 94.00 |
LW17 (Proposed Work) | 93.75 |
Rice is an important food globally, and its production may be increased by reducing the effects of possible diseases. We observed that several computer-aided models exist, and every day, there is a new advancement in the models. In this study, we created our own CNN model from scratch to detect rice plant disease more precisely compared to pre-trained models, such as VGG19, InceptionV3, MobileNet, Xception, and DenseNet201. After trying different possible variations, we observed that our Model LW17 performed exceptionally well compared to other deep neural network models, with an accuracy of 93.75%. We obtained this result using max pooling, the optimizer "Adam", a learning rate of 0.001, and a training-to-validation ratio of 9: 1. We obtained the best accuracy for our model when we used the 17-layer architecture, which had almost a third of the layers compared to the ReNET50 model, whose highest accuracy was 93.85%. As the performance of the model was tested on a small dataset, in the future, we will test our model on other larger datasets to check its robustness. In addition, in the future, the performance of the proposed model may be compared with some more parameters.
All authors declare no conflict of interest with any other similar work.
[1] |
Y. Hwangbo, K. J. Lee, B. Jeong, K. Y. Park, Recommendation system with minimized transaction data, Data Sci. Manage., 4 (2021), 40-45. https://doi.org/10.1016/j.dsm.2022.01.001 doi: 10.1016/j.dsm.2022.01.001
![]() |
[2] |
L. Shi, G. Song, G. Cheng, X. Liu, A user-based aggregation topic model for understanding user's preference and intention in social network, Neurocomputing, 413 (2020), 1-13. https://doi.org/10.1016/j.neucom.2020.06.099 doi: 10.1016/j.neucom.2020.06.099
![]() |
[3] |
W. Ji, X. Meng, Y. Zhang, STARec: Adaptive learning with spatiotemporal and activity influence for POI recommendation, ACM Trans. Inf. Syst., 40 (2021), 1-40. https://doi.org/10.1145/3485631 doi: 10.1145/3485631
![]() |
[4] |
J. Wang, Z. Huang, Z. Liu, SQPMF: successive point of interest recommendation system based on probability matrix factorization, Appl. Intell., 54 (2024), 680-700. https://doi.org/10.1007/s10489-023-05196-x doi: 10.1007/s10489-023-05196-x
![]() |
[5] |
W. Ji, X. Meng, Y. Zhang, SPATM: A social period-aware topic model for personalized venue recommendation, IEEE Trans. Knowl. Data Eng., 34 (2020), 3997-4010. https://doi.org/10.1109/TKDE.2020.3029070 doi: 10.1109/TKDE.2020.3029070
![]() |
[6] |
F. Mo, X. Fan, C. Chen, H. Yamana, Sampling-based epoch differentiation calibrated graph convolution network for point-of-interest recommendation, Neurocomputing, 571 (2024), 127140. https://doi.org/10.1016/j.neucom.2023.127140 doi: 10.1016/j.neucom.2023.127140
![]() |
[7] |
X. Wang, D. Wang, D. Yu, R. Wu, Q. Yang, S. Deng, et al., Intent-aware graph neural network for point-of-interest embedding and recommendation, Neurocomputing, 557 (2023), 126734. https://doi.org/10.1016/j.neucom.2023.126734 doi: 10.1016/j.neucom.2023.126734
![]() |
[8] |
M. Gan, Y. Ma, Mapping user interest into hyper-spherical space: a novel poi recommendation method, Inf. Process. Manage., 60 (2023), 103169. https://doi.org/10.1016/j.ipm.2022.103169 doi: 10.1016/j.ipm.2022.103169
![]() |
[9] |
Y. Qin, C. Gao, Y. Wang, S. Wei, D. Jin, J. Yuan, et al., Disentangling geographical effect for point-of-interest recommendation, IEEE Trans. Knowl. Data Eng., 35 (2023), 7883-7897. https://doi.org/10.1109/TKDE.2022.3221873 doi: 10.1109/TKDE.2022.3221873
![]() |
[10] |
L. Shi, J. Luo, C. Zhu, F. Kou, G. Cheng, X. Liu, A survey on cross-media search based on user intention understanding in social networks, Inf. Fusion, 91 (2023), 566-581. https://doi.org/10.1016/j.inffus.2022.11.017 doi: 10.1016/j.inffus.2022.11.017
![]() |
[11] |
L. Shi, J. P. Du, G. Cheng, X. Liu, Z. G. Xiong, J. Luo, Cross‐media search method based on complementary attention and generative adversarial network for social networks, Int. J. Intell. Syst., 37 (2022), 4393-4416. https://doi.org/10.1002/int.22723 doi: 10.1002/int.22723
![]() |
[12] |
Z. Cai, G. Yuan, S. Qiao, S. Qu, Y. Zhang, R. Bing, FG-CF: Friends-aware graph collaborative filtering for POI recommendation, Neurocomputing, 488 (2022), 107-119. https://doi.org/10.1016/j.neucom.2022.02.070 doi: 10.1016/j.neucom.2022.02.070
![]() |
[13] |
Y. C. Chen, T. Thaipisutikul, T. K. Shih, A learning-based POI recommendation with spatiotemporal context awareness, IEEE Trans. Cybern., 52 (2020), 2453-2466. https://doi.org/10.1109/TCYB.2020.3000733 doi: 10.1109/TCYB.2020.3000733
![]() |
[14] |
C. Lang, Z. Wang, K. He, S. Sun, POI recommendation based on a multiple bipartite graph network model, J. Supercomput., 78 (2022), 9782-9816. https://doi.org/10.1007/s11227-021-04279-1 doi: 10.1007/s11227-021-04279-1
![]() |
[15] |
L. Chang, W. Chen, J. Huang, C. Bin, W. Wang, Exploiting multi-attention network with contextual influence for point-of-interest recommendation, Appl. Intell., 51 (2021), 1904-1917. https://doi.org/10.1007/s10489-020-01868-0 doi: 10.1007/s10489-020-01868-0
![]() |
[16] |
J. Zhang, X. Liu, X. Zhou, X. Chu, Leveraging graph neural networks for point-of-interest recommendations, Neurocomputing, 462 (2021), 1-13. https://doi.org/10.1016/j.neucom.2021.07.063 doi: 10.1016/j.neucom.2021.07.063
![]() |
[17] | G. Christoforidis, P. Kefalas, A. N. Papadopoulos, Y. Manolopoulos, RELINE: point-of-interest recommendations using multiple network embeddings, Knowl. Inf. Syst., 63 (2021), 791-817. |
[18] | Y. Yang, Z. Wu, L. Wu, K. Zhang, R. Hong, Z. Zhang, et al., Generative-contrastive graph learning for recommendation, in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, (2023), 1117-1126. https://doi.org/10.1145/3539618.3591691 |
[19] |
L. Guo, J. Zhang, L. Tang, T. Chen, L. Zhu, H. Yin, Time interval-enhanced graph neural network for shared-account cross-domain sequential recommendation, IEEE Trans. Neural Networks Learn. Syst., 35 (2024), 4002-4016. https://doi.org/10.1109/TNNLS.2022.3201533 doi: 10.1109/TNNLS.2022.3201533
![]() |
[20] |
R. Gao, Y. Tao, Y. Yu, J. Wu, X. Shao, J. Li, et al., Self-supervised dual hypergraph learning with intent disentanglement for session-based recommendation, Knowl. Based Syst., 270 (2023), 110528. https://doi.org/10.1016/j.knosys.2023.110528 doi: 10.1016/j.knosys.2023.110528
![]() |
[21] |
C. Yang, J. Zou, J. Wu, H. Xu, S. Fan, Supervised contrastive learning for recommendation, Knowl. Based Syst., 258 (2022), 109973. https://doi.org/10.1016/j.knosys.2022.109973 doi: 10.1016/j.knosys.2022.109973
![]() |
[22] |
F. Wang, X. Lu, L. Lyu, CGSNet: Contrastive graph self-attention network for session-based recommendation, Knowl. Based Syst., 251 (2022), 109282. https://doi.org/10.1016/j.knosys.2022.109282 doi: 10.1016/j.knosys.2022.109282
![]() |
[23] |
Q. Li, H. Ma, R. Zhang, W. Jin, Z. Li, Dual-view co-contrastive learning for multi-behavior recommendation, Appl. Intell., 53 (2023), 20134-20151. https://doi.org/10.1007/s10489-023-04495-7 doi: 10.1007/s10489-023-04495-7
![]() |
[24] |
Y. Zhang, G. Yin, Y. Dong, L. Zhang, Contrastive learning with frequency domain for sequential recommendation, Appl. Soft Comput., 144 (2023), 110481. https://doi.org/10.1016/j.asoc.2023.110481 doi: 10.1016/j.asoc.2023.110481
![]() |
[25] |
S. Xiao, D. Zhu, C. Tang, Z. Huang, Combining graph contrastive embedding and multi-head cross-attention transfer for cross-domain recommendation, Data Sci. Eng., 8 (2023), 247-262. https://doi.org/10.1007/s41019-023-00226-7 doi: 10.1007/s41019-023-00226-7
![]() |
[26] |
Y. He, G. Wu, D. Cai, X. Hu, Meta-path based graph contrastive learning for micro-video recommendation, Expert Syst. Appl., 222 (2023), 119713. https://doi.org/10.1016/j.eswa.2023.119713 doi: 10.1016/j.eswa.2023.119713
![]() |
[27] |
J. Ji, B. Zhang, J. Yu, X. Zhang, D. Qiu, B. Zhang, Relationship-aware contrastive learning for social recommendations, Inf. Sci., 629 (2023), 778-797. https://doi.org/10.1016/j.ins.2023.02.011 doi: 10.1016/j.ins.2023.02.011
![]() |
[28] |
H. Tang, G. Zhao, Y. He, Y. Wu, X. Qian, Ranking-based contrastive loss for recommendation systems, Knowl. Based Syst., 261 (2023), 110180. https://doi.org/10.1016/j.knosys.2022.110180 doi: 10.1016/j.knosys.2022.110180
![]() |
[29] |
J. Zhuang, S. Meng, J. Zhang, V. S. Sheng, Contrastive learning based graph convolution network for social recommendation, ACM Trans. Knowl. Discov. Data, 17 (2023), 1-21. https://doi.org/10.1145/3587268 doi: 10.1145/3587268
![]() |
[30] | Y. Qin, Y. Wang, F. Sun, W. Ju, X. Hou, Z. Wang, et al., DisenPOI: Disentangling sequential and geographical influence for point-of-interest recommendation, in Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, (2023), 508-516. https://doi.org/10.1145/3539597.3570408 |
[31] | W. Ju, Y. Qin, Z. Qiao, X. Luo, Y. Wang, Y. Fu, et al., Kernel-based substructure exploration for next POI recommendation, in Proceedings of the 2022 IEEE International Conference on Data Mining, (2022), 221-230. https://doi.org/10.1109/ICDM54844.2022.00032 |
[32] | J. D. Zhang, C. Y. Chow, Geosoca: Exploiting geographical, social and categorical correlations for point-of-interest recommendations, in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, (2015), 443-452. https://doi.org/10.1145/2766462.2767711 |
[33] | S. Feng, G. Cong, B. An, Y. M. Chee, Poi2vec: Geographical latent representation for predicting future visitors, in Proceedings of the AAAI Conference on Artificial Intelligence, (2017), 102-108. https://doi.org/10.1609/aaai.v31i1.10500 |
[34] |
P. Zhao, A. Luo, Y. Liu, J. Xu, Z. Li, F. Zhuang, et al., Where to go next: A spatio-temporal gated network for next poi recommendation, IEEE Trans. Knowl. Data Eng., 34 (2022), 2512-2524. https://doi.org/10.1109/TKDE.2020.3007194 doi: 10.1109/TKDE.2020.3007194
![]() |
[35] | B. Chang, G. Jang, S. Kim, J. Kang, Learning graph-based geographical latent representation for point-of-interest recommendation, in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (2020), 135-144. https://doi.org/10.1145/3340531.3411905 |
[36] |
Z. Wang, Y. Zhu, Q. Zhang, H. Liu, C. Wang, T. Liu, Graph-enhanced spatial-temporal network for next POI recommendation, ACM Trans. Knowl. Discov. Data, 16 (2022), 1-21. https://doi.org/10.1145/3513092 doi: 10.1145/3513092
![]() |
[37] | Q. Liu, S. Wu, L. Wang, T. Tan, Predicting the next location: A recurrent model with spatial and temporal contexts, in Proceedings of the AAAI Conference on Artificial Intelligence, (2016), 194-200. https://doi.org/10.1609/aaai.v30i1.9971 |
[38] |
J. Fu, R. Gao, Y. Yu, J. Wu, J. Li, D. Liu, et al., Contrastive graph learning long and short-term interests for POI recommendation, Expert Syst. Appl., 238 (2024), 121931. https://doi.org/10.1016/j.eswa.2023.121931 doi: 10.1016/j.eswa.2023.121931
![]() |
[39] |
M. Acharya, K. K. Mohbey, D. S. Rajput, Long-term preference mining with temporal and spatial fusion for point-of-interest recommendation, IEEE Access, 12 (2024), 11584-11596. https://doi.org/10.1109/ACCESS.2024.3354934 doi: 10.1109/ACCESS.2024.3354934
![]() |
1. | Ratnesh Kumar Dubey, Dilip Kumar Choubey, 2024, Chapter 34, 978-981-97-2613-4, 485, 10.1007/978-981-97-2614-1_34 | |
2. | Syed Khasim, Irfan Sadiq Rahat, Hritwik Ghosh, Kareemulla Shaik, Sujit Kumar Panda, Using Deep Learning and Machine Learning: Real-Time Discernment and Diagnostics of Rice-Leaf Diseases in Bangladesh, 2023, 10, 2414-1399, 10.4108/eetiot.4579 | |
3. | Hussain. A, Balaji Srikaanth. P, 2023, VGG19 Enhanced Convolutional Neural Network for Paddy Leaf Disease Detection, 979-8-3503-2284-2, 840, 10.1109/ICPCSN58827.2023.00144 | |
4. | R. Senthil, Ravi Khatwal, An Effective Disease Detection Analysis on Rice Leaves Using Hybrid MCSVM-DNN Predictor Architecture, 2024, 2520-8195, 10.1007/s41976-024-00175-3 | |
5. | Yasmin M. Alsakar, Nehal A. Sakr, Mohammed Elmogy, An enhanced classification system of various rice plant diseases based on multi-level handcrafted feature extraction technique, 2024, 14, 2045-2322, 10.1038/s41598-024-81143-1 | |
6. | Janagiraman S, Mithun P, Sandhiya J, Priyanka S, 2025, Enabling Inclusive Communication: A Sign Language Chatbot for Diverse Audiences, 979-8-3315-4237-5, 33, 10.1109/AUTOCOM64127.2025.10957203 | |
7. | Vinay Kukreja, 2025, Explainable AI in Medical Imaging: A Hybrid CNN and Random Forest Framework for Brain Tumor Detection, 979-8-3315-4237-5, 1662, 10.1109/AUTOCOM64127.2025.10957256 | |
8. | Bidyutmala Saha, 2025, Blockchain Based Supply Chain and User-Centric Service Oriented Framework for Healthcare 5.0, 979-8-3315-4237-5, 494, 10.1109/AUTOCOM64127.2025.10956948 | |
9. | Manish Nandy, Ahilya Dubey, 2025, IoT and 5G Integration: Transforming Real-Time Applications in Smart Industries, 979-8-3315-4237-5, 1207, 10.1109/AUTOCOM64127.2025.10957077 | |
10. | Shiva Mehta, Abhinav Rathour, 2025, AI-Powered Diagnosis: A CNN-SVM Framework for Identifying COVID-19 in Chest Radiographs, 979-8-3315-4237-5, 1442, 10.1109/AUTOCOM64127.2025.10957153 | |
11. | Jatin Khurana, Dasarathy A K, Gowri K, Ankita Agarwal, Sulabh Mahajan, Abhijit Pal, 2025, Low-Cost Wearable Sensors for Continuous Measurement of Vital Signs in Resource-Limited Settings, 979-8-3315-4237-5, 976, 10.1109/AUTOCOM64127.2025.10956234 | |
12. | Aakansha Soy, Vasani Vaibhav Prakash, 2025, Medical Image Denoising using Deep Convolutional Autoencoders for Ultrasound, 979-8-3315-4237-5, 262, 10.1109/AUTOCOM64127.2025.10957708 | |
13. | Shrestha Tripathy, B Reddy, Aravindan M, Muthiah M A, Trapty Agarwal, Bhavuk Samrat, 2025, Multi-Connectivity Techniques in 5G Networks: Performance Analysis and Optimization, 979-8-3315-4237-5, 736, 10.1109/AUTOCOM64127.2025.10956461 |
Layer (type) | Output Shape | Param # |
Conv2d 118 (Conv2D) | (None, 118, 118, 32) | 896 |
Conv2d 119 (Conv2D) | (None, 116, 116, 32) | 9248 |
Max pooling2d 56 (MaxPooling2D) | (None, 58, 58, 32) | 0 |
Conv2d 120 (Conv2D) | (None, 56, 56, 64) | 18, 496 |
Conv2d 121 (Conv2D) | (None, 54, 54, 64) | 36, 928 |
Conv2d 122 (Conv2D) | (None, 52, 52, 64) | 36, 928 |
Max pooling2d 57 (MaxPooling2D) | (None, 26, 26, 64) | 0 |
Dropout 28 (Dropout) | (None, 26, 26, 64) | 0 |
Conv2d 123 (Conv2D) | (None, 24, 24, 128) | 73, 856 |
Conv2d 124 (Conv2D) | (None, 22, 22, 128) | 147, 584 |
Conv2d 125 (Conv2D) | (None, 20, 20, 128) | 147, 584 |
Max pooling2d 58 (MaxPooling2D) | (None, 10, 10, 128) | 0 |
Conv2d 126 (Conv2D) | (None, 8, 8, 256) | 295, 168 |
Conv2d 127 (Conv2D) | (None, 6, 6, 256) | 590, 080 |
Max pooling2d 59 (MaxPooling2D) | (None, 3, 3, 256) | 0 |
Flatten 14 (Flatten) | (None, 2304) | 0 |
Dense 42 (Dense) | (None, 256) | 590, 080 |
Dropout 29 (Dropout) | (None, 256) | 0 |
Dense 43 (Dense) | (None, 256) | 65, 792 |
Dropout 30 (Dropout) | (None, 256) | 0 |
Dense 44 (Dense) | (None, 3) | 771 |
Activation 14 (Activation) | (None, 3) | 0 |
Notes: Total parameters: 2, 013, 411; trainable parameters: 2, 013, 411; non-trainable parameters: 0 |
Model | Number of Epochs | Training Accuracy | Validation Accuracy | Validation Loss |
VGG-16 | 5 | 90.06% | 85.21% | 1.1936 |
10 | 92.63% | 87.43% | 1.1302 | |
15 | 94.47% | 89.78% | 1.2301 | |
INCEPTION 3 | 5 | 61.54% | 51.28% | 8.9556 |
10 | 72.12% | 58.97% | 7.1420 | |
15 | 74.47% | 59.10% | 9.2561 | |
MOBILENET | 5 | 85.00% | 50.18% | 1.1704 |
10 | 92.06% | 52.21% | 1.2332 | |
15 | 93.24% | 52.63% | 1.3489 | |
DENSNET-201 | 5 | 79.17% | 79.1% | 0.6808 |
10 | 89.74% | 79.49% | 0.6377 | |
15 | 90.47% | 80.12% | 0.5672 | |
XCEPTION | 5 | 66.53% | 53.17% | 3.6400 |
10 | 69.50% | 59.52% | 4.3366 | |
15 | 70.47% | 60.12% | 4.5672 | |
5 | 91.60% | 90.18% | 0.2238 | |
EFFICIENT NET | 10 | 94.05% | 94% | 0.2682 |
15 | 94.47% | 94.10% | 0.2561 | |
5 | 37.20% | 33.33% | 5.9619 | |
INCEPTION RES NETV2 | 10 | 48.81% | 45.24% | 3.7276 |
15 | 49.10% | 45.71% | 3.2682 | |
NASNET MOBILE | 5 | 67.26% | 55.95% | 0.9370 |
10 | 70.24% | 54.76% | 1.2071 | |
15 | 71.36% | 55.71% | 1.1820 | |
RESNET 50 | 5 | 93.15% | 83.33% | 0.3279 |
10 | 94.35% | 92.86% | 0.3517 | |
15 | 94.85% | 93.71% | 0.3178 | |
VGG-16 | 5 | 92.26% | 79.76% | 0.8327 |
10 | 92.56% | 84.52% | 0.6207 | |
15 | 93.05% | 85.54% | 0.4178 | |
5 | 91.52% | 87.50% | 0.1711 | |
LW17 CNN | 10 | 90.81% | 90.62% | 0.1605 |
15 | 94.93% | 87.50% | 0.2139 |
MODEL | Training Accuracy | Validation Accuracy | Validation Loss | Execution Time (in Sec) |
Vgg19 | 90.50% | 89.78% | 1.2 | 218 |
InceptionV3 | 56.70% | 50.30% | 19 | 560 |
MobileNet | 81.50% | 77.80% | 1.9 | 488 |
Densnet 201 | 77.70% | 75.20% | 0.72 | 540 |
XceptionNet | 56.70% | 52.20% | 5.21 | 346 |
EfficientNet | 89.31% | 94.10% | 0.27 | 246 |
InceptionResnet | 49.10% | 42.00% | 46 | 389 |
NasnetMobile | 70.24% | 53.41% | 1.22 | 458 |
Resnet50 | 94.85% | 93.71% | 0.37 | 548 |
Vgg16 | 92.44% | 85.54% | 0.68 | 110 |
LW17 | 94.35% | 93.75% | 0.18 | 116 |
Class | Disease | Precision | Recall | F1-Score | Accuracy (%) |
1 | Bacterial Leaf Blight (BLB) | 1.000 | 0.8571 | 0.923 | 100 |
2 | Brown Spot | 0.833 | 1.00 | 0.9088 | 83.33 |
3 | Leaf Smut | 1.00 | 0.933 | 0.965 | 100 |
- | Average | 0.944 | 0.930 | 0.932 | 94.44 |
Author | Accuracy (%) |
A. Sony [16] | 86.6 |
Radhika Wadhawan [17] | 85 |
S. Phadikar et al. [20] | 79.5 |
Azim, M. A. et al. [21] | 86.58 |
Anam Islam et al. [22] | 91.67 |
Vimal Shrivastava et al. [25] | 91.37 |
Chen, J., et al., 2020 [36] | 99.11 |
Chen, J. et al., 2021 [37] | 98.50 |
Chen, J., et al., 2021 [38] | 93.75 |
Chen, J., et al., 2021 [39] | 99.14 |
Chen, J., et al., 2021 [40] | 99.78 |
Chen, J., et al., 2021 [41] | 99.67 |
Ruth J. A. et.al., 2022 [42] | 99.00 |
Uma, R., & Meenakshi, A., 2021 [43] | 98.42 |
Gadekallu, T. R. et al., 2021 [44] | 94.00 |
LW17 (Proposed Work) | 93.75 |
Layer (type) | Output Shape | Param # |
Conv2d 118 (Conv2D) | (None, 118, 118, 32) | 896 |
Conv2d 119 (Conv2D) | (None, 116, 116, 32) | 9248 |
Max pooling2d 56 (MaxPooling2D) | (None, 58, 58, 32) | 0 |
Conv2d 120 (Conv2D) | (None, 56, 56, 64) | 18, 496 |
Conv2d 121 (Conv2D) | (None, 54, 54, 64) | 36, 928 |
Conv2d 122 (Conv2D) | (None, 52, 52, 64) | 36, 928 |
Max pooling2d 57 (MaxPooling2D) | (None, 26, 26, 64) | 0 |
Dropout 28 (Dropout) | (None, 26, 26, 64) | 0 |
Conv2d 123 (Conv2D) | (None, 24, 24, 128) | 73, 856 |
Conv2d 124 (Conv2D) | (None, 22, 22, 128) | 147, 584 |
Conv2d 125 (Conv2D) | (None, 20, 20, 128) | 147, 584 |
Max pooling2d 58 (MaxPooling2D) | (None, 10, 10, 128) | 0 |
Conv2d 126 (Conv2D) | (None, 8, 8, 256) | 295, 168 |
Conv2d 127 (Conv2D) | (None, 6, 6, 256) | 590, 080 |
Max pooling2d 59 (MaxPooling2D) | (None, 3, 3, 256) | 0 |
Flatten 14 (Flatten) | (None, 2304) | 0 |
Dense 42 (Dense) | (None, 256) | 590, 080 |
Dropout 29 (Dropout) | (None, 256) | 0 |
Dense 43 (Dense) | (None, 256) | 65, 792 |
Dropout 30 (Dropout) | (None, 256) | 0 |
Dense 44 (Dense) | (None, 3) | 771 |
Activation 14 (Activation) | (None, 3) | 0 |
Notes: Total parameters: 2, 013, 411; trainable parameters: 2, 013, 411; non-trainable parameters: 0 |
Model | Number of Epochs | Training Accuracy | Validation Accuracy | Validation Loss |
VGG-16 | 5 | 90.06% | 85.21% | 1.1936 |
10 | 92.63% | 87.43% | 1.1302 | |
15 | 94.47% | 89.78% | 1.2301 | |
INCEPTION 3 | 5 | 61.54% | 51.28% | 8.9556 |
10 | 72.12% | 58.97% | 7.1420 | |
15 | 74.47% | 59.10% | 9.2561 | |
MOBILENET | 5 | 85.00% | 50.18% | 1.1704 |
10 | 92.06% | 52.21% | 1.2332 | |
15 | 93.24% | 52.63% | 1.3489 | |
DENSNET-201 | 5 | 79.17% | 79.1% | 0.6808 |
10 | 89.74% | 79.49% | 0.6377 | |
15 | 90.47% | 80.12% | 0.5672 | |
XCEPTION | 5 | 66.53% | 53.17% | 3.6400 |
10 | 69.50% | 59.52% | 4.3366 | |
15 | 70.47% | 60.12% | 4.5672 | |
5 | 91.60% | 90.18% | 0.2238 | |
EFFICIENT NET | 10 | 94.05% | 94% | 0.2682 |
15 | 94.47% | 94.10% | 0.2561 | |
5 | 37.20% | 33.33% | 5.9619 | |
INCEPTION RES NETV2 | 10 | 48.81% | 45.24% | 3.7276 |
15 | 49.10% | 45.71% | 3.2682 | |
NASNET MOBILE | 5 | 67.26% | 55.95% | 0.9370 |
10 | 70.24% | 54.76% | 1.2071 | |
15 | 71.36% | 55.71% | 1.1820 | |
RESNET 50 | 5 | 93.15% | 83.33% | 0.3279 |
10 | 94.35% | 92.86% | 0.3517 | |
15 | 94.85% | 93.71% | 0.3178 | |
VGG-16 | 5 | 92.26% | 79.76% | 0.8327 |
10 | 92.56% | 84.52% | 0.6207 | |
15 | 93.05% | 85.54% | 0.4178 | |
5 | 91.52% | 87.50% | 0.1711 | |
LW17 CNN | 10 | 90.81% | 90.62% | 0.1605 |
15 | 94.93% | 87.50% | 0.2139 |
MODEL | Training Accuracy | Validation Accuracy | Validation Loss | Execution Time (in Sec) |
Vgg19 | 90.50% | 89.78% | 1.2 | 218 |
InceptionV3 | 56.70% | 50.30% | 19 | 560 |
MobileNet | 81.50% | 77.80% | 1.9 | 488 |
Densnet 201 | 77.70% | 75.20% | 0.72 | 540 |
XceptionNet | 56.70% | 52.20% | 5.21 | 346 |
EfficientNet | 89.31% | 94.10% | 0.27 | 246 |
InceptionResnet | 49.10% | 42.00% | 46 | 389 |
NasnetMobile | 70.24% | 53.41% | 1.22 | 458 |
Resnet50 | 94.85% | 93.71% | 0.37 | 548 |
Vgg16 | 92.44% | 85.54% | 0.68 | 110 |
LW17 | 94.35% | 93.75% | 0.18 | 116 |
Class | Disease | Precision | Recall | F1-Score | Accuracy (%) |
1 | Bacterial Leaf Blight (BLB) | 1.000 | 0.8571 | 0.923 | 100 |
2 | Brown Spot | 0.833 | 1.00 | 0.9088 | 83.33 |
3 | Leaf Smut | 1.00 | 0.933 | 0.965 | 100 |
- | Average | 0.944 | 0.930 | 0.932 | 94.44 |
Author | Accuracy (%) |
A. Sony [16] | 86.6 |
Radhika Wadhawan [17] | 85 |
S. Phadikar et al. [20] | 79.5 |
Azim, M. A. et al. [21] | 86.58 |
Anam Islam et al. [22] | 91.67 |
Vimal Shrivastava et al. [25] | 91.37 |
Chen, J., et al., 2020 [36] | 99.11 |
Chen, J. et al., 2021 [37] | 98.50 |
Chen, J., et al., 2021 [38] | 93.75 |
Chen, J., et al., 2021 [39] | 99.14 |
Chen, J., et al., 2021 [40] | 99.78 |
Chen, J., et al., 2021 [41] | 99.67 |
Ruth J. A. et.al., 2022 [42] | 99.00 |
Uma, R., & Meenakshi, A., 2021 [43] | 98.42 |
Gadekallu, T. R. et al., 2021 [44] | 94.00 |
LW17 (Proposed Work) | 93.75 |