
Hormone receptor negative (HR-) breast cancer subtypes are etiologically distinct from the more common, less aggressive, and more treatable form of estrogen receptor positive (ER+) breast cancer. Numerous population-based studies have found that, in the United States, Black women are 2 to 3 times more likely to develop HR- breast cancer than White women. Much of the existing research on racial disparities in breast cancer subtype has focused on identifying predisposing genetic factors associated with African ancestry. This approach fails to acknowledge that racial stratification shapes a wide range of environmental and social exposures over the life course. Human stress genomics considers the role of individual stress perceptions on gene expression. Yet, the role of structurally rooted biopsychosocial processes that may be activated by the social patterning of stressors in an historically unequal society, whether perceived by individual black women or not, could also impact cellular physiology and gene expression patterns relevant to HR- breast cancer etiology. Using the weathering hypothesis as our conceptual framework, we develop a structural perspective for examining racial disparities in breast cancer subtypes, integrating important findings from the stress biology, breast cancer epidemiology, and health disparities literatures. After integrating key findings from these largely independent literatures, we develop a theoretically and empirically guided framework for assessing potential multilevel factors relevant to the development of HR- breast cancer disproportionately among Black women in the US. We hypothesize that a dynamic interplay among socially patterned psychosocial stressors, physiological & behavioral responses, and genomic pathways contribute to the increased risk of HR- breast cancer among Black women. This work provides a basis for exploring potential alternative pathways linking the lived experience of race to the risk of HR- breast cancer, and suggests new avenues for research and public health action.
Citation: Erin Linnenbringer, Sarah Gehlert, Arline T. Geronimus. Black-White Disparities in Breast Cancer Subtype: The Intersection of Socially Patterned Stress and Genetic Expression[J]. AIMS Public Health, 2017, 4(5): 526-556. doi: 10.3934/publichealth.2017.5.526
[1] | Nosheen Aslam, Muhammad Sarfaraz Iqbal, Syed Makhdoom Hussain, Muhammad Rizwan, Qurat-Ul-Ain Naseer, Muhammad Afzal, Rizwan Muneer, Farzana Batool . Effects of chelating agents on heavy metals in Hepatitis C Virus (HCV) patients. Mathematical Biosciences and Engineering, 2019, 16(3): 1138-1149. doi: 10.3934/mbe.2019054 |
[2] | Jun Gao, Fei Wu, Yakufu Yasen, Wanqing Song, Lijia Ren . Generalized Cauchy process based on heavy-tailed distribution and grey relational analysis for reliability predicting of distribution systems. Mathematical Biosciences and Engineering, 2022, 19(7): 6620-6637. doi: 10.3934/mbe.2022311 |
[3] | Yuan Yang, Yuwei Ye, Min Liu, Ya Zheng, Guozhi Wu, Zhaofeng Chen, Yuping Wang, Qinghong Guo, Rui Ji, Yongning Zhou . Family with sequence similarity 153 member B as a potential prognostic biomarker of gastric cancer. Mathematical Biosciences and Engineering, 2022, 19(12): 12581-12600. doi: 10.3934/mbe.2022587 |
[4] | Zishuo Yan, Hai Qi, Yueheng Lan . The role of geometric features in a germinal center. Mathematical Biosciences and Engineering, 2022, 19(8): 8304-8333. doi: 10.3934/mbe.2022387 |
[5] | Yuan Yang, Lingshan Zhou, Xi Gou, Guozhi Wu, Ya Zheng, Min Liu, Zhaofeng Chen, Yuping Wang, Rui Ji, Qinghong Guo, Yongning Zhou . Comprehensive analysis to identify DNA damage response-related lncRNA pairs as a prognostic and therapeutic biomarker in gastric cancer. Mathematical Biosciences and Engineering, 2022, 19(1): 595-611. doi: 10.3934/mbe.2022026 |
[6] | Kaiyu Shen, Shuaiyi Ke, Binyu Chen, Tiantian Zhang, Hongtai Wang, Jianhui Lv, Wencang Gao . Identification and validation of biomarkers for epithelial-mesenchymal transition-related cells to estimate the prognosis and immune microenvironment in primary gastric cancer by the integrated analysis of single-cell and bulk RNA sequencing data. Mathematical Biosciences and Engineering, 2023, 20(8): 13798-13823. doi: 10.3934/mbe.2023614 |
[7] | Fred Brauer . Some simple epidemic models. Mathematical Biosciences and Engineering, 2006, 3(1): 1-15. doi: 10.3934/mbe.2006.3.1 |
[8] | Huu Dang Quoc . MEMINV: A hybrid efficient approximation method solving the multi skill-resource constrained project scheduling problem. Mathematical Biosciences and Engineering, 2023, 20(8): 15407-15430. doi: 10.3934/mbe.2023688 |
[9] | Kwabena Owusu-Agyemang, Zhen Qin, Appiah Benjamin, Hu Xiong, Zhiguang Qin . Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization. Mathematical Biosciences and Engineering, 2021, 18(4): 4772-4796. doi: 10.3934/mbe.2021243 |
[10] | Shuyan Liu, Peilin Li, Yuanhao Tan, Geqi Ding, Bo Peng . A robust local pulse wave imaging method based on digital image processing techniques. Mathematical Biosciences and Engineering, 2023, 20(4): 6721-6734. doi: 10.3934/mbe.2023289 |
Hormone receptor negative (HR-) breast cancer subtypes are etiologically distinct from the more common, less aggressive, and more treatable form of estrogen receptor positive (ER+) breast cancer. Numerous population-based studies have found that, in the United States, Black women are 2 to 3 times more likely to develop HR- breast cancer than White women. Much of the existing research on racial disparities in breast cancer subtype has focused on identifying predisposing genetic factors associated with African ancestry. This approach fails to acknowledge that racial stratification shapes a wide range of environmental and social exposures over the life course. Human stress genomics considers the role of individual stress perceptions on gene expression. Yet, the role of structurally rooted biopsychosocial processes that may be activated by the social patterning of stressors in an historically unequal society, whether perceived by individual black women or not, could also impact cellular physiology and gene expression patterns relevant to HR- breast cancer etiology. Using the weathering hypothesis as our conceptual framework, we develop a structural perspective for examining racial disparities in breast cancer subtypes, integrating important findings from the stress biology, breast cancer epidemiology, and health disparities literatures. After integrating key findings from these largely independent literatures, we develop a theoretically and empirically guided framework for assessing potential multilevel factors relevant to the development of HR- breast cancer disproportionately among Black women in the US. We hypothesize that a dynamic interplay among socially patterned psychosocial stressors, physiological & behavioral responses, and genomic pathways contribute to the increased risk of HR- breast cancer among Black women. This work provides a basis for exploring potential alternative pathways linking the lived experience of race to the risk of HR- breast cancer, and suggests new avenues for research and public health action.
An emotion is a powerful and intuitive sensation that arises in humans as a result of their conditions, circumstances, moods, and relationships with other people. Therefore, emotional recognition is a crucial topic to discuss when talking about human-machine connection as well as when it comes to analyzing and exercising control over feelings. There are a variety of methods, including psychological signals, bodily gestures, speech, and facial expressions, that may be used to decipher an individual's emotional state. Alterations in the human body's physiological states, such as variations in heart rate (measurable by an electrocardiogram (ECG)), temperature, skin conductance, muscular tension, and brain waves are examples of these alterations [1,2,3,4,5].
Emotion detection, a fundamental aspect of affective computing, involves using advanced technologies, specifically deep learning, to recognize and interpret human emotions from various inputs such as text, speech, facial expressions, and other multimodal sources. The ability to understand human emotions automatically has gained increasing significance due to its wide range of applications, including sentiment analysis, mental health assessment, human-computer interaction, and personalized services [6,7,8]. Deep learning, with its capability to automatically learn intricate patterns and features from large and complex datasets, has emerged as a powerful tool for enhancing the accuracy and robustness of emotion detection systems [9,10,11,12,13].
Deep learning is a very important component of the process of identifying a variety of feelings based on acoustic speech information. The detection and classification of emotions represent a vital aspect of human-computer interaction, sentiment analysis, and affective computing. Existing frameworks and techniques have made substantial progress in this domain, yet they are not without their challenges. One of the primary challenges in emotion classification is the availability of high-quality and diverse datasets [14,15,16]. Many existing deep learning models for emotion detection are regarded as "black boxes", making it difficult to understand how and why they make certain predictions. Emotion classification often requires substantial computational resources, especially when dealing with large-scale datasets or complex deep-learning models [17,18,19,20]. These resource requirements can limit the widespread adoption of emotion recognition systems, particularly in resource-constrained environments or on edge devices. Optimizing existing frameworks for efficiency while maintaining accuracy is an ongoing challenge. Addressing these challenges is crucial for creating more accurate and adaptable emotion recognition systems that can be applied across various domains and cultures. In this paper, a novel optimization-based extended deep learning strategy with optimal feature selection is performed to accurately identify facial emotions by addressing the existing problems. The major contributions of the proposed method are as follows.
● To design an innovative FER model, extended walrus-based deep learning with Botox feature selection network (EWDL-BFSN) for effectually predicting facial emotions.
● To apply a new meta-heuristic optimization algorithm, the improved Botox optimization algorithm (IBoA), to select optimal features for minimizing computational complexity.
● To design an effective enhanced optimization-based kernel residual 50 (EK-ResNet50) network for predicting facial emotions with maximum accuracy rate by minimizing the errors.
● To offer a nature-inspired metaheuristic walrus optimization algorithm (WOA) for tuning the network parameters of the EK-ResNet50 model and lessen the false positive (FP) rates.
● To estimate the performance of the EWDL-BFSN model by relating it with state-of-the-art methods using two different publicly available datasets.
The remaining paper structure is as follows: the relevant studies addressing the prediction of facial emotions are identified in Section 2. In Section 3, the EWDL-BFSN model is discussed. The outcome and discussion are presented in Section 4. Section 5 brings an illustration of the conclusion and future directions.
Alzahrani [21] introduced a bioinspired image processing-enabled facial emotion recognition (BIPFER-EOHDL) model that incorporates an equilibrium optimizer (EO) and hybrid deep learning. The median filtering (MF) was used to pre-process the input image. Next, the EfficientNetB7 model was used to extract features. Meanwhile, the EO technique was used to select the hyperparameters of the EfficientNetB7 model. Lastly, the categorization of facial emotions was accomplished through a multi-head attention bi-directional long short-term memory (MA-BLSTM) model. Even with improved performances, this model would require minimizing the false rates.
Tao and Duan [22] suggested a hierarchical attention network with progressive feature fusion for the purpose of classifying facial expressions. First, a diverse feature extraction module that depends on multiple feature aggregation blocks was used to exploit both high-level and low-level features, as well as gradient features in order to aggregate diverse complementary features. Second, a hierarchical attention module (HAM) was constructed to gradually boost discriminative characteristics from significant regions of the face images and suppress task-irrelevant features from distracting facial regions to efficiently fuse the dissimilar features. According to extensive trials, it achieved the best performance; however, the accuracy was not satisfying.
Alamgir and Alam [23] offered a hybrid deep belief rain optimization (HDBRO) model for FER. The first step used in the HDBRO model was pre-processing the images obtained from the dataset in order to remove noise. Then, the primary geometric and appearance-based features were retrieved from the pre-processed image. From the retrieved feature set, the most significant features were chosen and classified into seven distinct emotions using DBRO. The HDBRO method's overall performance attained 97% accuracy values, superior to those of the other categorization models. However, a better optimal parameter tuning algorithm was required to improve the classification performance.
Kumar et al. [24] recommended an improved FER method for recognizing facial emotions using neural networks and optimal descriptor selection. At first, the Viola-Jones method was used to extract the face from the input image. Then, the Gabor filtering effectively filtered the noise, and a modified version of the SIFT approach called affine-scale-invariant feature transform (ASIFT) was used to extract the facial components as features. Subsequently, the neural network used the retrieved descriptors for classification. The findings showed that this system was capable of classifying seven different emotions accurately. The drawbacks of this method include less texture and edge improvements, less accuracy, and more complexity.
Kumari and Bhatia [25] introduced an enhanced FER technique based on deep learning. At first, the collected dataset was subjected to a combined trilateral filter in order to eliminate noise. Then, the filtered images underwent contrast-limited adaptive histogram equalization (CLAHE) to improve image visibility. Lastly, training accomplished a deep convolutional neural network (CNN). The cost function of deep CNN was also optimized by means of the Nadam optimizer. This model outperformed the competing models in terms of precision, recall, accuracy, and other metrics, according to comparative analysis. However, it had limitations such as overfitting issues, vanishing gradient problems, less accuracy, and higher false rates.
Although several deep learning-based FER models have made great progress, they still present several problems and restrictions. For training, emotion detection models mainly rely on labeled datasets. It is difficult to find large and diverse labeled datasets for emotion identification. Besides, accurate emotion labeling is difficult to achieve, and different annotators can interpret emotions differently. As a result, the training data can contain discrepancies that could have an impact on the model's performance. Deep learning models, particularly intricate ones like deep neural networks, are frequently referred to as "black boxes" due to their level of intricacy. Building trust and comprehending the model's decision-making process is crucial, especially in sensitive areas like mental health; however, it is difficult to understand the underlying workings of these models and provide reasons for the predictions they make. Deep learning algorithms undergo overfitting problems, especially if the training data is sparse or unbalanced. In emotion detection tasks, striking a balance between accurately fitting the training data and generalizing to new data remains difficult. Thereby, in this paper, a novel optimization-driven enhanced deep learning model focuses on overcoming the limitations of existing methods.
In this section, a novel EWDL-BFSN model is proposed for accurately predicting facial emotions. The EWDL-BFSN model includes different phases such as data collection, pre-processing, feature extraction, feature selection, and facial emotion classification. Initially, the data collection stage gathers the input images from the publically available dataset. The raw input image typically needs to be pre-processed to enhance the visual representation using a gradient wavelet anisotropic filter (GWAF). Next, a feature extraction method based on SqueezeNet is utilized to extract the features representing facial emotions. Then, the feature selection process is performed using IBoA to obtain optimized features with lower sizes. Lastly, the selected features are fed to EK-ResNet50 to achieve enhanced classification. The block diagram of the EWDL-BFSN model is presented in Figure 1.
The pre-processing is the initial stage of the proposed EWDL-BFSN model, which is employed to boost the visual representation of the acquired input image. In the EWDL-BFSN model, a gradient wavelet anisotropic filter (GWAF) is applied to pre-process the input image acquired from the datasets. Anisotropic diffusion is an extensively renowned method for alleviating noise in images when the existence of edges is more significant. However, when the level of noise is high, the anisotropic diffusion approach is ineffective, and it is considered the primary drawback. Preserving information structures in facial images, like object boundaries and edges, is a major challenge in the field of image processing but is essential to reach better results. The wavelet-based transformation has attained greater importance owing to its capability to perform multi-scale analysis and detect both low- and high-frequency components. Therefore, wavelet transformation is employed in dissimilar image processing applications, including speech recognition, computer graphics, and multi-fractal analysis, biology for cell membrane recognition, fingerprint verification, and smoothing and image de-noising, among others. A wavelet transform-based anisotropic diffusion method is proposed for the filtering of facial images to address the drawbacks of anisotropic diffusion and the benefits of wavelet transform. Here, no smoothing method nor even a Gaussian smoothing is applied.
ωn,p2kL(b,c,u+1)=ωn,p2kL(b,c,u)+κ|μ(b,c)|∑(q,r)∈μ(b,c)g(|∇ωn,p2kL(b,c,u)|,ς)|∇ωn,p2kL(b,c,u)|, | (1) |
where ωn,p2kL(b,c,u) represents the wavelet coefficient at iteration u or time constant at position (b,c), n=(1,2) implies the horizontal and vertical direction, g() indicates the gradient in the DWT domain, κ resembles the stability constant < = 0.25, and μ describes the neighborhoods of the central pixel. The threshold employed to tune the gradient magnitude is specified as ς, which is computed as
ς=√5ς, | (2) |
where ςe=1.4285υl(∇l−υ(|∇l|)). In this case, the filtering equation has removed the Gaussian smoothing component. Owing to this removal, this procedure is more ideal and efficient. The gradient is computed ωn,p2kL(b,c,u) in four dissimilar directions as
∇Nωn,p2kL(b,c,u+1)=ωn,p2kL(b,c−1,u)−ωn,p2kL(b,c,u), | (3) |
∇Sωn,p2kL(b,c,u+1)=ωn,p2kL(b,c−1,u)−ωn,p2kL(b,c,u), | (4) |
∇Eωn,p2kL(b,c,u+1)=ωn,p2kL(b,c−1,u)−ωn,p2kL(b,c,u), | (5) |
∇Wωn,p2kL(b,c,u+1)=ωn,p2kL(b,c−1,u)−ωn,p2kL(b,c,u). | (6) |
For noise reduction, the anisotropic diffusion is executed iteratively on each scale of wavelet transform components ω1,p2kL,ω2,p2kL. As wavelets can break down complex information into simpler forms at dissimilar locations and scales, they are used in this situation. After pre-processing, the pre-processed images are fed to SqueezeNet model for feature extraction.
For emotion recognition, feature extraction encompasses a number of crucial steps and methods to precisely recognize and classify human emotions. Feature extraction involves identifying and isolating particular attributes or specific characteristics. To guarantee the consistency and accuracy of the input, the process usually starts with image pre-processing. This is followed by the identification and localization of the face within the image. The detection of significant face landmarks, comprising the nose, mouth, and eyes, is essential to realize the spatial arrangement of facial features. Methods such as geometric-based techniques examine the angles and distances between landmarks, whereas appearance-based methods acquire fine variations and texture of the muscles and skin of the face. However, deep learning models, mainly convolutional neural networks (CNNs), and advanced techniques automatically recognize and extract the features from raw pixel data. To appropriately classify the emotions, these obtained features are subsequently fed into classifiers by minimizing the dimensionality.
The choice of the initial feature extraction layer is significant for achieving both speed and precision in FER, and it normally entails a trade-off between accuracy and speed. SqueezeNet [26] is designed with the objective of decreasing both the model's size and number of parameters [27]. SqueezeNet lessens the model size to about 4.8 MB while preserving recognition accuracy by compressing the parameters to around 1/50 of AlexNet. SqueezeNet uses the convolutional separation method to transform a standard 3 × 3 convolution into a fire unit by exchanging a 1 × 1 convolution kernel for a part of 3 × 3 convolution kernel. Each module has a rectified linear unit (ReLU) activation, and the fire module encompasses two layers, namely a squeeze layer and an expand layer to upsurge the depth of network. There are 1 × 1 convolution kernels in the squeeze layer, as well as 1 × 1 and 3 × 3 convolution kernels in the expand layer. The 3 × 3 convolution kernel can guarantee the precision of network, while the 1 × 1 convolution kernel can minimize the weight parameters. SqueezeNet is selected as a FER feature extraction network to diminish the time of feature extraction and accelerate detection and recognition owing to its excellent precision and compact size. The architecture of SqueezeNet for extracting features is given in Figure 2.
To give the network a specific depth, the first 17 layers of the SqueezeNet FER feature extraction network are used, and the last convolution and average pooling layers are eliminated. First, the pre-processed input image goes through Conv1 and Max1, Fire2−Fire4 and Max4, then Fire5−Fire8 and Max8, and finally through Fire9. A set of convolutions is employed to extract the image's FER feature map.
Every layer in the SqueezeNet feature extraction network is comprised of three max-pooling layers with a stride of 2, eight fire units, and one convolutional layer. Each fire module has a similar structure, which encompasses the squeeze layer and an expand layer. The depth of the network is given as 2. The 1 × 1 and 3 × 3 convolutional output feature maps are split together in the channel of the expand layer to form the fire module's channel. The below expression is fulfilled by the quantity of convolution kernels in the squeeze and expand layer:
Y<Z1+Z2, | (7) |
where Y indicates the number of 1 × 1 convolution kernels in squeeze layer, Z1 specifies the number of 1 × 1 convolution kernels in the expand layer, and Z2 resembles the number of 3 × 3 convolution kernels in the expand layer.
The SqueezeNet's input size for feature extraction is set to 300 × 300 × 3, and the feature maps' size is minimized to half of its original size using 3 × 3 max pooling layer with a stride of 2. Lastly, Fire9 is utilized to obtain the 19 × 19 × 512 feature map. It has been demonstrated in [28] that large feature maps are favorable for detecting smaller objects, but smaller feature maps are convenient for detecting large objects. The proposed method uses 10 × 10 × 1024 feature map as its input to expand the detection of facial emotions. This is accomplished by passing the 19 × 19 × 512 feature map through a 3 × 3 convolutional layer with 1024 channels and a step size of 2. Moreover, there are more 1 × 1 convolution kernels in the fire module than 3 × 3 convolution kernels. Minimum information loss happens if the network dimension is minimized using the 1 × 1 convolution kernel. As a result, while retaining more facial emotion information, the feature extraction speed of SqueezeNet can be increased. The experiment will demonstrate a higher performance of SqueezeNet for FER feature extraction. In addition, the SqueezeNet model works well for tasks requiring higher real-time performance because of its speedy performance. After feature extraction, the extracted features are given as input to the feature selection algorithm.
The process of discovering the optimal subset of features is termed feature selection. It is vital to build high-performance models and minimize computational complexity. In the proposed EWDL-BFSN model, the feature selection is performed using the improved Botox optimization algorithm (IBoA). IBoA combines the Botox optimization algorithm (BoA) [29] and dynamic weight factor to improve feature selection performance. BoA is a general metaheuristic optimization algorithm stimulated through Botox operation mechanism. The tenacity of BoA is to overcome the optimization issues by engaging in a human-based policy. The BoA is considered and mathematically established by taking hints from Botox treatments, in which the defects are targeted and cured to increase beauty. Moreover, it has great capability to accomplish a balance between exploration and exploitation. Each individual demanding Botox treatments is reflected as a BoA member. BoA models the way a doctor would inject Botox into specific facial muscles to diminish wrinkles and increase beauty. Correspondingly, the BoA strategy incorporates picking decision factors and including a specific value, like Botox, to boost the candidate solution.
Similar to many optimization techniques, BoA may find it difficult to escape local optima. In some cases, the BoA fails to find the actual global optimum and converges on a suboptimal solution. For complex problems, strategies to prevent trapping in local optima and improve the exploration can be required. Dynamic weights are incorporated into the BoA to prevent it from becoming stuck in local optima, which are poor solutions that are mistaken for the best. When considering dynamic weights, BOA can then concentrate on areas with the greatest decrease in wrinkles (exploit) based on the outcomes that have been observed. Moreover, BoA is capable of striking a balance between exploring the search space of promising areas that have been identified so far and searching the search space for potentially superior solutions (avoiding local optima) through the inclusion of dynamic weights. This can greatly enhance the algorithm's chance of determining the true global optimum. For feature selection, the population members of BoA are characterized as features. Each member contributes to the decision variable values according to their location in the problem-solving space, statistically characterized as a vector. The below equation delivers the population matrix from this vector, which comprehends the decision variables.
Z=[→Z1⋮→Zj⋮→ZQ]Q×p=[z1,1⋯y1,f⋯y1,p⋮zk,1⋯yk,f⋯yk,p⋮zQ,1⋯yQ,f⋯yQ,p]Q×p. | (8) |
The Eq (9) is engaged to randomly assign the position of each BoA member during initialization:
zk,f=LowBoftk,f.(UppBof−LowBof),k=1,2,...,Q,f=1,2,...,p, | (9) |
where Z designates the population matrix of BoA, Q characterizes the number of population members, →Zj defines the kth member of BoA (candidate solution), p specifies the number of decision variables, tk,f symbolizes the random numbers from the interval [0, 1], and UppBo and LowBo define the upper and lower bounds of eth decision variable, respectively.
For each individual, it is possible to calculate the fitness function of the problem. The fitness function is employed to rate each feature's excellence (potential solution). The objective of IBoA is to recognize the subset of ideal features in a search area that has minimum feature subset size and lower classification error rate. For feature selection, the fitness function used in the IBoA is conveyed as follows:
I(fit)k=(1−δ)×S+δ×(PDime), | (10) |
where S describes the classification accuracy, δexemplifies the weight parameter set to 0.01, P outlines the size of the selected feature subset, and Dime suggests the dimension. The below expression delivers the array of fitness function values to be categorized as a vector:
→H=[H1⋮Hk⋮HQ]Q×1=[H(→Z1)⋮H(→Zk)⋮H(→ZQ)]Q×1, | (11) |
where →H characterizes the vector of the assessed fitness function, and Hk indicates the evaluated fitness function that relies on kth BoA member.
According to BoA design, the number of facial muscles that demand Botox injections is diminished as the algorithm iterates. Consequently, the following equation is exploited to calculate the number of decision variables (i.e., chosen muscles) for Botox injection:
Qd=[1+pv]⩽p, | (12) |
where v characterizes the present value of the iteration counter and Qd defines the number of muscles that necessitate Botox injections.
The doctor determines which muscles to inject Botox depending on the person's face and wrinkles. By this circumstance, the below formulation is engaged to choose the variables to be injected for each population member.
Dbsk={g1,g2,...,gl,...,gQd},gl∈{1,2,3,...,p}and∀j,m∈{1,2,3,..,Qd}:gj≠gm, | (13) |
where Dbsk signifies the set of potential decision variables for the kth population members that are preferred for Botox injection, and gk states the location of the lth decision variable preferred for Botox injection.
The amount of Botox injections for each population member is calculated using the below equation, which is equivalent to the doctor's decision in determining the drug quantity for Botox injection based on patient desires and expertise:
→Dk={→ZMean−→Zk,v<V2→ZBest−→Zk,else, | (14) |
where →Dk=(dk,1,...,dk,l,....,dk,p) characterizes the considered amount for Botox injection to the kth member, V suggests the total number of iterations, →ZBest defines the best member of the population, and →ZMean signifies the mean population position (i.e., →ZMean=1Q∑Qk=1→Zk). In this phase of BoA, a dynamic weight factor Ψ is included to assist the walrus in constantly updating their location. The equation of Ψ is mathematically stated as follows:
Ψ=e2(1−m/M)−e−2(1−m/M)e2(1−m/M)+e−2(1−m/M). | (15) |
If the BoA is able to achieve an enhanced global search, the value of Ψ at the beginning of the iteration is larger, and at the end of the iteration, the value minimizes adaptively. Now, the BoA can maximize convergence speed and perform local searches more effectively.
→Dk={(→ZMean−→Zk)ψ,v<V2(→ZBest−→Zk)ψ,else. | (16) |
The appearance of faces is altered by the wrinkles disappearing after a Botox injection into the facial muscles. Initially, a new location is calculated for each BoA member depending on Botox injection based on the below equation:
→ZNewk:zNewk,gl=zk,g;+tk,gl.dk,gl, | (17) |
where →ZNewk characterizes the new location of kth member after injecting Botox, dk,gl indicates the dimension of Botox injection forkth member (i.e., →Dk), zNewk,gl specifies its gthl dimension, and tk,gl designates a random number with uniform distribution on the interval. If the value of fitness increases, this new position exchanges the resultant member's preceding location in accordance with the below expression:
→Zk={→ZNewk,HNewk<Hk→Zk,else, | (18) |
where HNewk characterizes the fitness function value. The pseudocode of feature selection using IBoA is delivered in Algorithm 1.
Algorithm 1: Feature selection using IBoA |
Start |
Initialize the size of population Q and total number of iteration V |
Set the fitness function, constraints and variables |
Build the initial population matrix in random manner |
Determine the fitness function |
Assess the best candidate solution →ZBest |
For v=1to V |
Update the number of decision variables for injecting Botox using Eq (12) |
For k=1to Q |
Describe the variables that are imitated for Botox injection based on Eq (13) |
Calculate the amount of Botox injection based on Eq (16) |
For k=1to Qd |
Calculate the new location of kthIBoA member based on Eq (17) |
End |
Compute the fitness function depending on →ZNewj |
Update the kth member of IBoA using Eq (18) |
End |
Save the best candidate solution so far obtained |
End |
Output the best solution (features) |
Stop |
The proposed EWDL-BFSN model has utilized an enhanced optimization-based kernel residual 50 (EK-ResNet50) network for the detection and classification of facial emotions. EK-ResNet50 network resembles a sophisticated method to enhance the efficiency and accuracy of emotion detection from facial images. The foundation of this network is the ResNet-50 architecture, a deep convolutional neural network renowned for its capability to handle vanishing gradients and facilitate residual learning training of very deep networks. To enhance feature learning and classification, optimization-based approaches and kernel techniques are incorporated. Kernel methods, which are popular for their capability to map input data into higher-dimensional spaces, are specially utilized to acquire variations and complex patterns in facial expressions. These kernel policies allow the network to acquire more complex and subtle properties, which are essential for differentiating between emotions, by embedding them into the residual blocks of ResNet-50. The optimization-based techniques improve the network's performance even further by optimizing the learning process. Overfitting can be avoided, and generalization can be strengthened with the use of adaptive learning rate modifications and sophisticated regularization techniques. Through these modifications, the network is assured to learn relevant features more efficiently and remain robust when dealing with a variety of datasets.
Furthermore, the residual connections in ResNet-50 aid in reducing the degradation issue and provide stable gradient flow even in very deep networks by enabling the network to learn identity mappings. When it comes to intricate tasks such as FER, where it is critical to capture fine-grained features and variations, this stability is significant to support good performance. The EK-ResNet50 network can perform outstandingly better in real-world scenarios than conventional models, offering faster convergence and greater accuracy. These advantages can greatly benefit real-time emotion detection systems that are utilized in psychological analysis, automated customer service, and human-computer interaction. Thus, the strong framework of ResNet-50 associated with sophisticated optimization strategy and kernel delivers an influential mechanism for furthering the field of FER. In addition, the ResNet50 model has obtained greater results in the ImageNet classification challenge and addressed the gradient explosion and disappearance issues. Nevertheless, certain issues remain, particularly the inability to learn subtle features, a lengthy operation time, and the large number of parameters. The EWDL-BFSN model suggests an EK-ResNet50 network by considering the issues of current ResNet50 [30] as well as the features of facial emotion images. The EK-ResNet50 network takes the optimal features selected through IBoA as input. The structure of the EK-ResNet50 network is provided in Figure 3. The enhancements made with the standard ResNet50 model are also provided.
The features of facial emotions are difficult to learn because the richness of detailed information is fundamentally proportionate to the number of pixels they occupy. A convolution layer with a 7 × 7 convolution kernel, which can learn apparent features, makes up the network input part of ResNet50. Nonetheless, the complex image background makes it challenging to learn the texture and color information of facial features using a 7 × 7 convolution, which influences the network model's ability to identify facial emotions. More effective subtle feature learning is necessary to correctly identify facial emotions. Consequently, to better adapt the network design for FET, the 7 × 7 convolution of the first layer is decomposed. In EK-ResNet50, the 7 × 7 convolutional layer is replaced with three 3 × 3 stacked convolutional layers. Besides, the thorough inspection of the ResNet50 structure reveals a greater number of 3 × 3 convolutions within the residual module. Thereby, the 3 × 3 convolution is decomposed into 3 × 1 and 1 × 3 convolutions in series. One way to improve the network's capacity for nonlinear fitting is by using additional nonlinear activation functions in the tiny convolution layer of the series. However, there are fewer parameters in the calculation.
Identity mapping is incorporated into the residual term by the ResNet50 network, and a part of the 1 × 1 convolution layer in the identity mapping is employed to maximize channel dimensions for ensuring the computation of eigen matrix summation. In stages 2–4, the identity mapping of the first residual module is a 1 × 1 convolution layer with a step size of 2. As the step size exceeds the convolution kernel's size, some image information is not convolutionally processed, building it to learn useful features. The identity mapping tactic is enriched in accordance with the features of the facial emotions. In the EK-ResNet50 network, identity mapping is accomplished by employing the convolution layer and the max-pool layer in series. With a step size of 2, the 3 × 3 max-pool layer computes all feature map information while retaining the important features in the area.
In order to maximize the network's ability for generalization, batch normalization (BN) executes global normalization along the batch dimension of the sample. The BN is sensitive to the value of batch size, resulting in data deviation while determining the BN layer at network training, because the standard deviation and average value of BN are derived from the batch size. Switchable normalization (SN) [31] stabilizes the network under dissimilar batch sizes, resolves the issue of computation deviation occurring by BN, and achieves model optimality. In the EK-ResNet50 network, the BN layer in the four phases of the standard ResNet50 model is exchanged with the SN layer. This significantly enhances the model's stability as well as the convergence speed in the FER task.
The walrus optimization algorithm (WOA) is one of the most recent swarm intelligence algorithms inspired by how walruses breed, migrate, roost, escape, gather, and feed based on critical signals (safety and danger signals) [32]. The danger signal is employed in WOA to decide whether to perform the exploration stage or the exploitation stage. During the algorithm's early exploration, the walrus herd moves to a new domain in the solution space if the danger signal reaches specific requirements. On the other hand, the walrus herd reproduces, which defines the exploitation stage. During the exploitation stage, the safety signal is important in defining whether a walrus selects foraging or roosting behavior. In this roosting behavior, male, female, and juvenile walruses communicate with one another in order to move the population in a path that is favorable to survival. Typical foraging characteristics cover gathering and fleeing away, which are managed by danger signals.
In an environment with ordered policing, the walrus herd can accomplish population expansion (looking for the global optimum) and avoid being killed or captured by predators (searching for the local optimum). The experimentation on different test suits shows that the WOA can handle high-dimensional benchmarks and real-world issues with unique stability properties and highly competitive performance. Moreover, the WOA increases the effectiveness of optimization calculations and encourages the ongoing development and application extension of artificial intelligence. Moreover, it becomes a strong tool for resolving challenging issues in the real world. Thereby, the proposed EWDL-BFSN model has selected WOA to tune the hyperparameters of the classification model. Here, the walrus is considered as the tunable parameter, and the gathering behavior of the exploitation stage is imitated to update the optimal values of hyperparameters as follows:
Yu+1j,k=(Y1+Y2)/(Y1+Y2)22, | (19) |
Y1=Yubest−b1×c1×|Yubest−Yuj,k|, | (20) |
Y2=Yusecond−b2×c2×|Yusecond−Yuj,k|, | (21) |
b=χ×s1−χ, | (22) |
c=tan(φ), | (23) |
where Y1 and Y2 resemble the two weights influencing the gathering behavior of walrus, Yusecond signifies the location of the second walrus in the current iteration, and s1 indicates a random number that falls between 0 and 1. |Yusecond−Yuj,k| indicates the separation between the current walrus and the second walrus, b and c characterize the gathering coefficients, and φ represents the values between 0 and π.
This section offers the results of the EWDL-BFSN model and specifies its advantages over modern architectures. The EWDL-BFSN model is executed using the Python platform on a personal computer (PC) with 16 GB of RAM. A testing process is conveyed using an Intel(R) Core (TM) i5-4770CPU@3.20GHz processor, which functions on a 64-bit operating system. For experimentation, the EWDL-BFSN model used two publicly available datasets, namely the extended Cohn-Kanade (CK+) dataset and the FER-2013 dataset. The dropout, learning rate, and activation function are set to 0.001, 0.3, and ReLU. The optimizer used in the classification model is WOA. Numerous performance indicators such as accuracy, sensitivity, specificity, and F1-score are analyzed, and the efficacy of EWDL-BFSN is established over state-of-the-art methods. The below sub-sections cover detailed descriptions of datasets, performance indicators, performance analysis, and discussion of results.
In the EWDL-BFSN model, two different datasets, namely CK+ dataset and FER-2013 dataset, are used for FER experimentation. Comprehensive data for the face emotion classification model is available in the FER2013 for the Kaggle Competition. The FER2013 database was created as part of the ICML 2013 Kaggle challenges [33]. Since then, scientific research on FER has been assessed using this data collection. The images are registered automatically, ensuring that the face of every image is centered and roughly equal in size. The purpose is to use the emotions revealed in a facial expression to classify all faces into eight groups: happiness, sadness, anger, disgust, surprise, contempt, fear, and neutral. This dataset covers 35,887 grayscale images with 48 × 48 pixel resolution. Among all images, 28,709 are used for training and the remaining 3589 for testing. Each image is associated with one of eight emotional states. The other dataset, called CK+3 [34], is a frequently utilized database for FER investigation. It has eight gestures for 123 different individuals. In addition, based on each participant's appearance, 961 image data are included, with each subject resembling one of the eight basic emotional categories (happiness, sadness, fear, disgust, anger, surprise, neutral, and contempt).
This section explains the descriptions and mathematical formulas for several performance measures, including accuracy, F1-score, sensitivity (also known as recall), and specificity. The ratio of accurate FER to complete data elements is distinguished as classification accuracy. Sensitivity is revealed as the ratio of accurate positive outcomes to the total number of matters in the positive class. Specificity measures how well the model differentiates instances that cannot belong to a specified emotion class. The F1-score is a measure that thoroughly replicates the average of recall and precision. The below expression demonstrates the equations of various performance indicators:
Accuracy=TP+TNTP+TN+FP+FN, | (24) |
ℜecall=TPTP+FN, | (25) |
Specificity=TNTN+FP, | (26) |
F1−score=2∗Precision∗ℜecalPrecision+ℜecall, | (27) |
where TN specifies true negative (TN), FP designates false positive (FP), TP indicates true positive (TP), and FN symbolizes false negative (FN).
This section encompasses a detailed analysis and results comparison for classifying facial emotions as anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. In the evaluation, numerous state-of-the-art models are used to define the effectiveness of the EWDL-BFSN model. Convolutional neural network (CNN), ResNet-101, AlexNet, CNN-VGG-19, inception v3, MobileNet, support vector machine (SVM), ResNet50, HGSO-DLFER, and BIPFER-EOHDL [21,35] are the techniques being compared for classifying facial emotions.
Table 1 presents an analysis of FER results of the EWDL-BFSN model with different classes under an 80:20 ratio of training and testing stages using the CK+ dataset. The results reveal that the EWDL-BFSN model performs successfully across all emotion classes (anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise). In the same way, Table 2 offers the same analysis using the FER-2013 dataset.
Training stage (80%) | Testing stage (20%) | |||||||
Class | Accuracy | Sensitivity | Specificity | F1-score | Accuracy | Sensitivity | Specificity | F1-score |
Surprise | 99.24 | 88.22 | 99.51 | 90.82 | 99.56 | 94.15 | 98.97 | 87.82 |
Sadness | 98.08 | 99.71 | 99.45 | 99.08 | 98.74 | 95.58 | 98.31 | 91.61 |
Neutral | 99.54 | 99.06 | 99.41 | 96.73 | 99.37 | 97.10 | 98.94 | 95.09 |
Happiness | 99.70 | 96.00 | 99.71 | 97.39 | 98.71 | 94.05 | 98.25 | 87.55 |
Fear | 98.52 | 98.01 | 99.79 | 91.90 | 98.61 | 95.52 | 98.72 | 95.89 |
Disgust | 98.64 | 97.15 | 99.50 | 97.62 | 98.46 | 91.96 | 98.99 | 99.72 |
Contempt | 99.25 | 96.29 | 100.00 | 99.55 | 98.55 | 92.34 | 100.00 | 100.00 |
Anger | 99.24 | 96.35 | 99.62 | 96.16 | 98.86 | 94.39 | 98.88 | 93.95 |
Training stage (70%) | Testing stage (30%) | |||||||
Class | Accuracy | Sensitivity | Specificity | F1-score | Accuracy | Sensitivity | Specificity | F1-score |
Surprise | 99.06 | 88.2 | 99.43 | 90.76 | 99.42 | 94.11 | 98.78 | 87.67 |
Sadness | 98.03 | 99.67 | 99.41 | 99.05 | 98.46 | 95.52 | 98.24 | 91.46 |
Neutral | 99.80 | 99.02 | 99.36 | 96.27 | 99.17 | 96.89 | 98.26 | 95.01 |
Happiness | 99.61 | 95.89 | 99.65 | 97.28 | 98.56 | 94.01 | 98.51 | 87.23 |
Fear | 98.25 | 98 | 99.71 | 91.82 | 98.47 | 95.05 | 98.27 | 95.35 |
Disgust | 98.39 | 97.05 | 99.43 | 97.32 | 98.4 | 91.35 | 98.82 | 99.47 |
Contempt | 99.18 | 96.21 | 99.32 | 99.25 | 98.46 | 92.21 | 99.76 | 100.00 |
Anger | 99.21 | 96.3 | 99.38 | 96.07 | 98.57 | 94.27 | 98.57 | 93.53 |
The comparison of the accuracy of the proposed EWDL-BFSN and other state-of-the-art methods regarding FER is shown in Figure 4. The comparison demonstrates how accurate the EWDL-BFSN model is compared to CNN, ResNet-101, AlexNet, CNN-VGG-19, inception v3, MobileNet, SVM, ResNet50, HGSO-DLFER, and BIPFER-EOHDL [21,35]. By using the CK+ and FER-2013 datasets, the overall accuracy accomplished by the EWDL-BFSN for FER is 99.37 and 99.25%, respectively, which is clearly shown in the graphical representation. The reason for these high values is the usage of EK-ResNet50 with effective IBoA for feature selection. The state-of-the-art methods show certain insufficiencies in FER compared with the EWDL-BFSN model. From the graphical demonstration, it is shown that the ResNet and Naïve Bayes models reach the lowest accuracy scores among all state-of-the-art methods, while the BIPFER-EOHDL model has a higher accuracy but still lower than that of the EWDL-BFSN model.
The sensitivity for effectively predicting facial emotions using the CK+ and FER-2013 datasets of the EWDL-BFSN model compared with the state-of-the-art methods is shown in Figure 5. This comparison demonstrates how well the EWDL-BFSN can predict facial emotions (anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise). From the graphical demonstration, it is shown that the EWDL-BFSN model accomplishes a better recall value than the other state-of-the-art methods. As the complex feature vectors are nominated through IBoA, the EWDL-BFSN model can classify facial emotion classes with greater exactness. The overall sensitivity value accomplished by the EWDL-BFSN model is 99.2 and 98.21% using the CK+ and FER-2013 datasets, respectively.
Figure 6 compares the specificity of the EWDL-BFSN model and other state-of-the-art methods in effectively predicting facial emotions using the CK+ and FER-2013 datasets. From the graphical demonstration, it is demonstrated that the EWDL-BFSN model has a higher specificity value than the other state-of-the-art methods. The primary cause for this enhancement is the selection of the EK-ResNet50 network, which uses the WOA for classifying facial emotions. Among the state-of-the-art methods, the BIPFER-EOHDL has better specificity, closer to the EWDL-BFSN model, while ResNet101 and Naïve Bayes have relatively little specificity. Subsequently, it is recognized that the EWDL-BFSN model effectively identifies the facial emotion classes and measures the rate of incorrect cases.
Figure 7 displays the F1-score of the EWDL-BFSN model and the other state-of-the-art methods for effectively predicting facial emotions using the CK+ and FER-2013 datasets. The EWDL-BFSN model can properly recognize the class labels for a given input data. The graphical depiction illustrates that the EWDL-BFSN model outperforms the state-of-the-art methods by around 98.21 and 98.15% for the CK+ and FER-2013 datasets, respectively. Furthermore, the graphical examination clearly demonstrates that the optimal parameter selection has allowed the EWDL-BFSN model to surpass numerous deep learning models regarding their F1-score value. Consequently, the findings prove that the EWDL-BFSN model outperforms the other state-of-the-art methods in exactly classifying facial emotions.
The comparative analysis of EWDL-BFSN and state-of-the-art models using the CK+ dataset is presented in Table 3, clearly showing that the proposed EWDL-BFSN model has superior performance for FER. Similarly, the comparative analysis of EWDL-BFSN and state-of-the-art models using the FER-2013 dataset is presented in Table 4.
Methods | Performances (%) | |||
Accuracy | Sensitivity | Specificity | F1-score | |
CNN [21] | 94.76 | 78.5 | 78.53 | 78.08 |
ResNet-10 [21] | 93.89 | 73.65 | 76.11 | 74.29 |
AlexNet [21] | 95.88 | 83.25 | 83.1 | 82.5 |
CNN-VGG19 [35] | 94.03 | 82.95 | 81.59 | 81.75 |
Inception V3 [35] | 93.74 | 80.23 | 84.06 | 73.82 |
MobileNet [35] | 92.32 | 89.74 | 83.81 | 86.52 |
SVM [35] | 91.64 | 89.17 | 82.18 | 87.55 |
ResNet50 [35] | 88.54 | 90.96 | 83.65 | 85.99 |
HGSO-DLFER [35] | 98.65 | 98.45 | 84.99 | 87.78 |
BIPFER-EOHDL [21] | 99.05 | 88.5 | 99 | 90.93 |
Proposed | 99.37 | 99.2 | 99.48 | 98.21 |
Methods [21] | Performances (%) | |||
Accuracy | Sensitivity | Specificity | F1-score | |
Naïve Bayes | 40.55 | 40.57 | 45.59 | 40.51 |
Decision tree | 51.52 | 51.79 | 51.64 | 51.71 |
SVM | 62.58 | 62.52 | 63.58 | 59.53 |
Random forest | 62.51 | 63.5 | 64.51 | 60.54 |
Logistic regression | 51.55 | 51.66 | 49.65 | 49.66 |
CNN-3 | 80.79 | 81.63 | 81.79 | 81.73 |
CNN | 64.8 | 61.59 | 64.25 | 61.69 |
ResNet-101 | 63.43 | 60.11 | 62.99 | 60.52 |
AlexNet | 65.3 | 61.76 | 64.34 | 62.03 |
BIPFER-EOHDL | 88.5 | 83.42 | 83.48 | 83.5 |
Proposed | 99.25 | 98.21 | 97.41 | 98.15 |
The training and testing data from both CK+ and FER-2013 datasets are combined to examine the accuracy and loss of the EWDL-BFSN model for predicting facial emotions. Around 20% of the data is applied for testing and the remaining 80% is considered for training the EWDL-BFSN model. Figure 8 establishes the testing and training accuracy of the EWDL-BFSN model for FER. Here, the accuracy performance of the EWDL-BFSN model is assessed by fluctuating the epoch size from 0 to 300 and observed in a graphical way. In terms of accuracy during testing and training, the graphical portrayal seems to be closely identical. Correspondingly, the EWDL-BFSN model that is trained for 300 epochs and tested using the CK+ and FER-2013 datasets is examined regarding testing and training loss. The EWDL-BFSN model accomplishes a little loss once the epoch size is extended, as depicted in the graphical representation. The loss performance accomplished using CK+ and FER-2013 datasets is disclosed in Figure 9. The proposed study got minimum loss performance due to an efficient data training procedure over the EWDL-BFSN model.
To calculate the robustness of each phase of the EWDL-BFSN model, a number of ablation studies are presented for accurately classifying facial emotions. To conduct the ablation investigation, the EWDL-BFSN model is split into four distinct components: module-A, module-B, module-C, and module-D. The effectiveness of each stage in the EWDL-BFSN model is assessed independently based on the accuracy, sensitivity, specificity, and F1-score provided by these modules. Module-A suggests that the EWDL-BFSN model operates without executing pre-processing; Module-B indicates that no feature extraction process is performed; Module-C indicates that the feature selection process is not performed; and Module-D suggests that the EWDL-BFSN model performs classification without using WOA.
The performance obtained for the EWDL-BFSN model in the ablation study is shown in Table 5. Compared to all other modules, module-A has a decrease in performance. This results from the pre-processing stage being excluded. Here, the process is accomplished by feeding the obtained input image directly into the stage of feature extraction. Due to noisy images, the EWDL-BFSN model cannot achieve better performance without pre-processing the input image. In module-B, the pre-processed images are sent directly to IBoA for FER, eliminating the feature extraction procedure. Nevertheless, feature selection is used to reduce feature dimensionality, which enhances the EWDL-BFSN model's effectiveness. In the same way, module-C excludes the optimal feature selection process and provides the output from SqueezeNet feature extraction directly to the classification model. Additionally, module-D analyzes the performance without using a WOA in EK-ResNet50 network for FER classification and proves that WOA is necessary to achieve better FER performance. Consequently, it is believed that every step is important for improving the EWDL-BFSN model's performance in accurately predicting facial emotions.
Metrics | Module-A | Module-B | Module-C | Module-D |
Accuracy | 93.67 | 95.24 | 94.97 | 96.75 |
Sensitivity | 93.89 | 94.16 | 95.48 | 96.67 |
Specificity | 95.17 | 96.78 | 95.89 | 97.73 |
F1-score | 93.24 | 94.02 | 93.78 | 97.27 |
In this section, the EWDL-BFSN model is evaluated using other standard datasets in order to infer its applicability in different scenarios. For comparison, other state-of-the-art methods [deep neural network (DNN) [36], hybrid deep CNN [37], appearance-based fused descriptors model [38], distance and shape signature features–based multilayer perceptron model [39], and deep CNN with bilinear pooling [40] are utilized. Table 6 offers the analysis of the EWDL-BFSN model with other standard datasets: Japanese female facial expression (JAFFE), Karolinska directed emotional faces (KDEF), Radboud faces database (RaFD), and Indian movie face benchmark database (IMFDB). It is clear that the proposed EWDL-BFSN model has better performance than the other state-of-the-art methods for different datasets.
Method | Dataset | Accuracy (%) |
Deep neural network (DNN) [36] | JAFFE | 96.91 |
Hybrid deep CNN [37] | JAFFE | 98.14 |
KDEF | 95.29 | |
RaFD | 98.86 | |
Appearance-based fused descriptors model [38] | KDEF | 90.12 |
RaFD | 95.54 | |
JAFFE | 96.17 | |
Distance and shape signature features–based multilayer perceptron model [39] | JAFFE | 96.4 |
Deep CNN with bilinear pooling [40] | IMFDB | 64.17 |
Proposed model | JAFFE | 99.26 |
KDEF | 98.76 | |
RaFD | 98.65 | |
IMFDB | 97.56 |
Recently, deep learning–based methods have been suggested to be most effective for image sentiment analysis methodologies. In [41], the sentiment conveyed in tweets was examined regarding a potential emergency at various points within a specified region. Two scenarios for binary and multi-class were considered to test the RASA model. In order to obtain keywords based on tweet feeds and interpretations, it employed the LSTM technique with word embedding. A CNN model for human facial expression identification was studied in [42]; seven emotions were predicted and identified using the facial action coding system model. The findings show that 79.8% accuracy was achieved without the use of optimization techniques. The ResNet18 BNN network, which is based on the conventional Bayesian neural network design, was introduced in [43] for classifying human facial expressions into seven major classes. When tested on the FER-2013 test set, this model obtained 71.5 and 73.1% accuracy in the PublicTestSet and PrivateTestSet. Besides, a facial emotion identification system with automatic face detection and facial expression recognition was presented in [44]. Four deep CNNs and a label smoothing technique were employed to deal with mislabeled training data, as opposed to an ensemble model. The accuracy of the ExpW, FER-2013, and SFEW 2.0 datasets was 72.72, 51.97, and 71.82%, respectively. In [45], a face-sensitive CNN (FS-CNN) was offered for the recognition of human emotions. A deep learning–based technique that recognizes online students' real-time activity based on their facial emotions was suggested in [46], and a FER system that can recognize emotions from mask-covered faces was given in [47]. The ability of the developed FER system to identify emotions and their valence only from the eye region was assessed and contrasted with the outcomes attained when the entire face was taken into consideration.
This research presents the EWDL-BFSN model, which integrates numerous sophisticated methods to introduce a comprehensive solution to FER. With a set of well-defined processes covering pre-processing, feature extraction, feature selection, and classification, the EWDL-BFSN model is methodologically created to improve efficacy as well as the accuracy of emotion detection. The EWDL-BFSN model's advanced feature selection and classification model with parameter tuning procedures are its main advantages. The best possible input images for feature extraction are guaranteed when GWAF is used for image pre-processing. After that, features are extracted from the pre-processed images using SqueezeNet. By choosing the best features, the IBoA further improves the EWDL-BFSN model and ensures that the most pertinent features are used for FER. The EK-ResNet50 network manages the classification and effectively identifies and classifies face emotions by employing the chosen features. The WOA, a nature-inspired metaheuristic algorithm, is a crucial constituent in optimizing the tunable parameters of the EK-ResNet50 model, thereby improving its overall performance. Tests of the EWDL-BFSN model on the CK+ and FER-2013 datasets yielded very encouraging results, with overall accuracies of 99.37 and 99.25%, respectively. These measures show the EWDL-BFSN model's superiority over other state-of-the-art techniques for classifying facial emotions. Performance indicators including sensitivity, F1-score, specificity, and accuracy are included to give a thorough assessment of the model's abilities. However, there are some cases in which the model may fail to correctly classify emotions. For instance, the EWDL-BFSN can misclassify the emotions in situations where the facial expressions are subtle or ambiguous, such as expressions with low intensity or mixed emotions. Misclassifications can also result from incomplete training data representation of certain emotions, illumination, position, face occlusion variability, and data imbalances where specific emotion classes are underrepresented. In order to handle failure instances and enhance overall performance, it is required to further improve the robustness of the model by using strategies like data augmentation and regularization. Furthermore, even if the model works well on the FER-2013 and CK+ datasets, it is still unclear whether it can be employed in other less regulated datasets. The necessity for exact hyperparameter tuning also highlights a possible area for development, because poor hyperparameter selection can be compromised by suboptimal hyperparameters.
This paper proposes a novel EWDL-BFSN model for effectively identifying facial emotions. The EWDL-BFSN model chooses the best features and modifies classifier hyperparameters in order to automatically and effectively recognize facial emotions. The EWDL-BFSN uses GWAF for pre-processing collected images and SqueezeNet for extracting key features. The optimal features are chosen by IBoA, whereas the emotion recognition and classification are handled by the EK-ResNet50 network. Furthermore, the hyperparameters of EK-ResNet50 model are optimized through the application of WOA. The CK+ and FER-2013 publicly accessible datasets are used to train and evaluate the model using the Python platform. A comprehensive simulation study is conducted to verify the higher FER results obtained by the EWDL-BFSN model. According to the simulation results, FER outcomes of the EWDL-BFSN model are more effective than the other current methods. The overall accuracy of the EWDL-BFSN model on CK+ and FER-2013 datasets is 99.37 and 99.25%, respectively. Even if a high accuracy rate is obtained through the EWDL-BFSN model, it requires more computation resources due to complexity at both the training and testing phases. Besides, the datasets employed to test the model's performance cannot accurately reflect the diversity of real-world situations. To overcome the limitations, more effective hybrid models and optimization strategies that minimize computational load without minimizing performance will be developed in future work. To ensure generalizability and robustness, more evaluation is made by including a wider range of datasets with more variability.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors declare there is no conflict of interest.
[1] |
Anderson WF, Rosenberg PS, Prat A, et al. (2014) How many etiological subtypes of breast cancer: Two, three, four, or more? J Natl Cancer Inst 106: dju165. doi: 10.1093/jnci/dju165
![]() |
[2] |
Sims AH, Howell A, Howell SJ, et al. (2007) Origins of breast cancer subtypes and therapeutic implications. Nat Clin Pract Oncol 4: 516-525. doi: 10.1038/ncponc0908
![]() |
[3] | Paquet ER, Hallett MT (2014) Absolute assignment of breast cancer intrinsic molecular subtype. J Natl Cancer Inst 107: 357. |
[4] |
Perou CM, Sorlie T, Eisen MB, et al. (2000) Molecular portraits of human breast tumours. Nature 406: 747-752. doi: 10.1038/35021093
![]() |
[5] |
Sorlie T, Tibshirani R, Parker J, et al. (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A 100: 8418-8423. doi: 10.1073/pnas.0932692100
![]() |
[6] |
Amend K, Hicks D, Ambrosone CB (2006) Breast cancer in African-American women: differences in tumor biology from European-American women. Cancer Res 66: 8327-8330. doi: 10.1158/0008-5472.CAN-06-1927
![]() |
[7] | Howlader N, Altekruse SF, Li CI, et al. (2014) US incidence of breast cancer subtypes defined by joint hormone receptor and HER2 Status. J Natl Cancer Inst 106: dju055. |
[8] |
Newman LA (2014) Breast cancer disparities: high-risk breast cancer and African ancestry. Surg Oncol Clin N Am 23: 579-592. doi: 10.1016/j.soc.2014.03.014
![]() |
[9] |
Dietze EC, Sistrunk C, Miranda-Carboni G, et al. (2015) Triple-negative breast cancer in African-American women: disparities versus biology. Nat Rev Cancer 15: 248-254. doi: 10.1038/nrc3896
![]() |
[10] | Gapstur SM, Dupuis J, Gann P, et al. (1996) Hormone receptor status of breast tumors in black, Hispanic, and non-Hispanic white women. An analysis of 13,239 cases. Cancer 77: 1465-1471. |
[11] |
Hausauer AK, Keegan TH, Chang ET, et al. (2007) Recent breast cancer trends among Asian/Pacific Islander, Hispanic, and African-American women in the US: changes by tumor subtype. Breast Cancer Res 9: R90. doi: 10.1186/bcr1839
![]() |
[12] |
Joslyn SA (2002) Hormone receptors in breast cancer: racial differences in distribution and survival. Breast Cancer Res Treat 73: 45-59. doi: 10.1023/A:1015220420400
![]() |
[13] |
Tarone RE, Chu KC (2002) The greater impact of menopause on ER- than ER+ breast cancer incidence: a possible explanation (United States). Cancer Causes Control 13: 7-14. doi: 10.1023/A:1013960609008
![]() |
[14] |
Dunnwald LK, Rossing MA, Li CI (2007) Hormone receptor status, tumor characteristics, and prognosis: a prospective cohort of breast cancer patients. Breast Cancer Res 9: R6. doi: 10.1186/bcr1639
![]() |
[15] |
Bauer KR, Brown M, Cress RD, et al. (2007) Descriptive analysis of estrogen receptor (ER)negative, progesterone receptor (PR)-negative, and HER2-negative invasive breast cancer, the so-called triple-negative phenotype - A population-based study from the California Cancer Registry. Cancer 109: 1721-1728. doi: 10.1002/cncr.22618
![]() |
[16] |
Sineshaw HM, Gaudet M, Ward EM, et al. (2014) Association of race/ethnicity, socioeconomic status, and breast cancer subtypes in the National Cancer Data Base (2010-2011). Breast Cancer Res Treat 145: 753-763. doi: 10.1007/s10549-014-2976-9
![]() |
[17] | Kohler BA, Sherman RL, Howlader N, et al. (2015) Annual Report to the Nation on the Status of Cancer, 1975-2011, Featuring Incidence of Breast Cancer Subtypes by Race/Ethnicity, Poverty, and State. J Natl Cancer Inst 107: djv048. |
[18] |
Morris GJ, Naidu S, Topham AK, et al. (2007) Differences in breast carcinoma characteristics in newly diagnosed African-American and Caucasian patients: a single-institution compilation compared with the National Cancer Institute's Surveillance, Epidemiology, and End Results database. Cancer 110: 876-884. doi: 10.1002/cncr.22836
![]() |
[19] |
Hayanga AJ, Newman LA (2007) Investigating the phenotypes and genotypes of breast cancer in women with African ancestry: The need for more genetic epidemiology. Surg Clin North Am 87: 551-568. doi: 10.1016/j.suc.2007.01.003
![]() |
[20] |
Newman LA (2015) Disparities in breast cancer and African ancestry: A global perspective. Breast J 21: 133-139. doi: 10.1111/tbj.12369
![]() |
[21] |
King MC, Marks JH, Mandell JB (2003) Breast and ovarian cancer risks due to inherited mutations in BRCA1 and BRCA2. Science 302: 643-646. doi: 10.1126/science.1088759
![]() |
[22] |
Churpek JE, Walsh T, Zheng Y, et al. (2015) Inherited predisposition to breast cancer among African American women. Breast Cancer Res Treat 149: 31-39. doi: 10.1007/s10549-014-3195-0
![]() |
[23] |
Slavich GM, Cole SW (2013) The Emerging Field of Human Social Genomics. Clin Psychol Sci 1: 331-348. doi: 10.1177/2167702613478594
![]() |
[24] | Geronimus AT (1992) The weathering hypothesis and the health of African-American women and infants: evidence and speculations. Ethn Dis 2: 207-221. |
[25] | Geronimus AT (1994) The weathering hypothesis and the health of African American women and infants: Implications for reporducitve strategies and policy analysis. In: Sen G, Snow RC, editors. Power and Decision: The Social Control of Reproduction. Cambridge, Mass.: Harvard University Press. |
[26] | Geronimus AT (2001) Understanding and eliminating racial inequalities in women's health in the United States: the role of the weathering conceptual framework. J Am Med Womens Assoc 56: 133-136, 149-150. |
[27] | Geronimus AT, Thompson JP (2004) To denigrate, ignore, or disrupt: The health impact of policy-induced breakdown of urban African American communites of support. Du Bois Rev 1: 247-279. |
[28] |
Geronimus AT, Hicken M, Keene D, et al. (2006) "Weathering" and age patterns of allostatic load scores among blacks and whites in the United States. Am J Public Health 96: 826-833. doi: 10.2105/AJPH.2004.060749
![]() |
[29] | Geronimus AT, Bound J, Keene D, et al. (2007) Black-white differences in age trajectories of hypertension prevalence among adult women and men, 1999-2002. Ethn Dis 17: 40-48. |
[30] | Geronimus AT, Hicken MT, Pearson JA, et al. (2010) Do US Black Women Experience Stress-Related Accelerated Biological Aging?: A Novel Theory and First Population-Based Test of Black-White Differences in Telomere Length. Hum Nat 21: 19-38. |
[31] |
Keene DE, Geronimus AT (2011) Community-based support among African American public housing residents. J Urban Health 88: 41-53. doi: 10.1007/s11524-010-9511-z
![]() |
[32] |
Geronimus AT (2013) Deep integration: Letting the epigenome out of the bottle without losing sight of the structural origins of population Health. Am J Public Health 103: S56-S63. doi: 10.2105/AJPH.2013.301380
![]() |
[33] |
Williams DR, Mohammed SA, Shields AE (2016) Understanding and effectively addressing breast cancer in African American women: Unpacking the social context. Cancer 122: 2138-2149. doi: 10.1002/cncr.29935
![]() |
[34] |
Trivers K, Lund M, Porter P, et al. (2009) The epidemiology of triple-negative breast cancer, including race. Cancer Causes Control 20: 1071-1082. doi: 10.1007/s10552-009-9331-1
![]() |
[35] |
Kwan ML, Kushi LH, Weltzien E, et al. (2009) Epidemiology of breast cancer subtypes in two prospective cohort studies of breast cancer survivors. Breast Cancer Res 11: R31. doi: 10.1186/bcr2261
![]() |
[36] |
Millikan RC, Newman B, Tse CK, et al. (2008) Epidemiology of basal-like breast cancer. Breast Cancer Res Treat 109: 123-139. doi: 10.1007/s10549-007-9632-6
![]() |
[37] |
Parise CA, Bauer KR, Brown MM, et al. (2009) Breast cancer subtypes as defined by the estrogen receptor (ER), progesterone receptor (PR), and the human epidermal growth factor receptor 2 (HER2) among women with invasive breast cancer in California, 1999-2004. Breast J 15: 593-602. doi: 10.1111/j.1524-4741.2009.00822.x
![]() |
[38] |
Carey LA, Perou CM, Livasy CA, et al. (2006) Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. JAMA 295: 2492-2502. doi: 10.1001/jama.295.21.2492
![]() |
[39] |
Setiawan VW, Monroe KR, Wilkens LR, et al. (2009) Breast cancer risk factors defined by estrogen and progesterone receptor status: the multiethnic cohort study. Am J Epidemiol 169: 1251-1259. doi: 10.1093/aje/kwp036
![]() |
[40] |
Gehlert S, Sohmer D, Sacks T, et al. (2008) Targeting health disparities: a model linking upstream determinants to downstream interventions. Health Aff (Millwood) 27: 339-349. doi: 10.1377/hlthaff.27.2.339
![]() |
[41] |
Warnecke RB, Oh A, Breen N, et al. (2008) Approaching health disparities from a population perspective: the National Institutes of Health Centers for population health and health disparities. Am J Public Health 98: 1608-1615. doi: 10.2105/AJPH.2006.102525
![]() |
[42] |
Pearlin LI, Schieman S, Fazio EM, et al. (2005) Stress, health, and the life course: Some conceptual perspectives. J Health Soc Behav 46: 205-219. doi: 10.1177/002214650504600206
![]() |
[43] |
Hill TD, Ross CE, Angel RJ (2005) Neighborhood disorder, psychophysiological distress, and health. J Health Soc Behav 46: 170-186. doi: 10.1177/002214650504600204
![]() |
[44] |
Wheaton B (1985) Models for the stress-buffering functions of coping resources. J Health Soc Behav 26: 352-364. doi: 10.2307/2136658
![]() |
[45] |
Petticrew M, Fraser JM, Regan MF (1999) Adverse life-events and risk of breast cancer: A meta-analysis. Br J Health Psychol 4: 1-17. doi: 10.1348/135910799168434
![]() |
[46] |
Chida Y, Hamer M, Wardle J, et al. (2008) Do stress-related psychosocial factors contribute to cancer incidence and survival? Nat Clin Pract Oncol 5: 466-475. doi: 10.1038/ncponc1134
![]() |
[47] |
Holmes TH, Rahe RH (1967) The social readjustment rating scale. Psychosom Med 11: 213-218. doi: 10.1016/0022-3999(67)90010-4
![]() |
[48] |
Brown JS, Meadows SO, Elder GH, Jr. (2007) Race-ethnic inequality and psychological distress: Depressive symptoms from adolescence to young adulthood. Dev Psychol 43: 1295-1311. doi: 10.1037/0012-1649.43.6.1295
![]() |
[49] |
Geronimus AT, Pearson JA, Linnenbringer E, et al. (2015) Race-ethnicity, poverty, urban stressors, and telomere length in a detroit community-based sample. J Health Soc Behav 56: 199-224. doi: 10.1177/0022146515582100
![]() |
[50] | Cheang A, Cooper CL (1985) Psychosocial factors in breast cancer. Stress Med 1: 11-24. |
[51] |
Sorlie T (2004) Molecular portraits of breast cancer: tumour subtypes as distinct disease entities. Eur J Cancer 40: 2667-2675. doi: 10.1016/j.ejca.2004.08.021
![]() |
[52] |
Chrousos GP, Torpy DJ, Gold PW (1998) Interactions between the hypothalamic-pituitary-adrenal axis and the female reproductive system: clinical implications. Ann Intern Med 129: 229-240. doi: 10.7326/0003-4819-129-3-199808010-00012
![]() |
[53] |
Spiegel D, Butler LD, Giese-Davis J, et al. (2007) Effects of supportive-expressive group therapy on survival of patients with metastatic breast cancer: a randomized prospective trial. Cancer 110: 1130-1138. doi: 10.1002/cncr.22890
![]() |
[54] |
Michael YL, Carlson NE, Chlebowski RT, et al. (2009) Influence of stressors on breast cancer incidence in the Women's Health Initiative. Health Psychol 28: 137-146. doi: 10.1037/a0012982
![]() |
[55] |
Melhem-Bertrandt A, Conzen S (2010) The relationship between psychosocial stressors and breast cancer biology. Current Breast Cancer Reports 2: 130-137. doi: 10.1007/s12609-010-0021-5
![]() |
[56] |
Cacioppo JT, Hawkley LC (2003) Social isolation and health, with an emphasis on underlying mechanisms. Perspect Biol Med 46: S39-52. doi: 10.1353/pbm.2003.0049
![]() |
[57] | McClintock MK, Conzen SD, Gehlert S, et al. (2005) Mammary cancer and social interactions: identifying multiple environments that regulate gene expression throughout the life span. J Gerontol B Psychol Sci Soc Sci 60 1: 32-41. |
[58] |
Williams JB, Pang D, Delgado B, et al. (2009) A model of gene-environment interaction reveals altered mammary gland gene expression and increased tumor growth following social isolation. Cancer Prev Res 2: 850-861. doi: 10.1158/1940-6207.CAPR-08-0238
![]() |
[59] |
Hasen NS, O'Leary KA, Auger AP, et al. (2010) Social isolation reduces mammary development, tumor incidence, and expression of epigenetic regulators in wild-type and p53-heterozygotic mice. Cancer Prev Res (Phila Pa) 3: 620-629. doi: 10.1158/1940-6207.CAPR-09-0225
![]() |
[60] |
Williams DR, Neighbors HW, Jackson JS (2003) Racial/ethnic discrimination and health: findings from community studies. Am J Public Health 93: 200-208. doi: 10.2105/AJPH.93.2.200
![]() |
[61] |
Taylor TR, Williams CD, Makambi KH, et al. (2007) Racial discrimination and breast cancer incidence in US black women - the black women's health study. Am J Epidemiol 166: 46-54. doi: 10.1093/aje/kwm056
![]() |
[62] |
Krieger N, Jahn JL, Waterman PD (2017) Jim Crow and estrogen-receptor-negative breast cancer: US-born black and white non-Hispanic women, 1992-2012. Cancer Causes Control 28: 49-59. doi: 10.1007/s10552-016-0834-2
![]() |
[63] |
Williams DR, Collins C (2001) Racial residential segregation: a fundamental cause of racial disparities in health. Public Health Rep 116: 404-416. doi: 10.1016/S0033-3549(04)50068-7
![]() |
[64] |
Schulz A, Williams DR, Israel BA, et al. (2002) Racial and spatial relations as fundamental determinants of health in Detroit. Milbank Q 80: 677-707. doi: 10.1111/1468-0009.00028
![]() |
[65] | Massey DS, Denton NA (1993) American Apartheid: Segregation and the Making of the Underclass. Cambridge, Mass.: Harvard University Press. |
[66] | Barrett RE, Cho YI, Weaver KE, et al. (2008) Neighborhood change and distant metastasis at diagnosis of breast cancer. Ann Epidemiol 18: 43-47. |
[67] |
Sampson RJ, Morenoff JD, Earls F (1999) Beyond social capital: Spatial dynamics of collective efficacy for children. Am Sociol Rev 64: 633-660. doi: 10.2307/2657367
![]() |
[68] |
Warner E, Gomez S (2010) Impact of Neighborhood Racial Composition and Metropolitan Residential Segregation on Disparities in Breast Cancer Stage at Diagnosis and Survival Between Black and White Women in California. J Community Health 35: 398-408. doi: 10.1007/s10900-010-9265-2
![]() |
[69] | Linnenbringer E (2014) Social constructions, biological implications: A structural examination of racial disparities in breast cancer subtype. Ann Arbor, MI: University of Michigan. |
[70] | Linnenbringer E, Geronimus AT (2014) Neighborhood SES, racial concentration, and hormone receptor status among california women diagnosed with breast cancer population association of america annual meeting. Boston, MA. |
[71] | Sampson RJ, Morenoff JD, Gannon-Rowley T (2002) Assessing "neighborhood effects": Social processes and new directions in research. Ann RevSociol 28: 443-478. |
[72] |
Bernard P, Charafeddine R, Frohlich KL, et al. (2007) Health inequalities and place: A theoretical conception of neighbourhood. Soc Sci Med 65: 1839-1852. doi: 10.1016/j.socscimed.2007.05.037
![]() |
[73] |
Cummins S, Curtis S, Diez-Roux AV, et al. (2007) Understanding and representing 'place' in health research: A relational approach. Soc Sci Med 65: 1825-1838. doi: 10.1016/j.socscimed.2007.05.036
![]() |
[74] |
Macintyre S, Ellaway A, Cummins S (2002) Place effects on health: how can we conceptualise, operationalise and measure them? Soc Sci Med 55: 125-139. doi: 10.1016/S0277-9536(01)00214-3
![]() |
[75] |
Pickett KE, Wilkinson RG (2008) People like us: ethnic group density effects on health. Ethn Health 13: 321-334. doi: 10.1080/13557850701882928
![]() |
[76] | Becares L, Nazroo J, Stafford M (2009) The buffering effects of ethnic density on experienced racism and health. Health Place 15: 670-678. |
[77] | Bécares L, Shaw R, Nazroo J, et al. (2012) Ethnic density effects on physical morbidity, mortality, and health behaviors: A systematic review of the literature. Am J Public Health 102: e33-e66. |
[78] |
Keene D, Bader M, Ailshire J (2013) Length of residence and social integration: The contingent effects of neighborhood poverty. Health Place 21: 171-178. doi: 10.1016/j.healthplace.2013.02.002
![]() |
[79] | Small ML (2004) Villa Victoria: The transformation of coail capital in a Boston Barrio. Chicago: University of Chicago Press. |
[80] |
Tach LM (2009) More than bricks and mortar: Neighborhood frames, social processes, and the mixed-income redevelopment of a public housing project. City Commun 8: 269-299. doi: 10.1111/j.1540-6040.2009.01289.x
![]() |
[81] |
Pearlin LI (1989) The sociological study of stress. J Health Soc Behav 30: 241-256. doi: 10.2307/2136956
![]() |
[82] |
Taylor SE, Repetti RL, Seeman T (1997) Health psychology: What is an unhealthy environment and how does it get under the skin? Ann Rev Psychol 48: 411-447. doi: 10.1146/annurev.psych.48.1.411
![]() |
[83] | Massey DS (2004) Segregation and stratification: A biosocial perspective. Du Bois Review 1: 7-25. |
[84] | Jackson JS, Knight KM (2006) Race and self-regulatory health behaviors: The role of the stress response and the HPA axis in physical and mental health disparities. In: Schaie KW, Carstensen LL, editors. Social Structures, Aging, and Self-Regulation in the Elderly. New York: Springer: 189-207. |
[85] | Cole SW, Hawkley LC, Arevalo JM, et al. (2007) Social regulation of gene expression in human leukocytes. Genome Biol 8(9): R189. |
[86] |
Traustadóttir T, Bosch PR, Matt KS (2005) The HPA axis response to stress in women: effects of aging and fitness. Psychoneuroendocrinology 30: 392-402. doi: 10.1016/j.psyneuen.2004.11.002
![]() |
[87] |
McEwen BS (1998) Protective and damaging effects of stress mediators. N Engl J Med 338: 171-179. doi: 10.1056/NEJM199801153380307
![]() |
[88] |
McEwen BS, Wingfield JC (2003) The concept of allostasis in biology and biomedicine. Horm Behav 43: 2-15. doi: 10.1016/S0018-506X(02)00024-7
![]() |
[89] |
Parente V, Hale L, Palermo T (2013) Association between breast cancer and allostatic load by race: National Health and Nutrition Examination Survey 1999-2008. Psycho-Oncology 22: 621-628. doi: 10.1002/pon.3044
![]() |
[90] |
Morland K, Wing S, Diez Roux A, et al. (2002) Neighborhood characteristics associated with the location of food stores and food service places. Am J Prev Med 22: 23-29. doi: 10.1016/S0749-3797(01)00403-2
![]() |
[91] |
Moore LV, Diez Roux AV (2006) Associations of neighborhood characteristics with the location and type of food stores. Am J Public Health 96: 325-331. doi: 10.2105/AJPH.2004.058040
![]() |
[92] | Zenk SN, Schulz AJ, Israel BA, et al. (2006) Fruit and vegetable access differs by community racial composition and socioeconomic position in Detroit, Michigan. Ethn Dis 16: 275-280. |
[93] | Baker EA, Schootman M, Barnidge E, et al. (2006) The role of race and poverty in access to foods that enable individuals to adhere to dietary guidelines. Prev Chronic Dis 3: A76. |
[94] |
Jackson JS, Knight KM, Rafferty JA (2010) Race and unhealthy behaviors: chronic stress, the HPA axis, and physical and mental health disparities over the life course. Am J Public Health 100: 933-939. doi: 10.2105/AJPH.2008.143446
![]() |
[95] |
Zhang SM, Hankinson SE, Hunter DJ, et al. (2005) Folate intake and risk of breast cancer characterized by hormone receptor status. Cancer Epidemiol Biomarkers Prev 14: 2004-2008. doi: 10.1158/1055-9965.EPI-05-0083
![]() |
[96] |
Boggs DA, Palmer JR, Wise LA, et al. (2010) Fruit and vegetable intake in relation to risk of breast cancer in the black women's health study. Am J Epidemiol 172: 1268-1279. doi: 10.1093/aje/kwq293
![]() |
[97] |
Esteller M (2008) Epigenetics in cancer. N Engl J Med 358: 1148-1159. doi: 10.1056/NEJMra072067
![]() |
[98] |
Gronbaek K, Hother C, Jones PA (2007) Epigenetic changes in cancer. Apmis 115: 1039-1059. doi: 10.1111/j.1600-0463.2007.apm_636.xml.x
![]() |
[99] | Ahuja N, Li Q, Mohan AL, et al. (1998) Aging and DNA methylation in colorectal mucosa and cancer. Cancer Res 58: 5489-5494. |
[100] | Issa JP (2000) CpG-island methylation in aging and cancer. Curr Top Microbiol Immunol 249: 101-118. |
[101] |
Szyf M, McGowan P, Meaney MJ (2008) The social environment and the epigenome. Environ Mol Mutagen 49: 46-60. doi: 10.1002/em.20357
![]() |
[102] |
Fraga MF, Ballestar E, Paz MF, et al. (2005) Epigenetic differences arise during the lifetime of monozygotic twins. Proc Natl Acad Sci U S A 102: 10604-10609. doi: 10.1073/pnas.0500398102
![]() |
[103] |
Jovanovic J, Rønneberg JA, Tost J, et al. (2010) The epigenetics of breast cancer. Mol Oncol 4: 242-254. doi: 10.1016/j.molonc.2010.04.002
![]() |
[104] | Weigel RJ, deConinck EC (1993) Transcriptional control of estrogen receptor in estrogen receptor-negative breast carcinoma. Cancer Res 53: 3472-3474. |
[105] | Ferguson AT, Lapidus RG, Baylin SB, et al. (1995) Demethylation of the estrogen receptor gene in estrogen receptor-negative breast cancer cells can reactivate estrogen receptor gene expression. Cancer Res 55: 2279-2283. |
[106] | Wei M, Xu J, Dignam J, et al. (2007) Estrogen receptor alpha, BRCA1, and FANCF promoter methylation occur in distinct subsets of sporadic breast cancers. Breast Cancer Res Treat 111: 113-120. |
[107] |
Gaudet MM, Campan M, Figueroa JD, et al. (2009) DNA hypermethylation of ESR1 and PGR in breast cancer: pathologic and epidemiologic associations. Cancer Epidemiol Biomarkers Prev 18: 3036-3043. doi: 10.1158/1055-9965.EPI-09-0678
![]() |
[108] |
Feng W, Shen L, Wen S, et al. (2007) Correlation between CpG methylation profiles and hormone receptor status in breast cancers. Breast Cancer Res 9: R57. doi: 10.1186/bcr1762
![]() |
[109] |
Widschwendter M, Siegmund KD, Muller HM, et al. (2004) Association of breast cancer DNA methylation profiles with hormone receptor status and response to tamoxifen. Cancer Res 64: 3807-3813. doi: 10.1158/0008-5472.CAN-03-3852
![]() |
[110] |
Christensen BC, Kelsey KT, Zheng S, et al. (2010) Breast Cancer DNA Methylation Profiles Are Associated with Tumor Size and Alcohol and Folate Intake. PLoS Genet 6: e1001043. doi: 10.1371/journal.pgen.1001043
![]() |
[111] |
Cole SW (2010) Elevating the perspective on human stress genomics. Psychoneuroendocrinology 35: 955-962. doi: 10.1016/j.psyneuen.2010.06.008
![]() |
1. | Md. Shimul Bhuia, Raihan Chowdhury, Meher Afroz, Md. Showkot Akbor, Md. Sakib Al Hasan, Jannatul Ferdous, Rubel Hasan, Marcus Vinícius Oliveira Barros de Alencar, Mohammad S. Mubarak, Muhammad Torequl Islam, Therapeutic Efficacy Studies on the Monoterpenoid Hinokitiol in the Treatment of Different Types of Cancer, 2025, 1612-1872, 10.1002/cbdv.202401904 |
Training stage (80%) | Testing stage (20%) | |||||||
Class | Accuracy | Sensitivity | Specificity | F1-score | Accuracy | Sensitivity | Specificity | F1-score |
Surprise | 99.24 | 88.22 | 99.51 | 90.82 | 99.56 | 94.15 | 98.97 | 87.82 |
Sadness | 98.08 | 99.71 | 99.45 | 99.08 | 98.74 | 95.58 | 98.31 | 91.61 |
Neutral | 99.54 | 99.06 | 99.41 | 96.73 | 99.37 | 97.10 | 98.94 | 95.09 |
Happiness | 99.70 | 96.00 | 99.71 | 97.39 | 98.71 | 94.05 | 98.25 | 87.55 |
Fear | 98.52 | 98.01 | 99.79 | 91.90 | 98.61 | 95.52 | 98.72 | 95.89 |
Disgust | 98.64 | 97.15 | 99.50 | 97.62 | 98.46 | 91.96 | 98.99 | 99.72 |
Contempt | 99.25 | 96.29 | 100.00 | 99.55 | 98.55 | 92.34 | 100.00 | 100.00 |
Anger | 99.24 | 96.35 | 99.62 | 96.16 | 98.86 | 94.39 | 98.88 | 93.95 |
Training stage (70%) | Testing stage (30%) | |||||||
Class | Accuracy | Sensitivity | Specificity | F1-score | Accuracy | Sensitivity | Specificity | F1-score |
Surprise | 99.06 | 88.2 | 99.43 | 90.76 | 99.42 | 94.11 | 98.78 | 87.67 |
Sadness | 98.03 | 99.67 | 99.41 | 99.05 | 98.46 | 95.52 | 98.24 | 91.46 |
Neutral | 99.80 | 99.02 | 99.36 | 96.27 | 99.17 | 96.89 | 98.26 | 95.01 |
Happiness | 99.61 | 95.89 | 99.65 | 97.28 | 98.56 | 94.01 | 98.51 | 87.23 |
Fear | 98.25 | 98 | 99.71 | 91.82 | 98.47 | 95.05 | 98.27 | 95.35 |
Disgust | 98.39 | 97.05 | 99.43 | 97.32 | 98.4 | 91.35 | 98.82 | 99.47 |
Contempt | 99.18 | 96.21 | 99.32 | 99.25 | 98.46 | 92.21 | 99.76 | 100.00 |
Anger | 99.21 | 96.3 | 99.38 | 96.07 | 98.57 | 94.27 | 98.57 | 93.53 |
Methods | Performances (%) | |||
Accuracy | Sensitivity | Specificity | F1-score | |
CNN [21] | 94.76 | 78.5 | 78.53 | 78.08 |
ResNet-10 [21] | 93.89 | 73.65 | 76.11 | 74.29 |
AlexNet [21] | 95.88 | 83.25 | 83.1 | 82.5 |
CNN-VGG19 [35] | 94.03 | 82.95 | 81.59 | 81.75 |
Inception V3 [35] | 93.74 | 80.23 | 84.06 | 73.82 |
MobileNet [35] | 92.32 | 89.74 | 83.81 | 86.52 |
SVM [35] | 91.64 | 89.17 | 82.18 | 87.55 |
ResNet50 [35] | 88.54 | 90.96 | 83.65 | 85.99 |
HGSO-DLFER [35] | 98.65 | 98.45 | 84.99 | 87.78 |
BIPFER-EOHDL [21] | 99.05 | 88.5 | 99 | 90.93 |
Proposed | 99.37 | 99.2 | 99.48 | 98.21 |
Methods [21] | Performances (%) | |||
Accuracy | Sensitivity | Specificity | F1-score | |
Naïve Bayes | 40.55 | 40.57 | 45.59 | 40.51 |
Decision tree | 51.52 | 51.79 | 51.64 | 51.71 |
SVM | 62.58 | 62.52 | 63.58 | 59.53 |
Random forest | 62.51 | 63.5 | 64.51 | 60.54 |
Logistic regression | 51.55 | 51.66 | 49.65 | 49.66 |
CNN-3 | 80.79 | 81.63 | 81.79 | 81.73 |
CNN | 64.8 | 61.59 | 64.25 | 61.69 |
ResNet-101 | 63.43 | 60.11 | 62.99 | 60.52 |
AlexNet | 65.3 | 61.76 | 64.34 | 62.03 |
BIPFER-EOHDL | 88.5 | 83.42 | 83.48 | 83.5 |
Proposed | 99.25 | 98.21 | 97.41 | 98.15 |
Metrics | Module-A | Module-B | Module-C | Module-D |
Accuracy | 93.67 | 95.24 | 94.97 | 96.75 |
Sensitivity | 93.89 | 94.16 | 95.48 | 96.67 |
Specificity | 95.17 | 96.78 | 95.89 | 97.73 |
F1-score | 93.24 | 94.02 | 93.78 | 97.27 |
Method | Dataset | Accuracy (%) |
Deep neural network (DNN) [36] | JAFFE | 96.91 |
Hybrid deep CNN [37] | JAFFE | 98.14 |
KDEF | 95.29 | |
RaFD | 98.86 | |
Appearance-based fused descriptors model [38] | KDEF | 90.12 |
RaFD | 95.54 | |
JAFFE | 96.17 | |
Distance and shape signature features–based multilayer perceptron model [39] | JAFFE | 96.4 |
Deep CNN with bilinear pooling [40] | IMFDB | 64.17 |
Proposed model | JAFFE | 99.26 |
KDEF | 98.76 | |
RaFD | 98.65 | |
IMFDB | 97.56 |
Training stage (80%) | Testing stage (20%) | |||||||
Class | Accuracy | Sensitivity | Specificity | F1-score | Accuracy | Sensitivity | Specificity | F1-score |
Surprise | 99.24 | 88.22 | 99.51 | 90.82 | 99.56 | 94.15 | 98.97 | 87.82 |
Sadness | 98.08 | 99.71 | 99.45 | 99.08 | 98.74 | 95.58 | 98.31 | 91.61 |
Neutral | 99.54 | 99.06 | 99.41 | 96.73 | 99.37 | 97.10 | 98.94 | 95.09 |
Happiness | 99.70 | 96.00 | 99.71 | 97.39 | 98.71 | 94.05 | 98.25 | 87.55 |
Fear | 98.52 | 98.01 | 99.79 | 91.90 | 98.61 | 95.52 | 98.72 | 95.89 |
Disgust | 98.64 | 97.15 | 99.50 | 97.62 | 98.46 | 91.96 | 98.99 | 99.72 |
Contempt | 99.25 | 96.29 | 100.00 | 99.55 | 98.55 | 92.34 | 100.00 | 100.00 |
Anger | 99.24 | 96.35 | 99.62 | 96.16 | 98.86 | 94.39 | 98.88 | 93.95 |
Training stage (70%) | Testing stage (30%) | |||||||
Class | Accuracy | Sensitivity | Specificity | F1-score | Accuracy | Sensitivity | Specificity | F1-score |
Surprise | 99.06 | 88.2 | 99.43 | 90.76 | 99.42 | 94.11 | 98.78 | 87.67 |
Sadness | 98.03 | 99.67 | 99.41 | 99.05 | 98.46 | 95.52 | 98.24 | 91.46 |
Neutral | 99.80 | 99.02 | 99.36 | 96.27 | 99.17 | 96.89 | 98.26 | 95.01 |
Happiness | 99.61 | 95.89 | 99.65 | 97.28 | 98.56 | 94.01 | 98.51 | 87.23 |
Fear | 98.25 | 98 | 99.71 | 91.82 | 98.47 | 95.05 | 98.27 | 95.35 |
Disgust | 98.39 | 97.05 | 99.43 | 97.32 | 98.4 | 91.35 | 98.82 | 99.47 |
Contempt | 99.18 | 96.21 | 99.32 | 99.25 | 98.46 | 92.21 | 99.76 | 100.00 |
Anger | 99.21 | 96.3 | 99.38 | 96.07 | 98.57 | 94.27 | 98.57 | 93.53 |
Methods | Performances (%) | |||
Accuracy | Sensitivity | Specificity | F1-score | |
CNN [21] | 94.76 | 78.5 | 78.53 | 78.08 |
ResNet-10 [21] | 93.89 | 73.65 | 76.11 | 74.29 |
AlexNet [21] | 95.88 | 83.25 | 83.1 | 82.5 |
CNN-VGG19 [35] | 94.03 | 82.95 | 81.59 | 81.75 |
Inception V3 [35] | 93.74 | 80.23 | 84.06 | 73.82 |
MobileNet [35] | 92.32 | 89.74 | 83.81 | 86.52 |
SVM [35] | 91.64 | 89.17 | 82.18 | 87.55 |
ResNet50 [35] | 88.54 | 90.96 | 83.65 | 85.99 |
HGSO-DLFER [35] | 98.65 | 98.45 | 84.99 | 87.78 |
BIPFER-EOHDL [21] | 99.05 | 88.5 | 99 | 90.93 |
Proposed | 99.37 | 99.2 | 99.48 | 98.21 |
Methods [21] | Performances (%) | |||
Accuracy | Sensitivity | Specificity | F1-score | |
Naïve Bayes | 40.55 | 40.57 | 45.59 | 40.51 |
Decision tree | 51.52 | 51.79 | 51.64 | 51.71 |
SVM | 62.58 | 62.52 | 63.58 | 59.53 |
Random forest | 62.51 | 63.5 | 64.51 | 60.54 |
Logistic regression | 51.55 | 51.66 | 49.65 | 49.66 |
CNN-3 | 80.79 | 81.63 | 81.79 | 81.73 |
CNN | 64.8 | 61.59 | 64.25 | 61.69 |
ResNet-101 | 63.43 | 60.11 | 62.99 | 60.52 |
AlexNet | 65.3 | 61.76 | 64.34 | 62.03 |
BIPFER-EOHDL | 88.5 | 83.42 | 83.48 | 83.5 |
Proposed | 99.25 | 98.21 | 97.41 | 98.15 |
Metrics | Module-A | Module-B | Module-C | Module-D |
Accuracy | 93.67 | 95.24 | 94.97 | 96.75 |
Sensitivity | 93.89 | 94.16 | 95.48 | 96.67 |
Specificity | 95.17 | 96.78 | 95.89 | 97.73 |
F1-score | 93.24 | 94.02 | 93.78 | 97.27 |
Method | Dataset | Accuracy (%) |
Deep neural network (DNN) [36] | JAFFE | 96.91 |
Hybrid deep CNN [37] | JAFFE | 98.14 |
KDEF | 95.29 | |
RaFD | 98.86 | |
Appearance-based fused descriptors model [38] | KDEF | 90.12 |
RaFD | 95.54 | |
JAFFE | 96.17 | |
Distance and shape signature features–based multilayer perceptron model [39] | JAFFE | 96.4 |
Deep CNN with bilinear pooling [40] | IMFDB | 64.17 |
Proposed model | JAFFE | 99.26 |
KDEF | 98.76 | |
RaFD | 98.65 | |
IMFDB | 97.56 |