
Sign language (SL) recognition for individuals with hearing disabilities involves leveraging machine learning (ML) and computer vision (CV) approaches for interpreting and understanding SL gestures. By employing cameras and deep learning (DL) approaches, namely convolutional neural networks (CNN) and recurrent neural networks (RNN), these models analyze facial expressions, hand movements, and body gestures connected with SL. The major challenges in SL recognition comprise the diversity of signs, differences in signing styles, and the need to recognize the context in which signs are utilized. Therefore, this manuscript develops an SL detection by Improved Coyote Optimization Algorithm with DL (SLR-ICOADL) technique for hearing disabled persons. The goal of the SLR-ICOADL technique is to accomplish an accurate detection model that enables communication for persons using SL as a primary case of expression. At the initial stage, the SLR-ICOADL technique applies a bilateral filtering (BF) approach for noise elimination. Following this, the SLR-ICOADL technique uses the Inception-ResNetv2 for feature extraction. Meanwhile, the ICOA is utilized to select the optimal hyperparameter values of the DL model. At last, the extreme learning machine (ELM) classification model can be utilized for the recognition of various kinds of signs. To exhibit the better performance of the SLR-ICOADL approach, a detailed set of experiments are performed. The experimental outcome emphasizes that the SLR-ICOADL technique gains promising performance in the SL detection process.
Citation: Mashael M Asiri, Abdelwahed Motwakel, Suhanda Drar. Robust sign language detection for hearing disabled persons by Improved Coyote Optimization Algorithm with deep learning[J]. AIMS Mathematics, 2024, 9(6): 15911-15927. doi: 10.3934/math.2024769
[1] | Mashael Maashi, Mohammed Abdullah Al-Hagery, Mohammed Rizwanullah, Azza Elneil Osman . Deep convolutional neural network-based Leveraging Lion Swarm Optimizer for gesture recognition and classification. AIMS Mathematics, 2024, 9(4): 9380-9393. doi: 10.3934/math.2024457 |
[2] | Eatedal Alabdulkreem, Mesfer Alduhayyem, Mohammed Abdullah Al-Hagery, Abdelwahed Motwakel, Manar Ahmed Hamza, Radwa Marzouk . Artificial Rabbit Optimizer with deep learning for fall detection of disabled people in the IoT Environment. AIMS Mathematics, 2024, 9(6): 15486-15504. doi: 10.3934/math.2024749 |
[3] | Nuha Alruwais, Hayam Alamro, Majdy M. Eltahir, Ahmed S. Salama, Mohammed Assiri, Noura Abdelaziz Ahmed . Modified arithmetic optimization algorithm with Deep Learning based data analytics for depression detection. AIMS Mathematics, 2023, 8(12): 30335-30352. doi: 10.3934/math.20231549 |
[4] | Maher Jebali, Abdesselem Dakhli, Wided Bakari . Deep learning-based sign language recognition system using both manual and non-manual components fusion. AIMS Mathematics, 2024, 9(1): 2105-2122. doi: 10.3934/math.2024105 |
[5] | Tamilvizhi Thanarajan, Youseef Alotaibi, Surendran Rajendran, Krishnaraj Nagappan . Improved wolf swarm optimization with deep-learning-based movement analysis and self-regulated human activity recognition. AIMS Mathematics, 2023, 8(5): 12520-12539. doi: 10.3934/math.2023629 |
[6] | E Laxmi Lydia, Chukka Santhaiah, Mohammed Altaf Ahmed, K. Vijaya Kumar, Gyanendra Prasad Joshi, Woong Cho . An equilibrium optimizer with deep recurrent neural networks enabled intrusion detection in secure cyber-physical systems. AIMS Mathematics, 2024, 9(5): 11718-11734. doi: 10.3934/math.2024574 |
[7] | Hend Khalid Alkahtani, Nuha Alruwais, Asma Alshuhail, Nadhem NEMRI, Achraf Ben Miled, Ahmed Mahmud . Election-based optimization algorithm with deep learning-enabled false data injection attack detection in cyber-physical systems. AIMS Mathematics, 2024, 9(6): 15076-15096. doi: 10.3934/math.2024731 |
[8] | Wahida Mansouri, Amal Alshardan, Nazir Ahmad, Nuha Alruwais . Deepfake image detection and classification model using Bayesian deep learning with coronavirus herd immunity optimizer. AIMS Mathematics, 2024, 9(10): 29107-29134. doi: 10.3934/math.20241412 |
[9] | Sultanah M. Alshammari, Nofe A. Alganmi, Mohammed H. Ba-Aoum, Sami Saeed Binyamin, Abdullah AL-Malaise AL-Ghamdi, Mahmoud Ragab . Hybrid arithmetic optimization algorithm with deep learning model for secure Unmanned Aerial Vehicle networks. AIMS Mathematics, 2024, 9(3): 7131-7151. doi: 10.3934/math.2024348 |
[10] | Maha M. Althobaiti, José Escorcia-Gutierrez . Weighted salp swarm algorithm with deep learning-powered cyber-threat detection for robust network security. AIMS Mathematics, 2024, 9(7): 17676-17695. doi: 10.3934/math.2024859 |
Sign language (SL) recognition for individuals with hearing disabilities involves leveraging machine learning (ML) and computer vision (CV) approaches for interpreting and understanding SL gestures. By employing cameras and deep learning (DL) approaches, namely convolutional neural networks (CNN) and recurrent neural networks (RNN), these models analyze facial expressions, hand movements, and body gestures connected with SL. The major challenges in SL recognition comprise the diversity of signs, differences in signing styles, and the need to recognize the context in which signs are utilized. Therefore, this manuscript develops an SL detection by Improved Coyote Optimization Algorithm with DL (SLR-ICOADL) technique for hearing disabled persons. The goal of the SLR-ICOADL technique is to accomplish an accurate detection model that enables communication for persons using SL as a primary case of expression. At the initial stage, the SLR-ICOADL technique applies a bilateral filtering (BF) approach for noise elimination. Following this, the SLR-ICOADL technique uses the Inception-ResNetv2 for feature extraction. Meanwhile, the ICOA is utilized to select the optimal hyperparameter values of the DL model. At last, the extreme learning machine (ELM) classification model can be utilized for the recognition of various kinds of signs. To exhibit the better performance of the SLR-ICOADL approach, a detailed set of experiments are performed. The experimental outcome emphasizes that the SLR-ICOADL technique gains promising performance in the SL detection process.
Generally, disabled or special persons like deaf people are not able to talk with normal people. So, SL helps people who are unable to communicate [1]. In SL, many kinds of motions with numerous shapes are used. SL is one of the chief methods of communication among deaf as well as hearing people. SL detection networks are an effective method to talk with deaf and mute people [2]. A huge number of deaf and mute people exist across the world, and sometimes it becomes complex for common people to talk with them since not everybody can recognize SL. To create effectual communication among normal and disabled persons, there is a necessity to boost the usage of SL detection systems [3]. Non-verbal communication aids deaf as well as dumb people to talk between themselves or with others without any hassle. Deaf refers to a disability that harms a person's hearing capability and makes them unable to listen, whereas dumb refers to an incapacity that damages talking skills and makes them unable to talk [4]. If a person is not able to communicate or hear, it will be very hard to communicate with other persons. Due to this, SL plays a vital part. It allows an individual to talk without words [5]. However, one of the main issues that exist in this is that not many people know SL. Dumb and deaf people can be capable of talking between themselves by employing SL, but with normal hearing people it will be difficult for them to communicate due to lack of SL knowledge. This problem can be fixed by the usage of technique-driven solutions [6]. By employing a solution, people can effortlessly convert SL gestures into general spoken language.
SL detection denotes to usage of procedures and models employed to detect the resulting series of gestures and clear their meaning in a manuscript or language [7]. This technique contains numerous research areas, like pattern recognition, human-computer interaction (HCI), natural language processing (NLP), CV, video acquisition and processing, etc. It is a challenging topic with high difficulty [8]. Currently, typical SL detection systems are primarily separated into two kinds, namely machine vision-based schemes and sensor-based schemes. Automated recognition of human signs is a difficult multi-disciplinary problem that is not completely determined [9]. Presently, numerous models have been proposed including ML techniques for SL recognition. Many efforts have been made to detect human signs over the development of DL methods [10]. A system that is based upon DL models handles architecture, while the learning process is stimulated physically with traditional networks.
This manuscript develops an SL detection via an Improved Coyote Optimization Algorithm with DL (SLR-ICOADL) technique hearing disabled persons. The goal of the SLR-ICOADL technique is to accomplish an accurate detection model that enables communication for persons using SL as a primary case of expression. At the initial stage, the SLR-ICOADL technique applies a bilateral filtering (BF) approach for noise elimination. Following this, the SLR-ICOADL technique uses Inception-ResNetv2 for feature extraction. Meanwhile, the ICOA is utilized to select the optimal hyperparameter values of the DL model. Finally, the extreme learning machine (ELM) classification model can be utilized for the recognition of various kinds of signs. The experimental outcomes highlight that the SLR-ICOADL technique gains promising performance in the SL detection process.
Al-onazi et al. [11] presented an Arabic SL Gesture Classification employing the Deer Hunting Optimizer with ML (ASLGC-DHOML) technique. This method mainly pre-processed the input gesture images and produced feature vectors through the DenseNet169 framework. For gesture detection and classification, an MLP algorithm was utilized for recognizing and categorizing the present SL gestures. Also, the DHO method was implemented for parameter optimization of the MLP method. In [12], an Indian SL recognition technique was developed by applying ML methods. This developed approach creates the usage of different images of individuals, which represent the alphabet in Indian SL. Such images are further pre-processed, and then the technique employs these achieved images in testing and training the ML method.
Aliyev et al. [13] implemented object recognition and classification by leveraging pre-trained lightweight CNN methods. Initially, a database including close to a thousand images could be gathered, and then stimulating objects on images were labeled with bounding boxes. For designing, training, analyzing, and employing the applicable model, the TensorFlow Object identification API with Python was implemented. A pre-trained MobileNet-v2 algorithm was leveraged for tasks. In [14], an Enhanced Bald Eagle Search Optimization with TL SL Recognition (EBESO-TLSLR) method was developed. In this model, the SqueezeNet architecture was utilized for feature map generation. To recognize the SL types, the long short-term memory (LSTM) model was employed. Further, the EBESO technique was utilized for optimal hyperparameter selection of the LSTM model.
In [15], a solution was introduced with the utilization of a ML technique, which can recognize hand gestures and convert them into text or speech and conversely transform from speech or text to gestures. A webcam must be employed for capturing the region of interest (RoI), like identifying the hand movement and gestures to be represented. According to the identified gestures, the noted recording can be played. In [16], a method was developed that can employ a visual hand database dependent upon Arabic SL as well as analyze this visual information in textual data. The database employed includes Arabic SL alphabets, and all categories signify various meanings by their hand signs or gestures. Numerous data augmentation and preprocessing methods were implemented for the images. The analysis was carried out by employing diverse pre-trained methods using the specified database.
The authors of [17] considered a new supervised learning technique for ISL identification, which was beyond standard classification approaches. This algorithm utilizes advanced models devised for classifying novel interpretations. With the help of a wide-ranging database, complex patterns and nuances have been recognized, promoting correct and adaptable decision-making. A CNN method was implemented, not only in data classification but also for refinement of classification limitations and iterative learning. Marais et al. [18] examined signer-independent SLR with the LSA64 database and considered various feature extraction methods. These 2 techniques have been designed as an InceptionV3-GRU framework that employs raw images as input as well as a pose evaluation LSTM model. MediaPipe Holistic was applied to remove pose evaluation landmark coordinates. Lastly, the third architecture implements augmentation and TL through the pose analysis LSTM method.
In this manuscript, a novel SLR-ICOADL methodology is developed for hearing disabled persons. The goal of the SLR-ICOADL technique is to accomplish an accurate detection model that enables communication for persons using SL as a primary case of expression. Figure 1 demonstrates the entire procedure of the proposed SLR-ICOADL methodology.
Initially, the SLR-ICOADL technique applies the BF approach for noise elimination. The BF system is a non-linear image filter system utilized for noise removal while maintaining fine and edge details [19]. Different from linear filters that execute a weighted average in a neighborhood of pixel values, BF considers either the spatial or intensity differences among pixels. It allocates weights depending on either spatial distance or intensity similarity, ensuring that pixels with the same intensity give more to the filtering method. In the context of noise elimination, BF is particularly effective at smoothing images while maintaining sharp transitions at edges. This can be accomplished by decreasing the control of various pixels with distinct intensities, thus avoiding blurring across edges. The BF adaptability makes it useful for different kinds of noise comprising salt-and-pepper noise, as it selectively smooths regions with consistent intensity while maintaining edges and details.
At this stage, the SLR-ICOADL technique uses Inception-ResNetv2 for feature extraction, which integrates inception as well as residual connection techniques for enhanced performance [20]. This hybrid method allows the system to make use of the advantages of both approaches, with a quicker training period and averting vanishing gradient issues. In addition, residual links permit the system to skip a few layers at the time of training. Also, Inception-ResNetv2 uses multiple kernel sizes in one layer to extract outlines with assorted orders, enhancing the network's capability to catch features of variable difficulty.
The Inception-ResNet technique has numerous blocks that contain filter concatenation, convolutional layers, ResNet, ReLU activation functions, and inception structures. Its unique design and effective usage of filters has produced notable results in medical image detection challenges. Therefore, we use the InceptionResNetV2 pre-trained technique as the base for our research.
The Inception-ResNetv2 technique is normally used in CT-scan image detection due to its capability to remove important features from medical images. This DL framework integrates Inception and ResNet, two well-known DL techniques. Inception methods are useful due to their ability to remove features at manifold measures, whereas ResNet approaches efficiently solve the problem of vanishing gradients at the time of training. The Inception-Resnetv2 technique is personalized to overcome the limits of these methods to achieve high accuracy in image detection challenges. Furthermore, this technique effectively handles huge datasets with decreased calculation time, creating a desired method for the analysis of medical images.
In this work, the ICOA is utilized to select the optimal hyperparameter values of the DL model. The COA is a novel optimization model stimulated by the performance of coyotes in the environment and the traditional exchange between coyotes [21]. The coyote is a clever mammalian animal that resides mostly in North and Central America. For example, in the large applications, scientists tried to clear coyotes in the west, but rather this animal extended into the east and their populace increased rather than declining.
In the COA, the social performance of coyotes is measured as a cost function, given by
SOCp,tc=[x1,x2,…,xD] | (1) |
where SOC specifies the charging rate, p defines the group, and t designates the time of simulation.
Initially, a few arbitrary coyote candidates are designated as solution candidates in this space. This procedure is expressed as
SOCp,tc,j=LBj+r×(Urj−Lrj) | (2) |
where r∈[0,1] are random values, Urj defines the upper range, and Lrj denotes a lower range of the jth variable in the hunt space.
An objective function for every coyote is given as
objp,tc=f(SOCp,tc,j) | (3) |
Here, groups' places can be upgraded arbitrarily by traveling from one cluster to another. This procedure's formula is given by
P1=0.05×N2c | (4) |
where Nc represents the number of coyotes as a candidate solution that is lesser or equivalent to √200, and P1 has a value higher than 1. The cluster populace for coyotes was restricted to 14 to improve model diversity.
An optimal candidate is defined as an alpha coyote, which is demonstrated by the formula
αp,t=socp,tc | (5) |
A culture transformation for coyotes is given by
culp,tj={Rp,tNC+12,j12(Rp,tNC2,j+Rp,tNC2,1,j)O.W. | (6) |
Here, Rp,t represents coyotes, social condition position for cluster number p at time t for variable j.
Further, a coyote's life cycle depends on a mixture of social performance and environmental factors of parent coyotes, arithmetically exhibited as
Blep,tj={socp,tr1,j,rj<prsorj=j1socp,tr2,j,rj≥prs+praorj=j2 σj,O.W. | (7) |
where rj refers to random values from the range zero and one, σj defines random solutions from the design variable limit, r2 designates a random coyote in cluster p,j1 and j2 signify random design variables, pra describes the association, and prs refers to scatter probabilities that announce coyote's national variety in group attained by
prs=1d | (8) |
Pra=12(1−pr) | (9) |
where d signifies the dimension of a variable.
The balancing procedure (Ble) for the life cycle is clarified in the following result: If the quantity of coyotes in the group is one, Ble survives and the worst outcomes of coyotes expire. If the quantity of coyotes in the cluster is higher than 1, Ble exists and the oldest worst outcome of coyotes expire. Otherwise, Ble deceases. The possibility of dying for Ble is 15%. Figure 2 illustrates the steps involved in the COA.
Two aspects are employed for defining cultural transition among clusters. These two aspects are given as
δ1=αp,t−socp,tcr1 | (10) |
δ2=culp,t−socp,tcr2 | (11) |
where δ1 denotes cultural differences between the alpha and nominated coyote (cr1), and δ2 defines culture dissimilarity among the cluster culture and nominated coyote (cr2).
The upgrading procedure for social performance depends on the leader and group effect measured by
nsocp,tc=socp,tc+r1×δ1+r2×δ2 | (12) |
where r1 and r2 define random values between 0 and 1.
Lastly, the novel cost of upgrading is given by
nobjp,tc=f(nsocp,tc) | (13) |
socp,t+1c={nsocp,tc,nobjp,tc<objp,tcsocp,tc, O.W. | (14) |
The COA shows a better solution for the optimizer problem; however, there is a major drawback to this method. The ICOA presents two classical approaches to resolving early convergence problems.
The first mechanism is to enhance the COA and deal with the inclination speed to accomplish a better solution. The primary goal of COA is to model the updating location to control the random value. In the COA, the early individual moves from the high step size to create a balance among the local and global search, and the step size is decreased by the local search from the solution space at the last iteration:
Nsocp,tc={nsocp,tc=socp,tc+r1×δ1+r2×δ2 +(socbestc−socp,tc)×γ,rand>0.5(24)nsocp,tc=socp,tc+r1×δ1+r2×δ2 −(socbestc−socp,tc)×γ,rand≤0.5 | (15) |
γ={(f(socbestc)f(socworstc))2,if f(socworstc)≠01 ,if f(socworstc)=0 | (16) |
where (socbestc) and f(socworstc) are the worst and best performances of the main function, respectively.
The self‐adaptive weighting model enhances the value of weight for the social behaviors of the coyotes by taking the alpha value to decrease the variance amongst the worst and best performances. The chaos conception is the 2nd process used to resolve the local optimization problems. Also, this helps to decrease the time complexity.
The ICOA model creates pseudo‐random values that have sequence and ergodic random features. The logistic map has been used as a typical chaotic mechanism and is used for the three parameters (η,r1,r2). The formula can is given by
ηnewk+1=ηnewk+∅i×ηnewk | (17) |
rnew1,k+1=rnew1,k+∅i×rnew1,k | (18) |
rnew2,k+1=rnew2,k+∅i×rnew2,k | (19) |
∅k+1=4∅k(1−∅k) | (20) |
where the initial value ∅1 is a randomly generated number within [0, 1] and, ∅k denotes the value of ith chaotic iteration.
The fitness choice is a significant aspect of manipulating the solution of the ICOA approach. The hyperparameter choice model contains the outcome of the encoder system for evaluating the solution of candidate results. During this case, the ICOA methodology assumes accuracy as the main condition to plan the fitness function (FF) that is expressed as
Fitness=max(P) | (21) |
P=TPTP+FP | (22) |
where FP and TP denote the false and true positive values, respectively.
Next, the ELM classification model can be utilized for the recognition of various kinds of signs. The ELM technique simplifies one-layer backpropagation (BP) systems [22]. Dissimilar to conventional learning models, ELM arbitrarily picks input weights as well as biases of hidden layer (HL) neurons which can make these limits beforehand looking at learning data. ELM neural network presented as a simplified single HL with feed-forward neural network (SLFN).
SLFN is expressed as
y=g(b0+∑hj=1wj0vj)vj−fi(bi+h∑j=1aisixi) | (23) |
where vj are HL neurons (i=1,…,n), ai are the weight of connections among input variables and neurons in HL, wjo are the weight of connections amid neurons in HL and output neurons, bi are the bias of neurons in HL, b0 are the bias of output neurons, fi represents the activation function of neurons, g represents the activation function of output neurons, and si defines dual variables.
The ELM output function is given as
fL(x)=∑Li=1β1h1(x)=h(x)β | (24) |
where ∑Li=1β1h1(x)=h(x)β is the output weight vector, and ∑Li=1β1h1(x)=h(x)β represents a non-linear feature of a learning machine. The h1 formula is given by
h1(x)=G(ai,bi,x),bi∈R | (25) |
where G(ai,bi,x) is a continuous non-linear function.
In this section, the performance analysis of the SLR-ICOADL technique is tested using the SL dataset [23]. The dataset includes samples under 29 classes. For experimental validation, the training/testing dataset is split 80:20. In Table 1 and Figure 3, the overall SL detection results of the SLR-ICOADL technique are reported in terms of distinct metrics. The results highlight that the SLR-ICOADL technique recognizes various kinds of signs proficiently. In addition, it is noticed that the SLR-ICOADL technique reaches average precn, recal, accuy, and Fscore of 99.97%, 99.97%, 99.96%, and 99.98%, respectively.
Sign | Precision | Recall | Accuracy | F-Score | Sign | Precision | Recall | Accuracy | F-Score |
A | 100.00 | 100.02 | 99.93 | 99.96 | P | 99.98 | 99.94 | 99.97 | 100.00 |
B | 99.95 | 99.96 | 99.96 | 99.97 | Q | 99.95 | 99.95 | 99.93 | 99.94 |
C | 99.99 | 99.99 | 99.95 | 99.97 | R | 100.01 | 99.99 | 99.98 | 99.98 |
D | 99.93 | 100.01 | 100.00 | 99.95 | S | 99.98 | 100.01 | 99.93 | 100.00 |
E | 99.97 | 99.99 | 99.93 | 100.02 | T | 99.95 | 100.02 | 99.97 | 99.99 |
F | 99.94 | 99.92 | 100.00 | 100.01 | U | 99.97 | 99.99 | 99.94 | 99.96 |
G | 99.98 | 99.96 | 99.96 | 100.01 | V | 99.93 | 99.98 | 100.02 | 100.02 |
H | 99.96 | 99.94 | 99.94 | 99.92 | W | 100.00 | 99.99 | 99.93 | 99.96 |
I | 99.99 | 99.95 | 99.93 | 99.96 | X | 99.98 | 100.00 | 100.02 | 100.01 |
J | 100.00 | 99.98 | 99.93 | 100.02 | Y | 99.99 | 99.93 | 100.01 | 99.95 |
K | 99.95 | 99.95 | 99.93 | 99.95 | Z | 99.94 | 99.93 | 99.99 | 100.02 |
L | 99.95 | 100.00 | 99.92 | 100.00 | Space | 99.98 | 99.96 | 99.97 | 100.01 |
M | 99.96 | 99.96 | 99.98 | 100.00 | Nothing | 99.96 | 99.95 | 99.97 | 99.92 |
N | 99.92 | 99.93 | 99.94 | 99.95 | Delete | 99.98 | 99.96 | 99.98 | 99.92 |
O | 99.96 | 99.97 | 100.00 | 99.94 | Average | 99.97 | 99.97 | 99.96 | 99.98 |
Figure 4 shows the accuracy and loss curves of the SLR-ICOADL technique compared with other existing approaches. The simulation values imply that the accuracy value inclines to enhance and loss performance inclines to minimal with an enhancement in epoch counts. It is also apparent that the training loss is minimal and the validation accuracy is superior on the test database.
Table 2 illustrates the comparative analysis of the SLR-ICOADL technique with other recent methods [24]. Figure 5 demonstrates the precn and recal analysis of the SLR-ICOADL methodology compared with recent algorithms. The simulation values show that the SLR-ICOADL technique achieves better performance. Based on precn, the SLR-ICOADL method has a higher precn of 99.97%, while the ODTL-SLRC, SGD optimizer, RMSProp optimizer, and Adam optimizer models have achieved lesser precn of 99.93%, 99.90%, 99.92%, and 99.89%, respectively. In addition, based on recal, the SLR-ICOADL methodology reached a maximum recal of 99.97%, while the ODTL-SLRC, SGD optimizer, RMSProp optimizer, and Adam optimizer techniques have a lower recal of 99.93%, 99.90%, 99.89%, and 99.84%, respectively.
Methods | Precision | Recall | Accuracy | F-Score |
SLR-ICOADL | 99.97 | 99.97 | 99.96 | 99.98 |
ODTL-SLRC | 99.93 | 99.93 | 99.93 | 99.94 |
SGD optimizer | 99.90 | 99.90 | 99.90 | 99.84 |
RMSProp optimizer | 99.92 | 99.89 | 99.75 | 99.88 |
Adam optimizer | 99.89 | 99.84 | 99.25 | 99.89 |
Figure 6 represents the accuy and Fscore outcomes of the SLR-ICOADL method compared with other algorithms. The experimental values imply that the SLR-ICOADL technique achieves better performance. Based on accuy, the SLR-ICOADL methodology has reached a superior accuy of 99.96%, whereas the ODTL-SLRC, SGD optimizer, RMSProp optimizer, and Adam optimizer approaches have accomplished minimal accuy of 99.93%, 99.90%, 99.75%, and 99.25%, respectively. Additionally, based on the Fscore, the SLR-ICOADL system has a maximum Fscore of 99.98%, while the ODTL-SLRC, SGD optimizer, RMSProp optimizer, and Adam optimizer methodologies achieved a lower Fscore of 99.94%, 99.84%, 99.88%, and 99.89%, respectively.
Table 3 examines the overall recognition rate (RR) and computation time (CT) outcomes of the SLR-ICOADL technique with existing models. In Fig. 7, a comparison study of the SLR-ICOADL technique is performed in terms of RR. Based on RR, the SLR-ICOADL technique gains a maximum RR of 99.95%, while the KNN, SVM, ANN, CNN, and ODTL-SLRC techniques attain lower RR values of 96.25%, 98.10%, 98.11%, 99.89%, and 99.93%, respectively.
Methods | Recognition rate (%) | Computation time (min) |
KNN algorithm | 96.25 | 16.54 |
SVM model | 98.10 | 14.34 |
ANN model | 98.11 | 15.41 |
CNN model | 99.89 | 11.21 |
ODTL-SLRC | 99.93 | 6.44 |
SLR-ICOADL | 99.95 | 3.90 |
A brief computational time (CT) result of the SLR-ICOADL technique compared with recent approaches is shown in Figure 8. The results indicate that the KNN, SVM, ANN, and CNN models have resulted in ineffectual CT values of 16.54 min, 14.34 min, 15.41 min, and 11.21 min, respectively. Although the ODTL-SLRC technique offers a reasonable CT of 6.44 min, the SLR-ICOADL technique reported the lowest CT of 3.90 min. These results demonstrate the enhanced SL recognition results of the SLR-ICOADL technique.
In this manuscript, a new SLR-ICOADL methodology was developed for hearing disabled persons. The goal of the SLR-ICOADL technique is to accomplish an accurate detection model that enables communication for persons using SL as a primary form of expression. Primarily, the SLR-ICOADL technique applies the BF approach for noise elimination. Following this, the SLR-ICOADL technique uses Inception-ResNetv2 for feature extraction. Then, the ICOA is utilized to select the optimal hyperparameter values of the DL model. Finally, an ELM-based classification model can be utilized for the recognition of various kinds of signs. To demonstrate the better performance of the SLR-ICOADL algorithm, a detailed set of experiments were performed. The experimental outcomes demonstrate that the SLR-ICOADL technique has promising performance in the SL detection process.
All authors of this manuscript contributed equally.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors extend their appreciation to the King Salman Center for Disability Research for funding this work through Research Group no KSRG-2023-420.
The authors declare that they have no conflict of interest. The manuscript was written through the contributions of all authors. All authors have approved the final version of the manuscript.
[1] | N. Basnin, L. Nahar, M. S. Hossain, An integrated CNN-lSTM model for bangla lexical sign language recognition, in Proc. of Int. Conf. on Trends in Computational and Cognitive Engineering, Advances in Intelligent Systems and Computing Book Series, Springer, Singapore, 1309 (2020), 695–707. https://doi.org/10.1007/978-981-33-4673-4_57 |
[2] | C. U. Bharathi, G. Ragavi, K. Karthika, Signtalk: Sign language to text and speech conversion, in 2021 Int. Conf. on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), Coimbatore, India, 2021, 1–4. https://doi.org/10.1109/ICAECA52838.2021.9675751 |
[3] | F. Beser, M. A. Kizrak, B. Bolat, T. Yildirim, Recognition of sign language using capsule networks, in 2018 26th Signal Processing and Communications Applications Conf. (SIU), Izmir, Turkey, 2018, 1–4. https://doi.org/10.1109/ICAECA52838.2021.9675751 |
[4] | M. A. Ahmed, B. B. Zaidan, A. A. Zaidan, M. M. Salih, M. M. b. Lakulu, A review on systems-based sensory gloves for sign language recognition state of the art between 2007 and 2017, Sensors, 18 (2018), 2208. https://doi.org/10.3390/s18072208 |
[5] |
Y. Dai, Z. Luo, Review of unsupervised person re-identification, J. New Media, 3 (2021), 129–136. https://doi.org/10.32604/jnm.2021.023981 doi: 10.32604/jnm.2021.023981
![]() |
[6] |
K. Snoddon, Sign language planning and policy in ontario teacher education, Lang. Policy, 20 (2021), 577–598. https://doi.org/10.1007/s10993-020-09569-7 doi: 10.1007/s10993-020-09569-7
![]() |
[7] |
H. M. Mohammdi, D. M. Elbourhamy, An intelligent system to help deaf students learn arabic sign language, Interact. Learn. Envir., 2021, 1–16. https://doi.org/10.1080/10494820.2021.1920431 doi: 10.1080/10494820.2021.1920431
![]() |
[8] | M. P. Kumar, M. Thilagaraj, S. Sakthivel, C. Maduraiveeran, M. P. Rajasekaran, S. Rama, Sign language translator using LabVIEW enabled with internet of things, Smart Intelligent Computing and Applications, Smart Innovation, Systems and Technologies Book Series, Springer, Singapore, 104 (2019), 603–612. https://doi.org/10.1007/978-981-13-1921-1_59 |
[9] |
X. R. Zhang, J. Zhou, W. Sun, S. K. Jha, A lightweight CNN based on transfer learning for COVID-19 diagnosis, Comput. Mater. Con., 72 (2022), 1123–1137. https://doi.org/10.32604/cmc.2022.024589 doi: 10.32604/cmc.2022.024589
![]() |
[10] |
A. Kumar, R. Kumar, A novel approach for iSL alphabet recognition using extreme learning machine, Int. J. Inf. Technol., 13 (2021), 349–357. https://doi.org/10.1007/s41870-020-00525-6 doi: 10.1007/s41870-020-00525-6
![]() |
[11] |
B. B. Al-onazi, M. K. Nour, H. Alshahran, M. A. Elfaki, M. M. Alnfiai, R. Marzouk, et al., Arabic sign language gesture classification using Deer Hunting Optimization with machine learning model, Comput. Mater. Con., 75 (2023). https://doi.org/10.1007/s41870-020-00525-6 doi: 10.1007/s41870-020-00525-6
![]() |
[12] | M. Potnis, D. Raul, M. Inamdar, Recognition of Indian Sign Language using Machine Learning Algorithms, In 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), IEEE, 2021,579–584. https://doi.org/10.1007/s41870-020-00525-6 |
[13] | S. Aliyev, A. Abd Almisreb, S. Turaev, Azerbaijani sign language recognition using machine learning approach, In Journal of Physics: Conference Series, IOP Publishing, 2251 (2022), 012007. https://doi.org/10.1088/1742-6596/2251/1/012007 |
[14] |
M. M. Asiri, A. Motwakel, S. Drar, Enhanced Bald Eagle Search Optimizer with Transfer Learning-based Sign Language Recognition for Hearing-impaired Persons, J. Disabil. Res., 2 (2023), 86–93. https://doi.org/10.1088/1742-6596/2251/1/012007 doi: 10.1088/1742-6596/2251/1/012007
![]() |
[15] | S. Anthoniraj, V. Ganashree, B. J. R. Umdor, G. D. Sai, B. R. Navya, Sign Language Interpreter Using Machine Learning, In 2021 International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), (pp. 1–6). IEEE, 2021. https://doi.org/10.1109/ICAECA52838.2021.9675693 |
[16] |
M. Zakariah, Y. A. Alotaibi, D. Koundal, Y. Guo, M. Mamun Elahi, Sign language recognition for Arabic alphabets using transfer learning technique, Comput. Intell. Neurosci., 2022 (2022), Article ID 4567989. https://doi.org/10.1155/2022/4567989 doi: 10.1155/2022/4567989
![]() |
[17] |
N. K. Bose Duraimutharasan, K. Sangeetha, Machine Learning and Vision Based Techniques for Detecting and Recognizing Indian Sign Language, Revue d'Intelligence Artificielle, 37 (2023). https://doi.org/10.1109/ICAECA52838.2021.9675693 doi: 10.1109/ICAECA52838.2021.9675693
![]() |
[18] | M. Marais, D. Brown, J. Connan, A. Boby, Improving signer-independence using pose estimation and transfer learning for sign language recognition, In International Advanced Computing Conference, (pp. 415–428), Cham: Springer Nature Switzerland, 2022. https://doi.org/10.1109/ICAECA52838.2021.9675693 |
[19] | B. Desai, U. Kushwaha, S. Jha, M. NMIMS, Image filtering-techniques algorithms and applications, Applied GIS, 7 (2020), 101. |
[20] | M. Neshat, M. Ahmedb, H. Askarid, M. Thilakaratnee, S. Mirjalilia, Hybrid Inception Architecture with Residual Connection: Fine-tuned Inception-ResNet Deep Learning Model for Lung Inflammation Diagnosis from Chest Radiographs, 2023. arXiv preprint arXiv: 2310.02591. |
[21] |
A. Sari, A. Majdi, M. J. C. Opulencia, A. Timoshin, D. T. N. Huy, N. D. Trung, et al., New optimized configuration for a hybrid PV/diesel/battery system based on coyote optimization algorithm: A case study for Hotan county, Energy Rep., 8 (2022), 15480–15492. https://doi.org/10.1016/j.egyr.2022.11.059 doi: 10.1016/j.egyr.2022.11.059
![]() |
[22] |
H. Dehghanisanij, S. Emami, V. Rezaverdinejad, A. Amini, Potential of the hazelnut tree search–ELM hybrid approach in estimating yield and water productivity, Appl. Water Sci., 13 (2023), 61. https://doi.org/10.1016/j.egyr.2022.11.059 doi: 10.1016/j.egyr.2022.11.059
![]() |
[23] | https://www.kaggle.com/datasets/grassknoted/asl-alphabet |
[24] |
F. Alrowais, S. S. Alotaibi, S. Dhahbi, R. Marzouk, A. Mohamed, A. M. Hilal, Sign Language Recognition and Classification Model to Enhance Quality of Disabled People, Networks, 9 (2022), 10. https://doi.org/10.32604/cmc.2022.029438 doi: 10.32604/cmc.2022.029438
![]() |
Sign | Precision | Recall | Accuracy | F-Score | Sign | Precision | Recall | Accuracy | F-Score |
A | 100.00 | 100.02 | 99.93 | 99.96 | P | 99.98 | 99.94 | 99.97 | 100.00 |
B | 99.95 | 99.96 | 99.96 | 99.97 | Q | 99.95 | 99.95 | 99.93 | 99.94 |
C | 99.99 | 99.99 | 99.95 | 99.97 | R | 100.01 | 99.99 | 99.98 | 99.98 |
D | 99.93 | 100.01 | 100.00 | 99.95 | S | 99.98 | 100.01 | 99.93 | 100.00 |
E | 99.97 | 99.99 | 99.93 | 100.02 | T | 99.95 | 100.02 | 99.97 | 99.99 |
F | 99.94 | 99.92 | 100.00 | 100.01 | U | 99.97 | 99.99 | 99.94 | 99.96 |
G | 99.98 | 99.96 | 99.96 | 100.01 | V | 99.93 | 99.98 | 100.02 | 100.02 |
H | 99.96 | 99.94 | 99.94 | 99.92 | W | 100.00 | 99.99 | 99.93 | 99.96 |
I | 99.99 | 99.95 | 99.93 | 99.96 | X | 99.98 | 100.00 | 100.02 | 100.01 |
J | 100.00 | 99.98 | 99.93 | 100.02 | Y | 99.99 | 99.93 | 100.01 | 99.95 |
K | 99.95 | 99.95 | 99.93 | 99.95 | Z | 99.94 | 99.93 | 99.99 | 100.02 |
L | 99.95 | 100.00 | 99.92 | 100.00 | Space | 99.98 | 99.96 | 99.97 | 100.01 |
M | 99.96 | 99.96 | 99.98 | 100.00 | Nothing | 99.96 | 99.95 | 99.97 | 99.92 |
N | 99.92 | 99.93 | 99.94 | 99.95 | Delete | 99.98 | 99.96 | 99.98 | 99.92 |
O | 99.96 | 99.97 | 100.00 | 99.94 | Average | 99.97 | 99.97 | 99.96 | 99.98 |
Methods | Precision | Recall | Accuracy | F-Score |
SLR-ICOADL | 99.97 | 99.97 | 99.96 | 99.98 |
ODTL-SLRC | 99.93 | 99.93 | 99.93 | 99.94 |
SGD optimizer | 99.90 | 99.90 | 99.90 | 99.84 |
RMSProp optimizer | 99.92 | 99.89 | 99.75 | 99.88 |
Adam optimizer | 99.89 | 99.84 | 99.25 | 99.89 |
Methods | Recognition rate (%) | Computation time (min) |
KNN algorithm | 96.25 | 16.54 |
SVM model | 98.10 | 14.34 |
ANN model | 98.11 | 15.41 |
CNN model | 99.89 | 11.21 |
ODTL-SLRC | 99.93 | 6.44 |
SLR-ICOADL | 99.95 | 3.90 |
Sign | Precision | Recall | Accuracy | F-Score | Sign | Precision | Recall | Accuracy | F-Score |
A | 100.00 | 100.02 | 99.93 | 99.96 | P | 99.98 | 99.94 | 99.97 | 100.00 |
B | 99.95 | 99.96 | 99.96 | 99.97 | Q | 99.95 | 99.95 | 99.93 | 99.94 |
C | 99.99 | 99.99 | 99.95 | 99.97 | R | 100.01 | 99.99 | 99.98 | 99.98 |
D | 99.93 | 100.01 | 100.00 | 99.95 | S | 99.98 | 100.01 | 99.93 | 100.00 |
E | 99.97 | 99.99 | 99.93 | 100.02 | T | 99.95 | 100.02 | 99.97 | 99.99 |
F | 99.94 | 99.92 | 100.00 | 100.01 | U | 99.97 | 99.99 | 99.94 | 99.96 |
G | 99.98 | 99.96 | 99.96 | 100.01 | V | 99.93 | 99.98 | 100.02 | 100.02 |
H | 99.96 | 99.94 | 99.94 | 99.92 | W | 100.00 | 99.99 | 99.93 | 99.96 |
I | 99.99 | 99.95 | 99.93 | 99.96 | X | 99.98 | 100.00 | 100.02 | 100.01 |
J | 100.00 | 99.98 | 99.93 | 100.02 | Y | 99.99 | 99.93 | 100.01 | 99.95 |
K | 99.95 | 99.95 | 99.93 | 99.95 | Z | 99.94 | 99.93 | 99.99 | 100.02 |
L | 99.95 | 100.00 | 99.92 | 100.00 | Space | 99.98 | 99.96 | 99.97 | 100.01 |
M | 99.96 | 99.96 | 99.98 | 100.00 | Nothing | 99.96 | 99.95 | 99.97 | 99.92 |
N | 99.92 | 99.93 | 99.94 | 99.95 | Delete | 99.98 | 99.96 | 99.98 | 99.92 |
O | 99.96 | 99.97 | 100.00 | 99.94 | Average | 99.97 | 99.97 | 99.96 | 99.98 |
Methods | Precision | Recall | Accuracy | F-Score |
SLR-ICOADL | 99.97 | 99.97 | 99.96 | 99.98 |
ODTL-SLRC | 99.93 | 99.93 | 99.93 | 99.94 |
SGD optimizer | 99.90 | 99.90 | 99.90 | 99.84 |
RMSProp optimizer | 99.92 | 99.89 | 99.75 | 99.88 |
Adam optimizer | 99.89 | 99.84 | 99.25 | 99.89 |
Methods | Recognition rate (%) | Computation time (min) |
KNN algorithm | 96.25 | 16.54 |
SVM model | 98.10 | 14.34 |
ANN model | 98.11 | 15.41 |
CNN model | 99.89 | 11.21 |
ODTL-SLRC | 99.93 | 6.44 |
SLR-ICOADL | 99.95 | 3.90 |