A novel improved deep learning model based on Bi-LSTM algorithm for intrusion detection in WSN

Ra'ed M. Al-Khatib; Laila Heilat; Wala Qudah; Salem Alhatamleh; Asef Al-Khateeb; Ra'ed M. Al-Khatib; Laila Heilat; Wala Qudah; Salem Alhatamleh; Asef Al-Khateeb

doi:10.3934/nhm.2025024

Networks and Heterogeneous Media

2025, Volume 20, Issue 2: 532-565. doi: 10.3934/nhm.2025024

Previous Article Next Article

Research article Special Issues

A novel improved deep learning model based on Bi-LSTM algorithm for intrusion detection in WSN

1.
Department of Computer Sciences, Faculty of Information Technology and Computer Science, Yarmouk University, Irbid 21163, Jordan
2.
Department of Computer Science, Faculty of Computer Science and Information Technology, Jerash, Jordan

Received: 15 October 2024 Revised: 14 April 2025 Accepted: 27 April 2025 Published: 27 May 2025

The widespread range of wireless sensor networks (WSNs) has recently increased the possibility of attacks. In the cyber-physical system, WSN is considered a significant component, consisting of several hops that self-organize from moving or stationary sensors. Attackers perform abusive operations to enter, seize, and control the WSN network. In order to stop such attacks on sensor networks in the future, the network traffic data was evaluated, dangerous traffic activity was monitored, and nodes were investigated. In this study, we proposed a new improved bidirectional long-short-term memory (Bi-LSTM) algorithm, which is an enhancement of the LSTM model, to address the issue of intrusion detection in WSN systems, which has four steps. In a preprocessing step, the initial raw data collected from the network can be used to train learning models. Then, the analyzed data was classified according to the types of network traffic, and the attacks were detected in the second step. In the next step, our proposed improved learning models showed results with higher accuracy than traditional detection methods. This study assessed the performance of deep learning algorithms on two different datasets, primarily the 'WSN-DS' and 'KDDCup99' datasets, which were chosen to identify and classify different DoS attacks. The primary common WSN attacks were flooding, scheduling, blackhole, and grayhole, which were included in the data set 'WSN-DS', and (denial-of-service (DoS), user-2-root (U2R), root-2-local (R2L), and Probe) were in the 'KDDCup99' dataset. Our newly improved proposed model based on Bi-LSTM achieved an accuracy of 100% on the 'WSN-DS' dataset and 99.9% on the 'KDDCup99' dataset.

Keywords:

artificial intelligence,
recurrent neural network (RNN),
deep learning,
intrusion detection,
wireless sensor network (WSN),
bidirectional long-short-term memory (Bi-LSTM)

Citation: Ra'ed M. Al-Khatib, Laila Heilat, Wala Qudah, Salem Alhatamleh, Asef Al-Khateeb. A novel improved deep learning model based on Bi-LSTM algorithm for intrusion detection in WSN[J]. Networks and Heterogeneous Media, 2025, 20(2): 532-565. doi: 10.3934/nhm.2025024

Related Papers:

[1]	Rasool M. Imran, Kadhim Hamzah Chalok, Siraj A. M. Nasrallah . Innovative two-stage thermal control of DC-DC converter for hybrid PV-battery system. AIMS Electronics and Electrical Engineering, 2025, 9(1): 26-45. doi: 10.3934/electreng.2025002
[2]	Mohamed M. Albarghot, Mohamed T. Iqbal, Kevin Pope, Luc Rolland . Dynamic modeling and simulation of the MUN Explorer autonomous underwater vehicle with a fuel cell system. AIMS Electronics and Electrical Engineering, 2020, 4(1): 114-131. doi: 10.3934/ElectrEng.2020.1.114
[3]	José M. Campos-Salazar, Roya Rafiezadeh, Felipe Santander, Juan L. Aguayo-Lazcano, Nicolás Kunakov . Comprehensive GSSA and D-Q frame dynamic modeling of dual-active-bridge DC-DC converters. AIMS Electronics and Electrical Engineering, 2025, 9(3): 288-313. doi: 10.3934/electreng.2025014
[4]	Suliana Ab-Ghani, Hamdan Daniyal, Abu Zaharin Ahmad, Norazila Jaalam, Norhafidzah Mohd Saad, Nur Huda Ramlan, Norhazilina Bahari . Adaptive online auto-tuning using Particle Swarm optimized PI controller with time-variant approach for high accuracy and speed in Dual Active Bridge converter. AIMS Electronics and Electrical Engineering, 2023, 7(2): 156-170. doi: 10.3934/electreng.2023009
[5]	Fadhil A. Hasan, Lina J. Rashad . Combining fractional-order PI controller with field-oriented control based on maximum torque per ampere technique considering iron loss of induction motor. AIMS Electronics and Electrical Engineering, 2024, 8(3): 380-403. doi: 10.3934/electreng.2024018
[6]	José M. Campos-Salazar, Roya Rafiezadeh, Juan L. Aguayo-Lazcano, Constanza Márquez . Reduction of harmonic distortion in electromagnetic torque of a single-phase reluctance motor using a multilevel neutral-point-clamped DC-AC converter. AIMS Electronics and Electrical Engineering, 2025, 9(2): 215-242. doi: 10.3934/electreng.2025011
[7]	Saqib J Rind, Saba Javed, Yawar Rehman, Mohsin Jamil . Sliding mode control rotor flux MRAS based speed sensorless induction motor traction drive control for electric vehicles. AIMS Electronics and Electrical Engineering, 2023, 7(4): 354-379. doi: 10.3934/electreng.2023019
[8]	Dipak R. Swain, Sunita S. Biswal, Pravat Kumar Rout, P. K. Ray, R. K. Jena . Optimal fractional sliding mode control for the frequency stability of a hybrid industrial microgrid. AIMS Electronics and Electrical Engineering, 2023, 7(1): 14-37. doi: 10.3934/electreng.2023002
[9]	Anjan Ku. Sahoo, Ranjan Ku. Jena . Improved DTC strategy with fuzzy logic controller for induction motor driven electric vehicle. AIMS Electronics and Electrical Engineering, 2022, 6(3): 296-316. doi: 10.3934/electreng.2022018
[10]	Tarun Naruka, Debasis Tripathy, Prangya Mohanty . Power quality enhancement by a solar photovoltaic-based distribution static compensator. AIMS Electronics and Electrical Engineering, 2025, 9(2): 192-214. doi: 10.3934/electreng.2025010

Abstract

1. Protein Chainmail and the HK97-like Fold

Chainmail is a system formed by concatenated rings (Figure 1A). Chainmail in the form of interlocking rings of metal was used by medieval knights to protect their bodies from external forces in battle. The term protein chainmail was coined to explain how capsid proteins of HK97 form large complexes that behave abnormally under biochemical analyses [1]. Later, the X-ray structure revealed how the polypeptide chains are arranged in the HK97 mature capsid to form concatenated rings [2] (Figure 1B). Since then, protein chainmail has been found in the capsids of other icosahedral double-stranded DNA (dsDNA) viruses and protein cages [3,4]. Like the armor, chainmail in viruses provides a thin yet durable layer to endure the internal force exerted by the encapsulated dsDNA [5]. In both cases, chainmail may have been developed for its ability to maintain structural integrity. Indeed, unlike Borromean rings, breaking a single ring in a chainmail does not affect the integrity of the whole mail (Figure 1A).

Figure 1. Protein Chainmail and the HK97-like fold. (A) Concept of chainmail versus Borromean rings. Breaking one ring in a chainmail would not affect the integrity of the whole, unlike the case of Borromean rings. (B) Protein chainmail of HK97. Chainmail is a structural organization of concatenated rings found in the capsids of icosahedral dsDNA viruses, and allow capsids to withstand internal forces exerted by the dsDNA. The interlocking rings of protein chainmail resemble armor constructed of metal rings worn by medieval knights during times of battle. (C) Basic building blocks of the HK97-like fold include the N (red), α (yellow), and β (blue) primary elements. Variations of the HK97-like fold can occur when the basic building blocks are connected in different order. Additionally, extra domains can be inserted at the E-loop or at the ends of the building blocks.

DownLoad: Full-Size Img PowerPoint

Each protein ring is composed of five or six subunits (pentamers and hexamers respectively) (Figure 1B), linked by either covalent or non-covalent interactions. First discovered in bacteriophage HK97 in 2000 [2], the HK97-like fold of the subunit has the shape of a modern submarine with a turret, a large forward and a small tail section (Figure 1C). The turret, forward, and tail correspond to the axial (A) domain, peripheral (P) domain and extended (E) loop, respectively. Prominent secondary structure features include a spine helix and a 3-stranded β sheet in the P-domain, a 5-stranded β sheet in the A-domain, the hairpin shaped E-loop and an N-terminal extension (N-arm) parallel to the spine helix (Figure 1C).

The HK97-like fold can also be divided into three building blocks—the N, α, and β primary elements (shown in red, yellow, and blue respectively in Figure 1C) to facilitate comparisons of subunits from different viruses [6]. Because the N terminus (and therefore the N building block) is fixed, the ends of 4 β strands and one α helix in the A-domain can only join in two distinct ways, resulting in two different topologies of the HK97-like fold [6]. The HK97-like fold of the HK97 subunit is connected in the order N, α, and β. This contrasts with the HK97-like fold in BPP-1, which is arranged in the order N, β and α, representing a non-circular permutation of the order in HK97 [6]. In other words, the topology of the HK97-like fold can only be interchanged between α and β building blocks.

Chainmail has been classified into two different subcategories on the basis of chemical interactions of its subunits: covalent and non-covalent chainmail [7]. Since its discovery in 2000 [2], the presence of protein chainmail in dsDNA phages has been observed in many dsDNA viruses, such as HK97 [2], BPP-1 [6], phage λ [8], P22 [9], and herpesviruses [7,10]. Each of these viruses employs a protein chainmail that may vary in the number of rings but shares a common HK97-like capsid protein fold. As illustrated in Figure 1C, alternating the protein chain connectivity and inserting additional domains to the core HK97-like fold can produce different viral and cellular proteins. This review aims to examine the conservation of the HK97-like fold in dsDNA viruses and cellular proteins and the divergence in the organization of this fold for chainmail construction. Careful examination of existing high-resolution structures of dsDNA viruses reveal the following four different strategies to build protein chainmail with the HK97-like fold.

2. Covalent Chainmail in HK97 Formed by Isopeptide Bonds

HK97 (named after its place of isolation, Hong Kong) is a long-tailed phage that infects Escherichia coli and is a member of the Siphoviridae family. The 18-Å thick capsid head is icosahedral with a triangulation number of T = 7 and is composed of 420 capsid protein subunits, forming 12 pentameric and 60 hexameric rings. Historically, protein chainmail for capsid stabilization was originally discovered in HK97, first predicted based on the fact that capsid form covalent complexes and did not enter the gel after denaturation [1] and later visualized at atomic detail by X-ray crystallography [2]. Figure 2A illustrates the typical secondary structural features of the HK97-like fold in the HK97 capsid protein (gp5), while Figure 4A shows a simplified representation showing topology of HK97 HK97-like fold. In addition to these canonical features described above, HK97 gp5 has two other features of note. First, around 100 residues of gp5 are cleaved during maturation [2], leading to a truncated capsid protein. Second, a 10-residue long glycine-rich loop (G-loop) interrupts the tail end of the spine helix. This G-loop has been implicated in the control and guidance of capsid assembly, and has been speculated to be a structural feature sometimes found in viruses with this fold [11].

Figure 2. Capsid Protein Comparison Between HK97 and BPP-1. (A) The capsid protein of HK97, gp5 contains the HK97-like fold (Protein Data Bank [PDB] 2FT1). The capsid protein of HK97 is truncated during maturation and begins at Ser₁₀₄, ending at Gly₃₈₄. Major structural elements include the N terminal extension (N-arm, magenta), extended loop (E-loop, cyan), 3-stranded β sheet including the P-loop (red), spine α helix (Spine helix, green), glycine-rich loop (G-loop, blue), and 5-stranded β sheet (zoomed, yellow). Along with the 3-stranded β sheet, the spine helix with a G-loop interruption forms the P-domain. The 5-stranded β sheet forms the core of the A-domain and contains a ↑↑↓↑↑ sheet arrangement. The N-arm and the E-loop protrude from the core of the fold and are flexible to accommodate capsid curvature. (B) The capsid protein of BPP-1 (PDB 3J4U). Unlike gp5 of HK97, the capsid protein of BPP-1 is not truncated. It has the same color scheme as Figure 2A, and contains similar secondary structure elements. The N-arm of BPP-1 is curved such that the β strand can interact with the auxiliary protein (see Figure 5B).

DownLoad: Full-Size Img PowerPoint

The atomic structure of HK97 capsid revealed that it is stabilized by a combination of van der Waals contacts, hydrogen bonds, salt bridges, and the unique isopeptide bonds, which is a covalent linkage that forms between two amino acid sidechains. In this case, the bond forms between the amino group of K₁₆₉ in the E-loop of one subunit and the carboxyl terminal of N₃₅₆ in the P-loop of another subunit [2]. Near the locations where three hexamers meet (the local/pseudo and global/icosahedral three fold axes, indicated by a triangle in Figure 3), the isopeptide bonds function to chemically bind subunits and anchor the hexamers in place (Figure 3A). In the absence of isopeptide bonds, it has been speculated that the three-fold region represents the weakest point in icosahedral capsids [12]. Thus, it is not surprising that isopeptide bonds bring stability to the capsid at this particular site. Another apparent consequence of these bonds is the overlapping or crisscross of hexamers at the three fold interface. The conjoined E-loops and P-loops coalesce to arrange a system of concatenating rings. A closer inspection of a single isopeptide bond reveals that the subunits are topologically linked [2] (Figure 3C). In the grand scheme of things, all of the subunits are covalently bonded together to form a single molecule—the HK97 capsid. The capsid itself is a network of concatenated, covalently-linked protein rings—a covalent chainmail.

Figure 3. Comparison of HK97 Isopeptide Bonds and BPP-1 Salt Bridges at the 3-fold Axis. (A) A local three-fold interface of HK97 showing the location of isopeptide bonds. The three-fold vertex is indicated by the yellow triangle. Three isopeptide bonds crosslink hexamers and pentamers of the HK97 capsid. Additionally, the protein rings crisscross to form a network of covalently linked, concatenated protein rings, or a covalent chainmail. (B) A local three-fold interface of BPP-1 showing that BPP-1 has salt bridges, but no isopeptide bonds. The salt bridges form between charged residues of the MCP and the auxiliary protein (Aux). These salt bridges, combined with additional non-covalent interactions, complete the non-covalent chainmail of BPP-1. (C) Zoomed-in view of the HK97 isopeptide bond that forms between Lys₁₆₉ and Asn₃₅₆. It functions to chemically bind subunits of HK97 together without the use of additional proteins. (D) Zoomed-in view of the salt bridge in BPP-1 at the three-fold interface. It forms between the positively-charged residue Lys₅ of the auxiliary protein and the negatively-charged residue Asp₂₀₇ of the capsid protein.

DownLoad: Full-Size Img PowerPoint

Covalent bonds are resilient relative to their non-covalent counterparts, such as salt bridges or hydrogen bonds, which allows the HK97 capsid to be stable without contracting the use of any auxiliary proteins. This explains how HK97 can utilize a single type of subunit to construct a stable capsid. The strategy that HK97 employs to secure its capsid is both simple and efficient, and thus far, unique among dsDNA virus structures. Given the efficiency of this strategy, and with fast emergence of high-resolution cryoEM structures of dsDNA viruses enabled by direct electron detection technology, it will be just a matter of time before the discovery of similar arrangements of covalent chainmail established by isopeptide bonds in other dsDNA viruses. The availability of such structures in the future should help elucidate the evolutionary origins of the isopeptide bond in viral capsids.

3. Non-Covalent Chainmail in P22-like Phages Stabilized by an Insertion

Enterobacteriophage P22 is a Podoviridae phage that infects Salmonella typhimurium. P22 has been used in molecular biology to study interactions with the host bacterium. Although the P22 capsid protein also contains the conserved HK97-like fold with a topology similar to that of HK97 (Figure 4C), it has a non-conserved accessory insertional domain (I-domain) introduced between the A and P-domains [9] (Figure 5A). The insertional junctions of the I-domain are located at the strands of the five-stranded β sheet of the HK97-like fold core. Interestingly, the insertional domain of P22 is situated at a similar position as the insertion of the additional upper domains in the herpesviruses RRV major capsid protein [7] (see below). The common insertional site within the HK97-like folds of P22 and the herpesviruses suggests a point of modification that may be conserved. This may have significant implications for structural modifications, as insertions can be engineered and attached at this specific position in the HK97-like fold.

Figure 4. Topology of the HK97-like fold in HK97, BPP-1, and P22. (A) Simplified representation of the topology of the HK97-like fold in HK97. Secondary structural elements are colored according to Fig 2A. (B) Topology of the HK97-like fold in BPP-1 MCP. (C) Topology of the HK97-like fold in P22 capsid protein. The I-domain has insertional junctions at the 5-stranded β sheet after β1 and before β3. The rest of the P22 capsid protein resembles the variant of HK97-like fold found in HK97.

DownLoad: Full-Size Img PowerPoint

Figure 5. I-domain in P22 Stabilization through Polar Interactions. (A) The HK97-like fold in the capsid protein of P22 includes similar secondary structural elements as HK97 and BPP-1, but also includes an insertional domain (I-domain) with junctions at β1 and β3 of the 5-stranded β sheet. The spine α helix lacks the distinctive G-loop interruption. Instead, the D-loop of the I-domain may replace the function of the G-loop. Additionally, the P-loop appears to be larger than that of HK97 or BPP-1. (B) A view of the I-domain (PDB 2M5S) from the front. The N-terminus (red circle) continues after β1, and the C-terminus (yellow circle) of the I-domain continues to β3. A 6-stranded Greek key β barrel (orange) forms the core of this domain, supported by a D-loop (dark blue) and an S-loop (purple). (C) Two-fold interface of the P22 capsid protein. Hexamer 1 (green) and hexamer 2 are held together by polar interactions between the two I-domain D-loops situated across the two-fold axis. The coloring format of the I-domain is the same as Figure 7B. (D) Zoomed-in view removing the other subunits and only showing the two subunits that interact across the two-fold axis. The yellow oval represents the polar interaction that holds the D-loops together.

DownLoad: Full-Size Img PowerPoint

The I-domain consists of three main structural elements - an antiparallel six-stranded Greek key β-barrel, the D-loop, and the S-loop (Figure 5B). Although the role of the I-domain has not been firmly established, functional analyses suggest that it is important in capsid protein folding stability, capsid assembly, and capsid size determination [13]. In addition, this domain has been proposed to stabilize individual capsid protein subunits through complementary electrostatic interactions [9], or even to stabilize interactions between adjacent subunits [14]. Positively charged residues of the I-domain, primarily located in the β-barrel, play a role in electrostatic interactions that appear to dock the I-domain to the HK97-like fold core. The D-loop and the S-loop are disordered in the isolated state of the I-domain, but become fixed upon assembly of the virion. It has been suggested that the D-loop plays an analogous role to the G-loop in HK97, and limits conformational changes in the subunit during assembly [13]. The S-loop is somewhat shorter than the D-loop, and plays a role in capsid size determination [13].

At the two-fold interface of the mature capsid (Figure 5C), the I-domain of two P22 subunits forms a dimer complex which are held in place by the polar interactions between the D-loops [13] (Figure 5D). Based on this observation, as well as the fact that the I-domain is rich in β-strands, we hypothesize that the I-domain of P22 is a predecessor of the auxiliary protein in BPP-1 and ε15 (see below). Perhaps the I-domain was cleaved and the product evolved to fasten onto the outer surface of the viral capsid. Conversely, the I-domain could potentially be a derivative of the auxiliary protein, in which case the auxiliary would be presumed to have evolved functionally before the addition of the I-domain. Another plausible explanation is that the I-domain and the auxiliary protein evolved analogously to perform a similar function—to stabilize the subunits through non-covalent interactions. Because the I-domain is observed in P22-like phages, including Sf6 [15] and CUS-3 [16], it has been proposed that the capsid proteins of this phage group have diverged from the classic HK97-like fold [16]. In any case, the P22 capsid provides an example of how the chainmail theme in dsDNA viruses is still maintained despite modifications and augmentations to the basic elementary subunit.

The recently solved structure of the P7 phage lends additional insight into the insertion strategy. Among phages of the P22-like group the location of the insertion is a recurring theme, which P7 also follows. However, instead of having an I-domain insertion, P7 has an A-loop at the 5-stranded β sheet [17]. The A-loop is situated such that a salt bridge can be formed between R₂₆₂ in the A-loop and D₁₀₃ in the 3-stranded β sheet. This intracapsomeric interaction results in non-covalent topological linking [17].

4. Non-Covalent Chainmail in BPP-1-like Phages Stabilized by an Auxiliary Protein Dimer

BPP-1 is a short-tailed phage of the Podoviridae family and infects Bordetella pertussis, the organism responsible for whooping cough. Its capsid is also icosahedral with a T = 7 triangulation number and is comparable in size to that of HK97. However, the atomic structure of BPP-1 reveals two major variations from the structure of the major capsid protein (MCP) of BPP-1 (Figure 2B). The first point of variation is that BPP-1 MCP is not truncated [6], unlike the N-terminal truncation of HK97 capsid protein. Another point of variation lies in the rearranged topology of the HK97-like fold in BPP-1 MCP (Figure 4B). Although the BPP-1 MCP contains the conserved secondary structure elements of the HK97-like fold, it does not contain the same type of G-loop found in HK97. One plausible explanation is that BPP-1 undergoes a different maturation process using a separate scaffolding protein that does not utilize the G-loop.

BPP-1 is not stabilized by covalent bonds, and this difference is highlighted by the lack of isopeptide bonds linking the protein rings. Instead, the capsid of BPP-1 is held together by a network of salt bridges. Salt bridges are intrinsically weaker than covalent bonds in terms of bond dissociation energy. It follows that because salt bridges are energetically weaker than covalent bonds, a greater number of them are required to sustain capsid stability in BPP-1. These salt bridges are found at a similar location at the three-fold interface as the isopeptide bonds in HK97 (Figure 3B). Additionally, salt bridges are found at the two-fold interface, which are supported by the presence of additional proteins.

The way that the MCP is folded presents a challenge because it limits the orientation and contact of charged residues required to form salt bridges. Thus, the number of salt bridges that can be formed is limited to a finite quantity if the capsid is constructed of only one type of protein. The solution of this problem in BPP-1 is to incorporate an additional protein on the outer surface of the capsid, which is found in a 1 to 1 ratio with the MCP (Figure 6A). This additional protein has a structure that is β sheet rich and contains the jellyroll fold [6]. In previous literature, proteins with similar function have been termed auxiliary [8], decoration [18], stabilizing [19], stapling [20], and cement [6] based on their function. Here we refer to these additional surface proteins as auxiliary proteins to highlight their role in aiding capsid stabilization. These auxiliary proteins form dimers at the local two-fold interface which are held together by hydrogen bonding between the β strands on the edges of the jellyroll β sheets of the two auxiliary proteins (Figure 6B) and a symmetric pair of salt bridges (Figure 6C). The hydrogen bonded N-terminal loops of MCPs on opposing sides of the dimer stabilize this 8-stranded β sheet, forming an augmented 10-stranded β sheet [6]. This augmented β sheet serves as an additional layer of support for the capsid.

Figure 6. Auxiliary Protein Dimer Interactions in BPP-1. (A) The local two-fold axis of BPP-1. Hexamer 1 (green) and hexamer 2 (cyan) are composed of MCP, and are stabilized by auxiliary proteins (magenta). The auxiliary proteins form dimers across the two-fold axes. (B) Zoomed-in view of the auxiliary protein dimer, which sits atop of MCP subunits at surface of the capsid. The top β strands of the CP dimer form an 8-stranded β sheet. On either side of this 8-stranded β sheet, β strands from the N-terminus (red circle) of capsid proteins hydrogen bond to form an augmented 10-stranded β sheet (navy blue). The C-terminus of the auxiliary protein (yellow circle) is exposed on the outer surface of the capsid, which is beneficial for phage display. (C) When looking from the perspective of the camera and cutting a section of the dimer at the two-fold axis, the interactions between auxiliary and capsid protein become more apparent. A pair of salt bridges hold the auxiliary protein dimer together (Lys₄₁-Asp₃₈) and salt bridges also form between Asp₂₁ of the MCP and Arg₃₉ of the auxiliary proteins. In this sense, the auxiliary protein acts like a glue to stabilize the capsid.

DownLoad: Full-Size Img PowerPoint

More importantly, auxiliary proteins contain charged residues that allow the formation of essential salt bridges with the MCP. At the three-fold interface, Aspartate₂₀₇, which is located in the P-loop of BPP-1 MCP, forms a salt bridge with Lysine₅, which is found near the N-terminus of the auxiliary protein (Figure 3D). Here, the salt bridge replaces the isopeptide bond of HK97 that binds the P-loop to the E-loop. However, the three-fold non-covalent interactions alone may not be adequate to support the capsid, which may imply the need for additional salt bridges at the two-fold interface (Figure 6C). In this sense, the auxiliary protein may serve as the glue that secures the individual MCPs in place to form the BPP-1 capsid.

The use of additional auxiliary protein to bolster the capsid seems to be a recurring strategy to compensate for the lack of isopeptide bonds in some dsDNA viruses. ε15, which is another phage of similar architecture, also demonstrates a comparable use of auxiliary proteins and salt bridges to support its capsid [20]. Although BPP-1 and ε15 both display the rearranged HK97-like fold, it is unclear whether the auxiliary protein dimer is a necessity of the rearrangement. Protein engineering efforts to revert the BPP-1 topology back to the HK97 topology have not produced functional infectious phage particles [6]. The use of charged residues and hydrogen bonds in the formation of non-covalent chainmail yields functional phages, but such a strategy is nowhere as efficient or as elegant as the covalent bond demonstrated in HK97. Thus, one may consider that such a strategy represents a less-evolved, perhaps more ancestral solution to build protein chainmail.

5. Non-Covalent Chainmail in λ-like Phages Stabilized by an Auxiliary Protein Trimer

Phage λ is a long-tailed phage with similar morphological characteristics as HK97. It is also a long-tailed Siphoviridae phage that infects E. coli, and has been well-documented as a model organism for phage-host interactions. The phage λ capsid is constructed of two proteins—gpD, an auxiliary protein and gpE, the MCP. It is not surprising that gpE also displays the HK97-like fold containing secondary structure similarities consistent with other dsDNA phages. Nevertheless, gpE by itself is unable to achieve the chainmail scheme and requires help in the form of an auxiliary protein, gpD. Like BPP-1, phage λ lacks the distinctive chemical linkages of HK97 and instead employs the use of its auxiliary protein gpD. However, a notable difference is that this protein is located at the three-fold rather than the two-fold vertices. The auxiliary protein gpD binds phage λ directly above the three-fold vertices, forming a trimer and stabilizing the mature capsid (Figure 7A). Logically, it would make sense for an auxiliary protein to bind to the three-fold vertex and stabilize the weakest point in icosahedral capsids [12], which is similar in location to isopeptide bonds in HK97 and the salt bridges in BPP-1. In fact, superimposition of the gpD trimer in phage λ with the covalent crosslinks in HK97 revealed that the trimers were positioned directly above the isopeptide bonds [8].

Figure 7. Auxiliary Protein Trimer Interactions in Phage Lambda. (A) A view of the three-fold interface of phage λ from the top. The gpD trimer (orange) is situated atop the site where subunits from three hexamers composed of gpE capsid proteins meet. The gpD trimer interacts with two subunits from each hexamer. In this sense, the gpD trimer acts as a molecular staple and anchors six gpE subunits together. (B) A zoomed-in view of the stabilizing interaction of a single gpD monomer. A four-stranded β sheet is formed from a strand from the gpD N-terminus (orange), two strands from the E-loop of a gpE subunit (magenta), and another strand from the N-arm of an adjacent gpE subunit (blue). The N-termini of gpD and the N-arm of gpE are denoted by red circles. Adapted from [8] with permission from the publisher.

DownLoad: Full-Size Img PowerPoint

Previously, the gpD monomer was crystallized and its structure was solved at 1.1 Å resolution [19]; however, the crystallized structure was unable to resolve the first fourteen residues of gpD at the N-terminus. These residues, which are critical to the understanding the interaction between gpD and gpE, were shown to be flexible from protein disorder predictions and NMR data [21]. A later cryoEM study resolved the missing fourteen residues and the mystery behind the gpD-gpE interactions [8]. The N-terminus of gpD interacts with gpE by creating a four-stranded β sheet in conjunction with the two β strands of the E-loop from a gpE subunit and the N-terminus β strand from an adjacent gpE subunit (Figure 7B). The contribution of this additional strand from the auxiliary protein gpD functions to staple two gpE subunits together through an augmented β sheet interaction. By taking into account the fact that gpD forms trimeric complexes, it is evident that six subunits from three different gpE rings are secured by non-covalent interactions. As an aside, the limited resolution of the structure presented in [8] does not allow visualization of distinct residues that are involved in these interactions.

The trimeric complex of gpD functions as a molecular staple that binds gpE subunits together to form an augmented β sheet, serving as the basis of the non-covalent chainmail network in phage λ. The auxiliary proteins of both BPP-1 and phage λ utilize the N-terminus of nearby capsid protein subunits to form augmented βsheets, yet the auxiliary protein in BPP-1 forms dimers, while the corresponding auxiliary protein in phage λ forms trimers. To add to the complexity of the situation, the auxiliary proteins in BPP-1 and phage λ are located at the two-fold and three-fold vertices respectively. On the basis of a common stabilization locale at the three-fold vertices, it has been proposed that ancestor of phage λ and HK97 included an auxiliary protein, and that HK97 diverged and evolved covalent crosslinks thereby eliminating the need for an additional protein [8]. In light of the recent discovery of different auxiliary proteins at the two-fold vertices in BPP-1 and ε15, we think that the evolution of chainmail-forming capsids cannot be explained by a simple case of convergent or divergent evolution, but rather a convoluted combination of both.

6. A Common Theme and a Possible Strategy to Build Complex Chainmail

All of the viruses utilizing the chainmail strategy contain the conserved HK97-like fold, and can be further categorized, based on the above discussed strategies, by how chainmail is achieved. Presently, HK97 is the only known virus to display covalent chainmail formed with isopeptide crosslinks and thus constitutes its own group (Table 1). The use of a single protein in HK97 to form a stable structure represents a highly optimized strategy, perhaps resulting from evolutionary selection. As such, it may represent a more recent emergence of complexes with the HK97-like fold. Insertional domains, which were originally considered immunoglobulin-like domains in many viruses, are a feature common to the P22-like group of viruses. The BPP-1-like group is classified based on rearrangement of the HK97-like fold and stabilization by means of auxiliary protein dimers at the two-fold interface. Viruses displaying auxiliary protein trimers at the three-fold interface fall under the lambda-like category. More complex viruses such as the herpesviruses utilize a combination of auxiliary proteins, insertional domains, and unique stabilization elements to fasten their capsids.

Table 1. Comparison of Structures Based On Stabilization Strategy

| Show Table

DownLoad: CSV

It appears that viruses utilizing the insertional strategy for capsid stabilization display the widest range of diversity. In contrast to Sf6 [15] and CUS-3 [16], which are similar in capsid size and architecture to P22, the phages T4 [18] and ϕ29 [22] serve as reminders to the diversity witnessed among phages. The prolate capsids of ϕ29 and T4 contain elongated midsections in stark contrast to the hexagonally-shaped capsids of T = 7 phages.

The crystal structure of the capsid protein of the well-studied T4 phage revealed the conserved HK97-like fold with an HK97-like topology [18]. Interestingly, the I-domain in T4 appears to be an extension of the E-loop instead of the 5-stranded β sheet in the A-domain. However, the linker region between the E-loop and I-domain is disordered in the crystal form. Because of the extended formation of the I-domain, it has been hypothesized to make electrostatic contacts with an adjacent HK97-like fold in the same hexamer (or pentamer) based on the complementary nature of both surfaces [18]. If the I-domain in T4 is indeed linked to the E-loop, T4 capsid protein demonstrates

another point of modification within the HK97-like fold core. Although the phage ϕ29 is much smaller and less complex than T4, it too displays a prolate capsid composed of capsid proteins containing I-domain [22]. The cryoEM density provides evidence of HK97-like fold secondary structural features, and an additional density deemed a Bacterial Immunoglobulin 2 (BIG2)-like domain. The structural similarity to Ig-like domains is analogous to the case of P22, where the I-domain was originally named a telokin Ig-like domain [9]. The I-domain in ϕ29 capsid protein is located in a similar position as the equivalent in P22, and is found sitting above the E-loop of the underlying HK97-like fold.

It is conceivable that the additional domains surrounding the HK97-like fold in the herpesviruses are also insertions (Figure 8). Unlike phages, these viruses are of significant medical relevance with eight types known to infect humans. The eight known human herpesviruses are classified into three subfamilies, alpha, beta and gammaherpesviruses. Alphaherpesvirus and betaherpesvirus subfamilies include the well-studied herpes simplex virus type 1 (HSV-1) [23] and human cytomegalovirus (HCMV) [24], respectively. The two known human gammaherpesviruses, Kaposi’s sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV), are associated with lymphomas and other malignancies [25,26,27]. Sub-nanometer resolution capsid structures of these viruses have shown the presence of HK97-like fold in their MCP [7,10,28,29]. Each MCP monomer contains over 1300 amino acids and is organized into six domains: upper, channel, buttress, helix-rich, dimerization and HK97-like fold [7]. The T = 16 capsid shell is formed nearly entirely by the small (~280 amino acids) Johnson-fold domain of MCP through non-covalent chainmail. The other five domains of MCP are likely insertions to the Johnson-fold domain on both sides of the MCP floor. Whether these insertional domains are also attached to the HK97-like fold at the regions described above is yet to be established through high resolution studies.

Figure 8. Chainmail in Herpesvirus RRV. (A) Even large, complex viruses such as the T = 16, 1300 Å herpesvirus RRV display the HK97-like fold. Side view of the major capsid subunit in herpesvirus RRV. (B) Each hexon subunit contains one monomer of SCIP (red) and one monomer of MCP. Each MCP monomer can be divided into six domains: upper domain in the upper region (gray), channel, buttress (blue) and helix-rich domains (green) in the middle region, and Johnson-fold (HK97-like fold) (purple) and dimerization domains (orange) in the floor region. (C) Fitting the HK97-like fold of HK97 (ribbon) into the Johnson-fold domain of RRV (semitransparent purple) highlights similarities between their HK97-like folds. Adapted from [7] with permission from the publisher.

DownLoad: Full-Size Img PowerPoint

7. Conclusion

Viruses likely originated from ancient cells through encapsulating cellular plasmids or genome fragments by cellular protein [30]. Indeed, cellular complexes resembling the structure and topology of HK97 gp5 have been found in both bacteria and archaea cells [3,4,31]. Similarly, it is natural to expect that cellular proteins with the BPP-1 topology of the HK97-like fold also exist elsewhere. The gene encoding one of these topologies might have evolved from the other through non-circular permutation in ancient cells through horizontal gene transfer—a rule for evolution in the prokaryotic world. In fact, although rare, non-circular permutation has been observed in cells, as exemplified by bacterial DNA methyltransferases [32,33,34]. Such permutation needs multiple steps of gene cutting-and-pasting [33], and can generate non-functional or less fit products. However, such non-functional intermediates may serve a purpose for cellular gene redundancy and lead to functional proteins through further evolution. These ancient cellular genes, encoding cellular proteins with the HK97-like fold, might have independently given rise to the viral proteins with different topologies and insertional domains. Adaptation to more complex environments of higher level organisms such as animal cells could have naturally led to acquisition of additional structures by way of insertions at locations of the loops of the HK97-like fold, as revealed in eukaryotic viruses such as herpesviruses. Such insertional domains could also give rise to auxiliary proteins by way of gene splicing such as the proposed case between BPP-1 and P22-like phages.

Acknowledgements

Work in our lab was supported in part by grants from the NIH (GM071940 and AI094386 to Z.H.Z). JC was a recipient of a J W & Nellie MacDowell Undergraduate Research Scholarship at UCLA.

Conflict of Interest

The authors declare no conflict of interest.

References

[1]	I. Almomani, B. Al-Kasasbeh, M. Al-Akhras, Wsn-DS: A dataset for intrusion detection systems in wireless sensor networks, J. Sens., 2016 (2016), 4731953. https://doi.org/10.1155/2016/4731953 doi: 10.1155/2016/4731953
[2]	K. M. Nahar, R. M. Al-Khatib, M. Barhoush, A. A. Halin, Mpf-leach: Modified probability function for cluster head election in leach protocol, Int. J. Comput. Appl. Technol., 60 (2019), 267–280. https://doi.org/10.1504/IJCAT.2019.100295 doi: 10.1504/IJCAT.2019.100295
[3]	D. Hemanand, G. V. Reddy, S. S. Babu, K. R. Balmuri, T. Chitra, S. Gopalakrishnan, An intelligent intrusion detection and classification system using csgo-lsvm model for wireless sensor networks (WSNs), Int. J. Intell. Syst. Appl. Eng., 10 (2022), 285–293.
[4]	M. S. Alsahli, M. M. Almasri, M. Al-Akhras, A. I. Al-Issa, M. Alawairdhi, Evaluation of machine learning algorithms for intrusion detection system in WSN, Int. J. Adv. Comput. Sci. Appl., 12 (2021).
[5]	R. M. Al-Khatib, N. E. A. Al-qudah, M. S. Jawarneh, A. Al-Khateeb, A novel improved lemurs optimization algorithm for feature selection problems, J. King Saud. Univ. Comput. Inf. Sci., 35 (2023), 101704. https://doi.org/10.1016/j.jksuci.2023.101704 doi: 10.1016/j.jksuci.2023.101704
[6]	P. Nancy, S. Muthurajkumar, S. Ganapathy, S. Santhosh Kumar, M. Selvi, K. Arputharaj, Intrusion detection using dynamic feature selection and fuzzy temporal decision tree classification for wireless sensor networks, IET Commun., 14 (2020), 888–895. https://doi.org/10.1049/iet-com.2019.0172 doi: 10.1049/iet-com.2019.0172
[7]	A. Migdady, A. Al-Aiad, R. M. Al-Khatib, Efficientnet deep learning model for pneumothorax disease detection in chest X-rays images, Int. J. Bus. Inf. Syst., 2022.
[8]	N. K. Mittal, A survey on wireless sensor network for community intrusion detection systems, in 2016 3rd International Conference on Recent Advances in Information Technology (RAIT), Dhanbad, India, (2016), 107–111. https://doi.org/10.1109/RAIT.2016.7507884
[9]	K. M. Nahar, R. M. Al-Khatib, M. A. Al-Shannaq, M. M. Barhoush, An efficient holy Quran recitation recognizer based on SVM learning model, Jordanian J. Comput. Inf. Technol., 6 (2020), 395–417.
[10]	H. S. Sharma, A. Sarkar, M. M. Singh, An efficient deep learning-based solution for network intrusion detection in wireless sensor network, Int. J. Syst. Assur. Eng. Manage., 14 (2023), 2423–2446. https://doi.org/10.1007/s13198-023-02090-0 doi: 10.1007/s13198-023-02090-0
[11]	M. Lopez-Martin, A. Sanchez-Esguevillas, J. I. Arribas, B. Carro, Supervised contrastive learning over prototype-label embeddings for network intrusion detection, Inf. Fusion, 79 (2022), 200–228. https://doi.org/10.1016/j.inffus.2021.09.014 doi: 10.1016/j.inffus.2021.09.014
[12]	M. L. Martin, A. Sanchez-Esguevillas, B. Carro, Review of methods to predict connectivity of iot wireless devices, Novel Appl. Mach. Learn. Network Traffic Anal. Prediction, (2017), 95.
[13]	R. M. Al-Khatib, M. A. Al-Betar, M. A. Awadallah, K. M. Nahar, M. M. A. Shquier, A. M. Manasrah, et al., MGA-TSP: Modernised genetic algorithm for the travelling salesman problem, Int. J. Reasoning-based Intell. Syst., 11 (2019), 215–226. https://doi.org/10.1504/IJRIS.2019.102541 doi: 10.1504/IJRIS.2019.102541
[14]	D. Sarkar, S. Kumar, P. Das, H. Ramos, Higher-order convergence analysis for interior and boundary layers in a semi-linear reaction-diffusion system networked by a $k$ -star graph with non-smooth source terms, Networks Heterogen. Media, 19 (2024), 1085–1115. https://doi.org/10.3934/nhm.2024048 doi: 10.3934/nhm.2024048
[15]	S. Kumar, P. Das, K. Kumar, Adaptive mesh based efficient approximations for darcy scale precipitation-dissolution models in porous media, Int. J. Numer. Methods Fluids, 96 (2024)a, 1415–1444. https://doi.org/10.1002/fld.5294
[16]	S. Kumar, P. Das, A uniformly convergent analysis for multiple scale parabolic singularly perturbed convection-diffusion coupled systems: Optimal accuracy with less computational time, Appl. Numer. Math., 207 (2025), 534–557. https://doi.org/10.1016/j.apnum.2024.09.020 doi: 10.1016/j.apnum.2024.09.020
[17]	S. Jiang, J. Zhao, X. Xu, Slgbm: An intrusion detection mechanism for wireless sensor networks in smart environments, IEEE Access, 8 (2020), 169548–169558. 10.1109/ACCESS.2020.3024219 doi: 10.1109/ACCESS.2020.3024219
[18]	S. Saini, P. Das, S. Kumar, Parameter uniform higher order numerical treatment for singularly perturbed robin type parabolic reaction diffusion multiple scale problems with large delay in time, Appl. Numer. Math., 196 (2024), 1–21. https://doi.org/10.1016/j.apnum.2023.10.003 doi: 10.1016/j.apnum.2023.10.003
[19]	K. Kumar, P. C. Podila, P. Das, H. Ramos, A graded mesh refinement approach for boundary layer originated singularly perturbed time-delayed parabolic convection diffusion problems, Math. Methods Appl. Sci., 44 (2021), 12332–12350. https://doi.org/10.1002/mma.7358 doi: 10.1002/mma.7358
[20]	M. S. Jawarneh, S. M. Shah, M. M. Aljawarneh, R. M. Al-Khatib, M. G. Al-Bashayreh, Rib bone extraction towards liver isolating in ct scans using active contour segmentation methods, Int. J. Adv. Comput. Sci. Appl., 16 (2025).
[21]	C. Cai, Y. Tao, T. Zhu, Z. Deng, Short-term load forecasting based on deep learning bidirectional lstm neural network, Appl. Sci., 11 (2021), 8129. https://doi.org/10.3390/app11178129 doi: 10.3390/app11178129
[22]	R. M. Al-Khatib, N. K. T. El-Omari, M. A. Al-Betar, Innovative cloud computing object-oriented model to unify heterogeneous data, Int. J. Oper. Res., 46 (2023), 289–322. https://doi.org/10.1504/IJOR.2023.129410 doi: 10.1504/IJOR.2023.129410
[23]	N. A. Alrajeh, S. Khan, B. Shams, Intrusion detection systems in wireless sensor networks: A review, Int. J. Distrib. Sens. Netw., 9 (2013), 167575. https://doi.org/10.1155/2013/167575 doi: 10.1155/2013/167575
[24]	W. Lee, S. J. Stolfo, K. W. Mok, Mining in a data-flow environment: Experience in network intrusion detection, in Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (1999), 114–124.
[25]	W. Lee, S. J. Stolfo, K. W. Mok, A data mining framework for building intrusion detection models, in Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No. 99CB36344), Oakland, CA, USA, (1999), 120–132. https://doi.org/10.1109/SECPRI.1999.766909
[26]	M. Dener, C. Okur, S. Al, A. Orman, WSN-BFSF: A new dataset for attacks detection in wireless sensor networks, IEEE Internet Things J., 11 (2023), 2109–2125. https://doi.org/10.1109/JIOT.2023.329220 doi: 10.1109/JIOT.2023.329220
[27]	S. Salmi, L. Oughdir, Performance evaluation of deep learning techniques for dos attacks detection in wireless sensor network, J. Big Data, 10 (2023), 1–25. https://doi.org/10.1186/s40537-023-00692-w doi: 10.1186/s40537-023-00692-w
[28]	H. S. Sharma, M. M. Singh, A. Sarkar, Machine learning-based dos attack detection techniques in wireless sensor network: A review, in Proceedings of the International Conference on Cognitive and Intelligent Computing: ICCIC 2021, Springer Nature Singapore, Singapore, 2 (2023), 583–591.
[29]	B. J. Santhosh, Kddcup99, 2025. Available from: https://ieee-dataport.org/documents/kddcup99#files.
[30]	S. Choudhary, N. Kesswani, Analysis of KDD-Cup'99, NSL-KDD and UNSW-NB15 datasets using deep learning in LoT, Procedia Comput. Sci., 167 (2020), 1561–1573. https://doi.org/10.1016/j.procs.2020.03.367 doi: 10.1016/j.procs.2020.03.367
[31]	H. Rashaideh, A. Sawaie, M. A. Al-Betar, L. M. Abualigah, M. M. Al-Laham, R. M. Al-Khatib, et al., A grey wolf optimizer for text document clustering, Int. J. Intell. Syst., 29 (2020), 814–830. https://doi.org/10.1515/jisys-2018-0194 doi: 10.1515/jisys-2018-0194
[32]	S. Overflow, Kdd1999 dataset features exolaination, 2025. Available from: https://stackoverflow.com/questions/17024961/kdd1999-dataset-features-exolaination.
[33]	K. F, Long-short-term memory (lstmwsn_1), 2023. Available from: https://www.kaggle.com/code/kamalfstudify/lstmwsn-1.
[34]	Q. Abuein, M. Ra'ed, A. Migdady, M. S. Jawarneh, A. Al-Khateeb, Arsa-tweets: A novel arabic sarcasm detection system based on deep learning model, Heliyon, 10 (2024), e36892. https://doi.org/10.1016/j.heliyon.2024.e36892 doi: 10.1016/j.heliyon.2024.e36892
[35]	P. Das, An a posteriori based convergence analysis for a nonlinear singularly perturbed system of delay differential equations on an adaptive mesh, Numer. Algorithms, 81 (2019), 465–487. https://doi.org/10.1007/s11075-018-0557-4 doi: 10.1007/s11075-018-0557-4
[36]	K. L. Chiew, B. Hui, An improved network intrusion detection method based on cnn-lstm-sa, J. Adv. Res. Appl. Sci. Eng. Technol., 1 (2025), 225–238. https://doi.org/10.37934/araset.44.1.225238 doi: 10.37934/araset.44.1.225238
[37]	S. Kumar, S. Kumar, P. Das, Second-order a priori and a posteriori error estimations for integral boundary value problems of nonlinear singularly perturbed parameterized form, Numer. Algorithms, (2024)b, 1–28. https://doi.org/10.1007/s11075-024-01918-5
[38]	S. Kumar, P. Das, Impact of mixed boundary conditions and nonsmooth data on layer-originated nonpremixed combustion problems: Higher-order convergence analysis, Stud. Appl. Math., 153 (2024), e12763. https://doi.org/10.1111/sapm.12763 doi: 10.1111/sapm.12763
[39]	I. Abu Doush, M. A. Al-Betar, M. A. Awadallah, A. I. Hammouri, R. M. Al-Khatib, S. ElMustafa, et al., Harmony search algorithm for patient admission scheduling problem, Int. J. Intell. Syst., 29 (2019), 540–553. https://doi.org/10.1515/jisys-2018-0094 doi: 10.1515/jisys-2018-0094
[40]	S. Siami-Namini, N. Tavakoli, A. S. Namin, The performance of lstm and bilstm in forecasting time series, in 2019 IEEE International Conference on Big Data (Big Data), (2019), 3285–3292. https://doi.org/10.1109/BigData47090.2019.9005997

Reader Comments

Your name:*

Email:*
© 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)