
Citation: Caterina Calgaro, Claire Colin, Emmanuel Creusé. A combined finite volume - finite element scheme for a low-Mach system involving a Joule term[J]. AIMS Mathematics, 2020, 5(1): 311-331. doi: 10.3934/math.2020021
[1] | Ali Al Khabyah . Mathematical aspects and topological properties of two chemical networks. AIMS Mathematics, 2023, 8(2): 4666-4681. doi: 10.3934/math.2023230 |
[2] | Zhibin Du, Ayu Ameliatul Shahilah Ahmad Jamri, Roslan Hasni, Doost Ali Mojdeh . Maximal first Zagreb index of trees with given Roman domination number. AIMS Mathematics, 2022, 7(7): 11801-11812. doi: 10.3934/math.2022658 |
[3] | Zeeshan Saleem Mufti, Ali Tabraiz, Qin Xin, Bander Almutairi, Rukhshanda Anjum . Fuzzy topological analysis of pizza graph. AIMS Mathematics, 2023, 8(6): 12841-12856. doi: 10.3934/math.2023647 |
[4] | Ali N. A. Koam, Ali Ahmad, Azeem Haider, Moin A. Ansari . Computation of eccentric topological indices of zero-divisor graphs based on their edges. AIMS Mathematics, 2022, 7(7): 11509-11518. doi: 10.3934/math.2022641 |
[5] | Muhammad Kamran Jamil, Muhammad Imran, Aisha Javed, Roslan Hasni . On the first general Zagreb eccentricity index. AIMS Mathematics, 2021, 6(1): 532-542. doi: 10.3934/math.2021032 |
[6] | Muhammad Umar Mirza, Rukhshanda Anjum, Maged Z. Youssef, Turki Alsuraiheed . A comprehensive study on fuzzy and crisp graph indices: generalized formulae, proximity and accuracy analysis. AIMS Mathematics, 2023, 8(12): 30922-30939. doi: 10.3934/math.20231582 |
[7] | Edil D. Molina, José M. Rodríguez-García, José M. Sigarreta, Sergio J. Torralbas Fitz . On the Gutman-Milovanović index and chemical applications. AIMS Mathematics, 2025, 10(2): 1998-2020. doi: 10.3934/math.2025094 |
[8] | Zhen Lin . The biharmonic index of connected graphs. AIMS Mathematics, 2022, 7(4): 6050-6065. doi: 10.3934/math.2022337 |
[9] | Chenxu Yang, Meng Ji, Kinkar Chandra Das, Yaping Mao . Extreme graphs on the Sombor indices. AIMS Mathematics, 2022, 7(10): 19126-19146. doi: 10.3934/math.20221050 |
[10] | Milica Anđelić, Tamara Koledin, Zoran Stanić . Notes on Hamiltonian threshold and chain graphs. AIMS Mathematics, 2021, 6(5): 5078-5087. doi: 10.3934/math.2021300 |
The field of optimization plays an important role in formulating many daily life problems in different fields of science, engineering, medicine, and biology. The main goal of optimization is to find the best possible or optimal solution by taking the maximization or minimization from all possible solutions under certain constraints.
Many techniques have been developed to solve optimization problems such as dynamic programming, greedy technique, integer programming, and metaheuristics. The major challenges for solving many optimization problems are (1) the high computation time that is required to find an optimal solution, and (2) the low accuracy of the approximate solution when manipulating large-size problems. Bioinformatics is a research field that has many problems that can be formulated as optimization tasks. Examples of optimization problems in bioinformatics are molecular docking [1], deoxyribonucleic acid (DNA) motifs [2,3,4], ribonucleic acid (RNA) structure comparisons [5,6], and RNA structure prediction [7].
We focus on the RNA structure comparison problem. The study of molecular similarity permits the classification of molecules into groups, estimation of their evolutionary history, identification of functional motifs, and thus prediction of their biological function [8].
RNA is a single-stranded polymer that consists of four different nucleotides—adenine (A), guanine (G), cytosine (C), and uracil (U). These nucleotides are connected by phosphodiester bonds. The shape of the RNA structure is determined when the RNA folds back on itself [1]. In this case, a hydrogen bond exists between two interacting nucleotides and forms Watson-Crick (G-C and A-U) and wobble (G-U) base pairs [9].
In general, various computational models have been used to formulate the problem of RNA structure comparison, such as the tree [10,11,12], arc-annotated sequence (AAS) [13], and alignment-free [12,14]. The research in this paper is interested in the algorithms that solve the RNA structure comparison based on AAS. In this case, the RNA structure comparison problem is defined as follows [13]: Given two RNAs as AASs, the goal is to determine the maximum common subsequence between two AASs under the condition that all arcs connecting the subsequence's nucleotides of RNA are preserved. This goal is named the longest arc-preserving common subsequence (LAPCS).
There are two special cases of LAPCS [15]: c-fragment LAPCS, and c-diagonal LAPCS. In c-fragment LAPCS, is a LAPCS such that the fragment bases in the first sequence are only permitted to match fragment bases at the same location in the second sequence, where each sequence is divided into fragments, each of size c, except the last one, where c≥1. In c-diagonal LAPCS, is a LAPCS, such that the base ai is only permitted to match a base in the range bi−c,…,bi+c, where c≥0.
In AAS, many algorithms have been proposed based on different approaches [4,5,13,16,17,18,19]: (1) Type of proposed algorithm: exact or approximation. (2) Type or level of RNA structure: Crossing, nested, chain, and plain (see Figure 1). (3) Type of platform used: Sequential or high-performance systems. Table 1 illustrates the different proposed algorithms to solve RNA structure alignment based on the LAPCS model. From Table 1, the following points are observed: (1) The running time for exact algorithms is non-polynomial in the general case of RNA structure level, crossing type. (2) No experimental studies for finding the optimal solution in the general case. (3) The running time for a heuristic solution requires high computational time in the general case. (4) The best-known heuristic algorithm for RNA structure comparison in the general case is the algorithm proposed by Blum and Blesa [5], named the BB algorithm. (5) All proposed algorithms are designed and implemented on a computer with a single processor and no parallel implementation for any algorithm based on LAPCS.
Ref | Level of RNA Structure | Exact/Approximation | Time | Technique | Experimental Study |
[13] | (Crossing, T) | Exact | NP-hard | MIS | No |
[16] | (Nested, Nested) | Exact | NP-hard | MIS | No |
[17] | (Nested, Chain) | Exact | O(lAl3B) | DP | No |
[18] | (Plain, Plain) | Exact | O(lAlB) | DP | No |
[17] | (Crossing, Crossing) | Heuristic | O(lAlB) | DP and MIS | Yes |
[5] | (Crossing, Crossing) | Heuristic | lA seconds | Generate a set of common subsequences and MIS | Yes |
[19] | (Crossing, Crossing) | Heuristic | lA seconds | Generate a set of common subsequences and Maximum Clique | Yes |
[4] | (Crossing, Plain) | Heuristic | O(lAlB) | DP and Lookup table | Yes |
Note: *: T∈{Crossing, Nested, Chain, Plain}, DP: Dynamic programming, MIS: Maximal independent set, NP: non-deterministic polynomial. |
The novelty of this research paper focuses on designing a new parallel heuristic algorithm for three goals. The first is reducing the running time of the best-known heuristic algorithm, BB. The second is increasing the accuracy of the output of the best-known heuristic algorithm, BB. The third is to the best of our knowledge, no previous parallel algorithms have been proposed based on LAPCS. To verify these goals, we used the high-performance computing methodology to design a parallel algorithm for solving RNA structure alignment problem. The developed algorithm is based on two levels of parallelism and implemented on a multicore system of 16 threads. The results show that the parallel proposed algorithm outperforms the BB algorithm from running time and accuracy perspectives.
The organization of the paper includes an introduction, four sections, and a conclusion. The mathematical and computer background of the RNA comparison structure problem is given in Section 2. The details of the proposed parallel algorithm are introduced in Section 3. In Section 4, the experimental configurations used in the experiments including simulated and real data are given. The results of the experiments are discussed and analyzed in Section 5. Finally, the conclusion of using the high-performance system in RNA comparison is given.
In this section, a set of definitions and mathematical formula related to the LAPCS are given as follows.
Assume that the set of nucleotides of RNA is Σ={A,C,G,U} and the set RNA structure type is T={Crossing,Nested,Chain, Plain}.
A subsequence A′=a′1a′2…a′l of a string A=a1a2…alA is generated by deleting lA−l symbols from A. A common subsequence C of two strings A and B is a subsequence that appears in both strings and is formally represented as:
C={ci,j=(i,j), s.t. ai=bj,1≤i≤lA,1≤j≤lB}. | (1) |
A longest common subsequence (LCS) C of two strings A and B is a common subsequence between A and B and has a maximal length. Formally,
C={ci,j=(i,j) , s.t.ai=bj,1≤i≤lA,1≤j≤lB} and |C|≥|C′|,∀common subsequenceC′. | (2) |
The LCS for two substrings a1a2…ai and b1b2…bj is computed by the dynamic programming (DP) technique using the following equation.
LCS[i,j]={0i=0orj=0LCS[i−1,j−1]+1ai=bjandi,j≥1Max{LCS[i−1,j],LCS[i,j−1]}ai≠bjandi,j≥1 | (3) |
An arc-annotated sequence (AAS) is a pair (A,PA), where (1) A is a string (sequence) of length lA over Σ, A=a1a2…alA. (2) PA is the set of arcs, where each arc represents an unordered pair of any two complementary nucleotides. Formally, the set PA is defined as follows:
PA={(i,j):1≤i<j≤lA,andaiandajarecomplementary}. | (4) |
∀(i1,j1),(i2,j2)∈PA:(i1=i2⟺j1=j2)andi1≠j2. | (5) |
A common subsequence C of two arc-annotated sequences, (A,PA) and (B,PB), is named an arc-preserving common subsequence (APCS) if C is a subsequence for A and B and preserves all the arcs that link subsequence nucleotides. Formally,
Eq (2) and ∀ci1,j1,ci2,j2∈C,(i1,i2)∈PA⟺(j1,j2)∈PB. | (6) |
The APCS with maximal length is named the longest arc-preserving common subsequence, LAPCS. Formally,
C is APCS and |C|≥|C′|, for every APCSC′. | (7) |
The relation between two arcs is as follows [20].
● Two arcs (i1,j1) and (i2,j2) are crossing iff i1 < i2 < j1<j2 or i2 < i1 < j2<j1.
● Two arcs (i1,j1) and (i2,j2) are nested iff i2<i1<j2⟺i2<j1<j2,fori1<j1andi2<j2.
● Two arcs (i1,j1) and (i2,j2) are chain iff i1<i2⟺j1<i2,fori2<j2.
The main purpose of this section is to describe in detail how to use the parallel concept to develop an efficient parallel algorithm based on the best-known algorithm for RNA comparison, the BB algorithm. The proposed algorithm named PBB.
The BB algorithm consists of many steps, where the main two steps are as follows. The first step is generating a number, nsol, of common subsequences, S, from the two strings A and B. The second step uses the Cplex tool [21] to find the APCS for the output of the first step based on the concept of the maximum independent set (MIS). The MIS can be determined by construct a graph G = (V, E), where (1) V is the set of matched pairs that are built from the common subsequences, V={vi,j=(i,j):(i,j)∈S}, and (2) E={(vi,j,vi′,j′):(i,j)and(i′,j′)arebreakingarc−preservingconditions, (i,j)and(i′,j′)∈S}. Then, using integer linear program model, the problem can be formulated as a selection of a maximal number of non-conflicting binary variables [6]. The two main steps are repeated many times which is based on the length of the input string A, lA seconds. In each iteration, the algorithm updates the set of solutions and determines the best value of the solution by comparing the best old solution with the current values of the solution. Figure 2 illustrates the execution of BB algorithm on two RNAs of length 15 each.
The BB algorithm uses four parameters [5]: (1) nsol is the number of APCS constructions per iteration. (2) tmax is the maximum time allowed to find MIS using Cplex tool when the number of common subsequences is greater than or equal to nsol. (3) drate and lsize are two parameters used in the process of generation of a random common subsequence.
The proposed algorithm is based on using two levels of parallelism as follows. Assume that the number of threads is k, and k = k1*k2. The reason for writing k as a product of integer numbers is to make the parallelism in two levels. The first level is based on parallelizing the generation of common subsequences from A and B. This means that when generating nsol common subsequences only as in the BB algorithm, the proposed algorithm generates k1*nsol common subsequences such that each thread generates nsol common subsequences. The second level is how to use parallelization to find APCS using the MIS method.
Additionally, in the BB algorithm, the two main steps are repeated many times based on the length of string A. Therefore, the proposed parallel algorithm reduces this repetition time to approximately lA/k1 seconds. Also, the proposed PBB algorithm uses some parallel subroutines such as parallel maximal independent set, PMIS, algorithm [21], parallel binary tree technique [22,23], and parallel LCS, PLCS, algorithm [24,25].
The complete steps for the proposed PBB algorithm, PBB, are as follows.
Algorithm: PBB |
Input: Two AASs (A,PA) and (B,PB), and three integers k, k1 and k2 such that k = k1*k2 and k1, k2≥2. |
Step 1: Set the values of parameters in a constant time, nsol, tmax, drate and lsize, based on the length of A as suggested in [5], where |A| ≥ |B|, see [5, Table 4]. |
Step 2: Generate the set of matched pairs between A and B using k threads. This step can be performed by dividing B into k substrings, Bi, of approximately equal size, |B|/k. Then each thread finds the set of matched pairs, Ri, between A and Bi. Finally, R is the union set of all matched pairs, R=⋃ki=1Ri, generated by k threads that can be computed by the parallel binary tree (PBT) paradigm. |
Step 3: Find the LCS of A and B by applying the parallel LCS (PLCS) algorithm using k threads. The list L represents the set of match pairs of LCS of A and B. |
Step 4: Apply the parallel MIS, PMIS, algorithm on list L using the Cplex tool and set the results to the current solution Sbest. |
Step 5: Each thread ti, 1≤i≤k1, performs the following: |
Step 5.1: Initialization: timer=0, Si_best=Sbest, Si=Sbest. |
Step 5.2: Repeat the following steps until timer is greater than or equal to |A|/k1 seconds: |
Step 5.2.1.: Repeat the following nsol iterations: |
Step 5.2.1.1: Generate a random common subsequence, say Ci, from the list R as in [5]. |
Step 5.2.1.2: Apply the PMIS algorithm on Ci using k2 threads and obtain the list LCi. |
Step 5.2.1.3: Update Si=Si∪LCi. |
Step 5.2.2: Apply the PMIS algorithm on Si using k2 threads within time limit tmax and obtain the list LSi. |
Step 5.2.3: Update Si_best with LSi if |LSi|>|Si_best|. |
Step 6: Calculate Sbest=⋃k1i=1Si_best using k threads and the PBT paradigm. |
Step 7: Apply the PMIS algorithm on Sbest using k threads within time limit tmax and obtain the list LAPCS. |
Output: LAPCS |
Note that in the case of using the PBT paradigm using k threads, see Steps 2 and 6. If the value of k1 is small, then the proposed algorithm will perform these steps sequentially. Also, Figure 3 shows the flow chart of the proposed parallel algorithm.
In this section, we describe the experimental setup used to implement the proposed parallel algorithm on a multicore system. The experimental setup includes the platform used in the implementation, data generation used in the comparison, and the number of threads used in each level of parallelism.
For the platform used in the implementation, the experimental studies are based on a multicore system that can execute 16 threads in parallel using a processor with a speed of 2.4 GHz and a memory capacity of 24 GB. The system works under the Linux operating system. All algorithms were programmed using the Java programming language. The Java thread features were used to implement the parallel region. Additionally, all compared algorithms used IBM ILOG CPLEX v12.8 [21] as a tool to find a good heuristic solution for the MIS problem in sequential and parallel cases
For data generation, experimental comparisons between the algorithms are performed using two different datasets. The first dataset is generated as artificial data that is used for two purposes: (1) Evaluating the proposed algorithm compared to the best-known algorithm for RNA structure comparison, the BB algorithm; and (2) determining the best approach to assign 16 threads for the parallel case as two levels of parallelism. The second dataset is a real biological data for RNA structures that is used to ensure that the proposed parallel algorithm is also efficient for real data in terms of time and accuracy.
For artificial data, two parameters affect the generation of the RNA sequence: The sequence length, n, and the number of arcs, m. For fixed values of n and m, the system will generate an RNA sequence of length n containing m arcs such that the type of RNA structure is crossing. The set of values of n is {100,200,300,400,500}, while the set of values of m depends on n and is equal to n/2, n/5, and n/10. The sequence will be generated randomly and the appearance of each letter in the sequence is 1/4.
For fixed values of n and m, the running time of the compared algorithms is measured by taking the average value for 30 instances. Therefore, for a fixed value of n, the running time of the algorithm is the average of 90 instances because there are three values of m. Therefore, there are 5 × 90 = 450 comparisons of RNA structures. Additionally, the length of APCS is computed for each m to study the effect of the number of arcs on the performance of each algorithm.
To implement the parallel proposed algorithm, PBB, on a multicore system consisting of 16 threads, the number of threads, k = k1*k2, can be represented in different ways as follows: (1) 16 = 2 × 8, (2) 16 = 4 × 4, and (3) 16 = 8 × 2. In general, k1 and k2 are used for the first and second levels of parallelism, respectively. Therefore, the parallel algorithm can be represented as three parallel versions, PBB1, PBB2, and PBB3 for 16 = 2 × 8, 16 = 4 × 4, and 16 = 8 × 2, respectively. Thus, the PBB1 algorithm uses 2 threads in the first level of parallelism and 8 threads in the second level of parallelism.
For real data of RNA structures, experimental studies focus on different real data as, shown in Table 2. The Ribonuclease P RNA database contains a large number of RNAs; therefore, we selected 12 RNAs only. For Group Ⅰ introns (Group A, B, and E), we selected all RNAs in the dataset. The details of each selected RNA, such as name, length, and number of arcs, are provided in Appendix A.
RNA Database | Number of Selected RNAs | Total Number of Comparisons | Location | Ref |
Ribonuclease P RNA | 12 (out of 470) | 66 | http://www.rnasoft.ca/strand/ | [26] |
Group i introns (A) | 12 | 66 | https://crw2-comparative-rna-web.org/group-i-introns/ | [27] |
Group i introns (B) | 21 | 210 | ||
Group i introns (E) | 6 | 15 |
For each RNA group that consists of β RNAs, the total number of possible comparisons between each pair is given by:
(β2)=β(β−1)/2. | (8) |
Therefore, the methodology used in the experimental study applies the two algorithms, sequential and parallel, to all pairs of RNAs. For example, the RNA database named "Group Ⅰ introns (Group E)" contains 6 RNAs: M.anisopliae.4, C.hypophloia, E.nigra, H.rubra, M.anisopliae.2, and C.parasitica. Therefore, the total number of comparisons is 15 as follows. (1) Five comparisons between M.anisopliae.4 and every RNA in {C.hypophloia, E.nigra, H.rubra, M.anisopliae.2, C.parasitica}. (2) Four comparisons between C.hypophloia and every RNA in {E.nigra, H.rubra, M.anisopliae.2, C.parasitica}. (3) Three comparisons between E.nigra and every RNA in {H.rubra, M.anisopliae.2, C.parasitica}. (4) Two comparisons between H.rubra and every RNA in {M.anisopliae.2, C.parasitica}. (5) One comparison between M.anisopliae.2 and C.parasitica.
In general, the experimental studies on real datasets include 357 comparisons to find the LAPCS. For each pair of RNAs in real dataset, two measurements are calculated. The first measure is the length of the APCS, while the second measure is the running time of executing the algorithm.
In this section, the results and analysis of the experimental studies on artificial data and real data are discussed in the next two subsections.
The results of comparing four algorithms, one sequential and three parallel, are shown in Figure 4 and Table 3. The analysis of data results in the figure and table indicates the following.
n | m | BB | PBB1 | PBB2 | PBB3 |
100 | n/10 | 59.3 | 59.3 | 59.3 | 59.3 |
n/5 | 56.3 | 56.6 | 56.5 | 56.5 | |
n/2 | 46.2 | 46.5 | 46.6 | 46.8 | |
200 | n/10 | 121.4 | 121.3 | 121.5 | 121.6 |
n/5 | 112.4 | 112.3 | 113.0 | 113.1 | |
n/2 | 89.4 | 89.4 | 89.9 | 89.1 | |
300 | n/10 | 180.5 | 181.2 | 181.5 | 181.1 |
n/5 | 168.9 | 169.4 | 169.0 | 169.4 | |
n/2 | 134.7 | 135.0 | 135.3 | 135.4 | |
400 | n/10 | 236.5 | 238.2 | 237.5 | 237.6 |
n/5 | 218.9 | 220.6 | 219.3 | 218.6 | |
n/2 | 166.9 | 170.9 | 169.6 | 168.2 | |
500 | n/10 | 291.4 | 294.7 | 296.2 | 293.8 |
n/5 | 264.9 | 271.1 | 273.5 | 270.1 | |
n/2 | 199.3 | 207.5 | 203.3 | 203.5 |
From the running time perspective, the results, in Figure 4, illustrate the following observations. First, the running time of all parallel algorithms is less than the running time of the sequential algorithm. For example, the running time for the BB algorithm is 335.6 second when n = 300, whereas the running times for the three parallel algorithms, PBB1, PBB2, and PBB3, are 186.3,111.5, and 82.0, seconds, respectively. Second, the PBB3 algorithm is faster than the PBB2 algorithm, and the PBB2 algorithm is faster than the PBB1 algorithm. For example, the running time for PBB3 algorithm is 127.4 seconds when n = 500, while the running times for the two other algorithms, PBB2, and PBB1, are 184.2 and 308.3 seconds, respectively. Third, the percentage of improvements, on average, for the three parallel algorithms, PBB1, PBB2, and PBB3, are 44.78%, 66.97%, and 77.60%, respectively, where the percentage of improvement is measured by 1-Tpar/Tseq. Fourth, the average values of speed up for the parallel algorithms, PBB1, PBB2, and PBB3, are 1.8, 3, and 4.5, respectively, where the speed up is equal to Tseq/Tpar.
From the length of output viewpoint, the results in Table 3 illustrate the following observations. First, the length of the APCS generated by parallel algorithms is approximately equal to the length of the APCS generated by the sequential algorithm when n ≤ 200. For example, the average length of APCS, for 30 instances, generated by the four algorithms, BB, PBB1, PBB2, and PBB3, are 46.20, 46.53, 46.60, and 46.77, respectively, when n = 100, m = n/2. Second, the length of the APCS generated by parallel algorithms is greater than the length of the APCS generated by sequential algorithm when n > 200. For example, the average length of APCS, for 30 instances, generated by the four algorithms, BB, PBB1, PBB2, and PBB3, are 166.9,170.9,169.6, and 168.2, respectively, when n = 400, m = n/2. Third, the difference between the length of the APCS generated by parallel algorithms and the BB algorithm increases with increasing values of n. For example, the average difference between the length of the APCS generated by BB algorithm and PBB1 algorithm is 2.5 when n = 400, whereas the difference equal to 5.9 when n = 500. Fourth, the length of the APCS generated by the PBB1 and PBB2 algorithms is almost greater than that generated by the PBB3 algorithm.
A non-parametric statistical test known as the Wilcoxon signed-rank test [27] was employed to ascertain whether there exist statistically significant variations in the length of the output for the four algorithms, BB, PBB1, PBB2, and PBB3. The significant level used in the test is equal to 0.05. The results of implementing the test on each pair of algorithms, six pairs of algorithms, show the following observations (see additional file “math-09-05-550-supplementary”). (1) There was a significant difference between all parallel algorithms, PBB1, PBB2, and PBB3, and the sequential algorithm, BB; except in one case, there is no significant difference between BB and PBB1 when n = 200. (2) In the case of the two algorithms, PBB1 and PBB2, the PBB2 algorithm is better than the PBB1 algorithm when n = 200 and 300, while the PBB1 algorithm is better than the PBB2 algorithm when n = 400. Otherwise, there is no significant difference between the two algorithms. (3) In the case of the two algorithms, PBB1 and PBB3, the PBB3 algorithm is better than the PBB1 algorithm when n = 200, while the PBB1 algorithm is better than the PBB3 algorithm when n = 400 and 500. Otherwise, there is no significant difference between the two algorithms. (4) In the case of the two algorithms, PBB2 and PBB3, the PBB3 algorithm is better than the PBB2 algorithm when n = 200, while the PBB2 algorithm is better than the PBB3 algorithm when n = 400 and 500. Otherwise, there is no significant difference between the two algorithms.
From the memory required by each algorithm viewpoint, Table 4 illustrates the values of the memories in GB. The results show the following observations: First, the memory required by the BB algorithm is less than all parallel algorithms, PBB1, PBB2, and PBB3. Second, the memory required by the PBB1 algorithm is less than that required by the PBB2 algorithm, and the memory required by the PBB2 algorithm is less than that required by the PBB3 algorithm. The memory required for the PBB3 algorithm is high compared to other parallel algorithms due to the manipulation of 8 APCS simultaneously using the Cplex tool, while the two other algorithms manipulate 4 and 2 APCS.
n | BB | PBB1 | PBB2 | PBB3 |
100 | 0.14 | 0.16 | 0.23 | 0.31 |
200 | 0.14 | 0.15 | 0.24 | 0.36 |
300 | 0.15 | 0.19 | 0.34 | 0.55 |
400 | 0.17 | 0.18 | 0.53 | 0.72 |
500 | 0.26 | 0.28 | 0.53 | 0.73 |
As a result, from the analysis of previous data, the parallel algorithms PBB1 and PBB2 have good performance from the length of output measurement compared to the other algorithm. Additionally, the PBB2 algorithm has better performance than the PBB1 algorithm from a running time perspective, which is more important than memory because the amount of storage is not high for all parallel algorithms. Therefore, the parallel algorithm PBB2 was selected to evaluate the parallelization of the BB algorithm for the real dataset as in the next subsection.
In this subsection, a comparison between the sequential algorithm and the selected parallel algorithm, PBB2, is performed to verify that the parallelism enhances the sequential algorithm from the points of view of the length of the output and running time. Table 5 shows the results of two measurements, time and length of LAPCS, for two algorithms, BB and PBB2, on four datasets of real RNAs.
RNA Database | Running Time in Seconds | Length of LAPCS | ||||||
BB | PBB2 | % of Improvement |
Difference (PBB2-BB) | Diff = 0 | Diff > 0 | CV | ||
BB | PBB2 | |||||||
Ribonuclease P RNA | 398.1 | 107.5 | 73% | [0, 7] | 50% | 50% | 0.179 | 0.175 |
Group i introns (Group A) | 1003.2 | 259.7 | 74% | [0, 7] | 45.4% | 54.6% | 0.362 | 0.362 |
Group i introns (Group B) | 1263 | 343.7 | 71.4% | [0, 10] | 68.1% | 31.9% | 0.633 | 0.632 |
Group i introns (Group E) | 551.1 | 183.7 | 66.6% | [0, 5] | 53.3% | 46.7% | 0.163 | 0.161 |
For the running time measurement, Table 5 shows the running time of both algorithms and the percentage of improvement in the running for the proposed parallel algorithm PBB2 compared to BB algorithm. For example, the two algorithms, BB and PBB2, were run on the Ribonuclease P RNA dataset and obtained the following results. (1) For 66 cases, the average running times of the BB and PBB2 algorithms are 398.1 and 107.5 seconds, respectively. (2) The PBB2 algorithm outperforms the BB algorithm with a percentage of improvement of 73%. Additionally, the running times for BB algorithm on the two dataset, Group A and B, are higher than the other dataset because the two datasets contain RNA with length greater than 1000 (see Appendix A).
For the length of APCS measurement, Table 5 displays the range of variation between the PBB2 and BB algorithms' outputs for the length of APCS measurement, as well as the percentage of cases where the PBB2 algorithm's generated LAPCS is longer (or equal to) the BB algorithm's generated LAPCS. For example, the two algorithms, BB and PBB2, were run on the Group i introns (Group A) dataset and obtained the following results: (1) The length of LAPCS generated by the PBB2 algorithm is greater than or equal to the output of the BB algorithm, with a difference from 0 to 7. (2) In 45.4% of the comparison cases, both algorithms generate the LAPCS with the same lengths. On the other hand, the PBB2 algorithm generates LAPCS with a length greater than that generated from the BB algorithm, with a percentage of 54.6%. Additionally, the difference between PBB2 and BB algorithms is sometimes large, such as in the Group i introns (Group B) dataset, where the maximum difference is 10. (3) The results of measuring the coefficient of variation (CV) of both algorithms for the length of LAPCS is almost equal. (4) The PBB2 algorithm has a significant difference compared to BB algorithm when we use Wilcoxon signed-rank test for all cases, except one case when the dataset is Group i introns (Group E).
On average, in all cases, the PBB2 algorithm outperforms the BB algorithm in terms of running time, with an improvement of approximately 71%. Additionally, the PBB2 algorithm generates LAPCS with a length greater than that generated by the BB algorithm, with at least 1 in 45.8% of the cases.
Identifying the similarity structure between two RNA structures is challenging in bioinformatics due to the high computational time required to find an optimal solution. In this paper, the RNA structure is represented as the longest arc-preserving common subsequence model. Then high-performance computing paradigms and metaheuristic techniques are utilized to develop an efficient parallel algorithm on a multicore system. The developed parallel algorithm outperforms the best-known sequential algorithm in terms of the running time and length of output.
Additionally, the developed algorithm was tested on artificial and real data to measure the percentage of improvement in the running 71% on average and increasing the length of output by approximately 45% of all cases. On the other side, the proposed algorithm required a little bit more space.
There are many open research questions based on the contributions of this paper: How can we replace the MIS method with another heuristic algorithm to increase the efficiency of the proposed algorithm? (2) How can we use the graphic processing units (GPUs) to implement the parallel proposed algorithm?
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors are grateful to the referees for their valuable comments that helped to improve the paper. The authors would like to acknowledge the support provided by Deputy for Research & Innovation, Ministry of Education through Initiative of Institutional Funding at University of Ha'il–Saudi Arabia through project number IFP-22 187.
This work was supported by Deputy for Research & Innovation, Ministry of Education through Initiative of Institutional Funding at University of Ha'il–Saudi Arabia through project number IFP-22 187.
The authors declare that they have no conflicts of interest.
RNA Database | Name of RNA | length | #arcs |
Ribonuclease P RNA | Allochromatium_vinosum | 369 | 119 |
Bacteroides_thetaiotaomicron | 361 | 121 | |
Porphyromonas_gingivalis | 398 | 131 | |
Haemophilus_influenza | 377 | 124 | |
Halococcus_morrhuae | 475 | 154 | |
Haloferax_volcanii | 433 | 142 | |
Mycoplasma_genitalium | 384 | 119 | |
Mycoplasma_pneumoniae | 369 | 112 | |
Serratia_marcescens | 378 | 225 | |
Shewanella_putrefaciens | 354 | 115 | |
Streptomyces_bikiniensis | 397 | 135 | |
Streptomyces_lividans | 405 | 138 | |
Group Ⅰ intron (Group A) | m.S.cerevisiae | 1210 | 101 |
b.Bacteriophage.RB3 | 1115 | 69 | |
b.coliphage.T4.A2.NRDD | 1058 | 92 | |
c.C.eugametos | 1053 | 82 | |
b.coliphage.T4.A2.TD | 1036 | 77 | |
C.reinhardii | 953 | 97 | |
S.luteus 2449 | 611 | 107 | |
coliphage | 607 | 70 | |
Bacteriophage.SP0 | 908 | 81 | |
m.S.cerevisiae | 803 | 100 | |
S.luteus 2504 | 373 | 104 | |
Bacteriophage.beta | 403 | 87 | |
Group Ⅰ intron (Group B) | m.P.anserina | 2630 | 66 |
c.C.moewusii | 1844 | 110 | |
m.T.papilionaceus | 1780 | 74 | |
m.M.senile | 1710 | 86 | |
m.N.crassa | 1180 | 59 | |
m.E.nidulans | 1105 | 57 | |
m.S.cerevisiae | 1075 | 82 | |
C.albicans | 459 | 127 | |
S.negevensis | 736 | 93 | |
S.luteus.1974 | 623 | 89 | |
N.aquatica | 416 | 116 | |
S.luteus.1923 | 555 | 148 | |
C.pallidostigmatica | 924 | 64 | |
C.saccharophila | 427 | 101 | |
A.castellanii | 846 | 75 | |
C.dubliniensis | 621 | 114 | |
S.pombe | 343 | 68 | |
M.senile | 880 | 94 | |
S.luteus.2500 | 617 | 191 | |
M.grisea | 285 | 66 | |
W.mrakii | 409 | 72 | |
Group Ⅰ intron (Group E) | M.anisopliae.4 | 430 | 91 |
C.hypophloia | 526 | 138 | |
E.nigra | 523 | 145 | |
H.rubra | 557 | 136 | |
M.anisopliae.2 | 429 | 86 | |
C.parasitica | 640 | 124 |
[1] |
M. Avila, J. Principe and R. Codina, A finite element dynamical nonlinear subscale approximation for the low Mach number flow equations, J. Comput. Phys., 230 (2011), 7988-8009. doi: 10.1016/j.jcp.2011.06.032
![]() |
[2] |
A. Beccantini, E. Studer, S. Gounand, et al. Numerical simulations of a transient injection flow at low Mach number regime, Int. J. Numer. Meth. Eng., 76 (2008), 662-696. doi: 10.1002/nme.2331
![]() |
[3] | A. Bradji and R. Herbin, Discretization of coupled heat and electrical diffusion problems by finiteelement and finite-volume methods, IMA J. Numer. Anal., 28 (2008), 469-495. |
[4] |
D. Bresch, E. H. Essoufi and M. Sy, Effect of density dependent viscosities on multiphasic incompressible fluid models, J. Math. Fluid Mech., 9 (2007), 377-397. doi: 10.1007/s00021-005-0204-4
![]() |
[5] |
D. Bresch, V. Giovangigli and E. Zatorska, Two-velocity hydrodynamics in fluid mechanics: Part I Well posedness for zero Mach number systems, J. Math. Pure. Appl., 104 (2015), 762-800. doi: 10.1016/j.matpur.2015.05.003
![]() |
[6] |
C. Calgaro, E. Chane-Kane, E. Creusé, et al. L∞-stability of vertex-based MUSCL finite volume schemes on unstructured grids: simulation of incompressible flows with high density ratios, J. Comput. Phys., 229 (2010), 6027-6046. doi: 10.1016/j.jcp.2010.04.034
![]() |
[7] |
C. Calgaro, C. Colin and E. Creusé, A combined finite volumes - finite elements method for a low-Mach model, Int. J. Numer. Meth. Fl., 90 (2019), 1-21. doi: 10.1002/fld.4706
![]() |
[8] |
C. Calgaro, C. Colin, E. Creusé, et al. Approximation by an iterative method of a low-Mach model with temperature dependant viscosity, Math. Method. Appl. Sci., 42 (2019), 250-271. doi: 10.1002/mma.5342
![]() |
[9] |
C. Calgaro, E. Creusé and T. Goudon, An hybrid finite volume-finite element method for variable density incompressible flows, J. Comput. Phys., 227 (2008), 4671-4696. doi: 10.1016/j.jcp.2008.01.017
![]() |
[10] |
C. Calgaro, E. Creusé and T. Goudon, Modeling and simulation of mixture flows: application to powder-snow avalanches, Comput. Fluids, 107 (2015), 100-122. doi: 10.1016/j.compfluid.2014.10.008
![]() |
[11] |
C. Calgaro, M. Ezzoug and E. Zahrouni, Stability and convergence of an hybrid finite volumefinite element method for a multiphasic incompressible fluid model, Commun. Pure Appl. Anal., 17 (2018), 429-448. doi: 10.3934/cpaa.2018024
![]() |
[12] | C. Cancès and C. Guichard, Convergence of a nonlinear entropy diminishing control volume finite element scheme for solving anisotropic degenerate parabolic equations, Math. Comput., 85 (2016), 549-580. |
[13] |
C. Chainais-Hillairet, Discrete duality finite volume schemes for two-dimensional drift-diffusion and energy-transport models, Int. J. Numer. Meth. Fl., 59 (2009), 239-257. doi: 10.1002/fld.1393
![]() |
[14] |
C. Chainais-Hillairet, Y.-J. Peng and I. Violet, Numerical solutions of Euler-Poisson systems for potential flows, Appl. Numer. Math., 59 (2009), 301-315. doi: 10.1016/j.apnum.2008.02.006
![]() |
[15] |
D. R. Chenoweth and S. Paolucci, Natural convection in an enclosed vertical air layer with large horizontal temperature differences, J. Fluid Mech., 169 (1986), 173-210. doi: 10.1017/S0022112086000587
![]() |
[16] | P. G. Ciarlet, Introduction to numerical linear algebra and optimisation, Cambridge Texts in Applied Mathematics. Cambridge University Press, Cambridge, 1989. |
[17] | C. Colin, Analyse et simulation numérique par méthode combinée Volumes Finis - Eléments Finis de modèles de type Faible Mach, PhD thesis, Université de Lille, 2019. |
[18] |
Y. Coudière, J.-P. Vila and P. Villedieu, Convergence rate of a finite volume scheme for a two-dimensional convection-diffusion problem, ESAIM: Mathematical Modelling and Numerical Analysis, 33 (1999), 493-516. doi: 10.1051/m2an:1999149
![]() |
[19] | R. Danchin and X. Liao, On the well-posedness of the full low Mach number limit system in general critical Besov spaces, Commun. Contemp. Math., 14 (2012), 1250022. |
[20] |
S. Dellacherie, On a diphasic low Mach number system, ESAIM: Mathematical Modelling and Numerical Analysis, 39 (2005), 487-514. doi: 10.1051/m2an:2005020
![]() |
[21] |
K. Domelevo and P. Omnes, A finite volume method for the Laplace equation on almost arbitrary two-dimensional grids, ESAIM: Mathematical Modelling and Numerical Analysis, 39 (2005), 1203-1249. doi: 10.1051/m2an:2005047
![]() |
[22] |
J. Droniou and R. Eymard, A mixed finite volume scheme for anisotropic diffusion problems on any grid, Numer. Math., 105 (2006), 35-71. doi: 10.1007/s00211-006-0034-1
![]() |
[23] |
P. Embid, Well-posedness of the nonlinear equations for zero Mach number combustion, Commun. Part. Diff. Eq., 12 (1987), 1227-1283. doi: 10.1080/03605308708820526
![]() |
[24] |
R. Eymard and T. Gallouët, H-convergence and numerical schemes for elliptic problems, SIAM J. Numer. Anal., 41 (2003), 539-562. doi: 10.1137/S0036142901397083
![]() |
[25] | R. Eymard, T. Gallouët and R. Herbin, Finite volume methods, Handbook of numerical analysis, 7 (2000), 713-1018. |
[26] |
R. Eymard, T. Gallouët and R. Herbin, A cell-centered finite-volume approximation for anisotropic diffusion operators on unstructured meshes in any space dimension, IMA J. Numer. Anal., 26 (2006), 326-353. doi: 10.1093/imanum/dri036
![]() |
[27] | R. Herbin, J.-C. Latché and K. Saleh, Low Mach number limit of a pressure correction MAC scheme for compressible barotropic flows. In: Finite volumes for complex applications VIII- methods and theoretical aspects, volume 199 of Springer Proc. Math. Stat., pages 255-263. Springer, Cham, 2017. |
[28] | R. Herbin, J.-C. Latché and K. Saleh, Low Mach number limit of some staggered schemes for compressible barotropic flows, arXiv preprint, arXiv:1803.09568. |
[29] |
V. Heuveline, On higher-order mixed FEM for low Mach number flows: application to a natural convection benchmark problem, Int. J. Numer. Meth. Fl., 41 (2003), 1339-1356. doi: 10.1002/fld.454
![]() |
[30] |
F. Huang and W. Tan, On the strong solution of the ghost effect system, SIAM J. Math. Anal., 49 (2017), 3496-3526. doi: 10.1137/16M106964X
![]() |
[31] |
P. Le Quéré, C. Weisman, H. Paillère, et al. Modelling of natural convection flows with large temperature differences: a benchmark problem for low Mach number solvers. Part I. Reference solutions, ESAIM: Mathematical Modelling and Numerical Analysis, 39 (2005), 609-616. doi: 10.1051/m2an:2005027
![]() |
[32] |
C. D. Levermore, W. Sun and K. Trivisa, Local well-posedness of a ghost effect system, Indiana Univ. Math. J., 60 (2011), 517-576. doi: 10.1512/iumj.2011.60.4179
![]() |
[33] | P.-L. Lions, Mathematical topics in fluid mechanics: Volume 2: Compressible models, volume 10 of Oxford Lecture Series in Mathematics and its Applications. The Clarendon Press, Oxford University Press, New York, 1998. |
[34] |
A. Majda and J. Sethian, The derivation and numerical solution of the equations for zero Mach number combustion, Combust. Sci. Technol., 42 (1985), 185-205. doi: 10.1080/00102208508960376
![]() |
1. | Xiujun Zhang, Zhiqiang Zhang, Natarajan Chidambaram, Subhasri Jaganathan, Narasimhan Devadoss, Vignesh Ravi, On degree and distance-based topological indices of certain interconnection networks, 2022, 137, 2190-5444, 10.1140/epjp/s13360-022-03010-0 | |
2. | Kinkar Chandra Das, Sourav Mondal, Zahid Raza, On Zagreb connection indices, 2022, 137, 2190-5444, 10.1140/epjp/s13360-022-03437-5 | |
3. | Sourav Mondal, Kinkar Chandra Das, Zagreb connection indices in structure property modelling, 2023, 69, 1598-5865, 3005, 10.1007/s12190-023-01869-5 | |
4. | Deepa Balasubramaniyan, Natarajan Chidambaram, Vignesh Ravi, Muhammad Kamran Siddiqui, QSPR analysis of anti‐asthmatic drugs using some new distance‐based topological indices: A comparative study, 2024, 124, 0020-7608, 10.1002/qua.27372 | |
5. | Yaşar NACAROĞLU, ON LEAP ZAGREB INDICES OF A SPECIAL GRAPH OBTAINED BY SEMIGROUPS, 2023, 6, 2618-5660, 16, 10.33773/jum.1333260 |
Ref | Level of RNA Structure | Exact/Approximation | Time | Technique | Experimental Study |
[13] | (Crossing, T) | Exact | NP-hard | MIS | No |
[16] | (Nested, Nested) | Exact | NP-hard | MIS | No |
[17] | (Nested, Chain) | Exact | O(lAl3B) | DP | No |
[18] | (Plain, Plain) | Exact | O(lAlB) | DP | No |
[17] | (Crossing, Crossing) | Heuristic | O(lAlB) | DP and MIS | Yes |
[5] | (Crossing, Crossing) | Heuristic | lA seconds | Generate a set of common subsequences and MIS | Yes |
[19] | (Crossing, Crossing) | Heuristic | lA seconds | Generate a set of common subsequences and Maximum Clique | Yes |
[4] | (Crossing, Plain) | Heuristic | O(lAlB) | DP and Lookup table | Yes |
Note: *: T∈{Crossing, Nested, Chain, Plain}, DP: Dynamic programming, MIS: Maximal independent set, NP: non-deterministic polynomial. |
RNA Database | Number of Selected RNAs | Total Number of Comparisons | Location | Ref |
Ribonuclease P RNA | 12 (out of 470) | 66 | http://www.rnasoft.ca/strand/ | [26] |
Group i introns (A) | 12 | 66 | https://crw2-comparative-rna-web.org/group-i-introns/ | [27] |
Group i introns (B) | 21 | 210 | ||
Group i introns (E) | 6 | 15 |
n | m | BB | PBB1 | PBB2 | PBB3 |
100 | n/10 | 59.3 | 59.3 | 59.3 | 59.3 |
n/5 | 56.3 | 56.6 | 56.5 | 56.5 | |
n/2 | 46.2 | 46.5 | 46.6 | 46.8 | |
200 | n/10 | 121.4 | 121.3 | 121.5 | 121.6 |
n/5 | 112.4 | 112.3 | 113.0 | 113.1 | |
n/2 | 89.4 | 89.4 | 89.9 | 89.1 | |
300 | n/10 | 180.5 | 181.2 | 181.5 | 181.1 |
n/5 | 168.9 | 169.4 | 169.0 | 169.4 | |
n/2 | 134.7 | 135.0 | 135.3 | 135.4 | |
400 | n/10 | 236.5 | 238.2 | 237.5 | 237.6 |
n/5 | 218.9 | 220.6 | 219.3 | 218.6 | |
n/2 | 166.9 | 170.9 | 169.6 | 168.2 | |
500 | n/10 | 291.4 | 294.7 | 296.2 | 293.8 |
n/5 | 264.9 | 271.1 | 273.5 | 270.1 | |
n/2 | 199.3 | 207.5 | 203.3 | 203.5 |
n | BB | PBB1 | PBB2 | PBB3 |
100 | 0.14 | 0.16 | 0.23 | 0.31 |
200 | 0.14 | 0.15 | 0.24 | 0.36 |
300 | 0.15 | 0.19 | 0.34 | 0.55 |
400 | 0.17 | 0.18 | 0.53 | 0.72 |
500 | 0.26 | 0.28 | 0.53 | 0.73 |
RNA Database | Running Time in Seconds | Length of LAPCS | ||||||
BB | PBB2 | % of Improvement |
Difference (PBB2-BB) | Diff = 0 | Diff > 0 | CV | ||
BB | PBB2 | |||||||
Ribonuclease P RNA | 398.1 | 107.5 | 73% | [0, 7] | 50% | 50% | 0.179 | 0.175 |
Group i introns (Group A) | 1003.2 | 259.7 | 74% | [0, 7] | 45.4% | 54.6% | 0.362 | 0.362 |
Group i introns (Group B) | 1263 | 343.7 | 71.4% | [0, 10] | 68.1% | 31.9% | 0.633 | 0.632 |
Group i introns (Group E) | 551.1 | 183.7 | 66.6% | [0, 5] | 53.3% | 46.7% | 0.163 | 0.161 |
Ref | Level of RNA Structure | Exact/Approximation | Time | Technique | Experimental Study |
[13] | (Crossing, T) | Exact | NP-hard | MIS | No |
[16] | (Nested, Nested) | Exact | NP-hard | MIS | No |
[17] | (Nested, Chain) | Exact | O(lAl3B) | DP | No |
[18] | (Plain, Plain) | Exact | O(lAlB) | DP | No |
[17] | (Crossing, Crossing) | Heuristic | O(lAlB) | DP and MIS | Yes |
[5] | (Crossing, Crossing) | Heuristic | lA seconds | Generate a set of common subsequences and MIS | Yes |
[19] | (Crossing, Crossing) | Heuristic | lA seconds | Generate a set of common subsequences and Maximum Clique | Yes |
[4] | (Crossing, Plain) | Heuristic | O(lAlB) | DP and Lookup table | Yes |
Note: *: T∈{Crossing, Nested, Chain, Plain}, DP: Dynamic programming, MIS: Maximal independent set, NP: non-deterministic polynomial. |
RNA Database | Number of Selected RNAs | Total Number of Comparisons | Location | Ref |
Ribonuclease P RNA | 12 (out of 470) | 66 | http://www.rnasoft.ca/strand/ | [26] |
Group i introns (A) | 12 | 66 | https://crw2-comparative-rna-web.org/group-i-introns/ | [27] |
Group i introns (B) | 21 | 210 | ||
Group i introns (E) | 6 | 15 |
n | m | BB | PBB1 | PBB2 | PBB3 |
100 | n/10 | 59.3 | 59.3 | 59.3 | 59.3 |
n/5 | 56.3 | 56.6 | 56.5 | 56.5 | |
n/2 | 46.2 | 46.5 | 46.6 | 46.8 | |
200 | n/10 | 121.4 | 121.3 | 121.5 | 121.6 |
n/5 | 112.4 | 112.3 | 113.0 | 113.1 | |
n/2 | 89.4 | 89.4 | 89.9 | 89.1 | |
300 | n/10 | 180.5 | 181.2 | 181.5 | 181.1 |
n/5 | 168.9 | 169.4 | 169.0 | 169.4 | |
n/2 | 134.7 | 135.0 | 135.3 | 135.4 | |
400 | n/10 | 236.5 | 238.2 | 237.5 | 237.6 |
n/5 | 218.9 | 220.6 | 219.3 | 218.6 | |
n/2 | 166.9 | 170.9 | 169.6 | 168.2 | |
500 | n/10 | 291.4 | 294.7 | 296.2 | 293.8 |
n/5 | 264.9 | 271.1 | 273.5 | 270.1 | |
n/2 | 199.3 | 207.5 | 203.3 | 203.5 |
n | BB | PBB1 | PBB2 | PBB3 |
100 | 0.14 | 0.16 | 0.23 | 0.31 |
200 | 0.14 | 0.15 | 0.24 | 0.36 |
300 | 0.15 | 0.19 | 0.34 | 0.55 |
400 | 0.17 | 0.18 | 0.53 | 0.72 |
500 | 0.26 | 0.28 | 0.53 | 0.73 |
RNA Database | Running Time in Seconds | Length of LAPCS | ||||||
BB | PBB2 | % of Improvement |
Difference (PBB2-BB) | Diff = 0 | Diff > 0 | CV | ||
BB | PBB2 | |||||||
Ribonuclease P RNA | 398.1 | 107.5 | 73% | [0, 7] | 50% | 50% | 0.179 | 0.175 |
Group i introns (Group A) | 1003.2 | 259.7 | 74% | [0, 7] | 45.4% | 54.6% | 0.362 | 0.362 |
Group i introns (Group B) | 1263 | 343.7 | 71.4% | [0, 10] | 68.1% | 31.9% | 0.633 | 0.632 |
Group i introns (Group E) | 551.1 | 183.7 | 66.6% | [0, 5] | 53.3% | 46.7% | 0.163 | 0.161 |