Spatio-temporal Keywords Queries in HBase

  • Received: 01 May 2015 Revised: 01 August 2015 Published: 01 January 2016
  • With the amount of data accumulated to tens of billions of scale, HBase, a distributed key-value database, plays a significant role in providing effective and high-throughput data service and management. However, for the applications involving spatio-temporal data, there is no good solution, due to inefficient query processing in HBase. In this paper, we propose spatiotemporal keyword searching problem for HBase, which is a meaningful issue in real life and a new challenge in this platform. To solve this problem, a novel access model for HBase is designed, containing row keys for indexing spatiotemporal dimensions and Bloom filters for fast detecting the existence of query keywords. And then, two algorithms for spatio-temporal keyword queries are developed, one is suitable for the queries with ordinary selectivity, the other is a parallel algorithm based on MapReduce aiming for the large range queries. We evaluate our algorithms on a real dataset, and the empirical results show that they are capable to handle spatio-temporal keyword queries efficiently.

    Citation: Xiaoying Chen, Chong Zhang, Zonglin Shi, Weidong Xiao. Spatio-temporal Keywords Queries in HBase[J]. Big Data and Information Analytics, 2016, 1(1): 81-91. doi: 10.3934/bdia.2016.1.81

    Related Papers:

    [1] Bhuvana Selvaraj, Sangeetha Soundararajan, Shettu Narayanasamy, Ganesan Subramanian, Senthil Kumar Ramanathan . Frequency of hereditary hemochromatosis gene mutations and their effects on iron overload among beta thalassemia patients of Chennai residents. AIMS Molecular Science, 2021, 8(4): 233-247. doi: 10.3934/molsci.2021018
    [2] Madhan Srinivasamurthy, Nagaraj Kakanahalli, Shreeshail V. Benakanal . A truncation mutation in the L1CAM gene in a child with hydrocephalus. AIMS Molecular Science, 2021, 8(4): 223-232. doi: 10.3934/molsci.2021017
    [3] Jin Hwan Do . Genome-wide transcriptional comparison of MPP+ treated human neuroblastoma cells with the state space model. AIMS Molecular Science, 2015, 2(4): 440-460. doi: 10.3934/molsci.2015.4.440
    [4] Mohammad Shboul, Hela Sassi, Houweyda Jilani, Imen Rejeb, Yasmina Elaribi, Syrine Hizem, Lamia Ben Jemaa, Marwa Hilmi, Susanna Gerit Kircher, Ali Al Kaissi . The phenotypic spectrum in a patient with Glycine to Serine mutation in the COL2A1 gene: overview study. AIMS Molecular Science, 2021, 8(1): 76-85. doi: 10.3934/molsci.2021006
    [5] Yutaka Takihara, Ryuji Otani, Takuro Ishii, Shunsuke Takaoka, Yuki Nakano, Kaori Inoue, Steven Larsen, Yoko Ogino, Masashi Asai, Sei-ichi Tanuma, Fumiaki Uchiumi . Characterization of the human IDH1 gene promoter. AIMS Molecular Science, 2023, 10(3): 186-204. doi: 10.3934/molsci.2023013
    [6] Bhuvana Selvaraj, Ganesan Subramanian, Senthil Kumar Ramanathan, Sangeetha Soundararajan, Shettu Narayanasamy . Studies on molecular spectrum of beta thalassemia among residents of Chennai. AIMS Molecular Science, 2022, 9(3): 107-135. doi: 10.3934/molsci.2022007
    [7] Kui Wang, Zhifang Yang, Xiaohui Chen, Shunxiao Liu, Xiang Li, Liuhao Wang, Hao Yu, Hongwei Zhang . Characterization and analysis of myosin gene family in the whitefly (Bemisia tabaci). AIMS Molecular Science, 2022, 9(2): 91-106. doi: 10.3934/molsci.2022006
    [8] Anna Kapambwe Bwalya, Robinson Mugasiali Irekwa, Amos Mbugua, Matthew Mutinda Munyao, Peter Kipkemboi Rotich, Tonny Teya Nyandwaro, Caroline Wangui Njoroge, Anne Wanjiru Mwangi, Joanne Jepkemei Yego, Shahiid Kiyaga, Samson Muuo Nzou . Investigation of single nucleotide polymorphisms in MRPA and AQP-1 genes of Leishmania donovani as resistance markers in visceral leishmaniasis in Kenya. AIMS Molecular Science, 2021, 8(2): 149-160. doi: 10.3934/molsci.2021011
    [9] Sarmishta Mukhopadhyay, Sayak Ganguli, Santanu Chakrabarti . Shigella pathogenesis: molecular and computational insights. AIMS Molecular Science, 2020, 7(2): 99-121. doi: 10.3934/molsci.2020007
    [10] Irene M. Waita, Atunga Nyachieo, Daniel Chai, Samson Muuo, Naomi Maina, Daniel Kariuki, Cleophas M. Kyama . Genetic polymorphisms in eostrogen and progesterone receptor genes in Papio anubis induced with endometriosis during early stage of the disease. AIMS Molecular Science, 2021, 8(1): 86-97. doi: 10.3934/molsci.2021007
  • With the amount of data accumulated to tens of billions of scale, HBase, a distributed key-value database, plays a significant role in providing effective and high-throughput data service and management. However, for the applications involving spatio-temporal data, there is no good solution, due to inefficient query processing in HBase. In this paper, we propose spatiotemporal keyword searching problem for HBase, which is a meaningful issue in real life and a new challenge in this platform. To solve this problem, a novel access model for HBase is designed, containing row keys for indexing spatiotemporal dimensions and Bloom filters for fast detecting the existence of query keywords. And then, two algorithms for spatio-temporal keyword queries are developed, one is suitable for the queries with ordinary selectivity, the other is a parallel algorithm based on MapReduce aiming for the large range queries. We evaluate our algorithms on a real dataset, and the empirical results show that they are capable to handle spatio-temporal keyword queries efficiently.


    Alkaptonuria (AKU, OMIM 203500 ) is a rare metabolic disease that is inherited as Mendelian autosomal recessive trait and was firstly described by Garrod [1],[2]. The global AKU prevalence rate is extremely low and was estimated as 1:250,000 live births [3]. However, the incidence rate is higher in certain regions such as Jordan, India, Slovakia and the Dominican Republic [4]. AKU is resulting from a defect in homogentisate 1,2-dioxygenase (HGD) enzyme (EC 1.13.11.5) which is involved in the catabolic pathway of phenylalanine and tyrosine amino acids [5]. Consequently, homogentisic acid is generated as an intermediary product and released to the circulation. A portion of the circulating homogentisic acid is excreted in the urine which turns black in colour on standing or upon alkalinisation [6]. The residual portion of homogentisic acid in bloodstream will be distributed to various connective tissues and deposited mainly in cartilages, tendons and ligaments as ochronotic pigment after oxidation and polymerization reactions [7],[8]. The gene encoding for HGD enzyme is located on the human chromosome 3 at q21–q23 and it is expressed particularly in hepatic and renal cells [9],[10]. HGD gene consists of 14 exons with transcript length of 1674 nucleotides (NCBI Reference Sequence: NM_000187.4) that encode for 445-amino acids protein.

    The initial clinical manifestations of AKU are homogentisic aciduria, ochronosis and arthropathy of large joints such as knees and hips [11]. At advanced stage of the disease, AKU patients presented with additional clinical features including Achilles tendon rupture, arthroplasty, aortic valve disease and stones formation commonly renal and prostatic ones [12]. However, the severity of the disease varies between AKU patients but it increases with age due to the ongoing accumulation of the homogentisic acid in various body tissues and fluids [13]. Therefore, individual AKU patients may not have similar signs and symptoms even between the same family siblings [14]. Taylor et al. (2011) showed that joint tissues are prone to pigmentation only after the incidence of focal cellular and extracellular matrix changes [15]. Moreover, the authors revealed that there is a spectrum of pigmentation in articular cartilages in which HGA-derived pigment is initially deposited at the boundaries of subchondral bone and calcified cartilages and ended with complete pigmentation of cartilage matrix at more advanced ochronosis [15]. Various radiographic and imaging techniques such as MRI and CT-scan are widely used to determine the extent of and severity of joints and spinal damage [3]. Moreover, the quality of life rather than the life span is strictly affected, mainly as a result of alteration of mechanical properties of large joints as well as degenerative changes and calcifications of spinal intervertebral discs [16].

    The disease is primarily diagnosed by measuring the level of homogentisic acid in blood and urine samples using gas chromatography-mass spectrometry analysis (GC-MS) [17]. Differential diagnosis is mandatory because the spinal and joints symptoms of different disorders such as osteoarthritis, rheumatoid arthritis and ankylosing spondylitis are similar to those associated with AKU [18],[19]. The clinical trials SONIA 1 and SONIA 2 approved the efficacy and safety of nitisinone as a therapy for AKU [20],[21]. Nitisinone has a role in delaying the disease progression through blocking of homogentisic acid formation and subsequent accumulation in tissues [22],[23]. Corneal keratopathy secondary to hypertyrosinemia is an eye pathology reported in some AKU patients who are receiving nitisinone therapy, a major adverse effect which can be reversed by dose reduction and diet restriction [24]. However, the lifestyle of AKU patients must be adjusted to slow and manage the symptoms of the disease [14]. In addition, regulating nutritional habits to lower protein intake significantly reduces the level of circulating tyrosine and a subsequent reduction in homogentisic acid was also observed [25].

    AKU patients are either homozygous or compound heterozygous for the various identified variants of the HGD gene [26],[27]. In addition to DNA sequencing technique, multiplex ligation-dependent probe amplification (MLPA) analysis can also be used for detection of AKU mutations because of the recent discovery of numerous large-scale deletions in AKU [13]. The HGD mutation database was constructed in 2010 and so far a list of 213 unique DNA variants has been reported among AKU patients worldwide [28]. Remarkably, HGD mutations in the Arab populations are rarely reported [29],[30]. In Jordanian society, only two reports were published despite the high prevalence of the disease and the identification of more than 40 cases in south Jordan [31],[32]. The aim of the present study is to investigate pathogenic variants among members of single Jordanian family diagnosed with AKU. As a consequence, our results identified a novel HGD mutation in addition to the common A122V variant. The novel mutation occurring in exon 10 has not been listed yet in the HGD mutation database and other databases including: AKU database, ClinVar, HGMD and ApreciseKUre. To the best of our knowledge this is the first study to publish this unique mutation globally.

    The study was conducted during the period of September 2019 to June 2020. Peripheral blood and urine samples were collected from Jordanian AKU family members (n = 23). There was no consanguinity relationship between the parents enrolled in the current study except the mother II:8 and the father II:9 who are first degree cousins as depicted in the family pedigree (Figure 1). To investigate HGD variants among participants, genomic DNA was extracted from whole blood using Quick-gDNA Miniprep Kit (Zymo Research, USA) according to the manufacturer's instructions. Briefly, the procedure started with the mixing of 100 µl of whole blood with 400 µl of Genomic Lysis Buffer (4:1) in a 1.5 ml micro centrifuge tube. Then, the sample was incubated for 10 minutes at room temperature. The lysate was transferred to Zymo-Spin column in a collection tube and centrifuged at 10,000× g for 1 minute. After centrifugation, 200 µl DNA Pre-Wash Buffer was added to the spin column and then re-centrifuged at 10,000× g for 1 minute. Finally, 500 µl of g-DNA Wash Buffer was added to the spin column and centrifuged at 10,000× g for 1 minute. The DNA was eluted from the column using 100 µl of elution buffer.

    The quantity and quality (260/280 and 260/230) of the extracted DNA were measured by NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, USA). The extracted DNA was stored at −20°C until analysis.

    Figure 1.  Pedigree chart showing the relationship between all participated family members. The black arrow located at the down left corner of AKU patient III:1 indicates the proband in this study. The pedigree also illustrates the segregation of AKU alleles through the family members. WT indicates wild-type.

    AKU patients were diagnosed based on family history, clinical examination and routine analysis for homogentisic aciduria using ferric chloride solution or by overnight standing of urine as illustrated in Figure 2. Presence of homogentisic acid was additionally confirmed by measuring urinary homogentisic acid level using GC-MS analysis. Written informed consents were obtained from all participants prior to inclusion in the study. The study protocol was approved by the Committee of Research Ethics in Mutah University, Jordan (Certificate of approval No. 201957; approval date September 18, 2019). All followed procedures were in accordance with the Declaration of Helsinki and its contemporary amendments.

    Figure 2.  Urine sample collected from AKU patient. (A) is a fresh urine sample; (B) after 24 hours standing, the colour of the urine was converted to brownish-black due to oxidation of urinary homogentisic acid.

    The 14 exons of the human HGD gene for the proband III:1 were amplified from the extracted genomic DNA using the specific primers listed in Table 1. This was followed by a segregation analysis for exon 6 and exon 10 for all participated members (Figure 1). The primers pair of each exon was designed to include the flanking intronic regions containing the splice sites sequences using Primer3 (v. 0.4.0) online tool [33],[34]. The primers were purchased from Macrogen, South Korea. PCR reaction was prepared according to the standard protocol using Phusion High-Fidelity PCR Master Mix (Thermo Fisher, USA), 300 nM of each of the forward and reverse primers and 100 ng of genomic DNA. PCR reactions were executed under the standard programme with specific annealing temperature for each primer pair (the range was between 52–60°C).

    An initial denaturation for 30 sec at 98°C, followed by 35 cycles of 10 sec denaturation at 98°C, annealing for 30 sec at 52–60°C, extension for 30 sec at 72°C followed by the final extension step at 72°C for 7 min. PCR products were separated on 2% agarose gel electrophoresis and then purified with ExoSAP-IT PCR Product Cleanup Reagent (Applied biosystems, USA) according to the manufacturer's instructions. Briefly, 5 µL of a post-PCR reaction product was mixed with 2 µL of ExoSAP-IT reagent. The mix were then incubated at 37°C for 15 minutes to degrade remaining primers and nucleotides followed by another incubation at 80°C for 15 minutes to inactivate ExoSAP-IT reagent. PCR purified fragments were sequenced from both directions using the BigDye terminator V3.1 Cycle Sequencing kit (Applied Biosystems, USA) and the ABI 3730xl gene analyser at Macrogen Inc, South Korea. Segregation analysis for the candidate variants was then performed for all participated members from the enrolled family.

    Table 1.  PCR primers for genomic amplification of the human HGD exons (5′ → 3′).
    Exon Forward Reverse PCR product size (bp)
    E1 GAGTTAGACAATTCTTTCAGC ATGAACAAAGGCAAGGGATG 418
    E2 GCAATATCCAGCACTCTTCTGA CCCCTATGACTTGGGAAACC 437
    E3 GGGGCAAGTCACATCAAAAG GCTGGCAGGAAGTTCATTCT 416
    E4 TTGGCAGCATGGAAATAACC TTTGAGCAGAAAACAGACACACT 517
    E5 AGCATGAAAAGCAGCATCAG ACGCAGGTGGTTTTGTCTCT 560
    E6 GTCAGTAAATTCAGGCTCCTTAGA TCCATCCTCCCTTTTCTGTTT 521
    E7 CGCTATTCTTTCATTCCCTCA GTCCAGAAGAGATGGGCAAA 530
    E8 ACAAGTTCCTTGCCTGGTGA CTCAGATTCCCTCCTCGTTG 439
    E9 CCAAGCAGCTCAACAAACAA AGTGAGACAGCGAAGGGAGA 319
    E10 CTCTCTTCCCTTCCCCTCAC TTTGTAGTGCCGTAGTGGTATGA 551
    E11 TCTCCCAAAGGACGGTAAAA CTCCCTCACCAAAGGACAAA 392
    E12 CAGATCCCTACCCCAAACCT CACGAGCCAAATGAACCTCT 600
    E13 TGCCAAGAATGCCAATATGA CCCTCTTTTGACTCTTCCTCTG 478
    E14 ACCAGAGCCACAACTCAGG CTGCCAGGTTTGTCTCATCA 576

     | Show Table
    DownLoad: CSV

    The sequence data were analysed with Chromas Pro software (Technolysium LTD, South Brisbane, Australia). The effects of missense variants identified in the coding regions were predicted using in silico tools: PolyPhen2, SIFT and Mutation taster. Variants occurring with a frequency of ≥1% were classified as benign. The identified variants were queried in the HGD mutation database, AKU database, ClinVar, Human Gene Mutation Database (HGMD) and ApreciseKUre database.

    Jordanian family with total number of 23 individuals (10 males and 13 females) participated in the current study. The sociodemographic characteristics of the participants are shown in Table 2. The proband III:1 was diagnosed with AKU when he was 11 years old. The proband's mother confirmed the existence of the black urine phenomenon since childhood and the brownish-black colour of the nappies drove her attention to seek medical advice. The proband III:1 was not diagnosed as AKU patient until 2010 after a visit to a local paediatrician. Subsequently, his sister III:4 who also showed the black urine sign was diagnosed with AKU. Additionally, two AKU patients (III:8 and III:9) who are cousins of the proband were diagnosed with the disease during the screening of the proband family members and relatives. Genomic DNA was extracted from whole blood samples of all participants to identify healthy from carriers since carriers are asymptomatic. The relationship between the family members is illustrated in the pedigree shown in Figure 1.

    The PCR amplification of the 14 exons was performed for genotyping of the proband III:1. The analysis of exons sequences revealed the existence of two different pathogenic mutations in exons 6 and 10 and subsequent identification of different HGD alleles as shown in Figure 3.

    Table 2.  Sociodemographic characteristics of the study participants.
    Sociodemographic Variable Frequency Family member code in pedigree
    Gender
     • Male 10 See pedigree (square indicates male and circle indicates female)
     • Female 13
    Age
     • 1–20 6 III: 2,3,4,5,7,8
     • 21–40 4 II: 7 and III: 1,6,9
     • 41–60 8 II: 1,2,3,4,5,6,8,9
     • 60–80 5 I: 1,2,3,4,5
    Marital status
     • Single 9 III: 1,2,3,4,5,6,7,8,9
     • Married (non-consanguineous) 12 I: 1,2,3,4,5 and II: 1,2,3,4,5,6,7
     • Consanguineous marriage (first-cousins) 2 II: 8,9
    Nationality
     • Jordanian 23 All participants
     • Others 0
    Diagnosed with AKU (urinary level of homogentisic acid)/date
     • Healthy (ND*) 10 I: 1,3 / II: 2,3,6,7 and III:2,3,6,7
     • AKU carrier (ND*)/genotyped in this study 9 I: 2,4,5 / II: 1,4,5,8,9 and III: 5
     • AKU patient (2g/24h & 1.5 g/24h, respectively) /diagnosed in 2010 2 III: 1,4
     • AKU patient (2.2g/24h & 1.9 g/24h, respectively) / diagnosed in 2020 2 III: 8,9
    Occupation status
     • Not working 5 I: 1,2,3,4,5
     • Student 8 III: 1,2,3,4,5,6,7,8
     • Working 10 II: 1,2,3,4,5,6,7,8,9 and III: 9
    Educational level
     • Illiterate 4 I: 2,3,4,5
     • Attending school 6 III: 2,3,4,5,7,8
     • Attending university 3 III: 1,6,9
     • Qualified/ academic level 10 I: 1 and II: 1,2,3,4,5,6,7,8,9

    Note: ND*: not detected.

     | Show Table
    DownLoad: CSV
    Figure 3.  DNA sequence analysis of exon 6 and exon 10 of the HGD gene in the family members enrolled in our study. (A) chromatograms show the wild-type sequence in healthy participants; (B) chromatograms show heterozygous variants in AKU carriers; (C) chromatogram shows the homozygous pattern of the novel variant R225C identified in AKU patients.

    The fathers I:1 and I:3 have two normal alleles whereas the mother I:2 was heterozygous with one normal allele assigned as HGDNr and a mutated allele, carrying the conservative missense variant A122V, denoted as HGDA122V within exon 6, c.365C>T transition mutation. This mutation is recurrent and has been reported before in the HGD mutation database and was classified as pathogenic mutation with hexamer disruption effect on HGD protein [3],[27]. On the other hand, the mutated allele found in the carrier members I:4 and I:5 from the first generation showed different pathogenic missense mutation located in exon 10 (c.673 C>T, p.R225C). Subsequently, this allele was assigned as HGDR225C. Segregation analysis of participated members from the second generation (II: 1–9) revealed that II:2, II:3, II:6 and II:7 were wild-type (WT) and healthy, II:1 & II:4 were carrier for the HGDA122V mutation and II:5, II:8 and II:9 were carriers for the HGDR225C mutation as they inherited the corresponding maternal mutated allele and the paternal normal allele. To investigate for segregation of the previous two mutated alleles (HGDA122V and HGDR225C), we genotyped all participated siblings from the third generation. Mutational analysis revealed that III:1 and III:4 were compound heterozygous as they inherited the mutated alleles from their corresponding parent and denoted as HGDA122V/HGDR225C. The daughter III:5 did not show the phenotype of AKU but genotyping analysis demonstrated that she was AKU carrier with heterozygous pattern (HGDNr/HGDR225C) as she inherited the mutated variant from her mother and the normal allele from her father. The distribution of different alleles in progeny is shown in Figure 1. On the other hand, the sibling III:8 and III:9 were HGDR225C homozygous. The chromatograms of the alleles are illustrated in Figure 3. Sanger sequencing determined that variants are fully segregated with the disease in affected family members. This segregation mode is in agreement with the autosomal recessive pattern of AKU inheritance which is strongly suggesting that these missense mutations are pathogenic. Moreover, the two missense variants (c.365C>T, p.Ala122Val and c.673 C>T, p.Arg 225Cys) were detected with allele frequency of 0.005569% and 0.001196% respectively in the gnomAD database. Both variants were identified to be deleterious, probably damaging and disease causing in SIFT, PolyPhen2 and Mutation taster, respectively as shown in Table 3.

    AKU is a degenerative disease and several studies showed that it is resulted due to structural rather than regulatory gene mutation [27],[35]. The signs and symptoms of the diseases are mainly caused by the accumulation of homogentisic acid in tissues instead of its conversion into maleylacetoacetic acid leading to the deposition of ochronotic dark pigment in connective tissues particularly the skin, sclera of the eye, cardiac valves, spine and large joints cartilages [27]. In the study performed on large cohort of AKU patients, Ascher et al. reported that there is no phenotype-genotype correlation between the severity of the disease and the type of HGD variant [27]. Numerous variants of the HGD gene have been reported and listed in the HGD mutation database and AKU database. Therefore, this disorder displays an outstanding allelic heterogeneity [7],[28]. The genetic analysis of DNA extracted from diagnosed AKU patients showed that the disease is presented either as homozygous or compound heterozygous pattern of the different pathogenic variants of the HGD gene [27],[28].

    Middle East and Arab countries including Jordan are regions known with high rate of consanguinity and endogamy due to cultural, ethnical, socioeconomic and historical reasons [36]. A study conducted in Jordan demonstrated that there is a strong association between consanguinity and autosomal recessive diseases due to increased level of homozygosity between offspring of consanguineous matings [37]. Our study was conducted in a small village in south Jordan known with high rate of marriages between relatives or within one's own tribe or community. Consequently, families are genetically isolated and share at least one common ancestor. This will help in highlighting specific conditions related to AKU disease at the molecular level and identifying allelic variants directly implicated in the lack of HGD enzyme functionality.

    Table 3.  The effects of the missense mutations found in Jordanian family members as predicted by PolyPhen2, SIFT and Mutation taster.
    Variant
    Family member Mutation Database
    dbSNP ID Protein Prediction
    MAF gnomAD (%) References
    Exon number Chromosome location (GRCh37) Nucleotide change Protein effect Variant Effect ClinVar HGD mutation database SIFT # PolyPhen-2 * Mutation Taster ^
    ENST00000283871.10 (HGD-201) ENST00000283871.10 (HGD-201)
    E6 3:120369690 c.365C>T p.Ala122Val Missense I:1, II:1,II:4, III:1, III:4 Likely Pathogenic Pathogenic rs544956641 0.01/Deleterious 0.996/Probably damaging 0.999/Disease causing 0.005569 [3],[32],[41]
    E10 3:120363267 c.673C>T p.Arg225Cys Missense I:4, I:5, II:5, II:8, II:9, III:1, III:4, III:5, III:8, III:9 N/A N/A rs756789146 0/Deleterious 1/Probably damaging 0.999/Disease causing 0.001196 N/A

    Notes: #: 0.0 to 0.05, Variants with scores in this range are considered deleterious, Variants with scores closer to 0.0 are more confidently; *: 0.85 to 1.0, Variants with scores in this range are more confidently predicted to be damaging; ^: Score range from 0 to 1 and variants with higher scores are predicted to be more likely to be pathogenic.

     | Show Table
    DownLoad: CSV

    Titus et al. (2000) showed that the HGD protomer consists of a 280 amino acids N-terminal domain and a 140 amino acids C-terminal domain [38]. Additionally, the authors revealed that the HGD protomer associates as a hexamer arranged as a dimer of trimers to form the functional HGD enzyme [38]. Homogentisic acid is a substrate of this enzyme and it binds the catalytic site at Glu 341, His335 and His371 amino acids in presence of the cofactor Fe2+ [10]. The activity of this enzyme is highly affected by mutations in the HGD sequence. Actually, the complex structure of the protein can be disrupted in numerous and different ways depending on the type of mutation. Some of the mutations specifically affect the stability of the protomer itself (protomer destabilization) such as G205V and A267V variants while others interfere with the catalytic side (active site disruption) such as R330S and P332R variants or can affect the hexamer assembly (hexamer disruption) such as G152R and G185R variants [27],[39]. In the study of Bernini et al., the authors suggested a new strategy for the treatment of AKU through the targeting of the defective HGD enzyme by pharmacological chaperones which can restore the structural stability of the native HGD enzyme disrupted by the various missense mutations [40]. These chaperones are able to rescue the activity and functionality of the HGD enzyme partially or completely so considered as promising AKU treatment [40].

    In our study we identified a novel missense mutation R225C in addition to the founder missense mutation A122V which was previously reported in Jordanian AKU patients [32]. The proband III:1 in this study and his sister III:4 were found compound heterozygous carrying the novel mutation from their mother and the recurrent mutation from their father. Further analysis of the variants in their relatives revealed that their cousins were homozygous for the novel mutation. The proband's cousins are generated from first degree consanguineous mating and the parents who are AKU carriers inherited the novel mutation from their maternal grandmother. Our results clearly demonstrate that the mother and the father of the proband do not share common ancestor because they inherited two different alleles from their corresponding maternal grandmothers. Indeed, the presence of two variants indicates that there are two independent founders implicated in the prevalence of AKU in Jordan.

    A122V is relatively common AKU mutation which is present in AKU chromosomes from different geographical regions [3],[41]. Phornphutkul et al. (2002) reported the mutation of A122V in addition to 23 novel variants found in the study conducted on 58 AKU patients [3]. Molecular dynamics simulation and functional analysis showed that the common A122V variant exerts a negative impact on the HGD enzyme function through a disruption effect on the hexamer [27]. We analysed the effect of the novel R225C mutation on the structure of the HGD protein complex using the computational tool mSCM-PPI2. We found that the reported mutation causes destabilization of the hexamer due to disruption of protomer-protomer interactions as illustrated in Figure 4. The analysis showed that there was a decrease in binding affinity by a factor close to 1 kcal/mol (ΔΔGAffinity = −0.98 kcal/mol) highlighting the vital role of the wild-type residue.

    Figure 4.  The predicted effect of the missense novel AKU mutation on the structure of HGD protein complex using mSCM-PPI2 tool. (A) The hexameric structure of the active HGD protein which consists of six chains A, B, C, D, E and F (red, orange, yellow, grey, light blue and purple respectively); (B) Nitrogen atoms (dark blue ball) in the side chain of arginine (position 225) found in WT chain A (red) make four hydrogen bonds (dashed red line) with residues of adjacent chain D (grey); (C) Arginine at position 225 in WT chain A (red) makes five polar bonds (dashed orange line) with residues of adjacent chain D (grey); (D) Cysteine (yellow ball refers to sulphur atom) at position 225 in mutated chain A (red) has low affinity to bind residues in chain D because it lost all polar and H-bonds present in WT chain resulting in loose intermolecular interactions between chains and a subsequent reduction in hexamer stability.

    Interestingly, the amino acid arginine at position 225 (Figure 5) appears remarkable and different studies reported the substitution of arginine (R) with other amino acids such as histidine (c. 674G>A) [42], leucine (c. 674G>T) [3], proline (c. 674G>C) [39] or cysteine (c.673 C>T) as detected in our study.

    Figure 5.  Distribution of the HGD variants identified in the present study. The common A122V was identified in exon 6 and the novel R225C at exon 10. Previously reported missense variants which are occurring at the same position of exon 10 but with different substituted amino acids were marked with red asterisk (R225H [42], R225L [3] & R225P [39]).

    Additionally, the novel mutation was not found in coincidence in the same haplotype with the second HGD mutation A122V which strongly suggests that the novel variant does not represent a frequent polymorphism. However, Usher et al. (2015) showed that the mutation at position 225 in the protein reduces the hexamer stability because arginine at 225 in wild-type protein is located at the interface between protomers and it is involved in intermolecular interactions [39]. Consequently, any alteration at this position is expected to disrupt theses intermolecular interactions and the subsequent destabilisation of the hexamer [39]. Furthermore, the investigation of the effect of the novel AKU-associated mutation on the enzyme structure and function and the evaluation of its potential pathogenicity using different prediction tools specifically designed for interpretation of missense variants confirmed that the previously unreported AKU variant at exon 10 is pathogenic rather than benign polymorphism. This novel mutation is going to be submitted to HGD mutation database.

    The current study provides insight on AKU alleles present in the members of the third generation of an AKU family from south of Jordan. In summary, our study bares a novel missense pathogenic variant R225C in addition to the recurrent mutation A122V in HGD gene among Jordanian AKU family members. Since Jordan is a country with high rate of consanguineous marriages, there is a need to start a national screening project including different areas of Jordan to diagnose more AKU patients. Our plan in the future is to do genetic analysis study on larger cohort as other pathogenic variants of the HGD may be detected among Jordanian AKU patients. Moreover, and in term of prevention of recurrent occurrence of the identified pathogenic variants, testing for at-risk relatives and prenatal molecular diagnosis for pregnancies at increased risk are recommended.

    [1] [ HBase, 2015. Available from:http://hbase.apache.org.
    [2] [ Hadoop, 2015. Available from:http://hadoop.apache.org.
    [3] [ J. Blustein and A. El-Maazawi, Bloom filters. a tutorial, analysis, and survey, Halifax, NS:Dalhousie University, (2002), 1-31.
    [4] [ C. Cheng, C. Sun, X. Xu and D. Zhang, A multi-dimensional index structure based on improved VA-file and CAN in the cloud, International Journal of Automation and Computing, 11(2014), 109-117.
    [5] [ G. Cong, C. S. Jensen and D. Wu, Efficient retrieval of the top k most relevant spatial web objects, VLDB Endowment, 2(2009), 337-348.
    [6] [ I. D. Felipe, V. Hristidis and N. Rishe, Keyword search on spatial databases, In ICDE, (2008), 656-665.
    [7] [ C. S. Jensen, D. Lin and B. C. Ooi, Query and update efficient B+-tree based indexing of moving objects, VLDB Endowment, 30(2004), 768-779.
    [8] [ B. Moon, H. V. Jagadish, C. Faloutsos and J. H. Saltz, Analysis of the clustering properties of the Hilbert space-filling curve, IEEE Transactions on Knowledge and Data Engineering, 13(2001), 124-141.
    [9] [ S. Nishimura, S. Das, D. Agrawal and A. E. Abbadi, MD-HBase:A Scalable Multidimensional Data Infrastructure for Location Aware Services, In MDM, 1(2011), 7-16.
    [10] [ W. Zhou, J. Lu, Z. Luan, S. Wang, G. Xue and S. Yao, SNB-index:A SkipNet and B+ tree based auxiliary Cloud index, Cluster Computing, 17(2014), 453-462.
  • This article has been cited by:

    1. Nesrin Mwafi, Ali Alasmar, Monther Al-Momani, Sattam Alazaydeh, Omar Alajoulin, Mohammad Alsalem, Heba Kalbouneh, Alkaptonuria with extensive ochronotic degeneration of the Achilles tendon and its surgical treatment: a case report and literature review, 2021, 15, 1875-855X, 129, 10.2478/abm-2021-0016
  • Reader Comments
  • © 2016 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3648) PDF downloads(593) Cited by(1)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog