Export file:

Format

  • RIS(for EndNote,Reference Manager,ProCite)
  • BibTex
  • Text

Content

  • Citation Only
  • Citation and Abstract

Identification of hormone binding proteins based on machine learning methods

1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
2 National Research Institute for Family Planning, Beijing 100081, China
3 National Center of Human Genetic Resources, Beijing 100081, China
4 Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611730, China
5 Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China

Special Issues: Machine Learning in Molecular Biology

The soluble carrier hormone binding protein (HBP) plays an important role in the growth of human and other animals. HBP can also selectively and non-covalently interact with hormone. Therefore, accurate identification of HBP is an important prerequisite for understanding its biological functions and molecular mechanisms. Since experimental methods are still labor intensive and cost ineffective to identify HBP, it’s necessary to develop computational methods to accurately and efficiently identify HBP. In this paper, a machine learning-based method was proposed to identify HBP, in which the samples were encoded by using the optimal tripeptide composition obtained based on the binomial distribution method. In the 5-fold cross-validation test, the proposed method yielded an overall accuracy of 97.15%. For the convenience of scientific community, a user-friendly webserver called HBPred2.0 was built, which could be freely accessed at http://lin-group.cn/server/HBPred2.0/.
  Figure/Table
  Supplementary
  Article Metrics

Keywords hormone binding protein; tripeptide composition; binomial distribution method; feature selection; support vector machine; webserver

Citation: Jiu-Xin Tan, Shi-Hao Li, Zi-Mei Zhang, Cui-Xia Chen, Wei Chen, Hua Tang, Hao Lin. Identification of hormone binding proteins based on machine learning methods. Mathematical Biosciences and Engineering, 2019, 16(4): 2466-2480. doi: 10.3934/mbe.2019123

References

  • 1. G. Baumann, Growth hormone binding protein. The soluble growth hormone receptor, Minerva. Endocrinol., 27 (2002), 265–276.
  • 2. J. A. Kraut and N. E. Madias, Adverse effects of the metabolic acidosis of chronic kidney disease, Adv. Chronic. Kidney Dis., 24 (2017), 289–297.
  • 3. F. Sohm, I. Manfroid and A. Pezet, et al., Identification and modulation of a growth hormone-binding protein in rainbow trout (Oncorhynchus mykiss) plasma during seawater adaptation, Gen. Comp. Endocrinol., 111 (1998), 216–224.
  • 4. Y. Zhang and T. A. Marchant, Identification of serum GH-binding proteins in the goldfish (Carassius auratus) and comparison with mammalian GH-binding proteins, J. Endocrinol., 161 (1999), 255–262.
  • 5. I. E. Einarsdottir, N. Gong and E. Jonsson, et al., Plasma growth hormone-binding protein levels in Atlantic salmon Salmo salar during smoltification and seawater transfer, J. Fish Biol., 85 (2014), 1279–1296.
  • 6. S. Fisker, J. Frystyk and L. Skriver, et al., A simple, rapid immunometric assay for determination of functional and growth hormone-occupied growth hormone-binding protein in human serum, Eur. J. Clin. Invest., 26 (1996), 779–785.
  • 7. H. Tang, Y. W. Zhao and P. Zou, et al., HBPred: a tool to identify growth hormone-binding proteins, Int. J. Biol. Sci., 14 (2018), 957–964.
  • 8. S. Basith, B. Manavalan and T. H. Shin, et al., iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree, Comput. Struct. Biotechnol. J., 16 (2018), 412–420.
  • 9. L. Breuza, S. Poux and A. Estreicher, et al., The UniProtKB guide to the human proteome, Database (Oxford), 2016 (2016).
  • 10. L. Fu, B. Niu and Z. Zhu, et al., CD-HIT: accelerated for clustering the next-generation sequencing data, Bioinformatics, 28 (2012), 3150–3152.
  • 11. K. Tian, X. Zhao and S. S. Yau, Convex hull analysis of evolutionary and phylogenetic relationships between biological groups, J.Theor. Biol., 456 (2018), 34–40.
  • 12. I. Dubchak, I. Muchnik and S. R. Holbrook, et al., Prediction of protein folding class using global description of amino acid sequence, Proc. Natl. Acad. Sci. U S A, 92 (1995), 8700–8704.
  • 13. H. Tang, W. Chen and H. Lin, Identification of immunoglobulins using Chou's pseudo amino acid composition with feature selection technique, Mol. Biosyst., 12 (2016), 1269–1275.
  • 14. K. C. Chou, Some remarks on protein attribute prediction and pseudo amino acid composition, J. Theor. Biol., 273 (2011), 236–247.
  • 15. K. C. Chou, Prediction of protein cellular attributes using pseudo-amino acid composition, Proteins, 43 (2001), 246–255.
  • 16. F. Y. Dao, H. Yang and Z. D. Su, et al., Recent advances in conotoxin classification by using machine learning methods, Molecules, 22 (2017), in press.
  • 17. Q. Zou, S. Wan and Y. Ju, et al., Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy, BMC System. Biol., 10 (2016), 114.
  • 18. L. Wei, R. Su and B. Wang, et al., Integration of deep feature representations and handcrafted features to improve the prediction of N6-methyladenosine sites, Neurocomputing, 324 (2019), 3–9.
  • 19. G. H. Huang and J. C. Li, Feature extractions for computationally predicting protein post-translational modifications, Curr. Bioinform., 13 (2018), 387–395.
  • 20. Q. Zou, J. Zeng and L. Cao, et al., A novel features ranking metric with application to scalable visual and bioinformatics data classification, Neurocomputing, 173 (2016), 346–354.
  • 21. H. Y. Lai, X. X. Chen and W. Chen, et al., Sequence-based predictive modeling to identify cancerlectins, Oncotarget, 8 (2017), 28169–28175.
  • 22. X. X. Chen, H. Tang and W. C. Li, et al., Identification of bacterial cell wall lyases via pseudo amino acid composition, Biomed. Res. Int., 2016 (2016), 1654623.
  • 23. X. J. Zhu, C. Q. Feng and H. Y. Lai, et al., Predicting protein structural classes for low-similarity sequences by evaluating different features, Knowled. System., 163 (2019), 787–793.
  • 24. H. Yang, W. R. Qiu and G. Q. Liu, et al., iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC, Int. J. Biol. Sci., 14 (2018), 883–891.
  • 25. H. Yang, H. Tang and X. X. Chen, et al., Identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition, Biomed. Res. Int., 2016 (2016), 5413903.
  • 26. C. Q. Feng, Z. Y. Zhang and X. J. Zhu, et al., iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators, Bioinformatics, (2018), in press.
  • 27. F. Y. Dao, H. Lv and F. Wang, et al., Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique, Bioinformatics, (2018), in press.
  • 28. H. Lin, Z. Y. Liang and H. Tang, et al., Identifying sigma70 promoters with novel pseudo nucleotide composition, IEEE/ACM Trans. Comput. Biol. Bioinform., (2017), in press.
  • 29. W. Chen, H. Yang and P. Feng, et al., iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties, Bioinformatics, 33 (2017), 3518–3523.
  • 30. W. Chen, P. Feng and T. Liu, et al., Recent advances in machine learning methods for predicting heat shock proteins, Curr. Drug Metab., (2018), in press.
  • 31. D. Li, Y. Ju and Q. Zou, Protein folds prediction with hierarchical structured SVM, Curr. Proteom., 13 (2016), 79–85.
  • 32. N. Zhang, S. Yu and Y. Guo, et al., Discriminating ramos and jurkat cells with image textures from diffraction imaging flow cytometry based on a support vector machine, Curr. Bioinform., 13 (2018), 50–56.
  • 33. H. Yang, H. Lv and H. Ding, et al., iRNA-2OM: A sequence-based predictor for identifying 2'-o-methylation sites in homo sapiens, J. Comput. Biol., 25 (2018), 1266–1277.
  • 34. P. M. Feng, H. Ding and W. Chen, et al., Naive Bayes classifier with feature selection to identify phage virion proteins, Comput. Math. Methods Med., 2013 (2013), 530696.
  • 35. B. Manavalan, S. Subramaniyam and T. H. Shin, et al., Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy, J. Proteom. Res., 17 (2018), 2715–2726.
  • 36. P. M. Feng, W. Chen and H. Lin, et al., iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition, Anal. Biochem., 442 (2013), 118–125.
  • 37. P. M. Feng, H. Lin and W. Chen, Identification of antioxidants from sequence information using naive Bayes, Comput. Math. Method. Med., 2013 (2013), 567529.
  • 38. P. Feng, H. Yang and H. Ding, et al., iDNA6mA-PseKNC: Identifying DNA N(6)-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC, Genomics, (2018), in press.
  • 39. W. Chen, P. M. Feng and E. Z. Deng, et al., iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition, Anal. Biochem., 462 (2014), 76–83.
  • 40. L. Z. Yuan, E. F. Yong and Z. Wei, et al., Using quadratic discriminant analysis to predict protein secondary structure based on chemical shifts, Curr. Bioinform., 12 (2017), 52–56.
  • 41. W. Chen, H. Lv, and F. Nie, et al., i6mA-Pred: Identifying DNA N6-methyladenine sites in the rice genome, Bioinformatics, (2019), in press.
  • 42. Y. Bao, S. Marini and T. Tamura, et al., Toward more accurate prediction of caspase cleavage sites: a comprehensive review of current methods, tools and features, Brief Bioinform., (2018), in press.
  • 43. H. Tang, C. M. Zhang and R. Chen, et al., Identification of secretory proteins of malaria parasite by feature selection technique, Letter. Organic Chem., 14 (2017), 621–624.
  • 44. H. Tang, R. Z. Cao and W. Wang, et al., A two-step discriminated method to identify thermophilic proteins, Int. J. Biomath., 10 (2017), in press.
  • 45. S. Patel, R. Tripathi and V. Kumari, et al., DeepInteract: Deep neural network based protein-protein interaction prediction tool, Curr. Bioinform., 12 (2017), 551–557.
  • 46. R. Z. Cao, B. Adhikari and D. Bhattacharya, et al., QAcon: single model quality assessment using protein structural and contact information with machine learning techniques, Bioinform., 33 (2017), 586–588.
  • 47. R. Cao, C. Freitas and L. Chan, et al., ProLanGO: Protein function prediction using neural machine translation based on a recurrent neural network, Molecules, 22 (2017), in press.
  • 48. B. Manavalan, T. H. Shin and M. O. Kim, et al., PIP-EL: A new ensemble learning method for improved proinflammatory peptide predictions, Front. Immunol., 9 (2018), 1783.
  • 49. B. Manavalan, T. H. Shin and G. Lee, PVP-SVM: Sequence-based prediction of phage virion proteins using a support vector machine, Front. Microbiol., 9 (2018), 476.
  • 50. T. Cui, L. Zhang and Y. Huang, et al., MNDR v2.0: an updated resource of ncRNA-disease associations in mammals, Nucleic Acids Res., 46 (2018), D371–D374.
  • 51. T. Zhang, P. Tan and L. Wang, et al., RNALocate: a resource for RNA subcellular localizations, Nucleic Acids Res., 45 (2017), D135–D138.
  • 52. Y. Yi, Y. Zhao and C. Li, et al., RAID v2.0: an updated resource of RNA-associated interactions across organisms, Nucleic Acids Res., 45 (2017), D115–D118.
  • 53. Z.Y. Liang, H.Y. Lai and H. Yang, et al., Pro54DB: a database for experimentally verified sigma-54 promoters, Bioinformatics, 33 (2017), 467–469.
  • 54. J. Song, Y. Wang and F. Li, et al., iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites, Brief Bioinform., (2018), in press.
  • 55. J. Song, F. Li and A. Leier, et al., PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy, Bioinformatics, 34 (2018), 684–687.
  • 56. R. Cao and J. Cheng, Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks, Methods, 93 (2016), 84–91.
  • 57. W. Chen, P.M. Feng and E.Z. Deng, et al., iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition, Anal. Biochem., 462 (2014), 76–83.
  • 58. I. Naseem, S. Khan and R. Togneri, et al., ECMSRC: A sparse learning approach for the prediction of extracellular matrix proteins, Curr. Bioinform., 12 (2017), 361–368.
  • 59. R. Z. Cao, D. Bhattacharya and J. Hou, et al., DeepQA: improving the estimation of single protein model quality with deep belief networks, BMC Bioinform., 17 (2016), in press.
  • 60. B. Manavalan, S. Basith and T. H. Shin, et al., MLACP: machine-learning-based prediction of anticancer peptides, Oncotarget, 8 (2017), 77121–77136.
  • 61. B. Manavalan, S. Basith and T. H. Shin, et al., mAHTPred: a sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation, Bioinformatics, (2018), in press.
  • 62. B. Manavalan, R. G. Govindaraj and T. H. Shin, et al., iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction, Front. Immunol., 9 (2018), 1695.
  • 63. B. Manavalan, T. H. Shin and G. Lee, DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest, Oncotarget, 9 (2018), 1944–1956.

 

This article has been cited by

  • 1. Nguyen Quoc Khanh Le, Fertility-GRU: Identifying Fertility-Related Proteins by Incorporating Deep-Gated Recurrent Units and Original Position-Specific Scoring Matrix Profiles, Journal of Proteome Research, 2019, 10.1021/acs.jproteome.9b00411
  • 2. Xiaoqing Ru, Peigang Cao, Lihong Li, Quan Zou, Selecting Essential MicroRNAs Using a Novel Voting Method, Molecular Therapy - Nucleic Acids, 2019, 18, 16, 10.1016/j.omtn.2019.07.019
  • 3. Hao Lv, Fu-Ying Dao, Zheng-Xing Guan, Dan Zhang, Jiu-Xin Tan, Yong Zhang, Wei Chen, Hao Lin, iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice, Frontiers in Genetics, 2019, 10, 10.3389/fgene.2019.00793
  • 4. Chaolu Meng, Shunshan Jin, Lei Wang, Fei Guo, Quan Zou, AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine, Frontiers in Bioengineering and Biotechnology, 2019, 7, 10.3389/fbioe.2019.00224
  • 5. Duyen Thi Do, Nguyen Quoc Khanh Le, A sequence-based approach for identifying recombination spots in Saccharomyces cerevisiae by using hyper-parameter optimization in FastText and support vector machine, Chemometrics and Intelligent Laboratory Systems, 2019, 103855, 10.1016/j.chemolab.2019.103855
  • 6. Wei Chen, Pengmian Feng, Xiaoming Song, Hao Lv, Hao Lin, iRNA-m7G: Identifying N7-methylguanosine Sites by Fusing Multiple Features, Molecular Therapy - Nucleic Acids, 2019, 18, 269, 10.1016/j.omtn.2019.08.022
  • 7. Muhammad Arif, Farman Ali, Saeed Ahmad, Muhammad Kabir, Zakir Ali, Maqsood Hayat, Pred-BVP-Unb: Fast prediction of bacteriophage Virion proteins using un-biased multi-perspective properties with recursive feature elimination, Genomics, 2019, 10.1016/j.ygeno.2019.09.006
  • 8. Hui Yang, Wuritu Yang, Fu-Ying Dao, Hao Lv, Hui Ding, Wei Chen, Hao Lin, A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae, Briefings in Bioinformatics, 2019, 10.1093/bib/bbz123
  • 9. Zhihua Chen, Xinke Wang, Peng Gao, Hongju Liu, Bosheng Song, Predicting Disease Related microRNA Based on Similarity and Topology, Cells, 2019, 8, 11, 1405, 10.3390/cells8111405
  • 10. Fang Wang, Zheng-Xing Guan, Fu-Ying Dao, Hui Ding, A Brief Review of the Computational Identification of Antifreeze Protein, Current Organic Chemistry, 2019, 23, 15, 1671, 10.2174/1385272823666190718145613
  • 11. Chen-Chen Li, Bin Liu, MotifCNN-fold: protein fold recognition based on fold-specific features extracted by motif-based convolutional neural networks, Briefings in Bioinformatics, 2019, 10.1093/bib/bbz133
  • 12. Han Luo, Donghua Wang, Juan Liu, Ying Ju, Zhe Jin, A Framework Integrating Heterogeneous Databases for the Completion of Gene Networks, IEEE Access, 2019, 7, 168859, 10.1109/ACCESS.2019.2954994
  • 13. Xin Gao, Donghua Wang, Jun Zhang, Qing Liao, Bin Liu, iRBP-Motif-PSSM: Identification of RNA-Binding Proteins Based on Collaborative Learning, IEEE Access, 2019, 7, 168956, 10.1109/ACCESS.2019.2952621
  • 14. Tianyi Zhao, Donghua Wang, Yang Hu, Ningyi Zhang, Tianyi Zang, Yadong Wang, Identifying Alzheimer’s Disease-related miRNA Based on Semi-clustering, Current Gene Therapy, 2019, 19, 4, 216, 10.2174/1566523219666190924113737
  • 15. He Zhuang, Ying Zhang, Shuo Yang, Liang Cheng, Shu-Lin Liu, A Mendelian Randomization Study on Infant Length and Type 2 Diabetes Mellitus Risk, Current Gene Therapy, 2019, 19, 4, 224, 10.2174/1566523219666190925115535
  • 16. Chang Lu, Zhe Liu, Bowen Kan, Yingli Gong, Zhiqiang Ma, Han Wang, TMP-SSurface: A Deep Learning-Based Predictor for Surface Accessibility of Transmembrane Protein Residues, Crystals, 2019, 9, 12, 640, 10.3390/cryst9120640
  • 17. Shanwen Sun, Chunyu Wang, Hui Ding, Quan Zou, Machine learning and its applications in plant molecular studies, Briefings in Functional Genomics, 2019, 10.1093/bfgp/elz036
  • 18. Ruirui Liang, Jiayang Xie, Chi Zhang, Mengying Zhang, Hai Huang, Haizhong Huo, Xin Cao, Bing Niu, Identifying Cancer Targets Based on Machine Learning Methods via Chou’s 5-steps Rule and General Pseudo Components, Current Topics in Medicinal Chemistry, 2019, 19, 25, 2301, 10.2174/1568026619666191016155543
  • 19. Changgeng Tan, Tong Wang, Wenyi Yang, Lei Deng, PredPSD: A Gradient Tree Boosting Approach for Single-Stranded and Double-Stranded DNA Binding Protein Prediction, Molecules, 2019, 25, 1, 98, 10.3390/molecules25010098
  • 20. Chunyu Wang, Jialin Li, Xiaoyan Liu, Maozu Guo, Predicting Sub-Golgi Apparatus Resident Protein With Primary Sequence Hybrid Features, IEEE Access, 2020, 8, 4442, 10.1109/ACCESS.2019.2962821

Reader Comments

your name: *   your email: *  

© 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution Licese (http://creativecommons.org/licenses/by/4.0)

Download full text in PDF

Export Citation

Copyright © AIMS Press All Rights Reserved