Research article Special Issues

MTNA: A deep learning based predictor for identifying multiple types of N-terminal protein acetylated sites

  • Received: 27 May 2023 Revised: 18 July 2023 Accepted: 23 July 2023 Published: 03 August 2023
  • N-terminal acetylation is a specific protein modification that occurs only at the N-terminus but plays a significant role in protein stability, folding, subcellular localization and protein-protein interactions. Computational methods enable finding N-terminal acetylated sites from large-scale proteins efficiently. However, limited by the number of the labeled proteins, existing tools only focus on certain subtypes of N-terminal acetylated sites on frequently detected amino acids. For example, NetAcet focuses on alanine, glycine, serine and threonine only, and N-Ace predicts on alanine, glycine, methionine, serine and threonine. With the growth of experimental N-terminal acetylated site data, it is observed that N-terminal protein acetylation occurs on nearly ten types of amino acids. To facilitate comprehensive analysis, we have developed MTNA (Multiple Types of N-terminal Acetylation), a deep learning network capable of accurately predicting N-terminal protein acetylation sites for various amino acids at the N-terminus. MTNA not only outperforms existing tools but also has the capability to identify rare types of N-terminal protein acetylated sites occurring on less studied amino acids.

    Citation: Yongbing Chen, Wenyuan Qin, Tong Liu, Ruikun Li, Fei He, Ye Han, Zhiqiang Ma, Zilin Ren. MTNA: A deep learning based predictor for identifying multiple types of N-terminal protein acetylated sites[J]. Electronic Research Archive, 2023, 31(9): 5442-5456. doi: 10.3934/era.2023276

    Related Papers:

  • N-terminal acetylation is a specific protein modification that occurs only at the N-terminus but plays a significant role in protein stability, folding, subcellular localization and protein-protein interactions. Computational methods enable finding N-terminal acetylated sites from large-scale proteins efficiently. However, limited by the number of the labeled proteins, existing tools only focus on certain subtypes of N-terminal acetylated sites on frequently detected amino acids. For example, NetAcet focuses on alanine, glycine, serine and threonine only, and N-Ace predicts on alanine, glycine, methionine, serine and threonine. With the growth of experimental N-terminal acetylated site data, it is observed that N-terminal protein acetylation occurs on nearly ten types of amino acids. To facilitate comprehensive analysis, we have developed MTNA (Multiple Types of N-terminal Acetylation), a deep learning network capable of accurately predicting N-terminal protein acetylation sites for various amino acids at the N-terminus. MTNA not only outperforms existing tools but also has the capability to identify rare types of N-terminal protein acetylated sites occurring on less studied amino acids.



    加载中


    [1] B. Polevoda, F. Sherman, N-terminal acetyltransferases and sequence requirements for N-terminal acetylation of eukaryotic proteins, J. Mol. Biol., 325 (2003), 595–622. https://doi.org/10.1016/S0022-2836(02)01269-X doi: 10.1016/S0022-2836(02)01269-X
    [2] C. Yi, M. Ma, L. Ran, J. Zheng, J. Tong, J. Zhu, et al., Function and molecular mechanism of acetylation in autophagy regulation, Science, 336 (2012), 474–477. https://doi.org/10.1126/science.1216990 doi: 10.1126/science.1216990
    [3] B. Polevoda, F. Sherman, The diversity of acetylated proteins, Genome Biol., 3 (2002), 1–6. https://doi.org/10.1186/gb-2002-3-5-reviews0006 doi: 10.1186/gb-2002-3-5-reviews0006
    [4] X. J. Yang, The diverse superfamily of lysine acetyltransferases and their roles in leukemia and other diseases, Nucleic Acids Res., 32 (2004), 959–976. https://doi.org/10.1093/nar/gkh252 doi: 10.1093/nar/gkh252
    [5] T. Arnesen, P. Van Damme, B. Polevoda, K. Helsens, R. Evjenth, N. Colaert, et al., Proteomics analyses reveal the evolutionary conservation and divergence of N-terminal acetyltransferases from yeast and humans, Proc. Natl. Acad. Sci., 106 (2009), 8157–8162. https://doi.org/10.1073/pnas.0901931106 doi: 10.1073/pnas.0901931106
    [6] C. S. Hwang, A. Shemorry, A. Varshavsky, N-Terminal acetylation of cellular proteins creates specific degradation signals, Science, 327 (2010), 973–977. https://doi.org/10.1126/science.1183147 doi: 10.1126/science.1183147
    [7] A. J. Trexler, E. Rhoades, N‐terminal acetylation is critical for forming α‐helical oligomer of α‐synuclein, Protein Sci., 21 (2012), 601–605. https://doi.org/10.1002/pro.2056 doi: 10.1002/pro.2056
    [8] R. Behnia, B. Panic, J. R. C. Whyte, S. Munro, Targeting of the Arf-like GTPase Arl3p to the Golgi requires N-terminal acetylation and the membrane protein Sys1p, Nat. Cell Biol., 6 (2004), 405–413. https://doi.org/10.1038/ncb1120 doi: 10.1038/ncb1120
    [9] D. C. Scott, J. K. Monda, E. J. Bennett, J. W. Harper, B. A. Schulman, N-Terminal acetylation acts as an avidity enhancer within an interconnected multiprotein complex, Science, 334 (2011), 674–678. https://doi.org/10.1126/science.1209307 doi: 10.1126/science.1209307
    [10] T. Y. Lee, J. B. K. Hsu, F. M. Lin, W. C. Chang, P. C. Hsu, H. D. Huang, N‐Ace: Using solvent accessibility and physicochemical properties to identify protein N‐acetylation sites, J. Comput. Chem., 31 (2010), 2759–2771. https://doi.org/10.1002/jcc.21569 doi: 10.1002/jcc.21569
    [11] A. F. Rope, K. Wang, R. Evjenth, J. Xing, J. J. Johnston, J. J. Swensen, et al., Using VAAST to identify an X-linked disorder resulting in lethality in male infants due to N-terminal acetyltransferase deficiency, Am. J. Hum. Genet., 89 (2011), 345. https://doi.org/10.1016/j.ajhg.2011.07.008 doi: 10.1016/j.ajhg.2011.07.008
    [12] T. V. Kalvik, T. Arnesen, Protein N-terminal acetyltransferases in cancer, Oncogene, 32 (2013), 269–276. https://doi.org/10.1038/onc.2012.82 doi: 10.1038/onc.2012.82
    [13] D. J. Welsch, G. L. Nelsestuen, Amino-terminal alanine functions in a calcium-specific process essential for membrane binding by prothrombin fragment 1, Biochemistry, 27 (1988), 4939–4945. https://doi.org/10.1021/bi00413a052 doi: 10.1021/bi00413a052
    [14] D. Umlauf, Y. Goto, R. Feil, Site-specific analysis of histone methylation and acetylation, Epigenet. Protoc., 287 (2004), 99–120. https://doi.org/10.1385/1-59259-828-5:099 doi: 10.1385/1-59259-828-5:099
    [15] K. F. Medzihradszky, In‐solution digestion of proteins for mass spectrometry, Methods Enzymol., 405 (2005), 50–65. https://doi.org/10.1016/S0076-6879(05)05003-2 doi: 10.1016/S0076-6879(05)05003-2
    [16] C. Xia, Y. Tao, M. Li, T. Che, J. Qu, Protein acetylation and deacetylation: An important regulatory modification in gene transcription, Exp. Ther. Med., 20 (2020), 2923–2940. https://doi.org/10.3892/etm.2020.9073 doi: 10.3892/etm.2020.9073
    [17] L. Kiemer, J. D. Bendtsen, N. Blom, NetAcet: Prediction of N-terminal acetylation sites, Bioinformatics, 21 (2005), 1269–1270. https://doi.org/10.1093/bioinformatics/bti130 doi: 10.1093/bioinformatics/bti130
    [18] K. D. Yamada, S. Omori, H. Nishi, M. Miyagi, Identification of the sequence determinants of protein N-terminal acetylation through a decision tree approach, BMC Bioinf., 18 (2017), 289. https://doi.org/10.1186/s12859-017-1699-4 doi: 10.1186/s12859-017-1699-4
    [19] K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising image classification models and saliency maps, arXiv preprint, (2013), arXiv: 1312.6034. https://doi.org/10.48550/arXiv.1312.6034
    [20] L. McInnes, J. Healy, J. Melville, Umap: Uniform manifold approximation and projection for dimension reduction, arXiv preprint, (2018), arXiv: 1802.03426. https://doi.org/10.48550/arXiv.1802.03426
    [21] The UniProt Consortium, UniProt: the universal protein knowledgebase in 2023, Nucleic Acids Res., 51 (2023), D523–D531. https://doi.org/10.1093/nar/gkac1052 doi: 10.1093/nar/gkac1052
    [22] Y. Huang, B. Niu, Y. Gao, L. Fu, W. Li, CD-HIT suite: A web server for clustering and comparing biological sequences, Bioinformatics, 26 (2010), 680–682. https://doi.org/10.1093/bioinformatics/btq003 doi: 10.1093/bioinformatics/btq003
    [23] G. E. Crooks, G. Hon, J. M. Chandonia, S. E. Brenner, WebLogo: A sequence logo generator, Genome Res., 14 (2004), 1188–1190. https://doi.org/10.1101/gr.849004 doi: 10.1101/gr.849004
    [24] J. Zhang, H. Chai, S. Guo, H. Guo, Y. Li, High-throughput identification of mammalian secreted proteins using species-specific scheme and application to human proteome, Molecules, 23 (2018), 1448. https://doi.org/10.3390/molecules23061448 doi: 10.3390/molecules23061448
    [25] P. Radivojac, V. Vacic, C. Haynes, R. R. Cocklin, A. Mohan, J. W. Heyen, et al., Identification, analysis, and prediction of protein ubiquitination sites, Proteins, 78 (2010), 365–380. https://doi.org/10.1002/prot.22555 doi: 10.1002/prot.22555
    [26] S. Kawashima, P. Pokarowski, M. Pokarowska, A. Kolinski, T. Katayama, M. Kanehisa, AAindex: amino acid index database, progress report 2008, Nucleic Acids Res., 36 (2007), D202–D205. https://doi.org/10.1093/nar/gkm998 doi: 10.1093/nar/gkm998
    [27] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, in 2017 Advances in Neural Information Processing Systems, (2017), 1–11.
    [28] T. Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár. Focal loss for dense object detection, in 2017 International Conference on Computer Vision (ICCV), IEEE, (2017), 2980–2988.
    [29] J. Zhang, Y. Zhang, Z. Ma, In silico prediction of human secretory proteins in plasma based on discrete firefly optimization and application to cancer biomarkers identification, Front. Genet., 10 (2019), 542. https://doi.org/10.3389/fgene.2019.00542 doi: 10.3389/fgene.2019.00542
    [30] T. Saito, M. Rehmsmeier, The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets, PloS One, 10 (2015), e0118432. https://doi.org/110.1371/journal.pone.0118432
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(763) PDF downloads(53) Cited by(0)

Article outline

Figures and Tables

Figures(5)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog