On the sensitivity of feature ranked lists for large-scale biological data

  • Received: 01 June 2012 Accepted: 29 June 2018 Published: 01 April 2013
  • MSC : 92-08.

  • The problem of feature selection for large-scale genomic data, for example from DNA microarray experiments, is one of the fundamental and well-investigated problems in modern computational biology.From the computational point of view, a selected gene list should be characterized by good predictive power and should be understood and well explained from the biological point of view.Recently, another feature of selected gene lists is increasingly investigated, namely their stability which measures how the content and/or the gene order change when the data are perturbed.In this paper we propose a new approach to analysis of gene list stability, termed the sensitivity index, that does not require any data perturbationand allows the gene list that is most reliable in a biological sense to be chosen.

    Citation: Danuta Gaweł, Krzysztof Fujarewicz. On the sensitivity of feature ranked lists for large-scale biological data[J]. Mathematical Biosciences and Engineering, 2013, 10(3): 667-690. doi: 10.3934/mbe.2013.10.667

    Related Papers:

    [1] Wenkui Zheng, Guangyao Zhang, Chunling Fu, Bo Jin . An adaptive feature selection algorithm based on MDS with uncorrelated constraints for tumor gene data classification. Mathematical Biosciences and Engineering, 2023, 20(4): 6652-6665. doi: 10.3934/mbe.2023286
    [2] Xiwen Qin, Shuang Zhang, Dongmei Yin, Dongxue Chen, Xiaogang Dong . Two-stage feature selection for classification of gene expression data based on an improved Salp Swarm Algorithm. Mathematical Biosciences and Engineering, 2022, 19(12): 13747-13781. doi: 10.3934/mbe.2022641
    [3] Lal Hussain, Wajid Aziz, Ishtiaq Rasool Khan, Monagi H. Alkinani, Jalal S. Alowibdi . Machine learning based congestive heart failure detection using feature importance ranking of multimodal features. Mathematical Biosciences and Engineering, 2021, 18(1): 69-91. doi: 10.3934/mbe.2021004
    [4] Songlin Liu, Shouming Zhang, Zijian Diao, Zhenbin Fang, Zeyu Jiao, Zhenyu Zhong . Pedestrian re-identification based on attention mechanism and Multi-scale feature fusion. Mathematical Biosciences and Engineering, 2023, 20(9): 16913-16938. doi: 10.3934/mbe.2023754
    [5] Jianhua Jia, Yu Deng, Mengyue Yi, Yuhui Zhu . 4mCPred-GSIMP: Predicting DNA N4-methylcytosine sites in the mouse genome with multi-Scale adaptive features extraction and fusion. Mathematical Biosciences and Engineering, 2024, 21(1): 253-271. doi: 10.3934/mbe.2024012
    [6] Wenjun Xu, Zihao Zhao, Hongwei Zhang, Minglei Hu, Ning Yang, Hui Wang, Chao Wang, Jun Jiao, Lichuan Gu . Deep neural learning based protein function prediction. Mathematical Biosciences and Engineering, 2022, 19(3): 2471-2488. doi: 10.3934/mbe.2022114
    [7] Huiqing Wang, Xiao Han, Jianxue Ren, Hao Cheng, Haolin Li, Ying Li, Xue Li . A prognostic prediction model for ovarian cancer using a cross-modal view correlation discovery network. Mathematical Biosciences and Engineering, 2024, 21(1): 736-764. doi: 10.3934/mbe.2024031
    [8] Bowen Ding, Zhaobin Ma, Shuoyan Ren, Yi Gu, Pengjiang Qian, Xin Zhang . A genetic algorithm with two-step rank-based encoding for closed-loop supply chain network design. Mathematical Biosciences and Engineering, 2022, 19(6): 5925-5956. doi: 10.3934/mbe.2022277
    [9] Kun Yu, Mingxu Huang, Shuaizheng Chen, Chaolu Feng, Wei Li . GSEnet: feature extraction of gene expression data and its application to Leukemia classification. Mathematical Biosciences and Engineering, 2022, 19(5): 4881-4891. doi: 10.3934/mbe.2022228
    [10] Rishin Haldar, Swathi Jamjala Narayanan . A novel ensemble based recommendation approach using network based analysis for identification of effective drugs for Tuberculosis. Mathematical Biosciences and Engineering, 2022, 19(1): 873-891. doi: 10.3934/mbe.2022040
  • The problem of feature selection for large-scale genomic data, for example from DNA microarray experiments, is one of the fundamental and well-investigated problems in modern computational biology.From the computational point of view, a selected gene list should be characterized by good predictive power and should be understood and well explained from the biological point of view.Recently, another feature of selected gene lists is increasingly investigated, namely their stability which measures how the content and/or the gene order change when the data are perturbed.In this paper we propose a new approach to analysis of gene list stability, termed the sensitivity index, that does not require any data perturbationand allows the gene list that is most reliable in a biological sense to be chosen.


    [1] Molecular Endocrinology, 25 (2011), 1326-1336.
    [2] Frontiers in Bioscience, 14 (2009), 2829-2844.
    [3] Briefings in Bioinformatics, 10 (2009), 556-568.
    [4] Journal of Clinical Oncology, 25 (2007), 852-861.
    [5] The Journal of Nutritional Biochemistry, 22 (2011), 634-641.
    [6] Critical Reviews in Clinical Laboratory Sciences, 45 (2008), 531-562.
    [7] Computer Methods and Programs in Biomedicine, 93 (2009), 124-139.
    [8] Journal of the American Statistical Association, 97 (2002), 77-87.
    [9] 2003. Available from: http://www.bioconductor.org/help/course-materials/2003/Milan/Lectures/classif.pdf.
    [10] The Biochemical Journal, 345 (2000), 503-509.
    [11] Endocrine - Related Cancer, 14 (2007), 809-826.
    [12] Journal of Medical Informatics and Technologies, 2 (2001), MI9-MI17.
    [13] International Journal of Applied Mathematics and Computer Science, 13 (2003), 327-335.
    [14] AACR Meeting Abstracts, (2005), 217-c-218 .
    [15] 2nd edition, Springer-Verlag, 2009.
    [16] Molecular Cancer, 9 (2010).
    [17] Cancer Research, 65 (2005), 1587-1597.
    [18] Bioinformatics, 24 (2008), 258-264.
    [19] Oncogene, 31 (2012), 3111-3123.
    [20] BMC Bioinformatics, 7 (2006), 235-244.
    [21] Genes Dev., 22 (2008), 308-321.
    [22] Breast Cancer Research, 11 (2009).
    [23] Clin. Cancer Res., 13 (2007), 2471-2478.
    [24] PLoS ONE, 6 (2011), 1-10.
    [25] BMC Clinical Pathology, 12 (2012), 2-14.
    [26] Molecular Cancer, 9 (2010).
    [27] Am. J. Pathol., 156 (2000), 595-605.
    [28] 2010, Patent WO/2010/061996.
    [29] Journal of Biomedicine and Biotechnology, 2010 (2010), 556-568.
    [30] Biology Direct, 7 (2012).
    [31] Cancer Research, 57 (1997), 1776-1784.
    [32] Proceedings of the National Academy of Sciences of the United States of America, 98 (2001), 5116-5121.
    [33] Lung Cancer, 48 (2005), 19-29.
    [34] Journal of Clinical Oncology, 26 (2008), 2952-2958.
    [35] Stanford University, (2007), 1-13.
    [36] World Journal of Surgical Oncology, 9 (2011).
    [37] Available from: http://gedi.ci.uchicago.edu/.
    [38] Available from: http://www.genecards.org/.
    [39] Available from: http://www.malacards.org/.
    [40] Available from: http://www.ncbi.nlm.nih.gov/.
    [41] Available from: http://www.ncbi.nlm.nih.gov/omim.
    [42] 2012. Available from: http://www.globethesis.com/?t=2154330335497716 and http://www.res-medical.com/oncology/93581.
    [43] Available from: https://usgene.sequencebase.com/.
    [44] Available from: http://www.wikigenes.org/.
  • This article has been cited by:

    1. Shuo Wang, Junyan Lu, Detect influential points of feature rankings, 2025, 115, 14769271, 108339, 10.1016/j.compbiolchem.2024.108339
  • Reader Comments
  • © 2013 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2652) PDF downloads(508) Cited by(1)

Article outline

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog