Research article Special Issues

Worry less about the algorithm, more about the sequence of events

  • Received: 28 June 2020 Accepted: 20 September 2020 Published: 28 September 2020
  • Background Many algorithms exist for learning network structure and parameters from data. Some of these algorithms include Grow Shrink, Incremental Association Markov Blanket, IAMB, Fast IAMB, and Interleaved IAMB, Hill Climbing, Restricted Maximization and Maximum-Minimum Hill Climbing. These algorithms optimize the fit to the data, while ignoring the order of occurrences of the variables in the network structure. Objective This paper examines if sequence information (i.e., one variable occurs before another) can make algorithms for learning directed acyclical graph networks more accurate. Methods A 13- variable network was simulated, where information on sequence of occurrence of some of the variables was assumed to be known. In each simulation 10,000 observations were generated. These observations were used by 4 conditional dependency and 4 search and score algorithms to discover the network from the simulated data. Partial sequence was used to prohibit a directed arc from a later variable to an earlier one. The Area under the Receiver Operating Curve (AROC) was used to compare the accuracy of the sequence-constrained and unconstrained algorithms in predicting the last node in the network. In addition, we examined the performance of sequence constrained algorithms in a real data set. We analyzed 1.3 million disability assessments done on 296,051 residents in Veterans Affairs nursing homes; where the sequence of occurrence of variables was inferred from the average age of occurrence of disabilities. We constructed three networks using Grow-Shrink algorithm, one without and the other two use two permutation of the observed sequence. The fit of these three models to data was examined using Bayesian Information Criterion (BIC). Results In simulated data, algorithms that used sequenced constraints (AROC = 0.94, confidence intervals, C.I. = 0.86 to 1) were significantly more accurate than the same algorithm without use of sequence constraints (AROC = 0.74, C.I. = 0.65 to 0.83). The agreement between discovered and observed networks improved from range of 0.54 to 0.97 to range of 0.88 to 1. In the real data set, the Bayesian network constructed with use of sequence had 6% lower BIC scores. Conclusions Sequence information improved accuracy of all eight learning algorithms and should be routinely examined in learning network structure from data.

    Citation: Farrokh Alemi. Worry less about the algorithm, more about the sequence of events[J]. Mathematical Biosciences and Engineering, 2020, 17(6): 6557-6572. doi: 10.3934/mbe.2020342

    Related Papers:

  • Background Many algorithms exist for learning network structure and parameters from data. Some of these algorithms include Grow Shrink, Incremental Association Markov Blanket, IAMB, Fast IAMB, and Interleaved IAMB, Hill Climbing, Restricted Maximization and Maximum-Minimum Hill Climbing. These algorithms optimize the fit to the data, while ignoring the order of occurrences of the variables in the network structure. Objective This paper examines if sequence information (i.e., one variable occurs before another) can make algorithms for learning directed acyclical graph networks more accurate. Methods A 13- variable network was simulated, where information on sequence of occurrence of some of the variables was assumed to be known. In each simulation 10,000 observations were generated. These observations were used by 4 conditional dependency and 4 search and score algorithms to discover the network from the simulated data. Partial sequence was used to prohibit a directed arc from a later variable to an earlier one. The Area under the Receiver Operating Curve (AROC) was used to compare the accuracy of the sequence-constrained and unconstrained algorithms in predicting the last node in the network. In addition, we examined the performance of sequence constrained algorithms in a real data set. We analyzed 1.3 million disability assessments done on 296,051 residents in Veterans Affairs nursing homes; where the sequence of occurrence of variables was inferred from the average age of occurrence of disabilities. We constructed three networks using Grow-Shrink algorithm, one without and the other two use two permutation of the observed sequence. The fit of these three models to data was examined using Bayesian Information Criterion (BIC). Results In simulated data, algorithms that used sequenced constraints (AROC = 0.94, confidence intervals, C.I. = 0.86 to 1) were significantly more accurate than the same algorithm without use of sequence constraints (AROC = 0.74, C.I. = 0.65 to 0.83). The agreement between discovered and observed networks improved from range of 0.54 to 0.97 to range of 0.88 to 1. In the real data set, the Bayesian network constructed with use of sequence had 6% lower BIC scores. Conclusions Sequence information improved accuracy of all eight learning algorithms and should be routinely examined in learning network structure from data.


    加载中


    [1] D. Heckerman, D. Geiger, D. Chickering, Learning Bayesian networks: the combination of knowledge and statistical data, Mach. Learn., 20 (1995), 197-234.
    [2] G. Cooper, E. Herskovits, A Bayesian method for the induction of probabilistic networks from data, Mach. Learn., 7 (1992), 299-347.
    [3] W. Buntine, Theory refinement on Bayesian networks, in Uncertainty in Artificial Intelligence: Proceedings of the Seventh Conference (1991), 26 (1991), 52-60.
    [4] W. Lam, F. Bacchus, Theory refinement on Bayesian networks, Comput. Intell., 10 (1994), 269-293.
    [5] C. Wallace, K. Korb, H. Dai, Causal discovery via MML, Int. Confer. Mach. Learn., (1996), 516-524.
    [6] H. Akaike, A new look at the statistical model identification, IEEE Trans. Autom. Contr. 19 (1974), 716-723.
    [7] G. Schwarz, Estimating the dimension of a model, Annals. Stat., 6 (1978), 461-464.
    [8] D. Chickering, D. Geiger, D. Heckerman, Learning Bayesian networks: Search methods and experimental results, in Fifth International Workshop on Artificial Intelligence and Statistics, (1995), 112- 128.
    [9] N. Friedman, I. Nachman, D. Pe'er, Learning Bayesian network structure from massive datasets: The "sparse candidate" algorithm, in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999), 1999.
    [10] P. Naï m, P. H. Wuillemin, P. Leray, O. Pourret, A. Becker, Réseaux Bayésiens, Eyrolles, Paris, 2011.
    [11] P. Munteanu, M. Bendou, The EQ framework for learning equivalence classes of Bayesian networks, in Proceedings of the 2001 IEEE International Conference on Data Mining, (2001), 417- 424.
    [12] W. Lam, F. Bacchus, Learning Bayesian belief networks: An approach based on the MDL principle, Comput. Intell., 10 (1994), 269-293.
    [13] L. Jouffe, P. Munteanu, New search strategies for learning Bayesian networks, in Proceedings of the 10th International Symposium on Applied Stochastic Models and Data Analysis, Compiègne, France, (2001), 591-596.
    [14] J. Cheng, R. Greiner, J. Kelly, D. A. Bell, E. Liu, Learning Bayesian networks from data: an information-theory based approach, Artif. Intell. J., 137 (2002), 43-90.
    [15] P. Spirtes, C. Glymour, R. Scheines, Causation, Prediction, and Search, Springer-Verlag, 2000.
    [16] J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann, 1988.
    [17] T. Verma, J. Pearl, Equivalence and synthesis of causal models, in UAI '90: Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, (1990), 255-270.
    [18] D. Margaritis, Learning Bayesian Network Model Structure from Data, Ph.D thesis, Carnegie Mellon University, 2003.
    [19] C. F. Aliferis, I. Tsamardinos, A. Statnikov, HITON: A novel Markov Blanket algorithm for optimal variable selection, AMIA Annu. Symp. Proc., 2003 (2003), 21-25.
    [20] P. Spirtes, An anytime algorithm for causal inference, AISTATS, 2001.
    [21] F. Fu, Q. Zhou, Learning sparse causal Gaussian networks with experimental intervention: regularization and coordinate descent, J. Am. Stat. Assoc., 108 (2013), 288-300.
    [22] B. Aragam, Q. Zhou, Concave penalized estimation of sparse Gaussian Bayesian networks, J. Mach. Learn. Res., 16 (2015), 2273-2328.
    [23] G. I. Allen, Z. A. Liu, A Local Poisson Graphical Model for inferring networks from sequencing data, IEEE Trans. Nanobiosci., 12 (2013), 189-198.
    [24] A. Agresti, Categorical Data Analysis, Wiley-Interscience, 2002.
    [25] G. Park, G. Raskutti, Learning large-scale poisson DAG models based on over dispersion scoring, in Advances in Neural Information Processing Systems, (2015), 631-639.
    [26] S. W. Han, H. Zhong, Estimation of sparse directed acyclical graphs for multivariate counts, Biometrics, 72 (2016), 791-803.
    [27] A. Shojaie, G. Michaildis, Analysis of gene sets based on the underlying regulatory network, J. Comp. Biol., 16 (2009), 407-426.
    [28] J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge University Press, 2000.
    [29] N. Friedman, D. Koller, Being Bayesian about network structure: A Bayesian approach to structure discovery in Bayesian networks, Mach. Learn., 50 (2003), 95-125.
    [30] P. Larrafiaga, M. Poza, Y. Yurramendi, R. H. Murga, C. M. H. Kuijpers, Structure learning of Bayesian networks by genetic algorithms: A performance analysis of control parameters, IEEE Trans. Pattern Anal. Mach. Intell., 18 (1996).
    [31] M. Teyssier, D. Koller, Ordering-based search: A simple and effective algorithm for learning Bayesian networks, in Proceedings of the 21st conference on Uncertainty in AI (2005), 584.
    [32] M. Zargoush, F. Alemi, V. E. Vinzi, J. Vang, R. Kheirbek, A psychological approach to learning causal networks, Health Care Manage. Sci., 17 (2014), 194-201.
    [33] C. R. Levy, M. Zargoush, A. E. Williams, A. R. Williams, P. Giang, J. Wojtusiak, et al., Sequence of functional loss and recovery in nursing homes, Gerontologist, 56 (2016), 52-61.
    [34] L. A. Goodman, W. H. Kruskal, Measures of association for cross classifications, Part I, J. Am. Stat. Assoc., 49 (1954), 732-764.
    [35] J. Vang, Using a model of human cognition of causality to orient arcs in structural learning of Bayesian networks, Ph.D thesis, George Mason University, 2008.
    [36] D. D. Dunlop, S. L. Hughes, L. M. Manheim, Disability in activities of daily living: patterns of change and a hierarchy of disability, Am. J. Public Health., 87 (1997), 378-383.
    [37] L. Ferrucci, J. M. Guralnik, F. Cecchi, N. Marchionni, B. Salani, J. Kasper, et al., Constant hierarchic patterns of physical functioning across seven populations in five countries, Gerontologist, 38 (2008), 286-294.
    [38] S. Katz, A. B. Ford, R. W. Moskowitz, B. A. Jackson, M. W. Jaffe, Studies of illness in the aged, the index of ADL: A standardized measure of biological and psychological function, JAMA, 185 (1963), 914-919.
    [39] I. Tsamardinos, C. F. Aliferis, A. Statnikov, Algorithms for large scale Markov blanket discovery, in Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference, (2003), 376-381.
    [40] I. Tsamardinos, L. E. Brown, C. F. Aliferis, The max-min hill-climbing Bayesian network structure learning algorithm, Mach. Learn., 65 (2006), 31-78.
    [41] H. S. Sousa, F. Prieto-Castrillo, J. C. Matos, J. M. Branco, P. B. Lourenço, Combination of expert decision and learned based Bayesian networks for multi-scale mechanical analysis of timber elements, Expert Syst. Appl., 93 (2018), 156-168.
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2791) PDF downloads(74) Cited by(1)

Article outline

Figures and Tables

Figures(4)  /  Tables(2)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog