In this paper, we defined a novel edit distance for merge trees, which we argued to be suitable for a broad range of applications. Relying also on some technical results contained in other works, we investigated its stability properties, which ended up being analogous to the ones of the 1-Wasserstein distance between persistence diagrams. We tested and compared our metric against the interleaving distance in several simulations and case studies, highlighting the trade-off between stability and sensitivity when choosing the appropriate metric for a given data analysis problem, much alike the bias-variance trade-off in statistical modeling. In the appendix, we also compared our metric with other edit distances appearing in the literature, with both theoretic and practical considerations.
Citation: Matteo Pegoraro. A finitely stable edit distance for merge trees[J]. AIMS Mathematics, 2025, 10(7): 17179-17231. doi: 10.3934/math.2025769
In this paper, we defined a novel edit distance for merge trees, which we argued to be suitable for a broad range of applications. Relying also on some technical results contained in other works, we investigated its stability properties, which ended up being analogous to the ones of the 1-Wasserstein distance between persistence diagrams. We tested and compared our metric against the interleaving distance in several simulations and case studies, highlighting the trade-off between stability and sensitivity when choosing the appropriate metric for a given data analysis problem, much alike the bias-variance trade-off in statistical modeling. In the appendix, we also compared our metric with other edit distances appearing in the literature, with both theoretic and practical considerations.
| [1] | H. Adams, T. Emerson, M. Kirby, R. Neville, C. Peterson, P. Shipman, et al., Persistence images: A stable vector representation of persistent homology, J. Mach. Learn. Res., 18 (2017), 1–35. |
| [2] |
M. Balasubramanian, E. L. Schwartz, The isomap algorithm and topological stability, Science, 295 (2002), 7. https://doi.org/10.1126/science.295.5552.7a doi: 10.1126/science.295.5552.7a
|
| [3] | D. Beers, J. Leygonie, The fiber of persistent homology for trees, arXiv: 2303.16176, (2023). https://doi.org/10.48550/arXiv.2303.16176 |
| [4] | K. Beketayev, D. Yeliussizov, D. Morozov, G. H. Weber, B. Hamann, Measuring the distance between merge trees, In: Topological methods in data analysis and visualization III. Mathematics and visualization, Cham: Springer, 2014,151–165. https://doi.org/10.1007/978-3-319-04099-8_10 |
| [5] |
S. Biasotti, D. Giorgi, M. Spagnuolo, B. Falcidieno, Reeb graphs for shape analysis and applications, Theor. Comput. Sci., 392 (2008), 5–22. https://doi.org/10.1016/j.tcs.2007.10.018 doi: 10.1016/j.tcs.2007.10.018
|
| [6] | O. Bobrowski, S. Mukherjee, J. E. Taylor, Topological consistency via kernel estimation, Bernoulli, 23 (2017), 288–328. |
| [7] | P. Bubenik, Statistical topological data analysis using persistence landscapes, J. Mach. Learn. Res., 16 (2015), 77–102. |
| [8] | R. Cardona, J. Curry, T. Lam, M. Lesnick, The universal $\ell^p$-metric on merge trees, 224 (2022), 24: 1–24: 20. https://doi.org/10.4230/LIPIcs.SoCG.2022.24 |
| [9] |
G. Carlsson, F. Mémoli, Classifying clustering schemes, Found. Comput. Math., 13 (2013), 221–252. https://doi.org/10.1007/s10208-012-9141-9 doi: 10.1007/s10208-012-9141-9
|
| [10] |
L. Cavinato, M. Pegoraro, A. Ragni, M. Sollini, P. A. Erba, F. Ieva, Author correction: Imaging-based representation and stratification of intra-tumor heterogeneity via tree-edit distance, Sci. Rep., 12 (2022), 19607. https://doi.org/10.1038/s41598-022-23752-2 doi: 10.1038/s41598-022-23752-2
|
| [11] | F. Chazal, D. Cohen-Steiner, M. Glisse, L. J. Guibas, S. Y. Oudot, Proximity of persistence modules and their diagrams, In: Proceedings of the twenty-fifth annual symposium on Computational geometry, New York: Association for Computing Machinery, 2009,237–246. https://doi.org/10.1145/1542362.1542407 |
| [12] |
F. Chazal, B. T. Fasy, F. Lecci, A. Rinaldo, L. Wasserman, Stochastic convergence of persistence landscapes and silhouettes, J. Comput. Geom., 6 (2015), 140–161. https://doi.org/10.20382/jocg.v6i2a8 doi: 10.20382/jocg.v6i2a8
|
| [13] |
D. Cohen-Steiner, H. Edelsbrunner, J. Harer, Stability of persistence diagrams, Discrete Comput. Geom., 37 (2007), 103–120. https://doi.org/10.1007/s00454-006-1276-5 doi: 10.1007/s00454-006-1276-5
|
| [14] |
J. Curry, The fiber of the persistence map for functions on the interval, J. Appl. Comput. Topology, 2 (2018), 301–321. https://doi.org/10.1007/s41468-019-00024-z doi: 10.1007/s41468-019-00024-z
|
| [15] |
J. Curry, J. DeSha, A. Garin, K. Hess, L. Kanari, B. Mallery, From trees to barcodes and back again ii: Combinatorial and probabilistic aspects of a topological inverse problem, Computational Geometry, 116 (2024), 102031. https://doi.org/10.1016/j.comgeo.2023.102031 doi: 10.1016/j.comgeo.2023.102031
|
| [16] |
J. Curry, H. B. Hang, W. Mio, T. Needham, O. B. Okutan, Decorated merge trees for persistent topology, J. Appl. Comput. Topology, 6 (2022), 371–428. https://doi.org/10.1007/s41468-022-00089-3 doi: 10.1007/s41468-022-00089-3
|
| [17] | V. de Silva, E. Munch, A. Patel, Categorified reeb graphs, Discrete Comput. Geom., 55 (2016), 854–906. https://doi.org/10.1007/s00454-016-9763-9 |
| [18] |
H. Edelsbrunner, D. Letscher, A. Zomorodian, Topological persistence and simplification, Discrete Comput. Geom., 28 (2002), 511–533. https://doi.org/10.1007/s00454-002-2885-2 doi: 10.1007/s00454-002-2885-2
|
| [19] | H. Edelsbrunner, J. Harer, Persistent homology—a survey, In: Surveys on discrete and computational geometry, Providence: American Mathematical Society, 2008,257–282, |
| [20] | K. H. Esbensen, D. Guyot, F. Westad, L. P. Houmoller, Multivariate data analysis: in practice: An introduction to multivariate data analysis and experimental design, Norway: CAMO Process AS, 2002. |
| [21] |
E. F. Touli, Y. S. Wang, Fpt-Algorithms for computing gromov-hausdorff and interleaving distances between trees, J. Comput. Geom., 13 (2022), 89–124, https://doi.org/10.20382/jocg.v13i1a4 doi: 10.20382/jocg.v13i1a4
|
| [22] |
M. Febrero, P. Galeano, W. González-Manteiga, Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels, Environmetrics, 19 (2008), 331–345. https://doi.org/10.1002/env.878 doi: 10.1002/env.878
|
| [23] | J. Felsenstein, Inferring phylogenies, Sunderland: Sinauer Associates, 2003. |
| [24] | F. Ferraty, P. Vieu, Nonparametric functional data analysis: theory and practice, New Youk: Springer, 2006, https://doi.org/10.1007/0-387-36620-2. |
| [25] |
E. Gasparovic, E. Munch, S. Oudot, K. Turner, B. Wang, Y. S. Wang, Intrinsic interleaving distance for merge trees, La Matematica, 4 (2025), 40–65, https://doi.org/10.1007/s44007-024-00143-9. doi: 10.1007/s44007-024-00143-9
|
| [26] | A. Hatcher, Algebraic topology, Cambrige: Cambridge University Press, 2001. |
| [27] | E. Hong, Y. Kobayashi, A. Yamamoto, Improved methods for computing distances between unordered trees using integer programming, In: Combinatorial optimization and applications. COCOA 2017, Cham: Springer, 2017, 45–60. https://doi.org/10.1007/978-3-319-71147-8_4 |
| [28] |
T. Jiang, L. S. Wang, K. Z. Zhang, Alignment of trees—an alternative to tree edit, Theor. Comput. Sci., 143 (1995), 137–148. https://doi.org/10.1016/0304-3975(95)80029-9 doi: 10.1016/0304-3975(95)80029-9
|
| [29] |
L. Kanari, A. Garin, K. Hess, From trees to barcodes and back again: theoretical and statistical perspectives, Algorithms, 13 (2020), 335. https://doi.org/10.3390/a13120335 doi: 10.3390/a13120335
|
| [30] |
M. Lesnick, The theory of the interleaving distance on multidimensional persistence modules, Found. Comput. Math., 15 (2015), 613–650. https://doi.org/10.1007/s10208-015-9255-y doi: 10.1007/s10208-015-9255-y
|
| [31] | J. Milnor, Morse theory, Princeton: Princeton University Press, 1963. https://doi.org/10.1515/9781400881802 |
| [32] |
F. Murtagh, P. Contreras, Algorithms for hierarchical clustering: An overview, ii, WIREs Data Min. Knowl., 7 (2017), e1219. https://doi.org/10.1002/widm.1219 doi: 10.1002/widm.1219
|
| [33] |
P. Ost, K. Decaestecker, B. Lambert, V. Fonteyne, L. Delrue, N. Lumen, et al., Prognostic factors influencing prostate cancer-specific survival in non-castrate patients with metastatic prostate cancer, The Prostate, 74 (2014), 297–305. https://doi.org/10.1002/pros.22750 doi: 10.1002/pros.22750
|
| [34] |
A. Patel, Generalized persistence diagrams, J. Appl. Comput. Topology, 1 (2018), 397–419. https://doi.org/10.1007/s41468-018-0012-6 doi: 10.1007/s41468-018-0012-6
|
| [35] | M. Pegoraro, A metric for tree-like topological summaries, arXiv: 2108.13108v3, (2021). |
| [36] | M. Pegoraro, A persistence-driven edit distance for trees with abstract weights, arXiv: 2304.12088v4, (2025). https://doi.org/10.48550/arXiv.2304.12088 |
| [37] | M. Pegoraro, A finitely stable edit distance for functions defined on merge trees, arXiv: 2108.13108v9, (2025). https://doi.org/10.48550/arXiv.2108.13108 |
| [38] |
M. Pegoraro, A graph-matching formulation of the interleaving distance between merge trees, AIMS Mathematics, 10 (2025), 13025–13081. https://doi.org/10.3934/math.2025586 doi: 10.3934/math.2025586
|
| [39] | M. Pegoraro, P. Secchi, Functional data representation with merge trees, arXiv: 2108.13147v6, (2024). https://doi.org/10.48550/arXiv.2108.13147 |
| [40] |
M. Pont, J. Vidal, J. Delon, J. Tierny, Wasserstein distances, geodesics and barycenters of merge trees, IEEE T. Vis. Comput. Gr., 28 (2021), 291–301. https://doi.org/10.1109/TVCG.2021.3114839 doi: 10.1109/TVCG.2021.3114839
|
| [41] |
C. Ramos-Carreño, J. L. Torrecilla, M. Carbajo-Berrocal, P. Marcos, A. Suárez, scikit-fda: A Python package for functional data analysis, J. Stat. Softw., 109 (2024), 1–37. https://doi.org/10.18637/jss.v109.i02 doi: 10.18637/jss.v109.i02
|
| [42] | J. O. Ramsay, B. W. Silverman, Functional data analysis, 2 Eds., New York: Springer, 2005, https://doi.org/10.1007/b98888 |
| [43] |
L. M. Sangalli, P. Secchi, S. Vantini, Analysis of AneuRisk65 data: K-mean alignment, Electron. J. Statist., 8 (2014), 1891–1904. https://doi.org/10.1214/14-EJS938A doi: 10.1214/14-EJS938A
|
| [44] |
L. M. Sangalli, P. Secchi, S. Vantini, A. Veneziani, A case study in exploratory functional data analysis: Geometrical features of the internal carotid artery, J. Am. Stat. Assoc., 104 (2009), 37–48. https://doi.org/10.1198/jasa.2009.0002 doi: 10.1198/jasa.2009.0002
|
| [45] |
S. M. Selkow, The tree-to-tree editing problem, Inform. Process. Lett., 6 (1977), 184–186. https://doi.org/10.1016/0020-0190(77)90064-3 doi: 10.1016/0020-0190(77)90064-3
|
| [46] |
Y. Shinagawa, T. L. Kunii, Y. L. Kergosien, Surface coding based on Morse theory, IEEE Comput. Graph., 11 (1991), 66–78. https://doi.org/10.1109/38.90568 doi: 10.1109/38.90568
|
| [47] |
R. Sridharamurthy, T. B. Masood, A. Kamakshidasan, V. Natarajan, Edit distance between merge trees, IEEE T. Vis. Comput. Gr., 26 (2020), 1518–1531. https://doi.org/10.1109/TVCG.2018.2873612 doi: 10.1109/TVCG.2018.2873612
|
| [48] |
R. Sridharamurthy, V. Natarajan, Comparative analysis of merge trees using local tree edit distance, IEEE T. Vis. Comput. Gr., 29 (2021), 1518–1530. https://doi.org/10.1109/TVCG.2021.3122176 doi: 10.1109/TVCG.2021.3122176
|
| [49] | E. F. Touli, Frechet-like distances between two merge trees, arXiv: 2004.10747v1, (2020). |
| [50] | R. D. Tuddenham, M. M. Snyder, Physical growth of California boys and girls from birth to eighteen years, Publ. Child. Dev. Univ. Calif., 1 (1954), 183–364. |
| [51] | L. M. Sangalli, P. Secchi, S. Vantini, V. Vitelli, Functional clustering and alignment methods with applications, Communications in Applied and Industrial Mathematics, 1 (2010), 205–224. |
| [52] | F. Wetzels, M. Anders, C. Garth, Taming horizontal instability in merge trees: On the computation of a comprehensive deformation-based edit distance, 2023 Topological Data Analysis and Visualization (TopoInVis), Melbourne, Australia, 2023, 82–92. https://doi.org/10.1109/TopoInVis60193.2023.00015 |
| [53] | F. Wetzels, C. Garth, A deformation-based edit distance for merge trees, In: 2022 Topological Data Analysis and Visualization (TopoInVis), 2022, 29–38. https://doi.org/10.1109/TopoInVis57755.2022.00010 |
| [54] |
F. Wetzels, H. Leitte, C. Garth, Branch decomposition-independent edit distances for merge trees, Comput. Graph. Forum, 41 (2022), 367–378. https://doi.org/10.1111/cgf.14547 doi: 10.1111/cgf.14547
|
| [55] |
F. Wetzels, H. Leitte, C. Garth, Accelerating computation of stable merge tree edit distances using parameterized heuristics, IEEE T. Vis. Comput. Gr., 31 (2025), 3706–3718. https://doi.org/10.1109/TVCG.2025.3567120 doi: 10.1109/TVCG.2025.3567120
|
| [56] |
K. Q. Wu, S. Zhang, A contour tree based visualization for exploring data with uncertainty, Int. J. Uncertain. Quan., 3 (2013), 203–223. https://doi.org/10.1615/Int.J.UncertaintyQuantification.2012003956 doi: 10.1615/Int.J.UncertaintyQuantification.2012003956
|
| [57] |
K. Z. Zhang, A constrained edit distance between unordered labeled trees, Algorithmica, 15 (1996), 205–222. https://doi.org/10.1007/BF01975866 doi: 10.1007/BF01975866
|