This paper explored the theoretical and practical connections between topological data analysis (TDA), game theory, and data poisoning attacks. We demonstrated how the topological structure of data can influence strategic interactions in adversarial settings, and how game-theoretic frameworks can model the interplay between defenders and attackers in machine learning systems. We introduced novel formulations that bridge these fields, developing metrics to quantify the topological vulnerability of data structures to poisoning attacks. Our analysis reveals that persistence diagrams from TDA can serve as powerful tools for both detecting poisoning attempts and designing robust defense mechanisms. We proposed a Nash equilibrium-based approach to determine optimal poisoning and defense strategies, supported by mathematical formulations and theoretical guarantees.
Citation: Massimiliano Ferrara. Modeling by topological data analysis and game theory for analyzing data poisoning phenomena[J]. AIMS Mathematics, 2025, 10(7): 15457-15475. doi: 10.3934/math.2025693
This paper explored the theoretical and practical connections between topological data analysis (TDA), game theory, and data poisoning attacks. We demonstrated how the topological structure of data can influence strategic interactions in adversarial settings, and how game-theoretic frameworks can model the interplay between defenders and attackers in machine learning systems. We introduced novel formulations that bridge these fields, developing metrics to quantify the topological vulnerability of data structures to poisoning attacks. Our analysis reveals that persistence diagrams from TDA can serve as powerful tools for both detecting poisoning attempts and designing robust defense mechanisms. We proposed a Nash equilibrium-based approach to determine optimal poisoning and defense strategies, supported by mathematical formulations and theoretical guarantees.
| [1] | C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, et al., Intriguing properties of neural networks, (2014), arXiv: 1312.6199. http://doi.org/10.48550/arXiv.1312.6199 |
| [2] | B. Biggio, B. Nelson, P. Laskov, Poisoning attacks against support vector machines, In: Proceedings of the 29th international conference on machine learning, Madison: Omnipress, 2012, 1467–1474. |
| [3] | M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, B. Li, Manipulating machine learning: Poisoning attacks and countermeasures for regression learning, 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 2018, 19–35. http://doi.org/10.1109/SP.2018.00057 |
| [4] | J. Steinhardt, P. W. Koh, P. S. Liang, Certified defenses for data poisoning attacks, In: Proceedings of the 31st international conference on neural information processing systems, New York: Curran Associates Inc., 2017, 3520–3532. |
| [5] | A. Shafahi, W. R. Huang, M. Najibi, O. Suciu, C. Studer, T. Dumitras, et al., Poison frogs! targeted clean-label poisoning attacks on neural networks, In: Proceedings of the 32nd international conference on neural information processing systems, New York: Curran Associates Inc., 2018, 6106–6116. |
| [6] | A. Turner, D. Tsipras, A. Madry, Clean-label backdoor attacks, In: Workshop on robust and reliable ML systems at ICLR, 2019. Available from: https://people.csail.mit.edu/madry/lab/cleanlabel.pdf |
| [7] |
G. Carlsson, Topology and data, Bull. Amer. Math. Soc., 46 (2009), 255–308. http://doi.org/10.1090/S0273-0979-09-01249-X doi: 10.1090/S0273-0979-09-01249-X
|
| [8] | H. Edelsbrunner, J. Harer, Computational topology: an introduction, Providence: American Mathematical Society, 2010. |
| [9] |
A. Zomorodian, G. Carlsson, Computing persistent homology, Discrete Comput. Geom., 33 (2005), 249–274. http://doi.org/10.1007/s00454-004-1146-y doi: 10.1007/s00454-004-1146-y
|
| [10] |
D. Cohen-Steiner, H. Edelsbrunner, J. Harer, Stability of persistence diagrams, Discrete Comput. Geom., 37 (2007), 103–120. http://doi.org/10.1007/s00454-006-1276-5 doi: 10.1007/s00454-006-1276-5
|
| [11] |
L. Wasserman, Topological data analysis, Annu. Rev. Stat. Appl., 5 (2018), 501–532. http://doi.org/10.1146/annurev-statistics-031017-100045 doi: 10.1146/annurev-statistics-031017-100045
|
| [12] |
F. Chazal, B. Michel, An introduction to topological data analysis: fundamental and practical aspects for data scientists, Front. Artif. Intell., 4 (2021), 667963. http://doi.org/10.3389/frai.2021.667963 doi: 10.3389/frai.2021.667963
|
| [13] | C. Hofer, R. Kwitt, M. Niethammer, A. Uhl, Deep learning with topological signatures, In: Proceedings of the 31st international conference on neural information processing systems, New York: Curran Associates Inc., 2017, 1633–1643. |
| [14] | M. Carriere, F. Chazal, Y. Ike, T. Lacombe, M. Royer, Y. Umeda, PersLay: A neural network layer for persistence diagrams and new graph topological signatures, (2020), arXiv: 1904.09378. http://doi.org/10.48550/arXiv.1904.093378 |
| [15] | N. Dalvi, P. Domingos, Mausam, S. Sanghai, D. Verma, Adversarial classification, In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, New York: Association for Computing Machinery, 2004, 99–108. http://doi.org/10.1145/1014052.1014066 |
| [16] | M. Brückner, T. Scheffer, Stackelberg games for adversarial prediction problems, Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, New York: Association for Computing Machinery, 2011,547–555. http://doi.org/10.1145/2020408.2020495 |
| [17] | I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, 2015 International Conference on Learning Representations, San Diego, USA, 2015. |
| [18] |
L. Dritsoula, P. Loiseau, J. Musacchio, A game-theoretic analysis of adversarial classification, IEEE T. Inf. Foren. Sec., 12 (2017), 3094–3109. http://doi.org/10.1109/TIFS.2017.2718494 doi: 10.1109/TIFS.2017.2718494
|
| [19] |
J. Sun, L. X. Zhang, L. Liu, Y. M. Wu, Q. H. Shan, Output consensus control of multi-agent systems with switching networks and incomplete leader measurement, IEEE T. Autom. Sci. Eng., 21 (2024), 6643–6652. http://doi.org/10.1109/TASE.2023.3328897 doi: 10.1109/TASE.2023.3328897
|
| [20] | B. Li, Y. Vorobeychik, Feature cross-substitution in adversarial classification, Advances in Neural Information Processing Systems, 3 (2014), 2087–2095. |
| [21] | C. Guo, J. S. Frank, K. Q. Weinberger, Low frequency adversarial perturbation, (2019), arXiv: 1809.08758. http://doi.org/10.48550/arXiv.1809.08758 |