Drivers of harmful algal blooms in coastal areas of Eastern Mediterranean: a machine learning methodological approach

Androniki Tamvakis; George Tsirtsis; Michael Karydis; Kleanthis Patsidis; Giorgos D. Kokkoris; Androniki Tamvakis; George Tsirtsis; Michael Karydis; Kleanthis Patsidis; Giorgos D. Kokkoris

doi:10.3934/mbe.2021322

Mathematical Biosciences and Engineering

2021, Volume 18, Issue 5: 6484-6505. doi: 10.3934/mbe.2021322

Previous Article Next Article

Research article Special Issues

Drivers of harmful algal blooms in coastal areas of Eastern Mediterranean: a machine learning methodological approach

Department of Marine Sciences, Faculty of Environment, University of the Aegean, University Hill, GR81100, Mytilene, Greece

Received: 26 May 2021 Accepted: 19 July 2021 Published: 28 July 2021

Harmful algal species are present in the Mediterranean Sea and are often associated with toxic events affecting the nearby coastal zones. The presence of 18 marine microalgae, at genus level, associated with potentially harmful characteristics was predicted using a number of machine learning techniques based exclusively on a small set of abiotic variables, already identified as drivers of blooms. Random Forest (RF) algorithm achieved the best predictive performance by correctly identifying the presence of most genera with a mean of 89.2% of total samples. Although, RF has shown lower predictive performance for genera present in a low number of samples, its predictive power remains at least "fair' in these cases. The main tree-based advantage of RF was thereafter used to assess the importance of the input variables in predicting the presence of the algal genera. Temperature had the most powerful effect on genera's presences, although this effect varies among genera. Finally, the genera were clustered based on their response to the considered abiotic variables and common trends in an ecological context were identified.
- harmful algal,
- machine learning,
- Random Forest,
- abiotic parameters,
- Eastern Mediterranean
Citation: Androniki Tamvakis, George Tsirtsis, Michael Karydis, Kleanthis Patsidis, Giorgos D. Kokkoris. Drivers of harmful algal blooms in coastal areas of Eastern Mediterranean: a machine learning methodological approach[J]. Mathematical Biosciences and Engineering, 2021, 18(5): 6484-6505. doi: 10.3934/mbe.2021322

Related Papers:

Abstract

Harmful algal species are present in the Mediterranean Sea and are often associated with toxic events affecting the nearby coastal zones. The presence of 18 marine microalgae, at genus level, associated with potentially harmful characteristics was predicted using a number of machine learning techniques based exclusively on a small set of abiotic variables, already identified as drivers of blooms. Random Forest (RF) algorithm achieved the best predictive performance by correctly identifying the presence of most genera with a mean of 89.2% of total samples. Although, RF has shown lower predictive performance for genera present in a low number of samples, its predictive power remains at least "fair' in these cases. The main tree-based advantage of RF was thereafter used to assess the importance of the input variables in predicting the presence of the algal genera. Temperature had the most powerful effect on genera's presences, although this effect varies among genera. Finally, the genera were clustered based on their response to the considered abiotic variables and common trends in an ecological context were identified.

References

[1]	M. I. Jordan, T. M. Mitchell, Machine learning: Trends, perspectives and prospects, Science, 349 (2015), 255-260. doi: 10.1126/science.aaa8415
[2]	E. Alpaydin, Introduction to machine learning, 2^nd Ed., The MIT Press, Cambridge, (2010).
[3]	C. Crisci, B. Ghattas, G. Perera, A review of supervised machine learning algorithms and their applications to ecological data, Ecol. Modell., 240 (2012), 113-122. doi: 10.1016/j.ecolmodel.2012.03.001
[4]	G. P. Harris, Phytoplankton ecology: structure, function and fluctuation, Chapman and Hall, London, 1986.
[5]	K. H. Mann, J. R. N. Lazier, Dynamics of marine ecosystems: biological-physical interactions in the Oceans, Blackwell Scientific Publications, Oxford, (1991).
[6]	R. de Wit, L. J. Stal, B. A. Lomstein, R. A. Herbert, H. Van Gemerden, P. Viaroli, et al., ROBUST: The ROle of BUffering capacities in STabilising coastal lagoon ecosystems, Cont. Shelf Res., 21 (2001), 2021-2041. doi: 10.1016/S0278-4343(01)00040-1
[7]	G. E. Fogg, B. Thake, Algal culture and phytoplankton ecology, 3^rd Ed, The University of Wisconsin Press, (1987).
[8]	I. Valiela, Marine ecological processes, Springer-Verlag, New York, (1984).
[9]	J. M. Zaldivar, F. S. Bacelar, S. Dueri, D. Marinov, P. Viaroli, E. Hernández-García, Modeling approach to regime shifts of primary production in shallow coastal ecosystems, Ecol. Modell., 220 (2009), 3100-3110. doi: 10.1016/j.ecolmodel.2009.01.022
[10]	G. M. Hallegraeff, Harmful algal blooms: a global overview, in Manual on Harmful Marine Microalgae (eds. G.M. Hallegraeff, D.M. Anderson and A.D. Cembella), UNESCO Publishing, (2003), 25-50.
[11]	P. Hoagland, S. Scatasta, The economic effects of harmful algal blooms, In Ecology of Harmful Algae (eds. Graneli E. and Turner J. T.), Springer-Verlag: Berlin, (2006), 391-401.
[12]	S. E. Shumway, A review of the effects of algal blooms on shellfish and aquaculture, J. World Aquac. Soc., 21 (1990), 65-104. doi: 10.1111/j.1749-7345.1990.tb00529.x
[13]	D. Kitsiou, M. Karydis, Coastal marine eutrophication assessment: a review on data analysis, Environ. Int., 37 (2011), 778-801. doi: 10.1016/j.envint.2011.02.004
[14]	L. Ignatiades, O. Gotsis-Skretas, A review on toxic and harmful algae in greek coastal waters (E. Mediterranean Sea), Toxins, 2 (2010), 101-1037.
[15]	D. Kitsiou, H. Coccossis, M. Karydis, Multi-dimensional evaluation and ranking of coastal areas using GIS and multiple criteria choice methods, Sci. Total Environ., 284 (2002) 1-17. doi: 10.1016/S0048-9697(01)00851-8
[16]	S. Spatharis, D. Mouillot, D. B. Danielidis, M. Karydis, T. Do Chi, G. Tsirtsis, Influence of terrestrial runoff on phytoplankton species richness-biomass relationships: a double stress hypothesis, J. Exp. Mar. Biol. Ecol., 362 (2008), 55-62. doi: 10.1016/j.jembe.2008.06.003
[17]	A. Menesguen, G. LacroiX, Modelling the marine eutrophication: a review, Sci. Total Environ., 636 (2018), 339-354. doi: 10.1016/j.scitotenv.2018.04.183
[18]	P. Jimeno-Saez, J. Senent-Aparicio, J. M. Cecilia, J. Perez-Sanchez, Using machine-learning algorithms for eutrophication modeling: case study of Mar Menor Lagoon (Spain), Int. J. Environ. Res. Publ. Health., 17 (2020), 1189. doi: 10.3390/ijerph17041189
[19]	K. Rankinen, J. E. C. Bernal, M. Holmberg, K. Vuorio, Identifying multiple stressors that influence eutrophication in a Finnish agricultural river, Sci. Total Environ., 658 (2019), 1278-1292. doi: 10.1016/j.scitotenv.2018.12.294
[20]	A. Tamvakis, J. Miritzis, G. Tsirtsis, A. Spyropoulou, S. Spatharis, Effects of meteorological forcing on coastal eutrophication: modeling with model trees, Estuar. Coast. Shelf Sci., 115 (2012), 210-217. doi: 10.1016/j.ecss.2012.09.003
[21]	A. Catherine, M. Selma, D. Mouillot, M. Troussellier, C. Bernard, Patterns and multi-scale drivers of phytoplankton species richness in temperate peri-urban lakes, Sci. Total Environ., 559 (2016), 74-83. doi: 10.1016/j.scitotenv.2016.03.179
[22]	A. Tamvakis, V. Trygonis, J. Miritzis, G. Tsirtsis, S. Spatharis, Optimizing biodiversity prediction from abiotic parameters, Environ. Model. Softw., 53 (2014), 112-120. doi: 10.1016/j.envsoft.2013.12.001
[23]	T.-H. Tran, N.-D. Hoang, Estimation of algal colonization growth on mortar surface using a hybridization of machine learning and metaheuristic optimization, Sadhana, 42 (2017), 929-939. doi: 10.1007/s12046-017-0652-6
[24]	P. Yu, R. Gao, D. Zhang, Z.-P. Liu, Predicting coastal algal blooms with environmental factors by machine learning methods, Ecol. Indic., 123 (2021), 107334. doi: 10.1016/j.ecolind.2020.107334
[25]	M. Karydis, D. Kitsiou, Marine eutrophication: a global perspective, CRC Press (2020).
[26]	S. B. Watson, C. Miller, G. Arhonditsis, G. L. Boyer, W. Carmichael, M. N. Charlton, et al., The re-eutrophication of Lake Erie: harmful algal blooms and hypoxia, Harmful Algae, 56 (2016), 44-66. doi: 10.1016/j.hal.2016.04.010
[27]	T. Okaichi, Red tides, Terra Scientific Publishing Company, Tokyo, Japan, 2004.
[28]	N. Mellios, S. J. Moe, C. Laspidou, Machine Learning Approaches for Predicting Health Risk of Cyanobacterial Blooms in Northern European Lakes, Water, 12 (2020), 1191. doi: 10.3390/w12061589
[29]	P. R. Hill, A. Kumar, M. Temimi, D. R. Bull, HABNet: Machine learning, remote sensing-based detection of harmful algal blooms, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 13 (2020), 3229-3239. doi: 10.1109/JSTARS.2020.3001445
[30]	J. Derot, H. Yajima, S. Jacquet, Advances in forecasting harmful algal blooms using machine learning models: a case study with Planktothrix rubescens in Lake Geneva, Harmful Algae, 99 (2020), 101906. doi: 10.1016/j.hal.2020.101906
[31]	L. Velo-Suarez, J. C. Gutierrez-Estrada, Artificial neural network approaches to one-step weekly prediction of Dinophysis acuminata blooms in Huelva (Western Andalucía, Spain), Harmful Algae, 6 (2007), 361-371. doi: 10.1016/j.hal.2006.11.002
[32]	M. Bourel, C. Crisci, A. Martinez, Consensus methods based on machine learning techniques for marine phytoplankton presence-absence prediction, J. Mar. Syst., 42 (2017), 46-54.
[33]	D. R. Cutler, T. C. Edwards Jr, K. H Beard, A. Cutler, K. T. Hess, J. Gibson, et al., Random forests for classification in ecology, Ecology, 88 (2007), 2783-2792. doi: 10.1890/07-0539.1
[34]	S. Lek, J. F. Guegan, Artificial neural networks as a tool in ecological modeling, an introduction, Ecol. Modell., 120 (1999), 65-73. doi: 10.1016/S0304-3800(99)00092-7
[35]	A. Verikas, A. Gelzinis, M. Bacauskiene, I. Olenina, E. Vaiciukynas, An Integrated Approach to Analysis of Phytoplankton Images, IEEE J. Ocean. Eng., 40 (2015), 315-326. doi: 10.1109/JOE.2014.2317955
[36]	F. Recknagel, ANNA-Artificial neural network model for predicting species abundance and succession of blue-green algae, Hydrobiologia, 349 (1997), 47-57. doi: 10.1023/A:1003041427672
[37]	C. Guallar, M. Delgado, J. Diogene, M. Fernandez-Tejedor, Artificial neural network approach to population dynamics of harmful algal blooms in Alfacs Bay (NW Mediterranean): Case studies of Karlodinium and Pseudo-nitzschia, Ecol. Model., 338 (2016), 37-50. doi: 10.1016/j.ecolmodel.2016.07.009
[38]	H. M. Oh, C. Y. Ahn, J. W. Lee, T. S. Chon, K. H. Choi, Y. S. Park, Community pattering and identification of predominant factors in algal bloom in Daechung Reservoir (Korea) using artificial neural networks, Ecol. Modell., 203 (2007), 109-118. doi: 10.1016/j.ecolmodel.2006.04.030
[39]	J. L Degling, C. Jin, A. Wong, Investigating the automatic classification of algae using the spectral and morphological characteristics via deep residual learning, in Image Analysis and Recognition. International Conference on Image Analysis and Recognition (eds. F. Karray, A. Campilho, A. Yu), Lecture Notes in Computer Science (11663), Springer, (2019), 269-280
[40]	X. Li, R. Liao, J. Zhou, P. T. Leung, M. Yan, H. Ma, Classification of morphologically similar algae and cyanobacteria using Mueller matrix imaging and convolutional neural networks, Appl. Optics, 56 (2017), 6520-6530. doi: 10.1364/AO.56.006520
[41]	A. El-habashi, I. Ioannou, M. C. Tomlinson, R. P. Stumpf, S. Ahmed, Satellite retrievals of Karenia brevis harmful algae blooms in the west Florida shelf using neural networks and comparison with other techniques, Remote Sens., 8 (2016), 377. doi: 10.3390/rs8050377
[42]	J. M. T. Palenzuela, L. G. Vilas, F. M. B. Aláez, Y. Pazos, Potential Application of the New Sentinel Satellites for Monitoring of Harmful Algal Blooms in the Galician Aquaculture, Thalassas, 36 (2020), 85-93. doi: 10.1007/s41208-019-00180-0
[43]	S. Hu, H. Liu, W. Zhao, T. Shi, Z. Hu, Q. Li, et al., Comparison of machine learning techniques in inferring phytoplankton size classes, Remote sens., 10 (2018), 191 doi: 10.3390/rs10030191
[44]	B. Bejaoui, Z. Armi, E. Ottaviani, E. Barelli, E. Gargouri-Ellouz, R. Cherif, et al., Random forest model and TRIX used in combination to assess and diagnose the trophic status of Bizerte Lagoon, southern Mediterranean, Ecol. Indic., 71 (2016), 293-301. doi: 10.1016/j.ecolind.2016.07.010
[45]	H. Yajima, J. Derot, Application of the Random Forest model for chlorophyll-a forecast in fresh and brackish water bodies in Japan, using multivariate long-term databases, Hydroinformatics, 20 (2018), 206-220. doi: 10.2166/hydro.2017.010
[46]	G. Martinez de la Escalera, C. Kruk, A. M. Segura, L. Nogueira, I. Alcantara, C. Piccini, Dynamics of toxic genotypes of Microcystis aeruginosa complex (MAC) through a wide freshwater to marine environmental gradient, Harmful Algae, 62 (2017), 73-83. doi: 10.1016/j.hal.2016.11.012
[47]	E. Valbi, F. Ricci, S. Capellacci, S. Casabianca, M. Scardi, A. Penna, A model predicting the PSP toxic dinoflagellate Alexandrium minutum occurrence in the coastal waters of the NW Adriatic Sea, Scientific Reports, 9 (2019), 4166. doi: 10.1038/s41598-019-55648-z
[48]	S. Spatharis, N. P. Dolapsakis, A. Economou-Amilli, G. Tsirtsis, D. B. Danielidis, Dynamics of potentially harmful microalgae in a confined Mediterranean Gulf-Assessing the risk of bloom formation, Harmful Algae, 8 (2009), 736-743. doi: 10.1016/j.hal.2009.03.002
[49]	G. Arhonditsis, G. Tsirtsis, M. Karydis, The effects of episodic rainfall events to the dynamics of coastal marine ecosystems: applications to a semi-enclosed gulf in the Meditteranean Sea, J. Mar. Syst., 35 (2002), 183-205. doi: 10.1016/S0924-7963(02)00081-7
[50]	G. Tsirtsis, M. Karydis, Evaluation of phytoplankton community indices for detecting eutrophic trends in the marine environment, Environ. Monit. Assess., 50 (1998), 255-269. doi: 10.1023/A:1005883015373
[51]	M. Karydis, Quantitative assessment of eutrophication: a scoring system for characterising water quality in coastal marine ecosystems, Environ. Monit. Assess., 41 (1996), 233-246. doi: 10.1007/BF00419744
[52]	D. Kitsiou, M. Karydis, Categorical mapping of marine eutrophication based on ecological indices, Sci. Total Environ., 255 (2000), 113-127. doi: 10.1016/S0048-9697(00)00457-5
[53]	S. Spatharis, G. Tsirtsis, D. Danielidis, T. Do Chi, D. Mouillot, Effects of pulsed nutrient inputs on phytoplankton assemblage structure and blooms in an enclosed coastal area, Estuar. Coast. Shelf Sci., 73 (2007), 807-815. doi: 10.1016/j.ecss.2007.03.016
[54]	Ø. Moestrup, R. Akselmann-Cardella, C. Churro, S. Fraga, M. Hoppenrath, M. Iwataki, et al., IOC-UNESCO Taxonomic Reference List of Harmful Micro Algae, (2009), Available from: http://www.marinespecies.org/hab on 2021-04-24.
[55]	M. Stone, Cross-validation and multinomial prediction, Biometrica, 61 (1974), 509-515 doi: 10.1093/biomet/61.3.509
[56]	K. Hornik, C. Buchta, A. Zeileis, Open-source machine learning: R Meets Weka. Comput. Stat., 24 (2009), 225-232. doi: 10.1007/s00180-008-0119-7
[57]	I. H. Witten, E. Frank, Data mining: practical machine learning tools and techniques, 2^nd edition, Morgan Kaufmann, San Francisco, 2005.
[58]	B. Juda, H. S. Le, Precision-recall versus accuracy and the role of large data sets, Proc. AAAI Conf. Artif. Intell., 33 (2019), 4039-4048.
[59]	M. Bekkar, H. K. Djemaa, T. A. Alitouche, Evaluation measures for models assessment over imbalanced data sets, J. Inf. Secur. Appl., 3 (2013), 27-39.
[60]	J. Cohen, A coefficient of agreement for nominal scales, Educ. Psychol. Meas., 20 (1960), 37-46. doi: 10.1177/001316446002000104
[61]	R.O. Duda, P. E. Hart, D. G. Stork, Pattern Classification, 2nd Edition, Wiley-Interscience, USA, (2000).
[62]	G. Louppe, L. Wehenkel, A. Sutera, P. Geurts, Understanding variable importances, in Forest of randomized trees (eds C.J.C Burges, L. Bottou, M. Welling, Z. Chahramani and K.Q. Weinberger), Advances in Neural Information Processing Systems, (2013) 431-439.
[63]	W. N. H. W. Mohamed, M. N. M. Salleh, A. H. Omar, A comparative study of Reduced Error Pruning method in decision tree algorithms, IEEE Int. Conf. Control Syst. Comput. Eng., Penang, (2012), 392-397.
[64]	P. Jain, J. M. Garibaldi, J. D. Hirst, Supervised machine learning algorithms for protein structure classification, Comput. Biol. Chem., 33 (2009), 216-223. doi: 10.1016/j.compbiolchem.2009.04.004
[65]	M. Belgiu, L. Dragut, Random forest in remote sensing: A review of applications and future directions, ISPRS J. Photogramm. Remote Sens., 114 (2016), 24-31. doi: 10.1016/j.isprsjprs.2016.01.011
[66]	K. Miller, F. Huettmann, B. Norcross, M. Lorenz, Multivariate random forest models of estruarine-associated fish and invertebrate communities, Mar. Ecol. Prog. Ser., 500 (2014), 159-174. doi: 10.3354/meps10659
[67]	Y. Qi, Random Forest for Bioinformatics, In Ensemble Machine Learning. (eds C. Zhang, Y. Ma) Springer, Boston, MA, (2012).
[68]	P. Yang, Y. Hwa Yang, B. B. Zhou, A. Y. Zomaya, A review of ensemble methods in bioinformatics, Curr. Bioinform., 5 (2010), 296-308. doi: 10.2174/157489310794072508
[69]	H. R. Sofaer, J. A. Hoeting, C. S. Jarnevich, The area under the precision-recall curve as a performance metric for rare binary events, Methods Ecol. Evol., 10 (2018), 565-577.
[70]	Q. Gu, L. Zhu, Z. Cai, Evaluation measures of the classification performance of imbalanced data sets, in Computation Intelligence and Intelligent Systems vol 51 (eds Z.Cai, Z. Li, Z. Kang and Y. Liu), Springer, Berlin, Heidelberg, (2009).
[71]	C. Chen, A. Liaw, L. Breiman, Using random forest to learn imbalanced data, University of California, Berkeley, (2004).
[72]	M. Khalilia, S. Chakraborty, M. Popescu, Predicting disease risks from highly imblanced data using random forest, BMC Medical Inform. Decis. Mak., 11 (2011), 51. doi: 10.1186/1472-6947-11-51
[73]	S. Moncheva, O. Gotsis-Skretas, K. Pagou, A. Krastev, Phytoplankton blooms in Black Sea and Mediterranean coastal ecosystems subjected to anthropogenic eutrophication: similarities and differences, Estuar. Coast. Shelf Sci., 53 (2001), 28-295.
[74]	K. H. Mann, J. R. N. Lazier, 2005. Dynamics of Marine Ecosystems. Blackwell Publishing Ltd.
[75]	C. Marampouti, A. C. J. Buma, M. Karin de Boer, Mediterranean alien harmful algal blooms: origins and impacts, Environ. Sci. Pollut. Res., 28 (2021), 3837-3851. doi: 10.1007/s11356-020-10383-1
[76]	D. H. Cushing, J. J. Walsh, The ecology of the seas. Blackwell Scientific Publications, Oxford, (1976).
[77]	T. Wyatt, Plants and animals of the sea, in: The ecology of the seas (eds. D.H. Cushing and J.J. Walsh), Blackwell Scientific Publications, Oxford, (1976), 81-97.
[78]	UNEP, State and pressures of the marine and coastal Mediterranean environment, Environmental Assessment Series (No. 5), European Environment Agency, Copenhagen. (1999).
[79]	B. R. Berland, J. Bonin, S. Y. Maestrini, Azote ou phosphore? Considerations sur le paradoxe nutritionnel de la Mediterranee, Oceanol. Acta, 3 (1980), 135-142.
[80]	E. Litchman, P. de Tezanos Pinto, C. A. Klausmeier, M. K. Thomas, K. Yoshiyama, Linking traits to species diversity and community structure in phytoplankton. In: L. Naselli-Flores, G. Rossetti (eds) Fifty years after the "Homage to Santa Rosalia": Old and new paradigms on biodiversity in aquatic ecosystems. Developments in Hydrobiology 213, Springer, Dordrecht, (2010), 12-28.
[81]	J. Li, P. M. Glibert, Y. Gao, Temporal and spatial changes in Chesapeake Bay water quality and relationships to Prorocentrum minimum, Karlodinium veneficum and CyanoHAB events, 191-2008. Harmful Algae, 42, (2015), 1-14. doi: 10.1016/j.hal.2014.11.003
[82]	J. M. Gasol, J. Garcia-Cantizano, R. Massan, R. Guerrero, C. Pedros-Alio, Physiological ecology of a metalimnetic Cryptomonas population: relationships to light, sulfide and nutrients, J. Plankton Res., 15 (1993), 255-275. doi: 10.1093/plankt/15.3.255
[83]	S. M. Pednekar, S. S. Bates, V. Kerkar, S. G. P. Matondkar, Environmental factors affecting the distribution of Pseudo-nitzschia in two monsoonal estuaries of Western India and effects of salinity on growth of domoic acid production by P. pungens, Estuaries Coasts, 41 (2018), 1448-1462. doi: 10.1007/s12237-018-0366-y
[84]	C. R. Anderson, M. R. P. Sapiano, M. B. K. Prasad, W. Long, P. J. Tango, C. W. Brown, et al., Predicting potentially toxigenic Pseudo-nitzchia blooms in the Chesapeake Bay, J. Mar. Syst., 83 (2010), 127-140. doi: 10.1016/j.jmarsys.2010.04.003
[85]	W. Feki-Sahnoun, H. Njah, A. Hamza, N.Barraj, M. Mahfoudi, A. Rebai, et al., Using general linear model, Bayesian Networks and Naïve Bayes classifier for prediction of Karenia Selliformis occurrences and blooms, Ecol. Inform., 43 (2018), 12-23. doi: 10.1016/j.ecoinf.2017.10.017
[86]	G. M. Grimaud, F. Mairet, A. Sciandra, O. Bernard, Modeling the temperature effect on the specific growth rate of phytoplankton: a review, Rev. Environ. Sci. Biotechnol., 16 (2017), 625-645. doi: 10.1007/s11157-017-9443-0

Reader Comments

Your name:*

Email:*
© 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)