Ambiguity in identifying parameters of an SIR model when fitting epidemic incidence data

B Shayak; Sana Jahedi; James A Yorke; B Shayak; Sana Jahedi; James A Yorke

doi:10.3934/mbe.2026036

Mathematical Biosciences and Engineering

2026, Volume 23, Issue 4: 913-939. doi: 10.3934/mbe.2026036

Previous Article Next Article

Research article Special Issues

Ambiguity in identifying parameters of an SIR model when fitting epidemic incidence data

1.
Department of Engineering, University of Maryland, College Park, MD, USA
2.
Department of Mathematics and Statistics, Oakland University, Rochester, MI, USA
3.
Department of Mathematics, University of Maryland, College Park, MD, USA

Received: 20 April 2025 Revised: 08 December 2025 Accepted: 14 January 2026 Published: 03 March 2026

When a new pathogen emerges, determining the key transmission parameters plays a crucial role in formulating public health policies and controlling the spread of the pathogen. It is important to note that not every parameter is "identifiable". A parameter of a model is said to be globally structurally identifiable from some kind of perfect data if it can be determined uniquely. Whether that parameter is identifiable depends on what kind of perfect data is available. In this work, we developed a new mathematical concept, the "decay-growth ratio", and using it, we prove that the basic reproduction number is "globally structurally identifiable" from the knowledge of this ratio, and with a bit more information, the duration of infectiousness can be determined as well. That, however, assumes perfect, noise-free data which in reality is unattainable. A parameter is said to be "practically identifiable" if it can be reliably estimated from finite, noisy data. Practical identifiability is inherently dependent on both the nature of the available data and the inferential methodology employed. We proved that neither the basic reproduction number nor the duration of infectiousness is practically identifiable from the common summary statistics of an outbreak, specifically its mean, standard deviation, and amplitude. In fact, we showed that given any outbreak, and given any value greater than one for the basic reproduction, there is an SIR solution with the same mean and standard deviation as the outbreak's. We further demonstrated how this result can be extended to more complex multi-compartment epidemic models. Moreover, we provided indistinguishable fits to real epidemic data with extremely different parameter choices. This insensitivity of fit quality to parameter choices means that traditional curve-fitting cannot reliably infer the key outbreak parameters from case data alone. Taken together, our results highlight fundamental limits on estimating epidemic parameters from incidence data, i.e., the rate of new cases, or from prevalence data, i.e., the number of infected people at a given time.
- SIR model,
- basic reproduction number,
- decay-growth ratio,
- parameter estimation,
- identifiability,
- practical identifiability,
- structural identifiability
Citation: B Shayak, Sana Jahedi, James A Yorke. Ambiguity in identifying parameters of an SIR model when fitting epidemic incidence data[J]. Mathematical Biosciences and Engineering, 2026, 23(4): 913-939. doi: 10.3934/mbe.2026036

Related Papers:

Abstract

When a new pathogen emerges, determining the key transmission parameters plays a crucial role in formulating public health policies and controlling the spread of the pathogen. It is important to note that not every parameter is "identifiable". A parameter of a model is said to be globally structurally identifiable from some kind of perfect data if it can be determined uniquely. Whether that parameter is identifiable depends on what kind of perfect data is available. In this work, we developed a new mathematical concept, the "decay-growth ratio", and using it, we prove that the basic reproduction number is "globally structurally identifiable" from the knowledge of this ratio, and with a bit more information, the duration of infectiousness can be determined as well. That, however, assumes perfect, noise-free data which in reality is unattainable. A parameter is said to be "practically identifiable" if it can be reliably estimated from finite, noisy data. Practical identifiability is inherently dependent on both the nature of the available data and the inferential methodology employed. We proved that neither the basic reproduction number nor the duration of infectiousness is practically identifiable from the common summary statistics of an outbreak, specifically its mean, standard deviation, and amplitude. In fact, we showed that given any outbreak, and given any value greater than one for the basic reproduction, there is an SIR solution with the same mean and standard deviation as the outbreak's. We further demonstrated how this result can be extended to more complex multi-compartment epidemic models. Moreover, we provided indistinguishable fits to real epidemic data with extremely different parameter choices. This insensitivity of fit quality to parameter choices means that traditional curve-fitting cannot reliably infer the key outbreak parameters from case data alone. Taken together, our results highlight fundamental limits on estimating epidemic parameters from incidence data, i.e., the rate of new cases, or from prevalence data, i.e., the number of infected people at a given time.

References

[1]	H. W. Hethcote, The mathematics of infectious diseases, SIAM Review, 42 (2000), 599–653. https://doi.org/10.1137/S0036144500371907 doi: 10.1137/S0036144500371907
[2]	F. Brauer, P. Van den Driessche, J. Wu, L. J. S. Allen, Mathematical epidemiology, volume 1945. Springer, 2008. https://doi.org/10.1007/978-3-540-78911-6
[3]	F. Brauer, C. Castillo-Chavez, Z. Feng, Mathematical models in epidemiology, volume 32. Springer, 2019. https://doi.org/10.1007/978-1-4939-9828-9
[4]	R. Ross, The prevention of malaria, John Murray, 1911.
[5]	W. O. Kermack, A. G. McKendrick, A contribution to the mathematical theory of epidemics, Proceedings of the Royal Society A, 115 (1927), 700–721. https://doi.org/10.1098/rspa.1927.0118 doi: 10.1098/rspa.1927.0118
[6]	Y. Li, Y. Yao, M. Feng, T. P. Benko, M. Perc, J. Zavrsnik, Epidemic dynamics in homes and destinations under recurrent mobility patterns, Chaos Solit. Fract., 195 (2025), 116273. https://doi.org/10.1016/j.chaos.2025.116273 doi: 10.1016/j.chaos.2025.116273
[7]	J. Chen, C. Xia, M. Perc. The siqrs propagation model with quarantine on simplicial complexes, IEEE Transactions on Computational Social Systems, 11 (2024), 4267–4278. https://doi.org/10.1109/TCSS.2024.3351173 doi: 10.1109/TCSS.2024.3351173
[8]	W. P. London, J. A. Yorke. Recurrent outbreaks of measles, chicken pox, and mumps: Ⅰ. Seasonal variation in contact rates, Am. J. Epidemiol., 98 (1973), 453–468. https://doi.org/10.1093/oxfordjournals.aje.a121575 doi: 10.1093/oxfordjournals.aje.a121575
[9]	A. S. Mahmud, C. J. E. Metcalf, B. T. Grenfell, Comparative dynamics, seasonality in transmission, and predictability of childhood infections in Mexico, Epidemiol. Infect., 145 (2017), 607–625. https://doi.org/10.1017/S0950268816002673 doi: 10.1017/S0950268816002673
[10]	A. Rachah, D. F. M. Torres, Modeling, dynamics and optimal control of Ebola transmission. Mathematics in Computer Science, 10 (2016), 331–342. https://doi.org/10.1007/s11786-016-0268-y doi: 10.1007/s11786-016-0268-y
[11]	F. Oduro, G. Appaboah, J. Baafi, Optimal control of Ebola transmission dynamics with interventions, British J. Math. Computer Sci., 19 (2016), 1–19.
[12]	R. Kumar, S. Dey, Sir model for Ebola outbreak in Liberia, Int. J. Math. Trends Technol., 28 (2015), 28–30.
[13]	I. E. Kibona, C. Yang, SIR Model of spread of Zika virus infections: ZikV linked to microcephaly simulations, Health, 9 (2017), 1190–1210.
[14]	S. E. B. Boret, R. Escalante, M. Villasana, Mathematical modeling of Zika virus in Brazil, (2023), arXiv preprint, arXiv: 1708.01280v2.
[15]	R. Dohare, M. Kumar, S. Sankhwar, N. Kumar, S. K Sagar, J. Kishore, SIR-SI model for Zika virus progression dynamics in India: A Case study, J. Commun. Diseases, 53 (2021), 100–104.
[16]	A. D. Zewdie, S. Gakkhar, A Mathematical model for Nipah virus infection. J. Appl. Math., 2020 (2020), (6050834). https://doi.org/10.1155/2020/6050834
[17]	A. K. Sikdar, B. Hossain, H. Islam, Compartmental modeling in epidemic diseases: Comparison between SIR model with constant and time-dependent parameters, Inverse Problems, 39 (2023), 035055. https://doi.org/10.1088/1361-6420/acb4e7 doi: 10.1088/1361-6420/acb4e7
[18]	K. R. Kumar, Nipah outbreak in Kerala: A network-based study, J. Phys. Conference Ser., 1850 (2000), 012019. https://doi.org/10.1088/1742-6596/1850/1/012019 doi: 10.1088/1742-6596/1850/1/012019
[19]	I. Cooper, A. Mondal, C. G. Antonopoulos, A SIR model assumption for the spread of COVID-19 in different communities, Chaos Solit. Fract., 139 (2020), 110057. https://doi.org/10.1016/j.chaos.2020.110057 doi: 10.1016/j.chaos.2020.110057
[20]	N. A. Kudryashov, M. A. Chmykhov, M. Vigdorowitsch, Analytical features of the SIR model and their applications to COVID-19, Appl. Math. Model., 90 (2021), 466–473. https://doi.org/10.1016/j.apm.2020.08.057 doi: 10.1016/j.apm.2020.08.057
[21]	J. David, et. al. Mathematical Models: Perspectives of Mathematical Modelers and Public Health Professionals, 2023.
[22]	O. Diekmann, J. A. P. Heesterbeek, J. A. J. Metz, On the definition and the computation of the basic reproduction ratio r 0 in models for infectious diseases in heterogeneous populations, J. Math. Biol., 28 (1990), 365–382. https://doi.org/10.1007/BF00178324 doi: 10.1007/BF00178324
[23]	P. van den Driessche, J. Watmough, Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission, Math. Biosci., 180 (2002), 29–48. https://doi.org/10.1016/S0025-5564(02)00108-6 doi: 10.1016/S0025-5564(02)00108-6
[24]	B. Dhungel, Md. S. Rahman, Md. M. Rahman, A. K. C. Bhandari, P. M. Le, N. A. Biva, et al., Reliability of early estimates of the basic reproduction number of covid-19: A systematic review and meta-analysis, Int. J. Environ. Res. Public Health, 19 (2022), 11613. https://doi.org/10.3390/ijerph191811613 doi: 10.3390/ijerph191811613
[25]	S. Jahedi, J. A. Yorke, When the best pandemic models are the simplest, MDPI Biology, 9 (2020), 353. https://doi.org/10.3390/biology9110353 doi: 10.3390/biology9110353
[26]	A. F. Villaverde, N. Tsiantis, J. R. Banga, Full observability and estimation of unknown inputs, states and parameters of nonlinear biological models, J. Royal Soc. Interf., 16 (2019), 20190043. https://doi.org/10.1098/rsif.2019.0043 doi: 10.1098/rsif.2019.0043
[27]	R. Bellman, K. J. Astrom, On structural identifiability, Math. Biosci., 7 (1970), 329–339. https://doi.org/10.1016/0025-5564(70)90132-X doi: 10.1016/0025-5564(70)90132-X
[28]	N. D. Evans, L. J. White, M. J. Chapman, K. R. Godfrey, M. J. Chappell, The structural identifiability of the susceptible infected recovered model with seasonal forcing, Math. Biosci., 194 (2005), 175–197. https://doi.org/10.1016/j.mbs.2004.10.011 doi: 10.1016/j.mbs.2004.10.011
[29]	N. Tuncer, T. T. Le, Structural and practical identifiability analysis of outbreak models, Math. Biosci., 299 (2018), 1–18. https://doi.org/10.1016/j.mbs.2018.02.004. doi: 10.1016/j.mbs.2018.02.004
[30]	N. Meshkat, A. Shiu, Structural identifiability of compartmental models: Recent progress and future directions, arXiv preprint arXiv: 2507.04496, 2025.
[31]	E. A. Dankwa, A. F. Brouwer, C. A. Donnelly, Structural identifiability of compartmental models for infectious disease transmission is influenced by data type, Epidemics, 41 (2022), 100643. https://doi.org/10.1016/j.epidem.2022.100643 doi: 10.1016/j.epidem.2022.100643
[32]	W. C. Roda, M. B. Varughese, D. Han, M. Y. Li, Why is it difficult to accurately predict the COVID-19 epidemic? Inf. Dis. Modelling, 5 (2020), 271–281. https://doi.org/10.1016/j.idm.2020.03.001 doi: 10.1016/j.idm.2020.03.001
[33]	F. Bergström, M. Favero, T. Britton, Identifiability in epidemic models with prior immunity and under-reporting, 2025. URL http://arXiv.org/abs/2506.07825
[34]	T. Sauer, T. Berry, D. Ebeigbe, M. M. Norton, A. J. Whalen, S. J. Schiff, Identifiability of infection model parameters early in an epidemic. SIAM J. Control and Optimization, 60 (2022), S27–S48. https://doi.org/10.1137/20M1353289
[35]	E. D. Sontag, Dynamic compensation, parameter identifiability, and equivariances, PLOS Comp. Bio., 13 (2017), e1005447. https://doi.org/10.1371/journal.pcbi.1005447 doi: 10.1371/journal.pcbi.1005447
[36]	J. C. Miller, A note on the derivation of epidemic final sizes, Bull.Math. Bio., 74 (2012), 2125–2141. https://doi.org/10.1007/s11538-012-9749-6 doi: 10.1007/s11538-012-9749-6
[37]	M. Turkyilmazoglu, Explicit formulae for the peak time of an epidemic from the SIR model, Phys. D Nonlinear Phenom., 422 (2021), 132902. https://doi.org/10.1016/j.physd.2021.132902 doi: 10.1016/j.physd.2021.132902
[38]	F. Brauer, Early estimates of epidemic final sizes, J. Bio. Dyn., 13 (2019), 23–30. https://doi.org/10.1080/17513758.2018.1469792 doi: 10.1080/17513758.2018.1469792
[39]	C. Meyer, Matrix analysis and applied linear algebra, Texts in Applied Mathematics. Siam, Philadelphia, USA, 2000. ISBN 9781489976123.
[40]	R. C. Spear, Large simulation models: calibration, uniqueness and goodness of fit, Env. Modelling Software, 12 (1997), 219–228. https://doi.org/10.1016/S1364-8152(97)00014-5 doi: 10.1016/S1364-8152(97)00014-5
[41]	M. K. Transtrum, B. B. Machta, J. P. Sethna, Why are nonlinear fits to data so challenging?, Phys. Rev. Lett., 104 (2010), 060201. https://doi.org/10.1103/PhysRevLett.104.060201 doi: 10.1103/PhysRevLett.104.060201

Reader Comments

Your name:*

Email:*
© 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)