Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

Generating patient-specific virtual tumor populations with reaction-diffusion models and molecular imaging data

  • The use of mathematical tumor growth models coupled to noisy imaging data has been suggested as a possible component in the push towards precision medicine. We discuss the generation of population and patient-specific virtual populations in this context, providing in silico experiments to demonstrate how intra- and inter-patient heterogeneity can be estimated by applying rigorous statistical procedures to noisy molecular imaging data, and how the noise properties of such data can be analyzed to estimate uncertainties in predicted patient outcomes.

    Citation: Nick Henscheid. Generating patient-specific virtual tumor populations with reaction-diffusion models and molecular imaging data[J]. Mathematical Biosciences and Engineering, 2020, 17(6): 6531-6556. doi: 10.3934/mbe.2020341

    Related Papers:

    [1] Rebecca Everett, Kevin B. Flores, Nick Henscheid, John Lagergren, Kamila Larripa, Ding Li, John T. Nardini, Phuong T. T. Nguyen, E. Bruce Pitman, Erica M. Rutter . A tutorial review of mathematical techniques for quantifying tumor heterogeneity. Mathematical Biosciences and Engineering, 2020, 17(4): 3660-3709. doi: 10.3934/mbe.2020207
    [2] Luca Bertelli, Frédéric Gibou . Fast two dimensional to three dimensional registration of fluoroscopy and CT-scans using Octrees on segmentation maps. Mathematical Biosciences and Engineering, 2012, 9(3): 527-537. doi: 10.3934/mbe.2012.9.527
    [3] Ziyou Zhuang . Optimization of building model based on 5G virtual reality technology in computer vision software. Mathematical Biosciences and Engineering, 2021, 18(6): 7936-7954. doi: 10.3934/mbe.2021393
    [4] Christian Engwer, Markus Knappitsch, Christina Surulescu . A multiscale model for glioma spread including cell-tissue interactions and proliferation. Mathematical Biosciences and Engineering, 2016, 13(2): 443-460. doi: 10.3934/mbe.2015011
    [5] Yu Lei, Zhi Su, Xiaotong He, Chao Cheng . Immersive virtual reality application for intelligent manufacturing: Applications and art design. Mathematical Biosciences and Engineering, 2023, 20(3): 4353-4387. doi: 10.3934/mbe.2023202
    [6] Wencong Zhang, Yuxi Tao, Zhanyao Huang, Yue Li, Yingjia Chen, Tengfei Song, Xiangyuan Ma, Yaqin Zhang . Multi-phase features interaction transformer network for liver tumor segmentation and microvascular invasion assessment in contrast-enhanced CT. Mathematical Biosciences and Engineering, 2024, 21(4): 5735-5761. doi: 10.3934/mbe.2024253
    [7] Christophe Prud'homme, Lorenzo Sala, Marcela Szopos . Uncertainty propagation and sensitivity analysis: results from the Ocular Mathematical Virtual Simulator. Mathematical Biosciences and Engineering, 2021, 18(3): 2010-2032. doi: 10.3934/mbe.2021105
    [8] Bethan Morris, Lee Curtin, Andrea Hawkins-Daarud, Matthew E. Hubbard, Ruman Rahman, Stuart J. Smith, Dorothee Auer, Nhan L. Tran, Leland S. Hu, Jennifer M. Eschbacher, Kris A. Smith, Ashley Stokes, Kristin R. Swanson, Markus R. Owen . Identifying the spatial and temporal dynamics of molecularly-distinct glioblastoma sub-populations. Mathematical Biosciences and Engineering, 2020, 17(5): 4905-4941. doi: 10.3934/mbe.2020267
    [9] Duane C. Harris, Giancarlo Mignucci-Jiménez, Yuan Xu, Steffen E. Eikenberry, C. Chad Quarles, Mark C. Preul, Yang Kuang, Eric J. Kostelich . Tracking glioblastoma progression after initial resection with minimal reaction-diffusion models. Mathematical Biosciences and Engineering, 2022, 19(6): 5446-5481. doi: 10.3934/mbe.2022256
    [10] Liang-Sian Lin, Susan C Hu, Yao-San Lin, Der-Chiang Li, Liang-Ren Siao . A new approach to generating virtual samples to enhance classification accuracy with small data—a case of bladder cancer. Mathematical Biosciences and Engineering, 2022, 19(6): 6204-6233. doi: 10.3934/mbe.2022290
  • The use of mathematical tumor growth models coupled to noisy imaging data has been suggested as a possible component in the push towards precision medicine. We discuss the generation of population and patient-specific virtual populations in this context, providing in silico experiments to demonstrate how intra- and inter-patient heterogeneity can be estimated by applying rigorous statistical procedures to noisy molecular imaging data, and how the noise properties of such data can be analyzed to estimate uncertainties in predicted patient outcomes.


    Broadly speaking, mathematical oncology seeks to develop a collection of in silico models that accurately predict the growth and progression of malignant tumors and their response to treatment [1]. While the basic scientific goal is to develop a rigorous quantitative science of cancer [2], a more readily attainable goal is to use existing models and data to inform clinical practice in some substantive way [3]. One approach to clinical mathematical oncology, which we call mathematical model-based precision medicine, is to use mathematical models to make patient-specific prognostic and therapeutic predictions which are subsequently employed to direct treatment choices and predict patient outcomes. In our view, such predictions must be conditioned on patient data in order for the strategy to qualify as precision medicine [4,5,6].

    Two essential features of oncology are heterogeneity and uncertainty. Heterogeneity can manifest as interpatient (between-patient), intrapatient (within-patient) and intratumor (within-tumor); see [7] for a more complete discussion of the clinical impact of heterogeneity in cancer and [8] for a review of the mathematical strategies to address heterogeneity. From a modeling perspective, heterogeneity implies a statistical approach where model parameters are treated as spatially distributed random variables with both population and patient-specific probability distributions [5]. Heterogeneity also highlights the issue of uncertainty: For a given patient, most if not all model parameters are unknown prior to the acquisition of patient-specific data, and even with such data, uncertainties may remain, since model parameters may only be partially identifiable and the data acquisition system may introduce additional irreducible noise. Image data provides a clear example of this issue, as imaging introduces information loss due to detector noise, low resolution and the possibility of null functions [9]. This uncertainty must be accounted for if model-based predictions are to be used reliably in the clinical context.

    Once a relevant mathematical model has been selected, prediction takes place in silico, whereby a simulated patient is produced and Quantities-of-Interest (QoIs) are calculated. In this context, we say that the parameter values and resulting simulations correspond to a virtual patient. Owing to the heterogeneity and uncertainty issues discussed above, in both population and patient-specific applications of mathematical oncology it is desirable to generate suitably randomized Virtual Patient Populations (VPPs). If the VPP represents an untreated group, it is called a Virtual Control Population (VCP) [4,10], while if the VPP represents a treatment group, it is called a Virtual Treatment Population (VTP). With appropriate VPPs, an in silico Virtual Clinical Trial (VCT) can be performed to assess treatment choices, again in either the population or patient-specific case. Population VPPs and VCTs can assist in making general prognosis and treatment recommendations or simulate a regulatory trial [11], while a patient-specific VPP and/or VCT can be used to assess uncertainty in a particular patient's disease progression or treatment response [6,12]. While population VCTs have found usage in device design and regulatory contexts [13], patient-specific VPPs and VCTs have seen limited usage.

    In previous work [5,6], we have highlighted the application of spatiotemporal random field modeling to address heterogeneity and uncertainty. In this work, we discuss specific VPP-based strategies towards the practical implementation of the framework presented in [5,6]. Specifically, we describe a method for generating VCPs by pairing a spatiotemporal reaction-diffusion equation (RDE) model for avascular, pre-metastatic tumor growth with spatially inhomogenous random field coefficients. We discuss the RDE model in sections 2.1 and 2.2 and the random field models in section 2.3. Together, these models provide a solution of the 'forward problem' for VPP generation. Two important inverse problems then arise. In the first, we consider in sections 4.1 and 4.2 the problem of estimating the RDE coefficient fields for a specific patient from noisy molecular imaging data; proof-of-concept simulation results are given in sections 5.1 and 5.2. For the second inverse problem, we must estimate the statistical parameters that define the random field coefficient models from a collection of many patients' data, hence solving the statistical calibration problem for virtual populations; we discuss this problem in section 4.3. Both maximum likelihood and Bayesian techniques are presented, and both require an accurate likelihood model connecting model parameters to available noisy data. To this end, in section 3 we discuss statistical models for molecular Emission Computed Tomography (ECT) data. We make this emphasis both because well-validated likelihood models for ECT are available [9], and because it is possible to model a direct connection between physiological processes such as tumor cell density and spatial growth rate, and certain emission imaging techniques and tracers.

    Note that the results given in sections 5.1 and 5.2 are generated in a pure simulation setting, where both ground-truth tumors and image data are simulated. This allows for methodological comparisons with known ground truth and tight control of all nuisance parameters that influence the data. A summary of future work that extends these results is provided in section 6.

    In this section, we introduce the tumor growth model that we employ for the remainder of the paper. We emphasize that the models we present are simplifications of real tumor biology, which is complex and not yet fully understood. These models are intended to be more 'clinically relevant' [3] than biologically accurate, due to their relative simplicity and amenability to patient-specific inference from spatiotemporal imaging data. In section 2.2 we discuss the parameter-to-solution map, in section 2.3 we discuss random field models for the tumor growth model coefficients, and in section 2.4 discuss the notions of virtual populations and virtual clinical trials.

    A wide variety of Partial Differential Equation (PDE) models have been employed to model spatiotemporal tumor growth, and the derivation and mathematical analysis of such models is adequately described elsewhere [14,15,16,17,18,19,20,21]. Here, we are interested in computational techniques for generating virtual patient populations, and thus consider a relatively simple PDE model (2.1) that has gained traction in the context of GlioBlastoma Multiforme (GBM). The randomization and statistical calibration techniques we present are independent of the tumor growth model, however, so the choice of Eq (2.1) is largely for illustration purposes.

    We denote a spatiotemporal density of tumor cells by n=n(x,t), with xV being a ddimensional position vector in the spatial domain V and 0tT, where T is the maximum number of days simulated. The units of n are cells per unit volume, and we take (x,t) to have units of cm and days, respectively. For in vivo tumors d=3, though for simulation purposes and modeling of in vitro and certain in vivo experimental setups we also consider spatial domains in d=2 as a reasonable approximation. The general reaction-diffusion equation for n is

    {nt=(Dn)+G(n)(x,t)V×[0,T]n(x,0)=n0(x)^nun=0boundary conditions (2.1)

    The spatially varying scalar diffusion coefficient D=D(x) describes the rate of apparent cell diffusion, while the growth function G(n) describes the growth and competition between the tumor and its environment. The boundary condition is taken to be the no-flux (Neumann) condition ^nun=0, where ^nu is the outward boundary normal of the domain V. Note that a treatment function T(n) can be added to Eq (2.1) to model an intervention such as chemotherapy, radiotherapy or surgical resection, but we postpone discussion of such models to a future work.

    While in the scalar tumor growth context the Gompertz function G(n)=ρnln(n/κ) has gained plausible mechanistic support via the notion of quiescence [22], to our knowledge the spatial RDE growth function is largely a phenomenological choice. The Fisher-KPP growth function is frequently selected in the GBM context, and is the one we will use in this paper:

    G(n)=ρ(x)n(x,t)(1n(x,t)κ(x)) (2.2)

    The growth ρ(x) and carrying capacity κ(x) functions, which have respective units of cells per day and cells, will be discussed further below.

    Note that in Eqs (2.1) and (2.2), several additional parameter functions are required to fully specify the solution to the PDE. These include the diffusion coefficient D(x), the initial cell density n0(x), the growth ρ(x) and carrying capacity κ(x). These parameter functions are certainly patient-specific, since they correspond to local physiology and nutrient concentrations. For notational simplicity, we will use the symbol β to denote the collection of all parameters necessary to specify a solution to Eq (2.1). In the context of this paper, a virtual patient then corresponds to a particular choice of β, which we will indicate by saying that patient j has parameter βj. We will assume for the remainder of this article the initial condition is given by a Gaussian

    n0(x)=Aexp(12σ2xx02) (2.3)

    with x0=[0.5,0.5]T, A=5/2πσ2 and σ2=1e4; this represents an initial collection of 5 tumor cells, well-localized around the center of our computational domain. The collection of all remaining unknown parameter fields is thus β=(D,ρ,κ). To avoid pathological situations, we also assume the technical condition that βB, where B is a set of parameter functions selected to guarantee that a positive, bounded solution n(x,t) of Eq (2.1) exists for all βB and for 0tT. Selecting such a B requires careful technical analysis of Eq (2.1); see [23].

    While we assume that each patient corresponds to a unique set of parameters β=βj, it is likely that βj is effectively unknown or uncertain for a given patient. To address this, we treat β as a function space-valued random vector in both the population and patient-specific cases, and provide corresponding random field models in section 2.3 below. In a simulation setting, we must choose a computational representation for βj, which we indicate with a superscript v for 'virtual'. So, β(v)j is a virtual (i.e., computational) sample corresponding to a sample for β.

    Assuming Eq (2.1) is classically well-posed for every (or almost every) βB, we can construct the deterministic parameter-to-solution map

    n(model)j=F(βj) (2.4)

    where n(model)j=n(model)j(x,t) solves Eq (2.1) with the parameter vector β=βj, corresponding to patient j. In a simulation setting, the forward map (2.4) is implemented via a numerical differential equation solver. We have employed a finite difference method-of-lines approach with a five-point spatial stencil and Runge-Kutta time stepper, as discussed in [24]. The result is an approximate solution to Eq (2.1), where the virtual coefficient sample β(v)j is processed through computational algorithm F(v) to produce the virtual tumor cell density n(v)j:

    n(v)j=F(v)(β(v)j) (2.5)

    The choice of discretization specifies the format for the virtual cell density n(v)j; with our finite difference scheme, we can assume that n(v)j takes the form of an Nx×Ny×Nt voxel array (for d=2). The specification of β(v)j is discussed below; we will assume a voxel-free form (Eq (2.8) below) that can be evaluated on an arbitrary grid when needed. Note that the virtual cell density will differ both from a patient's true cell density and from a measured cell density by discretzation, model and measurement errors; see [8] for a discussion of these errors. A demonstration of our scheme implementing (2.5) in d=2 is shown in Figure 1.

    Figure 1.  A realization of the tumor growth model (2.1) with Fisher-KPP growth term (2.2) (bottom row). The time-independent random field coefficients (top row, frames 2–4) are draws from ρLB(200,0.1,2e3), DLB(20,1e7,0.04) and κLB(100,5e7,0.1), as defined in section 2.3. The tumor burden N(t), defined in Eq (2.13), is shown for each time point. The initial condition n0(x) is shown top left.

    In the next section, we will introduce random field models that allow Eqs (2.4) and (2.5) to be used to generate virtual tumor populations. Note briefly that we have made assumptions that allow us to analyze and simulate Eq (2.1) as a Random PDE (RPDE) model, that is as a classically well-posed PDE whose coefficients are randomized. Note that this is in contrast to a Stochastic PDE (SPDE) model, which requires more advanced analysis and simulation techniques. See e.g., [5] for discussion of a technique based on characteristic functionals (introduced briefly below), and [25,26,27] for further discussion of the distinction between SPDE and RPDE.

    As discussed in the introduction, any mathematical model that is employed in a clinical precision medicine context must account for the possibility of heterogeneity and uncertainty by assuming that model parameters are sampled from a probability distribution. Since the model parameters that specify Eqs (2.1) and (2.2) are spatial functions, we employ the theory of random fields. We will avoid technical discussions, sufficing to say that one can treat β as a function space-valued random vector, written formally as β:ΩB, where (Ω,F,P) is a probability space and B is a set of functions mentioned previously, selected to guarantee the existence of the forward map (2.4). We assume that BX, where X is a Hilbert space with inner product (β,φ)X, taken to be the L2(V) inner product for vector-valued functions. In the most general setting, the complete statistics of a second-order random process are fully specified by its marginal distributions (through the Kolmogorov construction), its abstract probability distribution Pβ, or its characteristic functional: [5,9,28,29]

    Ψβ(φ)=E[exp(i(β,φ)X)]. (2.6)

    Frequently, a process is completely specified by its first few moments e.g., its mean function ˉβ(x)=E[β] and its covariance function k(x,x)=E[(β(x)ˉβ(x))(β(x)ˉβ(x))T]. While its use requires functional calculus, the characteristic functional (2.6) is typically a compact and efficient way to describe the complete statistics of a random field, and its use in the biological context is a topic of current investigation [5]. Refer to the recent books [27] and [30] for a complete discussion of the theory of Hilbert space-valued random vectors.

    In this work, we will simulate random fields by first assuming a convenient functional form for the realizations of the process, then randomizing the corresponding (finitely many) parameters to achieve desired statistics. This method allows us to guarantee that realizations of β satisfy conditions for Eq (2.1) to be well-posed, and allows for straightforward simulations using standard code packages. In particular, the random field models we use will take the form of a randomized synthesis map, which for a scalar-valued function φj(x) takes the form:

    φj=Φ(θj) (2.7)

    where Φ():RNX is a synthesis map that maps a (finite-dimensional) patient-specific parameter θj to the function φ(v)j(x). By selecting a probability distribution P(θj|θp) that depends on the population parameter θpΘRp, the synthesis map (2.7) produces a random field. In sections 4.1 and 4.2, we discuss ways to estimate a particular θj from image data, while in section 4.3 we discuss ways to estimate the population parameter θp from a collection of images.

    Because the coefficients in the RDE (2.1) must be non-negative and bounded to be physiologically realistic, it is advantageous to begin with a random field model that always produces non-negative, bounded realizations. For our simulations, we employ 'lumpy-type' random field models [31,32], a realization of which has the functional form

    φ(x)=Lmaxl=1bl(xxl;γ). (2.8)

    If the lump function (x;γ) and the lump amplitudes bl are non-negative and bounded, and the number of lumps is finite, then Eq (2.8) is guaranteed to produce non-negative and bounded realizations. For simplicity, we use isotropic unnormalized Gaussian lumps:

    (x;σ2)=exp(12σ2x2) (2.9)

    Random fields with this choice of lump function were proposed to model soft tissues with smoothly varying characteristics [31], while more sophisticated choices of (x), such as the clustered lump technique [32], can model more complex tissues such as breast or brain tissue. Ultimately, the choice of (x) and its shape parameter γ requires a statistical calibration step such as that discussed in section 4.3.

    We randomize Eq (2.8) as follows. We first select a lump shape parameter γ=σ2=σ20, a constant lump amplitude b0, and a mean number of lumps ˉLLmax. We then draw LPoi(ˉL), and set b=b0 for 1L, and b=0 otherwise. Lastly, we draw x1,xL, I.I.D. uniform in the domain V and form the sum (2.8). Note that the resulting random field realization can be written in terms of a nonlinear synthesis map (2.7) with N=(d+1)Lmax. We use the notation LB(ˉL,b0,σ20) to indicate a lumpy-type field with mean number of lumps ˉL, amplitude b0 and isotropic lump variance σ20. A demonstration of the behavior of lumpy-type random fields of the form given in Eq (2.8) is shown in Figure 1, top row.

    We reiterate that a single realization of the random field specified in Eq (2.8) is fully specified by the parameters σ20,b0 and the lump center list [xl], while the complete statistics of the random field are fully specified by ˉL,σ20 and b0. This distinction is important when we discuss parameter estimation problems in sections 4.1–4.3.

    Suppose that qRn is a low-dimensional vector quantifying patient status or prognosis, for instance (informally) q=[metric of tumor burden,metric of normal tissue function]. In the model-based precision medicine context, such a vector is generally called a quantitative biomarker [6,33,34]. The purpose of a virtual patient population, and more generally a virtual clinical trial, is to predict the distribution of q for either a control or treatment group. We will assume for convenience that q=qR is a scalar functional of β and n:

    q=M(β,n) (2.10)

    Given a virtual population of simulated parameters and tumors, (β(v)1,n(v)1),,(β(v)J,n(v)J), we estimate the biomarker q for each virtual patient using a method M(v), resulting in a vector

    Q(v)=[M(v)(β(v)1,n(v)1),,M(v)(β(v)J,n(v)J)]RJ (2.11)

    If Q(v) is generated from a population distribution, that is, the random vector β models inter-patient heterogeneity, then we say that Q(v) is a population virtual biomarker sample. If β models patient-specific uncertainties, for instance if β is sampled from the conditional random vector βj|gj where gj is some patient-specific data, then we say that Q(v) is a patient-specific virtual biomarker sample. Statistical properties of q such as its mean, standard deviation and distribution can be estimated from Q(v). By generating virtual control and treatment groups, Eq (2.11) can be used to simulate a clinical trial. If the VPP employed in Eq (2.11) represents a treatment group, decisions regarding the intervention can be made based on these statistics. For instance, a rigorous decision-theoretic approach to patient-specific treatment selection would define an additional function U(q) which measures the utility of achieving the biomarker q(τ) via the intervention τ. Since q is a random vector, the typical approach is to maximize the expected utility, that is, we seek the intervention τ which solves

    τ=argmaxτE[U(q(τ))] (2.12)

    While numerous patient-specific therapy optimization schemes have been suggested [35], to our knowledge a scheme based on Eq (2.12) has not seen practical implementation in this context. The VPP methodologies presented here may prove applicable to a future application of Eq (2.12), since the expected value in Eq (2.12) can be approximated via Eq (2.11).

    An example of a quantity-of-interest that we will use to illustrate our methods is the tumor burden, which is the total number of tumor cells as calculated from n(x,t):

    N(t)=Vn(x,t)dx (2.13)

    If the goal is to predict N(t) for some t in the future, a VPP allows one to assess the uncertainty in this value as a result of uncertainty in the patient-specific parameter βj being 'propagated' through the models F and M [25,30]. An illustration of uncertainty propagation is demonstrated in Figure 2, where several choices of the population parameter θp are chosen and the resulting (estimated) probability distribution of N(100) is shown. Further examples are given in section 5 below. Another possible QoI, the integrated log-kill, is similar to Eq (2.13) but with the integrand n(x,t) replaced by ln(n(x,t)/n0(x)). This metric is derived from a mass action drug effect and is applicable in chemotherapy evaluation [5]. Note that we do not discuss the important topic of exactly which scalar QoIs have real clinical impact in terms of patient outcomes; refer to e.g., [33,34].

    Figure 2.  Demonstration of the influence of the random field parameters θp on the distribution of the biomarker (2.13). The parameters (b0,σ20) controlling the statistics of ρ(x)LB(200,b0,σ20) are changed, and for each pair, a virtual biomarker sample of size J=512 is computed with q=M(v)(n(v))=N(100). Relative frequency histograms are displayed for each virtual biomarker sample.

    In a pure simulation setting, a virtual biomarker sample of the form shown in Eq (2.11) can be computed without reference to any particular patient-specific data, so long as the corresponding probability distributions are well-calibrated. However, the model-based precision medicine context requires that patient-specific data be used to create individualized VPPs, hence making uncertainty-informed decisions about patient j possible. We will focus on the case of molecular Emission Computed Tomography (ECT) data, for reasons discussed below.

    In the clinic, image data typically takes the form of reconstructed tomographic imaging studies such as MRI, CT, or 18F-FDG Positron Emission Tomography (PET). Note that while it can be difficult to precisely relate these standard tomographic data to either nj or βj, it is nonetheless possible to postulate such relationships and thereby extract useful information from these images. An example of such an approach is the usage of Apparent Diffusion Coefficient (ADC) maps, generated using Diffusion-Weighted MRI (DWMRI), to estimate nj through the application of a model which relates tissue cellularity to ADC [3,36,37]. The basic difficulty in such efforts is that malignancy is not directly measured by any of these standard techniques: For example, ADC maps estimate the apparent diffusion coefficient of water molecules and CT measures X-Ray attenuation, neither of which is uniquely altered by the presence of a malignant tumor. While increased 18F-FDG uptake implies increased cellular metabolism along the glycolytic pathway, a byproduct of malignant growth, this effect is not unique to malignant cells, since benign tumors can also display an increased metabolic rate [38].

    In this article we consider a wide class of imaging techniques known as Emission Computed Tomography (ECT), broadly defined as any imaging modality (such as PET) which detects a molecular tracer distribution via optical or nuclear detection techniques. We choose to focus on ECT data because the wide variety of available tracers allows for more direct modeling relating these data to tumor cell density nj and the RDE coefficient fields βj. As an example, genetically engineered cancer cells can be made to express Green Fluorescent Protein (GFP), which acts as a photon activity distribution when excited [39], and thus ECT, typically in the form of intravital microscopy (e.g., window chambers [40]) can be used to image nj directly in vivo, at least in the small animal setting. Meanwhile, the growth factor ρj can potentially be measured using the PET tracer 18F-FLT, which is a radioactive version of the molecule thymadine used in DNA synthesis [41,42]. Ex vivo, the Ki-67 immunohistochemistry stain would typically be employed to measure proliferation, but this requires either a biopsy or sacrifice of the animal. Current progress in small animal imaging is working towards producing in vivo imaging agents for common immunohistochemistries, but to our knowledge an in vivo Ki-67 analog is not available. The other advantages of ECT are the possibility to perform polyscopic imaging (that is, measurement of several distinct physiological processes simultaneously) [43], and the wealth of literature on the statistics of raw and reconstructed ECT data [9].

    To employ the statistical estimation strategies discussed in section 4, we require statistical models for ECT data. We emphasize the case of raw, un-reconstructed ECT data because the statistics are better understood: Most reconstructed images produced in the clinic are generated by a proprietary 'black box' method provided by the imaging system manufacturer. Our framework can still be applied to such data by carefully modifying the H operator defined in Eq (3.3) to account for the reconstruction step, however it can be difficult to calibrate an appropriate noise model for likelihood-based estimation. If the MLEM algorithm is known to have been employed for reconstruction, see [44] for a discussion of the resulting noise characteristics.

    In an ECT system, a chemical tracer is engineered to emit some measurable particle, such as photons, charged particles or nuclear decay products [45]. A tracer corresponds to a particle activity distribution fj(x,t), having units of particles emitted per unit volume per unit time. For the case of GFP imaging discussed above, a reasonable model is f(GFP)j(x,t)nj(x,t), that is, the photon activity is proportional to the actual tumor cell density, with constant of proportionality related to the photon yield per cell (i.e., the number of photons detected, on average, per cell). Similarly, we can assume f(18F-FLT)jρj, the growth factor. The energy produced by fj(x,t) propagates through the tissue medium and is detected by specialized ECT imaging hardware, which can range from optical CCD and CMOS detectors to scintillation-type detectors for high-energy particles. The statistics of raw ECT data are discussed at length elsewhere [9,46], so we summarize by noting that two data formats are commonly available, known as binned-mode data and particle-processed data. Because the presentation of particle-processing requires a more technical discussion of imaging detectors, we only consider binned-mode data in this work; see e.g., [45,46,47] for a discussion of particle processing. In a future work we will address the usage of particle processing data in mathematical oncology.

    Binned-mode ECT data arises when detected particles are sorted into spatial bins corresponding to a discretization of the detector. These spatial bins need not correspond to physical detector pixels, which many ECT imaging detectors do not have. The physics-based statistics of particle counting nearly always lead to a Poisson data model for particle count data, which can be written in terms of a noisy linear functional of the activity fj as follows. Suppose that the imaging system has M detector bins, with data for patient j labeled gj1,,gjM. The statistical model for binned-mode ECT data is that gj is a Poisson random vector with mean

    ˉgjm=T0Vhm(x,t)fj(x,t)dxdt (3.1)

    where hm(x,t) is called the m-th sensitivity function of the imaging system (assumed to be fixed for all patients), fj(x,t) is the particle activity distribution corresponding to the tracer administered to patient j, and T is the total imaging time. For tomographic systems, hm(x,t) must account for both the propagation within the patient, typically following a Radiative Transport Equation (RTE), as well as the geometry and blur characteristics of the imaging detector [6,9]. For this work, we will consider a simplified imaging geometry, intended to model intravital microscopy techniques such as mouse window chambers, where optical and/or gamma-ray images are collected with a planar detector above a pseudo-2D activity distribution. To a reasonable approximation, a Gaussian blur model

    hm(x,t;A,σ2blur)=A2πσ2blurexp(xxm2/2σ2blur) (3.2)

    is then appropriate. This blur accounts for both imaging elements such as lenses and collimators as well as position estimation blur [9,46]. More sophisticated imaging system models, for instance 3D tomographic systems, will be discussed in a future work (see section 6).

    Expressing Eq (3.1) as a linear continuous-to-discrete operator H:XRM allows us to write a binned-mode system model succinctly as

    gj=ˉgj+ηj=Hfj+ηj, (3.3)

    where ˉgj=[ˉgj1,,ˉgjM]=HfjRM is defined via Eq (3.1). Considering fj as a parameter, the probability distribution for the count vector gj is thus

    P(gj|fj)=Mm=1exp(ˉgjm)ˉggjmjmgjm!=Mm=1exp((Hfj)m)(Hfj)gjmmgjm! (3.4)

    If fj has been parameterized using a finite-dimensional parameter θj, for instance via a synthesis map of the form shown in Eq (2.8), we can define a nonlinear function H:RNRM, where for image reconstruction typically N<M, as follows:

    H(θj)=HΦ(θj) (3.5)

    This function computes the mean image data ˉgj for the synthesized object fj=Φ(θj), and is implemented computationally via numerical quadrature methods applied to Eq (3.1). The component functions of H are denoted Hm(θj):RNR. Note that if the synthesis method is nonlinear, as in Eq (2.8), then H is a nonlinear function of the vector θj, while if Φ is linear, then H is linear. In either case, we can write a probability distribution for gj|θj as follows:

    P(gj|θj)=Mm=1exp(Hm(θj))Hm(θj)gjmgjm! (3.6)

    We will make use of Eqs (3.4)–(3.6) for likelihood-based estimation methods in section 4 below.

    Assume that for patient j, ECT data gjRM has been collected, where gj can either be monoscopic (a single imaging study) or polyscopic (consisting of multiple imaging studies, corresponding to multiple tracers). In the previous section, we provided statistical models for gj that relate it to the activity distribution fj via Eq (3.3). As discussed there, ECT tracers allow fj to be directly related to nj and ρj. In this section, we discuss statistical techniques for estimating fj (hence nj and ρj) from gj. In section 4.1, we discuss point estimation techniques for an individual patient, where a single estimate of fj is produced, then in section 4.2 we discuss patient-specific Bayesian strategies where the solution consists of samples from a posterior distribution. In section 4.3, we discuss methods for estimating a population parameter θp from a database G=[g1,,gJ] of images collected for J patients.

    Given data gjRM for patient j, we consider the estimation of the underlying activity distribution fj using the imaging system model gj=Hfj+ηj, given in Eq (3.3). To approach this infinite-dimensional inverse problem, we select a finite-dimensional parameterization fj=Φ(θj), where θjRN, and consider the problem of estimating θj from gj, that is, solving the inverse problem gj=H(θj)+ηj. Note that moving from fj to θj might introduce an approximation error, owing to the fact that fj need not be expressible in same form as the synthesis used for the reconstruction; in ECT imaging, fj is typically quite smooth, so this approximation error is typically insignificant when compared to the influence of detector resolution and noise. While typical image reconstruction algorithms assume a voxel-based parameterization of fj, we elect to use a lumpy-type synthesis of the form given in Eq (2.8). As discussed, the activity distribution can correspond to the cell density nj, or the growth factor ρj(x). We denote a generic estimate of θj produced from gj by a hat, i.e., ˆθj is an estimate of θj produced from the data gj. While a wide array of strategies exist to solve this inverse problem, we consider only likelihood-based methodologies.

    Recall from section 3.2 that the probability distribution for binned-mode ECT data is Poisson. For an activity distribution fj parameterized using the synthesis map Φ, i.e. fj=fj(θj), we thus have a likelihood function

    L(θj|gj)=P(gj|θj)=P(gj|fj(θj))=P(gj|Φ(θj))=Mm=1exp(Hm(θj))Hm(θj)gjmgjm! (4.1)

    where P(gj|fj) is the Poisson distribution given in Eq (3.4), and Hm(θj) are the components of the function defined in Eq (3.5). Note the abuse of notation in Eq (4.1): The two functions P(gj|θj) and P(gj|fj) have different parameter types, but we take Eq (4.1) to define the former in terms of the latter. Taking its logarithm (and ignoring the θj-independent constant) results in the log-likelihood function

    (θj|gj)=ln(L(θj|gj))=Mm=1[gjmln(ˉgjm)ˉgjm]=Mm=1[gjmln(Hm(θj))Hm(θj)] (4.2)

    We now define the Maximum Likelihood Estimate (MLE) as the maximum of either Eq (4.1) or Eq (4.2) over the set Θ of allowed parameters:

    ˆθ(ML)j=argmaxθΘL(θ|gj)=argmaxθΘ(θ|gj)=argminθΘ[(θ|gj)] (4.3)

    A variety of optimization algorithms exist to compute ˆθ(ML)j, including the Expectation Maximization (EM) algorithm and Newton-type methods [48,49]. If the imaging operator H(θj) is linear, the EM algorithm applied to Poisson data leads to an iterative method known in the imaging community as the MLEM algorithm. Starting from ˆθ(0)j, the MLEM algorithm performs the iterative update

    ˆθ(k+1)jn=ˆθ(k)jnsnMm=1Hmngm(Hˆθ(k)j)m,sn=Mm=1Hmn (4.4)

    We will employ the method (4.4) to compute some MLEs for linear imaging models in section 5, while for nonlinear models we use the Matlab algorithm fmincon, which is a based on an interior point method with BFGS quasi-Newton steps [50]. A demonstration of algorithm (4.4) applied to a reconstruction of nj(x,t) from image data of the form (3.1) is shown in Figure 3. Note that the reconstruction displays visual artifacts due to the function H(θ) having a nontrivial null space. We have found that these artifacts have minimal impact on the task at hand, which is to predict the progression of n(x,t) and N(t). Future work will analyze in more detail the question of task-based image quality, as defined in [9,51], in the mathematical oncology context.

    Figure 3.  Illustration of the MLEM image reconstruction algorithm (4.4) for k=102,103 and 104 iterations. The true nj(x,t) shown on the right. The visible artifacts are due to the fact that H has a nontrivial null space which is not accounted for in the MLE (4.3). Regularization strategies can alleviate this, though we have found that for the task of predicting tumor burden N(t), these artifacts have minimal impact.

    We note briefly that under certain technical conditions, an MLE is asymptotically consistent and efficient, meaning for well-specified likelihood models, the MLE converges to the true parameter and attains the Cramér-Rao lower bound. Asymptotically, the distribution of the estimate ˆθ(ML)j (as a random variable depending on the data gj) converges to a normal with covariance given by the inverse of the Fisher information [9]:

    M(ˆθ(ML)jθj)N(0,F1),Fnn=Egj(2(θj|gjm)θjnθjn) (4.5)

    The Fisher information matrix F=F(θj) can be used to design more efficient measurement systems by 'tuning' the design parameters of the system to modify the likelihood in a desirable manner [52]. Note also that the Fisher information matrix can be evaluated 'off-line', by computing the expected values in Eq (4.5), which allows one to analyze the uncertainty in the MLE in a frequentist setting. While F depends on the true θj, for many ECT systems this dependence is minimal (i.e., the Fisher information matrix is nearly constant with respect to θj), which allows one to compute a single (system-dependent) F to be used for all MLEs derived from the system. One can also estimate confidence intervals for scalar quantities of interest derived from estimates of θj using transformation rules for the Fisher information matrix [53].

    To use the MLE procedure to generate a patient-specific virtual population, we assume that gj=Hfj, where either fj=nj(x,t0), fj=ρj, or fj=[nj(x,t0),ρj(x)]T, that is, we image either the cell density at a single time point, or the growth factor, or both the cell density and the growth factor. From this data, we take the following steps:

    (i) Use MLE to obtain ˆθ(n)j (and ˆθ(ρ)j, if measured) from which we synthesize ˆnj (and ˆρj, if measured).

    (ii) Randomize the remaining parameters D, κ (and ρ, if not measured) using a well-calibrated random field model such as Eq (2.8), resulting in D(1)j,,D(J)j, κ(1)j,,κ(J)j (and ρ(1)j,,ρ(J)j, if required).

    (iii) Generate a patient-specific virtual population by using ˆnj as the initial condition, either ˆρj or ρ(1)j,,ρ(J)j for the growth factor, and D(1)j,,D(J)j, κ(1)j,,κ(J)j for the diffusion coefficient and carrying capacity, and solving the forward problem (2.5) for each sample.

    (iv) With the resulting VPP, compute any desired quantitative biomarkers as in Eq (2.11).

    Note that in the above procedure, we make no attempt to estimate D and κ by performing the nonlinear parameter estimation task implied by Eq (2.4), nor do we attempt to estimate n(x,t) for any t<t0. From single time point data such as this, an identifiability (i.e., uniqueness) issue arises preventing such estimation from succeeding. This procedure is used in sections 5.1 and 5.2 to produce virtual populations.

    Note that one issue that is not addressed in the above technique is uncertainty inherent in the noisy data gj, namely that gj being random means the MLE is also random. While Fisher information analysis can be employed to analyze this uncertainty in an off-line frequentist setting, this analysis does not provide a clear method for producing alternates to ˆθj, since one only has access to a single realization of gj. The Bayesian methodology discussed below provides an avenue towards producing many plausible estimates from a single realization of gj.

    In a patient-specific Bayesian solution to the inverse problem gj=Hfj+ηj, we assume that fj were sampled from a prior distribution P0(f). Note that P0(f) need not correspond to a well-calibrated model of the true population in order to proceed with the inference, but the method is typically much more reliable if it does. Then, since gj is statistically coupled to fj through Eq (3.3), we can consider the conditional random vector gj|fj, whose statistics are described by the likelihood function L(fj|gj)=P(gj|fj) as discussed previously. A Bayesian inverse solution to the inference problem is to consider the posterior random vector fj|gj instead of a single point estimate ˆfj. Samples from the posterior can be used to generate a virtual population as described in the procedure above, where now ˆnj and ˆρj are replaced by samples from their corresponding posteriors to account for uncertainty in the data gj.

    Bayesian personalization of tumor growth has been considered in [21,54], for example. The former considers an RDE similar to Eq (2.1) but requires two segmented MRI images to perform the personalization. The latter generates ensemble members (i.e., virtual patients) by using fixed RDE coefficients (β in our notation) and imposing an ad hoc additive Gaussian noise model on the system evolution, in order to leverage the ensemble Kalman filter technique.

    The distribution of the posterior fj|gj is given via Bayes rule, which states informally that Ppost(fj|gj)P0(f)L(fj|gj). More rigorously [55], we would say that

    dPpostdP0L(fj|gj)=exp((fj|gj))

    Practically speaking, in this paper we consider (via Eqs (2.7) and (2.8)) finite-dimensional parameterizations of f, so that when coupled with binned-mode ECT data, the result is a finite-dimensional Bayesian inverse problem [56]. Taking as before a lumpy-type parameterization fj=Φ(θj), and with the data described by the nonlinear function ˉgj=H(θj)=HΦ(θj) defined in section 3.2, we now assign a prior on the vector θj, which we assume has PDF p0(θj). The Poisson likelihood L(θj|gj) is given in Eq (3.6), leading to a description of the posterior as

    p(θj|gj)p0(θj)L(θj|gj) (4.6)

    If either the prior is infinite-dimensional or the ECT data is assumed to be of particle-processing form, a fully infinite-dimensional Bayesian approach must be considered [55].

    To draw samples from the posterior (4.6), we apply a Markov Chain Monte Carlo (MCMC) algorithm based on the one presented in [57]. First, we assume for each tracer fj an expansion of the form Eq (2.8), so that fj=Φ(θj) with θj=[b1,,bLmax,x1,,xLmax,γ1,,γLmax]. Note that in some cases, we reduce the computational burden by fixing the xl and/or γl and only sample the remaining parameters. The MCMC procedure is then as follows:

    (i) Given the current sample θ(n)j, we propose a new sample θjπ(θj|θ(n)j) by randomly selecting Lmcmc lumps and perturbing their amplitudes bl, location xl and shape γl by a Gaussian with diagonal covariance Σ. This proposal is symmetric.

    (ii) From θj, the activity fj is synthesized using Eq (2.8) and the proposed mean image ˉg=Hfj is simulated using Eq (3.5).

    (iii) The acceptance probability is computed using the Metropolis-Hastings rule [25]:

    Q(θj,θ(n)j)=min{1,p(θj|gj)p(θ(n)j|gj)}=min{1,p0(θj)L(θj|gj)p0(θ(n)j)L(θ(n)j|gj)},

    after which we accept θ(n+1)j=θj with probability Q(θj,θ(n)j), and take θ(n+1)j=θ(n)j with probability 1Q(θj,θ(n)j).

    By construction of the Metropolis-Hastings algorithm, the Markov chain samples θ(1)j,θ(2)j, have stationary distribution given by the posterior p(θj|gj), and can thus be used to synthesize virtual patients for the calculation of any quantitative biomarkers as discussed in section 2.4. An example of this approach is discussed in section 5.1. Note that the standard Metropolis-Hastings algorithm performs best for θ with relatively small dimension; various modifications for high-dimensional problems have also been proposed [25,30,55].

    As illustrated in Figure 2, the population parameter θp which specifies the random field models used to generate the virtual population has a substantial influence on the statistics of any biomarkers calculated using Eq (2.11). The question of how to estimate θp from data thus arises. Such calibration problems in the context of VCTs have been discussed in [58,59], but their methodology accounts only for ODE models arising in systems pharmacology and does not directly apply to our context of PDE tumor growth models, random fields and imaging data. In this section, we discuss briefly a technique to estimate the population parameters which specify a lumpy-type model (2.8) from a database of ECT images G=[g1,,gJ].

    Suppose we assume a lumpy-type model for each of the RDE coefficient fields, β=[D,ρ,κ], say each is of the form given in Eq (2.8) with (x) given by a Gaussian (2.9). While the lump function (x) can itself be the target of calibration (see [60], for instance), we assume Eq (2.9) for simplicity. The statistics of the random field LB(ˉL,b0,σ2) are thus fully specified by the three scalars ˉL,b0 and σ2, which we write as θp=[ˉL,b,σ2]Θ=(0,)3. The problem of calibrating a population distribution such as this is to take a database of noisy image data G=[g1,,gJ] for a population of J patients and produce an estimate ˆθp=[ˆˉL,ˆb,^σ2]. We can again apply either MLE or the Bayes approach as discussed in sections 4.1 and 4.2.

    To develop a likelihood model for θp, we consider first the generic case where gj consists of direct images of some activity fj, for instance fjρj as discussed in section 5.1. This case corresponds to that considered in [61]. Assuming inter-subject independence, we have

    L(θp|G)=p(G|θp)=Jj=1P(gj|θp) (4.7)

    Using the law of total probability and the fact that the distribution of the image data is independent of the object statistics, we can write

    P(gj|θp)=Efj|θp[P(gj|fj)] (4.8)

    In (binned-mode) ECT imaging, the probability distribution P(gj|fj) is given by the Poisson (3.4), which depends on the imaging operator H. The expression (4.8) states that the probability of observing a given image gj if the population parameter is θp is the probability of observing gj given the object fj, averaged over of all possible random field realizations fj with statistics specified by the parameter θp. The expectation in Eq (4.8) is ostensibly infinite dimensional (since the fj are functions), though it can be approximated using Monte Carlo:

    P(gj|θp)1JJj=1P(gj|fj),fjP(fj|θp) (4.9)

    Equation (4.9) specifies a computational method for approximating the likelihood (4.7): For fixed θp, compute Eq (4.9) for each image in the database, then compute the product (or its logarithm) in Eq (4.7). The MLE of θp is then defined as before in Eq (4.3), with θpΘp, where Θp is the set of admissible population parameters. Note that the estimate (4.9) can be computed in parallel using graphics processors, which drastically reduces the time required for each likelihood calculation, which would have previously been a bottleneck in the application of a traditional MLE. Despite this, the MLE optimization problem (4.3) applied to the likelihood (4.7) poses a challenge for standard numerical optimization routines due to scaling and non-convexity issues. See section 6 for a brief discussion of future work aimed at addressing these issues.

    In this section, we discuss two in silico experiments that illustrate the methodologies discussed above. For both, a common 'ground truth' cell density path n(x,t) is generated by sampling a random field for each coefficient and simulating until t=T=365 days. In the first experiment, we use (simulated) ECT data for fj(x)=n(x,100) to compute both an MLE and a collection of posterior samples for the tumor cell density at t=t0=100 days. Then, a virtual population is generated by using the resulting ˆnj (or posterior samples) as the initial condition in Eq (2.1), randomizing the remaining coefficients and simulating until t=T. The resulting populations are then compared to the 'true' n(x,t), using the tumor burden N(t) defined in Eq (2.13) as the quantity-of-interest. In the second experiment, we use ECT data and MLE for both the growth factor ρ(x) and the cell density n(x,t0), again generating a VPP by using the estimated cell density as the initial condition n0(x) and ρj=ˆρj, then randomizing both D and κ.

    For both experiments, we use the following random fields, both to generate the initial 'true' tumor path n(x,t) and to randomize unmeasured coefficients. We assume DLB(20,1e7,0.04), ρLB(200,0.25,0.002), and κLB(100,5e7,0.1). These parameters were selected to achieve an average tumor size of approximately 108 cells after the first year; obviously a calibration procedure such as the one outlined in section 4.3 would be required to estimate population parameters for a real population of tumors. The exact sample used is shown in Figure 4.

    Figure 4.  The 'true' tumor cell density, generated using Eq (2.1) with β=(D,ρ,κ) sampled from the random field models specified in the section 5 introduction above. The simulated imaging studies are performed at t=100 days (second panel).

    In this experiment, we suppose that g(n)j=H(n)nj+η(n)j consists of ECT measurements of the cell density nj(x,t0), with t0=100 days, and that no other measurements are available. We assume that the experimental setup is confined to a thin slab so that a 2D simulation described in section 3.1 is appropriate, and that the imaging system corresponds to Eq (3.1) with blur kernel given by Eq (3.2). The blur is taken to be σ2blur=0.0212, which corresponds to an imaging resolution (Full-Width-Half-Max, FWHM) of 500 μm, and we assume that the photon yield is A=1e3 (detected photons per cell). Once the mean image data ˉgj=Hfj is computed using Eq (3.1), a randomized image is produced using poissrnd.

    To perform the image reconstruction for nj(x,t0), we apply the linear MLEM algorithm (4.4) with a reconstruction of the type (2.8), with fixed lump centers and lump variance, so that θ=[b1,,bLmax]. Specifically, we fix the lump centers on a grid of size 25×25, uniformly spaced on [0.35,0.65]2, and set the lump variance to σ2=1e2/256. The matrix H with columns given by ˉgl=H((xxl)) is then computed using Eq (3.1), and the MLEM algorithm (4.4) applied for Kmax=5000 iterations, with initial guess ˆθ(0)j=HTgj. From the final ˆθ(Kmax)j, we synthesize the estimate ˆnj(x,t0) using Eq (2.8). The results of this reconstruction are shown in Figure 5.

    Figure 5.  Comparison of the 'true' tumor shown in Figure 4 (left, imaged at t0=100 days) to the maximum likelihood reconstruction ˆn(ML)j (middle). The original noisy image gj=Hnj+ηj and the mean image evaluated at the MLE ˉgj=Hˆn(ML)j=Hˆθ(ML)j are shown to the right. The MLEM algorithm (4.4) with 5000 iterations was employed. The reconstruction grid is shown on the middle plot (black dots). The computational time required for this reconstruction was about 30 seconds.

    To demonstrate the usage of this estimate to generate a virtual tumor population as discussed in section 4.1, we use the resulting ˆnj(x,t0) as the initial condition n0(x) in Eq (2.1), and randomize the remaining coefficient fields using the lumpy-type random field models described in the introduction to this section to produce a virtual population of size J=64. In Figure 6, the quantity-of-interest N(t) is shown for the resulting virtual population, while in Figure 7, a collection of 10 virtual tumors is shown for t=150 (i.e., 50 days post-imaging).

    Figure 6.  Illustration of a VPP-based prediction of tumor burden N(t) when an MLE of n(x,t0) is used as the initial condition and remaining coefficients are randomized (J=64). The 'true' tumor path (bold), empirical average of the VPP (dotted) and a one standard deviation confidence band (dashed line) are shown. Individual VPP samples are shown in light grey. The left plot shows the time interval from 0 (tumor initiation) to 1 year, while the right shows the interval from imaging to 30 days post-imaging. The VPP appears biased towards slower trajectories than the actual.
    Figure 7.  Example VPP Samples for t=150, i.e., 50 days post-imaging, generated using the MLEM estimate of nj(x,100) as the initial condition and randomizing the other coefficient fields as described in the main text. All plots are shown on [0.3,0.7]2 and with a common colorbar. It is apparent that the tumor morphology is largely consistient across the population, though some virtual tumors demonstrate small 'islands' of growth and the overall rate of growth is variable, as seen in Figure 6.

    To demonstrate the usage of the Bayesian methods discussed in section 4.2, we generated samples from the posterior nj|gj to use as part of the virtual population generation procedure. For the MCMC procedure, we use the algorithm outlined in section 4.2, moving Lmcmc=3 lumps for each proposal with a proposal distribution of N(0,σ2mcmcI), taking σmcmc=1e4. The prior for θj=[b1,,BLmax] is chosen to be I.I.D. uniform on the interval [0,1e10]. Starting from θ(0)j=ˆθ(ML)j, i.e., starting at the maximum likelihood estimate, we generate J=512 samples, then synthesize n(1)j,,n(J)j. Note that the subscript remains j as these samples correspond to patient j, while the superscript indicates that these are posterior samples, generated using MCMC. Starting at the maximum likelihood estimate (which for our prior corresponds to the MAP estimate) reduces the need for an expensive burn-in period in which the first J0 samples are thrown out to ensure that the chain is sampling from the posterior. The acceptance rate for our chain (number of proposals accepted divided by total number of samples) is 0.495. Using each posterior sample as an initial condition, we again generate a virtual population by solving Eq (2.1) with the remaining coefficients D,ρ,κ randomized according to the lumpy-type field models specified previously. The results of this virtual population are shown in Figure 8.

    Figure 8.  Illustration of the virtual population generated using Bayesian posterior sampling for the initial condition n0(x). The empirical average path ˉN(t) (dotted) and confidence band (dashed) are shown in comparison to the true tumor path (solid), as well as the J=512 paths constituting the virtual population (light grey).

    For both the maximum likelihood and Bayesian solutions, the resulting virtual populations under-estimate, on average, the true tumor burden both over the short- and long-term, which indicates that additional data may be needed to estimate other model coefficients to gain a more accurate patient-specific prediction; we discuss on possibility in section 5.2 below. While the Bayesian technique accounts for uncertainty inherent in the image data g(n)j, this uncertainty appears relatively small as it relates to the prediction of N(t); the variance of N(t) for the virtual population generated using the Bayesian technique is only slightly larger than the variance for the MLE population.

    In Experiment 1, we saw that the virtual population under-estimated the true tumor path in both the MLE and Bayesian examples, suggesting that additional data may be required. In this experiment, we assume that the image data gj consists of (simulated) polyscopic data of the following form. We assume that we have access first to cell density measurements of the form g(n)j=H(n)nj+η(n)j, where H(n) is the same as in Experiment 1. The second dataset is of the form g(ρ)j=H(ρ)ρj+η(ρ)j, where we now have σ2blur=0.0425 (1 mm FWHM). As in the first experiment, we assume that imaging is performed at t=t0=100. The image reconstruction for nj is performed using MLEM as in experiment 1 (Figure 5). For ρj(x), we use MLE with a nonlinear synthesis map as follows. The lumpy-type expansion (2.8) with Gaussian lump (2.9) is assumed where now θj=[x1,,xLmax,b,σ2], that is, we fit the lump positions and a common lump amplitude and variance. The result is a reconstruction of the form

    ˆρj=ˆbLmaxl=1exp(12ˆσ2xˆxl2),

    where we have chosen Lmax=60 for this experiment. We assume further the constraints that xl[0,1]2, b(0,), and σ2(0,0.1], and use the Matlab algorithm fmincon to perform the MLE optimization (4.3). The results of this reconstruction are shown in Figure 9.

    Figure 9.  Maximum likelihood reconstruction of the growth factor ρj (left) from simulated imaging data (top right). The reconstruction (middle) with the resulting lump centers highlighted (black dots) are shown, as well as the mean image at the MLE ˉgρj=H(ρ)ˆρj (bottom right). The computational time for this reconstruction was several hours. While the reconstruction may appear visually poor, it is sufficient for the task of predicting n(x,t) and N(t), as illustrated in Figures 10 and 11.

    In Figure 10, we compare the tumor burden predicted using the virtual population to the 'true' tumor path. Over the first 30 day post-imaging period, the ensemble mean ˉN(t) for the virtual population tracks the true tumor path closely, indicating that if both n and ρ are estimated, the result appears unbiased over this short time period. This is also demonstrated by looking at the individual virtual population samples in Figure 11: Visual inspection reveals a closer overall appearance to the 'true' tumor than was apparent in Figure 7. Over a longer time period, the ensemble still trends towards a lower overall growth rate than the 'true' tumor path, as seen previously in Figure 6.

    Figure 10.  Prediction of tumor burden N(t) when MLEs of n(x,t0) and ρ(x) are used for the initial condition and growth factor. D(x) and κ(x) are randomized to produce a VPP (J=64). The original tumor path (bold) and empirical average of the VPP (dotted), as well as a one-σ confidence band (dashed) and individual VPP samples (light grey) are shown. The left plot shows t=0 to t=365, while the right plot shows t=100 (imaging time) to t=130. Over the 30 day post-imaging window, the VPP appears nearly unbiased (compare to Figure 6).
    Figure 11.  Example VCT Samples for t=150, i.e. 50 days post-imaging, generated using the MLEM estimate of nj(x,100) as the initial condition, the MLE of ρj(x) for the growth factor, and randomizing D(x) and κ(x) as described in the main text. All plots are shown on [0.3,0.7]2 and with a common colorbar. In contrast to Figure 7, the morphology apparent in the virtual population is nearly identical; heterogeneity in the overall growth is apparent in Figure 10.

    In this article, we have presented a rigorous, end-to-end statistical framework for addressing both inter- and intra-patient heterogeneity for the commonly used reaction-diffusion tumor growth model (2.1), emphasizing the usage of raw molecular emission imaging data to estimate unknown model parameters. We demonstrated these methods in two in silico experiments, using a simulated ground truth tumor path and simulated imaging data to illustrate patient-specific Virtual Patient Population (VPP) generation.

    In section 2.3, we presented a family of random field models that are both easy to simulate and demonstrate a wide variety of spatial heterogeneity; in section 4.3, we discussed how the parameters defining the statistics of these random field models can be estimated from a collection of molecular images, and discussed some of the computational challenges inherent in this task. Future work will discuss algorithms for solving this MLE problem, as well as additional estimation strategies to calibrate more general random field models from noisy data.

    In section 3, we discussed molecular Emission Computed Tomography (ECT) imaging systems and their associated statistical models. For simplicity, we emphasized the case of performing direct, possibly polyscopic, 2D imaging of a thin sample. While the 2D approximation is very limiting in general, it is applicable to the experimental setting of mouse window chambers, a topic of future work in our group. In another future work, we will address fully 3D tomographic ECT systems with particle-processing data and the associated computational and image quality questions that arise. Understanding image data quality in a task performance setting can help design better imaging systems [51], and to our knowledge, there is currently minimal effort formally evaluating the quality of image data in the mathematical oncology context.

    In section 4, we presented general likelihood-based statistical procedures for estimating both patient-specific and population parameters from noisy molecular imaging data, and in section 5, we presented two experiments to illustrate these procedures. In the first experiment, measurements of nj(x,100) are used to construct a maximum likelihood estimate and posterior samples, which are subsequently used to generate a patient-specific VPP. In the second experiment, we add an additional imaging study, using only maximum likelihood to estimate both nj(x,100) and ρj(x). The resulting virtual population demonstrated a more accurate prediction of the true Nj(t), over a 30 day post-imaging prediction period. In a future work, we will investigate the question of optimizing imaging acquisition schemes for the task of predicting N(t). We will also address the addition of treatment models and evaluation techniques discussed in [5] to our virtual population methodology, working towards providing a general strategy to solving the optimization problem (2.12).

    The author was supported by the NIBIB grant R01EB000803 and travel funding from the Statistical and Applied Mathematical Sciences Institute. The author also acknowledges very useful comments from reviewers, discussions with the SAMSI working group on tumor heterogeneity and with Harrison H. Barrett and Luca Caucci from the University of Arizona.

    The authors declare no conflict of interest for this paper.



    [1] R. C. Rockne, A. Hawkins-Daarud, K. R. Swanson, J. P. Sluka, J. A. Glazier, P. Macklin, et al., The 2019 mathematical oncology roadmap, Phys. Biol., 16 (2019), 041005. doi: 10.1088/1478-3975/ab1a09
    [2] J. T. Oden, E. A. Lima, R. C. Almeida, Y. Feng, M. N. Rylander, D. Fuentes, et al., Toward predictive multiscale modeling of vascular tumor growth, Arch. Comput. Methods Eng., 23 (2016), 735-779. doi: 10.1007/s11831-015-9156-x
    [3] T. E. Yankeelov, N. Atuegwu, D. Hormuth, J. A. Weis, S. L. Barnes, M. I. Miga, et al., Clinically relevant modeling of tumor growth and treatment response, Sci. Trans. Med., 5 (2013), 187ps9.
    [4] A. Baldock, R. Rockne, A. Boone, M. Neal, A. Hawkins-Daarud, D. Corwin, et al., From patientspecific mathematical neuro-oncology to precision medicine, Front. Oncol., 3 (2013).
    [5] N. Henscheid, E. Clarkson, K. J. Myers, H. H. Barrett, Physiological random processes in precision cancer therapy, PloS One, 13 (2018), e0199823.
    [6] N. Henscheid, Quantifying Uncertainties in Imaging-Based Precision Medicine, Ph.D thesis, University of Arizona, 2018.
    [7] P. L. Bedard, A. R. Hansen, M. J. Ratain, L. L. Siu, Tumour heterogeneity in the clinic, Nature, 501 (2013), 355. doi: 10.1038/nature12627
    [8] R. Everett, K. B. Flores, N. Henscheid, J. Lagergren, K. Larripa, D. Li, et al., A tutorial review of mathematical techniques for quantifying tumor heterogeneity, Math. Biosci. Eng., 17 (2020), 3660. doi: 10.3934/mbe.2020207
    [9] H. Barrett, K. Myers, Foundations of Image Science, John Wiley & Sons, New York, 2004.
    [10] K. Swanson, R. Rostomily, E. Alvord Jr, A mathematical modelling tool for predicting survival of individual patients following resection of glioblastoma: a proof of principle, Br. J. Cancer, 98 (2008), 113.
    [11] A. Badano, C. G. Graff, A. Badal, D. Sharma, R. Zeng, F. W. Samuelson, et al., Evaluation of digital breast tomosynthesis as replacement of full-field digital mammography using an in silico imaging trial, JAMA Network Open, 1 (2018), e185474.
    [12] A. Karolak, D. A. Markov, L. J. McCawley, K. A. Rejniak, Towards personalized computational oncology: from spatial models of tumour spheroids, to organoids, to tissues, J. R. Soc., Interface, 15 (2018), 20170703.
    [13] E. Abadi, W. P. Segars, B. M. Tsui, P. E. Kinahan, N. Bottenus, A. F. Frangi, et al., Virtual clinical trials in medical imaging: a review, J. Med. Imaging, 7 (2020), 042805.
    [14] K. R. Swanson, C. Bridge, J. Murray, E. C. Alvord, Virtual and real brain tumors: using mathematical modeling to quantify glioma growth and invasion, J. Neurol. Sci., 216 (2003), 1-10. doi: 10.1016/j.jns.2003.06.001
    [15] P. Tracqui, Biophysical models of tumour growth, Rep. Prog. Phys., 72 (2009), 056701. doi: 10.1088/0034-4885/72/5/056701
    [16] J. T. Oden, E. E. Prudencio, A. Hawkins-Daarud, Selection and assessment of phenomenological models of tumor growth, Math. Models Methods Appl. Sci., 23 (2013), 1309-1338.
    [17] E. M. Rutter, T. L. Stepien, B. J. Anderies, J. D. Plasencia, E. C. Woolf, A. C. Scheck, et al., Mathematical analysis of glioma growth in a murine model, Sci. Rep., 7 (2017), 2508. doi: 10.1038/s41598-017-02462-0
    [18] J. A. Weis, M. I. Miga, L. R. Arlinghaus, X. Li, A. B. Chakravarthy, V. Abramson, et al., A mechanically coupled reaction-diffusion model for predicting the response of breast tumors to neoadjuvant chemotherapy, Phys. Med. Biol., 58 (2013), 5851. doi: 10.1088/0031-9155/58/17/5851
    [19] C. Hogea, C. Davatzikos, G. Biros, An image-driven parameter estimation problem for a reaction-diffusion glioma growth model with mass effects, J. Math. Biol., 56 (2008), 793-825. doi: 10.1007/s00285-007-0139-x
    [20] T. L. Stepien, E. M. Rutter, Y. Kuang, Traveling waves of a go-or-grow model of glioma growth, SIAM J. Appl. Math., 78 (2018), 1778-1801. doi: 10.1137/17M1146257
    [21] E. J. Kostelich, Y. Kuang, J. M. McDaniel, N. Z. Moore, N. L. Martirosyan, M. C. Preul, Accurate state estimation from uncertain data and models: an application of data assimilation to mathematical models of human brain tumors, Biol. Direct, 6 (2011), 64. doi: 10.1186/1745-6150-6-64
    [22] M. Gyllenberg, G. F. Webb, Quiescence as an explanation of Gompertzian tumor growth, Growth Dev. Aging., 53 (1989), 25-33.
    [23] J. D. Logan, An Introduction to Nonlinear Partial Differential Equations, John Wiley & Sons, 2008.
    [24] W. Hundsdorfer, J. G. Verwer, Numerical Solution of Time-Dependent Advection-DiffusionReaction Equations, Springer Science & Business Media, 2013.
    [25] R. C. Smith, Uncertainty Quantification: Theory, Implementation, and Applications, SIAM, 2013.
    [26] X. Han, P. E. Kloeden, Random Ordinary Differential Equations and their Numerical Solution, Springer, 2017.
    [27] G. J. Lord, C. E. Powell, T. Shardlow, Introduction to Computational Stochastic PDEs, Cambridge, 2014.
    [28] E. Clarkson, H. H. Barrett, Characteristic functionals in imaging and image-quality assessment: tutorial, JOSA A, 33 (2016), 1464-1475. doi: 10.1364/JOSAA.33.001464
    [29] I. Gel'fand, N. Vilenkin, Generalized Functions, Vol. 4: Applications of Harmonic Analysis, Academic Press, New York, 1964.
    [30] T. Sullivan, Introduction to Uncertainty Quantification, Springer, 2015.
    [31] J. Rolland, H. H. Barrett, Effect of random background inhomogeneity on observer detection performance, JOSA A, 9 (1992), 649-658. doi: 10.1364/JOSAA.9.000649
    [32] F. Bochud, C. Abbey, M. Eckstein, Statistical texture synthesis of mammographic images with clustered lumpy backgrounds, Opt. Express, 4 (1999), 33-42. doi: 10.1364/OE.4.000033
    [33] L. G. Kessler, H. X. Barnhart, A. J. Buckler, K. R. Choudhury, M. V. Kondratovich, A. Toledano, et al., The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions, Stat. Methods Med. Res., 24 (2015), 9-26. doi: 10.1177/0962280214537333
    [34] J. P. O'connor, E. O. Aboagye, J. E. Adams, H. J. Aerts, S. F. Barrington, A. J. Beer, et al., Imaging biomarker roadmap for cancer studies, Nat. Rev. Clin. Oncol., 14 (2017), 169. doi: 10.1038/nrclinonc.2016.162
    [35] A. M. Jarrett, D. Faghihi, D. A. H. II, E. A. Lima, J. Virostko, G. Biros, et al., Optimal control theory for personalized therapeutic regimens in oncology: Background, history, challenges, and opportunities, J. Clin. Med., 9 (2020), 1314. doi: 10.3390/jcm9051314
    [36] A. Anderson, J. Xie, J. Pizzonia, R. Bronen, D. Spencer, J. Gore, Effects of cell volume fraction changes on apparent diffusion in human cells, Magn. Reson. Imaging, 18 (2000), 689-695. doi: 10.1016/S0730-725X(00)00147-8
    [37] D. A. Hormuth II, J. A. Weis, S. L. Barnes, M. I. Miga, E. C. Rericha, V. Quaranta, et al., Predicting in vivo glioma growth with the reaction diffusion equation constrained by quantitative magnetic resonance imaging data, Phys. Biol., 12 (2015), 046006. doi: 10.1088/1478-3975/12/4/046006
    [38] J. Czernin, Oncological applications of FDG-PET, in PET Molecular Imaging and its Biological Applications, Springer, (2004), 321-388.
    [39] R. M. Hoffman, Application of GFP imaging in cancer, Lab. Invest., 95 (2015), 432-452. doi: 10.1038/labinvest.2014.154
    [40] R. Schafer, H. M. Leung, A. F. Gmitro, Multi-modality imaging of a murine mammary window chamber for breast cancer research, BioTechniques, 57 (2014), 45-50.
    [41] H. Vesselle, J. Grierson, M. Muzi, J. M. Pugsley, R. A. Schmidt, P. Rabinowitz, et al., In vivo validation of 3 deoxy-3-[18F] fluorothymidine ([18F] FLT) as a proliferation imaging tracer in humans: correlation of [18F] FLT uptake by positron emission tomography with KI-67 immunohistochemistry and flow cytometry in human lung tumors, Clin. Cancer Res., 8 (2002), 3315-3323.
    [42] E. T. McKinley, G. D. Ayers, R. A. Smith, S. A. Saleh, P. Zhao, M. K., et al., Limits of [18 F]-FLT PET as a biomarker of proliferation in oncology, PloS One, 8 (2013), e58938.
    [43] E. Clarkson, M. A. Kupinski, H. H. Barrett, L. Furenlid, A task-based approach to adaptive and multimodality imaging, Proc. IEEE, 96 (2008), 500-511. doi: 10.1109/JPROC.2007.913553
    [44] H. H. Barrett, D. W. Wilson, B. M. Tsui, Noise properties of the EM algorithm. I. theory, Phys. Med. Biol., 39 (1994), 833. doi: 10.1088/0031-9155/39/5/004
    [45] Y. Ding, L. Caucci, H. H. Barrett, Charged-particle emission tomography, Med. Phys., 44 (2017), 2478-2489. doi: 10.1002/mp.12245
    [46] H. H. Barrett, W. C. Hunter, B. W. Miller, S. K. Moore, Y. Chen, L. R. Furenlid, Maximumlikelihood methods for processing signals from gamma-ray detectors, IEEE Trans. Nuclear Sci., 56 (2009), 725-735. doi: 10.1109/TNS.2009.2015308
    [47] A. K. Jha, H. H. Barrett, E. C. Frey, E. Clarkson, L. Caucci, M. A. Kupinski, Singular value decomposition for photon-processing nuclear imaging systems and applications for reconstruction and computing null functions, Phys. Med. Biol., 60 (2015), 7359. doi: 10.1088/0031-9155/60/18/7359
    [48] A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B (methodol.), 39 (1977), 1-22.
    [49] G. McLachlan, T. Krishnan, The EM Algorithm and Extensions, John Wiley & Sons, 2007.
    [50] R. H. Byrd, J. C. Gilbert, J. Nocedal, A trust region method based on interior point techniques for nonlinear programming, Math. Program., 89 (2000), 149-185. doi: 10.1007/PL00011391
    [51] H. H. Barrett, K. J. Myers, C. Hoeschen, M. A. Kupinski, M. P. Little, Task-based measures of image quality and their relation to radiation dose and patient risk, Phys. Med. Biol., 60 (2015), R1.
    [52] H. H. Barrett, J. Denny, R. F. Wagner, K. J. Myers, Objective assessment of image quality. II. Fisher information, Fourier crosstalk, and figures of merit for task performance, JOSA A, 12 (1995), 834-852.
    [53] Y. Pawitan, In All Likelihood: Statistical Modelling and Inference Using Likelihood, Oxford University Press, 2001.
    [54] M. Lê, H. Delingette, J. Kalpathy-Cramer, E. R. Gerstner, T. Batchelor, J. Unkelbach, et al., MRI based Bayesian personalization of a tumor growth model, IEEE Trans. Med. Imaging, 35 (2016), 2329-2339. doi: 10.1109/TMI.2016.2561098
    [55] A. M. Stuart, Inverse problems: a Bayesian perspective, Acta Numerica, 19 (2010), 451-559. doi: 10.1017/S0962492910000061
    [56] J. Kaipio, E. Somersalo, Statistical and Computational Inverse Problems, Springer Science & Business Media, 2006.
    [57] M. A. Kupinski, J. W. Hoppin, E. Clarkson, H. H. Barrett, Ideal-observer computation in medical imaging with use of markov-chain monte carlo techniques, JOSA A, 20 (2003), 430-438. doi: 10.1364/JOSAA.20.000430
    [58] R. Allen, T. R. Rieger, C. J. Musante, Efficient generation and selection of virtual populations in quantitative systems pharmacology models, CPT: Pharmacometrics Syst. Pharmacol., 5 (2016), 140-146. doi: 10.1002/psp4.12063
    [59] T. R. Rieger, R. J. Allen, L. Bystricky, Y. Chen, G. W. Colopy, Y. Cui, et al., Improving the generation and selection of virtual populations in quantitative systems pharmacology models, Prog. Biophys. Mol. Biol., 139 (2018), 15-22. doi: 10.1016/j.pbiomolbio.2018.06.002
    [60] B. Galerne, A. Leclaire, L. Moisan, Texton noise, in Computer Graphics Forum, Wiley Online Library, 2017.
    [61] M. A. Kupinski, E. Clarkson, J. W. Hoppin, L. Chen, H. H. Barrett, Experimental determination of object statistics from noisy images, JOSA A, 20 (2003), 421-429. doi: 10.1364/JOSAA.20.000421
  • This article has been cited by:

    1. Eulalie Courcelles, Jean-Pierre Boissel, Jacques Massol, Ingrid Klingmann, Riad Kahoul, Marc Hommel, Emmanuel Pham, Alexander Kulesza, Solving the Evidence Interpretability Crisis in Health Technology Assessment: A Role for Mechanistic Models?, 2022, 4, 2673-3129, 10.3389/fmedt.2022.810315
    2. Qicai Huang, Miaochao Chen, Adaptive Extraction of Oil Painting Texture Features Based on Reaction Diffusion Equation, 2021, 2021, 1687-9139, 1, 10.1155/2021/4464985
    3. Amine MOUSTAFİD, Set-Valued Stabilization of Reaction-Diffusion Model by Chemotherapy and or Radiotherapy, 2023, 6, 2645-8845, 147, 10.33401/fujma.1299982
    4. Yash Vats, Mani Mehra, Dietmar Oelz, Abhishek Kumar Singh, A New Perspective for Scientific Modelling: Sparse Reconstruction-Based Approach for Learning Time-Space Fractional Differential Equations, 2024, 19, 1555-1415, 10.1115/1.4066330
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4050) PDF downloads(97) Cited by(4)

Figures and Tables

Figures(11)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog