Research article Special Issues

Transient thermoelastic responses in spherical elastic porous media using a fractional two-phase-lag model with space-time nonlocality

  • This study investigated the impact of the fractional Caputo-tempered two-phase-lag (FCT-TPL) heat conduction model on thermoelastic vibrations within a medium containing spherical cavities and voids. In the proposed model, the nonlocality of time and space is integrated to unify classical and generalized thermoelastic theories, enabling a thorough investigation of size-dependent phenomena and the scattering characteristics of thermo-mechanical waves. Also, by integrating fractional calculus with tempered derivatives, the proposed model adeptly captures the complex interaction between localized thermal effects and nonlocal mechanical responses, particularly in materials with pronounced microstructural features. The fractional order and tempering parameter are shown to play a crucial role in controlling thermal relaxation times and the amplitude of thermoelastic vibrations. The findings reveal that the integration of the fractional Caputo-tempered derivative, along with temporal and spatial nonlocal effects, into the two-phase-lag model significantly improves the accuracy of predicting transient thermoelastic responses in materials with cavities and voids.

    Citation: Kareem Alanazi, Ahmed E. Abouelregal. Transient thermoelastic responses in spherical elastic porous media using a fractional two-phase-lag model with space-time nonlocality[J]. AIMS Mathematics, 2025, 10(5): 12661-12688. doi: 10.3934/math.2025571

    Related Papers:

    [1] Paolo Dell'Aversana . Reservoir geophysical monitoring supported by artificial general intelligence and Q-Learning for oil production optimization. AIMS Geosciences, 2024, 10(3): 641-661. doi: 10.3934/geosci.2024033
    [2] Dell’Aversana Paolo, Bernasconi Giancarlo, Chiappa Fabio . A Global Integration Platform for Optimizing Cooperative Modeling and Simultaneous Joint Inversion of Multi-domain Geophysical Data. AIMS Geosciences, 2016, 2(1): 1-31. doi: 10.3934/geosci.2016.1.1
    [3] Paolo Dell’Aversana, Gianluca Gabbriellini, Alfonso Iunio Marini, Alfonso Amendola . Application of Musical Information Retrieval (MIR) Techniques to Seismic Facies Classification. Examples in Hydrocarbon Exploration. AIMS Geosciences, 2016, 2(4): 413-425. doi: 10.3934/geosci.2016.4.413
    [4] Paolo Dell'Aversana . Reservoir prescriptive management combining electric resistivity tomography and machine learning. AIMS Geosciences, 2021, 7(2): 138-161. doi: 10.3934/geosci.2021009
    [5] Thompson Lennox, Velasco Aaron A., Kreinovich Vladik . A Multi-Objective Optimization Framework for Joint Inversion. AIMS Geosciences, 2016, 2(1): 63-87. doi: 10.3934/geosci.2016.1.63
    [6] Zamora Azucena, A.Velasco Aaron . Inversion of Gravity Anomalies Using Primal-Dual Interior Point Methods. AIMS Geosciences, 2016, 2(2): 116-151. doi: 10.3934/geosci.2016.2.116
    [7] Santiago Quinteros, Aleksander Gundersen, Jean-Sebastien L'Heureux, J. Antonio H. Carraro, Richard Jardine . Øysand research site: Geotechnical characterisation of deltaic sandy-silty soils. AIMS Geosciences, 2019, 5(4): 750-783. doi: 10.3934/geosci.2019.4.750
    [8] Eve-Agnès Fiorentino, Sheldon Warden, Maksim Bano, Pascal Sailhac, Thomas Perrier . One-off geophysical detection of chlorinated DNAPL during remediation of an industrial site: a case study. AIMS Geosciences, 2021, 7(1): 1-21. doi: 10.3934/geosci.2021001
    [9] Ayesha Nadeem, Muhammad Farhan Hanif, Muhammad Sabir Naveed, Muhammad Tahir Hassan, Mustabshirha Gul, Naveed Husnain, Jianchun Mi . AI-Driven precision in solar forecasting: Breakthroughs in machine learning and deep learning. AIMS Geosciences, 2024, 10(4): 684-734. doi: 10.3934/geosci.2024035
    [10] John D Alexopoulos, Nikolaos Voulgaris, Spyridon Dilalos, Georgia S Mitsika, Ioannis-Konstantinos Giannopoulos, Vassileios Gkosios, Nena Galanidou . A geophysical insight of the lithostratigraphic subsurface of Rodafnidia area (Lesbos Isl., Greece). AIMS Geosciences, 2023, 9(4): 769-782. doi: 10.3934/geosci.2023041
  • This study investigated the impact of the fractional Caputo-tempered two-phase-lag (FCT-TPL) heat conduction model on thermoelastic vibrations within a medium containing spherical cavities and voids. In the proposed model, the nonlocality of time and space is integrated to unify classical and generalized thermoelastic theories, enabling a thorough investigation of size-dependent phenomena and the scattering characteristics of thermo-mechanical waves. Also, by integrating fractional calculus with tempered derivatives, the proposed model adeptly captures the complex interaction between localized thermal effects and nonlocal mechanical responses, particularly in materials with pronounced microstructural features. The fractional order and tempering parameter are shown to play a crucial role in controlling thermal relaxation times and the amplitude of thermoelastic vibrations. The findings reveal that the integration of the fractional Caputo-tempered derivative, along with temporal and spatial nonlocal effects, into the two-phase-lag model significantly improves the accuracy of predicting transient thermoelastic responses in materials with cavities and voids.



    In mathematics, computer science and economics, as well as in other disciplines like geophysics, solving an optimization problem consists of finding the best of all possible solutions in a given model space [1]. This target can be realized by minimizing (or maximizing) some type of objective function that includes, in many practical cases, the difference between observed and predicted quantities. For instance, in geophysics, a typical optimization problem is finding an Earth-model consisting of seismic-velocity spatial distribution that minimizes the differences between observed and predicted seismic travel times [2].

    Optimization techniques can be divided into approaches that allow exploring locally the model space, and approaches that allow a global or quasi-global search of the solution. In the first case, we generally incur in the problem of convergence towards local minima (or local maxima) of the cost function. In fact, the final solution will depend strongly on the initial model and on the exploration path in the parameters space. In general, when we apply local optimization techniques, we search for a solution in a limited portion of the model space, converging towards solutions that could not correspond with the best one for our specific problem. In order to face this problem, Global optimization techniques are addressed to find the global minimum (or the global maximum) of the objective function over the given set. Unfortunately, finding the global minimum (or maximum) of a function commonly represents a difficult task. Analytical methods are frequently not applicable and the use of numerical solution strategies often is not sufficient [3]. Typical techniques based on global or quasi-global search in the model space [4], include stochastic methods like Direct Monte-Carlo sampling approaches. Other methods are based on heuristic approaches to explore the model space in a more or less intelligent way. These include, for instance, Ant Colony optimization (ACO), Simulated annealing, Evolutionary algorithms (e.g., genetic algorithms and evolution strategies), and so forth. Despite the many advantages, these types of global optimization methods are generally difficult to put in practice in many situations, especially in three dimensions, due to the very expensive computational process when dealing with large parameter spaces.

    In order to face the intrinsic problems of both local and global optimization methods, in this paper, we propose to reformulate the optimization problems in terms of Reinforcement Learning (RL). Our approach aims to teach an "artificial agent" to search for the global minimum of the cost function in the model space using the advantages offered by a large suite of Reinforcement Learning algorithms. These are aimed at mapping situations to actions through the maximization of a "numerical reward signal" [5,6,7,8,9,10,11,12,13]. In every particular state, an artificial agent learns progressively by continuous interaction with its environment. This can be a true physical environment, as it happens, for instance, in case we desire to teach an agent to move through a real physical space. More in general, the environment can consist of a virtual space with which one or more artificial agents interact. The effects of every agent's action will be returned by the modified environment in terms of a reward (or a punishment) and a new state. The reward depends on the "quality" of the agent's actions. High rewards correspond with positive impact of the actions on the agent's target, and vice versa. For instance, if the objective of the artificial agent is to find the exit from a maze in the shortest possible time (or through the shortest path), the agent will receive a positive reward every time it moves properly to reach the exit.

    The final objective of such a learning strategy is to maximize the total reward accumulated during all iterations (cumulative reward), and not just the immediate reward. In the example of the maze, it means that the agent's objective is to find a global strategy to escape from the maze, rather than just selecting a single local step forward that could lead him into a dead end. This is a crucial point, because the goal of Reinforcement Learning methods is optimizing the agent's actions for a long-term horizon. Such an intrinsic forward-looking approach of RL algorithms can be used with profit to find global solution(s) in many optimization/inversion problems in geophysics (as well as in other fields). In fact, it is easy to grasp the analogies and possible points of connection between geophysical inversion problems and Reinforcement Learning. In the first case, the goal is to find an Earth model that corresponds to a minimum value of a certain cost function. In the second case, the goal is to find an optimal policy through which an agent can maximize its total reward. These are both examples of optimization problems.

    In the next methodological section, we will see how the geophysical inverse problem can be reformulated in terms of Reinforcement Learning strategy. For that purpose, we will use a combination of Q-Learning, Temporal Difference and Epsilon-Greedy algorithms. We will see that these methods fit the purpose of optimizing the exploration of the parameter-space in inversion problems. Finally, we will test our approach using synthetic geo-electric data, plus a seismic data set available in the public domain.

    Reinforcement Learning includes a suite of algorithms and techniques through which an "artificial agent" learns an optimal "behavior" by interacting with a dynamic "environment" and by maximizing a "reward metric" for the task, without being explicitly programmed for that task and without human intervention. The artificial agent selects those actions that allow increasing the cumulative reward, r ∈ R, achievable from a given state, s ∈ S (Figure 1).

    Figure 1.  Conceptual scheme of Reinforcement Learning.

    A "discount factor", γ, is applied to the long term rewards with the scope of giving progressively lower weights to rewards received far in the future. The agent's goal is to learn, by trials and errors, a "policy" for maximizing such cumulative long-term reward. The policy is often denoted by the symbol π. It consists of a function of the current environment state, s, belonging to the set S of all possible states, and returns an action, a, belonging to the set A of all possible actions.

    π(s):SA. (1)

    There are many different Reinforcement Learning techniques. Among the various methods, the Q-Learning method [14] is a suitable approach for solving optimization/inverse problems. The name derives from the Q-function that provides a measure of the Quality (in terms of effectiveness for a given task) of an action that the agent takes starting from a certain state. It is defined as follows:

    Q(s,a)=S×AR. (2)

    The Bellman equation below provides an operative definition of the maximum cumulative reward. This is given by the reward r that the agent received for entering the current state s and action a, plus the maximum future reward for the next state s', taking all the possible actions a from that state:

    Q(s,a)=r+γmaxaQ(s,a). (3)

    In formula (3), the symbol γ indicates the "discount factor". It is introduced for balancing the contribution of future rewards with respect to the immediate reward. The value of Q(s, a) can be found recursively: the algorithm starts by using random values (or any guess value) for the Q-function. Then, when the agent proceeds exploring its environment, the initial Q values progressively converge towards the optimal ones, based on the positive and/or negative feedback that the agent receives from its environment. The "Temporal Difference" (briefly TD) method (formula 4 below) provides a practical way for updating the Q values, as follows:

    Qnew(st,at)Q(st,at)+α[rt+γmaxaQ(st+1,a)Q(st,at)] (4)

    We can see that the new value of Q for state st and action at, is obtained by adding to the previous Q value a new term (in the square parenthesis) called temporal difference. This, in turn, is multiplied by a factor α that represents the learning rate and is commonly determined empirically by the user. The temporal difference consists of the immediate reward, rt, plus the difference between the maximum Q value for all the actions that the agent can take from the state st+1, minus the old value of Q. The maxaQ(st+1,a) term is multiplied by the above mentioned discount factor, γ.

    Now, we must explain how we define the Q values in the frame of our integrated Inversion-Reinforcement Learning (called, briefly, RL-Inv) approach. In other words, we must clarify how we assign a reward to the artificial agent (the optimization algorithm) while it explores the model space. In our method, we set the Q-function inversely proportional to the cost function (that, in turn, depends on the difference between observed and predicted responses) after a certain number N of iterations. The user determines such N value empirically. Indeed, we assume that a good convergence path towards a final low misfit represents a reasonable long-term reward for our Reinforcement Learning agent. In that case, low misfit (as well as low values of the cost function) correspond to high rewards and high Q values.

    For instance, let us suppose that we apply a Least Square optimization algorithm to solve our inverse problem; that algorithm coincides with our agent. In that case, we can define the cost function Φ(m) as follows:

    Φ(m)=(dobsg(m))TWd(dobsg(m))+ηmTRm. (5)

    In formula (5), m represents the vector of model parameters, or model vector; dobs represents the data vector (observations); g(m) is the forward operator by which we calculate the predicted response in the model vector m; the symbol T indicates "transpose"; Wd is he data covariance matrix for taking data uncertainties into account; R is a smoothing operator applied to the model vector m as a regularization term; η is a factor regulating the weight of the smoothing term in the cost function.

    In our procedure, we calculate Φ(m) at each iteration and store its value at every iteration. In such a way, we can calculate and store the correspondent Q value as follows:

    Q(st,at)1Φ(m). (6)

    Next, let us clarify how the Q-Learning formulas contribute to the inversion. In the frame of the Q-Learning approach, we need to estimate a cumulative reward by taking into account both the immediate as well as the long-term reward. In our approach, the immediate reward is given by the inverse of the cost function after just one or two iterations, as in formula (6). Instead, the long-term reward is given by the inverse of the cost function estimated after a "significant number" of iterations (such number depends on the inverse problem and is decided by the user, case by case). In such a way, we intend to set a policy that minimizes the cost function through a balanced combination of both short-term and-long term views. This concept will be further expanded in the next two sections.

    The Bellman equation (3) and the Temporal Difference iterative method (4) allow us estimating and progressively updating the values of the Q-function during the optimization (inversion) process. These values depend on the starting models and on the exploration paths in the model-space. The goal of our approach is to find an optimal policy for our optimization agent. Such a policy will coincide with the "optimal" exploration/exploitation path in the model space aimed at maximizing the Q-function. Hence, a crucial point is how the model space (that represents the environment of our Reinforcement Learning approach) is explored.

    In the frame of geophysical inversion (as well as in other optimization problems), the environment of the Reinforcement Learning problem is represented by the space of model parameters, or model space (Figure 2). As we said earlier, the agent corresponds with the optimization algorithm through which we try to minimize the cost function. At each iteration, the algorithm performs an action: it explores the environment in order to update the current geophysical model with the goal to reduce the misfit between observed and predicted responses. In our approach, we perform such an exploration using the Epsilon-Greedy algorithm. This provides an effective strategy for facing the well-known "Exploration vs. Exploitation" question. Let us explain the basics of this strategy and the reason why we included it in our approach.

    Figure 2.  Conceptual link between the Reinforcement Learning approach and the exploration of the model space in optimization problems.

    Exploration allows an agent improving its current state at each action, leading to a long-term benefit. In the frame of geophysical inversion, this corresponds to retrieve a distribution of model parameters that allows lowering the cost function (or the misfit) and, consequently, improving the Earth model. On the other hand, exploitation means to choose the greedy action to get the most short-term reward by exploiting the agent's current action-value. For instance, in case of Gradient-based optimization methods, this action corresponds to taking repeated steps in the opposite direction of the gradient of the cost function. The crucial point is that by being greedy with respect to immediate action-reward estimates, may not actually lead towards the maximum long-term reward, causing a sub-optimal behaviour. In other words, trying to minimize the cost function at each step could not represent the optimal inversion policy.

    Epsilon-Greedy is an effective approach aimed at balancing exploration and exploitation by choosing randomly between these two possibilities. The term "epsilon" refers to the probability of choosing to explore that is commonly lower than the probability to exploit. In other words, the optimization/inversion algorithm exploits most of the time with a small chance of exploring. It means that it updates the model parameters respecting the condition of reducing the cost function at each iteration (exploitation). However, it explores the model parameters (with lower probability: epsilon < < 1) in different directions too, even if that choice could imply a temporary increase of the cost function. Figure 3 shows a scheme of such approach and its pseudo-code.

    Figure 3.  Scheme of the Epsilon-Greedy approach (left) and its pseudo-code (right).

    At the same time, by applying the Bellman equation and the Temporal Difference method, we aim to a long-term reward that is minimizing the cost function after a significant number N of iterations (and not just the cost function at each individual iteration). This strategy allows us sampling large portions of the model space that otherwise would be excluded by a traditional greedy optimization strategy. Finally, we will get the optimal inversion policy. This uses the best exploitation/exploration strategy, produces the lowest final value of the cost function and the best inverted model.

    The block diagram of figure 4 summarizes the entire procedure, showing the sequence of steps through which we update the model parameters by maximizing the Q-function through a combination of Epsilon-Greedy exploration strategy and Bellman/Temporal Difference equations.

    Figure 4.  Block diagram of the Reinforcement Learning-Inversion (RL-Inv) approach.

    With reference to figure 4, in order to clarify better how and where the Q-Learning formulas contribute to the inversion process, we schematize the entire workflow through the following key steps:

    1) Create m starting models (process initialization). 2) Choose n (number of iterations). 3) Run n iterations for each model. 4) Update each model after n iterations. 5) Calculate the inverse of cost function (eq.6) after 1 or 2 iterations (short-term reward for each model). 6) Calculate the inverse of cost function (eq.6) after n iterations (long-term reward for each model). 7) Calculate (or update) the cumulative reward (Q values) using the Bellman and TD formulas (eqq.3 and 4). 8) Store Q values and update the Q-Table. 9) Chose epsilon (for the epsilon-Greedy method), as shown in figure 3. 10) Select model with the highest total reward with probability = 1-epsilon (exploitation). 11) Alternatively, select random model with probability = epsilon (exploration). 12) Use the selected model, perturb it and create other m initial models. 13) Iterate from step 3. 14) Exit from the loop when the cost function and the cumulative reward Q is stationary. 15) Finally, select the model with the highest Q-value (lowest cost function).

    In this section, we discuss two tests where we apply the RL-Inv method to two types of data set. In the first case, we use synthetic data obtained through a simulated resistivity survey. In the second case, we use refraction seismic data available in the public domain. For each test, we compare the final models obtained through a "standard" inversion/optimization approach and the RL-Inv methodology.

    In this test, we simulated the acquisition of DC (Direct Current) geo-electric data along a line 550 m long, with electrodes deployed with a regular spacing of 10 m. The upper panel of figure 5 shows the "true" resistivity scenario in which we simulated the resistivity survey. The model consists of two stacked resistive layers embedded in a conductive uniform background. The lower panel of the same figure shows the data (apparent resistivity section) of the simulated DC response. After adding 5% of Gaussian noise to the simulated response, our goal was to invert the synthetic data in order to retrieve the correct resistivity model. We started from a half-space initial guess, assuming no a priori information.

    Figure 5.  "True" (original) resistivity model (upper panel) and observed apparent resistivity (lower panel). Colour scale represents resistivity, in Ω·m.

    Despite its apparent simplicity, the resistivity model shown in figure 5 is not easy to retrieve by data inversion without using any prior information. Many equivalent geophysical models can honour the data equally well if we do not use any constraint. The inversion algorithm that we used in this case is a "standard" Damped Least Square optimization algorithm that minimises iteratively the cost function, like the one expressed by eq. (5). The regularization operator consists of a smoothing functional that allows finding smoothed model solutions. The effect will be that the two resistive layers cannot be adequately distinguished and, after the inversion process, they appear "mixed" into a unique layer. This is clearly shown in figure 6.

    Figure 6.  Inverted resistivity model (upper panel) using a Damped Least Square Optimization algorithm. The "true" model is shown again in the lower panel, for comparison.

    Next, we performed again the inversion of the same synthetic data, but this time through our Reinforcement Learning approach (RL-Inv), in order to verify if it was possible to find an inverse solution more consistent with the original resistivity model. Figure 7 shows the inverted resistivity model (upper panel). In this case, the RL-Inv solution shows the two resistive layers properly separated. Furthermore, they were retrieved with almost correct resistivity values, although the resistivity of the upper layer is slightly overestimated.

    Figure 7.  Inverted resistivity model (upper panel) using the RL-Inv approach. The "true" model is shown again in the lower panel, for comparison.

    Figure 8 shows the cross plot of predicted vs. observed apparent resistivity for both inversion results. This type of graph is useful because it provides a synoptic view of the misfit between observed and predicted geo-electrical responses. In case of perfect fit, the points should be on a 45-degree tilted line (green line in the figure). The scattering of the points above the ideal best-fit line is a measure of the misfit and of the noise in the data. Both cross plots show some level of scattering and of resistivity overestimation; however, the misfit of the second inversion result (from RL-Inv) is less than the one obtained through the traditional Damped Least Square approach. Furthermore, the second scattering cross-plot shows two clusters of scattered points that are related with the two separate resistivity layers.

    Figure 8.  Cross plot of predicted vs. observed apparent resistivity for the Damped Least Square inversion result (upper panel), compared with the cross plot for RL-inv results (lower panel).

    In summary, the RL-Inv approach produced results that are more consistent with the original resistivity scenario used for the simulation.

    In this second example, we applied the RL-Inv method to a classical refraction seismic data set with heterogeneous overburden and some high-velocity bedrock. This data set is included in the examples provided in the public-domain repository prepared for testing the open source "pyGIMLi software library" [15]. Figure 9 shows the data set in terms of travel times vs. offsets. The complex trends of the travel-time curves vs. offset suggest significant variability in the velocity field. We can observe frequent variations in the slope of the curves that indicate lateral as well as vertical velocity changes. Such complexity in the data space corresponds to a similar complexity in the model space. In scenarios like this, our RL-Inv approach can be useful to find a global solution for the refraction tomography problem, limiting the risk to fall in local minima of the cost function during the inversion process. We followed the scheme of Figure 3 by exploring the model space through the Epsilon-Greedy approach. First, we created an initial Q-Table based on the cost function values (here expressed in terms of Chi2 values) for a set of different starting models (Table 1). Next, the optimization agent started exploring the model-space (in this case, the unknown model parameter is P-Velocity, Vp) through the Epsilon-Greedy approach.

    Figure 9.  Data set: refraction travel-times (s) vs. offsets, x(m).
    Table 1.  Q-Table filled with the inverse values of the cost function for each search direction.

     | Show Table
    DownLoad: CSV

    Figure 10 shows an example of "Model selection histogram" obtained through exploration of the model space with the Epsilon-Greedy method. The bars of each histogram are proportional to the probability to select one model among many possible starting models. In this example, we have considered just 20 possible candidate models, for illustrative purposes. For each model, we calculated the cumulative reward using the Bellman formulas, as explained earlier in the methodological section. We can see that for low values of the epsilon parameter, the method selects almost exclusively the model(s) with high cumulative reward (some examples are indicated by the arrows in the figure 10). This corresponds to adopt a greedy strategy, with prevalence of exploitation of the model(s) with high reward. On the other side, by choosing high values of epsilon, model selection tends to be random, allowing exploring the model space through directions that would otherwise have been ignored. In other words, an appropriate setting of the epsilon parameter allows a balanced policy between exploration and exploitation in the model space during the inversion process. In this specific test, we performed many tests by setting the epsilon parameter in the range between 0.0 and 1.0. There is not any absolute rule to find the optimal value of epsilon. However, a good strategy is to make epsilon variable: as trials increase, epsilon should decrease. Indeed, as trials increase, we have less need of exploration and more convenience of exploitation, in order to get the maximum benefit from our policy.

    Figure 10.  Example of "Model selection histograms" using the Epsilon-Greedy method, for variable values of epsilon. Test on 20 different models.

    During the inversion process, the Q-Table was progressively updated. As explained earlier, the rule for updating the Q-Table is given by the Bellman equation and the iterative Temporal-difference method. In summary, the agent (the minimization algorithm) explores the model space and selects the optimal path that corresponds with the direction in the space of parameters with the highest cumulative reward. At the same time, it does not neglect to explore alternative directions in the model space, although with lower probability. After many iterations, the agent learns to move in the model space following the most convenient policy. This corresponds with the one that allows finding the global minimum of the cost function. Our inversion test seems to confirm the effectiveness of such strategy, as in the previous test. Figure 11 shows some examples of velocity models obtained by travel-time tomography, with the correspondent ray tracing. Each individual model corresponds to a certain point of the cost function in the model space. For each path explored in the model space, we have a correspondent suite of values of the cost function. Finally, the best model (left panel of Figure 12) is the one retrieved through the RL-Inv approach. It shows the Vp parameter distribution that corresponds to the highest cumulative reward. For comparison, the right panel of the same figure shows the Vp model obtained without the support of the RL approach, using a "standard" optimization approach. Compared with the RL-Inv solution, the "standard" solution tends to overestimate the bedrock velocity and is not able to highlight properly the heterogeneities in the overburden.

    Figure 11.  Examples of velocity models obtained by travel-time tomography, with the correspondent ray tracing.
    Figure 12.  Comparison between the inverted Vp models obtained by RL-Inv (left) and by a "standard" seismic refraction tomography approach (based on generalized Gauss-Newton optimization method (right).

    We introduced a new optimization/inversion approach fully integrated with Q-Learning, Temporal Difference and Epsilon-Greedy methods. These allow expanding the exploration of the model-space, minimizing the misfit and limiting the problem of falling in local inversion minima. The advantages of our approach are clearly highlighted through the comparative test results on multidisciplinary data (electrical and seismic). Finally, we remark that we expect the greatest benefits from our method in those applications where an extended exploration of the model-space is difficult or prohibitive, due to the size of the data-model space and the complexity of the inversion problem. For instance, interesting cases include full-wave seismic inversion and simultaneous joint inversion of multi-physics data.

    The author declares no conflict of interest.

    pyGIMLi examples data repository: https://github.com/gimli-org/example-data/blob/master/traveltime/koenigsee.sgt.



    [1] W. Pabst, The linear theory of thermoelasticity from the viewpoint of rational thermomechanics, Cera.-Silikaty, 49 (2005), 242–251. Available from: https://www.ceramics-silikaty.cz/index.php?page=cs_detail_doi&id=611.
    [2] W. Nowacki, Dynamic problems of thermoelasticity, Springer Science & Business Media, 1975.
    [3] Q. Wang, S. Ge, D. Wu, H. Ma, J. Kang, M. Liu, et al., Evolution of microstructural characteristics during creep behavior of Inconel 718 alloy, Mater. Sci. Eng. A, 857 (2022), 143859. https://doi.org/10.1016/j.msea.2022.143859 doi: 10.1016/j.msea.2022.143859
    [4] D. Y. Tzou, A unified field approach for heat conduction from macro-to micro-scales, ASME J. Heat Transfer, 117 (1995), 8–16. https://doi.org/10.1115/1.2822329 doi: 10.1115/1.2822329
    [5] D. Y. Tzou, Macro-to microscale heat transfer: the lagging behavior, John Wiley & Sons, 2014.
    [6] D. Y. Tzou, The generalized lagging response in small-scale and high-rate heating, Int. J. Heat Mass Transfer, 38 (1995), 3231–3240. https://doi.org/10.1016/0017-9310(95)00052-B doi: 10.1016/0017-9310(95)00052-B
    [7] H. W. Lord, Y. Shulman, A generalized dynamical theory of thermoelasticity, J. Mech. Phys. Solids, 15 (1967), 299–309. https://doi.org/10.1016/0022-5096(67)90024-5 doi: 10.1016/0022-5096(67)90024-5
    [8] A. E. Green, K. Lindsay, Thermoelasticity, J. Elasticity, 2 (1972), 1–7. https://doi.org/10.1007/BF00045689 doi: 10.1007/BF00045689
    [9] M. Shariati, M. Shishesaz, H. Sahbafar, M. Pourabdy, M. Hosseini, A review on stress-driven nonlocal elasticity theory, J. Comput. Appl. Mech., 52 (2021), 535–552. https://doi.org/10.22059/jcamech.2021.331410.653 doi: 10.22059/jcamech.2021.331410.653
    [10] A. C. Eringen, J. L. Wegner, Nonlocal continuum field theories, Appl. Mech. Rev., 56 (2003), B20–B22. https://doi.org/10.1115/1.1553434 doi: 10.1115/1.1553434
    [11] C. Polizzotto, Stress gradient versus strain gradient constitutive models within elasticity, Int. J. Solids Struct., 51 (2014), 1809–1818. https://doi.org/10.1016/j.ijsolstr.2014.01.021 doi: 10.1016/j.ijsolstr.2014.01.021
    [12] E. C. Aifantis, Gradient deformation models at nano, micro, and macro scales, J. Mech. Behav. Mater., 121 (1999), 189–202. https://doi.org/10.1115/1.2812366 doi: 10.1115/1.2812366
    [13] F. A. C. M. Yang, A. C. M. Chong, D. C. C. Lam, P. Tong, Couple stress based strain gradient theory for elasticity, Int. J. Solids Struct., 39 (2002), 2731–2743. https://doi.org/10.1016/S0020-7683(02)00152-X doi: 10.1016/S0020-7683(02)00152-X
    [14] C. W. Lim, G. Zhang, J. Reddy, A higher-order nonlocal elasticity and strain gradient theory and its applications in wave propagation, J. Mech. Phys. Solids, 78 (2015), 298–313. https://doi.org/10.1016/j.jmps.2015.02.001 doi: 10.1016/j.jmps.2015.02.001
    [15] S. Li, W. Zheng, L. Li, Spatiotemporally nonlocal homogenization method for viscoelastic porous metamaterial structures, Int. J. Mech. Sci., 282 (2024), 109572.
    [16] A. Overvig, S. A. Mann, A. Alù, Spatio-temporal coupled mode theory for nonlocal metasurfaces, Light: Sci. Appl., 13 (2024), 28. https://doi.org/10.1038/s41377-023-01350-9 doi: 10.1038/s41377-023-01350-9
    [17] Y. Zhang, D. Nie, X. Mao, L. I. Li, A thermodynamics-consistent spatiotemporally-nonlocal model for microstructure-dependent heat conduction, Appl. Math. Mech., 45 (2024), 1929–1948. https://doi.org/10.1007/s10483-024-3180-7 doi: 10.1007/s10483-024-3180-7
    [18] A. E. Abouelregal, M. Marin, A. Öchsner, A modified spatiotemporal nonlocal thermoelasticity theory with higher-order phase delays for a viscoelastic micropolar medium exposed to short-pulse laser excitation, Continuum Mech. Thermodyn., 37 (2025), 15. https://doi.org/10.1007/s00161-024-01342-z doi: 10.1007/s00161-024-01342-z
    [19] A. E. Abouelregal, M. Marin, Y. Alhassan, D. Atta, A novel space–time nonlocal thermo-viscoelastic model with two-phase lags for analyzing heat diffusion in a half-space subjected to a heat source, Iran. J. Sci. Technol. Trans. Mech. Eng., 49 (2025), 1315–1332. https://doi.org/10.1007/s40997-025-00835-9 doi: 10.1007/s40997-025-00835-9
    [20] M. Lazar, E. Agiasofitou, Nonlocal elasticity of Klein–Gordon type: Fundamentals and wave propagation, Wave Motion, 114 (2022), 103038. https://doi.org/10.1016/j.wavemoti.2022.103038 doi: 10.1016/j.wavemoti.2022.103038
    [21] E. Agiasofitou, M. Lazar, Nonlocal elasticity of Klein–Gordon type with internal length and time scales: Constitutive modelling and dispersion relations, PAMM, 23 (2023), e202300065. https://doi.org/10.1002/pamm.202300065 doi: 10.1002/pamm.202300065
    [22] A. A. Kilbas, H. M. Srivastava, J. J. Trujillo, Theory and applications of fractional differential equations, Elsevier, 2006.
    [23] M. Caputo, M. Fabrizio, A new definition of fractional derivative without singular kernel, Prog. Fract. Differ. Appl., 1 (2015), 73–85. http://dx.doi.org/10.12785/pfda/010201 doi: 10.12785/pfda/010201
    [24] M. Caputo, M. Fabrizio, Applications of new time and spatial fractional derivatives with exponential kernels, Prog. Fract. Differ. Appl., 2 (2016), 1–11. http://dx.doi.org/10.18576/pfda/020101 doi: 10.18576/pfda/020101
    [25] A. Atangana, D. Baleanu, Caputo-Fabrizio derivative applied to groundwater flow within confined aquifer, J. Eng. Mech., 143 (2017), D4016005. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001091 doi: 10.1061/(ASCE)EM.1943-7889.0001091
    [26] A. Atangana, D. Baleanu, New fractional derivatives with nonlocal and non-singular kernel: Theory and application to heat transfer model, Therm. Sci., 20 (2016), 763–769. https://doi.org/10.2298/TSCI160111018A doi: 10.2298/TSCI160111018A
    [27] S. Saifullah, A. Ali, A. Khan, K. Shah, T. Abdeljawad, A novel tempered fractional transform: Theory, properties and applications to differential equations, Fractals, 31 (2023), 2340045. https://doi.org/10.1142/S0218348X23400455 doi: 10.1142/S0218348X23400455
    [28] A. Liemert, A. Kienle, Fundamental solution of the tempered fractional diffusion equation, J. Math. Phys., 56 (2015), 113504. https://doi.org/10.1063/1.4935475 doi: 10.1063/1.4935475
    [29] M. Medve, M. Pospíšil, Generalized Laplace transform and tempered Ψ-Caputo fractional derivative, Math. Model. Anal., 28 (2023), 146–162. https://doi.org/10.3846/mma.2023.16370 doi: 10.3846/mma.2023.16370
    [30] J. Deng, L. Zhao, Y. Wu, Fast predictor-corrector approach for the tempered fractional differential equations, Numer. Algorithms, 74 (2017), 717–754. https://doi.org/10.1007/s11075-016-0169-9 doi: 10.1007/s11075-016-0169-9
    [31] A. Zavaliangos, L. Anand, Thermo-elasto-viscoplasticity of isotropic porous metals, J. Mech. Phys. Solids, 41 (1993), 1087–1118. https://doi.org/10.1016/0022-5096(93)90056-L doi: 10.1016/0022-5096(93)90056-L
    [32] M. I. Othman, A. Sur, Transient response in an elasto-thermo-diffusive medium in the context of memory-dependent heat transfer, Waves Rand. Compl. Media, 31 (2021), 2238–2261. https://doi.org/10.1080/17455030.2020.1737758 doi: 10.1080/17455030.2020.1737758
    [33] A. Hobiny, F. S. Alzahrani, I. Abbas, Three-phase lag model of thermo-elastic interaction in a 2D porous material due to pulse heat flux, Int. J. Numer. Methods Heat Fluid Flow, 30 (2020), 5191–5207. https://doi.org/10.1108/HFF-03-2020-0122 doi: 10.1108/HFF-03-2020-0122
    [34] P. Liu, T. He, Dynamic response of thermoelastic materials with voids subjected to ramp-type heating under three-phase-lag thermoelasticity, Mech. Adv. Mater. Struct., 29 (2022), 1386–1394. https://doi.org/10.1080/15376494.2020.1821137 doi: 10.1080/15376494.2020.1821137
    [35] V. Gupta, M. S. Barak, S. Das, Impact of memory-dependent heat transfer on Rayleigh waves propagation in nonlocal piezo-thermo-elastic medium with voids, Int. J. Numer. Methods Heat Fluid Flow, 34 (2024), 1902–1926. https://doi.org/10.1108/HFF-10-2023-0615 doi: 10.1108/HFF-10-2023-0615
    [36] S. Mondal, A. Sur, Thermo-hydro-mechanical interaction in a poroelastic half-space with nonlocal memory effects, Int. J. Appl. Comput. Math., 10 (2024), 68. https://doi.org/10.1007/s40819-024-01717-5 doi: 10.1007/s40819-024-01717-5
    [37] F. Ebrahimi, K. Khosravi, A. Dabbagh, A novel spatial–temporal nonlocal strain gradient theorem for wave dispersion characteristics of FGM nanoplates, Waves Rand. Com. Media, 34 (2024), 3490–3509. https://doi.org/10.1080/17455030.2021.1979272 doi: 10.1080/17455030.2021.1979272
    [38] M. Jia, S. Y. Lou, Integrable nonlinear Klein–Gordon systems with PT nonlocality and/or space–time exchange nonlocality, Appl. Math. Lett., 130 (2022), 108018. https://doi.org/10.1016/j.aml.2022.108018 doi: 10.1016/j.aml.2022.108018
    [39] B. Singh, Wave propagation in context of Moore–Gibson–Thompson thermoelasticity with Klein–Gordon nonlocality, Vietnam J. Mech., 46 (2024), 104–118. https://doi.org/10.15625/0866-7136/19728 doi: 10.15625/0866-7136/19728
    [40] F. Ebrahimi, K. Khosravi, A. Dabbagh, Wave dispersion in viscoelastic FG nanobeams via a novel spatial–temporal nonlocal strain gradient framework, Waves Rand. Com. Media, 34 (2024), 2962–2984. https://doi.org/10.1080/17455030.2021.1970282 doi: 10.1080/17455030.2021.1970282
    [41] M. E. Elzayady, A. E. Abouelregal, F. Alsharif, H. Althagafi, M. Alsubhi, Y. Alhassan, Two-stage heat-transfer modeling of cylinder-cavity porous magnetoelastic bodies, Mech. Time-Depend. Mater., 28 (2024), 2819–2840. https://doi.org/10.1007/s11043-024-09691-7 doi: 10.1007/s11043-024-09691-7
    [42] L. Anitha, R. M. Devi, R. Selvamani, F. Ebrahimi, Nonlocal couple stress vibration of pasted thermo elastic multilayered cylinder with hall current and multi dual phase lags, Mech. Solids, 59 (2024), 1659–1671. https://doi.org/10.1134/S0025654424603045 doi: 10.1134/S0025654424603045
    [43] H. Guo, Z. Xu, F. Shang, T. He, A new constitutive theory of nonlocal piezoelectric thermoelasticity based on nonlocal single-phase lag heat conduction and structural transient thermo-electromechanical response of piezoelectric nanorod, Mech. Adv. Mater. Struct., 31 (2024), 11737–11754. https://doi.org/10.1080/15376494.2024.2311240 doi: 10.1080/15376494.2024.2311240
    [44] M. Arai, K. Masui, One-dimensional thermo-elastic wave analysis for dynamic thermoelasticity coupled with dual-phase-lag heat conduction model, Mech. Eng. J., 11 (2024), 24-00255. https://doi.org/10.1299/mej.24-00255 doi: 10.1299/mej.24-00255
    [45] K. Zakaria, M. A. Sirwah, A. E. Abouelregal, A. F. Rashid, Photothermoelastic interactions in silicon microbeams resting on linear Pasternak foundation based on DPL model, Int. J. Appl. Mech., 13 (2021), 2150079. https://doi.org/10.1142/S1758825121500794 doi: 10.1142/S1758825121500794
    [46] E. C. D. Oliveira, J. A. Machado, A review of definitions for fractional derivatives and integral, Math. Probl. Eng., 2014 (2014), 1–7. https://doi.org/10.1155/2014/238459 doi: 10.1155/2014/238459
    [47] F. I. A. Amir, A. Moussaoui, R. Shafqat, M. H. El Omari, S. Melliani, The Hadamard ψ-Caputo tempered fractional derivative in various types of fuzzy fractional differential equations, Soft Comput., 28 (2024), 9253–9270. https://doi.org/10.1007/s00500-024-09821-w doi: 10.1007/s00500-024-09821-w
    [48] A. E. Abouelregal, Y. Alhassan, S. S. Alsaeed, M. Marin, M. E. Elzayady, MGT photothermal model incorporating a generalized Caputo fractional derivative with a tempering parameter: Application to an unbounded semiconductor medium, Contemp. Math., 5 (2024), 6556–6581. https://doi.org/10.37256/cm.5420245963 doi: 10.37256/cm.5420245963
    [49] A. E. Abouelregal, M. Marin, A. Foul, S. S. Askar, Thermomagnetic responses of a thermoelastic medium containing a spherical hole exposed to a timed laser pulse heat source, Case Stud. Therm. Eng., 56 (2024), 104288. https://doi.org/10.1016/j.csite.2024.104288 doi: 10.1016/j.csite.2024.104288
    [50] A. Kuznetsov, On the convergence of the Gaver–Stehfest algorithm, SIAM J. Numer. Anal., 51 (2013), 2984–2998. https://doi.org/10.1137/13091974X doi: 10.1137/13091974X
    [51] F. A. S. Martins, G. J. Weymar, I. da Cunha Furtado, F. Tumelero, R. da Silva Brum, R. S. de Quadros, et al., Analysis of EAHE through a coupled mathematical model solved by Laplace transform and Gaver-Stehfest algorithm, Cienc. Nat., 45 (2023), e74745. https://doi.org/10.5902/2179460X74745 doi: 10.5902/2179460X74745
    [52] S. Sheikh, L. Khalsa, G. Makkad, V. Varghese, Fractional dual-phase-lag hygrothermoelastic model for a sphere subjected to heat-moisture load, Arch. Appl. Mech., 94 (2024), 1379–1396. https://doi.org/10.1007/s00419-024-02583-9 doi: 10.1007/s00419-024-02583-9
    [53] Z. Xue, H. Zhang, J. Liu, M. Wen, Thermoelastic response of porous media considering spatial scale effects of heat transfer and deformation, J. Eng. Mech., 151 (2025), 05024002. https://doi.org/10.1061/JENMDT.EMENG-7730 doi: 10.1061/JENMDT.EMENG-7730
    [54] R. A. Fathy, E. E. Eraki, M. I. Othman, Effects of rotation and nonlocality on the thermoelastic behavior of micropolar materials in the 3PHL model, Iran. J. Sci. Technol. Trans. Mech. Eng., 49 (2025), 165–180. https://doi.org/10.1007/s40997-025-00834-w doi: 10.1007/s40997-025-00834-w
  • This article has been cited by:

    1. Valeria Giampaolo, Paolo Dell’Aversana, Luigi Capozzoli, Gregory De Martino, Enzo Rizzo, Optimization of Aquifer Monitoring through Time-Lapse Electrical Resistivity Tomography Integrated with Machine-Learning and Predictive Algorithms, 2022, 12, 2076-3417, 9121, 10.3390/app12189121
    2. Yulong Zhao, Ruike Luo, Longxin Li, Ruihan Zhang, Deliang Zhang, Tao Zhang, Zehao Xie, Shangui Luo, Liehui Zhang, A review on optimization algorithms and surrogate models for reservoir automatic history matching, 2024, 233, 29498910, 212554, 10.1016/j.geoen.2023.212554
    3. Ravichandran Sowmya, Manoharan Premkumar, Pradeep Jangir, Newton-Raphson-based optimizer: A new population-based metaheuristic algorithm for continuous optimization problems, 2024, 128, 09521976, 107532, 10.1016/j.engappai.2023.107532
    4. Chang Soon Kim, Van Quan Dao, Jinje Park, Byungho Jang, Seok-Ju Lee, Minwon Park, Lei Chen, Combining finite element and reinforcement learning methods to design superconducting coils of saturated iron-core superconducting fault current limiter in the DC power system, 2023, 18, 1932-6203, e0294657, 10.1371/journal.pone.0294657
    5. Sungil Kim, Tea-Woo Kim, Suryeom Jo, Artificial intelligence in geoenergy: bridging petroleum engineering and future-oriented applications, 2025, 15, 2190-0558, 10.1007/s13202-025-01939-3
  • Reader Comments
  • © 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(125) PDF downloads(12) Cited by(0)

Figures and Tables

Figures(9)  /  Tables(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog