1.
Introduction
Diabetes, characterized by insufficient insulin secretion or inadequate cellular response, leads to sustained elevated glucose levels, posing significant health risks [1,2]. Annually, over 1.5 million people worldwide succumb to diabetes-related complications, with 422 million diagnosed in 2019. Projections indicate a surge to 700 million patients by 2045, firmly establishing diabetes as a primary global cause of mortality. The treatment protocol for type Ⅰ diabetes includes oral medications or insulin injections aimed at maintaining blood glucose within the normal range [3,4]. Nevertheless, real-time monitoring remains challenging, often necessitating invasive blood sampling. Mobility issues, particularly prevalent in the middle-to-older age group, result in inconveniences and frequent hospital visits. Addressing these challenges is imperative; thus, the exploration of noninvasive or predictive approaches becomes crucial [5]. Our research responds to this imperative, accentuating its contributions and innovations in enhancing the quality of life for diabetic patients.
With the rapid progress in wearable devices and microsensor technology, the exploration of wearable sensor devices for blood glucose level monitoring has gained considerable attention [6,7]. Continuous glucose monitoring technology enables real-time tracking of blood glucose levels [8,9]. This is achieved by projecting future changes in glucose levels through frequent synchronized data sampling. In recent years, deep learning technologies have successfully personalized predictions of future blood glucose levels. These predictions utilize continuous glucose monitoring records, insulin usage information and individually reported physiological data.
Researchers are investigating methods to predict blood glucose levels from physiological data, aiming to reduce the need for frequent patient visits and the discomfort caused by punctures [10]. Precise blood glucose prediction can alleviate the burden of Type 1 diabetes [11]. Personalized prediction faces challenges, including carbohydrate intake, insulin timing, sleep quality and physical activity, influencing glucose fluctuations. Unlike a uniform model, personalized predictions adopt distinct models for each patient. Using pre-classification, multitask learning partially alleviates dynamic glucose variations among patients of different ages and genders [12,13].
The primary study goal is to develop a pre-classification-based multitask deep learning model predicting blood glucose levels. This model utilizes continuous glucose monitoring (CGM) data at time point T and other life event data to forecast levels at T + PH. Considered prediction windows are 30 and 60 minutes. Patient data was initially pre-classified based on sex and age. Subsequently, the proposed framework, TG-DANet (TCN-GRU-D-Attention Network), predicted glucose levels. Multidimensional time-series data were pre-processed, and aligned with monitoring records and life event data before being fed into TG-DANet for training. The framework is trained on pre-classified data. Based on outcomes, five sub-models Mi (i = 1…5). were trained for personalized predictions. These models were then used for blood glucose level prediction. Primary contributions are as follows.
1) Propose a data-driven blood glucose level prediction model based on pre-classification, which can accurately forecast future blood glucose levels.
2) The pre-classification deep learning approach, based on TG-DANet, helps effectively categorize and predict patients of different ages and sexes. This approach yields outstanding predictive performance within personalized patient contexts.
3) The introduction of an enhanced GRU model (GRU-D) that incorporates a decay mechanism to regulate fading hidden states optimizes the capture of long-term dependencies based on time steps, thereby enhancing the model's predictive capability.
4) Blood glucose prediction analysis, using real clinical patient data, demonstrated the remarkable accuracy of the model. Comparative evaluations against baseline models and the published literature confirm the superiority of the proposed model.
The rest of this paper is organized as follows. Section 2 discusses relevant research efforts related to predicting blood glucose levels, emphasizing the shortcomings of current studies. Section 3 outlines the dataset that was adopted and the pre-processing methods used. Section 4 elaborates on the modeling process of the personalized blood glucose prediction system and the methodology for optimizing the model configuration through hyperparameter tuning. Section 5 presents and analyzes the experimental results. Finally, Section 6 encapsulates the research findings, articulates the conclusions drawn and outlines potential areas for future research.
2.
Related works
The prevalent predictive models for blood glucose levels include data-driven models [14] and physiological and hybrid models [15,16]. Among these, data-driven models exhibit superior flexibility and generality compared to physiological and hybrid models. Data-driven models do not require many physiological parameters or specialized knowledge [17]. They can rapidly establish accurate blood glucose prediction models, yielding predictive performance similar to physiological models. Therefore, we used data-driven models to predict blood glucose levels [18].
In previous relevant studies, numerous scholars have utilized various algorithms, including Kalman filtering [19,20], artificial neural networks [21,22], XGBoost [23,24] and autoregressive integrated moving average (ARIMA) [25,26], to predict blood glucose levels. However, these studies often rely solely on calibrated individual or limited physiological data for predicting blood glucose levels. Consequently, these models exhibit deficiencies in incorporating relevant lifestyle data and continuous glucose monitoring information, making them inadequate for addressing personalized patient variations. For instance, in 2021, Md Fazle Rabby et al. [26] employed Kalman filtering and the StackLSTM algorithm for predicting blood glucose levels. Asiye Şahin et al. [27] proposed using an artificial neural network (ANN). Yiyang Wang et al. [28] utilized the XGBoost algorithm for prediction, while Federico D'Antoni et al. [29] introduced the autoregressive shifting algorithm for glucose prediction. However, these studies did not consider the impact of multiple variables, resulting in limited predictive accuracy. Although autoregressive models are considered classical statistical approaches, their implementation requires significant domain expertise, making them less suitable for computer scientists conducting disease prediction research. Consequently, deep learning and machine learning methods have gained widespread popularity because they can produce favorable outcomes without requiring extensive domain knowledge. Deep learning algorithms have shown promising results in predicting blood glucose levels. For instance, CNN [30,31], DRNN [32], FCNN [33], CRNN [34] and multilayer LSTM models have been extensively studied for predicting blood glucose levels. In addition, studies have explored using multiple variables to predict blood glucose levels. For example, the multitask prediction model (D-MTL) proposed by Shuvo et al. [35] experimented with selected features and ultimately identified four variables: continuous glucose monitoring data, insulin dosage, carbohydrate intake and fingertip glucose content. These features were input into the model for predicting blood glucose. Experimental results indicated an RMSE evaluation metric of 18.06 ± 2.74 mg/dL within a 30-minute prediction window. Tao Yang et al. [36] introduced a deep learning framework that utilizes an automated channel for personalized prediction of blood glucose levels. The authors utilized continuous glucose monitoring data, carbohydrate intake, insulin dosage and time-related information to predict patient blood glucose levels. Although these prediction methodologies have achieved certain levels of success, their selection of features primarily relies on empirical grounds, thus failing to maximize their value in clinical practice.
The existing limitations in blood glucose prediction research primarily manifest in the following aspects: 1) Temporal Scale: Predictive models struggle to capture real-time changes in these factors. 2) Lack of Relevant Clinical Data: Often, there is a lack of data regarding medication dosages, specialized diets, or specific physiological conditions. 3) External Interference Factors: The external environment and the patient's lifestyle can impact the accuracy of predictive models. This study mainly focused on using the authentic clinical dataset OhioT1DM to predict blood glucose levels for the next 30 and 60 minutes. In addition to employing model evaluations, the proposed model was assessed using Clarke Error Grid Analysis. These methodologies provide a more comprehensive model performance evaluation, resulting in more robust results and conclusions.
3.
Dataset and pre-processing
The OHIOT1DM dataset is used for validating and predicting the proposed model. Sourced from clinical real-world data at Ohio State University, this dataset pertains to blood glucose prediction. The utilization of this dataset requires adherence to relevant protocols and necessitates an application process to acquire usage permissions. The OHIOT1DM [37] dataset includes data from 12 individuals who were diagnosed with type 1 diabetes and took part in the Blood Glucose Level Prediction (BGLP) challenge in 2018 and 2020. The dataset covers eight weeks for each participant. This dataset includes continuous glucose monitoring (CGM) data collected at 5-minute intervals, fingertip capillary blood glucose levels obtained through self-monitoring, insulin dosages (including bolus and basal dosages), self-reported meal times, estimated carbohydrate intake, self-reported physical activity, sleep patterns, stress levels and data from Basis Peak or Empatica Embrace devices. The Basis Peak wristband data includes information on 5-minute heart rate, galvanic skin response (GSR), skin temperature, ambient temperature and step count. Each patient was represented within the dataset by an XML file for training and testing. There were 24 XML files encompassing all the data for the 12 patients. Specifically, the final ten days of patient data were allocated for testing, while the remaining data were designated for training. Table 1 summarizes the dataset, including gender, age and sample count for the training and testing sets.
In this study, solely evaluating the model's performance using continuous glucose monitoring (CGM) data did not comprehensively reflect its capabilities. Hence, we employed a feature incrementation approach to observe the model's performance and analyze the significance of these features. A series of ablation experiments were conducted to facilitate feature selection, and the quantitative results are presented in Tables 2–6. According to relevant studies, an individual's sex and age can influence their blood glucose levels [38,39]. Consequently, by implementing a pre-classification strategy, we categorized patients according to gender and age to develop personalized prediction models. Building upon the OhioT1DM dataset, we categorized the data into five classes corresponding to the five personalized prediction models proposed in this study.
These models are denoted as TG-DANeti (i = 1, 2, 3, 4, 5). These models utilize TG-DANet as their foundational architecture, differing primarily in parameter adjustments. These parameters were obtained using separate training datasets and the Optuna hyperparameter optimization framework, thereby improving the customization of blood glucose prediction for patients. Tables 2–4 and 6 demonstrate that the five input features significantly impact the CGM trends. Conversely, Table 5 identifies four input features that notably influence continuous glucose monitoring (CGM) trends. Different combinations of additional features did not result in significant performance improvements and could compromise predictive accuracy.
To enhance the stability of the model, data smoothing was applied before incorporating selected features. Given the potential for sensor device quality issues [40], power interruptions and connectivity problems, random missing values may be present in continuous glucose monitoring (CGM) data. In addressing this, linear interpolation was utilized to fill missing data gaps in the test set. However, to maintain the integrity of physiological patterns, we carefully considered the duration of missing values and opted to discard samples with continuous gaps exceeding two hours, as prolonged intervals could adversely affect prediction accuracy. Data preprocessing methods for other features were aligned with those used for CGM data. It is important to note that our model relies on sample data with consistent scales. If the lengths of other features do not match those of the CGM data, we employed a method to fill in the missing portions. Following the treatment for handling missing values, data normalization was performed to ensure uniform scaling across input features.
In addressing missing data, linear interpolation [41] was applied to the training set, while linear extrapolation [42,43] was employed for the test set to prevent encounters with future data. Additionally, Kalman filtering [44,45] was utilized specifically for pre-processing blood glucose data to mitigate sensor readings and device errors. It is imperative to underscore that the target variable for prediction was deliberately excluded from both the smoothing and filtering processes. This intentional omission is rooted in our acknowledgment of the potential hazards associated with artificially augmenting predictability at the potential cost of physiological accuracy. We have opted for this approach to circumvent any inadvertent distortion of the signal, particularly because Continuous Glucose Monitoring (CGM) values are inherently subjected to filtering by the manufacturer.
Despite the substantial advancements made by continuous glucose monitoring (CGM) devices in real-time glucose monitoring, their accuracy is constrained by limitations such as measurement errors, latency, inconvenience of use and high costs. These limitations often lead to outliers, which, in turn, result in misleading predictive outcomes.
Time-series smoothing [46] has proven to be a practical approach to overcoming these issues. This study used double exponential smoothing to preprocess the data, making the blood glucose level data more continuous and stable. This enhancement aimed to improve the predictive accuracy of the model.
Double exponential smoothing [47] preprocessing was applied to all the feature data. The purpose was to capture the level and trend components of the data as they change over time. The mathematical principles underlying double exponential smoothing involve two primary stages.
One: Level component smoothing
where Lt is the level component at time t, Yt is the actual value at time t, α is the level component smoothing coefficient (0 < α < 1) and Tt−1 is the trend component at time t-1.
Two: Trend component smoothing
where β is the smoothing coefficient for the trend component (0 < β < 1).
Finally, the formula for double exponential smoothing of the data is represented by Eq (3).
where Yt represents the smoothed data and Lt and Tt are the current time level and trend components, respectively. We conducted multiple experiments because the smoothing coefficients α and β for the level and trend components are two unknown parameters when using double exponential smoothing. We set α to 0.9 and β to 0.1. Figure 1 illustrates the effect of different parameters α and β on the training set data of patient #559 after double exponential smoothing for the first 1000 data points. As shown in Figure 1(a)–(d), as α increases, the Relative Error compared with the original data decreases, and as β decreases, the Relative Error and the Relative Error of β decrease. Figure 1 demonstrates the impact of different parameters, α and β, on the CGM values using double exponential smoothing. Pre-processing data with double exponential smoothing allows for a balance between data smoothness and responsiveness. This method combines short- and long-term trends, effectively reducing noise and sudden fluctuations by assigning weights to past data.
This allows for the extraction of more stable and accurate blood glucose trends. The range for the α and β parameters of the double exponential smoothing used in this study was 0.1–0.9. Figure 1(a)–(d) illustrates the influence of different combinations of parameters α and β on CGM data. In practical applications, the selection of appropriate values for α and β is typically based on empirical evidence and the results of actual data analysis. In this context, the choice of α values of 0.9 and 0.1 is grounded in the observation of the difference in Minimum Relative Error (MEL). Smaller differences in MEL indicate a better smoothing effect. In specific computations, the values of MEL are 11.744 mg/dL when α is 0.9 and α is 0.1, as calculated using Eq (4).
where, Smoothed Data refers to the data that has undergone a smoothing process, while Original Data denotes the raw, unprocessed data.
To determine the optimal parameter combination, a range of α and β values from 0 to 1, with an interval of 0.01, was systematically explored in the study. Through cross-validation of each dataset, the combination of α = 0.9 and β = 0.1 was ultimately identified as yielding the best performance. Figure 1 illustrates the images under different scenarios, where α varies between 0.1, 0.3, 0.6 and 0.9, and β varies between 0.1, 0.3, 0.6 and 0.9. These images reflect the smoothing effects under different parameter combinations. Overall, through such experiments and observations, the optimal parameter selection for achieving the best smoothing effect on the data in this study was determined.
4.
Proposed model
Personalized prediction of blood glucose levels was achieved through three major approaches: 1) Training a single model to predict blood glucose levels for all patients; 2) training independent models for each patient; and 3) introducing a pre-classification training model for blood glucose prediction. Because blood glucose dynamics vary for each patient, training a single predictive model for all patients' blood glucose levels is inaccurate. However, training separate models for each patient incurs significant costs. Therefore, the pre-classification predictive model proposed in this study addresses the issue of inaccurate predictions made by a single model and the high costs associated with training individual models for each patient. By categorizing patients based on gender and age and applying the model to data within the respective category, we can improve prediction accuracy while reducing computational costs. In this study, we trained five models to predict the blood glucose levels in patients.
Although traditional RNN structures, as described in [48], can retain historical information to enhance prediction accuracy, they have limited capabilities in modeling complex temporal patterns and long-range dependencies. In addition, they are prone to the issues of vanishing and exploding gradients. These concerns have been partially addressed in the context of Gated Recurrent Unit (GRU) recurrent neural networks. By introducing a time-step decay mechanism, the GRU can more effectively manage the preservation of historical information, thereby enhancing its ability to model long sequential dependencies. To further improve prediction accuracy, this study integrates a Temporal Convolutional Network (TCN) [49] into the model prediction process. The TCN is used to extract local features from input sequences. However, relying solely on a single TCN-GRU model may not fully capture essential features and long sequential relationships. Therefore, by incorporating an attention mechanism, the time-series model dynamically weighs different time steps to capture crucial temporal dependencies and improve prediction accuracy and interpretability. The proposed workflow for predicting personalized patient blood glucose levels is shown in Figure 2.
By pre-classifying the dataset and aggregating the data of the same category, this study divided the data into five distinct classes based on gender and age. Five training models were established for each class, each utilizing the proposed TG-DANet algorithm for training. This resulted in distinct model parameters and prediction outcomes. Using the CGM data predictions as a baseline, a stepwise feature training and selection approach was applied to these five models. Among males aged 20–40 years, the optimal feature combination consisted of Continuous Glucose Monitoring (CGM), Sleep, FG, C and Bl. The best feature combination within the 40–60 age range was CGM, Ba, FG, C and Bl. The optimal feature combination for individuals aged 60–80 included CGM, FG, C, Bl and GSR. Similarly, the optimal feature combination for females aged 20–40 included CGM, C, Bl and ST. Within the age range of 40–60, the best combination of features involved CGM, C, Bl, GSR and ST.
In the proposed TG-DANet algorithm, a dropout rate of 0.2 is employed to prevent overfitting and improve the accuracy of the output data. Each output consists of 12 data points, representing predictions for a 60-minute interval, with a 5-minute interval between each data point. In this study, predictions were made for blood glucose levels over the next 30 and 60 minutes. The GRU neural network, a variant of recurrent neural networks, addresses the issue of vanishing gradients commonly found in traditional RNNs while demonstrating improved training and inference efficiency. The GRU introduces gating mechanisms, such as update gates and reset gates, to effectively regulate the flow of information. It uses candidate hidden states to balance retaining previous memories and incorporating new information. This enables it to effectively capture long-range dependencies and excel in processing lengthy sequential data. Compared to Long Short-Term Memory (LSTM), the Gated Recurrent Unit (GRU) offers a more concise structure with fewer parameters, reducing the risk of overfitting. The architecture of the TG-DANet network model proposed in this study is shown in Figure 3. Within each GRU network [50,51], the following equations define the update and reset gates.
In the above four equations, Equation (5) represents the update gate of the GRU network. Here, zt is the output of the update gate, σ denotes the sigmoid activation function, Wz stands for the weight of the update gate, ht−1 represents the hidden state from the previous time step and xt is the input at the current time step. In Eq (6), rt signifies the output of the reset gate and Wr corresponds to the weight of the reset gate. In Eq (6), ht denotes the candidate's hidden state, Wh represents the weight matrix of the candidate hidden state and ⊙ denotes element-wise multiplication. In Eq (8), ht stands for the hidden state at the current time step.
GRU-D is an enhanced model based on a Gated Recurrent Unit (GRU) designed to address long-term dependency issues. Its uniqueness lies in incorporating a decay mechanism to control the retention of historical information, thereby mitigating the challenges associated with long-term dependencies. The decay coefficient, referred to as "decay" controls the extent of attenuation and effectively incorporates the missingness in the input features and RNN states, leading to enhanced predictive performance. In this study, the decay coefficient was set at 0.43.
Temporal Convolutional Networks (TCN) [52] are a method based on convolutional neural networks for time series prediction. It utilizes temporal convolutional operations to capture patterns and features in sequences, taking advantage of the parallel computational benefits of multiple one-dimensional convolutional neural networks and the capability to model both short- and long-term dependencies. Consequently, TCN excels in multistep prediction tasks. A key feature of TCN is the utilization of residual connections, which helps alleviate the problem of vanishing gradients. This enables the network to be trained more effectively and deeply.
In a TCN, one-dimensional convolution operations capture local and global patterns within time sequences. The convolution operation for an input sequence X can be expressed as Eq (9).
In Eq (9), y[t] represents the output value of the convolution operation at time step t, indicating the feature value. The function f represents the activation function, and in this study, the Rectified Linear Unit (ReLU) activation function was used. Variable w corresponds to the weight of the convolutional kernel, which takes the form of a filter with dimensions (k, 1). Here, k represents the size of the kernel and determines the number of time steps covered in each convolution operation. The expression x[t:t+k-1] represents the window of the input sequence x from time step t to t+k-1. This window is used for element-wise multiplication and summation with the convolutional kernel, resulting in the convolution operation. Finally, b represents the bias term obtained by adding an offset after the convolution operation. Within the TCN [53], residual connections are incorporated into the outputs of the convolutional layers to facilitate the practical training of deep networks. In the context of residual relationships, the output of the convolutional layer is added to the input, resulting in the output of a residual block. Specifically, within the TCN, the structure of the residual connections is described by Eq (10).
where y[t] represents the feature value of the residual output block at time step t and x[t] signifies the value of the input sequence x at time step t. The term (w∗x[t:t+k−1]+b) denotes the output of the convolutional layer, which results from applying the activation function to the convolutional operation. By adding x[t] to (w∗x[t:t+k−1]+b), the residual connection produces the final output of the prediction.
4.1. Hyperparameter optimization
To achieve optimal prediction results, we utilized the advanced hyperparameter optimization framework, Optuna, to optimize and fine-tune the parameters of the proposed algorithm. These hyperparameters include the number of hidden units in the GRU-D network, learning rate, dropout rate, activation function, choice of optimizer and decay rate. By adjusting these hyperparameters, the prediction accuracy of the model could be improved. The specific outcomes of the hyperparameter tuning process are presented in Table 7. The optimal hyperparameters were then used as inputs for the proposed model to make predictions. This resulted in reduced values for the evaluation metrics, such as RMSE and MAE.
4.2. Evaluation metrics
4.2.1. Root mean square error and mean absolute error
The metrics used in this study to evaluate the performance of the regression model included the root mean square error (RMSE) and mean absolute error (MAE). Smaller values of RMSE and MAE indicate better model performance.
where yi denotes the true value of the i-th sample, ˆyi denotes the predicted value of the ith sample and n represents the number of samples.
4.2.2. Clarke error grid analysis
The Clarke Error Grid Analysis (EGA) represents a pivotal clinical metric employed for the meticulous evaluation of prediction accuracy in blood glucose level assessments. This analysis entails a comprehensive examination of the disparities between the actual measured values and the corresponding predicted values, serving as a robust benchmark for assessing the efficacy of prediction models. The outcomes of blood glucose level predictions are systematically categorized into five distinct zones, denoted as Zones A, B, C, D and E. Each of these delineated zones holds specific significance, elucidated in detail within Table 8.
4.2.3. Consensus error grid analysis
Consensus Error Grid Analysis (CEGA) [54,55] constitutes a vital methodology for assessing the performance of classification models, particularly within the realm of blood glucose level prediction. This analytical framework, deeply rooted in statistical methods and predictive modeling, serves as a crucial tool for the evaluation of blood glucose level prediction applications, emphasizing precision and reliability [56]. CEGA involves a detailed exploration of the concordance between predicted and observed outcomes, organized within a predefined grid characterized by zones indicating the severity of errors, ranging from inconsequential to clinically significant [57]. The zones, labeled A through E, represent the gradation of errors, with A and B reflecting clinically acceptable deviations and C, D and E indicating errors of increasing severity.
5.
Experimental result
5.1. Experimental setup
This section presents the experimental results and their configurations. In the experiments, the root-mean-square error (RMSE) and mean absolute error (MAE) were employed as evaluation metrics for the models. The proposed models predicted blood glucose levels for the next 30 and 60 minutes. The OhioT1DM dataset was divided into training and testing sets for model training and evaluation. The experimental setup featured a computer with an Intel Core i7-8565U CPU, 12GB of DDR4 memory and a 256 GB solid-state drive. An NVIDIA GeForce RTX 2080 Ti graphics card was also used to accelerate the computations through GPU acceleration. The operating system used was Windows 10 Professional Edition (Version 21H2). The experimental implementation was performed using the Python programming language (version: 3.8.10), along with machine learning libraries such as TensorFlow (version: 2.11.0), Keras (version: 2.11.0) and Scikit-learn (version: 0.24.2).
At the beginning of the training, an initial learning rate of 0.001 was set using the Adam optimizer. The dropout rate was set to 0.3, and the chosen activation function was ReLU. The hidden layer comprises 64 units, with a batch size of 64 and a decay rate of 0.40. The GRU-D model was fine-tuned using the Optuna hyperparameter optimization framework [58,59]. Optuna is an open-source Python library specializing in automated hyperparameter optimization and machine learning model tuning. Using various optimization algorithms, such as Bayesian optimization, Optuna explores the parameter space and helps users identify the optimal hyperparameter configuration, thereby improving model performance and effectiveness. During the training process, the training set was divided into multiple mini-batches. Each epoch updated the model parameters using the average loss of these mini-batches. This mini-batch training approach can improve model performance while reducing computation time and resource usage.
5.2. Experimental results and discussion
5.2.1. Comparing the experimental results for different PHs
Table 9 presents the RMSE and MAE evaluation metrics for each tested model, demonstrating the predictive performance of the proposed models across 12 patients. From the data in Table 9, it is evident that the proposed models yield satisfactory predictive results. As the pH increases, there is a slight decrease in predictive accuracy. Specifically, for pHs of 30 and 60 minutes, the RMSE values ranged from 15.851 to 18.951 mg/dL and the MAE values ranged from 7.951 to 14.303 mg/dL. In terms of predictive efficacy, the TG-DANet2 model demonstrated the most accurate prediction for patient #563. Notably, there were variations in the predictive outcomes among the five sub-models used in this study. Among these, the TG-DANet2 model performed optimally, whereas the TG-DANet5 model showed comparatively poorer predictive performance. However, due to the adopted pre-classification training paradigm, each model's predictions were influenced by patient-specific data. The personalized effects of patient data are leveraged by integrating patient information with the Optuna hyperparameter optimization framework. This framework adjusts the model parameters using individual patient features and physiological data, enabling customized treatment recommendations. The distinct predictive outcomes of each sub-model can be attributed to variations and idiosyncrasies in the physiological data of the individual patients. The approach proposed in this study enables the dynamic updating of model parameters based on real-time patient physiological data, thereby facilitating more accurate predictions. Overall, the average predictions also demonstrated satisfactory predictive performance.
For a precise assessment of the algorithm's performance, Figure 4(a), (b) displays the 24-hour glucose prediction trajectories for patient 596 within the prediction ranges of 30 and 60 minutes. In Figure 4, the red dashed line represents the reference values of actual glucose levels, while the blue solid line denotes the predicted glucose levels forecasted by the algorithm. It can be observed from the graphs that as the pH increases, the accuracy of the predictions gradually diminishes.
Figure 5 shows the Clarke Error Grid Analysis (EGA) [60] chart for patient 596 within the prediction ranges of 30 and 60 minutes. In the Figure the data points are distributed along the bisecting line of Zone A, indicating a higher level of accuracy for the proposed model. The predicted values show some dispersion with increased pH, yet they remain within Zones A and B. This observation indicates that predictive outcomes are of practical significance in clinical applications.
Tables 10–14 present the findings of the analysis of five classification models employing consensus error networks. Specifically, Table 10 conducts a comprehensive comparative examination of diverse blood glucose ranges, encompassing overall sensor readings and their distribution within specific intervals. In the 40–80 mg/dL range, a predominant proportion of readings achieved A and B classifications, attaining an accuracy rate of 97.16%. Similarly, the 81–120 mg/dL range exhibited a comparable trend, with 97.6% of readings falling into the A category and 78.95% in the B category. The prevalence of readings within the 121–240 mg/dL range was notable, demonstrating accuracy rates of 98.56% for A and 73.25% for B. In the 241–400 mg/dL range, the accuracy rate for the A category reached 99.47%. Overall, the average accuracy of 23,420 readings stood at 96.25%, with A and B categories representing 74.67% and 21.58%, respectively. This implies that the majority of readings fall within the target range, showcasing high accuracy, particularly in the moderate blood glucose range. Nevertheless, certain readings within specific ranges may deviate from the target and necessitate closer scrutiny.
Tables 11–14 consistently reveal a substantial proportion of readings in both A and B categories across diverse algorithms, indicating an overarching high level of accuracy.
5.2.2. Comparison of results of different forecasting methods
In this study, we aimed to explore the application of deep learning in blood glucose level prediction and developed a personalized prediction framework named TG-DANet, validated on the OhioT1DM dataset. The proposed prediction framework exhibits high accuracy in blood glucose level prediction, and the reasons behind its effectiveness can be summarized as follows. First, we employed double exponential smoothing for pre-processing time series data to eliminate the influence of noise and outliers. Second, a decay factor is introduced in the GRU network model to control the retention of historical information, thereby mitigating the impact of long-term dependency issues. We propose using the pre-classification model approach for predicting patient-specific blood glucose levels, enhancing the model's ability to personalize predictions. Compared to other studies, the algorithm proposed in this research demonstrates superior accuracy and practicality in predicting blood glucose levels.
Table 15 compares state-of-the-art methods for predicting blood glucose levels using the OhioT1DM clinical dataset. Although some studies have extended the prediction duration to 120 minutes (equivalent to 24 data points), most related studies have focused on a range of 60 minutes. Therefore, this study primarily compared the prediction durations at 30 and 60 minutes using RMSE and MAE as evaluation metrics. Numerous methods have been proposed in the existing literature for predicting blood glucose levels. To validate the superiority of the proposed algorithm, it is essential to compare it with the existing literature.
Some researchers have employed machine learning algorithms like XGBoost [61], as well as deep learning algorithms such as Convolutional Neural Networks (CNN) [35], Deep Recurrent Neural Networks (DRNN) [62], Neural Physiological Encoder (NPE) combined with Long Short-Term Memory (LSTM) [63], improved deep learning models like Auto-LSTM [36], RNN [33], Deep Multitask Stack Long Short-Term Memory (DM-StackLSTM) [36], Cutting-Edge Deep Neural Networks (CE-DNN) [64], Multitask Long Short-Term Memory (MTL-LSTM) [65], GluNet [66,67], ANN [27], Nested-DE [29], LSTM-TCN, Shallow-Net [68], RNN [69], CRNN [70] and Weighted LSTM model (W-DLSTM) [67] for blood glucose level prediction. However, these algorithms often predict the results for 12 patients and then calculate the average to evaluate the model's performance. This approach fails to account for individual physiological variability. Thus, it is unable to achieve personalized predictions. To address this issue, we introduce a pre-classification prediction model and utilizes real-time parameter updates for improved prediction accuracy, thereby capturing individual differences more effectively. By categorizing data into five classes and generating separate predictions, we obtain five distinct prediction outcomes that more effectively capture the model's personalized prediction capability. Across prediction durations of 30 and 60 minutes, both the proposed sub-models and the averaged predictions outperform the models published in the literature. The average RMSE of our proposed personalized prediction model is 16.896 and 28.881 mg/dL, respectively. This indicates good predictive accuracy within the range of predictions in our study.
In summary, after comparing it with methods documented in published literature, the algorithm proposed in this study demonstrates superiority in predicting blood glucose levels. Within a 30-minute prediction range, the RMSE values for TG-DANet1, TG-DANet2, TG-DANet3, TG-DANet4, TG- DANet5 and TG-DANetavg are 16.552, 16.252, 16.685, 16.283, 18.214 and 16.896 mg/dL, respectively. The corresponding MAE values are 8.524, 8.324, 11.251, 8.841, 12.951 and 9.978 mg/dL. For a 60-minute prediction range, the RMSE values for TG-DANet1, TG- DANet2, TG-DANet3, TG-DANet4, TG-DANet5 and TG-DANetavg are 26.153, 31.032, 29.214, 25.954, 32.052 and 28.881 mg/dL, respectively. The corresponding MAE values are 16.365, 17.246, 21.062, 17.864, 23.254 and 19.347 mg/dL. Across all prediction ranges, the accuracy of the models decreases as the prediction interval increases. The algorithmic framework proposed in this study generally achieves remarkable predictive accuracy in personalized blood glucose level prediction. Across various prediction ranges, each model demonstrates corresponding levels of predictive accuracy.
Therefore, the proposed prediction framework in this study demonstrates heightened accuracy and robustness in managing and forecasting blood glucose levels for individuals with type 1 diabetes. Real-time parameter updates based on individual physiological data can be achieved through personalized prediction models, thereby enabling more accurate blood glucose predictions. Furthermore, integrating this model into relevant medical devices for real-time decision-making can effectively prevent adverse blood glucose events. Our findings have significant implications for managing the conditions of patients with type 1 diabetes, assisting physicians in making decisions and improving the quality of life for patients. By adopting personalized prediction approaches, patients can receive tailored medical services based on their specific circumstances, effectively controlling blood glucose levels, mitigating the risk of complications and enhancing the overall quality of daily life. The pre-classification approach appears well structured and the results are certainly relevant. However, it should be noted that the data available could be affected by uncertainties that could affect performance [71,72]. In consideration of potential uncertainties, a fuzzy logic-based pre-classifier might be a valuable avenue for future exploration [73].
6.
Conclusions
A personalized and dynamic understanding of blood glucose concentration is crucial for effective diabetes management, disease control and assessing progression. This study presents a dynamic and personalized multitask blood glucose prediction model to address this challenge. Leveraging the concept of pre-classification, the physiological data of patients is categorized based on age and gender. An enhanced GRU network model is then used for prediction, improving personalized blood glucose forecasting accuracy. The experimental results are evaluated from both analytical and clinical perspectives. The outcomes demonstrate the effectiveness of the proposed personalized multitask prediction model within the 30-minute and 60-minute prediction intervals. Our approach is superior to the latest machine learning and deep learning methods in terms of prediction accuracy, and feature fusion. The application of this model to wearable devices can enable real-time patient predictions and extend to other essential types of forecasts, serving as a potential area for future research. While this study has made significant strides in advancing our understanding of personalized blood glucose prediction, it is crucial to address certain limitations. First, the dataset used is of limited scale, involving only 12 participants for both training and testing. Recognizing this constraint, further validation on an independent and more diverse dataset is essential to confirm the generalizability of our proposed models. Despite these limitations, our work lays the groundwork for timely monitoring, adjustments in treatment plans and informed clinical decision-making, all of which are critical aspects in enhancing patient care and quality of life. Future research should focus on expanding datasets for robust model validation and exploring potential applications in diverse healthcare settings.
Use of AI tools declaration
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.
Conflict of interest
The authors declare that there are no conflicts of interest.