
Citation: Martin O. Steinhauser. Multiscale modeling, coarse-graining and shock wave computer simulations in materials science[J]. AIMS Materials Science, 2017, 4(6): 1319-1357. doi: 10.3934/matersci.2017.6.1319
[1] | Jiangshan Wang, Lingxiong Meng, Hongen Jia . Numerical analysis of modular grad-div stability methods for the time-dependent Navier-Stokes/Darcy model. Electronic Research Archive, 2020, 28(3): 1191-1205. doi: 10.3934/era.2020065 |
[2] | Linlin Tan, Bianru Cheng . Global well-posedness of 2D incompressible Navier–Stokes–Darcy flow in a type of generalized time-dependent porosity media. Electronic Research Archive, 2024, 32(10): 5649-5681. doi: 10.3934/era.2024262 |
[3] | Jiwei Jia, Young-Ju Lee, Yue Feng, Zichan Wang, Zhongshu Zhao . Hybridized weak Galerkin finite element methods for Brinkman equations. Electronic Research Archive, 2021, 29(3): 2489-2516. doi: 10.3934/era.2020126 |
[4] | Guoliang Ju, Can Chen, Rongliang Chen, Jingzhi Li, Kaitai Li, Shaohui Zhang . Numerical simulation for 3D flow in flow channel of aeroengine turbine fan based on dimension splitting method. Electronic Research Archive, 2020, 28(2): 837-851. doi: 10.3934/era.2020043 |
[5] | Jie Qi, Weike Wang . Global solutions to the Cauchy problem of BNSP equations in some classes of large data. Electronic Research Archive, 2024, 32(9): 5496-5541. doi: 10.3934/era.2024255 |
[6] | Jie Zhang, Gaoli Huang, Fan Wu . Energy equality in the isentropic compressible Navier-Stokes-Maxwell equations. Electronic Research Archive, 2023, 31(10): 6412-6424. doi: 10.3934/era.2023324 |
[7] | Zhonghua Qiao, Xuguang Yang . A multiple-relaxation-time lattice Boltzmann method with Beam-Warming scheme for a coupled chemotaxis-fluid model. Electronic Research Archive, 2020, 28(3): 1207-1225. doi: 10.3934/era.2020066 |
[8] |
Guochun Wu, Han Wang, Yinghui Zhang .
Optimal time-decay rates of the compressible Navier–Stokes–Poisson system in |
[9] | Wei Shi, Xinguang Yang, Xingjie Yan . Determination of the 3D Navier-Stokes equations with damping. Electronic Research Archive, 2022, 30(10): 3872-3886. doi: 10.3934/era.2022197 |
[10] | Xiaoxia Wang, Jinping Jiang . The uniform asymptotic behavior of solutions for 2D g-Navier-Stokes equations with nonlinear dampness and its dimensions. Electronic Research Archive, 2023, 31(7): 3963-3979. doi: 10.3934/era.2023201 |
Medical and health care has a bearing on the health of hundreds of millions of people and is a basic livelihood issue for people around the world. Specifically in China, the most populous country in the world, the total amount of medical resources is not abundant enough, resulting in an unbalanced supply and demand of medical services. According to information released by the National Health Commission of China, by the end of 2020, China had only 2.9 doctors per 1000 people, which means that there is only one doctor for every 300 people. To solve these problems, disease prediction has received increasing attention from academia and industry. While image-based [1] disease prediction has been well studied, research on text-based disease prediction [2] is still difficult due to the difficulty of understanding the Chinese language itself and obtaining a real and reliable clinical corpus.
Since the National Health and Family Planning Commission of China issued the "Basic Specifications for Electronic Health Records (Trial), " many hospitals have accumulated a lot of electronic health records (EHRs). EHRs are detailed records of medical activities by medical personnel, mostly written by doctors, including structured data (lab tests, vital signs, etc.) and unstructured data (chief complaints, current illness history, etc.). With the development and popularity of EHRs, more and more scholars are interested in disease prediction. Existing work has focused on graph-based [3] methods and classification-based [4] methods for disease prediction from EHRs. Graph-based methods focus on the relationships between symptoms and diseases for disease prediction. Classification-based methods mainly extract features from EHRs and predict disease for patients. Early research is mainly based on manually designed rules and traditional machine learning methods. The rule-based method has a high accuracy rate, but the construction of the rules requires the participation of personnel in the medical field, which is time-consuming and labor-intensive. Traditional machine learning methods, such as Support Vector Machine (SVM) [5] and Random Forest [6], can avoid this problem, but it is difficult to express deeper semantic information of EHRs. With the development of deep learning, its application in disease prediction [7, 8] has significantly improved the performance. However, the existing methods mainly focus on a single type of structured medical data [9] but ignore the differences and connections between varied types of medical data [10]. Such as gender information in EHRs, may be insufficient for these texts to use the same encoder for representation. Furthermore, the information of entities in disease prediction is often ignored. In order to solve the above problems, we propose a novel disease prediction model with multi-type data, and the overall structure is shown in Figure 1.
The contributions of this paper are as follows:
1) Entity information is integrated with text information to better obtain the representation of EHRs.
2) A multi-type data fusion model is proposed, which focuses on different ways to represent information respectively and improves obviously the accuracy of prediction and the interpretability of feature representations.
3) Evaluation of real EHRs from a Three Grade Class B General Hospital in Gansu Province, China, shows that the multi-type data fusion model outperforms previous disease prediction methods with EHRs.
Disease prediction is to use computer-related technology to extract features from EHRs and predict disease. Early research was mainly based on rules and knowledge reasoning expert systems [11]. Such methods are simple and easy to understand, but they require a lot of experts in the medical field to construct rules and are not flexible enough. With the continuous development of machine learning technology, more and more researchers apply these technologies for disease prediction. Palaniappand et al. [12] proposed a method for predicting heart disease using Naive Bayes and Decision Tree, which developed into a heart disease prediction system. Ananthakrishnan et al. [13] used logistic regression to diagnose Crohn's disease and ulcerative colitis. Drriseitl et al. [14] compared the algorithmic performances of K-nearest Neighbor, SVM and logistic regression in the diagnosis of skin diseases, and they found that SVM showed better performance.
With the development of deep learning in NLP tasks, there are many ways to use deep learning methods for disease prediction. Yang et al. [15] proposed a Convolutional Neural Network (CNN) model to obtain textual information in EHRs and perform disease prediction. An et al. [16] obtained different features of EHRs based on the BiLSTM model and fused different types of features to predict cardiovascular disease. Wang et al. [17] proposed a prediction method based on BiLSTM and CNN to model characters and words in EHRs, respectively. Du et al. [18] utilized a multigraph structural LSTM model and considered the Spatio-temporal characteristics to predict foodborne diseases. Rasmy et al. [19] utilized the CovRNN model to learn the representations of patients with COVID-19 and make relevant predictions, such as mortality and hospital stay. Sha and Wang [20] proposed a hierarchical GRU-based model to predict clinical outcomes based on the medical code of the patient's previous visits. With the proposed pre-training models of ELMO [21], OpenAI GPT [22] and BERT [23], significant improvements have been achieved in various NLP tasks, which have also been applied in the medical field. Zhang et al. [24] proposed a BERT based model with an enhanced layer to encode EHRs for auxiliary diagnosis in obstetrics. BioBERT [25] is a pre-trained model that was trained on general and biomedical domain corpora. Mugisha et al. [26] utilized BioBERT to obtain representations in EHRs and make predictions of pneumonia diseases. These research methods have improved the accuracy of disease prediction to a certain extent, but there are still some shortcomings. On the one hand, the disease prediction models based on traditional machine learning are often limited by the shortcomings of feature engineering and the algorithm itself, and they are heavily dependent on manual rules, leading to the failure of generalization of the models. On the other hand, these methods are mainly modeled by a single type of data information, and few of them pay enough attention to different data types.
In this paper, we propose a multi-type data fusion model based on EHRs, and the model structure is shown in Figure 2. The model can be divided into two parts: text representation and entity representation. The text representation module introduces BERT to get the encoded representation of the textual information; the encoding of the numerical information is achieved by one-hot and max-minimum normalization methods. Then, the textual and numerical encodings are sent to a multi-head self-attention layer, using the numerical information to enhance the text information and get a better text representation. The entity information is extracted by using the developed and relatively mature entity recognition technology. The pre-trained model BERT is used for encoding the characters of entities. Then, TextCNN is used for extracting features and obtaining entity representations. Finally, the two types of information are fused to get the final representation of the patient and make predictions about the disease.
The information contained in the EHRs text can be divided into structured and unstructured data. Unstructured data refers to textual information, while structured data refers to the information of demographics and physical examinations in this study, which are significant for disease prediction. For example, an older patient is more likely to have a cerebral infarction. It is considerable to convert structured data into numerical information for better representations. The specific situation is shown in Table 1.
Initial description | Gender:Female; Age:67; Marital Status:Married; Family History:None; |
T:36.8 ℃; P:62 beats/min; R:18 beats/min; Bp:140/90 mmHg | |
Numerical information | {(0, 67, 1, 0, 36.8, 62, 18, 140, 90)} |
Patient demographics include age, gender, marital status, family history and more. Only adult patients are concerned in this study, and their ages are split into 5 groups: (18, 25), (25, 45), (45, 65), (65, 89), (89, ). The patient's physical examination includes blood pressure (BP), heartbeat (R), pulse (P), body temperature (T), etc. The physical examination is encoded by max-min normalization. The demographic information is spliced to obtain the numerical representation of patients as Znum.
Text information in the EHRs includes the chief complaint, history of present illness, etc. Using appropriate algorithms to extract features from EHRs texts can better help patients with disease prediction. Due to the sparseness of Chinese EHRs, traditional methods, such as Doc2vec, can not accurately obtain the text representation of Chinese EHRs. However, a pre-training model based on transfer learning can achieve better results after fine-tuning in small scale samples after pre-training in large data sets. Therefore, we utilize the pre-trained language model BERT to obtain textual representations of EHRs. The input text sequence is as follows:
[CLS] Chinese Electronic Health Record [SEP] |
where, [CLS] indicates the start tag of the text, and [SEP] indicates the separator tag of the text. After the EHR is fed into the BERT model, the last layer of [CLS] is used to represent the entire EHR C. To better integrate numerical and textual information to obtain better text representations, we introduces the multi-head self-attention to enhance the textual representations of EHRs:
Q=K=V=Wcconcat(Znum,C), | (3.1) |
Attention(Q,K,V)=softmax(QKT√dk)V, | (3.2) |
Ztext=concat(head1,head2,…,headh)Wowhere headi=Attention(QWQi,KWKi,VWVi), | (3.3) |
where WiQ, WiK, WiV, Wc, Wo are trainable parameters, and Ztext is the final representation after enhancing of numerical information in the EHRs text.
Through the analysis of EHRs, we found that entity type information (symptoms, medicine, etc.) is important for disease prediction. For example, the symptoms of "coughing" may increase the risk of bronchitis. Therefore, it is a great necessity of introducing relatively mature named entity recognition technology to extract entity information from EHRs. The BiLSTM-CRF model we use can easily extract entity information, which can obtain contextual information more comprehensively and learn the relationship between contexts easily, and convert the extracted context information into corresponding labels for each Chinese character. The architecture of the BiLSTM-CRF model is illustrated in Figure 3. In the model, the BIO (Begin, Inside, Outside) tagging scheme is used. First, the Skip-Gram [27] algorithm is introduced to train character embedding in EHRs. The sentence is represented as a sequence of characters vector Q=(q1,q2,...,qn), where n is the length of the EHRs. Secondly, the embeddings (q1,q2,...,qn) are given as input to the BiLSTM layer. In the BiLSTM layer and at step t, the output state of the forward LSTM is the hidden vector →ht, and the output state of the other backward LSTM is hidden vectors ←ht. These two distinct networks use different parameters, and then the representation of a character ht=[→ht;←ht] is obtained by concatenating its forward and backward the hidden vector. Next, a full connection layer is used to map the hidden state vector (h1,h2,...,hn)∈Rn×m to k dimensions, where k is the number of labels in the label set. As a result, the sentence features are extracted that are represented as a matrix P=(p1,p2,...,pn)∈Rn×k. Finally, the parameters of the CRF layer are represented by a matrix A, and Aij denotes the score of the transition from the i-th label to the j-th label. Considering a sequence of labels y=(y1,y2,...,yn), the formula for calculating the score of the tag sequence is as follows.
score(x,y)=n∑i=1Pi,yi+n+1∑j=1Ayj−1,yj | (3.4) |
The score of the whole sequence is equal to the sum of the scores of all words within the sentence, which is determined by the output matrix P of BiLSTM layer and the transition matrix A of the CRF layer. Then, a softmax function is used to yield the conditional probability of the path by normalizing the above score over all possible tag paths y′.
P(y∣x)=escore(x,y)k∑y′=1escore(x,y′) | (3.5) |
During the training phase, the goal of this model is to maximize the log-probability of the correct tag sequence. In the prediction process, the score corresponding to each candidate sequence is calculated according to the trained parameters, and the optimal path is calculated using the Viterbi algorithm with dynamic programming as the core.
argmaxy′score(x,y′) | (3.6) |
For input sequence S={x1,x2,...,xn} of the EHR, xi represents i-th character in the EHR. Inputting the sequence into the BERT model to obtain the representation of each character,
H=[h1,h2,...,hn]=BERT([x1,x2,...,xn]). | (3.7) |
Suppose that vectors hi to hj are the final hidden state vectors from BERT for symptom entity esyi; we apply the average operation to obtain a vector representation for each of the entities. This process can be mathematically formalized as:
esyi=1j−i+1j∑t=iht. | (3.8) |
We get the symptom entity embeddings Esy=[esy1,esy2,...,esyn], and in the same way, we get the medicine and abnormal inspection result entity embeddings: Emed=[emed1,emed2,...,emedn], Eabn=[eabn1,eabn2,...,eabnn]. We separately performed convolution operations on the entity information of symptom, medicine and abnormal inspection results to extract various types of entity features. The convolution operation is carried out between convolution kernel w and symptom entity embedding in the ith window esyi:i+h−1 in the symptom entity Esy and obtained feature csyi :
csyi=f(esyi:i+h−1⋅w+b), | (3.9) |
where the size of the convolution kernel is w∈Rh×d, h is the height of the convolution kernel, and d is the dimension of the character embedding in BERT. b∈R is a bias term, and f is a non-linear function.
This filter is applied to each possible window of features in the event matrix {esy1:h,esy2:h+1,...,esyn−h+1:n} to produce a feature map csy=[csy1,csy2,...,csyn−h+1]. Then, max pooling is applied over the feature map, and the average c′sy=max{csy} is taken. In the same way, we convolved medicine and abnormal inspection results information to obtain the medicine and abnormal inspection results representation c′med, c′abn. The symptom, medicine and abnormal inspection results representation are spliced to obtain the final representation of the patient entity:
Zentity=concat(c′sy;c′med;c′abn). | (3.10) |
By splicing the representation of text and entity information, the final representation of EHR is denoted as Zpatient=concat(Ztext;Zentity), where the size of this vector is the sum of the components dtext+dentity. The EHR representation Zpatient is sent to the fully connected layer, and the probability of each type of disease is calculated by the softmax activation function. The formula is
y=softmax(w⋅Zpatient+b) | (3.11) |
where y denotes the prediction probability distribution of K disease classes (K=9). yi indicates the probability that the input EHR is related to the i-th disease.
In this paper, the cross-entropy loss function is used to train the model with the goal of minimizing the Loss:
Loss=−∑T∈Corpus∑Ki=1yi(T)log(yi(T)) | (3.12) |
where T is the input EHR, Corpus denotes training sample set and K is the number of classes.
Large-scale Chinese EHRs datasets with entity information are not always readily accessible. To facilitate research on Chinese EHRs, we collected a large raw dataset in a Three Grade Class B Hospital General in Gansu Province, China, which contained 61, 233 EHRs. We select 8 kinds of diseases, including cerebral infarction (CI), vertebrobasilar insufficiency (VBI), coronary atherosclerotic heart disease (CAHD), cholecystitis, bronchitis, degenerative spondylitis, intestinal obstruction, type 2 diabetic peripheral neuropathy (T2DM), and select some other diseases as the Chinese Electronic Health Record dataset (CEHR). Before the experiments of our study, the following preprocessing was carried out on the CEHR text:
1) De-privacy: Delete the patient's personal private information from CEHRs, such as: 'name', 'place of birth', 'occupation' and other private information.
2) Selecting the required CEHRs: Chinese EHRs contain a large number of missing values. Therefore, those with unfilled personal information and less than 200 words will be removed.
3) Label entity information: We refer to a large number of annotation specifications [28, 29] to label entity information. The CEHR corpus contains 3 types of entities: symptom (Sym), medicine (Med), and abnormal inspection result (Abn). Sym: Symptom refers to the subjective feelings described by the patient or the objective facts observed by the outside world, such as dizziness. Med: Medicine refers to the name of the medicine used in the process of treatment, excluding dosage, method of administration, etc. such as aspirin. Abn: Abnormal inspection result refers to abnormal changes and abnormal examination results that occur in patients through examination procedures or as observed by doctors, such as lung marking increase.
After the above processing, we selected 8290 CEHRs as experimental data, and further splitted the CEHRs by 70, 10 and 20% as training, validation, and test sets, respectively. Table 2 shows the distribution of CEHRs, in descending order of data volume. The statistics of the entity information for our experiments are shown in Table 3.
Disease | Training set | Test set | Validation set |
CI | 700 | 200 | 100 |
VBI | 700 | 200 | 100 |
CAHD | 700 | 200 | 100 |
bronchitis | 700 | 200 | 100 |
degenerative spondylitis | 700 | 200 | 100 |
T2DM | 700 | 200 | 100 |
other diseases | 700 | 200 | 100 |
cholecystitis | 511 | 146 | 73 |
intestinal obstruction | 392 | 112 | 56 |
Disease | Avg number | Max number | Min number |
CI | 16.83 | 23 | 9 |
VBI | 13.71 | 20 | 8 |
CAHD | 15.16 | 27 | 7 |
bronchitis | 17.47 | 29 | 6 |
degenerative spondylitis | 12.18 | 14 | 5 |
T2DM | 16.38 | 32 | 8 |
other diseases | 16.08 | 33 | 6 |
cholecystitis | 18.94 | 30 | 9 |
intestinal obstruction | 16.16 | 27 | 8 |
The goal of this paper is to get the EHR features and use them for disease prediction. Using evaluation metrics for classification tasks to assess the quality of disease prediction, such as Accuracy, Precision, Recall and F1-score, these are defined as follows:
Accuracy=TP+TNTP+TN+FP+FN | (4.1) |
Recall=TPTP+FN | (4.2) |
Precision=TPTP+FP | (4.3) |
F1−score=2×Precision×RecallPrecision+Recall | (4.4) |
where TP indicates the number of positive samples that were predicted as positive, FP indicates the number of negative samples that were predicted as positive and FN indicates the number of positive samples that were predicted to be negative. TN indicates the number of negative samples that were predicted as negative.
In order to protect the patient privacy and data denoising, EHRs in this paper are preprocessed in various ways, such as privacy removing, data cleaning, entity labeling and disease standardizing. The version of the BERT model is BERT-base-Chinese, the main super-parameter is the size of hidden layer, which we set at 768, the Transformer blocks are 12, the number of attention heads is 12, and maximum input length is 512. In the convolutional module, the heights of the filters are 2, 3 and 4. During the training, we applied the learning rate of 5e-5 and the dropout rate of 0.5, and the batch size is 32.
We conducted experiments to compare the performance of our model with other disease prediction models.
SVM [5]: PKUSEG is a tool for word segmentation of Chinese EHRs. Then, the TF-IDF algorithm is used for extracting key information to obtain the representation of Chinese EHRs and then use SVM for disease prediction.
CNN [15]: CNN is used for obtaining features from Chinese EHRs, and then the probability of the patient's disease can be computed by sending features to fully connected layers.
BiLSTM: The model utilizes BiLSTM to extract features and feed them into fully connected activation layers for disease prediction.
RCNN [30]: The model utilizes RCNN to obtain the textual features of EHRs, and then sends them into fully connected layers and activation layers for disease prediction.
BERT [23]: The model uses the pre-trained model BERT to extract the features of Chinese EHRs for disease prediction tasks.
We compared overall performance of our proposed model with baseline models on a test set of CEHR datasets. Table 4 shows the experimental results of baseline models and our proposed model.
Method | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) |
SVM | 89.06 | 87.39 | 86.91 | 87.15 |
CNN | 89.55 | 88.44 | 87.55 | 87.99 |
BiLSTM | 89.39 | 88.53 | 86.97 | 87.74 |
RCNN | 89.76 | 89.04 | 87.51 | 88.27 |
BERT | 91.68 | 90.83 | 89.26 | 90.04 |
Our-model | 94.66 | 93.62 | 90.28 | 91.92 |
As shown in Table 4, we can see that our method is more effective than other methods, and the F1-score reaches 91.92%. The methods in Table 4 can be divided into traditional machine learning and deep learning methods. SVM, as a traditional machine learning model, cannot deeply learn the complex feature representation of EHRs. The BERT model only obtains the text information of the EHRs, while ignoring the entity and numerical information, which leads our model to improve the F1-score of the mainstream BERT model by 1.88%. It means that entity information is very important for both patient representation and disease prediction. The experimental results show that the multi-type data fusion model can fully obtain the features of Chinese EHRs, and it is effective and feasible to conduct disease prediction based on this model.
As shown in Table 5, 8 kinds of disease and other diseases are in descending order of data volume (the number of each disease is shown in Table 2). We list the F1-score corresponding to each disease. Our model has the highest F1-score in all 8 kinds of diseases and other diseases, indicating that we can effectively represent patients in these 8 kinds of diseases and other diseases. In terms of the diseases with fewer quantities of data, our model shows a significant performance improvement compared to other baseline models, such as cholecystitis, and intestinal obstruction, which improved by 2.61% to 2.5%. For VBI and degenerative spondylitis, our model has less improvement, 1.25 and 1.31%, respectively. The main reason is that due to the small number of entities in these two diseases, the model cannot learn the features of entity information well.
Disease | SVM | CNN | BiLSTM | RCNN | BERT | Our model |
CI | 82.26 | 82.91 | 82.94 | 83.38 | 86.41 | 88.78 |
VBI | 82.45 | 83.19 | 83.83 | 86.27 | 87.18 | 88.43 |
CAHD | 89.72 | 90.74 | 90.43 | 90.88 | 92.03 | 93.83 |
bronchitis | 89.79 | 89.87 | 90.17 | 90.59 | 93.48 | 95.15 |
degenerative spondylitis | 89.91 | 90.14 | 89.81 | 90.41 | 90.22 | 91.53 |
T2DM | 90.37 | 90.91 | 89.85 | 89.49 | 91.48 | 93.43 |
other diseases | 84.27 | 86.23 | 85.17 | 85.94 | 88.54 | 89.96 |
cholecystitis | 88.69 | 89.72 | 88.75 | 88.35 | 90.91 | 93.52 |
intestinal obstruction | 86.89 | 88.23 | 88.72 | 89.14 | 90.18 | 92.68 |
To verify the importance and role of different types of information on CEHR representation and to better understand the behavior of the proposed fusion model, we employ an ablation study and conduct extensive experiments on different models. T, E and N represent textual information, entity information and numerical information, respectively. As shown in Table 6, using T + E + N, the model achieved 91.92% F1-score in the test set, which is 1.09, 1.56 and 0.75% higher than of the models without textual, entity and numerical information, indicating that different types of information have an impact on disease prediction. Among them, entity information has the greatest impact on the model, which shows that entity information plays a key role in our model. By fusing multiple types of data, the performance of the model is improved, and at the same time, the model is more explanatory.
Method | Precision (%) | Recall (%) | F1-score (%) |
T + E + N | 93.62 | 90.28 | 91.92 |
T + E | 92.18 | 90.18 | 91.17 |
T + N | 91.69 | 89.06 | 90.36 |
E + N | 91.19 | 90.47 | 90.83 |
In order to choose a better entity acquisition method, we compared the CRF and BiLSTM-CRF models, and the results are shown in Figure 4.
We performed a comparison of the CRF and BiLSTM-CRF models for the identification of entity information in CEHRs. The Precisions for the two models are 88.57 and 89.24%. The Recalls for the two models are 87.54 and 88.76%. The F1-scores for the two models are 88.05 and 89.01%. We find that the BiLSTM-CRF model outperforms the CRF model. So, the BiLSTM-CRF model is selected as the entity extraction model in CEHRs.
Multi-head attention was adopted to fuse textual information and numerical information. As shown in Figure 5, when the number of heads of the multi-head attention is 8, the model achieves the best performance, with an F1-score of 91.92%. As the number of heads increases, the performance of the model gets better, but the number of heads should not be set too large, as otherwise F1-score will decrease. When the number of heads reached 12, the F1-score decreased by 0.31%, because excessive attention would introduce noise and reduce the performance of the model.
The purpose of this experiment is to study whether the EHRs representation adopted in the BERT model is better than the traditional Word2vec and Doc2vec in effect. As shown in Table 7, the effect of using BERT as text and entity embedding is better than Word2vec and Doc2vec embeddings, and the F1-score of the BERT model is 3.17% better than that of the Word2vec+Doc2vec combined model. The reason is that the training method of the BERT model based on character vectors can alleviate the problem of polysemy to a certain extent.
Model | Precision (%) | Recall (%) | F1-score (%) |
Word2vec+Doc2vec | 89.15 | 88.36 | 88.75 |
Our-model | 93.62 | 90.28 | 91.92 |
This paper proposes a disease prediction method based on a multi-type data fusion mechanism for EHRs. The model uses multi-head self-attention to fuse numerical features into textual information and enhance text representation. Using the TextCNN model to formulate entity representation, the representations of text and entities in it are mixed together to obtain the final representation of the EHR. This method solves the problems of unreasonable representation and difficulty in feature extraction when various data of EHRs exist. The experimental results show that the multi-type data fusion model can effectively learn the feature representation of EHRs and achieve disease prediction. In future work, we will try to incorporate more information, such as time series data, external knowledge bases, etc., to further improve the quality and efficiency of disease prediction.
We would like to thank the anonymous reviewers for their valuable comments. The Publication of the article is supported by the National Natural Science Foundation of China (No. 62163033), the Natural Science Foundation of Gansu Province (No. 21JR7RA781, No. 21JR7RA116), Lanzhou Talent Innovation and Entrepreneurship Project (No. 2021-RC-49) and Northwest Normal University Major Research Project Incubation Program (No. NWNU-LKZD2021-06).
The authors declare there is no conflict of interest.
[1] | Phillips RR (2001) Crystals, defects and microstructures: Modeling across scales, Cambridge: Cambridge University Press. |
[2] | Yip S (2005) Handbook of materials modeling, Berlin: Springer. |
[3] | Steinhauser MO (2017) Computational Multiscale Modeling of Fluids and Solids-Theory and Applications, 2nd Edition, Berlin: Springer. |
[4] |
McNeil PL, Terasaki M (2001) Coping with the inevitable: how cells repair a torn surface membrane. Nat Cell Biol 3: E124–E129. doi: 10.1038/35074652
![]() |
[5] | Schmidt M, Kahlert U, Wessolleck J, et al. (2014) Characterization of a setup to test the impact of high-amplitude pressure waves on living cells. Sci Rep 4: 3849. |
[6] |
Gambihler S, Delius M, Ellwart JW (1992) Transient increase in membrane permeability of L1210 cells upon exposure to lithotripter shock waves in vitro. Naturwissenschaften 79: 328–329. doi: 10.1007/BF01138714
![]() |
[7] | Gambihler S, Delius M, Ellwart JW (1994) Permeabilization of the plasma membrane of L1210 mouse leukemia cells using lithotripter shock waves. J Membrane Biol 141: 267–275. |
[8] | Kodama T, Doukas AG, Hamblin MR (2002) Shock wave-mediated molecular delivery into cells. BBA-Mol Cell Res 1542: 186–194. |
[9] |
Bao G, Suresh S (2003) Cell and molecular mechanics of biological materials. Nat Mater 2: 715–725. doi: 10.1038/nmat1001
![]() |
[10] |
Tieleman DP, Leontiadou H, Mark AE, et al. (2003) Simulation of Pore Formation in Lipid Bilayers by Mechanical Stress and Electric Fields. J Am Chem Soc 125: 6382–6383. doi: 10.1021/ja029504i
![]() |
[11] |
Sundaram J, Mellein BR, Mitragotri S (2003) An Experimental and Theoretical Analysis of Ultrasound-Induced Permeabilization of Cell Membranes. Biophys J 84: 3087–3101. doi: 10.1016/S0006-3495(03)70034-4
![]() |
[12] |
Doukas AG, Kollias N (2004) Transdermal drug delivery with a pressure wave. Adv Drug Deliver Rev 56: 559–579. doi: 10.1016/j.addr.2003.10.031
![]() |
[13] |
Coussios CC, Roy RA (2008) Applications of Acoustics and Cavitation to Noninvasive Therapy and Drug Delivery. Annu Rev Fluid Mech 40: 395–420. doi: 10.1146/annurev.fluid.40.111406.102116
![]() |
[14] |
Prausnitz MR, Langer R (2008) Transdermal drug delivery. Nat Biotechnol 26: 1261–1268. doi: 10.1038/nbt.1504
![]() |
[15] |
Ashley CE, Carnes EC, Phillips GK, et al. (2011) The targeted delivery of multicomponent cargos to cancer cells by nanoporous particle-supported lipid bilayers. Nat Mater 10: 389–397. doi: 10.1038/nmat2992
![]() |
[16] |
Koshiyama K, Wada S (2011) Molecular dynamics simulations of pore formation dynamics during the rupture process of a phospholipid bilayer caused by high-speed equibiaxial stretching. J Biomech 44: 2053–2058. doi: 10.1016/j.jbiomech.2011.05.014
![]() |
[17] |
Steinhauser MO (2016) On the Destruction of Cancer Cells Using Laser-Induced Shock-Waves: A Review on Experiments and Multiscale Computer Simulations. Radiol Open J 1: 60–75. doi: 10.17140/ROJ-1-110
![]() |
[18] | Krehl POK (2009) History of Shock Waves, Explosions and Impact: A Chronological and Biographical Reference, Berlin: Springer. |
[19] |
Steinhauser MO, Schneider J, Blumen A (2009) Simulating dynamic crossover behavior of semiflexible linear polymers in solution and in the melt. J Chem Phys 130: 164902. doi: 10.1063/1.3111038
![]() |
[20] |
Rodriguez V, Saurel R, Jourdan G, et al. (2013) Solid-particle jet formation under shock-wave acceleration. Phys Rev E 88: 063011. doi: 10.1103/PhysRevE.88.063011
![]() |
[21] |
Zheng J, Chen QF, Gu YJ, et al. (2012) Hugoniot measurements of double-shocked precompressed dense xenon plasmas. Phys Rev E 86: 066406. doi: 10.1103/PhysRevE.86.066406
![]() |
[22] |
Falk K, Regan SP, Vorberger J, et al. (2013) Comparison between x-ray scattering and velocityinterferometry measurements from shocked liquid deuterium. Phys Rev E 87: 043112. doi: 10.1103/PhysRevE.87.043112
![]() |
[23] |
Brujan EA, Matsumoto Y (2014) Shock wave emission from a hemispherical cloud of bubbles in non-Newtonian fluids. J Non-Newton Fluid 204: 32–37. doi: 10.1016/j.jnnfm.2013.12.003
![]() |
[24] |
Iakovlev S, Iakovlev S, Buchner C, et al. (2014) Resonance-like phenomena in a submerged cylindrical shell subjected to two consecutive shock waves: The effect of the inner fluid. J Fluid Struct 50: 153–170. doi: 10.1016/j.jfluidstructs.2014.05.013
![]() |
[25] |
Bringa EM, Caro A,Wang YM, et al. (2005) Ultrahigh strength in nanocrystalline materials under shock loading. Science 309: 1838–1841. doi: 10.1126/science.1116723
![]() |
[26] |
Kadau K, Germann TC, Lomdahl PS, et al. (2007) Shock waves in polycrystalline iron. Phys Rev Lett 98: 135701. doi: 10.1103/PhysRevLett.98.135701
![]() |
[27] |
Knudson MD, Desjarlais MP, Dolan DH (2008) Shock-Wave Exploration of the High-Pressure Phases of Carbon. Science 322: 1822–1825. doi: 10.1126/science.1165278
![]() |
[28] |
Gurnett DA, Kurth WS (2005) Electron plasma oscillations upstream of the solar wind termination shock. Science 309: 2025–2027. doi: 10.1126/science.1117425
![]() |
[29] |
Gurnett DA, Kurth WS (2008) Intense plasma waves at and near the solar wind termination shock. Nature 454: 78–80. doi: 10.1038/nature07023
![]() |
[30] |
Dutton Z, Budde M, Slowe C, et al. (2001) Observation of quantum shock waves created with ultra-compressed slow light pulses in a Bose-Einstein condensate. Science 293: 663–668. doi: 10.1126/science.1062527
![]() |
[31] |
Damski B (2006) Shock waves in a one-dimensional Bose gas: From a Bose-Einstein condensate to a Tonks gas. Phys Rev A 73: 043601. doi: 10.1103/PhysRevA.73.043601
![]() |
[32] |
Chang JJ, Engels P, Hoefer MA (2008) Formation of dispersive shock waves by merging and splitting Bose-Einstein condensates. Phys Rev Lett 101: 170404. doi: 10.1103/PhysRevLett.101.170404
![]() |
[33] | Millot M, Dubrovinskaia N, Černok A, et al. (2015) Planetary science. Shock compression of stishovite and melting of silica at planetary interior conditions. Science 347: 418–420. |
[34] |
Bridge HS, Lazarus AJ, Snyder CW, et al. (1967) Mariner V: Plasma and Magnetic Fields Observed near Venus. Science 158: 1669–1673. doi: 10.1126/science.158.3809.1669
![]() |
[35] |
McKee CF, Draine BT (1991) Interstellar shock waves. Science 252: 397–403. doi: 10.1126/science.252.5004.397
![]() |
[36] | McClure S, Dorfmüller C (2002) Extracorporeal shock wave therapy: Theory and equipment. Clin Tech Equine Pract 2: 348–357. |
[37] |
Lingeman JE, McAteer JA, Gnessin E, et al. (2009) Shock wave lithotripsy: advances in technology and technique. Nat Rev Urol 6: 660–670. doi: 10.1038/nrurol.2009.216
![]() |
[38] | Cherenkov PA (1934) Visible emission of clean liquids by action of gamma radiation. Dokl Akad Nauk SSSR 2: 451–454. |
[39] |
Mach E, Salcher P (1887) Photographische Fixirung der durch Projectile in der Luft eingeleiteten Vorgänge. Ann Phys 268: 277–291. doi: 10.1002/andp.18872681008
![]() |
[40] |
Kühn M, Steinhauser MO (2008) Modeling and simulation of microstructures using power diagrams: Proof of the concept. Appl Phys Lett 93: 034102. doi: 10.1063/1.2959733
![]() |
[41] |
Walsh JM, Rice MH (1957) Dynamic compression of liquids from measurements on strong shock waves. J Chem Phys 26: 815–823. doi: 10.1063/1.1743414
![]() |
[42] | Asay JR, Chhabildas LC (2003) Paradigms and Challenges in Shock Wave Research, High-Pressure Shock Compression of Solids VI, New York: Springer-Verlag New York, 57–119. |
[43] |
Steinhauser MO, Grass K, Strassburger E, et al. (2009) Impact failure of granular materials-Nonequilibrium multiscale simulations and high-speed experiments. Int J Plasticity 25: 161–182. doi: 10.1016/j.ijplas.2007.11.002
![]() |
[44] |
Watson E, Steinhauser MO (2017) Discrete Particle Method for Simulating Hypervelocity Impact Phenomena. Materials 10: 379. doi: 10.3390/ma10040379
![]() |
[45] |
Holian BL, Lomdahl PS (1998) Plasticity induced by shock waves in nonequilibrium moleculardynamics simulations. Science 280: 2085–2088. doi: 10.1126/science.280.5372.2085
![]() |
[46] |
Kadau K, Germann TC, Lomdahl PS, et al. (2002) Microscopic view of structural phase transitions induced by shock waves. Science 296: 1681–1684. doi: 10.1126/science.1070375
![]() |
[47] |
Chen M, McCauley JW, Hemker KJ (2003) Shock-Induced Localized Amorphization in Boron Carbide. Science 299: 1563–1566. doi: 10.1126/science.1080819
![]() |
[48] | Holian BL (2004) Molecular dynamics comes of age for shockwave research. Shock Waves 13: 489–495. |
[49] |
Germann TC, Kadau K (2008) Trillion-atom molecular dynamics becomes a reality. Int J Mod Phys C 19: 1315–1319. doi: 10.1142/S0129183108012911
![]() |
[50] | Ciccotti G, Frenkel G, McDonald IR (1987) Simulation of Liquids and Solids, Amsterdam: North-Holland. |
[51] | Allen MP, Tildesley DJ (1987) Computer Simulation of Liquids, Oxford, UK: Clarendon Press. |
[52] |
Liu WK, Hao S, Belytschko T, et al. (1999) Multiple scale meshfree methods for damage fracture and localization. Comp Mater Sci 16: 197–205. doi: 10.1016/S0927-0256(99)00062-2
![]() |
[53] |
Gates TS, Odegard GM, Frankland SJV, et al. (2005) Computational materials: Multi-scale modeling and simulation of nanostructured materials. Compos Sci Technol 65: 2416–2434. doi: 10.1016/j.compscitech.2005.06.009
![]() |
[54] | Steinhauser MO (2013) Computer Simulation in Physics and Engineering, 1st Edition, Berlin: deGruyter. |
[55] |
Finnis MW, Sinclair JE (1984) A simple empirical N-body potential for transition metals. Philos Mag A 50: 45–55. doi: 10.1080/01418618408244210
![]() |
[56] |
Kohn W (1996) Density functional and density matrix method scaling linearly with the number of atoms. Phys Rev Lett 76: 3168–3171. doi: 10.1103/PhysRevLett.76.3168
![]() |
[57] |
Car R, Parrinello M (1985) Unified approach for molecular dynamics and density-functional theory. Phys Rev Lett 55: 2471–2474. doi: 10.1103/PhysRevLett.55.2471
![]() |
[58] |
Elstner M, Porezag D, Jungnickel G, et al. (1998) Self-consistent-charge density-functional tightbinding method for simulations of complex materials properties. Phys Rev B 58: 7260–7268. doi: 10.1103/PhysRevB.58.7260
![]() |
[59] |
Sutton AP, Finnis MW, Pettifor DG, et al. (1988) The tight-binding bond model. J Phys C-Solid State Phys 21: 35–66. doi: 10.1088/0022-3719/21/1/007
![]() |
[60] | Szabo A, Ostlund NS (1996) Modern quantum chemistry: introduction to advanced electronic structure theory, (Dover Books on Chemistry), New York: Dover Publications. |
[61] |
Kadau K, Germann TC, Lomdahl PS (2006) Molecular dynamics comes of age: 320 billion atom simulation on BlueGene/L. Int J Mod Phys C 17: 1755–1761. doi: 10.1142/S0129183106010182
![]() |
[62] |
Fineberg J (2003) Materials science: close-up on cracks. Nature 426: 131–132. doi: 10.1038/426131a
![]() |
[63] |
Buehler M, Hartmaier A, Gao H, et al. (2004) Atomic plasticity: description and analysis of a onebillion atom simulation of ductile materials failure. Comput Method Appl M 193: 5257–5282. doi: 10.1016/j.cma.2003.12.066
![]() |
[64] |
Abraham FF, Gao HJ (2000) How fast can cracks propagate? Phys Rev Lett 84: 3113–3116. doi: 10.1103/PhysRevLett.84.3113
![]() |
[65] |
Bulatov V, Abraham FF, Kubin L, et al. (1998) Connecting atomistic and mesoscale simulations of crystal plasticity. Nature 391: 669–672. doi: 10.1038/35577
![]() |
[66] |
Gross SP, Fineberg J, Marder M, et al. (1993) Acoustic emissions from rapidly moving cracks. Phys Rev Lett 71: 3162–3165. doi: 10.1103/PhysRevLett.71.3162
![]() |
[67] | Courant R (1943) Variational Methods for the Solution of Problems of Equilibrium and Vibrations. B Am Math Soc 49: 1–23. |
[68] |
Lucy LB (1977) A numerical approach to the testing of the fission hypothesis. Astron J 82: 1013–1024. doi: 10.1086/112164
![]() |
[69] |
Cabibbo N, Iwasaki Y, Schilling K (1999) High performance computing in lattice QCD. Parallel Comput 25: 1197–1198. doi: 10.1016/S0167-8191(99)00045-9
![]() |
[70] |
Evertz HG (2003) The loop algorithm. Adv Phys 52: 1–66. doi: 10.1080/0001873021000049195
![]() |
[71] | Holm EA, Battaile CC (2001) The computer simulation of microstructural evolution. JOM 53: 20–23. |
[72] |
Nielsen SO, Lopez CF, Srinivas G, et al. (2004) Coarse grain models and the computer simulation of soft materials. J Phys-Condens Mat 16: 481–512. doi: 10.1088/0953-8984/16/15/R03
![]() |
[73] |
Praprotnik M, Site LD, Kremer K (2008) Multiscale simulation of soft matter: From scale bridging to adaptive resolution. Annu Rev Phys Chem 59: 545–571. doi: 10.1146/annurev.physchem.59.032607.093707
![]() |
[74] | Karimi-Varzaneh HA, Müller-Plathe F (2011) Coarse-Grained Modeling for Macromolecular Chemistry, In: Kirchner B, Vrabec J, Topics in Current Chemistry, Berlin, Heidelberg: Springer, 326–321. |
[75] | Müller-Plathe F (2002) Coarse-graining in polymer simulation: from the atomistic to the mesoscopic scale and back. Chem Phys Chem 3: 755–769. |
[76] |
Abraham FF, Broughton JQ, Broughton JQ, et al. (1998) Spanning the length scales in dynamic simulation. Comp Phys 12: 538–546. doi: 10.1063/1.168756
![]() |
[77] |
Abraham FF, Brodbeck D, Rafey R, et al. (1994) Instability dynamics of fracture: A computer simulation investigation. Phys Rev Lett 73: 272–275. doi: 10.1103/PhysRevLett.73.272
![]() |
[78] |
Abraham FF, Brodbeck D, Rudge WE, et al. (1998) Ab initio dynamics of rapid fracture. Model Simul Mater Sc 6: 639–670. doi: 10.1088/0965-0393/6/5/010
![]() |
[79] | Warshel A, LevittM(1976) Theoretical studies of enzymic reactions: Dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme. J Mol Biol 103: 227–249. |
[80] | Winkler RG, Steinhauser MO, Reineker P (2002) Complex formation in systems of oppositely charged polyelectrolytes: a molecular dynamics simulation study. Phys Rev E 66: 021802. |
[81] |
Dünweg B, Reith D, Steinhauser M, et al. (2002) Corrections to scaling in the hydrodynamic properties of dilute polymer solutions. J Chem Phys 117: 914–924. doi: 10.1063/1.1483296
![]() |
[82] |
Stevens MJ (2004) Coarse-grained simulations of lipid bilayers. J Chem Phys 121: 11942–11948. doi: 10.1063/1.1814058
![]() |
[83] | Steinhauser MO (2005) A molecular dynamics study on universal properties of polymer chains in different solvent qualities. Part I. A review of linear chain properties. J Chem Phys 122: 094901. |
[84] |
Steinhauser MO, Hiermaier S (2009) A Review of Computational Methods in Materials Science: Examples from Shock-Wave and Polymer Physics. Int J Mol Sci 10: 5135–5216. doi: 10.3390/ijms10125135
![]() |
[85] |
Goetz R, Gompper G, Lipowsky R (1999) Mobility and elasticity of self-assembled membranes. Phys Rev Lett 82: 221–224. doi: 10.1103/PhysRevLett.82.221
![]() |
[86] |
Lipowsky R (2004) Biomimetic membrane modelling: pictures from the twilight zone. Nat Mater 3: 589–591. doi: 10.1038/nmat1208
![]() |
[87] |
Lyubartsev AP (2005) Multiscale modeling of lipids and lipid bilayers. Eur Biophys J 35: 53–61. doi: 10.1007/s00249-005-0005-y
![]() |
[88] |
Orsi M, Michel J, Essex JW (2010) Coarse-grain modelling of DMPC and DOPC lipid bilayers. J Phys-Condens Mat 22: 155106. doi: 10.1088/0953-8984/22/15/155106
![]() |
[89] | Steinhauser MO (2012) Introduction to Molecular Dynamics Simulations: Applications in Hard and Soft Condensed Matter Physics, InTech. |
[90] | Alberts B, Bray D, Johnson A, et al. (2000) Molecular Biology of the Cell, 4 Edition, New York: Garland Science, Taylor and Francis Group. |
[91] |
Steinhauser MO, Steinhauser MO, Schmidt M (2014) Destruction of cancer cells by laserinduced shock waves: recent developments in experimental treatments and multiscale computer simulations. Soft Matter 10: 4778–4788. doi: 10.1039/C4SM00407H
![]() |
[92] | Tozzini V (2004) Coarse-grained models for proteins. Curr Opin Struc Biol 15: 144–150. |
[93] |
Ayton GS, Noid WG, Voth GA (2007) Multiscale modeling of biomolecular systems: in serial and in parallel. Curr Opin Struc Biol 17: 192–198. doi: 10.1016/j.sbi.2007.03.004
![]() |
[94] |
Forrest LR, Sansom MS (2000) Membrane simulations: bigger and better? Curr Opin Struc Biol 10: 174–181. doi: 10.1016/S0959-440X(00)00066-X
![]() |
[95] |
Woods CJ, Mulholland AJ (2008) Multiscale modelling of biological systems. RSC Special Periodicals Report: Chemical Modelling, Applications and Theory 5: 13–50. doi: 10.1039/b608778g
![]() |
[96] | Steinhauser MO (editor) (2016) Special Issue of the Journal Materials: Computational Multiscale Modeling and Simulation in Materials Science. Available from: http://www.mdpi.com/journal/materials/special issues/modeling and simulation. |
[97] |
Brendel W (1986) Shock Waves: A New Physical Principle in Medicine. Eur Surg Res 18: 177–180. doi: 10.1159/000128523
![]() |
[98] | Wang CJ (2003) An overview of shock wave therapy in musculoskeletal disorders. Chang Gung Med J 26: 220–232. |
[99] |
Wang ZJZ, DesernoM (2010) A systematically coarse-grained solvent-free model for quantitative phospholipid bilayer simulations. J Phys Chem B 114: 11207–11220. doi: 10.1021/jp102543j
![]() |
[100] |
Wang ZB, Wu J, Fang LQ, et al. (2011) Preliminary ex vivo feasibility study on targeted cell surgery by high intensity focused ultrasound (HIFU). Ultrasonics 51: 369–375. doi: 10.1016/j.ultras.2010.11.002
![]() |
[101] |
Wang S, Frenkel V, Zderic V (2011) Optimization of pulsed focused ultrasound exposures for hyperthermia applications. J Acoust Soc Am 130: 599–609. doi: 10.1121/1.3598464
![]() |
[102] |
Paul W, Smith GD, Yoon DY (1997) Static and dynamic properties of an-C100H202 melt from molecular dynamics simulations. Macromolecules 30: 7772–7780. doi: 10.1021/ma971184d
![]() |
[103] |
Kreer T, Baschnagel J, Mueller M, et al. (2001) Monte Carlo Simulation of long chain polymer melts: Crossover from Rouse to reptation dynamics. Macromolecules 34: 1105–1117. doi: 10.1021/ma001500f
![]() |
[104] |
Krushev S, Paul W, Smith GD (2002) The role of internal rotational barriers in polymer melt chain dynamics. Macromolecules 35: 4198–4203. doi: 10.1021/ma0115794
![]() |
[105] |
Bulacu M, van der Giessen E (2005) Effect of bending and torsion rigidity on self-diffusion in polymer melts: A molecular-dynamics study. J Chem Phys 123: 114901. doi: 10.1063/1.2035086
![]() |
[106] | Kratky O, Porod G (1949) Röntgenuntersuchung gelöster Fadenmoleküle. Recl Trva Chim Pays-Bas 68: 1106–1122. |
[107] | Doi M, Edwards SF (1986) The Theory of Polymer Dynamics, Oxford: Clarendon Press. |
[108] |
Harris RA, Hearst JE (1966) On Polymer Dynamics. J Chem Phys 44: 2595–2602. doi: 10.1063/1.1727098
![]() |
[109] | Hearst JE, Harris RA (1967) On Polymer Dynamics. III. Elastic Light Scattering. J Chem Phys 46: 398–398. |
[110] |
Harnau L, Winkler RG, Reineker P (1997) Influence of stiffness on the dynamics of macromolecules in a melt. J Chem Phys 106: 2469–2476. doi: 10.1063/1.473154
![]() |
[111] |
Harnau L, WInkler RG, Reineker P (1999) On the dynamics of polymer melts: Contribution of Rouse and bending modes. EPL 45: 488–494. doi: 10.1209/epl/i1999-00193-6
![]() |
[112] |
Steinhauser MO (2008) Static and dynamic scaling of semiflexible polymer chains-a molecular dynamics simulation study of single chains and melts. Mech Time-Depend Mat 12: 291–312. doi: 10.1007/s11043-008-9062-9
![]() |
[113] |
Guenza M (2003) Cooperative dynamics in semiflexibile unentangled polymer fluids. J Chem Phys 119: 7568–7578. doi: 10.1063/1.1606674
![]() |
[114] | Piran T (2004) Statistical Mechanics of Membranes and Interfaces, 2 edition, World Scientific Publishing Co., Inc. |
[115] |
Schindler T, Kröner D, Steinhauser MO (2016) On the dynamics of molecular self-assembly and the structural analysis of bilayer membranes using coarse-grained molecular dynamics simulations. BBA-Biomembranes 1858: 1955–1963. doi: 10.1016/j.bbamem.2016.05.014
![]() |
[116] |
Brannigan G, Lin LCL, Brown FLH (2006) Implicit solvent simulation models for biomembranes. Eur Biophys J 35: 104–124. doi: 10.1007/s00249-005-0013-y
![]() |
[117] |
Chang R, Ayton GS, Voth GA (2005) Multiscale coupling of mesoscopic- and atomistic-level lipid bilayer simulations. J Chem Phys 122: 244716. doi: 10.1063/1.1931651
![]() |
[118] |
Huang MJ, Kapral R, Mikhailov AS, et al. (2012) Coarse-grain model for lipid bilayer selfassembly and dynamics: Multiparticle collision description of the solvent. J Chem Phys 137: 055101. doi: 10.1063/1.4736414
![]() |
[119] |
Pandit SA, Scott HL (2009) Multiscale simulations of heterogeneous model membranes. BBA-Biomembranes 1788: 136–148. doi: 10.1016/j.bbamem.2008.09.004
![]() |
[120] |
Farago O (2003) "Water-free" computer model for fluid bilayer membranes. J Chem Phys 119: 596–605. doi: 10.1063/1.1578612
![]() |
[121] |
Brannigan G, Philips PF, Brown FLH (2005) Flexible lipid bilayers in implicit solvent. Phys Rev E 72: 011915. doi: 10.1103/PhysRevE.72.011915
![]() |
[122] |
Yuan H, Huang C, Li J, et al. (2010) One-particle-thick, solvent-free, coarse-grained model for biological and biomimetic fluid membranes. Phys Rev E 82: 011905. doi: 10.1103/PhysRevE.82.011905
![]() |
[123] |
Noguchi H (2011) Solvent-free coarse-grained lipid model for large-scale simulations. J Chem Phys 134: 055101. doi: 10.1063/1.3541246
![]() |
[124] |
Weiner SJ, Kollman PA, Case DA, et al. (1984) A new force field for molecular mechanical simulation of nucleic acids and proteins. J Am Chem Soc 106: 765–784. doi: 10.1021/ja00315a051
![]() |
[125] |
Paul W, Yoon DY, Smith GD, et al. (1995) An Optimized United Atom Model for Simulations of Polymethylene Melts. J Chem Phys 103: 1702–1709. doi: 10.1063/1.469740
![]() |
[126] |
Siu SWI, Vácha R, Jungwirth P, et al. (2008) Biomolecular simulations of membranes: physical properties from different force fields. J Phys Chem 128: 125103. doi: 10.1063/1.2897760
![]() |
[127] |
Drouffe JM, Maggs AC, Leibler S, et al. (1991) Computer simulations of self-assembled membranes. Science 254: 1353–1356. doi: 10.1126/science.1962193
![]() |
[128] |
Goetz R, Lipowsky R (1998) Computer simulations of bilayer membranes: Self-assembly and interfacial tension. J Chem Phys 108: 7397–7409. doi: 10.1063/1.476160
![]() |
[129] |
Noguchi H, Takasu M (2001) Self-assembly of amphiphiles into vesicles: A Brownian dynamics simulation. Phys Rev E 64: 041913. doi: 10.1103/PhysRevE.64.041913
![]() |
[130] |
Bourov GK, Bhattacharya A (2005) Brownian dynamics simulation study of self-assembly of amphiphiles with large hydrophilic heads. J Chem Phys 122: 44702. doi: 10.1063/1.1834495
![]() |
[131] |
Steinhauser MO, Grass K, Thoma K, et al. (2006) Impact dynamics and failure of brittle solid states by means of nonequilibrium molecular dynamics simulations. EPL 73: 62–68. doi: 10.1209/epl/i2005-10353-2
![]() |
[132] |
Yang S, Qu J (2014) Coarse-grained molecular dynamics simulations of the tensile behavior of a thermosetting polymer. Phys Rev E 90: 012601. doi: 10.1103/PhysRevE.90.012601
![]() |
[133] | Eslami H, Müller-Plathe F (2013) How thick is the interphase in an ultrathin polymer film? Coarse-grained molecular dynamics simulations of polyamide-6,6 on graphene. J Phys Chem 117: 5249–5257. |
[134] |
Ganzenm¨uller GC, Hiermaier S, Steinhauser MO (2011) Shock-wave induced damage in lipid bilayers: a dissipative particle dynamics simulation study. Soft Matter 7: 4307–4317. doi: 10.1039/c0sm01296c
![]() |
[135] |
Huang WX, Chang CB, Sung HJ (2012) Three-dimensional simulation of elastic capsules in shear flow by the penalty immersed boundary method. J Comput Phys 231: 3340–3364. doi: 10.1016/j.jcp.2012.01.006
![]() |
[136] | Pazzona FG, Demontis P (2012) A grand-canonical Monte Carlo study of the adsorption properties of argon confined in ZIF-8: local thermodynamic modeling. J Phys Chem 117: 349–357. |
[137] |
Pogodin S, Baulin VA (2010) Coarse-grained models of phospholipid membranes within the single chain mean field theory. Soft Matter 6: 2216–2226. doi: 10.1039/b927437e
![]() |
[138] |
Wang Y, Sigurdsson JK, Brandt E, et al. (2013) Dynamic implicit-solvent coarse-grained models of lipid bilayer membranes: fluctuating hydrodynamics thermostat. Phys Rev E 88: 023301. doi: 10.1103/PhysRevE.88.023301
![]() |
[139] |
Koshiyama K, Kodama T, Yano T, et al. (2006) Structural Change in Lipid Bilayers and Water Penetration Induced by Shock Waves: Molecular Dynamics Simulations. Biophys J 91: 2198–2205. doi: 10.1529/biophysj.105.077677
![]() |
[140] |
Koshiyama K, Kodama T, Yano T, et al. (2008) Molecular dynamics simulation of structural changes of lipid bilayers induced by shock waves: Effects of incident angles. BBA-Biomembranes 1778: 1423–1428. doi: 10.1016/j.bbamem.2008.03.010
![]() |
[141] |
Lechuga J, Drikakis D, Pal S (2008) Molecular dynamics study of the interaction of a shock wave with a biological membrane. Int J Numer Mech Fluids 57: 677–692. doi: 10.1002/fld.1588
![]() |
[142] |
Kodama T, Kodama T, Hamblin MR, et al. (2000) Cytoplasmic molecular delivery with shock waves: importance of impulse. Biophys J 79: 1821–1832. doi: 10.1016/S0006-3495(00)76432-0
![]() |
[143] |
Doukas AG, McAuliffe DJ, Lee S, et al. (1995) Physical factors involved in stress-wave-induced cell injury: The effect of stress gradient. Ultrasound Med Biol 21: 961–967. doi: 10.1016/0301-5629(95)00027-O
![]() |
[144] |
Doukas AG, Flotte TJ (1996) Physical characteristics and biological effects of laser-induced stress waves. Ultrasound Med Biol 22: 151–164. doi: 10.1016/0301-5629(95)02026-8
![]() |
[145] |
Lee S, Doukas AG (1999) Laser-generated stress waves and their effects on the cell membrane. IEEE J Sel Top Quant 5: 997–1003. doi: 10.1109/2944.796322
![]() |
[146] |
Español P (1997) Dissipative Particle Dynamics with energy conservation. EPL 40: 631–636. doi: 10.1209/epl/i1997-00515-8
![]() |
[147] |
Steinhauser MO, Schindler T (2017) Particle-based simulations of bilayer membranes: selfassembly, structural analysis, and shock-wave damage. Comp Part Mech 4: 69–86. doi: 10.1007/s40571-016-0126-3
![]() |
[148] | Hansen JP, McDonald IR (2005) Theory of Simple Liquids, Academic Press. |
Initial description | Gender:Female; Age:67; Marital Status:Married; Family History:None; |
T:36.8 ℃; P:62 beats/min; R:18 beats/min; Bp:140/90 mmHg | |
Numerical information | {(0, 67, 1, 0, 36.8, 62, 18, 140, 90)} |
Disease | Training set | Test set | Validation set |
CI | 700 | 200 | 100 |
VBI | 700 | 200 | 100 |
CAHD | 700 | 200 | 100 |
bronchitis | 700 | 200 | 100 |
degenerative spondylitis | 700 | 200 | 100 |
T2DM | 700 | 200 | 100 |
other diseases | 700 | 200 | 100 |
cholecystitis | 511 | 146 | 73 |
intestinal obstruction | 392 | 112 | 56 |
Disease | Avg number | Max number | Min number |
CI | 16.83 | 23 | 9 |
VBI | 13.71 | 20 | 8 |
CAHD | 15.16 | 27 | 7 |
bronchitis | 17.47 | 29 | 6 |
degenerative spondylitis | 12.18 | 14 | 5 |
T2DM | 16.38 | 32 | 8 |
other diseases | 16.08 | 33 | 6 |
cholecystitis | 18.94 | 30 | 9 |
intestinal obstruction | 16.16 | 27 | 8 |
Method | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) |
SVM | 89.06 | 87.39 | 86.91 | 87.15 |
CNN | 89.55 | 88.44 | 87.55 | 87.99 |
BiLSTM | 89.39 | 88.53 | 86.97 | 87.74 |
RCNN | 89.76 | 89.04 | 87.51 | 88.27 |
BERT | 91.68 | 90.83 | 89.26 | 90.04 |
Our-model | 94.66 | 93.62 | 90.28 | 91.92 |
Disease | SVM | CNN | BiLSTM | RCNN | BERT | Our model |
CI | 82.26 | 82.91 | 82.94 | 83.38 | 86.41 | 88.78 |
VBI | 82.45 | 83.19 | 83.83 | 86.27 | 87.18 | 88.43 |
CAHD | 89.72 | 90.74 | 90.43 | 90.88 | 92.03 | 93.83 |
bronchitis | 89.79 | 89.87 | 90.17 | 90.59 | 93.48 | 95.15 |
degenerative spondylitis | 89.91 | 90.14 | 89.81 | 90.41 | 90.22 | 91.53 |
T2DM | 90.37 | 90.91 | 89.85 | 89.49 | 91.48 | 93.43 |
other diseases | 84.27 | 86.23 | 85.17 | 85.94 | 88.54 | 89.96 |
cholecystitis | 88.69 | 89.72 | 88.75 | 88.35 | 90.91 | 93.52 |
intestinal obstruction | 86.89 | 88.23 | 88.72 | 89.14 | 90.18 | 92.68 |
Method | Precision (%) | Recall (%) | F1-score (%) |
T + E + N | 93.62 | 90.28 | 91.92 |
T + E | 92.18 | 90.18 | 91.17 |
T + N | 91.69 | 89.06 | 90.36 |
E + N | 91.19 | 90.47 | 90.83 |
Model | Precision (%) | Recall (%) | F1-score (%) |
Word2vec+Doc2vec | 89.15 | 88.36 | 88.75 |
Our-model | 93.62 | 90.28 | 91.92 |
Initial description | Gender:Female; Age:67; Marital Status:Married; Family History:None; |
T:36.8 ℃; P:62 beats/min; R:18 beats/min; Bp:140/90 mmHg | |
Numerical information | {(0, 67, 1, 0, 36.8, 62, 18, 140, 90)} |
Disease | Training set | Test set | Validation set |
CI | 700 | 200 | 100 |
VBI | 700 | 200 | 100 |
CAHD | 700 | 200 | 100 |
bronchitis | 700 | 200 | 100 |
degenerative spondylitis | 700 | 200 | 100 |
T2DM | 700 | 200 | 100 |
other diseases | 700 | 200 | 100 |
cholecystitis | 511 | 146 | 73 |
intestinal obstruction | 392 | 112 | 56 |
Disease | Avg number | Max number | Min number |
CI | 16.83 | 23 | 9 |
VBI | 13.71 | 20 | 8 |
CAHD | 15.16 | 27 | 7 |
bronchitis | 17.47 | 29 | 6 |
degenerative spondylitis | 12.18 | 14 | 5 |
T2DM | 16.38 | 32 | 8 |
other diseases | 16.08 | 33 | 6 |
cholecystitis | 18.94 | 30 | 9 |
intestinal obstruction | 16.16 | 27 | 8 |
Method | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) |
SVM | 89.06 | 87.39 | 86.91 | 87.15 |
CNN | 89.55 | 88.44 | 87.55 | 87.99 |
BiLSTM | 89.39 | 88.53 | 86.97 | 87.74 |
RCNN | 89.76 | 89.04 | 87.51 | 88.27 |
BERT | 91.68 | 90.83 | 89.26 | 90.04 |
Our-model | 94.66 | 93.62 | 90.28 | 91.92 |
Disease | SVM | CNN | BiLSTM | RCNN | BERT | Our model |
CI | 82.26 | 82.91 | 82.94 | 83.38 | 86.41 | 88.78 |
VBI | 82.45 | 83.19 | 83.83 | 86.27 | 87.18 | 88.43 |
CAHD | 89.72 | 90.74 | 90.43 | 90.88 | 92.03 | 93.83 |
bronchitis | 89.79 | 89.87 | 90.17 | 90.59 | 93.48 | 95.15 |
degenerative spondylitis | 89.91 | 90.14 | 89.81 | 90.41 | 90.22 | 91.53 |
T2DM | 90.37 | 90.91 | 89.85 | 89.49 | 91.48 | 93.43 |
other diseases | 84.27 | 86.23 | 85.17 | 85.94 | 88.54 | 89.96 |
cholecystitis | 88.69 | 89.72 | 88.75 | 88.35 | 90.91 | 93.52 |
intestinal obstruction | 86.89 | 88.23 | 88.72 | 89.14 | 90.18 | 92.68 |
Method | Precision (%) | Recall (%) | F1-score (%) |
T + E + N | 93.62 | 90.28 | 91.92 |
T + E | 92.18 | 90.18 | 91.17 |
T + N | 91.69 | 89.06 | 90.36 |
E + N | 91.19 | 90.47 | 90.83 |
Model | Precision (%) | Recall (%) | F1-score (%) |
Word2vec+Doc2vec | 89.15 | 88.36 | 88.75 |
Our-model | 93.62 | 90.28 | 91.92 |