Loading [MathJax]/jax/output/SVG/jax.js

Non-existence of solutions to some degenerate coercivity elliptic equations involving measures data

  • In this paper, we main consider the non-existence of solutions u by approximation to the following quasilinear elliptic problem with principal part having degenerate coercivity:

    {div(|u|p2u(1+|u|)(p1)θ)+|u|q1u=λ,xΩ,u=0,xΩ,

    provided

    q>r(p1)[1+θ(p1)]rp,

    where Ω is a bounded smooth subset of RN(N>2), 1<p<N, q>1, 0θ<1, λ is a measure which is concentrated on a set with zero r capacity (p<rN).

    Citation: Maoji Ri, Shuibo Huang, Canyun Huang. Non-existence of solutions to some degenerate coercivity elliptic equations involving measures data[J]. Electronic Research Archive, 2020, 28(1): 165-182. doi: 10.3934/era.2020011

    Related Papers:

    [1] Zhaoyu Liang, Zhichang Zhang, Haoyuan Chen, Ziqin Zhang . Disease prediction based on multi-type data fusion from Chinese electronic health record. Mathematical Biosciences and Engineering, 2022, 19(12): 13732-13746. doi: 10.3934/mbe.2022640
    [2] Kunli Zhang, Bin Hu, Feijie Zhou, Yu Song, Xu Zhao, Xiyang Huang . Graph-based structural knowledge-aware network for diagnosis assistant. Mathematical Biosciences and Engineering, 2022, 19(10): 10533-10549. doi: 10.3934/mbe.2022492
    [3] Xiaoqing Lu, Jijun Tong, Shudong Xia . Entity relationship extraction from Chinese electronic medical records based on feature augmentation and cascade binary tagging framework. Mathematical Biosciences and Engineering, 2024, 21(1): 1342-1355. doi: 10.3934/mbe.2024058
    [4] Feng Li, Mingfeng Jiang, Hongzeng Xu, Yi Chen, Feng Chen, Wei Nie, Li Wang . Data governance and Gensini score automatic calculation for coronary angiography with deep-learning-based natural language extraction. Mathematical Biosciences and Engineering, 2024, 21(3): 4085-4103. doi: 10.3934/mbe.2024180
    [5] Hongyang Chang, Hongying Zan, Shuai Zhang, Bingfei Zhao, Kunli Zhang . Construction of cardiovascular information extraction corpus based on electronic medical records. Mathematical Biosciences and Engineering, 2023, 20(7): 13379-13397. doi: 10.3934/mbe.2023596
    [6] Qinhua Tang, Xingxing Cen, Changqing Pan . Explainable and efficient deep early warning system for cardiac arrest prediction from electronic health records. Mathematical Biosciences and Engineering, 2022, 19(10): 9825-9841. doi: 10.3934/mbe.2022457
    [7] Zhichang Zhang, Yu Zhang, Tong Zhou, Yali Pang . Medical assertion classification in Chinese EMRs using attention enhanced neural network. Mathematical Biosciences and Engineering, 2019, 16(4): 1966-1977. doi: 10.3934/mbe.2019096
    [8] Jiayuan Zhang, Rongxin Guo, Yifan Shi, Wanting Tang . An anti-impersonation attack electronic health record sharing scheme based on proxy re-encryption and blockchain. Mathematical Biosciences and Engineering, 2024, 21(6): 6167-6189. doi: 10.3934/mbe.2024271
    [9] Yong Li, Li Feng . Patient multi-relational graph structure learning for diabetes clinical assistant diagnosis. Mathematical Biosciences and Engineering, 2023, 20(5): 8428-8445. doi: 10.3934/mbe.2023369
    [10] Ching-Chun Chang, Chang-Tsun Li . Algebraic secret sharing using privacy homomorphisms for IoT-based healthcare systems. Mathematical Biosciences and Engineering, 2019, 16(5): 3367-3381. doi: 10.3934/mbe.2019168
  • In this paper, we main consider the non-existence of solutions u by approximation to the following quasilinear elliptic problem with principal part having degenerate coercivity:

    {div(|u|p2u(1+|u|)(p1)θ)+|u|q1u=λ,xΩ,u=0,xΩ,

    provided

    q>r(p1)[1+θ(p1)]rp,

    where Ω is a bounded smooth subset of RN(N>2), 1<p<N, q>1, 0θ<1, λ is a measure which is concentrated on a set with zero r capacity (p<rN).



    The number of users is growing every year on the Internet Medical Service Platform. For example, Guahao.COM, which is supported by the National Health and Family Planning Commission of China (NHFPC). As of July 2014, the Guahao website has been connected to the information systems of more than 900 key hospitals in 23 provinces across the country. It has more than 30 million real-name registered users and more than 100 thousand key hospital experts. Guahao website is committed to connecting hospitals, doctors and patients via the Internet, and promoting the efficient sharing of information between them. The platform presents the network structure [1] naturally and is relatively clear, including data of multiple dimensions such as the number of people, hospitals, and satisfaction. Using new developments and effectively analyzing data, we can help patients understand the competence of each doctor and find a more suitable doctor for themselves to better meet their individual needs. It also improves the quality of the medical platform. But, It is difficult to find the most effective doctors for patients with specific diseases from these complex data.

    In this research, we use the patient's review information to find the most eligible doctor for specific diseases. We believe that doctors with many positive comments from patients have better performance and vice versa. We build medical heterogeneous information network where we predict the real performance of the doctors' specific-disease, which is called the experience score. The higher the score, the better the doctor's performance. The medical heterogeneous information network [3] is shown in Figure 1, where the nodes with different types represent doctors and diseases, and the edges represent the treatment of doctors on diseases.

    Figure 1.  Doctor-disease heterogeneous information network.

    Firstly, to obtain better performance of the doctors, we embed two aspects of features. (i) Embedding information: a patient publishes comments on the doctor's personal home page, displaying criticism, appreciation or suggestions based on his/her medical experience. (ii) Embedding network: extracted the doctor-patient-disease heterogeneous network [4] from millions of review records, each of which represents a patient visiting a doctor for a specific disease.

    Secondly, in order to get more subtle and abstract features of doctors and diseases, we use the autoencoder [5] to learn the embedded representation of the network. Autoencoder is an important training model in deep learning [2,6] that automatically learns data representations by attempting to reconstruct their inputs at the output layer. At the same time, when we embed textual features in the network, we use autoencoder processing technology to learn the general representation of textual information and extract the hidden features to avoid the high dimensional and sparse textual content.

    Finally, considering that the doctor may be good at several specific diseases and the patient evaluate a limited number of doctors, we adopt the Extreme gradient boosting (XGBoost) [7,8] for the sparse heterogeneous network. The XGBoost algorithm has many advantages, which are needed in this paper, including the ability to build relatively quickly, to process continuous and categorical data naturally, to handle missing data naturally, to be robust to the outliers in the input, and to scale well on a large dataset.

    In brief, we use a large number of patients' evaluation data for doctors to propose a new method named SeekDoc, which can predict the experience score of doctors for a specific disease, so as to find the most effective doctors for the disease. We summarize our contributions as follows:

    ● We build a doctor-disease heterogeneous network to explore the potential links between the disease and the doctor, and then collect reviews and rating records from patients and utilize network-embedding techniques to represent doctor-disease vector pairs.

    ● We use autoencoder to get the latent features. Then, XGBoost algorithm is adopted to predict the doctor's experience scores for special diseases.

    ● Experiment results with real-world large-scale datasets demonstrate the effectiveness of SeekDoc.

    The rest of the paper is organized as follows. Preliminaries is presented in section 2, followed by section 3 that describe the proposed method SeekDoc in detail. In section 4, we describe the experiments and evaluations. Section 5 describes the related work. We conclude the paper and discuss the future work in section 6.

    We aim to predict the best doctor for a specific disease. Before presenting the proposed algorithm, we briefly show the required notations and definitions.

    Definition 1. (Heterogeneous Medical Information Network, HMIN) In the process of medical diagnosis, the network of participants connected by various relationships is heterogeneous medical information network. Define the node type as T, including the main medical participants. A represents the set of nodes, A={At}Tt=1, where At represents the set of nodes of type t. An undirected network is usually represented as a graph G(V,E,W), when the set of node V=A, the set of edge E is a binary relation group on V, W is a weight mapping from edge E. When T2, information network is called heterogeneous information network.

    Definition 2. (Doctor-Disease Network, 2DN) A 2DN is defined as a graph G(Vdoctor,Vdisease,E,W), where Vdoctor and Vdisease are two sets of nodes that represents doctors and diseases, E is the set of links, which connects a doctor and a disease, and W is the corresponding set of weights.

    Based on the two definitions, we further use the comments textual to quantify the weight between two adjacent nodes. Selecting this measure mainly has two reasons. First, comments textual provides an intuitive way to characterized the node relationship. Generally, the better doctor performs on a disease, the better comments the patient will evaluate the doctor. Secondly, the method of text-to-vector is very mature, and it is reasonable to convert numerical expression from text.

    Definition 3. (Doctor-Disease Network Weight, 2DNW) Given a pair of doctor and disease(i.e., the ith doctor in Vdoctor and the jth disease in Vdisease), SeekDoc collect the weight of textual, represented by wi and wj respectively, from textual comments.

    SeekDoc first aggregates all comments of the ith doctor from the online content data, then it picks keywords, denoted as wordi1, wordi2, ..., wordip(suppose totally p keywords found), from the aggregated comments. Using Word2Vec [9] technique, these p keywords are then converted to vectors vi1, vi2, ..., vip respectively. The weight wi1 characterizing the comments that patients left to the ith doctor is extracted as the center of keyword vectors, i.e..

    wi1=1ppk=1vik (2.1)

    In the similar way, SeekDoc aggregates the comments containing the name of disease, then extracts keywords from the comments. These keywords are converted to vector vj1, vj2, ..., vjp through Word2Vec. Then the center of these keyword vectors is used to characterize the jth disease. We denote this weight as wj1.

    In this section, we describe the framework of SeekDoc. Afterward, we introduce two parts of the framework in detail.

    The framework of SeekDoc is shown in Figure 2. First, we build a hereogeneous information network with doctor and disease data and use Node2Vec* [10] technology to obtain the representation of the doctor and disease. At the same time, we use Word2Vec [9] to transfer the comment data into the feature representation. Then we concatenate these two feature representation and apply autoencoder technology to learn them to get a reconstructed embedded feature. At last, we use Extreme Gradient Boosting to learn the reconstructed feature to improve the accuracy of the prediction and compute the experience score of doctors for a specific disease.

    * https://github.com/eliorc/node2vec/

    https://github.com/3Top/word2vec-api/

    Figure 2.  The framework of SeekDoc.

    We design an autoencoder structure to learn embedded features. The basic framework of autoencoder is a neural network, which includes input layer, output layer and at least one hidden layer. An single hidden layer autoencoder can be described by Figure 3.

    Figure 3.  Single hidden layer autoencoder.

    In Figure 3, for input X, assuming X=(x1,x2,,xd), and xi[0,1], the autoencoder first maps the input X to an implicit layer, using the hidden layer to represent it as H=(h1,h2,hd), and hi[0,1], the process is called encode. The specific form of the output H of the hidden layer is

    H=σ1(W1X+b1) (3.1)

    where W1 represents the weight matrix between the input layer and the hidden layer. b1 represents bias vector. σ1 is a nonlinear mapping, usually an activation function. At this level, we use the Rectified Linear units (ReLU) [11]. The ReLU activation function is defined as follows:

    σ1(x)=max(0,x) (3.2)

    The output H is called the variable of the hidden layer, and the variable Z is reconstructed by the variable of the hidden layer. Here, the Z of the output layer has the same structure as the input layer X, and this process is called decode. The specific form of the output layer Z is

    Z=σ2(W2H+b2) (3.3)

    where σ2 is the Tanh activation function, which is defined as

    σ2(x)=exexex+ex (3.4)

    The output of the output layer can be thought of as a prediction of the raw data X using the feature H. For the weight matrix W2 in the decoding process, it can be regarded as the inverse process of the encoding process, that is, W2=WT1. b2 represents bias vector.

    In order to minimize the reconstruction error between the reconstructed Z and the original X, we define its loss function:

    l=∥XZ2 (3.5)

    Extreme Gradient Boosting (XGBoost) is an integrated learning algorithm based on Gradient Boosting. Its principle is to achieve accurate classification by iterative calculation of weak classifier. The Gradient Boosting Machine algorithm uses the idea of gradient descent in generating each tree, based on all the trees generated in the previous step, moving in the direction of minimizing the given objective function. XGBoost is an implementation of the Gradient Boosting Machine that automatically leverages multithreading of the CPU for parallelism and improves the algorithm.

    Then, we introduce the model construction of XGBoost.

    Given a data set D=(xi,yi)(|D|=n,xiRm,yiR), the tree's integration model is represented as:

    ^yi=Kk=1fk(xi),fkF (3.6)

    where F={f(x)=wq(x)}(q:RmT,wRT) is the set space of the regression tree, xi represents the feature vector of the ith data point, k represents the number of trees, q represents the index of the result of each tree mapped to the leaf corresponding to the sample, and T represents the number of leaves, and each fk corresponds to a separate tree structure q and the weight of the leaf w.

    The original objective function form is as follows:

    Obj(Θ)=nil(yi,^yi)+Kk=1Ω(fk) (3.7)

    the first part is the training error between the predicted value ^yi and the true value yi, and the second part is the sum of the complexity of each tree. Complexity calculation formula:

    Ω(f)=γT+λ12Tj=1w2j (3.8)

    Then we use the method of adding a new function on the basis of retaining the original model every time. The detailed process is as follows:

    ˆy(0)i=0 (3.9)
    ˆy(1)i=f1(xi)=ˆy(0)i+f1(xi) (3.10)
    ˆy(2)i=f1(xi)+f2(xi)=ˆy(1)i+f2(xi) (3.11)
    ˆy(t)i=tk=1fk(xi)=ˆy(t1)i+ft(xi) (3.12)

    ˆy(t)i is the model prediction value of the ith sample in the tth iteration, and it retains the model prediction value ˆy(t1)i of the iteration t-1, and a new function ft(xi) is added. The choice to add a new function in each iteration is to minimize the objective function as much as possible. Thus, the rewrite target function is:

    Obj(t)=L(t)=nil(yi,^yi)+ti=1Ω(fi)=nil(yi,ˆy(t1)i+fi(xi))+Ω(ft)+contanst (3.13)

    optimize this objective function with ft. When the error function l is a square error, the objective function can be written as:

    L(t)=ni[2(ˆy(t1)iyi)ft(xi)+ft(xi)2]+Ω(ft)+contanst (3.14)

    For the case of other error functions, Taylor expansion is used to approximate the objective function, details are as follows:

    gi=l(yi,ˆy(t1))ˆy(t1)hi=2l(yi,ˆy(t1))ˆy(t1)˜L(t)ni[l(yi,ˆy(t1))+gift(xi)+12hif2t(xi)]+Ω(ft)+contanst (3.15)

    After removing the constant term, a relatively uniform objective function is obtained:

    ˜L(t)ni[gift(xi)+12hif2t(xi)]+Ω(ft) (3.16)

    In this section, we first introduce statistical information about the real medical data set in this paper. Then, we compare SeekDoc with several baselines on this data set. The experimental results indicate that SeekDoc has better performance in many evaluation indicators.

    We use a real and comprehensive data set to perform the experiment with doctors and patients. The dataset come from an online clinical discussion platform serving 554 hospitals in China—Guahao.COM (https://www.guahao.com/) where the patient can make a rating for the doctor by making an appointment online or consulting a doctor. It contains all patient appointments, reviews and ratings covers three years from July 2012 to December 2015.

    Table 1 gives some detailed statistics about the datasets.The dataset includes 28625 doctors which have 14284 ratings and comments from patients and 358 categories of disease. The patient's rating range for the doctor is [-2, 3], with -2 representing the patient being very dissatisfied with the doctor and 3 representing the patient being satisfied with the doctor. Based on these data, the algorithm will always give an excellent recommend.

    Table 1.  The detail of doctor-disease network dataset.
    Data Attributes Numbers
    Doctors Number of Doctors(NoD) 28625
    NoD with comments 2719
    NoD with appointments 1999
    NoD with consultations 8463
    Patients Number of Comments 14284
    Number of effective comments 10935
    Grade of Efficacy [-2, 3]
    Disease Number of Diesase 358

     | Show Table
    DownLoad: CSV

    In addition to the introduction of the data in the above table, we also consider the properties of the data itself, as mentioned in the following figure. Figure 4(a) shows the distribution of the doctor whether he is the director. Then, we summarize the doctor's profile information including number of reservation, number of consultation and attention as shown in Figure 4(b). The x-axis indicate is the number of doctors with the number of concerns, appointments, and consultations related to the disease. The y-axis corresponds to the amount of attention, appointment and consultation.

    Figure 4.  Properties of the data.

    Next, in order to display rating information more intuitively. We extracted the information that Figure 5 refers to the patient's rating of the doctor's treatment effect, it is easy to see from the figure that more patients are satisfied with the results of the doctor's treatment, and only a small number of patients think they have not received effective treatment.

    Figure 5.  Statistics on the evaluation of doctors' treatment effect.

    In this section, we compare the proposed method with eleven baseline algorithms for predicting doctor-disease pairs score. Note that, the first five algorithms are used to compare the quality of the recommendation results. The sixth algorithms are used to compare the effects of different feature combination. They are as follows:

    Multi-Layer Perception (MLP): the MLP[12] algorithm is one of the deep learning method. Learning through the Neural Network, include input layer, hidden layers, output layer to learning more mission-oriented features through hierarchical structure.

    Kernel Ridge (KR): the KR[13] algorithm combine Ridge Regression(Linear Regression) with kernel techniques. It will learn linear functions in space caused by individual kernels and data.

    Gaussian Naive Bayes (GNB): the GNB[14]algorithm inheriting Naive Bayes, the feature possibility is assumed to be Gaussian.

    Nonnegative Matrix Factorization (NMF): the NMF[15] algorithm makes all components after decomposition non-negative, and at the same time achieves a non-linear dimension reduction.

    Dr.Right! (DR!): the DR! [16] algorithm develop a data analytical framework which incorporates the so-called network-textual embedding, together with data-imbalance-aware mixture multi-classification models to rate doctors per specific diseases.

    Textual Features (SeekDoc-T): only textual features are embedded as inputs in this model.

    Heterogeneous Network Features (SeekDoc-HN): only heterogeneous network features are embedded as inputs in this model.

    Textual Features + Heterogeneous Network Features (SeekDoc-THN): textual features and heterogeneous network features are embedded as inputs in this model.

    Textual Features + Autoencoder (SeekDoc-TE): adding autoencoder to textual features as inputs in this model.

    Heterogeneous Network Features + Autoencoder (SeekDoc-HNE): adding autoencoder to heterogeneous network features as inputs in this model.

    Textual Features + Heterogeneous Network Features + Autoencoder (SeekDoc-THNE): adding autoencoder to textual features and heterogeneous network features are embedded as inputs in this model.

    Our experiments evaluate the proposed method from different perspectives, including its error, stability and comprehensiveness.

    More concretely, we used MSE and RMSE metrics to calculate SDE for evaluating the stability of the proposed predicting framework. The description of the different metrics is as follows:

    MSE=Ni|Si^Si|2N (4.1)
    RMSE=Ni|Si^Si|2N (4.2)
    SDE=Ni(Errori¯Errori)2N (4.3)

    where N represents the size of the data set, Si represents the doctor-disease score observed in the data set, that is, the patient's score on the doctor's treatment effect, and ^Si indicates the experimentally predicted doctor-disease score. |Si^Si| refers to the absolute error of the prediction, expressed as Errori, and ¯Errori is used to represent the average error. The MSE is the average of the sum of squares of errors between the observed doctor-disease score and the predicting score. The smaller the value is, the lower error the prediction is. The RMSE is the square value of MSE, the smaller the value, the more accurate the prediction is. The SDE is the standard deviation between them. The smaller the SDE is, the more stable the performance of the model is.

    In order to provide guidance for patients to find the right doctor, we use the softMax [17] function to classify their doctors, and use ACC to evaluate the quality of our classification.

    ACC=Ni(Si==^Si)N (4.4)

    The ACC refers to the probability that the predictive score is the same as the observed doctor-disease score. The higher the value is, the higher the accuracy of the prediction is.

    In this work, the machine learning models include three parts. First, data preprocessing is trained on python 2.7.15 with several scientific computing libraries, such as Numpy 1.15.1, Xlrd 1.1.0, Networkx 2.1, Matplotlib 2.2.3 and Scipy 1.1.0. Next, in order to abstract high dimensional feature model from data, we use autoencoder method trained on python 3.6.8 with several scientific computing libraries, such as Sklearn 0.20.2, Numpy 1.15.4 and Pytorch 1.0.0. Finally, main program and baseline algorithms are run in python 2.7.15 with several scientific computing libraries, such as Numpy 1.15.1, Sklearn 0.19.2 and XGBoost 0.6.

    Parametric Details: We set learning_rate = 0.1, n_estimators = 1000, max_depth = 30, min_child_weight = 1, gamma = 0.2, subsample = 0.8, objective = 'multi:softmax', scale_pos_weight = 1 for XGBoost.

    In this section, to evaluate the performance of the proposed algorithm. The experimental results are shown in Figure 6 to indicate the state-of-the-art performance of SeekDoc. We used the MSE, RMSE, SDE and ACC to evaluate the proposed method. The average results of the proposed method and the other five baselines were obtained after 5-fold cross validation. From the experimental results, we can draw the following conclusions:

    Figure 6.  The comparison performance between five baseline algorithms.

    (1) SeekDoc have the lowest MSE value(0.5857) than the MSE values of MLP(0.7330), KR(0.7043), GNB(9.3391), NMF(22.1384), DR!(4.1895), shown in Figure 6(a).

    (2) SeekDoc have the lowest RMSE value(0.7653) than the RMSE values of MLP(0.8562), KR(0.8392), GNB(3.4204), NMF(4.7051), DR!(2.0598), shown in Figure 6(b).

    (3) SeekDoc have lower SDE value(0.4487) than the SDE values of GNB(4.0169), NMF(0.4518), DR!(2.7102), but little higher than MLP(0.3299), KR(0.3454), shown in Figure 6(c).

    (4) SeekDoc have the best ACC (0.6914) values are higher than the ACC values of MLP(0.3791), KR(0.4485), GNB(0.2830), NMF(0.0033), DR!(0.5099), shown in Figure 6(d).

    SeekDoc has the best performence on MSE and RMSE, this represents the smallest error between our predicted Doctor's performance and reality. Although SDE value is not better than MLP and KR algorithm, the experimental results still show that our algorithm has better stability. We also classify doctors by using the software Max function and get the highest ACC value. It means that the prediction result is closest to the real situation. In a word, all experimental results demonstrate the priority of SeekDoc, compared with other baseline algorithms on the perspective of comprehensive evaluation.

    In this section, in order to better understand the algorithms, we will introduce the robustness check following these two parts: (1) study of feature embedding; (2) study of timing consuming.

    In this section, we demonstrate the performance of embedding features on the MSE values. The result is shown in Table 2.

    Table 2.  MSE values for feature embedding.
    MSE SeekDoc-T SeekDoc-HN SeekDoc-THN SeekDoc-TE SeekDoc-HNE SeekDoc-THNE
    MLP 5.5678 12.7355 5.5378 0.7171 0.8140 0.7330
    KR 2.3868 3.2435 3.1785 0.6942 0.7726 0.7043
    GNB 13.0730 6.6758 11.1109 11.9811 9.1277 11.6990
    NMF 20.1934 19.9607 20.1565 22.1384 22.1384 22.1384
    DR! 5.0913 5.0739 5.0879 3.8736 3.2792 4.2428
    SeekDoc 2.7067 2.5982 2.6897 0.8081 0.8479 0.5857

     | Show Table
    DownLoad: CSV

    From the Table 2, we compared the MSE values with five different algorithms on six different features. It is obvious that the SeekDoc achieves the better performance on dataset. On the one hand, SeekDoc have the best ACC value than other five baselines for DIHN, DITHN, DITHNE features respectively. On the other hand, when the method uses DITHNE features, the minimum MSE value is obtained. The second smaller value is DITE features, next, in order from smallest to largest, it is DIHNE features, DIHN features, DITHN features, DIT features. From the perspective of MSE values, the feature embedding method we used is reasonable and effective.

    We further evaluate the time consumption of SeekDoc and other five algorithms measured in seconds. The running time is shown in Figure 7. It can be found DR! is a time-consumer among all algorithms, while SeekDoc takes a little longer than other algorithms. However, SeekDoc performs much better than other algorithms. It can be seen from the analysis that the effective and efficient of SeekDoc.

    Figure 7.  The comparison of seekDoc with different algorithm on time.

    In our paper, we propose an advance approach named as SeekDoc, that help patients find the influential doctor for a given disease using online healthcare comments data. We summarize the most relevant studies as follows.

    Recently, it is very important for the patient to find a doctor who is suitable for the disease. More and more work has been done on healthcare. Edward et al. [18] proposed a GRaph-based Attention Model to solve the problem of insufficient data, which supplemented electronic health records(EHR) with hierarchical information inherent in medical ontology. Choi et al. [19] proposed Med2Vec, which not only learns the representations for both medical codes and visits from large EHR datasets with over million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts. To exploit the potential information captured in EHRs, Ayoub et al. [20] proposed MI-BiLSTM, a multimodal bidirectional long short-term memory-based framework for cardiovascular risk prediction that integrates medical texts and structured clinical information. Suresh et al. [21] combined with multi-task learning model to help doctors and nurses give patients more appropriate treatment through clinical prediction. Yin Zhang et al. [22] proposed a doctor recommendation based on hybrid matrix factorization. Jiang Ling et al. [23] investigated the approaches for measuring user similarity in online health social websites. Chang Xu et al. [24] proposed an online medical service recommendation scheme to protect privacy in the electronic health care system, which takes the doctor's reputation score and the similarity between user needs and doctor information as the basis for recommending medical services. Yan et al. [25] proposed a hybrid recommendation algorithm (PMF-CNN) based on deep learning is proposed for doctor recommendation, PMF-CNN model uses convolutional neural network to learn the context features of review information, so as to extract more accurate feature representation to realize the modeling of review information. Bo Jin et al. [26] proposed an effective and robust architecture for heart prediction. Ling Chen et al. [27] proposed a new network-based algorithm that ranks heterogeneous objects in a medical information network. Mateo et al. [28] showed that the healthcare expert system was implemented on the group cooperation model.

    As an important training model in deep learning, autoencoder has been paid more and more attention by researchers for its good performance in natural language processing. Jiawei Zhang et al. [29] introduced an embedded framework based on deep autoencoder, which aims to learn the embedded vector of users in emerging networks and reduce the degradation of embedding performance caused by network sparse structure. Bengio et al. [30] studied a hierarchical unsupervised learning algorithm empirically and explored variables to better understand its success and extended it to situations where the inputs are continuous or where the structure of the input distribution is not revealing enough about the variable to be predicted in a supervised task. Goodfellow et al. [31] proposed a number of empirical tests that directly measure the degree to which some learned features are invariant to different input transformations. These results further justify the use of "deep" vs. "shallower" representations, but suggest that mechanisms beyond merely stacking one autoencoder on top of another may be important for achieving invariance. Xiong Dapeng et al. [32] presented a computational DeepConPred, which includes a pipeline of two novel deep-learning-based methods (DeepCCon and DeepRCon) as well as a contact refinement step, to improve the prediction of long-rang residue contacts from primary sequences.

    Extreme gradient boosting(XGBoost) is a scalable machine learning system for tree xboosting that proposed by Chen in 2016. The advantage of the XGBoost algorithm is that it can automatically utilize the multi-threading of the CPU for parallelism [33], at the same time, it can improve the accuracy of the algorithm.

    In this paper, we proposed a novel predicting model for patient-doctor score using XGBoost methods. Compared to popular methods such as MLP, GPC, KR, GNB, NMF and Dr. Right!, our methed exhibits superior performance in the prediction of influential doctor. Besides the popular methods, we also use parts of features (DIT, DIHN, DITHN, DITE, DIHNE, DITHNE) as inputs to demonstrate the state-of-the-art of the feature-embedding algorithm. In the experimental data analysis and preprocessing, we used node embedding vectors to represent the doctor and disease with neighborhood network information, and used word embedding vectors to represent the patient comment in details. Further, we used autoencoder to extract latent features into the network. Then all the work is based the basic XGBoost model. Finally, a large number of experiments using the real-world datasets are used to demonstrate the effectiveness of SeekDoc in performance.

    SeekDoc has a certain improvement in predicting the most effective doctors, but there are still some shortcomings, such as feature embedding based on the network structure, no analysis and mining of heterogeneous network itself. In the future work, we consider adding more case data to SeekDoc to extract more influential feature data from the network. In addition, the performance of the model can be improved by adjusting the parameters or combining with other algorithms to further optimize the algorithm.

    This work is supported by the Fundamental Research Funds for the Central Universities 2412018QD022, NSFC (under Grant No.61976050, 61972384, 21473025), Jilin Provincial Science and Technology Department under Grant No. 20190302109GX and Jilin Education Department No. JJKH20200791KJ. We are grateful to all anonymous reviewers whose insightful comments have helped us to improve the work.

    All authors declare no conflicts of interest in this paper.



    [1] A. Alvino, L. Boccardo, V. Ferone, L. Orsina and G. Trombetti, Existence results for nonlinear elliptic equations with degenerate coercivity, Ann. Mat. Pura. Appl. (4), 182 (2003), 53-79. doi: 10.1007/s10231-002-0056-y
    [2] P. Bénilan, L. Boccardo, T. Gallouët, R. Gariepy, M. Pierre and J. L. Vázquez, An L1 theory of existence and uniqueness of nonlinear elliptic equations, Ann. Scuola. Norm. Sup. Pisa Cl. Sci. (4), 22 (1995), 241-273.
    [3] P. Bénilan, H. Brézis and M. Crandall, A semilinear equation in L1(RN), Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4), 2 (1975), 523-555.
    [4] Some elliptic problems with degenerate coercivity. Adv. Nonlinear Stud. (2006) 6: 1-12.
    [5] Some cases of weak continuity in nonlinear Dirichlet problems. J. Funct. Anal. (2019) 277: 3673-3687.
    [6] L. Boccardo and H. Brézis, Some remarks on a class of elliptic equations with degenerate coercivity, Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. (8), 6 (2003), 521-530.
    [7] Nonlinear degenerate elliptic problems with W1,10(Ω) solutions. Manuscripta Math. (2012) 137: 419-439.
    [8] Existence and uniqueness of entropy solutions for nonlinear elliptic equations with measure data. Ann. Inst. H. Poincaré Anal. Non Linéaire (1996) 13: 539-551.
    [9] H. Brézis, Nonlinear elliptic equations involving measures, in Contributions to Nonlinear Partial Differential Equations, Res. Notes Math., 89, Pitman, Boston, MA, 1983, 82–89.
    [10] On the existence of solutions to non-linear degenerate elliptic equations with measures data. Ricerche Mat. (1993) 42: 315-329.
    [11] G. Dal Maso, F. Murat, L. Orsina and A. Prignet, Renormalized solutions for elliptic equations with general measure data, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4), 28 (1999), 741-808.
    [12] Exitence results for some nonuniformly elliptic equations with irregular data. J. Math. Anal. Appl. (2001) 257: 100-130.
    [13] Elliptic equations with degenerate coercivity: Gradient regularity. Acta. Math. Sin. (Engl. Ser.) (2003) 19: 349-370.
    [14] Quasilinear elliptic equations with exponential nonlinearity and measure data. Math. Methods Appl. Sci. (2020) 43: 2883-2910.
    [15] Entropy solutions to noncoercive nonlinear elliptic equations with measure data. Electron. J. Differential Equations (2019) 2019: 1-22.
    [16] Marcinkiewicz estimates for solution to fractional elliptic Laplacian equation. Comput. Math. Appl. (2019) 78: 1732-1738.
    [17] S. Huang and Q. Tian, Harnack-type inequality for fractional elliptic equations with critical exponent, Math. Methods Appl. Sci., (2020), 1–18. doi: 10.1002/mma.6280
    [18] Stability for noncoercive elliptic equations. Electron. J. Differential Equations (2016) 2016: 1-11.
    [19] Dynamics of an edge-based SEIR model for sexually transmitted diseases. Math. Biosci. Eng. (2020) 17: 669-699.
    [20] Quelques résultats de Višik sur les problèmes elliptiques semi-linéaires par les méthodes de Minty-Browder. Bull. Soc. Math. France (1965) 93: 97-107.
    [21] X. Li and S. Huang, Stability and bifurcation for a single-species model with delay weak kernel and constant rate harvesting, Complexity, 2019 (2019). doi: 10.1155/2019/1810385
    [22] Non-existence of solutions for some nonlinear elliptic equations involving measures. Proc. Roy. Soc. Edinburgh Sect. A (2000) 130: 167-187.
    [23] Strong stability results for nonlinear elliptic equations with respect to very singular perturbation of the data. Commum. Contemp. Math. (2001) 3: 259-285.
    [24] Strong stability results for solutions of elliptic equations with power-like lower order terms and measure data. J. Funct. Anal. (2002) 189: 549-566.
    [25] M. M. Porzio and F. Smarrazzo, Radon measure-valued solutions for some quasilinear degenerate elliptic equations, Ann. Mat. Pura. Appl. (4), 194 (2015), 495-532. doi: 10.1007/s10231-013-0386-y
    [26] Q. Tian and Y. Xu, Effect of the domain geometry on the solutions to fractional Brezis-Nirenberg problem, J. Funct. Spaces, 2019 (2019), 4pp. doi: 10.1155/2019/1093804
    [27] Y. Ye, H. Liu, Y. Wei, M. Ma and K. Zhang, Dynamic study of a predator-prey model with weak Allee effect and delay, Adv. Math. Phys., 2019 (2019), 15pp. doi: 10.1155/2019/7296461
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3784) PDF downloads(413) Cited by(5)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog