Processing math: 81%
Research article Special Issues

Data governance and Gensini score automatic calculation for coronary angiography with deep-learning-based natural language extraction

  • With the widespread adoption of electronic health records, the amount of stored medical data has been increasing. Clinical data, often in the form of semi-structured or unstructured electronic medical records (EMRs), contains rich patient information. However, due to the use of natural language by physicians when composing these records, the effectiveness of traditional methods such as dictionaries, rule matching, and machine learning in the extraction of information from these unstructured texts falls short of clinical standards. In this paper, a novel deep-learning-based natural language extraction method is proposed to overcome current shortcomings in data governance and Gensini score automatic calculation in coronary angiography. A pre-trained model called bidirectional encoder representation from transformers (BERT) with strong text feature representation capabilities is employed as the feature representation layer. It is combined with bidirectional long short-term memory (BiLSTM) and conditional random field (CRF) models to extract both global and local features from the text. The study included an evaluation of the model on a dataset from a hospital in China and it was compared with another model to validate its practical advantages. Hence, the BiLSTM-CRF model was employed to automatically extract relevant coronary angiogram information from EMR texts. The achieved F1 score was 91.19, which is approximately 0.87 higher than the BERT-BiLSTM-CRF model.

    Citation: Feng Li, Mingfeng Jiang, Hongzeng Xu, Yi Chen, Feng Chen, Wei Nie, Li Wang. Data governance and Gensini score automatic calculation for coronary angiography with deep-learning-based natural language extraction[J]. Mathematical Biosciences and Engineering, 2024, 21(3): 4085-4103. doi: 10.3934/mbe.2024180

    Related Papers:

    [1] Victor Zhenyu Guo . Almost primes in Piatetski-Shapiro sequences. AIMS Mathematics, 2021, 6(9): 9536-9546. doi: 10.3934/math.2021554
    [2] Yukai Shen . kth powers in a generalization of Piatetski-Shapiro sequences. AIMS Mathematics, 2023, 8(9): 22411-22418. doi: 10.3934/math.20231143
    [3] Jinyun Qi, Zhefeng Xu . Almost primes in generalized Piatetski-Shapiro sequences. AIMS Mathematics, 2022, 7(8): 14154-14162. doi: 10.3934/math.2022780
    [4] Yanbo Song . On two sums related to the Lehmer problem over short intervals. AIMS Mathematics, 2021, 6(11): 11723-11732. doi: 10.3934/math.2021681
    [5] Xiaoqing Zhao, Yuan Yi . High-dimensional Lehmer problem on Beatty sequences. AIMS Mathematics, 2023, 8(6): 13492-13502. doi: 10.3934/math.2023684
    [6] Mingxuan Zhong, Tianping Zhang . Partitions into three generalized D. H. Lehmer numbers. AIMS Mathematics, 2024, 9(2): 4021-4031. doi: 10.3934/math.2024196
    [7] Bingzhou Chen, Jiagui Luo . On the Diophantine equations x2Dy2=1 and x2Dy2=4. AIMS Mathematics, 2019, 4(4): 1170-1180. doi: 10.3934/math.2019.4.1170
    [8] Rui Wang, Jiangtao Peng . On the inverse problems associated with subsequence sums of zero-sum free sequences over finite abelian groups Ⅱ. AIMS Mathematics, 2021, 6(2): 1706-1714. doi: 10.3934/math.2021101
    [9] Jinyan He, Jiagui Luo, Shuanglin Fei . On the exponential Diophantine equation (a(al)m2+1)x+(alm21)y=(am)z. AIMS Mathematics, 2022, 7(4): 7187-7198. doi: 10.3934/math.2022401
    [10] Baria A. Helmy, Amal S. Hassan, Ahmed K. El-Kholy, Rashad A. R. Bantan, Mohammed Elgarhy . Analysis of information measures using generalized type-Ⅰ hybrid censored data. AIMS Mathematics, 2023, 8(9): 20283-20304. doi: 10.3934/math.20231034
  • With the widespread adoption of electronic health records, the amount of stored medical data has been increasing. Clinical data, often in the form of semi-structured or unstructured electronic medical records (EMRs), contains rich patient information. However, due to the use of natural language by physicians when composing these records, the effectiveness of traditional methods such as dictionaries, rule matching, and machine learning in the extraction of information from these unstructured texts falls short of clinical standards. In this paper, a novel deep-learning-based natural language extraction method is proposed to overcome current shortcomings in data governance and Gensini score automatic calculation in coronary angiography. A pre-trained model called bidirectional encoder representation from transformers (BERT) with strong text feature representation capabilities is employed as the feature representation layer. It is combined with bidirectional long short-term memory (BiLSTM) and conditional random field (CRF) models to extract both global and local features from the text. The study included an evaluation of the model on a dataset from a hospital in China and it was compared with another model to validate its practical advantages. Hence, the BiLSTM-CRF model was employed to automatically extract relevant coronary angiogram information from EMR texts. The achieved F1 score was 91.19, which is approximately 0.87 higher than the BERT-BiLSTM-CRF model.



    Let q be an integer. For each integer a with

    1a<q,  (a,q)=1,

    we know that [1] there exists one and only one ˉa with

    1ˉa<q

    such that

    aˉa1(q).

    Define

    R(q):={a:1aq,(a,q)=1,2a+ˉa},
    r(q):=#R(q).

    The work [2] posed the problem of investigating a nontrivial estimation for r(q) when q is an odd prime. Zhang [3,4] gave several asymptotic formulas for r(q), one of which is:

    r(q)=12ϕ(q)+O(q12d2(q)log2q),

    where ϕ(q) is the Euler function and d(q) is the divisor function. Lu and Yi [5] studied a generalization of the Lehmer problem over short intervals. Let n2 be a fixed positive integer, q3 and c be integers with

    (nc,q)=1.

    They defined

    rn(θ1,θ2,c;q)=#{(a,b)[1,θ1q]×[1,θ2q]abc( mod q),na+b},

    where 0<θ1,θ21, and obtained

    rn(θ1,θ2,c;q)=(11n)θ1θ2φ(q)+O(q1/2τ6(q)log2q),

    where the O constant depends only on n. In addition, Xi and Yi [6] considered generalized Lehmer problem over short intervals. Han and Liu [7] gave an upper bound estimation for another generalization of the Lehmer problem over incomplete interval.

    Guo and Yi [8] also found the Lehmer problem has good distribution properties on Beatty sequences. For fixed real numbers α and β, defined by

    Bα,β:=(αn+β)n=1.

    Beatty sequences are linear sequences. Based on the results obtained, we conjecture the Lehmer problem also has good distribution properties in some non-linear sequences.

    The Piatetski-Shapiro sequence is a non-linear sequence, defined by

    Nc={nc:nN},

    where cR is non-integer with c>1 and zR. This sequence was first introduced by Piatetski-Shapiro [9] to study prime numbers in sequences of the form f(n), where f(n) is a polynomial. A positive integer is called square-free if it is a product of distinct primes. The distribution of square-free numbers in the Piatetski-Shapiro sequence has been studied extensively. Stux [10] found that, as x tends to infinity,

    nxnc is square-free 1=(6π2+o(1))x, for  1<c<43. (1.1)

    In 1978, Rieger [11] improved the range to 1<c<3/2 and obtained

    6xπ2+O(x(2c+1)/4+ε), for  1<c<32

    Considering the results obtained, we develop this problem by investigating

    R(c;q):=nNcR(q)n is square-free 1

    and range of c when q tends to infinity. By methods of exponential sum and Kloosterman sums and fairly detailed calculations, we get the following result, which is significant for understanding the distribution properties of the Lehmer problem.

    Theorem 1.1. Let q be an odd integer and large enough,

    γ:=1/candc(1,43),

    we obtain

    R(c;q)=3π2pq(1+p1)1qγ+O(pq(1p12)1qγ12)+O(q713γ+413pq(1p12)1logq)+O(q34d3(q)logq)+O(qγ16d2(q)log3q),

    where the O constant only depends on c.

    This paper consists of three main sections. Introduction covers the origins and developments of the Lehmer problem, along with several interesting results. It also presents relevant findings related to the Piatetski-Shapiro sequences. The second section includes some definitions and lemmas throughout the paper. The third section outlines the calculation process, where we use additive characteristics to convert the congruence equations into exponential sum problems. We then employ the Kloosterman sums and exponential sums methods to derive an interesting asymptotic formula.

    To complete the proof of the theorem, we need the following several definitions and lemmas.

    In this paper, we denote by t and {t} the integral part and the fractional part of t, respectively. As is customary, we put

    e(t):=e2πitand{t}:=tt.

    The notation t is used to denote the distance from the real number t to the nearest integer; that is,

    t:=minnZ|tn|.

    And indicates that the variable summed over takes values coprime to the number q. Throughout the paper, ε always denotes an arbitrarily small positive constant, which may not be the same at different occurrences; the implied constants in symbols O,, and may depend (where obvious) on the parameters c and ε, but are absolute otherwise. For given functions F and G, the notations

    FG,   GF   andF=O(G)

    are all equivalent to the statement that the inequality

    |F|C|G|

    holds with some constant C>0.

    Lemma 2.1. Let 1c(m) denote the characteristic function of numbers in a Piatetski-Shapiro sequence, then

    1c(m)=γmγ1+O(mγ2)+ψ((m+1)γ)ψ(mγ),

    where

    ψ(t)=tt12   andγ=1/c.

    Proof. Note that an integer m has the form

    m=nc

    for some integer n if and only if

    mnc<m+1,(m+1)γ<nmγ.

    So

    1c(m)=mγ(m+1)γ=mγψ(mγ)+(m+1)γ+ψ((m+1)γ)=γmγ1+O(mγ2)+ψ((m+1)γ)ψ(mγ).

    Thie completes the proof.

    Lemma 2.2. Let H1 be an integer, ah,bh be real numbers, we have

    |ψ(t)0<|h|Hahe(th)||h|Hbhe(th),ah1|h|,bh1H.

    Proof. In 1985, Vaaler showed how Beurling's function could be used to construct a trigonometric polynomial approximation to ψ(x). For each positive integer N, Vaaler's construction yields a trigonometric polynomial ψ of degree N which satisfies

    |ψ(x)ψ(x)|12N+2|n|N(1|n|N+1)e(nx),

    where

    ψ(x)=1|n|N(2πin)1ˆJN+1(n)e(nx),H(z)=sin2πzπ2{n=sgn(n)(zn)2+2z},J(z)=12H(z),HN(z)=sin2πzπ2{|n|Nsgn(n)(zn)2+2z},JN(z)=12HN(z),

    and sgn(n) is the sign of n. The Fourier transform ˆJ(t) satisfies

    ˆJ(t)={1,t=0;πt(1|t|)cotπt+|t|,0<|t|<1;0,t1.

    To be short, we denote

    ah=(2πih)1ˆJH+1(h)1|h|,bh=12H+2(1|h|H+1)1H.

    There are more details in Appendix Theorem A.6. of [12].

    Lemma 2.3. Denote

    Kl(m,n;q)=qa=1qb=1ab1(modq)e(ma+nbq),

    then

    Kl(m,n;q)(m,n,q)12q12d(q),

    where (m,n,q) is the greatest common divisor of m,n and q and d(q) is the number of positive divisors of q.

    Proof. The proof is given in [13].

    Lemma 2.4. (Korobov [14]) Let α be a real number, Q be an integer, and P be a positive integer, then

    |Q+Px=Q+1e(αx)|min(P,12α).

    Lemma 2.5. (Karatsuba [15]) For any number b, U<0, K1, let

    a=sr+θr2,(r,s)=1,   r1,   |θ|1,

    then

    kKmin(U,1ak+b)(Kr+1)(U+rlogr).

    Lemma 2.6. Suppose f is continuously differentiable, f(n) is monotonic, and

    f(n)λ1>0

    on I, then

    nIe(f(n))λ11.

    Proof. See [12, Theorem 2.1].

    Lemma 2.7. Let k be a positive integer, k2. Suppose that f(n) is a real-valued function with k continuous derivatives on [N,2N], Further suppose that

    0<Ff(k)(n)hF.

    Then

    |N<x2Ne(f(n))|FκNλ+F1,

    where the implied constant is absolute.

    Proof. See [12, Chapter 3].

    By the definition of Mobius function

    μ(n)={(1)ω(n),p|n,p2n,0,p2|n,

    it is clear that n is square-free if and only if

    μ2(n)=1,

    where ω(n) is the number of prime divisor of n. So

    R(c;q)=12qn=1(1(1)n+ˉn)μ2(n)1c(n)=12(R1R2), (3.1)

    where

    R1=qn=1μ2(n)1c(n)

    and

    R2=qn=1(1)n+ˉnμ2(n)1c(n).

    From Lemma 2.1, we have

    R1=qn=1μ2(n)1c(n)=qn=1μ2(n)(γnγ1+O(nγ2)+ψ((n+1)γ)ψ(nγ))=R11+R12, (3.2)

    where

    R11:=qn=1μ2(n)(γnγ1+O(nγ2))=qn=1μ2(n)γnγ1+O(qn=1μ2(n)nγ2).

    Let

    D={d:p|dp|q}

    and λ(n) is Liouville function. When nR(q),

    μ2(n)={dm=n,dDλ(d)μ2(m),(n,q)=1,0,(n,q)>1. (3.3)

    We just consider the first term of R11. Applying Euler summuation [1],

    qn=1μ2(n)γnγ1=qn=1dm=ndDλ(d)μ2(m)γ(dm)γ1=dDλ(d)dγ1mqdμ2(m)γmγ1=dDλ(d)dγ1mqd(l2mμ(l))γmγ1=dDλ(d)dγ1l(qd)12μ(l)l2γ2mqdl2γmγ1=dDλ(d)dγ1l(qd)12μ(l)l2γ2((qdl2)γ+O((qdl2)γ1))=qγdDλ(d)d1l(qd)12μ(l)l2+O(dDl(qd)12qγ1)=qγdDλ(d)d1(lμ(l)l2+O((qd)12))+O(pq(1p12)1qγ12)=6π2pq(1+p1)1qγ+O(pq(1p12)1qγ12),

    thus

    R11=6π2pq(1+p1)1qγ+O(pq(1p12)1qγ12). (3.4)

    For R12, by Lemma 2.2, we have

    R12:=qn=1μ2(n)(ψ((n+1)γ)ψ(nγ))=R121+O(R122), (3.5)

    where

    R121:=qn=1μ2(n)(0<|h|Hah(e((n+1)γh)e(nγh)))

    and

    R122:=qn=1μ2(n)(|h|Hbh(e((n+1)γh)+e(nγh))).

    Define

    f(t)=e(((dt)γ(dt+1)γ)h)1,

    then

    f(t) |h|(dt)γ1,f(t)t|h|dγ1tγ2.

    By Lemma 2.2 and Eq (3.3),

    R121=0<|h|HahdDλ(d)(1<mqdμ2(m)e((dm)γh)f(m))0<|h|H|h|1dD|qd0f(t)d(1<mtμ2(m)e((dm)γh))|0<|h|H|h|1dD|f(qd)(1<mqdμ2(m)e((dm)γh))|+0<|h|H|h|1dD|qd0f(t)t1<mtμ2(m)e((dm)γh)dt|.

    Let

    (κ,λ)=(16,23)

    be an exponential pair. Applying Lemma 2.7, it's easy to see

    1<mtμ2(m)e((dm)γh)=mt(l2mμ(l))e((dm)γh)lt12|mtl2e((dl2m)γh)|lt12logq(((dl2)γ|h|(tl2)γ1)16(tl2)23+((dl2)γ|h|(tl2)γ1)1)logqlt12((dγ|h|)16t16γ+12l1+(dγ|h|)1t1γl2)(dγ|h|)16t16γ+12log2q+(dγ|h|)1t1γlogq(dγ|h|)16t16γ+12log2q,

    thus

    R1210<|h|H|h|1dD|h|qγ1(dγ|h|)16(qd)16γ+12log2q+0<|h|H|h|1dDqd0|h|dγ1tγ2(dγ|h|)16t16γ+12log2qdt0<|h|H|h|16dDd12q76γ12log2q+0<|h|H|h|16dDd76γ1log2qqd0t76γ32dtH76pq(1p12)1q76γ12log2q. (3.6)

    For R122, the contribution from h0 can be bounded by similar methods of Eq (3.6). Taking

    H=q913713γ1,

    we obtain

    R122=b0qn=1μ2(n)+0<|h|Hbhqn=1μ2(n)(e((n+1)γh)+e(nγh))H1q+H76pq(1p12)1q76γ12logqq713γ+413pq(1p12)1log2q. (3.7)

    It follows from Eqs (3.5)–(3.7),

    R12q713γ+413pq(1p12)1log2q.

    Hence

    R1=6π2pq(1+p1)1qγ+O(pq(1p12)1qγ12)+O(q713γ+413pq(1p12)1log2q). (3.8)

    Similarly,

    R2=qn=1(1)n+ˉnμ2(n)1c(n)=RP21+RP22, (3.9)

    where

    R21=qn=1(1)n+ˉnμ2(n)(γnγ1+O(nγ2))

    and

    R22=qn=1(1)n+ˉnμ2(n)(ψ((n+1)γ)ψ(nγ)).

    We also just consider the first term of R21.

    qn=1(1)n+ˉnμ2(n)γnγ1=qn=1(1)n+ˉn(d2nμ(d))γnγ1=qn=1(1)n+ˉnd2ndq14μ(d)γnγ1+qn=1(1)n+ˉnd2nq14<dq12μ(d)γnγ1. (3.10)

    It is easy to see

    qn=1(1)n+ˉnd2nq14<dq12μ(d)γnγ1qn=1d2nq14<dq12γnγ1qγ14. (3.11)

    Since for integers m and a, one has

    1qqs=1e(s(ma)q)={1,ma( mod q);0,ma( mod q).

    This gives

    qn=1(1)n+ˉnd2ndq14μ(d)γnγ1=qn=1qm=1nm1(modq)(1)n+md2ndq14μ(d)γnγ1q1a=1a=mq1b=1b=n1=qn=1qm=1nm1(modq)q1a=1a=mq1b=1b=n(1)a+bd2bdq14μ(d)γbγ1=qn=1qm=1nm1(modq)q1a=1am(modq)q1b=1bn(modq)(1)a+bd2bdq14μ(d)γbγ1=qn=1qm=1nm1(modq)q1a=1q1b=1(1)a+bd2bdq14μ(d)γbγ1×(1qqs=1e(s(ma)q))(1qqt=1e(t(nb)q))=1q2qs=1qt=1(nm1(modq)e(sm+tnq))×(q1a=1(1)ae(saq))(q1b=1(1)be(tbq)d2bdq14μ(d)γbγ1). (3.12)

    From Lemma 2.3,

    nm1(modq)e(sm+tnq)=Kl(s,t;q)(s,t,q)12q12d(q). (3.13)

    Note the estimate

    |q1a=1(1)ae(saq)|1|e(12sq)1|1|cossqπ| (3.14)

    holds. By Abel summation and Lemma 2.4, we have

    q1b=1(1)be(tbq)d2bdq14μ(d)γbγ1=dq14μ(d)q1d2b=1(1)d2be(td2bq)γ(d2b)γ1dq14d2(γ1)|q1d2b=1(1)d2be(td2bq)γbγ1|dq14d2(γ1)γ(qd2)γ1max1βq1d2|βb=1(1)d2be(td2bq)|dq142dγqγ1max1βq1d2|βb=1e(td2bq)|+dq142dγqγ1max1βq1d2|βb=1(1)be(td2bq)|dq142dqγ1min(q1d2,12d2qt)+dq142dqγ1min(q1d2,1212d2qt). (3.15)

    To be short, combining Eqs (3.13)–(3.15), we denote

    R211:=q2qs=1qt=1(s,t,q)12d(q)q121|cossqπ|dq142dqγ1min(q1d2,12d2qt)qγ3qs=1qt=1(s,t,q)12d(q)q121|cossqπ|dq14min(q1d2,12d2qt)qγ3uqu12d(q)q12qus=11|cossuqπ|dq14qut=1min(q1d2,12d2qut)

    and

    R212:=q2qs=1qt=1(s,t,q)12d(q)q121|cossqπ|dq142dqγ1min(q1d2,1212d2qt).

    Let

    (d2,qu)=r,   (d2r,qur)=1,

    making use of Lemma 2.5, we have

    qut=1min(q1d2,12d2qut)(ququr+1)(q1d2+qurlogq)qrd2+qulogq.

    Insert it to R211, then

    R211qγ3uqu12d(q)q12qus=11|12suq|dq14(d2,qu)=r(qrd2+qulogq)qγ3uqu12d(q)q12qulogqdq14r|d2r|qu(qrd2+qulogq)qγ3uqu12d(q)q12qulogqr|qudq14r12(qd2+qulogq)qγ3uqu12d(q)q12qulogq(qd(q)+q54ud(q)logq)qγ14d3(q)log3q.

    By the same method of R211,

    R212qγ14d3(q)log3q.

    Following from Eqs (3.10) and (3.11), estimations of R211 and R212,

    R21qγ14+RP211+RP212qγ14d3(q)log3q. (3.16)

    By the similar method of R12 and R21,

    R22=R221+O(R222), (3.17)

    where

    R221:=qn=1(1)n+ˉnμ2(n)(0<|h|Hah(e((n+1)γh)e(nγh)))

    and

    R222:=qn=1(1)n+ˉnμ2(n)(|h|Hbh(e((n+1)γh)+e(nγh))).

    It is obvious that

    \begin{align} R_{221} = &\mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}}\left(\sum\limits_{d^{2} \mid n}\mu(d)\right)\left( \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(n+1)^{\gamma}h\right)-\mathbf{e}(-n^{\gamma}h)\right)\right) \\ = &\mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}} \mathop{\sum\limits_{d^{2} \mid n}}_{d \leqslant q^{\frac{1}{6}}}\mu(d) \left( \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(n+1)^{\gamma}h\right)-\mathbf{e}(-n^{\gamma}h)\right)\right) \\ & +\mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}} \mathop{\sum\limits_{d^{2} \mid n}}_{q^{\frac{1}{6}} < d \leqslant q^{\frac{1}{2}} }\mu(d) \left( \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(n+1)^{\gamma}h\right)-\mathbf{e}(-n^{\gamma}h)\right)\right). \end{align} (3.18)

    From the estimate

    \mathbf{e}\left(n^{\gamma}h-(n+1)^{\gamma}h\right)-1 \ll ( n^{\gamma} -(n+1)^{\gamma})h \ll \gamma n^{\gamma -1} h,

    by partial summation,

    \begin{align} &\mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}} \mathop{\sum\limits_{d^{2} \mid n}}_{q^{\frac{1}{6}} < d \leqslant q^{\frac{1}{2}} }\mu(d) \left( \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(n+1)^{\gamma}h\right)-\mathbf{e}(-n^{\gamma}h)\right)\right) \\ &\ll\mathop{{\sum}^{\prime}}_{n = 1}^{q} \mathop{\sum\limits_{d^{2} \mid n}}_{q^{\frac{1}{6}} < d \leqslant q^{\frac{1}{2}} } \left| \sum\limits_{0 < |h|\leqslant H}a_{h} \mathbf{e}(-n^{\gamma}h) \left(\mathbf{e}\left(n^{\gamma}h-(n+1)^{\gamma}h\right)-1\right)\right| \\ &\ll\mathop{{\sum}^{\prime}}_{n = 1}^{q} \mathop{\sum\limits_{d^{2} \mid n}}_{q^{\frac{1}{6}} < d \leqslant q^{\frac{1}{2}} } \gamma n^{\gamma-1}H \log H \\ & \ll q^{\gamma-\frac{1}{6}}H \log H . \end{align} (3.19)

    For another term of R_{221} ,

    \begin{align} &\mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}} \mathop{\sum\limits_{d^{2} \mid n}}_{d \leqslant q^{\frac{1}{6}} }\mu(d) \left( \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(n+1)^{\gamma}h\right)-\mathbf{e}(-n^{\gamma}h)\right)\right)\\ & = \mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}} \mathop{\sum\limits_{d^{2} \mid n}}_{d \leqslant q^{\frac{1}{6}} }\mu(d) \left( \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(n+1)^{\gamma}h\right)-\mathbf{e}(-n^{\gamma}h)\right)\right) \\ &\quad\times \left(\frac{1}{q}\sum\limits_{a = 1}^{q}\sum\limits_{s = 1}^{q}\mathbf{e}(\frac{s(m-a)}{q}) \right)\left(\frac{1}{q}\sum\limits_{b = 1}^{q}\sum\limits_{t = 1}^{q}\mathbf{e}(\frac{t(n-b)}{q})\right) \\ & = \frac{1}{q^{2}}\sum\limits_{s = 1}^{q}\sum\limits_{t = 1}^{q} \left( \mathop{\sum}_{nm \equiv 1 (\bmod q)} \mathbf{e}(\frac{sm+tn}{q}) \right)\left(\sum\limits_{a = 1}^{q-1}(-1)^{a}\mathbf{e}(-\frac{sa}{q})\right)\\ &\quad\times \left(\sum\limits_{b = 1}^{q-1}(-1)^{b}\mathbf{e}(-\frac{tb}{q}) \mathop{\sum\limits_{d^{2} \mid b}}_{d \leqslant q^{\frac{1}{6} } }\mu(d) \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(b+1)^{\gamma}h\right)-\mathbf{e}(-b^{\gamma}h)\right) \right ). \end{align} (3.20)

    We just need to give an estimation of the last part in (3.20). Similarly, let

    g (x) = \mathbf{e}\left(\left((d^{2}x)^{\gamma}-(d^{2}x+1)^{\gamma}\right)h \right)-1,

    then

    \begin{align} g (x) &\ll\ |h| (d^2x)^{\gamma-1}, \\ \frac{\partial g(x)}{\partial x} &\ll |h| d^{2\gamma-2}x^{\gamma-2}. \end{align}

    By partial summation,

    \begin{align} &\sum\limits_{b = 1}^{q-1}(-1)^{b}\mathbf{e}(-\frac{tb}{q}) \mathop{\sum\limits_{d^{2} \mid b}}_{d \leqslant q^{\frac{1}{6} } }\mu(d) \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(b+1)^{\gamma}h\right)-\mathbf{e}(-b^{\gamma}h)\right) \\ & = \sum\limits_{0 < |h|\leqslant H}a_{h} \sum\limits_{d \leqslant q^{ \frac{1}{6} }}\mu(d)\sum\limits_{1 \leqslant b \leqslant \lfloor\frac{q-1}{d^2}\rfloor} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right)g(b), \\ & \ll\sum\limits_{0 < |h|\leqslant H}|h|^{-1} \sum\limits_{d \leqslant q^{ \frac{1}{6} }} \left|\int_{ 1}^{ \lfloor\frac{q-1}{d^2}\rfloor} g(x) \mathrm{d}\left(\sum\limits_{1 < b \leqslant x} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right) \right) \right|\\ & \ll\sum\limits_{0 < |h|\leqslant H}|h|^{-1} \sum\limits_{d \leqslant q^{ \frac{1}{6} }} \left|g( \lfloor\frac{q-1}{d^2}\rfloor) \sum\limits_{ 1 \leqslant b \leqslant \lfloor\frac{q-1}{d^2}\rfloor} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right)\right|\\ &\quad +\sum\limits_{0 < |h|\leqslant H}|h|^{-1} \sum\limits_{d \leqslant q^{ \frac{1}{6} }} \left|\int_{ 1}^{ \lfloor\frac{q-1}{d^2}\rfloor}\frac{\partial g(x)}{\partial x} \sum\limits_{ 1 \leqslant b \leqslant x} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right) \mathrm{d}x\right|, \end{align}

    where

    g( \lfloor\frac{q-1}{d^2}\rfloor) \ll |h|q^{\gamma-1}

    and

    \begin{align} &\sum\limits_{ 1 \leqslant b \leqslant \lfloor\frac{q-1}{d^2}\rfloor} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right) \\ & = \sum\limits_{ 1 \leqslant b \leqslant q^{\frac{1}{6}} } \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right)+\sum\limits_{ q^{\frac{1}{6}} < b \leqslant \lfloor\frac{q-1}{d^2}\rfloor} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right). \end{align}

    It is obvious that

    \begin{align} \sum\limits_{ 1 \leqslant b \leqslant q^{\frac{1}{6}} } \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right)\ll q^{\frac{1}{6}}. \end{align}

    Suppose q be large enough and for b > q^{\frac{1}{6}} , when 2 \nmid d or q \nmid td^2 ,

    \left\| (\frac{d^{2}}{2}-\frac{td^{2}}{q})-\gamma d^{2\gamma}b^{\gamma-1}h\right\|^{-1} \geqslant \frac{1}{2} \left\| (\frac{1}{2}-\frac{t }{q})d^{2}\right\|^{-1} > 0,

    and applying Lemma 2.6, we have

    \begin{align} \sum\limits_{q^{\frac{1}{6}} < b \leqslant \lfloor\frac{q-1}{d^2}\rfloor} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right) \ll& \left\| (\frac{d^{2}}{2}-\frac{td^{2}}{q})-\gamma d^{2\gamma}b^{\gamma-1}h\right\|^{-1} \\ \ll& \left\| (\frac{1}{2}-\frac{t }{q})d^{2}\right\|^{-1} . \end{align}

    So

    \sum\limits_{ 1 \leqslant b \leqslant \lfloor\frac{q-1}{d^2}\rfloor} \mathbf{e}\left((\frac{d^{2}}{2}-\frac{td^{2}}{q})b-(d^{2}b)^{\gamma}h\right) \ll \begin{cases} q^{\frac{1}{6}}+ \left\| (\frac{1}{2}-\frac{t }{q})d^{2}\right\|^{-1} , & 2 \nmid d \text{ or } q \nmid td^2;\\ \frac{q}{d^2}, & 2 \mid d \text{ and } q \mid td^2 ;\end{cases}

    which means

    \begin{align} &\sum\limits_{b = 1}^{q-1}(-1)^{b}\mathbf{e}(-\frac{tb}{q}) \mathop{\sum\limits_{d^{2} \mid b}}_{d \leqslant q^{\frac{1}{6} } }\mu(d) \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(b+1)^{\gamma}h\right)-\mathbf{e}(-b^{\gamma}h)\right) \\ &\ll\sum\limits_{0 < |h|\leqslant H} |h|^{-1}\mathop{\sum\limits_{d \leqslant q^{ \frac{1}{6} }}}_{2 \nmid d \text{ or } q \nmid td^2} |h|q^{\gamma-1} \left(q^{\frac{1}{6}}+ \left\| (\frac{1}{2}-\frac{t }{q})d^{2}\right\|^{-1}\right) +\sum\limits_{0 < |h|\leqslant H} |h|^{-1}\mathop{\sum\limits_{d \leqslant q^{ \frac{1}{6} }}}_{2 \mid d \text{ and } q \mid td^2} |h|q^{\gamma} d^{-2} \\ &\ll H q^{ \gamma -1 }\left(q^{\frac{1}{3}}+ \mathop{\sum\limits_{d \leqslant q^{ \frac{1}{6} }}}_{2 \nmid d \text{ or } q \nmid td^2}\left\| (\frac{1}{2}-\frac{t }{q})d^{2}\right\|^{-1}\right)+ H q^{ \gamma }\mathop{\sum\limits_{d \leqslant q^{ \frac{1}{6} }}}_{2 \mid d \text{ and } q \mid td^2}d^{-2} . \end{align}

    We denote

    \begin{align} T(c)&: = \sum\limits_{d\leqslant q^{\frac{1}{6} }} \# \left\{ (\frac{q}{u}-2t)d^{2} \equiv c (\bmod 2\frac{q}{u}), t\leqslant \frac{q}{u}\right\} \\ &\ll \sum\limits_{d\leqslant q^{\frac{1}{6} }} (\frac{q}{u}, d^{2}) \\ &\ll q^{\frac{1}{3} }d(q), \end{align}

    thus,

    \begin{align} &\mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}} \mathop{\sum\limits_{d^{2} \mid n}}_{d \leqslant q^{\frac{1}{6}} }\mu(d) \left( \sum\limits_{0 < |h|\leqslant H}a_{h} \left(\mathbf{e}\left(-(n+1)^{\gamma}h\right)-\mathbf{e}(-n^{\gamma}h)\right)\right)\\ & \ll q^{-2}\sum\limits_{s = 1}^{q}\sum\limits_{t = 1}^{q}(s,t,q)^{\frac{1}{2}}d(q)q^{\frac{1}{2}}\frac{H q^{ \gamma- 1}}{|\cos\frac{s}{q}\pi|}\left(q^{\frac{1}{ 3}}+ \mathop{\sum\limits_{d \leqslant q^{ \frac{1}{6} }}}_{2 \nmid d \text{ or } q \nmid td^2} \left\| (\frac{1}{2}-\frac{t }{q})d^{2}\right\|^{-1}\right) \\ &\quad+q^{-2}\sum\limits_{s = 1}^{q}\sum\limits_{t = 1}^{q}(s,t,q)^{\frac{1}{2}}d(q)q^{\frac{1}{2}}\frac{H q^{ \gamma}}{|\cos\frac{s}{q}\pi|}\mathop{\sum\limits_{d \leqslant q^{ \frac{1}{6} }}}_{2 \mid d \text{ and } q \mid td^2}d^{-2} \\ & \ll Hq^{ \gamma-\frac{5}{2}}\sum\limits_{u\mid q}u^{\frac{1}{2}} d(q) \sum\limits_{s = 1}^{\frac{q}{u}}\frac{1}{|1-2 \frac{su}{q} |} \sum\limits_{t = 1}^{\frac{q}{u}} \left(q^{\frac{1}{3}}+ \sum\limits_{d \leqslant q^{\frac{1}{6} }} \left\| (\frac{1}{2}-\frac{ut }{q})d^{2}\right\|^{-1} \right) \\ &\quad+Hq^{ \gamma-\frac{3}{2}} \sum\limits_{u\mid q}u^{\frac{1}{2}} d(q) \sum\limits_{s = 1}^{\frac{q}{u}}\frac{1}{|1-2 \frac{su}{q} |} \sum\limits_{d \leqslant q^{ \frac{1}{6} }}\mathop{\sum\limits_{t = 1}^{\frac{q}{u}}}_{ q \mid td^2}d^{-2} \\ & \ll H q^{ \gamma-\frac{1}{6}} d^2(q) \log q+ Hq^{ \gamma-\frac{1}{3}}d^2(q) \log q \\ &\quad+ Hq^{ \gamma- \frac{3}{2}}\sum\limits_{u\mid q}u^{-\frac{1}{2}} d(q) \log^{2}q \max\limits_{C} \sum\limits_{C < c\leqslant2C} \|\frac{C}{2\frac{q}{u}}\|^{-1} T(c) \\ &\ll Hq^{ \gamma -\frac{1}{6}}d^2(q) \log^2 q. \end{align} (3.21)

    With Eqs (3.18) and (3.19), we have

    \begin{align} R_{221}&\ll Hq^{ \gamma -\frac{1}{6}}d^2(q) \log^2 q+H q^{ \gamma -\frac{1}{6}} \log H. \end{align} (3.22)

    For R_{222} , the contribution from h = 0 can be bounded by similar methods of R_{21} , and the contribution from h \neq 0 can be bounded by similar methods of R_{221} . Taking

    H = \log q ,

    we obtain

    \begin{align} R_{222}& = b_{0} \mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}} \mu^{2}(n) +\mathop{{\sum}^{\prime}}_{n = 1}^{q} (-1)^{n+\bar{n}}\mu^{2}(n)\left(\sum\limits_{0 < |h|\leqslant H}b_{h} \left(e\left(-(n+1)^{\gamma}h\right)+e\left(-n^{\gamma}h\right)\right)\right) \\ &\ll H^{-1}q^{\frac{3}{4}}d^{3}(q)\log^2 q +Hq^{ \gamma -\frac{1}{6}}d^2(q) \log^2 q\\ &\ll q^{\frac{3}{4}}d^{3}(q)\log q+ q^{ \gamma -\frac{1}{6}}d^{2}(q) \log^3 q . \end{align} (3.23)

    Following from Eqs (3.16), (3.22), and (3.23),

    \begin{align} R_{2}\ll q^{\gamma-\frac{1}{6}} d^{2}(q) \log^3 q+ q^{\frac{3}{4}}d^{3}(q)\log q . \end{align} (3.24)

    Hence, from Eqs (3.1), (3.8), and (3.24), we derive that

    \begin{align} R(c;q) = &\frac{3}{\pi^{2}}\prod\limits_{p\mid q}(1+p^{-1})^{-1}q^{\gamma}+O\left( \sum\limits_{p\mid q}(1-p^{-\frac{1}{2}}) ^{-1} q^{\gamma-\frac{1}{2}}\right)\\ &+O\left(q^{\frac{7}{13}\gamma+\frac{4}{13}}\prod\limits_{p\mid q}(1-p^{-\frac{1}{2}})^{-1}\log q\right)+O\left( q^{\frac{3}{4}}d^{3}(q)\log q \right)+O(q^{ \gamma -\frac{1}{6}} d^{2}(q)\log^3 q). \end{align}

    We need the error terms to be smaller than the main term, so

    \begin{cases} \frac{7}{13}\gamma+\frac{4}{13} < \gamma ,\\ \frac{3}{4} < \gamma ,\end{cases}

    which means the range of c is (1, \frac{4}{3}) . The reason why the range of c is changed is that R(c; q) requires q large enough.

    In this paper, we generalize the Lehmer problem by considering the count of square-free numbers in the intersection of the Lehmer set and Piatetski-Shapiro sequence when q is an odd integer and large enough. By methods of exponential sum and Kloosterman sum, we study its asymptotic properties and give a sharp asymptotic formula as q tends to infinity.

    Based on this result, we will consider some distribution problems similar to the Lehmer problem with more special sequences, which is significant for understanding the distribution properties of those problems.

    Xiaoqing Zhao: calculations, writing and editing; Yuan Yi: methodology and reviewing. All authors have read and agreed to the published version of the manuscript.

    In preparing this manuscript, we employed the language model ChatGPT-4 for the purpose of grammatical corrections. It did not influence the calculations and conclusion in this paper.

    This work is supported by Natural Science Foundation No.12271422 of China. The authors would like to express their gratitude to the referee for very helpful and detailed comments.

    The authors declare that there are no conflicts of interest regarding the publication of this paper.



    [1] T. Wang, P. Xuan, Z. Liu, T. Zhang, Assistant diagnosis with Chinese electronic medical records based on CNN and BiLSTM with phrase-level and word-level attentions, BMC Bioinf. , 21 (2020). https://doi.org/10.1186/s12859-020-03554-x doi: 10.1186/s12859-020-03554-x
    [2] J. Tsai, G. Bond, A comparison of electronic records to paper records in mental health centers, Int. J. Qual. Health Care, 20 (2008), 136–143. https://doi.org/10.1093/intqhc/mzm064 doi: 10.1093/intqhc/mzm064
    [3] Y. Hu, Research on the information diagnostic technology based on medical information, University of Electronic Science and Technology of China, 2015.
    [4] Z. Obermeyer, E. J. Emanuel, Predicting the future—big data, machine learning, and clinical medicine, N. Engl. J. Med. , 375 (2016), 1216–1219. https://doi.org/10.1056/NEJMp1606181 doi: 10.1056/NEJMp1606181
    [5] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature, 521 (2015), 436–444. https://doi.org/10.1038/nature14539 doi: 10.1038/nature14539
    [6] J. Yang, Y. Guan, B. He, C. Qu, Q. Yu, Y. Liu, et al., Corpus construction for named entities and entity relations on chinese electronic medical records, J. Softw. , 27 (2016), 2725–2746.
    [7] L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE, 77 (1989), 257–286. https://doi.org/10.1109/5.18626 doi: 10.1109/5.18626
    [8] A. Roberts, R. Gaizauskas, M. Hepple, Extracting clinical relationships from patient narratives, in Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, (2008), 10–18. https://doi.org/10.3115/1572306.1572309
    [9] J. Lafferty, A. McCallum, F. C. N. Pereira, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, in Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001), (2001), 282–289. https://repository.upenn.edu/handle/20.500.14332/6188
    [10] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. , 9 (1997), 1735–1780.
    [11] J. Devlin, M. W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, preprint, arXiv: 1810.048052018.
    [12] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, et al., Deep contextualized word representations, Assoc. Comput. Linguist. , 1 (2018), 2227–2237. https://doi.org/10.18653/v1/N18-1202 doi: 10.18653/v1/N18-1202
    [13] T. Younga, D. Hazarikab, S. Poriac, E. Cambriad, Recent trends in deep learning based natural language processing, IEEE Comput. Intell. Mag. , 13 (2018), 55–75. https://doi.org/10.1109/MCI.2018.2840738 doi: 10.1109/MCI.2018.2840738
    [14] L. Ouyang, Y. Tian, H. Tang, B. Zhang, Chinese named entity recognition based on B-LSTM neural network with additional features, in International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage, (2017), 269–279. https://doi.org/10.1007/978-3-319-72389-1_22
    [15] Y. Xiang, Chinese named entity recognition with character-word mixed embedding, in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, (2017), 2055–2058.
    [16] H. Yang, H. Gao, Toward sustainable virtualized healthcare: Extracting medical entities from Chinese online health consultations using deep neural networks, Sustainability, 10 (2018), 3292. https://doi.org/10.3390/su10093292 doi: 10.3390/su10093292
    [17] W. Zhang, S. Jiang, S. Zhao, K. Hou, Y. Liu, L. Zhang, A BERT-BiLSTM-CRF model for Chinese electronic medical records named entity recognition, in 2019 12th International Conference on Intelligent Computation Technology and Automation (ICICTA), (2019), 166–169. https://doi.org/10.1109/ICICTA49267.2019.00043
    [18] X. Zhang, Y. Zhang, Q. Zhang, Y. Ren, T. Qiu, J. Ma, et al., Extracting comprehensive clinical information for breast cancer using deep learning methods, Int. J. Med. Inf. , 132 (2019), 103985.
    [19] L. Li, L. Jin, Y. Jiang, D. Huang, Recognizing biomedical named entities based on the sentence vector/twin word embeddings conditioned bidirectional LSTM, in Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, (2016), 165–176. https://doi.org/10.1007/978-3-319-47674-2_15
    [20] M. Habibi, L. Weber, M. Neves, D. L. Wiegandt, U. Leser, Deep learning with word embeddings improves biomedical named entity recognition, Bioinformatics, 33 (2017), i37–i48. https://doi.org/10.1093/bioinformatics/btx228 doi: 10.1093/bioinformatics/btx228
    [21] J. P. C. Chiu, E. Nichols, Named entity recognition with bidirectional LSTM-CNNs, Trans. Assoc. Comput. Linguist., 4 (2016), 357–370. https://doi.org/10.1162/tacl_a_00104 doi: 10.1162/tacl_a_00104
    [22] L. Li, Y. Guo, Biomedical named entity recognition with CNN-BLSTM-CRF, J. Chin. Inf. Newsp., (2018), 116–122.
    [23] D. S. Sachan, P. Xie, M. Sachan, P. Xing, Effective use of bidirectional language modeling for transfer learning in biomedical named entity recognition, in Proceedings of the 3rd Machine Learning for Healthcare Conference, (2018), 383–402.
    [24] E. F. Tjong K. Sang, J. Veenstra, in Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics, (1999), 173–179. https://doi.org/10.3115/977035.977059
    [25] X. Dong, S. Chowdhury, L. Qian, Y. Guan, J. Yang, Q. Yu, Transfer bi-directional LSTM rnn for named entity recognition in chinese electronic medical records, in 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom), (2017), 12–15. https://doi.org/10.1109/HealthCom.2017.8210840
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1941) PDF downloads(100) Cited by(0)

Figures and Tables

Figures(6)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog