Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

LangMoDHS: A deep learning language model for predicting DNase I hypersensitive sites in mouse genome


  • Received: 22 July 2022 Revised: 12 September 2022 Accepted: 18 September 2022 Published: 24 October 2022
  • DNase I hypersensitive sites (DHSs) are a specific genomic region, which is critical to detect or understand cis-regulatory elements. Although there are many methods developed to detect DHSs, there is a big gap in practice. We presented a deep learning-based language model for predicting DHSs, named LangMoDHS. The LangMoDHS mainly comprised the convolutional neural network (CNN), the bi-directional long short-term memory (Bi-LSTM) and the feed-forward attention. The CNN and the Bi-LSTM were stacked in a parallel manner, which was helpful to accumulate multiple-view representations from primary DNA sequences. We conducted 5-fold cross-validations and independent tests over 14 tissues and 4 developmental stages. The empirical experiments showed that the LangMoDHS is competitive with or slightly better than the iDHS-Deep, which is the latest method for predicting DHSs. The empirical experiments also implied substantial contribution of the CNN, Bi-LSTM, and attention to DHSs prediction. We implemented the LangMoDHS as a user-friendly web server which is accessible at http:/www.biolscience.cn/LangMoDHS/. We used indices related to information entropy to explore the sequence motif of DHSs. The analysis provided a certain insight into the DHSs.

    Citation: Xingyu Tang, Peijie Zheng, Yuewu Liu, Yuhua Yao, Guohua Huang. LangMoDHS: A deep learning language model for predicting DNase I hypersensitive sites in mouse genome[J]. Mathematical Biosciences and Engineering, 2023, 20(1): 1037-1057. doi: 10.3934/mbe.2023048

    Related Papers:

    [1] N. E. Cho, G. Murugusundaramoorthy, K. R. Karthikeyan, S. Sivasubramanian . Properties of λ-pseudo-starlike functions with respect to a boundary point. AIMS Mathematics, 2022, 7(5): 8701-8714. doi: 10.3934/math.2022486
    [2] Pinhong Long, Huo Tang, Wenshuai Wang . Functional inequalities for several classes of q-starlike and q-convex type analytic and multivalent functions using a generalized Bernardi integral operator. AIMS Mathematics, 2021, 6(2): 1191-1208. doi: 10.3934/math.2021073
    [3] Sadaf Umar, Muhammad Arif, Mohsan Raza, See Keong Lee . On a subclass related to Bazilevič functions. AIMS Mathematics, 2020, 5(3): 2040-2056. doi: 10.3934/math.2020135
    [4] Mohammad Faisal Khan, Jongsuk Ro, Muhammad Ghaffar Khan . Sharp estimate for starlikeness related to a tangent domain. AIMS Mathematics, 2024, 9(8): 20721-20741. doi: 10.3934/math.20241007
    [5] Wenzheng Hu, Jian Deng . Hankel determinants, Fekete-Szegö inequality, and estimates of initial coefficients for certain subclasses of analytic functions. AIMS Mathematics, 2024, 9(3): 6445-6467. doi: 10.3934/math.2024314
    [6] Hava Arıkan, Halit Orhan, Murat Çağlar . Fekete-Szegö inequality for a subclass of analytic functions defined by Komatu integral operator. AIMS Mathematics, 2020, 5(3): 1745-1756. doi: 10.3934/math.2020118
    [7] Pinhong Long, Xing Li, Gangadharan Murugusundaramoorthy, Wenshuai Wang . The Fekete-Szegö type inequalities for certain subclasses analytic functions associated with petal shaped region. AIMS Mathematics, 2021, 6(6): 6087-6106. doi: 10.3934/math.2021357
    [8] K. R. Karthikeyan, G. Murugusundaramoorthy, N. E. Cho . Some inequalities on Bazilevič class of functions involving quasi-subordination. AIMS Mathematics, 2021, 6(7): 7111-7124. doi: 10.3934/math.2021417
    [9] Muhammad Ghaffar Khan, Sheza.M. El-Deeb, Daniel Breaz, Wali Khan Mashwani, Bakhtiar Ahmad . Sufficiency criteria for a class of convex functions connected with tangent function. AIMS Mathematics, 2024, 9(7): 18608-18624. doi: 10.3934/math.2024906
    [10] Ahmad A. Abubaker, Khaled Matarneh, Mohammad Faisal Khan, Suha B. Al-Shaikh, Mustafa Kamal . Study of quantum calculus for a new subclass of q-starlike bi-univalent functions connected with vertical strip domain. AIMS Mathematics, 2024, 9(5): 11789-11804. doi: 10.3934/math.2024577
  • DNase I hypersensitive sites (DHSs) are a specific genomic region, which is critical to detect or understand cis-regulatory elements. Although there are many methods developed to detect DHSs, there is a big gap in practice. We presented a deep learning-based language model for predicting DHSs, named LangMoDHS. The LangMoDHS mainly comprised the convolutional neural network (CNN), the bi-directional long short-term memory (Bi-LSTM) and the feed-forward attention. The CNN and the Bi-LSTM were stacked in a parallel manner, which was helpful to accumulate multiple-view representations from primary DNA sequences. We conducted 5-fold cross-validations and independent tests over 14 tissues and 4 developmental stages. The empirical experiments showed that the LangMoDHS is competitive with or slightly better than the iDHS-Deep, which is the latest method for predicting DHSs. The empirical experiments also implied substantial contribution of the CNN, Bi-LSTM, and attention to DHSs prediction. We implemented the LangMoDHS as a user-friendly web server which is accessible at http:/www.biolscience.cn/LangMoDHS/. We used indices related to information entropy to explore the sequence motif of DHSs. The analysis provided a certain insight into the DHSs.



    Let A denote the class of functions of the form

    f(z)=z+a2z2+a3z3+a4z4+, (1.1)

    which are analytic in the open unit disk D=(z:∣z∣<1) and normalized by f(0)=0 and f(0)=1. Recall that, SA is the univalent function in D=(z:∣z∣<1) and has the star-like and convex functions as its sub-classes which their geometric condition satisfies Re(zf(z)f(z))>0 and Re(1+zf(z)f(z))>0. The two well-known sub-classes have been used to define different subclass of analytical functions in different direction with different perspective and their results are too voluminous in literature.

    Two functions f and g are said to be subordinate to each other, written as fg, if there exists a Schwartz function w(z) such that

    f(z)=g(w(z)),zϵD (1.2)

    where w(0) and w(z)∣<1 for zϵD. Let P denote the class of analytic functions such that p(0)=1 and p(z)1+z1z, zϵD. See [1] for details.

    Goodman [2] proposed the concept of conic domain to generalize convex function which generated the first parabolic region as an image domain of analytic function. The same author studied and introduced the class of uniformly convex functions which satisfy

    UCV=Re{1+(zψ)f(z)f(z)}>0,(z,ψA).

    In recent time, Ma and Minda [3] studied the underneath characterization

    UCV=Re{1+zf(z)f(z)>|zf(z)f(z)|},zϵD. (1.3)

    The characterization studied by [3] gave birth to first parabolic region of the form

    Ω={w;Re(w)>∣w1}, (1.4)

    which was later generalized by Kanas and Wisniowska ([5,6]) to

    Ωk={w;Re(w)>kw1,k0}. (1.5)

    The Ωk represents the right half plane for k=0, hyperbolic region for 0<k<1, parabolic region for k=1 and elliptic region for k>1 [30].

    The generalized conic region (1.5) has been studied by many researchers and their interesting results litter everywhere. Just to mention but a few Malik [7] and Malik et al. [8].

    More so, the conic domain Ω was generalized to domain Ω[A,B], 1B<A1 by Noor and Malik [9] to

    Ω[A,B]={u+iv:[(B21)(U2+V2)2(AB1)u+(A21)]2
    >[2(B+1)(u2v2)+2(A+B+C)u2(A+1)]2+4(AB)2v2}

    and it is called petal type region.

    A function p(z) is said to be in the class UP[A,B], if and only if

    p(z)(A+1)˜p(z)(A1)(B+1)˜p(z)(B1), (1.6)

    where ˜p(z)=1+2π2(log1+z1z)2.

    Taking A=1 and B=1 in (1.8), the usual classes of functions studied by Goodman [1] and Kanas ([5,6]) will be obtained.

    Furthermore, the classes UCV[A,B] and ST[A,B] are uniformly Janoski convex and Starlike functions satisfies

    Re((B1)(zf(z))f(z)(A1)(B+1)(zf(z))f(z)(A+1))>|(B1)(zf(z))f(z)(A1)(B+1)(zf(z))f(z)(A+1)1| (1.7)

    and

    Re((B1)zf(z)f(z)(A1)(B+1)zf(z)f(z)(A+1))>|(B1)zf(z)f(z)(A1)(B+1)zf(z)f(z)(A+1)1|, (1.8)

    or equivalently

    (zf(z))f(z)UP[A,B]

    and

    zf(z)f(z)UP[A,B].

    Setting A=1 and B=1 in (1.7) and (1.8), we obtained the classes of functions investigated by Goodman [2] and Ronning [10].

    The relevant connection to Fekete-Szegö problem is a way of maximizing the non-linear functional |a3λa22| for various subclasses of univalent function theory. To know much of history, we refer the reader to [11,12,13,14] and so on.

    The error function was defined because of the normal curve, and shows up anywhere the normal curve appears. Error function occurs in diffusion which is a part of transport phenomena. It is also useful in biology, mass flow, chemistry, physics and thermomechanics. According to the information at hand, Abramowitz [15] expanded the error function into Maclaurin series of the form

    Erf(z)=2πz0et2dt=2πn=0(1)nz2n+1(2n+1)n! (1.9)

    The properties and inequalities of error function were studied by [16] and [4] while the zeros of complementary error function of the form

    erfc(z)=1erf(z)=2πzet2dt, (1.10)

    was investigated by [17], see for more details in [18,19] and so on. In recent time, [20,21,22] and [23] applied error functions in numerical analysis and their results are flying in the air.

    For f given by [15] and g with the form g(z)=z+b2z2+b3z3+ their Hadamard product (convolution) by fg and at is defined as:

    (fg)(z)=z+n=2anbnzn (1.11)

    Let Erf be a normalized analytical function which is obtained from (1.9) and given by

    Erf=πz2erf(z)=z+n=2(1)n1zn(2n1)(n1)! (1.12)

    Therefore, applying a notation (1.11) to (1.1) and (1.12) we obtain

    ϵ=AErf={F:F(z)=(fErf)(z)=z+n=2(1)n1anzn(2n1)(n1)!,fA}, (1.13)

    where Erf is the class that consists of a single function or Erf. See concept in Kanas et al. [18] and Ramachandran et al. [19].

    Babalola [24] introduced and studied the class of λpseudo starlike function of order β(0β1) which satisfy the condition

    Re(z(f(z))λf(z))>β, (1.14)

    where λ1(zD) and denoted by λ(β). We observed from (1.14) that putting λ=2, the geometric condition gives the product combination of bounded turning point and starlike function which satisfy

    Ref(z)(z(f(z))f(z))>β

    Olatunji [25] extended the class λ(β) to βλ(s,t,Φ) which the geometric condition satisfy

    Re((st)z(f(z))λf(sz)f(tz))>β,

    where s,tC,st,λ1,0β<1,zD and Φ(z) is the modified sigmoid function. The initial coefficient bounds were obtained and the relevant connection to Fekete-Szegö inequalities were generated. The contributions of authors like Altinkaya and Özkan [26] and Murugusundaramoorthy and Janani [27] and Murugusundaramoorthy et al. [28] can not be ignored when we are talking on λ-pseudo starlike functions.

    Inspired by earlier work by [18,19,29]. In this work, the authors employed the approach of [13] to study the coefficient inequalities for pseudo certain subclasses of analytical functions related to petal type region defined by error function. The first few coefficient bounds and the relevant connection to Fekete-Szegö inequalities were obtained for the classes of functions defined. Also note that, the results obtained here has not been in literature and varying of parameters involved will give birth to corollaries.

    For the purpose of the main results, the following lemmas and definitions are very necessary.

    Lemma 1.1. If p(z)=1+p1z+p2z2+ is a function with positive real part in D, then, for any complex μ,

    |p2μp21|2max{1,|2μ1|}

    and the result is sharp for the functions

    p0(z)=1+z1zorp(z)=1+z21z2(zD).

    Lemma 1.2. [29] Let pUP[A,B],1B<A1 and of the form p(z)=1+n=1pnzn. Then, for a complex number μ, we have

    |p2μp21|4π2(AB)max(1,|4π2(B+1)23+4μ(ABπ2)|). (1.15)

    The result is sharp and the equality in (1.15) holds for the functions

    p1(z)=2(A+1)π2(log1+z1z)2+22(B+1)π2(log1+z1z)2+2

    or

    p2(z)=2(A+1)π2(log1+z1z)2+22(B+1)π2(log1+z1z)2+2.

    Proof. For hP and of the form h(z)=1+n=1cnzn, we consider

    h(z)=1+w(z)1w(z)

    where w(z) is such that w(0)=0 and |w(z)|<1. It follows easily that

    w(z)=h(z)1h(z)+1=12z+(c22c214)z2+(c32c2c12+c318)z3+ (1.16)

    Now, if ˜p(z)=1+R1z+R2z2+, then from (1.16), one may have,

    ˜p(w(z))=1+R1w(z)+R2(w(z))2+R3(w(z))3 (1.17)

    where R1=8π2,R2=163π2, and R3=18445π2, see [30]. Substitute R1,R2 and R3 into (1.17) to obtain

    ˜p(w(z))=1+4c1π2z+4π2(c2c216)z2+4π2(c3c1c23+2c3145)z3+ (1.18)

    Since pUP[A,B], so from relations (1.16), (1.17) and (1.18), one may have,

    p(z)=(A+1)˜p(w(z))(A1)(B+1)˜p(w(z))(B1)=2+(A+1)4π2c1z+(A+1)4π2(c2c216)z2+2+(B+1)4π2c1z+(B+1)4π2(c2c216)z2+

    This implies that,

    p(z)=1+2(AB)c1π2z+2(AB)π2(c2c2162(B1)c21π2)z2+8(AB)π2[((B+1)2π4+B+16π2190)c21(B+1π2+112)c1c2+c34]z3+ (1.19)

    If p(z)=1+n=1pnzn, then equating coefficients of z and z2, one may have,

    p1=2π2(AB)c1

    and

    p2=2π2(AB)(c2c2162(B1)c21π2).

    Now for a complex number μ, consider

    p2μp21=2(AB)π2[c2c21(16+2(B+1)π2+2μ(AB)π2)]

    This implies that

    |p2μp21|=2(AB)π2|c2c21(16+2(B+1)π2+2μ(AB)π2)|.

    Using Lemma 1.1, one may have

    |p2μp21|=4(AB)π2max{1,|2v1|},

    where v=16+2(B+1)π2+2μ(AB)π2, which completes the proof of the Lemma.

    Definition 1.3. A function FϵA is said to be in the class UCV[λ,A,B], 1B<A1, if and only if,

    Re((B1)(z(F(z)λ))F(z)(A1)(B+1)(z(F(z)λ))F(z)(A+1))>|(B1)(z(F(z)λ))F(z)(A1)(B+1)(z(F(z)λ))F(z)(A+1)1|, (1.20)

    where λ1ϵR or equivalently (z(F(z)λ))F(z)ϵUP[A,B].

    Definition 1.4. A function FϵA is said to be in the class US[λ,A,B], 1B<A1, if and only if,

    Re((B1)z(F(z)λ)F(z)(A1)(B+1)z(F(z)λ)F(z)(A+1))>|(B1)z(F(z)λ)F(z)(A1)(B+1)z(F(z)λ)F(z)(A+1)1|, (1.21)

    where λ1ϵR or equivalently z(F(z)λ)F(z)ϵUP[A,B].

    Definition 1.5. A function FϵA is said to be in the class UMα[λ,A,B], 1B<A1, if and only if,

    Re((B1)[(1α)z(F(z)λ)F(z)+α(z(F(z)λ))F(z)](A1)(B+1)[(1α)z(F(z)λ)F(z)+α(z(F(z)λ))F(z)](A+1))>|(B1)[(1α)z(F(z)λ)F(z)+α(z(F(z)λ))F(z)](A1)(B+1)[(1α)z(F(z)λ)F(z)+α(z(F(z)λ))F(z)](A+1)1|,

    where α0 and λ1ϵR or equivalently (1α)z(F(z)λ)f(z)+α(z(f(z)λ))f(z)UP[A,B].

    In this section, we shall state and prove the main results, and several corollaries can easily be deduced under various conditions.

    Theorem 2.1. Let FUS[λ,A,B], 1B<A1, and of the form (1.13). Then, for a real number μ, we have

    |a3μa22|40(AB)|13λ|π2max{1,|4(B+1)π2132(AB)(12λ)2π2(2(2λ24λ+1)9μ(13λ)5)|}.

    Proof. If FUS[λ,A,B], 1B<A1, the it follows from relations (1.18), (1.19), and (1.20),

    z(F(z)λ)F(z)=(A+1)˜p(w(z))(A1)(B+1)˜p(w(z))(B1),

    where w(z) is such that w(0)=0 and w(z)∣<1. The right hand side of the above expression get its series form from (1.13) and reduces to

    z(F(z)λ)F(z)=1+2(AB)c1π2z+2(AB)π2(c2c2162(B1)c21π2)z2
    +8(AB)π2[((B+1)2π4+B+16π2190)c21(B+1π2+112)c1c2+c34]z3+. (2.1)

    If F(z)=z+n=2(1)n1anzn(2n1)(n1)!, then one may have

    z(F(z)λ)F(z)=1+12λ3a2z+(2λ24λ+19a2213λ10a3)z2+ (2.2)

    From (2.1) and (2.2), comparison of coefficient of z and z2 gives,

    a2=6(AB)(12λ)π2c1 (2.3)

    and

    2λ24λ+19a2213λ10a3=2(AB)π2(c216c212(B+1)π2c21).

    This implies, by using (2.3), that

    a3=20(AB)(13λ)π2[c216c212(B+1)π2c212(2λ24λ+1)(AB)(12λ)2π2c21].

    Now, for a real number μ consider

    |a3μa22|=
    |20(AB)(13λ)π2(c216c212(B+1)π2c21)+40(AB)2(2λ24λ+1)(12λ)2(13λ)π436μ(AB)2c21(12λ)2π4|
    =20(AB)(13λ)π2|c2c21(16+2(B+1)π22(AB)(2λ24λ+1)(12λ)2π2+9μ(AB)(13λ)5(12λ)2π2)|
    =20(AB)(13λ)π2|c2vc21|

    where v=16+2(B+1)π2(AB)(12λ)2π2(2(2λ24λ+1)9μ(13λ)5).

    Theorem 2.2. Let FUCV[λ,A,B], 1B<A1, and of the form (1.13). Then, for a real number μ, we have

    |a3μa22|40(AB)3|1+3λ|π2max{1,|4(B+1)π2132(1+3λ)(AB)(1+2λ)2π2(λ27μ20)|}

    Proof. If FUCV[λ,A,B], 1B<A1, then it follows from relations (1.18), (1.19), and (1.21),

    (zF(z)λ)F(z)=(A+1)˜p(w(z))(A1)(B+1)˜p(w(z))(B+1),

    where w(z) is such that w(0)=0 and w(z)∣<1. The right hand side of the above expression get its series form from (1.13) and reduces to,

    (zF(z)λ)F(z)=1+2(AB)c1π2z+2(AB)π2(c2c2162(B+1)π2c21)z2+8(AB)π2[(B+1π4+B+16π2+190)c31(B+1π2+112)c1c2+c34]z3+ (2.4)

    If F(z)=z+(1)n1anzn(2n1)(n1)!, then we have,

    (zF(z)λ)F(z)=12(1+2λ)3a2z+(1+3λ)(3a310+2λ9a22)z2+ (2.5)

    From (2.4) and (2.5), comparison of coefficients of z and z2 gives,

    a2=3(AB)c1(1+2λ)π2 (2.6)

    and

    (1+3λ)(3a310+2λ9a22)=2(AB)π2(c2c2162(B+1)c21π2)

    This implies, by using (2.6), that

    a3=103[2(AB)(1+3λ)π2(c2c2162(B+1)c21π2)+2λ(AB)2c21(1+2λ)2π4].

    Now, for a real number μ, consider

    |a3μa22|=|20(AB)3(1+3λ)π2(c216c12(B+1)π2c21)+20(AB)2c213(1+2λ)π49μ(AB)2c21(1+2λ)2π4|
    =20(AB)3(1+3λ)π2|c2c21(16+2(B+1)π2λ(1+3λ)(AB)(1+2λ)2π2+27μ(AB)(1+3λ)20(1+2λ)2π2)|
    =20(AB)3(1+3λ)π2|c2vc21|,

    where

    v=16+2(B+1)π2(1+3λ)(AB)(1+2λ)2π2(λ27μ20).

    Theorem 2.3. FMα[λ,A,B], 1B<A1, α0 and of the form (1.13). Then, for a real number μ, we have

    |a3μa22|40(AB)π2|3(λ+α+2αλ)+α1|max{1,|4(B+1)π2134(AB)[12λα(3+2λ)]2π2(2λ2(1+2α)+2λ(3α2)+1α9μ(3(λ+α+2αλ)+α1)10)|}.

    Proof. Let FMα[λ,A,B], 1B<A1, α0 and of the form (1.13). Then, for a real number μ, we have

    (1α)z(F(z))λF(z)+α(z(F(z))λ)F(z)=(A+1)˜p(w(z))(A1)(B+1)˜p(w(z))(B1), (2.7)

    where w(z) is such that w(z0)=0 and |w(z)|<1. The right hand side of the above expression get its series form from (2.7) and reduces to

    (1α)z(F(z))λF(z)+α(z(F(z))λ)F(z)=1+2(AB)Gπ2z+2(AB)π2(c2c2162(B+1)π2c21)z2+... (2.8)

    If F(z)=z+n=2(1)n1anzn(2n1)(n1)!, then one may have

    (1α)z(F(z))λF(z)+α(z(F(z))λ)F(z)=(1α)[1+12λ3a2z+(2λ24λ+19a2213λ10a3)z2+...]+α[12(1+2λ)3a2z+(1+3λ)(3a310+2λ9a22)z2+...] (2.9)

    from (2.8) and (2.9), comparison of coefficients of z and z2 gives

    a2=6(AB)c1[12λα(3+2λ)]π2 (2.10)

    and

    3(λ+α+2αλ)+α110a32λ2(1+2λ)+α19a22=2(AB)π2(c2c2162(B+1)π2c21)

    This implies, by using (2.10), that

    a3=103(λ+α+2αλ)+α1[2(AB)π2(c2c2162(B+1)π2c21)+4(AB)2[2λ2(1+2λ)+2λ(3α2)+1α][12λα(3+2λ)]2π4c21]

    Now, for a real number μ, consider

    |a3μa22|=|103(λ+α+2αλ)+α1[2(AB)π2(c2c2162(B+1)π2c21)+4(AB)2[2λ2(1+2λ)+2λ(3α2)+1α][12λα(3+2λ)]2π4c21]36(AB)2μG2[12λα(3+2λ)]2π4|
    =|20(AB)π(3(λ+α+2αλ)+α1)|c2c21[16+2(B+1)π22(AB)[2λ2(1+2α)+2λ(3α2)+1α](12λα(3+2λ))2π2+18μ(AB)[3(λ+α+2αλ)+α1]10[12λα(3+2λ)]2π2
    =20(AB)π(3(λ+α+2αλ)+α1)|c2vc21|,

    where

    v=16+2(B+1)π22(AB)[2λ2(1+2α)+2λ(3α2)+1α](12λα(3+2λ))2π2+18μ(AB)[3(λ+α+2αλ)+α1]10[12λα(3+2λ)]2π2.

    The force applied on certain subclasses of analytical functions associated with petal type domain defined by error function has played a vital role in this work. The results obtained are new and varying the parameters involved in the classes of function defined, these will bring new more results that has not been in existence.

    The authors would like to thank the referees for their valuable comments and suggestions.

    The authors declare that they have no conflict of interests.



    [1] T. Zhang, A. P. Marand, J. Jiang, PlantDHS: A database for DNase I hypersensitive sites in plants, Nucleic. Acids. Res., 44 (2016), D1148–D1153. https://doi.org/10.1093/nar/gkv962 doi: 10.1093/nar/gkv962
    [2] D. S. Gross, W. T. Garrard, Nuclease hypersensitive sites in chromatin, Annu. Rev. Biochem., 57 (1988), 159–197. https://doi.org/10.1146/annurev.bi.57.070188.001111 doi: 10.1146/annurev.bi.57.070188.001111
    [3] G. E. Crawford, I. E. Holt, J. C. Mullikin, D. Tai, E. D. Green, T. G. Wolfsberg, et al., Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites, Proc. Natl. Acad. Sci., 101 (2004), 992–997. https://doi.org/10.1073/pnas.0307540100 doi: 10.1073/pnas.0307540100
    [4] M. M. Carrasquillo, M. Allen, J. D. Burgess, X. Wang, S. L. Strickland, S. Aryal, et al., A candidate regulatory variant at the TREM gene cluster associates with decreased Alzheimer's disease risk and increased TREML1 and TREM2 brain gene expression, Alzheimer's Dementia, 13 (2017), 663–673. https://doi.org/10.1016/j.jalz.2016.10.005 doi: 10.1016/j.jalz.2016.10.005
    [5] W. Meuleman, A. Muratov, E. Rynes, J. Halow, K. Lee, D. Bates, et al., Index and biological spectrum of human DNase I hypersensitive sites, Nature, 584 (2020), 244–251. https://doi.org/10.1038/s41586-020-2559-3 doi: 10.1038/s41586-020-2559-3
    [6] M. T. Maurano, R. Humbert, E. Rynes, R. E. Thurman, E. Haugen, H. Wang, et al., Systematic localization of common disease-associated variation in regulatory DNA, Science, 337 (2012), 1190–1195. https://doi.org/10.1126/science.1222794 doi: 10.1126/science.1222794
    [7] J. Ernst, P. Kheradpour, T. S. Mikkelsen, N. Shoresh, L. D. Ward, C. B. Epstein, et al., Mapping and analysis of chromatin state dynamics in nine human cell types, Nature, 473 (2011), 43–49. https://doi.org/10.1038/nature09906 doi: 10.1038/nature09906
    [8] M. Mokry, M. Harakalova, F. W. Asselbergs, P. I. de Bakker, E. E. Nieuwenhuis, Extensive association of common disease variants with regulatory sequence, PLoS One, 11 (2016), e0165893. https://doi.org/10.1371/journal.pone.0165893 doi: 10.1371/journal.pone.0165893
    [9] D. Weghorn, F. Coulet, K. M. Olson, C. DeBoever, F. Drees, A. Arias, et al., Identifying DNase I hypersensitive sites as driver distal regulatory elements in breast cancer, Nat. Commun., 8 (2017), 1–16. https://doi.org/10.1038/s41467-017-00100-x doi: 10.1038/s41467-017-00100-x
    [10] W. Jin, Q. Tang, M. Wan, K. Cui, Y. Zhang, G. Ren, et al., Genome-wide detection of DNase I hypersensitive sites in single cells and FFPE tissue samples, Nature, 528 (2015), 142–146. https://doi.org/10.1038/nature15740 doi: 10.1038/nature15740
    [11] G. E. Crawford, S. Davis, P. C. Scacheri, G. Renaud, M. J. Halawi, M. R. Erdos, et al., DNase-chip: A high-resolution method to identify DNase I hypersensitive sites using tiled microarrays, Nat. Methods, 3 (2006), 503–509. https://doi.org/10.1038/nmeth888 doi: 10.1038/nmeth888
    [12] J. Cooper, Y. Ding, J. Song, K. Zhao, Genome-wide mapping of DNase I hypersensitive sites in rare cell populations using single-cell DNase sequencing, Nat. Protoc., 12 (2017), 2342–2354. https://doi.org/10.1038/nprot.2017.099 doi: 10.1038/nprot.2017.099
    [13] G. E. Crawford, I. E. Holt, J. Whittle, B. D. Webb, D. Tai, S. Davis, et al., Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS), Genome Res., 16 (2006), 123–131. https://doi.org/10.1101/gr.4074106 doi: 10.1101/gr.4074106
    [14] L. Song, G. E. Crawford, DNase-seq: A high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells, Cold Spring Harbor Protoc., 2010 (2010), pdb.prot5384. https://doi.org/10.1101/pdb.prot5384 doi: 10.1101/pdb.prot5384
    [15] W. Zhang, J. Jiang, Genome-wide mapping of DNase I hypersensitive sites in plants, in Plant Functional Genomics, Humana Press, 1284 (2015), 71–89. https://doi.org/10.1007/978-1-4939-2444-8_4
    [16] Y. Wang, K. Wang, Genome-wide identification of DNase I hypersensitive sites in plants, Curr. Protoc., 1 (2021), e148. https://doi.org/10.1002/cpz1.148 doi: 10.1002/cpz1.148
    [17] S. Wang, Q. Zhang, Z. Shen, Y. He, Z. Chen, J. Li, et al., Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture, Mol. Ther. Nucleic Acids, 24 (2021), 154–163. https://doi.org/10.1016/j.omtn.2021.02.014 doi: 10.1016/j.omtn.2021.02.014
    [18] Q. Zhang, Y. He, S. Wang, Z. Chen, Z. Guo, Z. Cui, et al., Base-resolution prediction of transcription factor binding signals by a deep learning framework, PLoS Comp. Biol., 18 (2022), e1009941. https://doi.org/10.1371/journal.pcbi.1009941 doi: 10.1371/journal.pcbi.1009941
    [19] S. Wang, Y. He, Z. Chen, Q. Zhang, FCNGRU: Locating transcription factor binding sites by combing fully convolutional neural network with gated recurrent unit, IEEE J. Biomed. Health. Inf., 26 (2021), 1883–1890. https://doi.org/10.1109/JBHI.2021.3117616 doi: 10.1109/JBHI.2021.3117616
    [20] Q. Zhang, Z. Shen, D. S. Huang, Predicting in-vitro transcription factor binding sites using DNA sequence+ shape, IEEE/ACM Trans. Comput. Biol. Bioinf., 18 (2019), 667–676. https://doi.org/10.1109/TCBB.2019.2947461 doi: 10.1109/TCBB.2019.2947461
    [21] Q. Zhang, S. Wang, Z. Chen, Y. He, Q. Liu, D. S. Huang, Locating transcription factor binding sites by fully convolutional neural network, Briefings Bioinf., 22 (2021), bbaa435. https://doi.org/10.1093/bib/bbaa435 doi: 10.1093/bib/bbaa435
    [22] Y. Zhang, Z. Wang, Y. Zeng, Y. Liu, S. Xiong, M. Wang, et al., A novel convolution attention model for predicting transcription factor binding sites by combination of sequence and shape, Briefings Bioinf., 23 (2022), bbab525. https://doi.org/10.1093/bib/bbab525 doi: 10.1093/bib/bbab525
    [23] Y. Zhang, Z. Wang, Y. Zeng, J. Zhou, Q. Zou, High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method, Briefings Bioinf., 22 (2021), bbab273. https://doi.org/10.1093/bib/bbab273 doi: 10.1093/bib/bbab273
    [24] Y. He, Z. Shen, Q. Zhang, S. Wang, D. S. Huang, A survey on deep learning in DNA/RNA motif mining, Briefings Bioinf., 22 (2021), bbaa229. https://doi.org/10.1093/bib/bbaa229 doi: 10.1093/bib/bbaa229
    [25] W. S. Noble, S. Kuehn, R. Thurman, M. Yu, J. Stamatoyannopoulos, Predicting the in vivo signature of human gene regulatory sequences, Bioinformatics, 21 (2005), i338–i343. https://doi.org/10.1093/bioinformatics/bti1047 doi: 10.1093/bioinformatics/bti1047
    [26] B. Manavalan, T. H. Shin, G. Lee, DHSpred: Support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest, Oncotarget, 9 (2018), 1944. https://doi.org/10.18632/oncotarget.23099 doi: 10.18632/oncotarget.23099
    [27] S. Zhang, W. Zhuang, Z. Xu, Prediction of DNase I hypersensitive sites in plant genome using multiple modes of pseudo components, Anal. Biochem., 549 (2018), 149–156. https://doi.org/10.1016/j.ab.2018.03.025 doi: 10.1016/j.ab.2018.03.025
    [28] Y. Liang, S. Zhang, IDHS-DMCAC: Identifying DNase I hypersensitive sites with balanced dinucleotide-based detrending moving-average cross-correlation coefficient, SAR QSAR Environ. Res., 30 (2019), 429–445. https://doi.org/10.1080/1062936X.2019.1615546 doi: 10.1080/1062936X.2019.1615546
    [29] S. Zhang, Z. Duan, W. Yang, C. Qian, Y. You, IDHS-DASTS: Identifying DNase I hypersensitive sites based on LASSO and stacking learning, Mol. Omics, 17 (2021), 130–141. https://doi.org/10.1039/D0MO00115E doi: 10.1039/D0MO00115E
    [30] B. Liu, R. Long, K. C. Chou, IDHS-EL: Identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework, Bioinformatics, 32 (2016), 2411–2418. https://doi.org/10.1093/bioinformatics/btw186 doi: 10.1093/bioinformatics/btw186
    [31] S. Zhang, J. Lin, L. Su, Z. Zhou, PDHS-DSET: Prediction of DNase I hypersensitive sites in plant genome using DS evidence theory, Anal. Biochem., 564 (2019), 54–63. https://doi.org/10.1016/j.ab.2018.10.018 doi: 10.1016/j.ab.2018.10.018
    [32] Y. Zheng, H. Wang, Y. Ding, F. Guo, CEPZ: A novel predictor for identification of DNase I hypersensitive sites, IEEE/ACM Trans. Comput. Biol. Bioinf., 18 (2021), 2768–2774. https://doi.org/10.1109/TCBB.2021.3053661 doi: 10.1109/TCBB.2021.3053661
    [33] S. Zhang, Q. Yu, H. He, F. Zhu, P. Wu, L. Gu, et al., IDHS-DSAMS: Identifying DNase I hypersensitive sites based on the dinucleotide property matrix and ensemble bagged tree, Genomics, 112 (2020), 1282–1289. https://doi.org/10.1016/j.ygeno.2019.07.017 doi: 10.1016/j.ygeno.2019.07.017
    [34] S. Zhang, T. Xue, Use Chou's 5-steps rule to identify DNase I hypersensitive sites via dinucleotide property matrix and extreme gradient boosting, Mol. Genet. Genomics, 295 (2020), 1431–1442. https://doi.org/10.1007/s00438-020-01711-8 doi: 10.1007/s00438-020-01711-8
    [35] Z. C. Xu, S. Y. Jiang, W. R. Qiu, Y. C. Liu, X. Xiao, IDHSs-PseTNC: Identifying DNase I hypersensitive sites with pseuo trinucleotide component by deep sparse auto-encoder, Lett. Org. Chem., 14 (2017), 655–664. https://doi.org/10.2174/1570178614666170213102455 doi: 10.2174/1570178614666170213102455
    [36] C. Lyu, L. Wang, J. Zhang, Deep learning for DNase I hypersensitive sites identification, BMC genomics, 19 (2018), 155–165. https://doi.org/10.1186/s12864-018-5283-8 doi: 10.1186/s12864-018-5283-8
    [37] P. Feng, N. Jiang, N. Liu, Prediction of DNase I hypersensitive sites by using pseudo nucleotide compositions, Sci. World J., 2014 (2014), 740506. https://doi.org/10.1155/2014/740506 doi: 10.1155/2014/740506
    [38] W. Chen, T. Y. Lei, D. C. Jin, H. Lin, K. C. Chou, PseKNC: A flexible web server for generating pseudo K-tuple nucleotide composition, Anal. Biochem., 456 (2014), 53–60. https://doi.org/10.1016/j.ab.2014.04.001 doi: 10.1016/j.ab.2014.04.001
    [39] W. Chen, H. Lin, K. C. Chou, Pseudo nucleotide composition or PseKNC: An effective formulation for analyzing genomic sequences, Mol. Biosyst., 11 (2015), 2620–2634. https://doi.org/10.1039/C5MB00155B doi: 10.1039/C5MB00155B
    [40] B. Liu, F. Liu, X. Wang, J. Chen, L. Fang, K. C. Chou, Pse-in-One: A web server for generating various modes of pseudo components of DNA, RNA, and protein sequences, Nucleic Acids Res., 43 (2015), W65–W71. https://doi.org/10.1093/nar/gkv458 doi: 10.1093/nar/gkv458
    [41] S. Zhang, Z. Zhou, X. Chen, Y. Hu, L. Yang, PDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine, J. Theor. Biol., 426 (2017), 126–133. https://doi.org/10.1016/j.jtbi.2017.05.030 doi: 10.1016/j.jtbi.2017.05.030
    [42] K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824 doi: 10.1109/TPAMI.2015.2389824
    [43] F. Y. Dao, H. Lv, W. Su, Z. J. Sun, Q. L. Huang, H. Lin, IDHS-deep: an integrated tool for predicting DNase I hypersensitive sites by deep neural network, Briefings Bioinf., 22 (2021), bbab047. https://doi.org/10.1093/bib/bbab047 doi: 10.1093/bib/bbab047
    [44] C. E. Breeze, J. Lazar, T. Mercer, J. Halow, I. Washington, K. Lee, et al., Atlas and developmental dynamics of mouse DNase I hypersensitive sites, bioRxiv, 2020 (2020). https://doi.org/10.1101/2020.06.26.172718 doi: 10.1101/2020.06.26.172718
    [45] W. Li, A. Godzik, Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22 (2006), 1658–1659. https://doi.org/10.1093/bioinformatics/btl158 doi: 10.1093/bioinformatics/btl158
    [46] L. Fu, B. Niu, Z. Zhu, S. Wu, W. Li, CD-HIT: Accelerated for clustering the next-generation sequencing data, Bioinformatics, 28 (2012), 3150–3152. https://doi.org/10.1093/bioinformatics/bts565 doi: 10.1093/bioinformatics/bts565
    [47] X. Tang, P. Zheng, X. Li, H. Wu, D. Q. Wei, Y. Liu, et al., Deep6mAPred: A CNN and Bi-LSTM-based deep learning method for predicting DNA N6-methyladenosine sites across plant species, Methods, 204 (2022), 142–150. https://doi.org/10.1016/j.ymeth.2022.04.011 doi: 10.1016/j.ymeth.2022.04.011
    [48] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, preprint, arXiv: 1301.3781.
    [49] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in Advances in neural information processing systems, 26 (2013), 3111–3119.
    [50] K. Fukushima, S. Miyake, Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position, Pattern Recognt., 15 (1982), 455–469. https://doi.org/10.1016/0031-3203(82)90024-3 doi: 10.1016/0031-3203(82)90024-3
    [51] D. H. Hubel, T. N. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat's visual cortex, J. Physiol., 160 (1962), 106. https://doi.org/10.1113/jphysiol.1962.sp006837 doi: 10.1113/jphysiol.1962.sp006837
    [52] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, et al., Handwritten digit recognition with a back-propagation network, in Advances in neural information processing systems, Morgan Kaufmann, 2 (1989), 396–404.
    [53] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 doi: 10.1162/neco.1997.9.8.1735
    [54] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, in Advances in neural information processing systems, 30 (2017), 6000–6010.
    [55] C. Raffel, D. P. Ellis, Feed-forward networks with attention can solve some long-term memory problems, preprint, arXiv: 1512.08756.
  • This article has been cited by:

    1. Sheza M. El-Deeb, Luminita-Ioana Cotîrlă, Coefficient Estimates for Quasi-Subordination Classes Connected with the Combination of q-Convolution and Error Function, 2023, 11, 2227-7390, 4834, 10.3390/math11234834
    2. Arzu Akgül, 2024, Chapter 8, 978-981-97-3237-1, 159, 10.1007/978-981-97-3238-8_8
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1920) PDF downloads(28) Cited by(1)

Figures and Tables

Figures(9)  /  Tables(6)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog