Research article Special Issues

Research on massive information query and intelligent analysis method in a complex large-scale system

  • With the rapid growth of big data and network information, it is particularly important to perform information query and intelligent analysis on unstructured massive data in large-scale complex systems. The existing methods of directly collating, sorting, summarizing, and storing retrieval of documents cannot meet the needs of information management and rapid retrieval of massive data. This paper takes the standardized storage, effective extraction and standardized database construction of massive resume information in social large-scale complex systems as an example, and proposes a massive information query and intelligent analysis method. The method utilizes the semi-structured features of the resume document, constructs the extraction rule model of various resume data to extract the massive resume information. On the basis of HBase distributed storage, with the help of parallel computing technology to optimize the storage and query efficiency, which ensures the intelligent analysis and retrieval of massive resume information. The experimental results show that this method not only greatly improves the extraction accuracy and recall rate of resume information data, but also compared with the traditional methods, there are obvious improvements in the three aspects of massive information retrieval methods, query usage efficiency, and the intelligent analysis of complex systems.

    Citation: Dailin Wang, Yunlei Lv, Danting Ren, Linhui Li. Research on massive information query and intelligent analysis method in a complex large-scale system[J]. Mathematical Biosciences and Engineering, 2019, 16(4): 2906-2926. doi: 10.3934/mbe.2019143

    Related Papers:

    [1] Brandy Rapatski, Juan Tolosa . Modeling and analysis of the San Francisco City Clinic Cohort (SFCCC) HIV-epidemic including treatment. Mathematical Biosciences and Engineering, 2014, 11(3): 599-619. doi: 10.3934/mbe.2014.11.599
    [2] Yanyu Xiao, Xingfu Zou . On latencies in malaria infections and their impact on the disease dynamics. Mathematical Biosciences and Engineering, 2013, 10(2): 463-481. doi: 10.3934/mbe.2013.10.463
    [3] Shouying Huang, Jifa Jiang . Global stability of a network-based SIS epidemic model with a general nonlinear incidence rate. Mathematical Biosciences and Engineering, 2016, 13(4): 723-739. doi: 10.3934/mbe.2016016
    [4] Bing Li, Yuming Chen, Xuejuan Lu, Shengqiang Liu . A delayed HIV-1 model with virus waning term. Mathematical Biosciences and Engineering, 2016, 13(1): 135-157. doi: 10.3934/mbe.2016.13.135
    [5] Haitao Song, Weihua Jiang, Shengqiang Liu . Virus dynamics model with intracellular delays and immune response. Mathematical Biosciences and Engineering, 2015, 12(1): 185-208. doi: 10.3934/mbe.2015.12.185
    [6] Yuming Chen, Junyuan Yang, Fengqin Zhang . The global stability of an SIRS model with infection age. Mathematical Biosciences and Engineering, 2014, 11(3): 449-469. doi: 10.3934/mbe.2014.11.449
    [7] Chunxiao Ding, Zhipeng Qiu, Huaiping Zhu . Multi-host transmission dynamics of schistosomiasis and its optimal control. Mathematical Biosciences and Engineering, 2015, 12(5): 983-1006. doi: 10.3934/mbe.2015.12.983
    [8] Peter Hinow, Pierre Magal, Shigui Ruan . Preface. Mathematical Biosciences and Engineering, 2015, 12(4): i-iv. doi: 10.3934/mbe.2015.12.4i
    [9] Susanne Ditlevsen, Petr Lansky . Preface. Mathematical Biosciences and Engineering, 2014, 11(1): i-ii. doi: 10.3934/mbe.2014.11.1i
    [10] Daozhou Gao, Shigui Ruan, Jifa Jiang . Preface. Mathematical Biosciences and Engineering, 2017, 14(5&6): i-ii. doi: 10.3934/mbe.201705i
  • With the rapid growth of big data and network information, it is particularly important to perform information query and intelligent analysis on unstructured massive data in large-scale complex systems. The existing methods of directly collating, sorting, summarizing, and storing retrieval of documents cannot meet the needs of information management and rapid retrieval of massive data. This paper takes the standardized storage, effective extraction and standardized database construction of massive resume information in social large-scale complex systems as an example, and proposes a massive information query and intelligent analysis method. The method utilizes the semi-structured features of the resume document, constructs the extraction rule model of various resume data to extract the massive resume information. On the basis of HBase distributed storage, with the help of parallel computing technology to optimize the storage and query efficiency, which ensures the intelligent analysis and retrieval of massive resume information. The experimental results show that this method not only greatly improves the extraction accuracy and recall rate of resume information data, but also compared with the traditional methods, there are obvious improvements in the three aspects of massive information retrieval methods, query usage efficiency, and the intelligent analysis of complex systems.


    Multi-attribute decision making (MADM) method has played an important role in operations research and modern decision science by effectively evaluating the alternative with multiple attributes. The evaluations of decision makers are always vague and imprecise due to the complexity of an actual decision-making environment. Si et al. [1] presented a novel method to compare the picture fuzzy numbers and applied it to solve decision making problems. Petrovic and Kankaras [2] developed a hybridized DEMATEL-AHP-TOPSIS for air traffic control radar position. Biswas et al. [3] proposed a multi-criteria decision making framework based on entropy measure to assess the mutual funds. Intuitionistic fuzzy (IF) sets (IFSs) proposed by Atanassov [4] can express the uncertainty and ambiguity of the information system quantitatively and intuitively. Subsequently, Atanassov and Gargov [5] introduced an interval-valued IFS (IVIFS) by using interval numbers to describe membership and non-membership functions. The IVIFS excellently expresses the imprecise preference for decision making. Thus far, IVIFS has received considerable attention in decision making [6,7,8,9] and entropy measure [10,11,12,13,14,15].

    With the increasing uncertainties and complexities involved in the management and decision situation, the higher requirements are put forward to represent fuzzy information. As data analysis and processing theory, picture fuzzy set, and fuzzy neutrosophic set are an effective tool to deal with imprecise and inconsistent information, but their values are expressed as single values. In real decision-making, single values cannot accurately describe the reality, uncertainty, and distortion of things. Besides, modeling a continuous set by using IF numbers (IFNs) and interval-valued IFNs (IVIFNs) is difficult. Thus, as an extension of IFSs, intuitionistic trapezoidal fuzzy numbers (ITrFNs) introduced by Liu and Yuan [16], can express more uncertainty from different dimensions of decision information than IFNs and IVIFNs. ITrFN extends IFS's discourse universe from a discrete set to a continuous set [17] because its prominent characteristic is that trapezoidal fuzzy numbers describe the corresponding membership and non-membership degrees. Thus, ITrFNs not only can depict the fuzzy concept of 'good' or 'excellent' but also present the concept abundantly [16,17]. In recent years, the research and application of intuitionistic triangular fuzzy numbers (ITFNs), which are a particular case of ITrFNs, have attracted considerable attention from scholars, such as Wang [18]; Wei [19]; Gao et al. [20]; Yu and Xu [21]. The current achievements are mainly concentrated in two aspects: (1) The ranking method of ITFNs based score and accuracy functions, (2) the intuitionistic triangular fuzzy agammaegation operators. But there is no investigation on entropy measure and its application in intuitionistic triangular fuzzy MADM with attribute weight completely unknown. Therefore, the entropy measure and MADM method under ITrFNs, which are exciting yet relatively sophisticated, must be discussed.

    Technique for order preference by similarity to an ideal solution (TOPSIS) [22] is a well-known method for MADM. The extended TOPSIS method for MADM problems with IFNs and IVIFNs using the connection numbers of set pair analysis theory was presented in [7] and [8], respectively. Garg and Kumar [6] proposed a TOPSIS approach based on a new exponential distance to handle MADM problems with IVIFN information. Subsequently, Garg and Kumar [9] applied the TOPSIS method to solve decision problems under a linguistic interval-valued IF (IVIF) environment. The present work is motivated by TOPSIS methods [6,7,8,9,22] and initially proposes an entropy measure of the intuitionistic trapezoidal fuzzy set (ITrFS) based on TOPSIS method and then provide an objective weighted approach. Accordingly, a MADM method with unknown weight information under an ITrFN environment is developed. The primary contributions of this study can be illuminated briefly as follows. (1) We newly define a Hamming distance measure of ITrFS and discuss its properties. (2) We propose entropy axioms and measure for ITrFS, which is the first report for entropy measure based on the idea of TOPSIS. (3) On this basis, we apply them to determine attribute weights in the ITrFN environment with unknown weight information and propose a method to address MADM problems with ITrFNs.

    The remainder of this paper is organized as follows. Section 2 briefly introduces related basic concepts. Section 3 presents an entropy measure for ITrFSs. In section 4, an objective approach to determine attribute weights is developed, and a MADM method with ITrFNs is proposed. Section 5 provides a numerical example to illustrate the feasibility of the proposed method. Section 6 presents our conclusions.

    Definition 1. [16]. A trapezoidal fuzzy number (TrFN) A is a fuzzy set in the set R of real numbers, with its membership function defined by

    FA(x)={0, if  x<a1,xa1a2a1, if  a1xa2,1, if  a2xa3,xa4a3a4, if  a3xa4,0, if  x>a4, (1)

    where a1a2a3a4 , a1 and a4 present the lower limit and upper limit of A, respectively, [a2,a3] is the mode interval, which can be denoted as a four-tuple (a1,a2,a3,a4) .

    Definition 2. Let X be a fixed set, μ A(x)=(tlA(x),tm1A(x),tm2A(x),thA(x)) and v A(x)=(flA(x),fm1A(x),fm2A(x),fhA(x)) are TrFNs defined on the unit interval [0, 1], then an intuitionistic trapezoidal fuzzy set (ITrFS)  A over X is defined as  A={(x,<μ A(x),v A(x)>)|xX} where the parameters μ A(x) and v A(x) indicate, respectively, the membership degree and non-membership degree of the element x in  A , with the conditions 0thA(x)+fhA(x)1 . tlA(x) and thA(x) present the lower limit and upper limit of μ A(x) , [tm1A(x),tm2A(x)] is the most possible membership interval of μ A(x) . flA(x) and fhA(x) present the lower limit and upper limit of v A(x) , [fm1A(x),fm2A(x)] is the non-membership interval of v A(x) .

    For convenience, we call  α=<(tlA,tm1A,tm2A,thA),(flA,fm1A,fm2A,fhA)> an intuitionistic trapezoidal fuzzy number (ITrFN), where

    tlA,tm1A,tm2A,thA[0,1],flA,fm1A,fm2A,fhA[0,1],thA+fhA[0,1]. (2)

    It is clear that the largest and smallest ITFN are α+=<(1,1,1,1),(0,0,0,0)> and α=<(0,0,0,0),(1,1,1,1)> , respectively. When tm1A=tm2A and fm1A=fm2A , an ITrFN reduces to an ITFN [16].

    For example, the product quality attribute in online service trading selection example can be expressed in an ITrFN ((0.1, 0.2, 0.3, 0.4), (0.2, 0.3, 0.5, 0.6)), where 0.1 and 0.4 indicate the lower limit and upper limit of users' satisfactory degree, [0.2, 0.3] means the interval of most possible satisfactory degree; 0.2 and 0.6 denote the lower limit and upper limit of users' dissatisfactory degree, [0.3, 0.5] is the interval of most possible dissatisfactory degree.

    Definition 3. [18] Let  α1=<(tl1,tm11,tm21,th1),(fl1,fm11,fm21,fh1)> and  α2=<(tl2,tm12,tm22,th2),(fl2,fm12, fm22,fh2)> be two ITrFNs and λ>0 , then the containment is:

     α1 α2   iff  tl1tl2,tm11tm12,tm21tm22,th1th2,fl1fl2,fm11fm12,fm21fm22,fh1fh2. (3)

    Some arithmetic operations between ITrFNs  α1 and  α2 are shown as below:

    (1)  α1+ α2=<(tl1+tl2tl1tl2,tm11+tm12tm11tm12,tm21+tm22tm21tm22,th1+th2th1th2),

    (fl1fl2,fm11fm12,fm21fm22,fh1fh2)>;

    (2) λ α1=<(1(1tl1)λ,1(1tm11)λ,1(1tm21)λ,1(1th1)λ),

    ((fl1)λ,(fm11)λ,(fm21)λ,(fh1)λ)>;

    (3)  α1c=<(fl1,fm11,fm21,fh1),(tl1,tm11,tm21,th1)>

    Definition 4. Let  α1=<(tl1,tm11,tm21,th1),(fl1,fm11,fm21,fh1)> and  α2=<(tl2,tm12,tm22,th2),(fl2,fm12,fm22,fh2)> be two ITrFNs. The Hamming distance d( α1, α2) between the ITFNs  α1 and  α2 is defined as follows:

    d( α1, α2)=18(|tl1tl2|+|tm11tm12|+|tm21tm22|+|th1th2|+
    |fl1fl2|+|fm11fm12|+|fm21fm22|+|fh1fh2|) (4)

    Theorem 1. The distance measure d( α1, α2) satisfies the following properties:

    (i) 0d( α1, α2)1 .

    (ii) d( α1, α2)=0 if and only if  α1= α2 .

    (iii) d( α1, α2)=d( α2, α1) .

    (iv) If  α3=<(tl3,tm13,tm23,th3),(fl3,fm13,fm23,fh3)> is an ITrFN and  α1 α2 α3 , then d( α1, α3)d( α1, α2) and d( α1, α3)d( α2, α3) .

    Proof. It is easy to see that the proposed similarity measure d( α1, α2) meets the third property of Theorem 1. We only need to prove (i), (ii) and (iv).

    For (i),

    By Eq (2), we have

    0|tl1tl2|1 , 0|tm11tm12|1 , 0|tm21tm22|1 , 0|th1th2|1 , 0|fl1fl2|1 , 0|fm11fm12|1 , 0|fm21fm22|1 , 0|fh1fh2|1 .

    It is easy to see that

    018(|tl1tl2|+|tm11tm12|+|tm21tm22|+|th1th2|+
    |fl1fl2|+|fm11fm12|+|fm21fm22|+|fh1fh2|)1
    012max(|tl1tl2|,|tm1tm2|,|th1th2|,|fl1fl2|,|fm1fm2|,|fh1fh2|)12

    Thus the inequality: 0d( α1, α2)1 is established.

    For (ii),

    When d( α1, α2)=1 , if and only if

    18(|tl1tl2|+|tm11tm12|+|tm21tm22|+|th1th2|+|fl1fl2|+
    |fm11fm12|+|fm21fm22|+|fh1fh2|)=0

    Apparently, it's easy to derive

    |tl1tl2|=0 , |tm11tm12|=0 , tm21tm22|=0 , |th1th2|=0 , |fl1fl2|=0 ,

    |fm11fm12|=0 , |fm21fm22|=0 , |fh1fh2|=0 .

    Thus we get tl1=tl2 , tm11=tm12 , tm21=tm22 , th1=th2 , fl1=fl2 , fm11=fm12 , fm21=fm22 , fh1fh2 . And then  α1= α2 .

    For (iv),

    Since

    tl1tl2tl3 , tm11tm12tm13 , tm21tm22tm23 , th1th2th3 , fl1fl2fl3 , fm11fm12fm13 , fm21fm22fm23 , fh1fh2fh3 ,

    We get

    |tl1tl2||tl1tl3| , |tm11tm12||tm11tm13| , |tm21tm22||tm21tm23| ,

    |th1th2||th1th3| , |fl1fl2||fl1fl3| , |fm11fm12||fm11fm13| ,

    |fm21fm22||fm21fm23| , |fh1fh2||fh1fh3| .

    Based on the above inequalities, it's easy to derive

    |tl1tl2|+|tm11tm12|+|tm21tm22|+|th1th2|+|fl1fl2|+
    |fm11fm12|+|fm21fm22|+|fh1fh2||tl1tl3|+|tm11tm13|+|tm21tm23|+|th1th3|+|fl1fl3|+
    |fm11fm13|+|fm21fm23|+|fh1fh3|

    and

    max(|tl1tl2|,|tm1tm2|,|th1th2|,|fl1fl2|,|fm1fm2|,|fh1fh2|)

    max(|tl1tl3|,|tm1tm3|,|th1th3|,|fl1fl3|,|fm1fm3|,|fh1fh3|) .

    Thus, d( α1, α3)d( α1, α2) . By the same way, it is proved that d( α1, α3)d( α2, α3) .

    For example, consider  α1=<(0.3,0.4,0.5,0.6),(0.0,0.1,0.2,0.3)> and  α2=<(0.5,0.5,0.6,0.6),(0.0,0.1,0.2,0.3)> ,  α3=<(0.5,0.5,0.7,0.7),(0.0,0.1,0.2,0.3)> are three ITrFNs in [0, 1]. According to Definition 3, we have  α1< α2< α3 . By Definition 4, we know d( α1, α2)=0.05 , d( α1, α3)=0.075 , d( α2, α3)=0.025 . Obvious, d( α1, α3)>d( α1, α2) and d( α1, α3)d( α2, α3) .

    Entropy measure is worthy of investigation in IF environment. It is widely used in the field of decision-making. Burillo and Bustince [10] discussed the entropy on IFSs and interval-valued. Szmidt and Kacprzyk [11] proposed an entropy measure from a geometric point of view. Chen and Li [12] conducted a comparative analysis on determining objective weights with intuitionistic fuzzy entropy measures. Joshi and Kumara [15] discussed the parametric (R, S)-norm IF entropy and applied it to MADM. Some researchers have recently used distance measures to derive fuzzy entropy by extending De Luca's axioms [14]. Liu [23] proposed some entropy measures for fuzzy sets (FSs) based on distances. Zhang and Zhang et al. [24] discussed the entropy of interval-valued FSs based on distance and its relationship with a similarity measure. Zhang and Xing et al. [13] introduced the relationship among distance measures, inclusion measures and fuzzy entropy of IVIFSs. To address the completely unknown attribute weights in MADM problems, Garg [25] proposed some IF Hamacher agammaegation operators based on entropy function to agammaegate the attribute values. Later, Garg [26] developed a generalized IF entropy for IVIFS and applied it to solve MADM problems. This section combines the entropy concept in [13] and TOPSIS method to develop a novel axiomatical definition of entropy measure for ITrFS.

    Definition 5. A real-valued function E:ITrFS(X)[0,1] is called an entropy on ITrFS(X) if it satisfies the following properties:

    (EP1) E(A)=0 iff A is a crisp set;

    (EP2) E(A)=1 iff d(A,A+)=d(A,A) for all AITrFS(X) , where d(A,A+) is a distance from A to A+ , and d(A,A) is a distance from A to A ;

    (EP3) E(A)=E(Ac) for all AITrFS(X) ;

    (EP4) For all A,BITrFS(X) , if |d(A,A)d(A,A)+d(A,A+)12||d(B,B)d(B,B)+d(B,B+)12| , then E(A)E(B) , where d(B,B+) is a distance from B to B+ , and d(B,B) is a distance from B to B .

    Remark 1. A new axiomatical definition of distance-based entropy for ITrFS is proposed in Definition 4 based on the idea of TOPSIS. Given a set type, we can define the entropy for the corresponding ITrFSs by using different distance measures between two ITrFSs. The properties in Definition 4 imply the following realities:

    (EP1) Crisp sets are not fuzzy;

    (EP2) If d(A,A+)=d(A,A) , then A is the fuzziest set;

    (EP3) The fuzziness of a generalized set is equal to that of its complement;

    (EP4) An ITrFS is fuzzier when its relative closeness is nearly 0.5.

    Theorem 2. Let d be the distance of ITrFS(X) . Then, for any AITrFS(X) ,

    E(A)=12|d(A,A)d(A,A)+d(A,A+)12| (5)

    is entropy of F(X) based on TOPSIS.

    Proof. We can prove that E(A) meets properties (EP1)-(EP4).

    EP1: If A is crisp set, that is, A(X)=<(1,1,1,1),(0,0,0,0)> or A(X)=<(0,0,0,0),(1,1,1,1)> , by using Eq. (5), then we have d(A,A)d(A,A)+d(A,A+)=1 or d(A,A)d(A,A)+d(A,A+)=0 . Thus, E(A)=12|112|=0 or E(A)=12|012|=0 .

    EP2: If E(A)=1 , then we have d(A,A)d(A,A)+d(A,A+)=12d(A,A)=d(A,A+) .

    EP3: Given d(Ac,A)=d(A,A+) and d(Ac,A+)=d(A,A) , then |d(Ac,A)d(Ac,A)+d(Ac,A+)12|=

    |12d(A,A)d(A,A)+d(A,A+)| . Thus, E(A)=E(Ac) .

    EP4: If |d(A,A)d(A,A)+d(A,A+)12||d(B,B)d(B,B)+d(B,B+)12| , then E(A)E(B) can be easily derived.

    Remark 2. Consider the distance measure dIVIFN(,) of IVIFNs, for an ITrFN  A=<(tlA,tm1A,tm2A,thA),(flA,fm1A,fm2A,fhA)> , if tlA=tm1A , tm2A=thA , flA=fm1A and fm2A=fhA , then  A is degenerated to an IVIFN  AIVIFN=<[tlA,thA],[flA,fhA]> , the largest IVIFN is A+IVIFN=<[1,1],[0,0]> , the smallest IVIFN is AIVIFN=<[0,0],[1,1]> , and Eq. (4) is degenerated to the distance measure of IVIFNs dIVIFN( α1, α2)=14(|tl1tl2|+|th1th2|+|fl1fl2|+|fh1fh2|) . According to Eq (5), the entropy of IVIFN  AIVIFN can be calculated as

    E( AIVIFN)=12|dIVIFN( AIVIFN,AIVIFN)dIVIFN( AIVIFN,AIVIFN)+d( AIVIFN,A+IVIFN)12| (6)

    Obviously, E( AIVIFN) satisfies properties (EP1)-(EP4). Thus the proposed entropy measure is a generalization of IVIFS.

    In this section, we provide a method to address ITrFN MADM problems unknown attribute weight by using the proposed entropy measure.

    For the MADM problem, the final decision should be derived from the assessments of all feasible alternatives on multiple attributes. For convenience, some symbols are introduced to characterize the MADM problem as follows.

    (1) The set of alternatives is Si(iM={1,2,,m}) .

    (2) The set of attributes is Aj(jN={1,2,,n}) . The attribute weight vector is denoted by w=(w1,w2,,wn) , where wj represents the weight of Aj such that wj[0,1] (jN) and nj=1wj=1 .

    (3) The assessments of alternatives Si on attributes Aj are ITrFNs  αij=<(tlij,tm1ij,tm2ij,thij),(flij,fm1ij,fm2ij,fhij)> .

    (4) An ITrFN MADM problem can be described by an ITrFN decision matrix  D=( αij)m×n .

    Attribute weights depend on the certainty and reliability of the assessments given by the decision maker. The objective weight is smaller when the evaluation value is more uncertain. The fuzziness and uncertainty of attribute values can be measured by the fuzzy entropy. According to the entropy-weighting method [9,13,26], we employ the proposed IF entropy measure to determine the weights of the attributes. The decision matrix  D=( αij)m×n can be turned into an IF entropy matrix Γ=(Eij)m×n , where

    Eij=12|d( αij,α)d( αij,α)+d( αij,α+)12|. (7)

    α=<(0,0,0,0),(1,1,1,1)> and α+=<(1,1,1,1),(0,0,0,0) are the negative ideal solution (NIS) and positive ideal solution (PIS), respectively.

    Then, the normalized entropy matrix H=(hij)m×n is obtained as follows:

    hij=Eijmax{Ei1,Ei2,,Ein}. (8)

    The objective attribute weights are determined by

    wj=1mi=1Eij1nj=1mi=1Eiji={1,2,,m},j={1,2,,n}. (9)

    Evidently, attribute weight wj is inversely proportional to the summation of the entropy values of attribute Aj . In other words, if the values of the attribute are vaguer and more unreliable, then we assign a lower weight; otherwise, a higher weight is attached.

    This section extends TOPSIS to agammaegate ITrFNs and rank alternatives. Suppose that PIS and NIS are R+=(α+1,α+2,,α+m) and R=(α1,α2,,αm) , respectively, where α+j=<(1,1,1,1),(0,0,0,0)> and αj=<(0,0,0,0),(1,1,1,1)> for benefit attributes and α+j=<(0,0,0,0),(1,1,1,1)> and αj=<(1,1,1,1),(0,0,0,0)> for cost attributes. In the decision matrix  D=( αij)m×n , the separation measures from alternative Si to PIS α+ and NIS α can be defined as follows.

    Definition 6. The weighted positive separation measure between alternative Si and PIS is defined as follows:

    G+i=nj=1wjd( αij,α+j) (10)

    where d( αij,α+j) is the distance from  αij to α+j , and wj is the attribute weight of attribute Aj .

    Definition 7. The weighted positive separation measure between alternative Si and NIS is defined as follows:

    Gi=nj=1wjd( αij,αj) (11)

    where d( αij,αj) is the distance from  αij to αj , and wj is the attribute weight of Aj .

    Then, a closeness coefficient to the PIS and NIS for each alternative is calculated as follows:

    RCi=GiGi+G+i. (12)

    Evidently, alternative Si is better when RCi is larger.

    This section presents a procedure for solving MADM problems with unknown attribute weights under an ITrFN environment; it can be summarized in the following steps:

    Step 1. Provide the decision matrix  D=( αij)m×n .

    Step 2. Calculate IF entropy matrix using Eq (7).

    Step 3. Determine the weight vector of attributes by Eq (9).

    Step 4. Identify the PIS and NIS and compute the separation measures from each alternative to PIS and NIS using Eqs (10) and (11), respectively.

    Step 5. Construct the closeness coefficient of alternatives according to Eq (12).

    Step 6. Rank the alternatives according to the closeness coefficient and select the best one.

    The detailed decision process of the proposed method is shown in Figure 1.

    Figure 1.  The decision process of the proposed method.

    Online service trading generally transpires between autonomous parties in an environment where the buyer often has insufficient information about the seller and goods. Many scholars believe that trust is a prerequisite for successful trading. Therefore, buyers must be able to identify the most trustworthy seller. Suppose that a consumer desires to select a reliable seller. After preliminary screening, four candidate sellers S1 , S2 , S3 and S4 remain to be further evaluated. Based on detailed seller ratings, the consumer assesses the four candidate sellers according to five trust factors, namely, product quality (A1), service attitude (A2), website usability (A3), response time (A4) and shipping speed (A5). The first three attributes are benefit attributes, whereas the last two are cost attributes. The decision maker provides the lower and upper limits and the most possible intervals to describe these attributes. The candidate sellers' ratings concerning the attributes can be represented as ITrFNs by using statistical methods, as shown in Table 1.

    Table 1.  The ITrFN decision matrix.
    A1 A2 A3 A4 A5
    S1 < (0.3, 0.4, 0.5, 0.5), (0.2, 0.3, 0.3, 0.4) > < (0.1, 0.4, 0.5, 0.6), (0.1, 0.2, 0.3, 0.3) > < (0.4, 0.4, 0.5, 0.6), (0.1, 0.2, 0.3, 0.4 > < (0.3, 0.4, 0.5, 0.5), (0.1, 0.2, 0.3, 0.5) > < (0.2, 0.4, 0.5, 0.5), (0.3, 0.3, 0.5, 0.5) >
    S2 < (0.1, 0.2, 0.3, 0.4), (0.2, 0.3, 0.4, 0.5) > < (0.1, 0.2, 0.3, 0.3), (0.1, 0.3, 0.4, 0.5) > < (0.2, 0.3, 0.4, 0.4), (0.2, 0.3, 0.4, 0.6 > < (0.1, 0.2, 0.3, 0.3), (0.2, 0.3, 0.4, 0.6) > < (0.2, 0.2, 0.3, 0.4), (0.3, 0.5, 0.5, 0.5) >
    S3 < (0.3, 0.4, 0.5, 0.5), (0.1, 0.2, 0.3, 0.5) > < (0.2, 0.3, 0.5, 0.6), (0.1, 0.2, 0.3, 0.4) > < (0.0, 0.2, 0.3, 0.3), (0.1, 0.3, 0.4, 0.5) > < (0.2, 0.3, 0.4, 0.4), (0.2, 0.2, 0.4, 0.4) > < (0.3, 0.3, 0.4, 0.4), (0.1, 0.2, 0.5, 0.5) >
    S4 < (0.1, 0.2, 0.4, 0.5), (0.1, 0.3, 0.4, 0.5) > < (0.0, 0.1, 0.2, 0.3), (0.2, 0.3, 0.5, 0.6) > < (0.0, 0.1, 0.3, 0.4), (0.2, 0.3, 0.4, 0.6) > < (0.1, 0.2, 0.4, 0.4), (0.2, 0.2, 0.3, 0.4) > < (0.1, 0.3, 0.4, 0.4), (0.1, 0.3, 0.4, 0.4) >

     | Show Table
    DownLoad: CSV

    Step 1. Form a decision matrix that is listed in Table 1.

    Step 2. Since A1, A2, A3 are benefit attributes and A4, A5 are cost attributes, we have the following PIS and NIS:

    R+=(<(1,1,1,1),(0,0,0,0)>,<(1,1,1,1),(0,0,0,0)>,<(1,1,1,1),(0,0,0,0)>,<(0,0,0,0),(1,1,1,1)>,<(0,0,0,0),(1,1,1,1)>)
    R=(<(0,0,0,0),(1,1,1,1)>,<(0,0,0,0),(1,1,1,1)>,<(0,0,0,0),(1,1,1,1)>,<(1,1,1,1),(0,0,0,0)>,<(1,1,1,1),(0,0,0,0)>)

    Using Eq (7), the decision matrix turns into IF entropy matrix as follows

    Γ=(0.8750.8250.7750.8251.0000.9000.9000.9500.8500.8250.8500.8500.8750.9750.9750.9750.7500.8251.0001.000).

    Step 3. Utilizing Eq (9), the attribute weight vector is determined as w=(0.20,0.20,0.19,0.20,0.21)T .

    Step 4. By Eqs (10) and (11), The positive and negative weighted separation are obtained as G+=(0.465,0.492,0.487,0.545) and G=(0.535,0.508,0.513,0.455) .

    Step 5. Using Eq (12), the closeness coefficients of each seller are calculated as RC=(0.535,0.508,0.513,0.455) .

    Step 6. Since RC1>RC3>RC2>RC4 , the best seller is S1 .

    Given different attribute weights will produce various decision results, this section carries out a sensitivity analysis on attribute weights to observe whether different attribute weights will lead to a different ranking of four trustworthy sellers. After expert discussion, the weight of product quality, service attitude and website usability are correct. I analyze the seller's ranking, in the case that the weights meet w4+w5=0.41 . When 0w40.24 , the ranking of four trustworthy sellers is S1>S3>S2>S4 . If w4=0.25 , their ranking is S1>S3=S2>S4 . When 0.26w40.35 , their ranking is S1>S2>S3>S4 . When 0.36w40.41 , their ranking is S2>S1>S3>S4 . The above results reveal the importance of attribute weights in decision-making.

    This section performs a comparison with the MADM method based on the generalized IF entropy developed by Garg [26]. We use the proposed method for solving the supplier selection problem given in [26] by appropriate modifications, given that the attribute ratings are in the form of IVIFNs. Specifically, when tlA=tm1A , tm2A=thA , flA=fm1A and fm2A=fhA , the ITrFN  α is reduced to an IVIFN, α+=<[1,1],[0,0]> , α=<[0,0],[1,1]> ; and the distance measure in Eq (4) is reduced to d( α1, α2)=14(|tl1tl2|+|th1th2|+|fl1fl2|+|fh1fh2|) . According to Eq (6), we have the entropy of each IVIFN. Using the proposed decision procedure for MADM, the ranking order of suppliers is as follows: A4A5A3A2A1 , which is the same as that obtained by the method in [26]. Hence, the proposed method is suitable for MADM problems with unknown attribute weight under an IVIF environment. The method in [26] cannot address decision problems with ITrFNs. Moreover, the proposed method is superior in terms of using generalized ITrFNs in comparison with the IVIFNs employed in [26]. The proposed method also has shortcomings. For example, it is not suitable for MADM problems with incomplete weight attribute information under ITrFSs environment. To solve this problem, we can define the cross-entropy of ITrFSs by learning from the cross-entropy of IFSs [27]. Then, the programming models can be constructed based on the cross-entropy of ITrFSs to obtain attribute weights.

    This study presented a TOPSIS-based entropy method to solve MADM problems with ITrFNs and unknown attribute weight information. We applied ITrFNs for MADM problems to address the imprecise and vague decision data in the actual MADM environment. We developed a distance measure for ITrFNs and discussed its properties. We put forward a TOPSIS-based entropy measure for ITrFNs, in which the entropy axioms for ITrFNs are easy to understand and compute because they only require identifying the largest and least values. Further, we provided an objective attribute weight method by using the proposed entropy measure. Then, by combining TOPSIS and entropy-weighted approach, a MADM method was proposed to select the best alternative. Finally, an online trustworthy service evaluation example indicated that the proposed MADM method is practical and useful. Our future research will cover the following three aspects. (1) We will construct additional entropy measures of ITrFNs and study the relationship between the entropy and similarity measure of ITrFNs. (2) We will extend the proposed method to a decision environment with linguistic interval-valued Atanassov IFSs [28]. (3) The proposed method will be used for large group decision-making problems [29] by integrating the evaluation information into ITrFNs. (4) The proposed method will be applied to the evaluation of text classification [30] and financial risk analysis [31] in ITrFSs environment.

    This work was supported by the National Natural Science Foundation of China (Nos. 61602219 and 71662014), the Science and Technology Project of Jiangxi Province Education Department of China (No. GJJ181482) and the Natural Science Fund Project of Jiangxi science and Technology Department (No. 20202BABL202027)

    The authors declare there is no conflict of interest.



    [1] B. Li, Y. Chen and S. Yu, Review of information extraction research, Comput. Eng. Appl., 10 (2003), 1–5+66. (in Chinese)
    [2] Y. Liu, R. Jin and J. Y. Chai, et al., A Maximum coherence model for dictionary-based cross-language information retrieval. Proceedings of the 28th Annual International ACMSIGIR Conference; 2005 August 15–19; Salvador, Brazil. New York: ACM; 536–543.
    [3] A. L. Berger, V. J. D. Pietra and S. A. D. Pietra, A maximum entropy approach to natural language processing, Comput. Linguist., 22 (1996), 39–71.
    [4] W. Huang and Y. Sun, Chinese short text sentiment analysis based on maximum entropy, Comput. Eng. Des., 38 (2017), 138–143. (in Chinese)
    [5] Y. Lin, Y. Liu and S. Zhou, Text information extraction based on maximum entropy of hidden Markov model, Acta Electronica Sinic, 33 (2005), 236–240. (in Chinese)
    [6] K. Seymore, A. Mccallum and R. Rosenfeld, Learning hidden Markov model structure for information extraction, In Aaai'99 Workshop Machine Learning for Information Extraction, (1999), 37–42.
    [7] C. Chi and Y. Zhang, Information extraction from chinese papers based on hidden Markov model, Adv. Mater. Res., 846 (2014), 1291–1294.
    [8] Y. Liu, Y. Lin and Z. Chen, Text information extraction based on hidden Markov model, J. Syst. Simulat., 16 (2004), 507–510. (in Chinese)
    [9] S. Zhe, Research and application of hidden Markov model in web page information extraction, Ph.D thesis, East China Normal University, 2016. (in Chinese)
    [10] S. Zhou, Y. Lin and Y. Wang, et al., Text information extraction based on clustered hidden Markov model, J. Syst. Simulat., 19 (2007), 4926–4931.
    [11] Q. Du, H. Wang and Z. Shao, et al., Research on the extraction method of literature metadata based on hybrid HMM, Comput. and Digit. Eng., 45 (2017), 101–106. (in Chinese)
    [12] F. Ciravegna and A. Lavelli, Learning Pinocchio: adaptive information extraction for real world applications, J. Nat. Lang. Eng., 10 (2004), 145–165.
    [13] W. Yu, G. Guan and M. Zhou, et al., CV information extraction based on two-level cascade text classification, J. Chinese Inform. Process. 20 (2006), 59–66.
    [14] K. Yu, G. Guan and M. Zhou, Resume information extraction with Cascaded Hybrid Model. Proceddings of the 43th Annual Meeting of the ACL; 2005 June; Ann Arbor, Michigan. Association for Computational Linguistics; 499–506. (in Chinese)
    [15] Q. Wang and F. Li, Wikipedia-based resume extraction of personal name information, Comput. Appl. Softw., 28 (2011), 170–174. (in Chinese)
    [16] N. Ren, Research on the extraction of character title information in large-scale real texts, Ph.D thesis, Beijing Language and Culture University, 2008.
    [17] N. Gu, W. Feng and X. Sun, et al., Chinese resume automatic analysis and recommendation algorithm, Comput. Eng. Appl., 53 (2017), 141–148+270. (in Chinese)
  • This article has been cited by:

    1. Piotr Gwiazda, Anna Marciniak-Czochra, Horst R. Thieme, Measures under the flat norm as ordered normed vector space, 2018, 22, 1385-1292, 105, 10.1007/s11117-017-0503-z
  • Reader Comments
  • © 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(5137) PDF downloads(545) Cited by(0)

Figures and Tables

Figures(11)  /  Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog