On Balancing between Optimal and Proportional categorical predictions

  • A bias-variance dilemma in categorical data mining and analysis is the fact that a prediction method can aim at either maximizing the overall point-hit accuracy without constraint or with the constraint of minimizing the distribution bias. However, one can hardly achieve both at the same time. A scheme to balance these two prediction objectives is proposed in this article. An experiment with a real data set is conducted to demonstrate some of the scheme's characteristics. Some basic properties of the scheme are also discussed.

    Citation: Wenxue Huang, Yuanyi Pan. On Balancing between Optimal and Proportional categorical predictions[J]. Big Data and Information Analytics, 2016, 1(1): 129-137. doi: 10.3934/bdia.2016.1.129

    Related Papers:

    [1] Tingting Du, Zhengang Wu . Some identities of the generalized bi-periodic Fibonacci and Lucas polynomials. AIMS Mathematics, 2024, 9(3): 7492-7510. doi: 10.3934/math.2024363
    [2] Tingting Du, Zhengang Wu . Some identities involving the bi-periodic Fibonacci and Lucas polynomials. AIMS Mathematics, 2023, 8(3): 5838-5846. doi: 10.3934/math.2023294
    [3] Hong Kang . The power sum of balancing polynomials and their divisible properties. AIMS Mathematics, 2024, 9(2): 2684-2694. doi: 10.3934/math.2024133
    [4] Utkal Keshari Dutta, Prasanta Kumar Ray . On the finite reciprocal sums of Fibonacci and Lucas polynomials. AIMS Mathematics, 2019, 4(6): 1569-1581. doi: 10.3934/math.2019.6.1569
    [5] Waleed Mohamed Abd-Elhameed, Omar Mazen Alqubori . New expressions for certain polynomials combining Fibonacci and Lucas polynomials. AIMS Mathematics, 2025, 10(2): 2930-2957. doi: 10.3934/math.2025136
    [6] Kritkhajohn Onphaeng, Prapanpong Pongsriiam . Exact divisibility by powers of the integers in the Lucas sequence of the first kind. AIMS Mathematics, 2020, 5(6): 6739-6748. doi: 10.3934/math.2020433
    [7] Ala Amourah, B. A. Frasin, G. Murugusundaramoorthy, Tariq Al-Hawary . Bi-Bazilevič functions of order ϑ+iδ associated with (p,q) Lucas polynomials. AIMS Mathematics, 2021, 6(5): 4296-4305. doi: 10.3934/math.2021254
    [8] Can Kızılateş, Halit Öztürk . On parametric types of Apostol Bernoulli-Fibonacci, Apostol Euler-Fibonacci, and Apostol Genocchi-Fibonacci polynomials via Golden calculus. AIMS Mathematics, 2023, 8(4): 8386-8402. doi: 10.3934/math.2023423
    [9] Abdulmtalb Hussen, Mohammed S. A. Madi, Abobaker M. M. Abominjil . Bounding coefficients for certain subclasses of bi-univalent functions related to Lucas-Balancing polynomials. AIMS Mathematics, 2024, 9(7): 18034-18047. doi: 10.3934/math.2024879
    [10] Waleed Mohamed Abd-Elhameed, Amr Kamel Amin, Nasr Anwer Zeyada . Some new identities of a type of generalized numbers involving four parameters. AIMS Mathematics, 2022, 7(7): 12962-12980. doi: 10.3934/math.2022718
  • A bias-variance dilemma in categorical data mining and analysis is the fact that a prediction method can aim at either maximizing the overall point-hit accuracy without constraint or with the constraint of minimizing the distribution bias. However, one can hardly achieve both at the same time. A scheme to balance these two prediction objectives is proposed in this article. An experiment with a real data set is conducted to demonstrate some of the scheme's characteristics. Some basic properties of the scheme are also discussed.


    Fibonacci polynomials and Lucas polynomials are important in various fields such as number theory, probability theory, numerical analysis, and physics. In addition, many well-known polynomials, such as Pell polynomials, Pell Lucas polynomials, Tribonacci polynomials, etc., are generalizations of Fibonacci polynomials and Lucas polynomials. In this paper, we extend the linear recursive polynomials to nonlinearity, that is, we discuss some basic properties of the bi-periodic Fibonacci and Lucas polynomials.

    The bi-periodic Fibonacci {fn(t)} and Lucas {ln(t)} polynomials are defined recursively by

    f0(t)=0,f1(t)=1,fn(t)={ayfn1(t)+fn2(t)n0(mod2),byfn1(t)+fn2(t)n1(mod2),n2,

    and

    l0(t)=2,l1(t)=at,ln(t)={byln1(t)+ln2(t)n0(mod2),ayln1(t)+ln2(t)n1(mod2),n2,

    where a and b are nonzero real numbers. For t=1, the bi-periodic Fibonacci and Lucas polynomials are, respectively, well-known bi-periodic Fibonacci {fn} and Lucas {ln} sequences. We let

    ς(n)={0n0(mod2),1n1(mod2),n2.

    In [1], the scholars give the Binet formulas of the bi-periodic Fibonacci and Lucas polynomials as follows:

    fn(t)=aς(n+1)(ab)n2(σn(t)τn(t)σ(t)τ(t)), (1.1)

    and

    ln(t)=aς(n)(ab)n+12(σn(t)+τn(t)), (1.2)

    where n0, σ(t), and τ(t) are zeros of λ2abtλab. This is σ(t)=abt+a2b2t2+4ab2 and τ(t)=abta2b2t2+4ab2. We note the following algebraic properties of σ(t) and τ(t):

    σ(t)+τ(t)=abt,σ(t)τ(t)=a2b2t2+4ab,σ(t)τ(t)=ab.

    Many scholars studied the properties of bi-periodic Fibonacci and Lucas polynomials; see [2,3,4,5,6]. In addition, many scholars studied the power sums problem of second-order linear recurrences and its divisible properties; see [7,8,9,10].

    Taking a=b=1 and t=1, we obtain the Fibonacci {Fn} or Lucas {Ln} sequence. Melham [11] proposed the following conjectures:

    Conjecture 1. Let m1 be an integer, then the sum

    L1L3L5L2m+1nk=1F2m+12k

    can be represented as (F2n+11)2R2m1(F2n+1), including R2m1(t) as a polynomial with integer coefficients of degree 2m1.

    Conjecture 2. Let m1 be an integer, then the sum

    L1L3L5L2m+1nk=1L2m+12k

    can be represented as (L2n+11)Q2m(L2n+1), where Q2m(t) is a polynomial with integer coefficients of degree 2m.

    In [12], the authors completely solved the Conjecture 2 and discussed the Conjecture 1. Using the definition and properties of bi-periodic Fibonacci and Lucas polynomials, the power sums problem and their divisible properties are studied in this paper. The results are as follows:

    Theorem 1. We get the identities

    nk=1f2m+12k(t)=a2m+1b(a2b2t2+4ab)mmj=0(1)mj(2m+1mj)(f(2n+1)(2j+1)(t)f2j+1(t)l2j+1(t)), (1.3)
    nk=1f2m+12k+1(t)=(ab)m(a2b2t2+4ab)mmj=0(2m+1mj)(f(2n+2)(2j+1)(t)f2(2j+1)(t)l2j+1(t)), (1.4)
    nk=1l2m+12k(t)=mj=0(2m+1mj)(l(2n+1)(2j+1)(t)l2j+1(t)l2j+1(t)), (1.5)
    nk=1l2m+12k+1(t)=am+1bm+1mj=0(1)mj(2m+1mj)(l(2n+2)(2j+1)(t)l2(2j+1)(t)l2j+1(t)), (1.6)

    where n and m are positive integers.

    Theorem 2. We get the identities

    nk=1f2m2k(t)=a2m(a2b2t2+4ab)mmj=0(1)mj(2mmj)f2j(2n+1)(t)f2j(t)a2m(a2b2t2+4ab)m(2mm)(1)m(n+12), (1.7)
    nk=1f2m2k+1(t)=(ab)m(a2b2t2+4ab)mmj=0(2mmj)(f2j(2n+2)(t)f4j(t)f2j(t))(ab)m(a2b2t2+4ab)m(2mm)n, (1.8)
    nk=1l2m2k(t)=mj=0(2mmj)f2j(2n+1)(t)l2j+1(t)22m1(2mm)(n+12), (1.9)
    nk=1l2m2k+1(t)=ambmmj=0(1)mj(2mmj)(f2j(2n+2)(t)f4j(t)f2j(t))ambm(2mm)(1)mn, (1.10)

    where n and m are positive integers.

    As for application of Theorem 1, we get the following:

    Corollary 1. We get the congruences:

    bl1(t)l3(t)l2m+1(t)nk=1f2m+12k(t)0(modf2n+1(t)1), (1.11)

    and

    al1(t)l3(t)l2m+1(t)nk=1l2m+12k(t)0(modl2n+1(t)at), (1.12)

    where n and m are positive integers.

    Taking t=1 in Corollary 1, we have the following conclusions for bi-periodic Fibonacci {fn} and Lucas {ln} sequences.

    Corollary 2. We get the congruences:

    bl1l3l2m+1nk=1f2m+12k0(modf2n+11), (1.13)

    and

    al1l3l2m+1nk=1l2m+12k0(modl2n+1a), (1.14)

    where n and m are nonzero real numbers.

    Taking a=b=1 and t=1 in Corollary 1, we have the following conclusions for bi-periodic Fibonacci {Fn} and Lucas {Ln} sequences.

    Corollary 3. We get the congruences:

    L1L3L2m+1nk=1F2m+12k0(modF2n+11), (1.15)

    and

    L1L3L2m+1nk=1L2m+12k0(modL2n+11), (1.16)

    where n and m are nonzero real numbers.

    To begin, we will give several lemmas that are necessary in proving theorems.

    Lemma 1. We get the congruence

    f(2n+1)(2j+1)(t)f2j+1(t)0(modf2n+1(t)1),

    where n and m are nonzero real numbers.

    Proof. We prove it by complete induction for j0. This clearly holds when j=0. If j=1, we note that abf3(2n+1)(t)=(a2b2t2+4ab)f32n+1(t)3abf2n+1(t) and we obtain

    f3(2n+1)(t)f3(t)=(abt2+4)f32n+1(t)3f2n+1(t)(abt2+4)f31(t)+3f1(t)=(abt2+4)(f2n+1(t)f1(t))(f22n+1(t)+f2n+1(t)f1(t)+f21(t))3(f2n+1(t)f1(t))=(abt2+4)(f2n+1(t)1)(f22n+1(t)+f2n+1(t)f1(t)+f21(t))3(f2n+1(t)1)0(modf2n+1(t)1).

    This is obviously true when j=1. Assuming that Lemma 1 holds if j=1,2,,k, that is,

    f(2n+1)(2j+1)(t)f2j+1(t)0(modf2n+1(t)1).

    If j=k+12, we have

    l2(2n+1)(t)f(2n+1)(2j+1)(t)=f(2n+1)(2j+3)(t)+abf(2n+1)(2j1)(t),

    and

    abl2(2n+1)(t)=(a2b2t2+4ab)f22n+1(t)2ab(a2b2t2+4ab)f21(t)2ab(modf2n+1(t)1).

    We have

    f(2n+1)(2k+3)(t)f2k+3(t)=l2(2n+1)(t)f(2n+1)(2k+1)(t)abf(2n+1)(2k1)(t)l2(t)f2k+1(t)+abf2k1(t)((abt2+4)f21(t)2)f(2n+1)(2k+1)(t)abf(2n+1)(2k1)(t)((abt2+4)f21(t)2)f2k+1(t)+abf2k1(t)((abt2+4)f21(t)2)(f(2n+1)(2k+1)(t)f2k+1(t))ab(f(2n+1)(2k1)(t)f2k1(t))0(modf2n+1(t)1).

    This completely proves Lemma 1.

    Lemma 2. We get the congruence

    al(2n+1)(2j+1)(t)al2j+1(t)0(modl2n+1(t)at),

    where n and m are nonzero real numbers.

    Proof. We prove it by complete induction for j0. This clearly holds when j=0. If j=1, we note that al3(2n+1)(t)=bl32n+1(t)+3al2n+1(t) and we obtain

    al3(2n+1)(t)al3(t)=bl32n+1(t)+3al2n+1(t)bl31(t)3al1(t)=(l2n+1(t)l1(t))(bl22n+1(t)+bl2n+1(t)l1(t)+bl21(t))3a(l2n+1(t)l1(t))=(l2n+1(t)at)(bl22n+1(t)+bayl2n+1(t)+ba2t2)3a(l2n+1(t)at)0(modl2n+1(t)at).

    This is obviously true when j=1. Assuming that Lemma 2 holds if j=1,2,,k, that is,

    al(2n+1)(2j+1)(t)al2j+1(t)0(modl2n+1(t)at).

    If j=k+12, we have

    l2(2n+1)(t)l(2n+1)(2j+1)(t)=l(2n+1)(2j+3)(t)+l(2n+1)(2j1)(t),

    and

    al2(2n+1)(t)=bl22n+1(t)+2abl21(t)+2a(modl2n+1(t)at).

    We have

    al(2n+1)(2k+3)(t)al(2k+3)(t)=a(l2(2n+1)(t)l(2n+1)(2k+1)(t)l(2n+1)(2k1)(t))a(l2(t)l2k+1(t)l2k1(t))(bl21(t)+2a)l(2n+1)(2k+1)(t)al(2n+1)(2k1)(t)(bl21(t)+2a)l2k+1(t)+al2k1(t)(abt2+2)(al(2n+1)(2k+1)(t)al2k+1(t))(al(2n+1)(2k1)(t)al2k1(t))0(modl2n+1(t)at).

    This completely proves Lemma 2.

    Proof of Theorem 1. We only prove (1.3), and the proofs for other identities are similar.

    nk=1f2m+12k(t)=nk=1(aς(2k+1)(ab)2k2(σ2k(t)τ2k(t)σ(t)τ(t)))2m+1=a2m+1(σ(t)τ(t))2m+1nk=1(σ2k(t)τ2k(t))2m+1(ab)(2m+1)k=a2m+1(σ(t)τ(t))2m+1nk=12m+1j=0(1)j(2m+1j)σ2k(2m+1j)(t)τ2kj(t)(ab)(2m+1)k=a2m+1(σ(t)τ(t))2m+12m+1j=0(1)j(2m+1j)(1σ2n(2m+12j)(t)(ab)(2m+12j)n(ab)2m+12jσ2(2m+12j)(t)1)=a2m+1(σ(t)τ(t))2m+1mj=0(1)j(2m+1j)(1σ2n(2m+12j)(t)(ab)(2m+12j)n(ab)2m+12jσ2(2m+12j)(t)11σ2n(2j12m)(t)(ab)(2j12m)n(ab)2j12mσ2(2j12m)(t)1)=a2m+1(σ(t)τ(t))2m+1mj=0(1)j(2m+1j)(σ2(2m+12j)(t)(ab)2m+12jσ(2n+2)(2m+12j)(t)(ab)(n+1)(2m+12j)+1σ2n(2j12m)(t)(ab)(2j12m)n1σ2(2m+12j)(t)(ab)(2m+12j))=a2m+1(σ(t)τ(t))2m+1mj=0(1)j(2m+1j)×(σ2m+12j(t)τ2m+12j(t)σ(2n+1)(2m+12j)(t)(ab)(2m+12j)n+τ(2n+1)(2m+12j)(t)(ab)(2m+12j)nσ2m+12j(t)τ2m+12j(t))=a2m+1b(a2b2t2+4ab)mmj=0(1)mj(2m+1mj)(f(2n+1)(2j+1)(t)f2j+1(t)l2j+1(t)).

    Proof of Theorem 2. We only prove (1.7), and the proofs for other identities are similar.

    nk=1f2m2k(t)=nk=1(aς(2k+1)(ab)2k2(σ2k(t)τ2k(t)σ(t)τ(t)))2m=a2m(σ(t)τ(t))2mnk=1(σ2k(t)τ2k(t))2m(ab)2mk=a2m(σ(t)τ(t))2mnk=12mj=0(1)j(2mj)σ2k(2mj)(t)τ2kj(t)(ab)2mk=a2m(σ(t)τ(t))2m2mj=0(1)j(2mj)(1σ2n(2m2j)(t)(ab)(2m2j)n(ab)2m2jσ2(2m2j)(t)1)
    =a2m(σ(t)τ(t))2mmj=0(1)j(2mj)(1σ2n(2m2j)(t)(ab)(2m2j)n(ab)2m2jσ2(2m2j)(t)1+1σ2n(2j2m)(t)(ab)(2j2m)n(ab)2j2mσ2(2j2m)(t)1)+a2m(σ(t)τ(t))2m(1)m+1(2mm)n=a2m(σ(t)τ(t))2mmj=0(1)j(2mj)(σ2(2m2j)(t)(ab)2m2jσ(2n+2)(2m2j)(t)(ab)(n+1)(2m2j)1+σ2n(2j2m)(t)(ab)(2j2m)n1σ2(2m2j)(t)(ab)2m2j)+a2m(σ(t)τ(t))2m(1)m+1(2mm)n=a2m(σ(t)τ(t))2mmj=0(1)j(2mj)(σ2m2j(t)τ2m2j(t)σ(2n+1)(2m2j)(t)(ab)n(2m2j)+τ(2n+1)(2m2j)(t)(ab)n(2m2j)τ2m2j(t)σ2m2j(t))+a2m(σ(t)τ(t))2m(1)m+1(2mm)n=a2m(a2b2t2+4ab)mmj=0(1)mj(2mmj)(f2j(2n+1)(t)f2j(t)f2j(t))+a2m(a2b2t2+4ab)m(1)m+1(2mm)n.

    Proof of Corollary 1. First, from the definition of fn(t) and binomial expansion, we easily prove (f2n+1(t)1,a2b2t2+4ab)=1. Therefore, (f2n+1(t)1,(a2b2t2+4ab)m)=1. Now, we prove (1.11) by Lemma 1 and (1.3):

    bl1(t)l3(t)l2m+1(t)nk=1f2m+12k(t)=l1(t)l3(t)l2m+1(t)(a2m+1(σ(t)τ(t))2mmj=0(1)mj(2m+1mj)(f(2n+1)(2j+1)(t)f2j+1(t)l2j+1(t)))0(modf2n+1(t)1).

    Now, we use Lemma 2 and (1.5) to prove (1.12):

    al1(t)l3(t)l2m+1(t)nk=1l2m+12k(t)=l1(t)l3(t)l2m+1(t)(mj=0(2m+1mj)(al(2n+1)(2j+1)(t)al2j+1(t)l2j+1(t)))0(modl2n+1(t)at).

    In this paper, we discuss the power sums of bi-periodic Fibonacci and Lucas polynomials by Binet formulas. As corollaries of the theorems, we extend the divisible properties of the sum of power of linear Fibonacci and Lucas sequences to nonlinear Fibonacci and Lucas polynomials. An open problem is whether we extend the Melham conjecture to nonlinear Fibonacci and Lucas polynomials.

    The authors declare that they did not use Artificial Intelligence (AI) tools in the creation of this paper.

    The authors would like to thank the editor and referees for their helpful suggestions and comments, which greatly improved the presentation of this work. All authors contributed equally to the work, and they have read and approved this final manuscript. This work is supported by Natural Science Foundation of China (12126357).

    The authors declare that there are no conflicts of interest regarding the publication of this paper.

    [1] [ A. C. Acock, Working with missing values, Journal of Marriage and Family, 67(2005), 1012-1028.
    [2] [ E. Acuna and C. Rodriguez, The treatment of missing values and its effect in the classifier accuracy, In Classification, Clustering and Data Mining Applications, (2004), 639-647.
    [3] [ G. E. Batista and M. C. Monard, An analysis of four missing data treatment methods for supervised learning, Applied Artificial Intelligence, 17(2003), 519-533.
    [4] [ J. Doak, An Evaluation of Feature Selection Methods and Their Application to Computer Security, UC Davis Department of Computer Science, 1992.
    [5] [ P. Domingos, A unified bias-variance decomposition, In Proceedings of 17th International Conference on Machine Learning. Stanford CA Morgan Kaufmann, 2000, 231-238.
    [6] [ Survey of Family Expenditures-1996, STATCAN, 1998.
    [7] [ A. Farhangfar, L. Kurgan and J. Dy, Impact of imputation of missing values on classification error for discrete data, Pattern Recognition, 41(2008), 3692-3705.
    [8] [ H. H. Friedman, On bias, variance, 0/1-loss, and the curse-of-dimensionality, Data mining and knowledge discovery, 1(1997), 55-77.
    [9] [ S. Geman, E. Bienenstock and R. Doursaté, Neural networks and the bias/variance dilemma, Neural computation, 4(1992), 1-58.
    [10] [ L. A. Goodman and W. H. Kruskal, Measures of association for cross classification, J. American Statistical Association, 49(1954), 732-764.
    [11] [ I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., 3(2003), 1157-1182.
    [12] [ L. Himmelspach and S. Conrad, Clustering approaches for data with missing values:Comparison and evaluation, In Digital Information Management (ICDIM), 2010 Fifth International Conference on,IEEE 2010, 19-28.
    [13] [ P. T. V. Hippel, Regression with missing Ys:An improved strategy for analyzing multiply imputed data, Sociological Methodology, 37(2007), 83-117.
    [14] [ W. Huang, Y. Shi and X. Wang, A nomminal association matrix with feature selection for categorical data, Communications in Statistics-Theory and Methods, to appear, 2015.
    [15] [ W. Huang, Y. Pan and J. Wu, Supervised Discretization for Optimal Prediction, Procedia Computer Science, 30(2014), 75-80.
    [16] [ G. James and T. Hastie, Generalizations of the Bias/Variance Decomposition for Prediction Error, Dept. Statistics, Stanford Univ., Stanford, CA, Tech. Rep, 1997.
    [17] [ S. Kullback and R. A. Leibler, On information and sufficiency, Annals of Mathematical Statistics, 22(1951), 79-86.
    [18] [ R. J. A. Little and D. B. Rubin, Statistical Analysis with Missing Data, John Wiley & Sons, Inc. 1987, New York, NY, USA.
    [19] [ H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers 1998, Norwell, MA, USA.
    [20] [ J. Luengo, S. García and F. Herrera, On the choice of the best imputation methods for missing values considering three groups of classification methods, Knowledge and information systems, 32(2012), 77-108.
    [21] [ Z. Mark and Y. Baram, The bias-variance dilemma of the Monte Carlo method, Artificial Neural Networks,ICANN, 2130(2001), 141-147.
    [22] [ R. Tibshirani, Bias, Variance and Prediction Error for Classification Rules, Citeseer 1996.
    [23] [ I. Yaniv and D. P. Foster, Graininess of judgment under uncertainty:An accuracyinformativeness trade-off, Journal of Experimental Psychology:General, 124(1995), 424-432.
    [24] [ L. Yu, K. K. Lai, S. Wang and W. Huang, A bias-variance-complexity trade-off framework for complex system modeling, In Computational Science and Its Applications-ICCSA 2006, Springer, 3980(2006), 518-527.
    [25] [ T. Zhou, Z. Kuscsik, J. Liu, M. Medo, J. R. Wakeling and Y. Zhang, Solving the apparent diversity-accuracy dilemma of recommender systems, Proceedings of the National Academy of Sciences, 107(2010), 4511-4515.
  • Reader Comments
  • © 2016 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4262) PDF downloads(558) Cited by(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog