Page 188 - Contributed Paper Session (CPS) - Volume 2
P. 188

CPS1820 Shuichi S.
                  discriminated six microarrays and found those are LSD (Fact3). This new fact
                  is crucial for Problem5, and no papers pointed this vital signal. Furthermore,
                  Method2 decomposed microarray into many linearly separable gene spaces
                  (SMs) and noise gene subspace. In this paper, we introduce the cancer gene
                  diagnosis to analyze all SMs by standard statistical methods.

                  2.  Methodology
                     At first, we introduce four problems [7][8] found through our many kinds
                  of research of discriminant analysis. Next, we explain the theory and Method2.

                  2.1 Four Problems of Discriminant Analysis
                  Problem1:  The  discrimination  rule  is  straightforward.  However,  most
                  researchers believe in the wrong rule. The values of yi are 1 for class1 and -1
                  for class2. Let f(x) be LDF and yi*f(xi) be an extended discriminant score (DSs)
                  for xi. The following rule is correct.
                  1)  If yi *f(xi) > 0, xi is classified to class1 /class2 correctly.
                  2)  If yi *f(xi) < 0, xi is misclassified to class1 /class2 correctly.
                  3)  We cannot properly discriminate xi on the discriminant hyperplane (f(xi) =
                     0).
                  Problem2: Only H-SVM [18] and RIP can recognize LSD theoretically. Other
                  LDFs cannot discriminate LSD correctly, and those error rates are high.
                  Problem3:  Problem3  is  the  defect  of  the  generalized  inverse  matrix
                  technique.
                  Problem4:  Fisher  never  formulated  the  equation  of  SE  of  discriminant
                  coefficients and error rates. We propose the 100-fold cross validation for a
                  small sample (Method1). It offers the 95% CIs of error rates and discriminant
                  coefficients.

                  2.2 New Theory of Discriminant Analysis
                     We developed IP-OLDF and RIP by LINGO [6]. Only RIP and H-SVM are
                  essential in this research. Integer programming (IP) defines RIP in (1). The ei
                  are 0/1 decision variables. If a case is classified correctly, ei =0. Otherwise, ei
                  =1. Thus, object function becomes MNM. N constraints define the feasible
                  region.
                                        MIN = Σei ;
                                        yi* (  xib + b0) >= 1 - M* ei ;                                    (1)
                                            t
                                        xi: n cases. b: p-coefficients. b0: free decision variables.
                                        ei: 0/1 decision variables. M: 10,000 (Big M constant).


                  Quadratic programming (QP) defines H-SVM in (2). Although QP finds only
                  one minimum value on the whole domain, we restrict the domain to the same

                                                                     177 | I S I   W S C   2 0 1 9
   183   184   185   186   187   188   189   190   191   192   193