Page 245 - Special Topic Session (STS) - Volume 1
P. 245

STS426 Asis K.C.
                                log    1 (,)  > log    2 (1|2) ………….Rule1
                                     2 (,)
                                                  1 (1|2)

            else, classify to group 2, where   is the prior probability for the i th group, i =
                                            
            1,2. If we assume   =   = 1/2 , then the above rule R1 classifies (x,y) to Group
                                   2
                              1
            1 if
                                       (, )
                                       1
                                  log       > log ( ) − log ( )
                                                   
                                                                2
                                                             
                                                      1
                                       (, )
                                       2
                For the estimation of unknown parameters involved in the classification
            rule, they have used the maximum likelihood estimates.
                The total cost of misclassification (TCM) according to Rule 1 corresponding
            to three forms of bivariate Gamma distribution and two sets of choices of
            classification cost are shown in the following table-1. The analysis was carried
            out on the basis of above mentioned data set on the galaxy NGC5128 and
            starting with two groups obtained by k-means clustering with k=2.

                                    Table 1: TCM for NGC 5128 data set
             Gamma distribution       TCM ( =0.9,  =0.3)   TCM ( =0.2,  =0.8)
                                                    2
                                                                      1
                                                                             2
                                             1
             First form               0.3409                   0.2657
             Second form              0.1365                   0.0182
             Third form               0.1501                   0.0265

                The authors have proposed under the above set up, a classification rule by
            including  the  knowledge  of  missing  proportion  in  the  construction  of
            classification rule, as described below.
                Assume  few  observations  are  missing  in  a  data  set  containing  n
            observations.  If  l  is  the  cost  per  observation,  the  total  cost  is  nl  when  no
            observation is missing. But for m missing observations in the data, total cost
                                                         
            reduces to ( − ). Now ( − )  =  (1 −   ). If p is the proportion of
                                                          
            missing  observations  in  the  data,  then   =      .  Hence, ( − )  = (1 −
                                                          
            )  ∝ (1 − ) , which indicates that it will be reasonable to take the loss as
            (1 − ), 0  <    <  1, in such a situation.
                Consequently, if we have two groups to classify, we redefine the loss due
            to  misclassification  as (1|2) =  (1 − ) and (2|1) =  (1 −  ),  where 
                                                                                      
                                                                           2
                                                                     2
                                             1
            are known constants,   is the proportion of missing observations in the i th
                                   
            group ,i = 1,2. It can be noted that when there is no missing value in the data,
            i.e.,   =   = 0, then (1|2) =   and (2|1) =  . So, the proposed loss due
                      2
                                           1
                 1
                                                            2
            to  misclassification  becomes  equal  to  that  without  missing  observations.
            Hence,  one  can  apply  earlier  adopted  Rule  1  after  discarding  the  missing
            observations (i.e. marginalization) in the data or after substituting the missing
            observations (i.e. imputation). But under marginalization there would be loss
            of  information  and  when  the  proportion  of  missing  observation  is  quite
                                                               234 | I S I   W S C   2 0 1 9
   240   241   242   243   244   245   246   247   248   249   250