Page 331 - Contributed Paper Session (CPS) - Volume 2
P. 331

CPS1876 Sarbojit R. et al.
                In  this  article,  we  focus  on  the NN classifier  based  on  Mean  Absolute
            Differences  of  Distances  (MADD)  proposed  by  Pal  et  al.  (2016)  and  make
            appropriate adjustments to the Euclidean distance function to discriminate
            among  populations  under  more  general  settings.  A  transformation  of  the
            usual  MADD  is  proposed  in  Section  2.  Section  3  further  generalizes  the
            proposed classifier to address cases where discriminative information comes
            from  the  joint  distribution  of  groups  of  component  variables.  Numerical
            performance of the proposed classifiers on simulated datasets is evaluated in
            Section 4.

            2.  Transformation of MADD
               Let us denote the training data set {( ,  ), ( ,  ), … , ( ,  )} by   (a
                                                               2
                                                            2
                                                    1
                                                       1
                                                                       
                                                                                  
                                                                          
            random sample of size ). Here,  s are  −dimensional random vectors and
                                             
             ∈ {1, … , } denotes the class label associated with   for 1  ≤    ≤  . Let 
                                                                                     
             
                                                                
            be the number of observations from the  − th class for 1  ≤    ≤   and
                  
                      . For a fixed , suppose that [ = ] =   with 0  <   <
             = ∑ =1                                              
                                                    = 1. Our objective is to predict the
                                                
             1,  | = ~  for 1  ≤    ≤   and ∑ =1  
                   
                          
                
            class label   of a new observation   ∈ ℝ .
                                                    
                        Z
               For a given training sample   and test point   ∈ ℝ , Pal et al. (2016)
                                                                 
                                           
            defined the dissimilarity index MADD between  and   for a fixed index
                                                                   0
            1  ≤   ≤   as follows:
                  0
                                  1            1              1
                                              −
                                                             −
                   1, (,  ) =  ∑ | 2‖ −  ‖ −  2‖ −  ‖| .                   (1)
                                                                       
                                                       
                                 − 1                             0
                            0
                                     1≤≠≤

               MADD uses the usual Euclidean (also known as  ) distance to compute
                                                                2
            dissimilarity between two vectors. However, one may use any other distance
            function instead of the  -distance to define a measure of dissimilarity like
                                     2
                                                                                  2
                                                                             2
                                                                  2
             1, . It is established that MADD performs well when  ,,  > 0 or  ≠  , for
                                                                             
                                                                                  
            1  ≤    ≠  ′ ≤   (see Pal et al. (2016)), but fails under more general conditions
            (see Figure 2). Observe that these limiting values     and  ,  come into
                                                                          2
                                                                   2
                                                               2
                                                              ,, 
                                                                          
            the  picture  as  a  consequence  of  using  the  Euclidean  distance  in  1, . We
            propose  some  adjustments  to  the  Euclidean  norm  that  will  lead  to  better
            performance compared to usual MADD under more general situations.
               Let   ∶ [0, ∞) ↦ [0, ∞)  and   ∶ [0, ∞) ↦ [0, ∞)  be  two  continuous,
            monotonically  increasing  functions  with  (0) =  (0) =  0 .  For  ,  ∈ ℝ ,
                                                                                     
            define
                                             
                                                           2
                            (, ) = [ −1  ∑ (| −  | )].                                              (2)
                                                    
                            
                                                         
                                            =1

                                                                  1
               It  is  clear  that  by  considering () =  2  and () =  2  for  ≥ 1, we  get
                        1
                                                                                
             (, ) =  ‖, ‖ , i.e., the   distance between u and v scaled by √. For
              
                                
                                          
                       √
                                                               320 | I S I   W S C   2 0 1 9
   326   327   328   329   330   331   332   333   334   335   336