Page 330 - Contributed Paper Session (CPS) - Volume 2
P. 330

CPS1876 Sarbojit R. et al.
                  literature  that  improve  the  performance  of  the  NN  classifier  in  HDLSS
                  settings. Chan and Hall (2009) developed some scale adjustments to the usual
                  1NN classifier, and this method performs well under the weaker condition that
                   2
                   12  > 0. A non-linear transformation of covariate space followed by the 1NN
                  classification was proposed by Dutta and Ghosh (2016). Recently, Pal et al.
                  (2016) used a new dissimilarity index instead of the usual Euclidean distance
                  for  1NN classification.  However,  all  these  methods  require  either  of  the
                                        2
                              2
                                              2
                  conditions  12  > 0 or  ≠   to yield good results for high-dimensional data.
                                        1
                                             2
                  So, they can deal with distributions that differ either in their locations or scales.
                      Consider  a  two  class  classification  problem  (say,  Example  1)  with  the
                  competing  distributions  as  ≡  ( , ∑ (1) ) and  ≡  ( , ∑ (2) ).  Here,
                                                                               
                                               1
                                                                           
                                                     
                                                        
                                                                      2
                                                            
                                                                                   
                      denotes  the   − dimensional  Gaussian  distribution,      is  the
                   −dimensional vector of zeros, and ∑ (1)  and ∑ (2)  are described below:
                                                                
                                                       

                                                      0.5          
                    (1)       [ ]       [ ]×(−[ ])     (2)        −[ ]    (−[ ])×[ ]
                               2
                                                                       2
                                                                                 2
                                               2
                                                                                     2
                                         2
                  ∑    = [                       ] and  ∑   = [                        ] ,
                                0.5                          
                           (−[ ])×[ ]    −[ ]               [ ]×(−[ ])     [ ]
                               2   2          2                  2      2        2

                  with   being the   ×   identity matrix and   ×    being the   ×   matrix of
                        
                                                                 5
                  zeros. As Example 2, we consider  ≡  ( ,  ) and  ≡ ∏     with
                                                               
                                                      1
                                                                                 =1
                                                                           2
                                                           
                                                                   
                                                                                      2
                                                                 3
                   2  =  (0,1) for  each 1  ≤    ≤  .  Throughout  this  article,  we  follow  the
                         5
                                                            
                  convention that if we write  = ( , … ,  ) ~ , then  ~  for all 1  ≤    ≤
                                                                            
                                                                        
                                                    1
                                                          
                                                                2
                                               2
                                                           2
                   . It  is  easy  to  check  that  12  = 0 and  =  = 0.75  for  Example  1.  For
                                                                2
                                                           1
                                                                                         5
                  Example  2,  the  parameters  are   (1)  =  (2)  =   and  ∑ (1)  = ∑ (2)  =   ,
                                                                                           
                                                                   
                                                                            
                                                            
                                                                                   
                                                      
                                                                                         3
                                                      2
                                                           2
                                       2
                  hence  we  obtain   12  = 0  and   =  =  5  .  Despite  having  different
                                                           2
                                                     1
                                                               3
                                                                                  2
                                                                     2
                                                                                       2
                  populations,  both  these  examples  end  up  with   12  = 0 and   =  .  The
                                                                                       2
                                                                                 1
                  existing  NN  classifiers  fail  to  yield  promising  results  in  high-dimensional
                  spaces. Similar arguments hold for any  in the NN classifier as   →  ∞ (with
                  a fixed value of  ∈ ℕ). In Figure 1, we report the empirical misclassification
                  probability of the Bayes classifier and 1NN classifier.

                               (1)              (2)                 5
                 () 1 ≡   (0  , ∑  ) and  2 ≡   (0  , ∑  )    () 1 ≡   (  ,   ) and  2 ≡ ∏    5 (0,1)
                                                                3             =1
                              Figure 1: Error rates of classifiers in Examples 1 and 2.
                                                                     319 | I S I   W S C   2 0 1 9
   325   326   327   328   329   330   331   332   333   334   335