Page 334 - Contributed Paper Session (CPS) - Volume 2
P. 334

CPS1876 Sarbojit R. et al.
                      for 1 ≤  ≤  and 1 ≤  ≤ . Analogous to the quantities  (,  ) and
                                                                                
                                                                                       0
                                               
                       3, (,  ) for ,  0  ∈ ℝ  with a fixed 1 ≤  ≤  (see Section 2), we now
                                                                 0
                                0
                      define the following two transformations:

                                                   
                                                                          2
                                (,  ) = [ −1  ∑  (  −1 ‖ −   0  ‖) ,                       (4)
                                                                
                                
                                       0
                                                  =1

                                              1
                               3, (,  ) =  ∑ | (,  ) −  ( ,  )|.                    (5)
                                                                             
                                                                     
                                                          
                                                                
                                       0
                                             − 1                        0
                                                 1≤≠ 0 ≤
                         This transformation   is the Mean Absolute Difference of the
                                                
                      generalized group based Distances (ggMADD). We denote the NN
                                                         
                      classifier based on ggMADD as    3, . Under appropriate conditions,
                                                                             
                      we proved that the misclassification probability of  3,  converges to
                      zero as  →  ∞.

                  4.  Results from the Analysis of Simulated Data Sets
                      We analyze several high-dimensional simulated data sets to compare the
                  performance of the proposed classifiers. Recall the examples introduced in
                  Section 1 (Examples 1, 2) and also the examples introduced later in Section 3
                  (Example 3). Along with these three data sets, we now consider one more
                  example in this section.
                      Recall the two scale matrices   and   defined in Example 3 (see Section
                                                   1
                                                           2
                  3). Let  1,…,   be independent  × 1 random vectors, identically distributed as
                              
                  the  multivariate  Cauchy  distribution  with  location  parameter   and  scale
                                                                                  
                  matrix   (say,  ( ,  )). Consider a   ×  1 random vector from the first
                                       
                                          1
                                   1
                          1
                                             ⊤ ⊤
                                      ⊤
                  population   =   ( , … ,  ) .The  distribution  function  of    is   () =
                                                                                      1
                                             
                                      1
                  ∏    ( ), where  2,  ≡  ( ,  ). Similarly, a   ×  1 random vector 
                    =1
                                                       1
                                                   
                        2,
                             
                                                1
                  from  the  second  population  follows   () = ∏    ( ) ,  where   2,  ≡
                                                                       2,
                                                          1
                                                                   =1
                                                                            
                   ( ,  ). We consider this as Example 4. Since moments do not exist for
                        
                           1
                     1
                  the multivariate Cauchy distribution, the constants     and   do not exist
                                                                                2
                                                                     2
                                                                         2
                                                                    12,
                                                                        1
                                                                                2
                  in this examples. So, we cannot comment on the performances of the usual
                  NN or NNMADD as   →  ∞. But, the proposed classifier can deal with such
                  heavy-tailed  distributions  if  we  choose  the    function  to  be  bounded.
                  However, in Example 4, the one-dimensional marginals for both   and   are
                                                                                        2
                                                                                 1
                  all  Cauchy  distribution  with  location  zero  and  scale  one  (i.e.,  (0, 1)) .
                  Therefore, gMADD cannot differentiate between   and  . This pushes us to
                                                                   1
                                                                          2
                  capture the differences between   and   through the joint distributions of
                                                    1
                                                           2
                  the groups.
                      In  each  example,  we  generated 50 observations  from  each  of  the  two
                  classes to form the training sample, while a test set of size 500 (250 from each
                  class)  was  used.  This  procedure  was  repeated  100  times  to  compute  the
                                                                     323 | I S I   W S C   2 0 1 9
   329   330   331   332   333   334   335   336   337   338   339