Page 331 - Contributed Paper Session (CPS) - Volume 2
P. 331
CPS1876 Sarbojit R. et al.
In this article, we focus on the NN classifier based on Mean Absolute
Differences of Distances (MADD) proposed by Pal et al. (2016) and make
appropriate adjustments to the Euclidean distance function to discriminate
among populations under more general settings. A transformation of the
usual MADD is proposed in Section 2. Section 3 further generalizes the
proposed classifier to address cases where discriminative information comes
from the joint distribution of groups of component variables. Numerical
performance of the proposed classifiers on simulated datasets is evaluated in
Section 4.
2. Transformation of MADD
Let us denote the training data set {( , ), ( , ), … , ( , )} by (a
2
2
1
1
random sample of size ). Here, s are −dimensional random vectors and
∈ {1, … , } denotes the class label associated with for 1 ≤ ≤ . Let
be the number of observations from the − th class for 1 ≤ ≤ and
. For a fixed , suppose that [ = ] = with 0 < <
= ∑ =1
= 1. Our objective is to predict the
1, | = ~ for 1 ≤ ≤ and ∑ =1
class label of a new observation ∈ ℝ .
Z
For a given training sample and test point ∈ ℝ , Pal et al. (2016)
defined the dissimilarity index MADD between and for a fixed index
0
1 ≤ ≤ as follows:
0
1 1 1
−
−
1, (, ) = ∑ | 2‖ − ‖ − 2‖ − ‖| . (1)
− 1 0
0
1≤≠≤
MADD uses the usual Euclidean (also known as ) distance to compute
2
dissimilarity between two vectors. However, one may use any other distance
function instead of the -distance to define a measure of dissimilarity like
2
2
2
2
1, . It is established that MADD performs well when ,, > 0 or ≠ , for
1 ≤ ≠ ′ ≤ (see Pal et al. (2016)), but fails under more general conditions
(see Figure 2). Observe that these limiting values and , come into
2
2
2
,,
the picture as a consequence of using the Euclidean distance in 1, . We
propose some adjustments to the Euclidean norm that will lead to better
performance compared to usual MADD under more general situations.
Let ∶ [0, ∞) ↦ [0, ∞) and ∶ [0, ∞) ↦ [0, ∞) be two continuous,
monotonically increasing functions with (0) = (0) = 0 . For , ∈ ℝ ,
define
2
(, ) = [ −1 ∑ (| − | )]. (2)
=1
1
It is clear that by considering () = 2 and () = 2 for ≥ 1, we get
1
(, ) = ‖, ‖ , i.e., the distance between u and v scaled by √. For
√
320 | I S I W S C 2 0 1 9