Page 334 - Contributed Paper Session (CPS) - Volume 2
P. 334
CPS1876 Sarbojit R. et al.
for 1 ≤ ≤ and 1 ≤ ≤ . Analogous to the quantities (, ) and
0
3, (, ) for , 0 ∈ ℝ with a fixed 1 ≤ ≤ (see Section 2), we now
0
0
define the following two transformations:
2
(, ) = [ −1 ∑ ( −1 ‖ − 0 ‖) , (4)
0
=1
1
3, (, ) = ∑ | (, ) − ( , )|. (5)
0
− 1 0
1≤≠ 0 ≤
This transformation is the Mean Absolute Difference of the
generalized group based Distances (ggMADD). We denote the NN
classifier based on ggMADD as 3, . Under appropriate conditions,
we proved that the misclassification probability of 3, converges to
zero as → ∞.
4. Results from the Analysis of Simulated Data Sets
We analyze several high-dimensional simulated data sets to compare the
performance of the proposed classifiers. Recall the examples introduced in
Section 1 (Examples 1, 2) and also the examples introduced later in Section 3
(Example 3). Along with these three data sets, we now consider one more
example in this section.
Recall the two scale matrices and defined in Example 3 (see Section
1
2
3). Let 1,…, be independent × 1 random vectors, identically distributed as
the multivariate Cauchy distribution with location parameter and scale
matrix (say, ( , )). Consider a × 1 random vector from the first
1
1
1
⊤ ⊤
⊤
population = ( , … , ) .The distribution function of is () =
1
1
∏ ( ), where 2, ≡ ( , ). Similarly, a × 1 random vector
=1
1
2,
1
from the second population follows () = ∏ ( ) , where 2, ≡
2,
1
=1
( , ). We consider this as Example 4. Since moments do not exist for
1
1
the multivariate Cauchy distribution, the constants and do not exist
2
2
2
12,
1
2
in this examples. So, we cannot comment on the performances of the usual
NN or NNMADD as → ∞. But, the proposed classifier can deal with such
heavy-tailed distributions if we choose the function to be bounded.
However, in Example 4, the one-dimensional marginals for both and are
2
1
all Cauchy distribution with location zero and scale one (i.e., (0, 1)) .
Therefore, gMADD cannot differentiate between and . This pushes us to
1
2
capture the differences between and through the joint distributions of
1
2
the groups.
In each example, we generated 50 observations from each of the two
classes to form the training sample, while a test set of size 500 (250 from each
class) was used. This procedure was repeated 100 times to compute the
323 | I S I W S C 2 0 1 9