Page 336 - Contributed Paper Session (CPS) - Volume 6
P. 336

CPS1950 Paolo G. et al.
                  the associated weight is, the more the variable plays a relevant role. If the
                  weight of a variable is equal to zero, then the variable is discarded. There are
                  various ways to give weights to the variables. A common one is based on
                  Principal  Component  Analysis  (PCA),  i.e.,  by  considering  the  component
                  loadings. In this case, the principal components span a low dimensional space
                  of order Q (< J) where the observation units are projected. The partition is
                  carried out by clustering the observation units in terms of their coordinates on
                  such a low-dimensional space, i.e., in terms of the component scores. For this
                  reason, we refer to as subspace clustering.
                      In the naïve approach to subspace clustering, the data reduction and the
                  clustering steps are done sequentially. In other words, firstly, PCA is applied to
                  the data, then the clustering method is run on the resulting component scores.
                  Such  an  approach  is  usually  known  as  tandem  analysis  (Arabie  &  Hubert,
                  1994). Although it is very intuitive, its use is not recommended because the
                  principal components are not optimal in the clustering sense. In fact, as is well-
                  known, they maximize the total sum of squares and therefore may lead to a
                  low-dimensional  configuration  of  the  observation  units  such  that  the
                  taxonomy is obscured. For more details, the interested reader may refer to, for
                  instance, De Sarbo et al. (1990) and De Soete & Carroll (1994).
                      In  order  to  address  the  clustering  problem  in  a  reduced  subspace
                  simultaneously, at least two proposals can be used. These are the Reduced K-
                  means  (RKM)  analysis  suggested  by  De  Soete  &  Carroll  (1994)  and  the
                  Factorial  K-means  (FKM)  analysis  suggested  by  Vichi  &  Kiers  (2001).  Both
                  methods detect a partition of the observation units in K clusters by assuming
                  that centroids lie in a subspace of variables. Although they are based on the
                  same assumption, as we shall see, they present distinctive features.
                      In  this  paper,  we  are  going  to  propose  a  new  clustering  method  in  a
                  reduced  subspace  exploiting  the  potentialities  of  RKM  and  FKM.  For  this
                  purpose, a linear convex combination of the RKM and FKM loss functions will
                  be used. Furthermore, in order to enlarge the applicability of our proposal, the
                  clustering problem is approached from the fuzzy point of view (Zadeh, 1965).
                  In contrast with the standard approach where the observation units either
                  belong or not to the clusters and every observation unit can be assigned to
                  one and only one cluster, the fuzzy approach allows to assign the observation
                  units to the clusters with the so-called fuzzy membership degrees ranging in
                  the interval [0, 1], where 0 means complete non-membership and 1 complete
                  membership, and such that, for each observation unit, the sum of the fuzzy
                  membership degrees is equal to one.
                      The paper is organized as follows. In the next section, RKM and FKM are
                  recalled and the new proposal is introduced. In Section 3 the results of the
                  application of the new clustering procedure to real data are reported. Some
                  final remarks in Section 4 conclude the paper.

                                                                     325 | I S I   W S C   2 0 1 9
   331   332   333   334   335   336   337   338   339   340   341