Page 337 - Contributed Paper Session (CPS) - Volume 6
P. 337

CPS1950 Paolo G. et al.
            2.  Methodology
                In  this  section  we  start  by  reviewing  RKM  and  FKM.  Later,  the  new
            clustering procedure is illustrated in detail.
            a.  Reduced K-Means (RKM):
            Let X be the data matrix of order (I × J) containing the scores of I observation
            units with respect to J variables. The Reduced K-means (RKM) analysis (De
            Soete & Carroll, 1994) can be formulated as:
                  X = UFA + E                                                       (1)
            where U, of order (I × K), is the membership matrix with elements equal to 0
            or 1 expressing for each observation units the membership to one of the K
            clusters. Note that U is row-stochastic, that is, its row-wise sum is equal to 1.
            A  is  the  component  weight  matrix  of  order  (J  ×  Q).  It  is  column-wise
            orthonormal, i.e., AA = IQ, being IQ the identity matrix of order Q, and every
            column  expresses  the  weights  of  the  variables  on  the  corresponding
            component. Finally, F is the centroid matrix of order (K × Q) such that every
            row refers to a cluster centroid. The centroids lie in the reduced subspace
            spanned by the columns of A. Finally, E is the residual matrix having the same
            order of X. The optimal parameter matrices U, F and A are obtained in the
            least square sense by minimizing the residual sum of squares:
                                            2
                             2
                  fRKM = || E ||  = || X  UFA || ,                                 (2)
            being || · || the Frobenius norm of matrices. Suitable Alternating Least Squares
            (ALS) algorithms can be adopted for the minimization of (2). The RKM solution
            is  not  unique.  Equally  fitting  solutions  can  be  found  up  to  rotational
            indeterminacy for the weights in A and the centroids in F and cluster label
            switching. Given an orthonormal rotation matrix R of order (Q × Q) and a
            permutation matrix P of order (K × K), letting A* = AR, U* = UP and F* = PFR,
            we have U*F*A* = UFA.
            When Q = J, i.e., when the variable space is not reduced through PCA (A = IJ,
            where IJ is the identity matrix of order J), RKM boils down to the standard KM
            algorithm.
            b.  Factorial K-Means (FKM):
                The RKM loss function in (2) is a proxy of the within-cluster sum of squares
            in the reduced space. In fact, it is the sum of the squared distances between
            the observation units in the (J-dimensional) observed space and the centroids
            in the (Q-dimensional) reduced space. This represents a sort of idiosyncrasy
            because it appears more reasonable to compute the within-cluster sum of
            squares in the reduced space by considering not only the centroids but also
            the observation units lying in the reduced space. This motivation leads to the
            so-called  Factorial  K-Means  (FKM)  procedure  developed  by  Vichi  &  Kiers
            (2001). The FKM model is expressed as
                  XAA = UFA + E.                                                   (3)



                                                               326 | I S I   W S C   2 0 1 9
   332   333   334   335   336   337   338   339   340   341   342