Page 235 - Special Topic Session (STS) - Volume 1
P. 235

STS426 Tanuka C.
                      For regular models the integrated likelihood can be approximated
                  simply by the BIC (Bayesian Information Criterion). Using BIC will add
                  a penalty to the log-likelihood based on the number of parameters,
                  and has shown good performance in a number of applications. (?; ?;
                  ?). BIC can be calculated as follows:

                                             ∗
                  2 log (|ℳ ) ≈ 2 log (| , ℳ ) −  log  ≡             (13)
                                            
                                                
                                                      
                                                                  
                             

                                  ∗
                  where, log (| , ℳ ) is the maximized likelihood for the model and
                                  
                                      
                  data  and    is  the  number  of  independent  parameters  to  be
                              
                  estimated from model ℳ  (?).
                                          
                      Finally, we can adopt the following strategy to combine all of the
                  methods discussed so far to select the optimal model:
                    Select a maximum number of components to consider for our
                      mixture model. Let us call it Gmax.
                    Estimate the parameters via the EM and MAP estimate method
                      for each parameterization and each number of components up to
                      Gmax.
                    Compute  BIC  for  the  mixture  likelihood  taking  the  parameter
                      estimates from the EM for up to Gmax clusters.
                    Select  the  model  (parameterization/number  of  mixture
                      components) having the maximum BIC.

               3.3  Dimension reduction for visualization
                    After performing cluster analysis to a group of data it is usually desired
               to  check  the  distinctness  of  the  clusters  created.  Popular  measure  for
               cluster validity e.g., Silhouette Width (?) utilizes Euclidean Distances or any
               standardized metric for checking the validity of K − Means clustering. But
               the distance to be used in case of clusters arising from GMMBC and other
               clustering methods are not clearly delineated. ? proposed a methodology
               to  reduce  the  dimensionality  of  data  so  that  it  can  be  projected  to  a
               subspace of 2 or 3 dimensions and thus we will have a convenient visual
               representation of  the clusters  created  from  a  finite  mixture  of  Gaussian
               densities. Information on the dimension reduced subspace is taken from
               the various group-specific measures such as, group means and depending
               on  the  estimated  mixture  model:  variation  on  group  covariances.  The
               proposed method aims to reduce the dimensionality by identifying a set of
               linear  combinations  -  called  Directions  -  ordered  by  importance  as
               quantified  by  the  associated  eigenvalues  of  the  original  features  which
               capture  most  of  the  cluster  structure  contained  in  the  data.  After
               performing all these, observations may then be projected on a dimension
               reduced subspace. This will facilitate various summary plots which will help
               us to visualize the clustering structure. The method uses the Gaussianity of

                                                               224 | I S I   W S C   2 0 1 9
   230   231   232   233   234   235   236   237   238   239   240