Page 42 - Contributed Paper Session (CPS) - Volume 3
P. 42

CPS1941 Jang S.
                      The disturbance terms for the three groups are   =  33.11,   =  54.18
                                                                      1
                                                                                   2
                  and   =  78.85 respectively. The dispersion is thus higher in the groups with
                       3
                  higher salaries then in those with lower salaries. This makes sense, since in the
                  low salary group a lot of employees just earn the minimal wage. Hence, a lot
                  of them have the same salary.
                      Moreover this example illustrates the dependence of the trajectories on
                  Luxembourg’s GDP. We see that in the three groups, this influence is non
                  linear, since   is always significantly different from 0. The trajectory equations
                               2
                  from table 1 can now be used to predict the future evolution of the salaries
                  for men and women as a function of GDP.

                  5.   Model Selection in Finite Mixture Models
                      Till now, there has been no really satisfactory solution for a model selection
                  procedure, in the sense of addressing the challenge to determine the optimal
                  number  of  classes  in  a  family  of  finite  mixture  models.  Nagin  (2005)
                  emphasizes the need of an interplay of formal statistical criteria and subjective
                  judgment  and  proposes  to  use  the  Bayesian  Information  Criterion  (BIC),
                  defined by

                                      BIC  = log() −  0, 5 log(),                  (6)


                  where  denotes the number of parameters in the model. The bigger the BIC,
                  the  better  the  model  explains  the  data.  The  correction  term 0.5 log() is
                  necessary, because the likelihood  is an increasing function of the number of
                  groups and just taking the likelihood as criterion does hence not make any
                  sense.
                      Nielsen  et  al.  (2014)  argue  that  the  existing  software  does  not  always
                  compute the BIC accurately and that furthermore BIC on its own does not
                  always  indicate  a  reasonable-seeming  number  of  groups  even  when
                  computed correctly. They propose the methodology of cross-validation error
                  (CVE) instead, which consists in computing a CVE for each possible choice r of
                  the number of latent classes, indicating the extent to which the model fails to
                  perfectly model the data. The final choice of r is then the one that minimizes
                  this CVE value. More precisely, CVE is defined by

                                                   
                                             1    1
                                       =   ∑  ∑ |  − ̂ [−] |,                 (7)
                                                          
                                               −1  =1

                          [−]
                  where ̂   it denotes the estimation of   obtained by running the model for
                                                       
                  the  whole  dataset,  except  line   .  Apart  from  the  fact  that  the  numerical
                  examples in Nielsen et al. (2014) do not really seem convincing, this method


                                                                      31 | I S I   W S C   2 0 1 9
   37   38   39   40   41   42   43   44   45   46   47