Page 43 - Contributed Paper Session (CPS) - Volume 3
P. 43

CPS1941 Jang S.
            is computationally extremely  heavy, since it requires to execute the model
            ( − 1) times,  which  can  be  quite  problematic  in  the  case  of  big  data
            samples.
                The criterion we propose also uses data to test the predictive power of the
            model.  We  actually  propose  to  use  the  posterior  probability  (/ )  of
                                                                                  
            individual i’s membership in group j that can be computed with the help of
            Bayes’ theorem as


                                                                                    (8)


            To  determine  the  optimal  number  of  groups  r  we  maximize  the  posterior
            probability criterion (PPC) defined for   >  1 by

                                                                                    (9)


            This actually means that we choose the number of groups that allows best to
            explain the predicted group membership of the people in the dataset. Having
            a PPC of N means that for all people in the dataset, it is almost sure to which
            group they belong. The big advantage of this criterion is that it gives a clear
            result  and  just  requires  a  minimal  number  of  computations. Moreover  the
            criterion is always computed as a sum of N terms and does hence not depend
            on the number of parameters in the model. Therefore, there is no need for a
            correction term.

            6.   Conclusion
                In this article, we presented Nagin's finite mixture model and some of its
            generalizations  and  showed  some  inherent  shortcomings  for  possible
            applications.  We  addressed  these  by  proposing  a  new  generalized  finite
            mixture model. A key characteristic is its ability to modelize nearly all kind of
            trajectories and to add covariates to the trajectories themselves in a nonlinear
            way.
                We  illustrated  these  possibilities  through  a  data  example  about  salary
            trajectories. We showed how to add a classical group membership predictor
            variable to the trajectories as well as a time series that does not depend on
            the subjects of the analysis but influences the shape of the trajectories in some
            of the groups.
                Finally,  we  proposed  a  new  methodology  for  determining  the  optimal
            number  of  groups  in  finite  mixture  models  by  introducing  the  posterior
            probability criterion.



                                                                32 | I S I   W S C   2 0 1 9
   38   39   40   41   42   43   44   45   46   47   48