Page 43 - Contributed Paper Session (CPS) - Volume 3
P. 43
CPS1941 Jang S.
is computationally extremely heavy, since it requires to execute the model
( − 1) times, which can be quite problematic in the case of big data
samples.
The criterion we propose also uses data to test the predictive power of the
model. We actually propose to use the posterior probability (/ ) of
individual i’s membership in group j that can be computed with the help of
Bayes’ theorem as
(8)
To determine the optimal number of groups r we maximize the posterior
probability criterion (PPC) defined for > 1 by
(9)
This actually means that we choose the number of groups that allows best to
explain the predicted group membership of the people in the dataset. Having
a PPC of N means that for all people in the dataset, it is almost sure to which
group they belong. The big advantage of this criterion is that it gives a clear
result and just requires a minimal number of computations. Moreover the
criterion is always computed as a sum of N terms and does hence not depend
on the number of parameters in the model. Therefore, there is no need for a
correction term.
6. Conclusion
In this article, we presented Nagin's finite mixture model and some of its
generalizations and showed some inherent shortcomings for possible
applications. We addressed these by proposing a new generalized finite
mixture model. A key characteristic is its ability to modelize nearly all kind of
trajectories and to add covariates to the trajectories themselves in a nonlinear
way.
We illustrated these possibilities through a data example about salary
trajectories. We showed how to add a classical group membership predictor
variable to the trajectories as well as a time series that does not depend on
the subjects of the analysis but influences the shape of the trajectories in some
of the groups.
Finally, we proposed a new methodology for determining the optimal
number of groups in finite mixture models by introducing the posterior
probability criterion.
32 | I S I W S C 2 0 1 9