Page 438 - Contributed Paper Session (CPS)

Page 438 - Contributed Paper Session (CPS) - Volume 2

P. 438

CPS1917 Trijya S.
exponential distributions. The two-component mixture of exponential
distributions is given by,

1 1
(| ) = ∙ ∙ (− ) + (1 − ) ∙ ) (− ), (1)
1 2
1 1 2 2
where ≥ 0, > 0, 0 ≤ ≤ 1.
1 2
With the availability of efficient optimization algorithms and easy access
to high speed computers, the method of maximum likelihood gained
popularity for the purpose of estimating parameters of mixture models.
Nevertheless, since parameters occur nonlinearly in the likelihood function,
several problems arise due to the rough surface of the likelihood and the
singularities therein. There could be instances where the likelihood function is
unbounded and maximum likelihood estimates (MLE's) may not exist. Kiefer
& Wolfowitz (1956) cited an example involving a mixture of two univariate
normal densities for which MLE's do not exist. Hosmer (1973, 1974) showed
through simulation that even for reasonable sample sizes and initial estimates,
iterative sequence of MLE's does not converge to particular values associated
with singularities. In the case of a mixture of two univariate normal
distributions, Hosmer (1973) asserted that if the sample size is small and
component distributions are poorly separated, then maximum likelihood
estimates should be `used with extreme caution or not at all'. Hosmer (1978)
has demonstrated that estimates obtained by the method of moments and
the method of moment generating functions outperform maximum likelihood
estimates in such situations.
One of the iterative procedures which is popular in the case of mixture
models is the non-derivative based expectation-maximization (EM) algorithm.
The second category of iterative procedures includes derivative based
algorithms like the Newton-Raphson and Marquardt-Levenberg algorithms.
Whichever algorithm is used, it would need good initial estimates for fast
convergence to the global maxima. If initial estimates are poor, convergence
may be slow, or the algorithm may converge to a local maxima. It is also
possible that in some ill-conditioned situations, convergence may not occur
at all. Thus, we need at least two sets of good initial estimates to ensure that
convergence occurs to the same values which provide the global maxima.
For the model in (1), Rider (1961) proposed the method of moments for
estimating parameters , and . Considering , . . . , to be a random
1
1
2,
1
sample from the distribution in (1), Rider equated the first three theoretical
raw moments of (1) to the corresponding sample raw moments , ,, and
1
2
, and after extensive algebra, obtained the following quadratic equation,
3

6(2 − ) + 2( − 3 ) + 3 − 2 = 0 (2)
2
2
2
2
3
1
2
2
1
3
1
The two real roots and of equation (2), if they exist, will yield estimates
̂
̂
2
1
427 | I S I W S C 2 0 1 9

433 434 435 436 437 438 439 440 441 442 443