Page 157 - Contributed Paper Session (CPS) - Volume 4
P. 157
CPS2164 Jonathan Hosking et al.
3. Model parameters and estimation
It is common for a state space model specification to involve unknown
parameters, particularly in the variance matrices Ht and Qt but also in the
transition matrices Zt, Tt and Rt. Estimation of these parameters is then required
for the model to be used in forecasting.
A commonly used estimation method is maximum likelihood. The log-
likelihood of an observed data set is Durbin and Koopman (2012, eq. (7.2))
Estimation then requires the maximization of (9) with respect to the model
parameters. Numerical methods are generally necessary. Many numerical
optimization methods, including the popular BFGS method (Wiipedia, 2015)
and its variants, use information about the derivatives of the objective function.
Numerical derivatives, obtained by finite differences, can be used, but if
analytical expressions are available for the derivatives, the optimization can
often be greatly accelerated.
Log-likelihood derivatives for state space models can be obtained in
several ways. For some classes of model, such as ARMA models in state space
form, derivatives can be computed using recursions adapted for the particular
model class (Ansley and Kohn, 1985; Melard, 1985). Consideration of a
complete-data likelihood, as though the states αt were observable, yields
explicit derivatives (Segal and Weinstein 1988, 1989; Koopman and Shephard,
1992). but in a simple form only for a restricted set of models for which the
matrices Ht, Qt, and Rt are invertible. Direct differentiation of (9) yields a
recursive procedure whose effect is to increase the number of quantities that
must be carried through the Kalman filter recursions. This approach is
described in more detail in Section 4.
What has hitherto been lacking is a single explicit expression for the log-
likelihood derivative that applies to all linear gaussian state space models in
the form (1)–(3), without any restrictions on the system matrices. We have
derived such an expression, by extending previous authors’ treatments of the
direct log-likelihood derivative. Our result is included in Section 4 below, as eq.
(14).
4. Log-likelihood derivatives for the state space model.
Let θ be a scalar model parameter. We use the notation X to denote. ∂X/∂θ.
−1
−1
For an invertible matrix X, we note that and ∂X−1/∂θ=−X xX . Differentiating
(9) with respect to θ gives the derivative of the log-likelihood:
146 | I S I W S C 2 0 1 9