Page 157 - Contributed Paper Session (CPS) - Volume 4
P. 157

CPS2164 Jonathan Hosking et al.
            3.  Model parameters and estimation
                It is common for a state space model specification to involve unknown
            parameters,  particularly  in  the  variance  matrices  Ht  and  Qt  but  also  in  the
            transition matrices Zt, Tt and Rt. Estimation of these parameters is then required
            for the model to be used in forecasting.
                A  commonly  used  estimation  method  is  maximum  likelihood.  The  log-
            likelihood of an observed data set  is Durbin and Koopman (2012, eq. (7.2))
                                               



                 Estimation then requires the maximization of (9) with respect to the model
            parameters.  Numerical  methods  are  generally  necessary.  Many  numerical
            optimization methods, including the popular BFGS method (Wiipedia, 2015)
            and its variants, use information about the derivatives of the objective function.
            Numerical  derivatives,  obtained  by  finite  differences,  can  be  used,  but  if
            analytical expressions are available for the derivatives, the optimization can
            often be greatly accelerated.
                 Log-likelihood  derivatives  for  state  space  models  can  be  obtained  in
            several ways. For some classes of model, such as ARMA models in state space
            form, derivatives can be computed using recursions adapted for the particular
            model  class  (Ansley  and  Kohn,  1985;  Melard,  1985).  Consideration  of  a
            complete-data  likelihood,  as  though  the  states  αt  were  observable,  yields
            explicit derivatives (Segal and Weinstein 1988, 1989; Koopman and Shephard,
            1992). but in a simple form only for a restricted set of models for which the
            matrices  Ht,  Qt,  and  Rt  are  invertible.  Direct  differentiation  of  (9)  yields  a
            recursive procedure whose effect is to increase the number of quantities that
            must  be  carried  through  the  Kalman  filter  recursions.  This  approach  is
            described in more detail in Section 4.
                 What has hitherto been lacking is a single explicit expression for the log-
            likelihood derivative that applies to all linear gaussian state space models in
            the  form  (1)–(3),  without  any  restrictions  on  the  system  matrices.  We  have
            derived such an expression, by extending previous authors’ treatments of the
            direct log-likelihood derivative. Our result is included in Section 4 below, as eq.
            (14).

            4.  Log-likelihood derivatives for the state space model.
                Let θ be a scalar model parameter. We use the notation X to denote. ∂X/∂θ.
                                                                   −1
                                                                       −1
            For an invertible matrix X, we note that and ∂X−1/∂θ=−X xX . Differentiating
            (9) with respect to θ gives the derivative of the log-likelihood:





                                                               146 | I S I   W S C   2 0 1 9
   152   153   154   155   156   157   158   159   160   161   162