Page 142 - Invited Paper Session (IPS) - Volume 2
P. 142

IPS192 Hukum C. et al.
                  plug-in empirical predictor (EP) of the population count in area  is obtained
                                                              ̂
                  by replacing  , by its predicted value ̂   =  ( | ) under the model (1) as
                                                                 
                                                                     
                                
                                     ŷ   =  + ̂   =  + ( −  )̂   ,                               (3)
                                             
                                       
                                                                
                                                                     
                                                         
                                      ̂
                                           
                                     
                                                                      
                  with ̂   = expit (  +  ̂) for binary data, where  = (0, … , 1, . . ,0) is 1×
                                    
                        
                                                                      
                                           
                                                                       
                  vector with 1 in the -th position and ̂ = (̂ , … . , ̂  )  . In SAE problems, the
                                                            1
                                                                   
                  sample size   is often negligible relative to the population size  , then ̂    =
                               
                                                                                
                    ̂  . An estimate of the proportion in area  is given by ̂   .
                     
                      We now introduce a spatially non-linear extension of an area level GLMM.
                  We refer this model as spatially non-linear generalized linear mixed model
                  (SNLGLMM).  We  start  by  developing  the  nonparametric  extension  of  the
                  GLMM, and then suggest a spatial extension of this model. Typically, the fixed
                  effect  part  of  a  GLMM  is  assumed  to  be  linear.  However,  in  reality  the
                  functional  form  of  this  relationship  may  be  unknown  or  it  may  have  a
                  complicated  functional  form.  Without  loss  of  generality  we  restrict  our
                  development  to  the  case  of  a  single  covariate  x  and  use  nonparametric
                  regression modelling based on a P-spline approximation. The spatially non-
                  linear GLMM (SNLGLMM) is then of the form
                                            ( ) = ƞ = ( ) +                                                  (4)
                                               
                                                                  
                                                            
                                                     
                                   2
                  where  ~(0,    ) is the area specific random effect and  = ( | ,  ) =
                                                                             
                          
                                                                                    
                                                                                         
                                                                                      
                                   
                  ℎ{( ) +  }.  In  particular,  the  spatially  non-linear  logistic-normal  mixed
                       
                             
                  model and the spatially non-linear Poissonnormal mixed model for binary and
                  count data, respectively, are defined as ( ) = ƞ = ( ) +   with  =
                                                                                          
                                                                      
                                                                                  
                                                                
                                                                             
                   {( ) +  }  and  log( ) = ƞ = ( ) +    with   = {( )}.  The
                                                                                     
                                                    
                                                           
                                 
                            
                                                                 
                                                                          
                                               
                  function ( ) in (4) is unknown, but can be approximated sufficiently well by
                              
                  the P-spline approximation
                                                                   
                                                              
                                                                               
                             ( , , ) =  +   + ⋯ … . +  + ∑  ( −  )                  (5)
                                         0
                                                                              +
                               
                                                                      
                                               1 
                                                                         
                                                            
                                                                  =1
                                                                
                                                           
                      Here  is the degree of the spline, () =   if  > 0 and its otherwise, 
                                                           +
                                                                                            
                                                                                          
                  for = 1, ……,  is a set of fixed constants called knots,  = ( , , … ,  )  is
                                                                               0
                                                                                       
                                                                                  ,
                  the  coefficient  vector  of  the  parametric  portion  of  the  model  and   =
                             
                   ( , … . ,  )  is the vector of spline coefficients,  is the number of spline knots,
                           
                    1
                                2
                  and  ~(0,  );  = 1,……, . Provided that the knot locations are sufficiently
                       
                                
                  spread out over the range of  and  is sufficiently large, the class of functions
                  defined by (5) can approximate most smooth functions. Ruppert et al. (2003,
                  chapter  5)  suggest  the  use  of  a  knot  for  every  four  observations,  up  to  a
                  maximum of about 40 knots for a univariate application. This is usually done
                  by placing these knots at equally spaced quantiles of the distribution of the
                  covariate.
                      Note that the P-spline approximation consists of a linear combination of
                  appropriately  chosen  basis  functions.  For  simplicity,  the  approximating
                                                                     129 | I S I   W S C   2 0 1 9
   137   138   139   140   141   142   143   144   145   146   147