Page 141 - Invited Paper Session (IPS) - Volume 2
P. 141

IPS192 Hukum C. et al.
            we use a subscript of  to index those quantities associated with area . In
            particular,   and   are used to represent the sample and population sizes in
                        
                               
            area , respectively. We also assume that the underlying unit level variable of
            interest  is discrete, and in particular is either a binary value or is  a non-
            negative  integer,  and  the  aim  is  to  estimate  the  corresponding  small  area
            population proportions or population totals (i.e. counts). Let the total of  in
            area  be denoted  , and let   and   denote the corresponding sample and
                               
                                         
                                                
            non-sample counts for area  respectively. We shall assume that area level
            auxiliary  information  from  secondary  data  sources,  e.g.,  Census  and
            Administrative records, is available. Let   be the p-vector of these covariates
                                                    
            for area  from these sources. The area level version of the GLMM is then
            defined as Pr(  |  ) ∝  , where
                                    
                              
                           
                                         ( ) = ƞ =   +                                       (1)
                                                       
                                                       
                                                  
                                                              
                                            
            where g(·) is a known function, called the link function,  =  −1 (ƞ ), β is the
                                                                             
                                                                    
            p-vector  of  regression  coefficients,  often  referred  to  as  the  fixed  effect
            parameter of the GLMM, and  ~  (0,  ). The model (1) can be used to relate
                                                   2
                                          
                                                   
            the area  level direct survey estimates to area  level covariates. This  type of
            model is often referred to as ‘area-level’ model in SAE (Fay and Herriot,1979).
            Collecting the area level models (1), we can write the model (1) as
                                          () =  ƞ =  + ,                                       (2)
            where   = ( , … . . ,  ) ,   = ( , … . . ,  )   is  a   ×   matric  and  u  =
                                              
                                                       
                                     
                          1
                                                      
                                              1
                                  
                         
            ( , … . . ,  )  is a vector of  × 1 of area random effects which is normally
                      
              1
            distributed with mean zero and variance ∑ =   . Here,   is an identity
                                                             2
                                                             
                                                       
                                                                        
            matrix of order . When the variable of interest  is binary, and unit level
            values  in  area    are  independently  and  identically  distributed,  the  sample
            counts   in area , has a Binomial distribution with parameters   and   ,
                                                                                     
                     
                                                                              
            denoted  by     ~  Binomial( ,  ),  where     is  now  the  probability  of
                                              
                                           
                           
                                                          
            occurrence of an event or probability of prevalence in area , often referred to
            as the probability of a ‘success’. Similarly, the non-sample count   in area 
                                                                             
            is  such  that     ~  Binomial(   -   ,  ).  That  is,  the  counts     and     are
                                          
                                                                         
                                                 
                          
                                                                                  
                                               
            independent  Binomial  variables  with     then  corresponding  to  a  common
                                                   
            success probability. In this case, the link function g(·) is usually taken to be the
            logit of the probability  . The model (1) linking   with the covariates   is
                                                                                    
                                     
                                                              
            then  the  GLMM  with  logistic  link  function  given  by  logit( )=  ln{ (1 −
                                                                          
                                                                                  
                            
             ) } = ƞ =    +   ,  with   =  exp(ƞ ){1 + (ƞ )} −1  = (ƞ ) =
               −1
                            
                                                                     
                       
                                               
                                                         
                                                                                   
                                     
              
                                            2
                    
                                                                                   
            (  +  )  and   ~ (0,  ).  Here,   |  ~Binomial ( , (  +
                                   
                                                       
                                                                                  
                                                            
                                                                          
                    
                          
                                            
             )) and  |  ~ Binomial  ( −  , expit(  +  )). The expected values of
                                                       
              
                                              
                      
                          
                                                             
                                         
                                                      
                                                                    
              and   given   are then     = ( | ) =   expit(  +  ) and   =
             
                                                                                   
                     
                                                      
                               
                                                            
                                                                           
                                                  
                                                                    
                                        
            ( | ) = ( −  )expit(  +  ). The population count in area  can be
                
                   
                                        
                               
                                              
                           
            expressed  as   =  +  , where  the  first  term   ,  the  sample  count,  is
                                      
                           
                                                               
                                 
            known whereas the second term  , the non-sample count, is unknown. A
                                               
                                                               128 | I S I   W S C   2 0 1 9
   136   137   138   139   140   141   142   143   144   145   146