Page 322 - Special Topic Session (STS) - Volume 3
P. 322

STS547 John D. et al.
                   We start by considering an SPD compiled from administrative sources. The
               SPD is compiled to the best of our abilities but is suspected of suffering from
               undercoverage as well as overcoverage. We now consider all relevant units
               (persons) U as including persons in both the population and the SPD (  of
                                                                                      
               size  ),  persons  in  the  SPD  but  not  in  the  population  (  ,  equating  to
                                                                           
                     11
               overcoverage in SPD of size   ) and the number of persons in the population
                                            01
               and not in the SPD (  or  undercoverage of size  ).  ,  01  and   are
                                                                        11
                                      
                                                                                    10
                                                                   10
               unobserved but  =  +  , the size of the SPD is observed. The objective
                                      11
                                
                                            01
               is to estimate  =  +  , the size of the target population.
                                          10
                              
                                    11
                   First we consider every unit i in the universe U as a multinomial trial with
               probabilities ( ) =  , P(iϵ ) =   and P(iϵ ) =   with  + 01  +
                                                                  
                                                                                 11
                                                                        10
                                 
                                        11
                                                      01
                                                
                10  = 1.  Table 1 illustrates this relationship between the target population
               and the SPD.
                   To estimate   the size of the target population, Graham and Lin (2019)
                                
               propose  sampling  the  target  population  with  known  sample  inclusion
               probabilities and linking the sampled units to the administrative list in an error
               free  way.  In  practice,  an  area  frame  in  conjunction  with  a  well  maintained
               dwelling  register  will  allow  for  sampling  dwellings  with  a  known  inclusion
               probability.  Known  inclusion  probabilities  for  individuals  then  requires  an
               assumption of no within dwelling non-response. Various field procedures can
               be used to approximate this assumption as closely as possible. However, to
               simplify notation and explanation, we consider a simple random sample of
               individuals with a constant and known inclusion probability . Table 2 provides
               the corresponding cell probabilities, for the relationship between the SPD and
               the  sample  in  terms  of    which  is  assumed  known  and  the  multinomial
               probabilities in Table 1. In practice, the underlying probability model (Table 2)
               is extended to include covariates such as age, sex, ethnicity and geography.
               We  will  use   ,  ,    00   to  denote  the  cell  counts  in  the  cross-
                                      01
                              00
                                  10
               tablulation of sample and list inclusion (i.e the table of counts corresponding
               to Table 2), where  ,      are directly observed. We note the count
                                               01
                                   00
                                       10
               for observed (0,1) cell in the sample - list union, n01 contains a mix of people
               in the target population but not included the sample and people genuinely
               not in the target population. Consequently the inference is not a standard DSE
               problem, which deals only with undercoverage in the observed data.
                   Graham  and  Lin  (2019)  take  a  Bayesian  approach  to  inference  which
               follows from the joint posterior distribution for   = /( 11  +  ) and
                                                                        10
                                                                                   10
                .The posterior distribution for the remaining cell probabilities can be easily
                 01
               obtained  using   11  = (1 −  )(1 −   ),  10  = (1 −  )   .  Given
                                                                         10
                                            01
               the posterior distribution for the cell probabilities, the posterior distribution
               for the total target population size can be obtained. Graham and Lin (2019)
               evaluate two methods for completing the target population unit record file.
               The first uses the estimated model probabilities and estimated  to impute
                                                                               

                                                                  311 | I S I   W S C   2 0 1 9
   317   318   319   320   321   322   323   324   325   326   327