Page 70 - Special Topic Session (STS) - Volume 4
P. 70

STS563 Patrick Graham et al.
                      Notice that the number of people in the (0,0) cell in Table 1, corresponding
                  to “not in the target population and not on the list” is assumed to be 0. In fact,
                  most of the world’s population falls in the cell! However, we are not interested
                  in  estimating  the  population  of  the  world  but  of  some  specific  target
                  population such as the usually resident population of New Zealand, and we
                  are seeking to use an administrative list for this purpose. For this problem, only
                  people in the target population or on the list or in both are relevant. That is
                  our  conceptual  starting  point  for  estimation  is  the  union  of  the  target
                  population and the list (cf Zhang (2015)). We let NU denote the size of the
                  target-list union.
                      If  a  sample  has  been  drawn  from  the  target  population  with  sample
                  inclusion probabilities, λ(x), independently of list inclusion, and the sample is
                  linked to the list without error, cross-tabulation of sample and list produces a
                  2 x 2 table (at each setting of X) underpinned by the probabilities shown in
                  Table 3. For simplicity, we regard the λ(x) as known. In practice λ(x) may need
                  to be estimated. From Table 3 it can be seen the sampling process transfers
                  some people from the (1, 1) cell in the target-list union to the (0, 1) cell in the
                  sample-list union, and some people from the (1, 0) cell in the target-list union
                  to the (0,0) cell in the sample-list joint distribution. This cell is, in reality not
                  observable.  This  needs  to  be  accommodated  in  the  analysis.  An  important
                  point is that Table 3 does not represent a traditional capture-recapture, or
                  dual-systems population estimation problem. Whereas the latter involves two
                  or more samplings from a target population we have a single sampling from
                  the population which is linked to a list that overlaps the target population. The
                  observed (0, 1) cell comprises a mix of people from the target population that
                  were  not  included  in  the  sample  and  people  genuinely  not  in  the  target
                  population. Traditional DSE methods cannot accommodate the latter group.

                  3.  Inference.  We base inference on the posterior predictive distribution of a
                  corrected list from which individuals not in the target population have been
                  removed and the target population members missed by the list have been
                  added. If we can generate corrected lists from this distribution, then for each
                  draw  we  could  obtain  population  counts  for  all  cells  of  interest  by  simple
                  tabulation.  The  tabulations  obtained  by  repeating  this  for  each  simulated
                  corrected list, represent a sample from the joint posterior distribution of the
                  cell counts. Summaries of this distribution such as the median, other quantiles,
                  and approximate credible intervals can be obtained straightforwardly.
                      Introducing the notation  to denote the cell-location for an individual in
                  the target-list union,  to denote the cell location in the sample-list union,
                                        ̃
                  letting  = (, ) where  denotes the vector of parameters for the models for
                    (), and  (x), letting () = = (  ( ),   ( ),   ( )) denote the
                                                                   10
                                                                           01
                                                                       
                                                               
                                   01
                                                           11
                                                                                
                  vector of cell probabilities at covariate setting x, and assuming the covariate
                                                                      59 | I S I   W S C   2 0 1 9
   65   66   67   68   69   70   71   72   73   74   75