Page 78 - Special Topic Session (STS) - Volume 4
P. 78

STS563 Davide Di Cecco et al.
                  Our  proposal  relies  on  the  following  assumption:  all  possible  erroneous
                  captures  are  defined  as  random  classification  errors  under  a  binary
                  classification  model.  That  is,  we  hypothesize  two  subpopulations:  one
                  comprising the out-of-scope units, and the other the in-scope units. Then a
                  two-component latent class model would adequately describe our data. To
                  model  possible  dependencies  among  captures  of  a  same  individuals  in
                  different sources, we relax the classic conditional independence assumption
                  of latent class models and assume a general log-linear model for the joint
                  distribution. To address the problem of subpopulations that are uncatchable
                  for some sources, we treat the uncatchable units as missing information and
                  develop an inferential approach to deal with missing data. This model has
                  been proposed in Di Cecco et al (2018). Here we present a Bayesian approach
                  to estimate the size of the population, addressing the challenges listed above.

                  2.  Methodology
                      Assume k lists or capture occasions are available, and let Yi be the random
                  variable indicating whether a unit is included in the i–th list, i = 1, ..., k (i.e., has
                  been captured in the i–th occasion):

                                       1       ℎ  − ℎ ;
                                 Yi  = {
                                       0                                                       ℎ.

                  Let Y = (Y1,... ,Yk) denote the capture profile of a unit, and let {P(Y = y) = py}y∈{0,1}k
                  be the associated probability distribution. Let U (i) be the set of units that are
                  catchable by list i, and let U be UiU (i). Let U1 be our target population, with
                  U1 ⊂ U. The cardinality of U is N, the one of U1 N1. Let X be the latent variable
                  identifying the units belonging to our target population:

                                       1                               1 ;
                                  X  = {
                                        0                                                    ℎ.

                  Let ny be the number of units having capture profile y, of which nx,y belong to
                  the latent class x so that n0,y + n1,y = ny. The total number of observed unit is
                  nobs, while the units having capture history y = 0 = (0, . . . , 0) are unobserved,
                  so that ∑y≠0 ny = nobs, and N = nobs + n0. Note that n1,0 is the number of units in
                  U1 that are not captured, while n0,0 is the number of uncaptured units which
                  are in U but not in U1. We are interested in estimating N1 = ∑y n1,y. The latent
                  class  model  under  the  conditional  independence  assumption  (CIA)  can  be
                  equivalently expressed as the mixture model








                                                                      67 | I S I   W S C   2 0 1 9
   73   74   75   76   77   78   79   80   81   82   83