Page 78 - Special Topic Session (STS) - Volume 4
P. 78
STS563 Davide Di Cecco et al.
Our proposal relies on the following assumption: all possible erroneous
captures are defined as random classification errors under a binary
classification model. That is, we hypothesize two subpopulations: one
comprising the out-of-scope units, and the other the in-scope units. Then a
two-component latent class model would adequately describe our data. To
model possible dependencies among captures of a same individuals in
different sources, we relax the classic conditional independence assumption
of latent class models and assume a general log-linear model for the joint
distribution. To address the problem of subpopulations that are uncatchable
for some sources, we treat the uncatchable units as missing information and
develop an inferential approach to deal with missing data. This model has
been proposed in Di Cecco et al (2018). Here we present a Bayesian approach
to estimate the size of the population, addressing the challenges listed above.
2. Methodology
Assume k lists or capture occasions are available, and let Yi be the random
variable indicating whether a unit is included in the i–th list, i = 1, ..., k (i.e., has
been captured in the i–th occasion):
1 ℎ − ℎ ;
Yi = {
0 ℎ.
Let Y = (Y1,... ,Yk) denote the capture profile of a unit, and let {P(Y = y) = py}y∈{0,1}k
be the associated probability distribution. Let U (i) be the set of units that are
catchable by list i, and let U be UiU (i). Let U1 be our target population, with
U1 ⊂ U. The cardinality of U is N, the one of U1 N1. Let X be the latent variable
identifying the units belonging to our target population:
1 1 ;
X = {
0 ℎ.
Let ny be the number of units having capture profile y, of which nx,y belong to
the latent class x so that n0,y + n1,y = ny. The total number of observed unit is
nobs, while the units having capture history y = 0 = (0, . . . , 0) are unobserved,
so that ∑y≠0 ny = nobs, and N = nobs + n0. Note that n1,0 is the number of units in
U1 that are not captured, while n0,0 is the number of uncaptured units which
are in U but not in U1. We are interested in estimating N1 = ∑y n1,y. The latent
class model under the conditional independence assumption (CIA) can be
equivalently expressed as the mixture model
67 | I S I W S C 2 0 1 9