Page 338 - Special Topic Session (STS) - Volume 3
P. 338
STS547 Daan Zult et al.
Equation (4) allows for the inclusion of covariates in the same way as in a
regular log - linear Poisson regression, which implies that ̂ must be
separated further into groups (e.g. male/female) and this categorical
covariate can be added to the regression equation. We refer to this
extension of the D&F model as the weighted CR (WCR) model. Why it is
called ‘weighted’ will become clear in the next section.
2.2 The weighted – multiple recapture model
In section 2.1 we showed how the D&F model can be written as a log –
linear Poisson regression model and how (categorical) covariates can be
added to this equation by splitting - up ̂ into smaller groups. This implies
that after this procedure we have for each cell count both an estimated and
observed cell count. Here we should note that each cell count consists of
records, so for each record we can calculate its weighted contribution to its
estimated cell count, i.e.:
̂
= (5)
where ̂ and refer to the estimated and observed cell count of record
. E.g., when we ignore covariates and record is linked between and
1
2
, ̂ and = 11. Now is a record level weight that sums up to the
different elements in ̂. Adding up over is similar to the case of no
linkage errors where each record has a weight of 1 and is added up to
obtain (the true and observed) cell counts. However, when we want to
extend the model such that it can deal with multiple – sources, we can write
as:
̂
= −1 (6),
with =0 = 1 , = ∑ −1 . Under equation (6) is updated
after every linkage procedure, which can be repeated for each new source.
After the update of the estimated cell count elements of ̂ can be
calculated by summing up over the records that belong to that cell,
where ̂̂ does not only distinguish between and but may distinguish
between any number of sources and categorical covariates. The WMR model
can then be written as:
[̂ ] = (, ) (7),
where ̂ is the estimated cell count vector that depends on = ( −1 , )
with a set of categorical covariates, according to some function (, ) with
a parameter vector.
327 | I S I W S C 2 0 1 9