Page 335 - Special Topic Session (STS)

Page 335 - Special Topic Session (STS) - Volume 3

P. 335

STS547 Daan Zult et al.

A linkage error correction model for population
size estimation with multiple sources
1
1
1,2
Daan Zult , Peter – Paul de Wolf , Bart Bakker , Peter van der Heijden 3
1 Statistics Netherland 1
2 VU University
3 Utrecht University and University of Southampton

Abstract
A new method is described to do population size estimation, while linkage
of sources occurs with errors. Our model is derived from a linkage error
correction model introduced by Ding and Fienberg (1994). They show how
to use linkage probabilities to correct the capture - recapture estimator for
linkage errors, but only in the case of two sources and no covariates. A
generalisation is proposed by incorporating the Ding & Fienberg model into
the standard log - linear modelling approach used in multiple - recapture
estimation. We show how the method performs in a simulation study with
data that resemble real data.

Keywords
Multiple – recapture estimation; population size estimation; capture –
recapture; record linkage; linkage errors

1. Introduction
This paper is a summary of Zult et al. (2019), which we refer to for a more
extensive and elaborate discussion of this topic. The size of a partly observed
population is often estimated with the capture – recapture (CR, for two
sources) or multiple – recapture (MR, for multiple sources) method. An
important assumption for these models is that records in different sources can
be identified such that it is known whether these records belong to the same
unit or not, i.e. records can be perfectly linked between sources. This
assumption of perfect linkage is of particular relevance if identification is not
obtained by some perfect identifier (like a tag or id-code) but by indirect
identifiers (like name and address). In that case record are usually linked with
probabilistic linkage (see Fellegi and Sunter, 1969, Winkler, 1988 or Jaro, 1989)
and the perfect linkage assumption is often violated which generally leads to
a biased population size estimate (PSE) (Gerritse et al., 2017).
A solution to this problem was provided by Ding and Fienberg (1994) (DF),
Di Consiglio and Tuoto (2015) (DC&T_15) and De Wolf et al. (2018) (DW).
These authors show how to use linkage probabilities to correct the capture -

1 The authors like to thank Jan van der Laan from Statistics Netherlands for his review of the
final version of this the paper.

324 | I S I W S C 2 0 1 9

330 331 332 333 334 335 336 337 338 339 340