Page 320 - Special Topic Session (STS) - Volume 3
P. 320
STS547 John D. et al.
Every unit in the population has an equal chance of being captured in
list .
Adding an additional assumption, the event that a person is captured in
list B is independent of any other person being captured in that list, Zhang
and Dunne (2018) show a variance estimator similar to that derived by Sekar
and Deming (1949) and presented in Bishop et al. (1975, page 233).
These assumptions are more relaxed than those presented by Wolter
(1986). This DSE model can be applied in many more scenarios where the
Wolter assumptions may not hold true. One scenario is where list A is derived
from administrative data sources.
The Irish PECADO project proposes a system where the SPD, compiled
from the activity records in individual public administration systems, is list A
(size x) with heterogeneity in the capture rates and a second administrative list
as list B (size n) satisfying the homogeneous capture assumption. List B (DLD)
is composed of those persons applying for or renewing their driver licence in
a given year. In Ireland, drivers have to renew their licence at least every 10
years and are required to show that they are resident in the State. We assume
neither list has erroneous records and we also assume perfect linkage based
on official Identification Numbers. Erroneous records can be considered as a
record that is not related to a person that should be included in the
population. The population estimate, is compiled as = / where m is
̂
̂
the size of the match between list A and list B. Post stratification by single year
of age, gender and nationality group is also implemented to strengthen the
homogeneous capture assumption and provide population estimates by these
groups. An additional assumption of no undercoverage for those under 18
years of age in list A is also made as DLD has no coverage in this age group.
DLD is further validated as a suitable list B by swapping in a smaller list derived
from a survey (underpinned by homogeneous capture assumption) and
comparing results. TDSE methods are used to hunt for erroneous records.
The theory underpinning TDSE is based on the concept that if the
assumption of homogeneous capture holds, then when list A is trimmed of k
records to get a new (trimmed) list of size − , there should be no
significant difference between the untrimmed population estimate N and the
population estimate after trimming . The size of the match between list
and list B is − where is the number of records from the trimmed
1
1
segment that now need to removed from the match between list A and list B.
(−)
̂
This provides the trimmed population estimate, = (− 1 ) .
We use TDSE methods to evaluate suspect parts of list A for records that
are not part of the population. While in theory the SPD is designed to remove
them, in practice, there may be errors in processing of administrative data
sources that may result in erroneous records being included in the SPD. To do
this we identify parts of the SPD where we suspect there may be erroneous
309 | I S I W S C 2 0 1 9