Page 282 - Invited Paper Session (IPS) - Volume 1
P. 282
IPS153 Christine B. et al.
Linking the census and administrative data
We link 2018 Census respondents to the same person in the IDI spine, so
that we can remove them from the IDI-ERP, leaving only those who did not
respond. This requires a high linkage rate and accurate linkages. Since New
Zealand does not have a common identifier, probabilistic linkage methods are
applied. The overall linkage rate of 97.7 percent is high in the NZ context.
Census respondents who have not been linked to the IDI spine are a mix of
those who:
• should have been matched to the IDI spine but were not (a missed or
‘false negative’ match)
• are not in the IDI spine (and therefore the non-match is correct)
The rate of missed matches has been estimated as 1.4 percent. False
positive matches (when different people are incorrectly linked) are estimated
as being less than 1 percent of the links made.
Admin enumerations in dwellings and households
The first and most demanding use of administrative data is the placement
of groups of people within a dwelling to form households. For private dwellings
where no census responses have been received, a statistical model has been
developed to predict which households constructed from admin records are
likely to have reliable data. The approach is based on methodology developed
by the US Bureau of the Census who have a planned strategy to use admin
enumerations in the non-response follow-up phase for their 2020 Census (US
Bureau of the Census, 2017). The model produces a score that represents how
reliable the administrative data is for representing the entire household in a
given dwelling. 2018 Census responding households are assumed to represent
the truth when training and assessing the model. The cut-off has been set as a
balance between strict criteria of obtaining exactly the same people in the
household as we observe in the census, and including admin households that
reflect similar adult-child patterns as the census, even if we cannot guarantee
that all household members are the same. Making the trade-off in this way
means we include relatively more large or complex households than if we had
set a more conservative cut-off, and makes some allowance for errors in census
responding households.
While households where we have received some census responses may
still be missing people, we have not developed a model to predict when admin
records ostensibly for the same address should be placed within those
responding households.
271 | I S I W S C 2 0 1 9