Page 283 - Invited Paper Session (IPS) - Volume 1
P. 283

IPS153 Christine B. et al.
                Admin enumerations in meshblocks
                After  deriving  admin  households  for  non-responding  dwellings  where
            possible, the next phase places admin people into the smallest New Zealand
            geographic area (meshblocks). The admin address ID provides an x,y location
            coordinate, but the person will be excluded from census households.
                The remaining IDI-ERP population eligible for inclusion in meshblocks are
            those who have not been linked to a census respondent, and are not already
            included at a dwelling. We then explicitly adjust for potential coverage errors
            that  could  result  from  over-coverage  in  the  IDI-ERP,  and  for  duplication
            caused by missing linkages between the census and the IDI. We draw on new
            methods Stats NZ have developed to remove over-coverage and adjust for
            missed  matches  in  the  context  of  a  new  Dual  System  Estimation  (DSE)
            population  estimate  based  on  the  same  linked  census  and  IDI-ERP
            administrative  population  sources.  DSE  is  a  well-established  methodology
            used  in  population  estimation,  often  applied  in  the  context  of  a  full
            enumeration census and a coverage survey. Over-coverage is typically avoided
            by design of the coverage sample, and may be removed from the census, and
            extensive checks are made to meet the DSE assumption of no linkage error.
            Neither of these approaches are possible in this situation.
                Removing over-coverage in the IDI-ERP: Our approach is designed to
            remove over-coverage caused by people selected in the IDI-ERP who are not
            NZ  residents  at  the  time  of  the  census.  As  we  cannot  directly  find  over-
            coverage, we will also remove valid records. The goal is to effectively remove
            as  much  over-coverage  as  possible,  while  also  minimising  the  amount  of
            under-coverage being introduced. We apply a more rigorous selection criteria
            for inclusion, whereby people must have activity in at least two data sources:
            tax or health, plus one other. Tax and health datasets have high coverage of
            the population, and requiring an additional activity provides stronger evidence
            that people are in fact residents. This approach targets all age-sex groups,
            combines factors found to predict over-coverage in previous research, and is
            easy to apply and explain.
                Accounting for missed linkages: We also adjust for duplicates caused by
            missed matches when we link the census file and the IDI spine. If we have not
            linked a census record to the IDI spine when we should have, it will still be in
            the eligible IDI-ERP dataset, and would be counted again. We are matching
            around 4 million census records to an IDI spine consisting of nearly 10 million
            records. We do not at present have the time or resources for checking even a
            sample to estimate missed matches.
                We estimate missed matches on the basis of responses to census questions
            that closely match the criteria for inclusion in the IDI spine. A subset of the
            census, M*, is created through applying strong requirements of membership in
            two or more of the datasets that make up the IDI spine (Choi, 2019). Those in

                                                               272 | I S I   W S C   2 0 1 9
   278   279   280   281   282   283   284   285   286   287   288