Page 328 - Special Topic Session (STS) - Volume 3
P. 328
STS547 Maarten C. et al.
Four data sources, the population census and three administrative
registers are available, that each have an ethnicity variable. Here we focus on
Māori ethnicity in a summarised binary form so that we have two mutually
exclusive categories: Māori (with or without other ethnicities) and non-Māori
(everyone else). Details of these sources and the procedures which have been
used to link them are described in section 2; perfect linkage is an essential
assumption for DSE. Then we build up the estimation problem in section 3,
starting with two registers, and then four registers, and finally consider using
the three administrative sources without the census. Some conclusions are
presented in Section 4.
2. Methodolgy
Because a person’s reported ethnicity can change over time, and
depending on the context, a key question is how to combine ethnicity from
multiple sources, when information is sometimes conflicting. Reid, Bycroft, and
Gleisner (2016) compared ethnicity data from the 2013 Census with the
ethnicity information collected by administrative sources, for a New Zealand
resident population derived from administrative sources. They found that
nearly everyone in this admin-based New Zealand resident population had
ethnicity recorded in at least one administrative data source, but that
consistency with census responses varied considerably by source and by
ethnic group. The method used to combine these sources has a major impact
on the result. Under the assumption that census responses provide the best
measure for official statistics purposes, a method that ranks sources based on
their consistency with the census has been applied. Using administrative data
alone was found to produce a time series that reflects expected patterns of
increasing ethnic diversity, with age structure and regional distribution of
ethnicity consistently in line with official measures (Stats NZ, 2018). The
approach however has some limitations, for example it does not allow for
reporting errors or conflicts in higher-ranked sources, which may be better
managed through a statistical model.
The population used here is the experimental administrative-based NZ
resident population known as the ‘IDI-ERP’ (Stats NZ, 2017). The data are
probabilistically linked in Stats NZ’s Integrated Data Infrastructure (IDI). The
IDI provides safe access to de-identified linked microdata for research and
statistics in the public interest.
We use ethnicity data from the 2013 population census and from three
administrative sources:
(i) Department of Internal Affairs (DIA) birth registrations data - which
includes the ethnicity of the child as reported at registration (ii) Ministry of
Education (MOE) tertiary education enrolment data - which includes
ethnicity for students (iii) Ministry of Health (MOH) National Health Index
317 | I S I W S C 2 0 1 9