Page 329 - Special Topic Session (STS) - Volume 3
P. 329
STS547 Maarten C. et al.
system, a unified national person list - which includes ethnicity. For a more
detailed explanation of these sources, see Reid et al. (2016).
Each of the administrative sources relates to different parts of the
population. Birth registrations are for babies born in NZ since 1998, or those
up to age 14 in 2013; tertiary education enrolments are available from around
the late 1990s, and are mainly for those aged between 18 and 40 years in 2013;
both census and health data include all ages, and each has an ethnicity value
for around 90 % of the IDI-ERP population. Overall, almost 99 percent of the
IDI-ERP population have ethnicity information from at least one of these
sources, and many people have information from more than one source.
The aim of the following analyses is to produce aggregate estimates of
Māori and non-Māori ethnicity by combining these four independent sources:
the 2013 Census and the three administrative sources.
3. Results
3.1 Two registers
We first explain the methodology for two registers and then apply it to
four registers. We start by using the two sources with the widest coverage, the
Census and the MOH. Being in the Census is denoted by A (A = 1 for ‘yes’, A
= 0 for ‘no’), and similarly for MOH, denoted by C. The ethnicity variable in the
Census is denoted by a (a = 0 for non-Māori, a = 1 for Māori, a = ‘-’ for
individuals who are in A but did not fill in their ethnicity, and a = ‘x’ for
individuals that are not in A). The ethnicity variable in the MOH is denoted by
c and coded similarly to a. In comparison to the methods employed by van
der Heijden et al. (2018) , the presence of the ‘-’ level in variables a and c is
new, and we first extend these methods with two registers.
Figure 1 illustrates the form of the data when they are coded in a matrix of
individuals in the rows by variables in the columns. The middle two columns
depict A and C, that indicate whether individuals are only in A but not in C ((A;
C) = (1; 0)), in both A and C ((A; C) = (1; 1)) or not in A but only in C ((A; C) =
(0; 1)). At the bottom is a horizontal band of ‘Individuals missed by both lists’,
and this refers to (A;C) = (0; 0). This last number has to be estimated to arrive
at an estimate of the size of the total population of non-Māori and Māori. The
first column stands for ethnicity variable a. When individuals are only in A ((A;
C) = (1; 0)), there are three types of individuals, namely 0, non-Māori (light
grey); 1, Māori (blocks); and ‘-’, those who have a missing value for ethnicity
(raster). When individuals are in both A and C ((A; C) = (1; 1)), all three areas
are found. When individuals are not in A but only in C, the ethnicity variable a
is automatically not measured and denoted by ‘x’ (white area). The last column
stands for ethnicity variable c, and it has similar levels as a. Notice that there
are three kinds of missing data: there is item missingness ‘-’ for those
individuals that are on a list but did not provide their ethnicity; there is item
318 | I S I W S C 2 0 1 9