Page 329 - Special Topic Session (STS) - Volume 3
P. 329

STS547 Maarten C. et al.
                   system, a unified national person list - which includes ethnicity. For a more
                   detailed explanation of these sources, see Reid et al. (2016).
                   Each  of  the  administrative  sources  relates  to  different  parts  of  the
               population. Birth registrations are for babies born in NZ since 1998, or those
               up to age 14 in 2013; tertiary education enrolments are available from around
               the late 1990s, and are mainly for those aged between 18 and 40 years in 2013;
               both census and health data include all ages, and each has an ethnicity value
               for around 90 % of the IDI-ERP population. Overall, almost 99 percent of the
               IDI-ERP  population  have  ethnicity  information  from  at  least  one  of  these
               sources, and many people have information from more than one source.
                   The aim of the following analyses is to produce aggregate estimates of
               Māori and non-Māori ethnicity by combining these four independent sources:
               the 2013 Census and the three administrative sources.

               3.  Results
               3.1 Two registers
                   We first explain the methodology for two registers and then apply it to
               four registers. We start by using the two sources with the widest coverage, the
               Census and the MOH. Being in the Census is denoted by A (A = 1 for ‘yes’, A
               = 0 for ‘no’), and similarly for MOH, denoted by C. The ethnicity variable in the
               Census is denoted by a (a = 0 for non-Māori, a  =  1 for Māori, a  =  ‘-’ for
               individuals  who  are  in  A  but  did  not  fill  in  their  ethnicity,  and  a  =  ‘x’  for
               individuals that are not in A). The ethnicity variable in the MOH is denoted by
               c and coded similarly to a. In comparison to the methods employed by van
               der Heijden et al. (2018) , the presence of the ‘-’ level in variables a and c is
               new, and we first extend these methods with two registers.
                   Figure 1 illustrates the form of the data when they are coded in a matrix of
               individuals in the rows by variables in the columns. The middle two columns
               depict A and C, that indicate whether individuals are only in A but not in C ((A;
               C) = (1; 0)), in both A and C ((A; C) = (1; 1)) or not in A but only in C ((A; C) =
               (0; 1)). At the bottom is a horizontal band of ‘Individuals missed by both lists’,
               and this refers to (A;C) = (0; 0). This last number has to be estimated to arrive
               at an estimate of the size of the total population of non-Māori and Māori. The
               first column stands for ethnicity variable a. When individuals are only in A ((A;
               C) = (1; 0)), there are three types of individuals, namely 0, non-Māori (light
               grey); 1, Māori (blocks); and ‘-’, those who have a missing value for ethnicity
               (raster). When individuals are in both A and C ((A; C) = (1; 1)), all three areas
               are found. When individuals are not in A but only in C, the ethnicity variable a
               is automatically not measured and denoted by ‘x’ (white area). The last column
               stands for ethnicity variable c, and it has similar levels as a. Notice that there
               are  three  kinds  of  missing  data:  there  is  item  missingness  ‘-’  for  those
               individuals that are on a list but did not provide their ethnicity; there is item




                                                                  318 | I S I   W S C   2 0 1 9
   324   325   326   327   328   329   330   331   332   333   334