Page 285 - Invited Paper Session (IPS) - Volume 1
P. 285

IPS153 Christine B. et al.
                Until  recently,  migration  flows  were  largely  based  on  the  responses
            provided by travellers on their passenger cards when crossing the border. The
            intention to stay in NZ (or leave NZ) for 12 months or more, as reported by
            the  traveller,  was  used  to  determine  whether  the  border-crossing  was  a
            migrant crossing. This measure allowed New Zealand to produce some of the
            most  timely  migration  statistics  in  the  world.  However  inter-censal
            discrepancies,  and  analysis  involving  total  traveller  in-flows  and  out-flows
            showed  that  the  measure  was  generating  inaccuracies  in  the  migration
            estimates, particularly between 2001 and 2006, where net migration was likely
            under-estimated by approximately 50,000 (Stats NZ 2017b) over the five years.
                This led to the development of an outcomes-based measure of migration,
            where, rather than a traveller’s stated intentions, the amount of time they spent
            in  or  out  of  New  Zealand  was  used  to  determine  migration  status.  Using
            passport  data  acquired  as  travellers  cross  the  New  Zealand  border,  a  travel
            history of almost every traveller is created. This gives us a longitudinal ‘register’
            of travel histories for individuals. By applying a classification rule to these travel
            histories, the migrant status of any given border-crossing can be classified. In
            New Zealand, the particular classification method is called the ‘12/16 month
            rule’ (Stats NZ 2017b). For example, an arrival who has been out of the country
            for  at  least  12  months  in  the  16  months  prior  to  their  border-crossing,  is
            classified as a migrant arrival if they spend 12 months or more in New Zealand
            over  the  16  months  following  the  border-crossing.  This  method  estimates
            migration  more  accurately,  as  we  are  no  longer  reliant  on  travellers’  self-
            reported intentions.
                As with any administrative data, identity resolution is a challenge, especially
            given  the  number  of  border-crossings  that  we  must  resolve.  Currently,  our
            longitudinal administrative dataset extends back to 2013, with over 65 million
            border-crossings that require identity resolution to ascertain border-crossings
            by the same traveller. On average, 40,000 border-crossings are added each day
            to  this  dataset.  Within  these  border-crossings,  we  estimate  over  17  million
            unique travellers, based on data available on passport records since 2013. The
            resolution of these individuals provides the basis of a register of travellers.
                While the outcomes-based measure provides more accurate measures of
            migration, it requires a wait of 16 months before migration levels are known
            with  certainty.  Official  external  migration  estimates  have  been  published
            monthly, within a month of the reference date. To maintain timeliness with the
            12/16 method, we have recently developed a predictive classification model,
            which provides provisional estimates of migration for the latest 16 months.
                This  is  a  difficult  prediction  problem  because  there  is  a  large  class
            imbalance between overall traveller numbers and migrants (who make up less
            than  2%  of  all  border-crossings).  We  have  employed  a  machine  learning
            approach,  learning  at  the  unit  record  level,  combined  with  a  multiple

                                                               274 | I S I   W S C   2 0 1 9
   280   281   282   283   284   285   286   287   288   289   290