Page 171 - Contributed Paper Session (CPS) - Volume 3
P. 171

CPS1983 Chong N. et al.
            policymakers for their use. The labour market indicators that track the state of
            the  economy  include,  but  is  not  limited  to,  employment,  unemployment,
            income, job vacancies, labour turnover, retrenchment, employment conditions
            and hours worked.
                With increasing demands for more granular and timely statistics, MRSD
            has  leveraged  on  technology  in  recent  years  to  implement  various  data
            analytics initiatives to meet those demands.
                In  the  following  sections,  the  data  analytics  initiatives  are  explored  in
            greater  detail,  and  how  they  have  helped  MRSD  to  improve  operational
            efficiency and data quality.

            2.   Methodology
                In this section, the methodology behind the data analytics initiatives is
            explained in greater detail.

            a)   Predictive Modelling
                With limited resources and increasing demands for large amounts of data
            at short notice, it is important to optimise resources to collect as much survey
            data as possible. Survey data is collected online or through phone or face-to-
            face interviews. Predictive modelling is done by making use of demographic
            information  such  as  age,  gender  and  household  composition  of  the
            respondents. A random forest model can be trained with past information to
            predict optimal timings to contact the selected respondents.
                A random forest model is built from growing multiple decision trees, with
            each  tree  depending  on  the  values  of  a  random  predictor  sampled
            independently (James, Witten, Hastie & Tibshirani, 2013). At each split, only
            one of  randomly selected predictors are considered from the full set of 
            predictors (James et al., 2013). This method prevents high correlation among
            trees from the influence of strong predictors (James et al., 2013). The outcome
            is  taken  from  the  average  of  all  predictions  to  reduce  variance
            (James et al., 2013).
                Assuming each household has a responsible adult, the simplified decision
            tree shows the most appropriate timing of contact. For instance, if a household
            comprises a couple of working age, it would be recommended to only contact
            them  outside  normal  working  hours.  However,  for  a  family  with  young
            children, establishing contact with them during working hours has a higher
            probability of success as an adult is more likely to be at home to care for the
            children (Diagram 1).







                                                               160 | I S I   W S C   2 0 1 9
   166   167   168   169   170   171   172   173   174   175   176