Page 25 - Contributed Paper Session (CPS) - Volume 7
P. 25

CPS2020 Honeylet T. S.
            2.2.  Estimation of the Model
                The goal of this study is to estimate a model of variable Y using variable
            Z after matching. The model is Poisson loglinear model characterized by the
            following:
                              = exp( + ),  = 1,2, … ,                  (2.1)
            where   is the mean of response  ,  is the intercept,  is the coefficient,  .
                                               
                    
                                                                                      
            is the   observation of variable Z, and  =  +  .
                   ℎ
                                                             
                                                        
                After imputation, corresponding models (2.1)  will be fitted to synthetic
            datasets  to  estimate  the  coefficients.  Variable  Y  will  be  the  response  and
            variable Z will be the predictor. Moreover, coefficients will also be estimated
            using bootstrap methods described below.

            Bootstrap within one synthetic dataset
              1)  Create synthetic dataset using matching methods described above.
              2)  Resample  with  replacement  the  same  number  of  observations  from
                  synthetic dataset created in Step 1.
              3)  Fit a model (2.1) to the resampled dataset in Step 2. Get the coefficient
                  estimates.
              4)  Repeat Step 2 and Step 3 200 times.
              5)  Get the average of the 200 coefficient estimates. The average will serve
                  as the coefficient estimate of model (2.1).

            Bootstrap across synthetic datasets
              1)  Concatenate the original data sources – Data Source A with missing Z
                  values and Data Source B with missing Y values.
              2)  Resample with replacement the same number of observations from the
                  concatenated dataset in Step 1.
              3)  From the resampled dataset in Step 2, group the observations with Z
                  missing. This will serve as the new Data Source A. Similarly, group the
                  observations with Y missing. This will serve as the new Data Source B.
              4)  Create synthetic dataset using matching methods.
              5)  Fit model (2.1) to the dataset in Step 4. Get the coefficient estimates.
              6)  Repeat Step 2 to Step 5 200 times.
              7)  Get the average of the 200 coefficient estimates. The average will serve
                  as the coefficient estimate of model (2.1).
                Note that synthetic datasets created using Poisson regression imputation
            and  MCMC  imputation  are  concatenated  file  Data  Source  A  ∪  B,  while
            synthetic  datasets created using random hot deck  imputation involve only
            Data  Source  A  with  imputed  Z  values  from  Data  Source  B.  Hence,  the
            corresponding synthetic datasets will be used in estimation of model (2.1).
                A total of nine of model coefficients will be estimated. For each type of
            synthetic dataset, there are three coefficient estimates. These are: (1) Poisson

                                                                14 | I S I   W S C   2 0 1 9
   20   21   22   23   24   25   26   27   28   29   30