Page 366 - Contributed Paper Session (CPS) - Volume 6
P. 366

CPS1969 Janna M. De Veyra



                                  Testing for independence on statistically
                                        matched categorical variables
                                              Janna M. De Veyra
                                         University of the Philippines Diliman

                  Abstract
                  In  most  instances,  conducting  a  new  survey  is  impossible  due  to  time
                  constraints and limited resources. Matching data sources has been used as a
                  way to obtain a data set where all the intended variables are available. This
                  paper proposes the use of the MCMC and the inclusion of random error in
                  matching  categorical  variables  as  well  as  the  application  of  bootstrap
                  procedure in testing for their independence. A simulation study indicates that
                  the  test  is  most  effective  when  the  proposed  procedures  are  all  applied
                  because combining all these procedures produces a correctly sized test that
                  yields the highest power among all other proposed procedures combined.

                  Keywords
                  random error; mcmc; bootstrap; size; power

                  1.  Introduction
                      Orazio, et.al (2006) defines statistical matching as a statistical procedure
                  that aims to integrate two or more datasets characterized by the fact that the
                  different  datasets  contain  information  on  a  set  of  common  variables  and
                  variables that are not jointly observed and that the units observed in the data
                  sets are different. The goal of this procedure is to derive a synthetic data and
                  to estimate the joint distribution of the variables that are not jointly observed
                  in a single data set. The need for this type of procedure increases when the
                  chance of conducting a new survey is almost impossible in a given time frame
                  and resources. This paper deals with matching procedures in the categorical
                  data to test for the independence of the two variables that are not jointly
                  observed. Seltman (2015) mentioned that the usual statistical test in the case
                  of categorical outcome and a categorical explanatory variable is whether or
                  not  the  two  variables  are  independent.  Matching  procedures  used  were
                  regression imputation, stochastic imputation, and an application of MCMC in
                  those  two  imputations.  A  check  for  independence  on  the  four  imputation
                  procedures  will  be  made  using  the  Chi-square  statistics.  An  application  of
                  bootstrap  method  under  the  four  imputation  procedures  will  also  be
                  considered to identify if this will produce a more reliable result in the test for
                  independence.



                                                                     355 | I S I   W S C   2 0 1 9
   361   362   363   364   365   366   367   368   369   370   371