Page 341 - Special Topic Session (STS) - Volume 3
P. 341

STS547 Daan Zult et al.
               4.  Discussion and Conclusion
                   In this paper we derived and tested the WMR model for population size
               estimation corrected for linkage error. The model is derived from the D&F
               model and is a more general extension than the models developed by DC&T
               (2015, 2018) and De Wolf et al. (2018) because it can deal with three or more
               sources and covariates. Furthermore, the WMR model is incorporated in the
               more general family of log - linear regression models and therefore no longer
               has to be studied as an isolated issue in CR and MR models. Finally, the WMR
               model was tested and approved in a simulation study.
                   In theory the WMR model might be an improvement on the D&F model,
               they both still require the availability of a rematch (for D&F) or audit (for WMR)
               study. The advantage of the WMR model is that an audit study might be easier
               to obtain, because it has lower requirements (it needs to be constructed on
               the cell count level instead of the much more detailed records matching pair
               level). However, the incorporation of covariates and additional sources in the
               WMR model also puts additional constraints on the audit study, in the sense
               that  the  audit  study  should  include  these  same  covariates  and  additional
               sources.  Given  that  the  sample  that  underlies  the  audit  study  must  be
               representative for R , this might be more difficult for increasing t.
                                  t
                   Also, we should note that we paid little attention to the impact of the exact
               linkage procedure. In section 2 we developed the WMR model in the context
               of the common sequential linkage approach, in which first two sources are
               linked and a third source is linked to this combined source. However, it is also
               possible that sources are linked pairwise or simultaneously. These approaches
               are  less  common  because  they  suffer  either  from  computational  (i.e.  the
               number  of  potential  matches  between  multiple  sources  increases
               exponentially) or methodological (e.g. what to do with inconsistent matching
               patterns like A → B, B → C, C ↛ A?). Furthermore, in the simulation study of
               section 3 we applied probabilistic linkage that uses techniques developed by
               Fellegi and Sunter (1969), Winkler (1988) and Jaro (1989) that aim to optimise
               the quality of matches on the matching pair level, while matching techniques
               that are designed to optimise the quality of the matches on the cell count level
               might already significantly reduce the problem of linkage errors in population
               size estimation.
                   Another  point  that  deserves  some  discussion  is  the  ‘individual  starting
               weight of 1’. Lists or registers of individuals sometimes also contain individual
               sample  weights,  which  indicate  the  size  of  the  group  that  this  individual
               represents as part of the total population. There is no reason why these sample
               weights  cannot  replace  the  starting  weights  of  1  in  the  WMR  model.
               Furthermore, when additional sources also contain sample weights they can
               be used to calculate  ,   and   in a slightly different way, i.e. simply by
                                                 ∗
                                        ∗
                                     
               adding up sample weights instead of counting. This way we would get ‘linkage

                                                                  330 | I S I   W S C   2 0 1 9
   336   337   338   339   340   341   342   343   344   345   346