Page 79 - Contributed Paper Session (CPS) - Volume 2
P. 79
CPS1437 Thanyani M.
auxiliary variables the sample values are known, either exactly or
approximately. The calibration problem seeks to improve the initial weights by
finding new weights … , that incorporate the auxiliary information. In a
typical practical problem, the sample size is rather large (Davies, 2018). The
number of auxiliary variables can also be large although it is usually much
smaller than .
Sample weight calibration in this paper is described as an optimisation
problem. Calibration is an important methodological instrument in the
production of statistics. Calibration estimation can be used to an advantage
in a range of different survey conditions, including estimation for domains in
one‐phase sampling, estimation for two‐phase sampling, and estimation for
two‐stage sampling with integrated weighting. Typical of those situations is
complex auxiliary information, a term used for information made up of
several components.
An example occurs when a two‐stage sample survey has information both
for units and for clusters of units, or when estimation for domains relies on
information from different parts of the population. The problem of estimating
survey weights can indeed be formulated as a constrained optimisation
problem, where one is attempting to minimise the difference between the
weighted sample distributions and known population distributions across a
set of control variables at both the household and person-levels (Bar-Gera et
al., (2009).
2. Methodology
The methodology employed assumed that sample data were already
adjusted for unequal probability of selection as well as for non-response. The
condition was also set to ensure that the weights are equal at household-
level and as a result both households and person weights will be estimated
using one procedure. To satisfy the major condition set, the integrated
method of calculating estimates is implemented. Several methods have been
proposed for producing his single weight, including generalised regression
methods (Wu et al., 1997). Wallace and Rust (1996) also compared post
stratification and raking using National Assessment for Educational Progress
(NAEP).
The final survey weights were constructed using regression estimation to
calibrate to the known population counts at the national-level by cross-
classification of age, gender and race, and the population counts at the
individual metros and non-metro within the provinces by two age groups (0-
14, and 15 years and over). Datasets used are South African Census 2011, the
Community Survey 2016 and simulated survey data. The computer program
called StatMx developed by Statistics Canada was used to implement
calibration.
68 | I S I W S C 2 0 1 9