Page 208 - Contributed Paper Session (CPS) - Volume 2
P. 208
CPS1824 Sanggi L.
region, age, sex, dwelling type, number of adults, number of children, marital
status, etc. (for HILDA). The regression model is estimated with wave 1
respondents only. This model is used to estimate wave 1 selection
probabilities for every new wave. After estimating probabilities for constituent
households using regression, the household selection probabilities as
indicated in equation (1) are then computed in HILDA. But the approach taken
in the SOEP is simpler by removing joint probabilities. Equation (1) can be
rewritten as
P() = 1 + 1 + ⋯ +
The SOEP approach is less complex to implement in practice.
3.4. Weighting methodology in the KPCLS
Notwithstanding the two previous approaches, the KPCLS is based on
administrative data, so the wave 1 selection probabilities of all household and
individuals who had the opportunity to be sampled at wave 1 can be known.
Therefore it is possible to calculate equation (1) directly. Because the
registration census manages the unique numbers of all households and
individuals annually, it is possible to associate the wave 1 region and
household identification number of new entrants. When the composition of a
household H1 at wave t is as shown in the table-2 below, we can merge wave
1 region and household numbers of all individuals who existed in population
at wave 1. The fourth member was not in the population at wave 1(births,
immigration). Therefore,
P(H1) = 1 − (1 − 0.2)(1 − 0.05) = 0.24
1
The initial weight is (H1) = ≈ 4.2 .On the other hand, the initial
0.24
weight of the weight sharing method is ℎ (H1) = (5+5) ≈ 3.3. Also, in the
0.24
case of the weight sharing method, the problem that the initial weight value
becomes smaller than 1 may occur when the number of new entrants is large.
However this problem does not occur if the selection probabilities are directly
calculated.
Table-2 : data structure
wave t
wave t ⋯ wave 1 wave 1
HH numbers HH member PSM region HH numbers
numbers
H1 1 Y A h1
H1 2 Y A h1
H1 3 N B h2
H1 4 N
∗ P(A) = 0.2, P(B) = 0.05
197 | I S I W S C 2 0 1 9