Page 352 - Special Topic Session (STS) - Volume 2
P. 352
STS498 Wei-Yin L.
12 months. Each variable with missing values is associated with a “missing
value flag variable” that takes values given in Table 2. Flag variables have
underscores in their names; e.g, INTRDVX is the flag variable associated with
INTRDVX. There are 587 neither constant nor completely missing X variables
that may be used to estimate the population mean of INTRDVX. About 20%
of these variables have missing values; 67 of them have more than 95%
missing values. No CU has complete responses on all 587 variables.
Table 1: Variables and percents of missing values in consumer expenditure data
AGE REF Age of reference person 0
FFTAXOWE Weighted estimate for federal tax liabilities 0
INTRDVX Interest or dividend received past 12 mos. 0
PERINSPQ Personal insurance and pensions last quarter
RENTEQVX Monthly rent if home rented 15.6
RETSURVX Retirement, survivor or disability pensions past 12 mos. 0
RETS RVX Flag variable for RETSURVX 0
STATE State (39 categories) 11.1
STOCKX Value of directly-held stocks, bonds, mutual funds 92.0
TOTXEST Estimated total taxes paid 0
Table 2: Codes and definitions of missing value flag variables
A valid nonresponse: a response is not anticipated
C “don’t know”, refusal or other type of nonresponse
D valid data value
T topcoding applied to value
Figure 1 shows the GUIDE piecewise-constant regression tree for
estimating the mean of INTRDVX. A condition is printed on the left side of
each intermediate node of the tree. A respondent goes to the left branch if
and only if the condition is satisfied. The sample size and sample mean
INTRDVX are printed below each terminal node. For example, at the root node,
the 803 respondents who are 57 years or younger go to the left subnode which
has a mean INTRDVX of $803. The other respondents go to the right subnode.
The symbol “<∗” is an abbreviation for “< or missing.” For example, the right
node immediately below the root node is split on STOCKX. Respondents with
STOCKX < $191,160 or STOCKX = missing go to the left subnode. The node in
black shows a special case where respondents go to the left branch if and only
if RETSURVX < $11,342 or with flag variable RETS RVX = A. See Loh et al. (2019)
for a deeper analysis of the data.
341 | I S I W S C 2 0 1 9