Page 220 - Contributed Paper Session (CPS) - Volume 7
P. 220
CPS2068 Jan-Philipp Kolb et al.
Using predictive modelling to identify panel
nonresponse
Jan-Philipp Kolb , Bernd Weiß , Christoph Kern 2
1
1
1 GESIS Leibniz Institute for the Social Sciences
2 University of Mannheim
Abstract
Panel surveys are a valuable source of data to investigate a wide range of
research questions. However, data quality can be negatively affected by
nonresponse. Unit nonresponse is most critical when it is due to selective
nonresponse patterns, which can lead to biased estimates. It is therefore
essential to identify panellists with a high risk of nonresponse. If it is possible
to locate these panellists, we could apply interventions in an adaptive survey
design to motivate them to further participate in the panel study. However,
identifying potential non-respondents is a challenging task given the wealth
of information typically available in panel studies. In this study, we aim to
utilize statistical learning methods with a diverse set of predictor variables to
tackle panel attrition from a prediction perspective. We study nonresponse in
the GESIS Panel, which is a bi-monthly probability-based mixed-mode access
panel of the German population (n ≈ 4,700). In addition to socio-demographic
and substantive variables, process-based para-data, as well as data from the
panel management, are used as predictors. Feeding this information to
supervised statistical learning methods offers a promising avenue for building
a useful nonresponse prediction model, as these methods allow to model
complex relationships across many features without the need of specifying the
models’ functional form in advance.
Keywords
Machine Learning; Panel Survey; Nonresponse; Feature Selection; Ensemble
Methods
1. Introduction
Unit nonresponse can become a severe problem if it occurs due to
patterns. It is the case, when the unit nonresponse is not completely random.
This needs to be investigated, and especially in Panel Surveys, many variables
need to be tested. This is where statistical learning methods come into play.
These methods have their advantages when dealing with such a large number
of variables. In this paper, statistical learning methods are tested using the Unit
Nonresponse in the GESIS Panel.
The GESIS Panel is a probability-based mixed mode access panel (Bosnjak
et al. 2018). Probability-based means that the panel participants were selected
207 | I S I W S C 2 0 1 9