Page 54 - Special Topic Session (STS) - Volume 4
P. 54
STS560 James Houran et al.
It follows from Eq. 2 that raw sums of observations (ratings coded as 0, 1, 2,
…) are minimally sufficient statistics to estimate respondents’ trait levels Tj as
well as the item parameters Di (Wright & Masters, 1982). Various approaches
to estimating the model parameters are described in this work as well. To
avoid introducing additional notation, the following makes no distinction
between the estimated and the true parameter values. Given Eq. 2 it is possible
to derive the expected value (Eij) of a person’s rating xij, and its’ standard
deviation SDij (see Wright and Masters, 1982). Thus, we can define an
observation’s residual as:
= − (3)
and its’ standardized form zij as:
= (4)
5. Misfit
It may be assumed that the zij follow an approximately normal distribution
with M=0 and SD=1. Thus, the summed squared values zij then follow a
2
2
2
distribution and a person index of fit can be obtained by aggregating the z
over items answered by this person (see Wright & Masters, 1982). Similarly,
aggregating across persons will provide an index of item fit. When items’ z are
aggregated across different subgroups of respondents, they also serve to
identify Differential Item Functioning (DIF), also called item bias. For instance,
if an item’s zij has higher mean for men than for women then this item is biased
against women. Thus, the zij can be interpreted as indices of idiosyncratic
preferences and subjective biases.
6. Using Misfit
Traditionally, testing in HR, education, and psychology focuses exclusively
on estimating the Tj, the respondent’s overall trait level and a test taker’s items’
fit is typically ignored. Yet, an observation with large | | should be deemed
aberrant because it is implausible given the model parameters. It is the central
thesis of this paper that such aberrations are worthy of study in their own right.
Large zij (e.g., | | >2) can be caused by a variety of factors: the question may
be ambiguous (e.g., due to poor wording of the question), or the test-taker
was distracted (e.g., ambient noise) or the person lacked motivation. However,
when such factors can be excluded as the causes of misfit, the zij often reflect
a respondent’s idiosyncrasies and biases. As is illustrated below, misfit conveys
valuable information that goes beyond a person’s overall trait estimate (test
scores) and can be exploited for diagnostic purposes.
7. A Case Study: The 20|20 Skills™
The following describes how response residuals can be used to suggest to
43 | I S I W S C 2 0 1 9