Page 54 - Special Topic Session (STS) - Volume 4
P. 54

STS560 James Houran et al.
                  It follows from Eq. 2 that raw sums of observations (ratings coded as 0, 1, 2,
                  …) are minimally sufficient statistics to estimate respondents’ trait levels Tj as
                  well as the item parameters Di (Wright & Masters, 1982). Various approaches
                  to estimating the model parameters are described in this work as  well. To
                  avoid  introducing  additional  notation,  the  following  makes  no  distinction
                  between the estimated and the true parameter values. Given Eq. 2 it is possible
                  to  derive  the  expected  value  (Eij)  of  a  person’s  rating  xij,  and  its’  standard
                  deviation  SDij  (see  Wright  and  Masters,  1982).  Thus,  we  can  define  an
                  observation’s residual as:
                                                                    =  −                                                            (3)
                                                 
                                                      
                                           
                  and its’ standardized form zij as:
                                               
                                                               =                                                                                      (4)
                                        
                                              

                  5.  Misfit
                      It may be assumed that the zij follow an approximately normal distribution
                  with M=0 and SD=1. Thus, the summed squared values zij  then follow a 
                                                                            2
                                                                                            2
                                                                                            2
                  distribution and a person index of fit can be obtained by aggregating the z
                  over items answered by this person (see Wright & Masters, 1982). Similarly,
                  aggregating across persons will provide an index of item fit. When items’ z are
                  aggregated  across  different  subgroups  of  respondents,  they  also  serve  to
                  identify Differential Item Functioning (DIF), also called item bias. For instance,
                  if an item’s zij has higher mean for men than for women then this item is biased
                  against  women.  Thus,  the  zij  can  be  interpreted  as  indices  of  idiosyncratic
                  preferences and subjective biases.

                  6.  Using Misfit
                      Traditionally, testing in HR, education, and psychology focuses exclusively
                  on estimating the Tj, the respondent’s overall trait level and a test taker’s items’
                  fit is typically ignored. Yet, an observation with large | | should be deemed
                                                                        
                  aberrant because it is implausible given the model parameters. It is the central
                  thesis of this paper that such aberrations are worthy of study in their own right.
                  Large zij (e.g., | | >2) can be caused by a variety of factors: the question may
                                 
                  be ambiguous (e.g., due to poor wording of the question), or the test-taker
                  was distracted (e.g., ambient noise) or the person lacked motivation. However,
                  when such factors can be excluded as the causes of misfit, the zij often reflect
                  a respondent’s idiosyncrasies and biases. As is illustrated below, misfit conveys
                  valuable information that goes beyond a person’s overall trait estimate (test
                  scores) and can be exploited for diagnostic purposes.

                  7.  A Case Study: The 20|20 Skills™
                      The following describes how response residuals can be used to suggest to


                                                                      43 | I S I   W S C   2 0 1 9
   49   50   51   52   53   54   55   56   57   58   59