Page 183 - Contributed Paper Session (CPS) - Volume 7
P. 183

CPS2056 Nurul Hafizah Azizan et al.
               version  of  Norwegian  Function  Assessment  Scale  (NFAS).  NFAS  is  an
               instrument  that  comprised  of  39‐items  used  to  evaluate  the  need  for
               rehabilitation,  the  right  to  social  security  benefits  and  adjustment  of  work
               demands  among  sick‐listed  persons  among  two  different  groups  of
               respondents  with  no  major  differences  in  demographic characteristics.  The
               result from this study suggested that 5‐point scale produced a better data
               quality in both internal consistency and discriminant validity as compared to
               4‐point original scale. Although odd number of response with 5‐point and 7‐
               point rating scales are the most frequently used, previous study suggested
               that  the  7‐point  scale  will  maximize  the  variance,  and  for  the  scale  point
               beyond seven, it will not increase the variance (Eutsler & Lang, 2015). However,
               another study from Revilla, Saris, and Krosnick (2014) reported that 5‐point
               scale produced better data quality rather than 7‐point and 11‐point scale.
                   A study conducted by Preston and Colman (2000) on response categories
               ranging from 2 to 11 found that reliability, validity and discriminating power
               were relatively poor for four and less point scales, and significantly higher for
               five to seven point scales. However, the study also revealed that test‐retest
               reliability tends to drop with more than 10 response categories despite internal
               consistency that does not significantly differ. The result obtained is almost
               similar with a study carried out by Lozano, García‐Cueto, and Muniz (2008).
               Based on simulated data using Monte Carlo method of 30‐items with response
               alternative ranging between two to nine and four different sample sizes (50,
               100, 200 and  500),  the results showed that both validity and reliability are
               better with response alternative between four and seven, and decrease with
               less than 4‐point scale. By using three different samples of 50 participants each,
               Daher et al. (2015) distributed the Malay Spiritual Well‐Being Scale (SWBS)
               with the original 6‐point scale, 3‐point and 4‐point modified scales to study
               the impact of rating scales categories on reliability and fit statistics using Rasch
               model. The results showed that reliability and fit statistics were robust with the
               original 6‐point scale and became worse for both new modified scales (3‐point
               and 4‐point). Similar with Osteras et al. (2008), the findings obtained in this
               study might also be affected by different groups of respondents involved in
               the study, where the same sample is more preferable to be used to get more
               accurate result for comparison purposes. Through a comparison between 7‐
               point and 11‐point categories of rating scales used for quality of life survey,
               Alwin (1997) reached the conclusion that questions with more categories are
               both more reliable and valid.

               3.2  Labels  and  Rating  Scale  Format  and  Its  Effect  on  Validity  and
               Reliability
               Other than studying the influence of number of response categories (ranged
               from 3 to 9) on quality of measurement instrument, Weng (2004) also put an

                                                                  170 | I S I   W S C   2 0 1 9
   178   179   180   181   182   183   184   185   186   187   188