Page 183 - Contributed Paper Session (CPS) - Volume 7
P. 183
CPS2056 Nurul Hafizah Azizan et al.
version of Norwegian Function Assessment Scale (NFAS). NFAS is an
instrument that comprised of 39‐items used to evaluate the need for
rehabilitation, the right to social security benefits and adjustment of work
demands among sick‐listed persons among two different groups of
respondents with no major differences in demographic characteristics. The
result from this study suggested that 5‐point scale produced a better data
quality in both internal consistency and discriminant validity as compared to
4‐point original scale. Although odd number of response with 5‐point and 7‐
point rating scales are the most frequently used, previous study suggested
that the 7‐point scale will maximize the variance, and for the scale point
beyond seven, it will not increase the variance (Eutsler & Lang, 2015). However,
another study from Revilla, Saris, and Krosnick (2014) reported that 5‐point
scale produced better data quality rather than 7‐point and 11‐point scale.
A study conducted by Preston and Colman (2000) on response categories
ranging from 2 to 11 found that reliability, validity and discriminating power
were relatively poor for four and less point scales, and significantly higher for
five to seven point scales. However, the study also revealed that test‐retest
reliability tends to drop with more than 10 response categories despite internal
consistency that does not significantly differ. The result obtained is almost
similar with a study carried out by Lozano, García‐Cueto, and Muniz (2008).
Based on simulated data using Monte Carlo method of 30‐items with response
alternative ranging between two to nine and four different sample sizes (50,
100, 200 and 500), the results showed that both validity and reliability are
better with response alternative between four and seven, and decrease with
less than 4‐point scale. By using three different samples of 50 participants each,
Daher et al. (2015) distributed the Malay Spiritual Well‐Being Scale (SWBS)
with the original 6‐point scale, 3‐point and 4‐point modified scales to study
the impact of rating scales categories on reliability and fit statistics using Rasch
model. The results showed that reliability and fit statistics were robust with the
original 6‐point scale and became worse for both new modified scales (3‐point
and 4‐point). Similar with Osteras et al. (2008), the findings obtained in this
study might also be affected by different groups of respondents involved in
the study, where the same sample is more preferable to be used to get more
accurate result for comparison purposes. Through a comparison between 7‐
point and 11‐point categories of rating scales used for quality of life survey,
Alwin (1997) reached the conclusion that questions with more categories are
both more reliable and valid.
3.2 Labels and Rating Scale Format and Its Effect on Validity and
Reliability
Other than studying the influence of number of response categories (ranged
from 3 to 9) on quality of measurement instrument, Weng (2004) also put an
170 | I S I W S C 2 0 1 9