Page 385 - Special Topic Session (STS) - Volume 3
P. 385

STS551 Zamira Hasanah Zamzuri et al.
               A  groundbreaking  work  in  this  field  was  produced  by  Maycock  &  Hall
            (1984) and Hauer & Lovell (1989), by associating the relationship between the
            accident rate and explanatory variables, using generalized linear models. The
            most basic model used is the Poisson regression model (Miao et al. 1992; Miao
            & Lum 1993). However, traffic accident data are typically overdispersed, hence
            attention shifted to the negative binomial regression model as can be found
            in Miao (1994), Vogt (1999), Miao (2001) and Zeeger et al. (2001). The work by
            Chin  &  Quddus  (2003)  and  Kweon  &  Kockelman  (2003)  show  that  these
            univariate fixed effects models are inadequate due to their inability to capture
            variation caused by unobserved covariates. To cater for this, random effects
            models that allow the unobserved heterogeneity have been introduced as can
            be found in Anastasopoulos & Mannering (2009).
               Another stream of interest in traffic accident modelling is to handle count
            data with extra zeros,  which is contributed by the underreporting scenario.
            Underreporting or failure to report the road traffic accidents has frequently
            increased  the attention on  the  imprecision  of  data  and  its  impact  on  road
            safety  policy-making  and  development.  The  World  Health  Report  has
            highlighted  the  necessity  for  precise  and  comprehensive  information  and
            scientific methodologies with regard to the prevention and control of road
            traffic injuries (Peden, 2004). A previous study demonstrated that generally, the
            exact figure of road crashes is indefinite, and practically entire studies of road
            crashes comprising greater than single form of data compare only two sources
            including police and hospital records (Elvick and Mysen.,1999). In this case, the
            level  of  comprehensiveness  of  these  datasets  is  incomplete.  The
            aforementioned two sources fail to include those who do not go to the hospital
            or to the police, causing in an additional underestimation of underreporting.
            Evidence has indicated that community-based studies usually provide precise
            death and injury rates (Sethi et al., 2004).
               Commonly to handle the presence of this extra zeros, zero adjusted models
            are used. According to Winkelmann (2003), when dealing with data in the form
            of  extra  zeros,  zero  inflation  model  may  be  used.  Overdispersion  of  zero
            inflated model is caused by the occurrence of extra zeroes in observed than
            expected. Zero inflated Poisson (ZIP) model is applied when the count data
            with  extra  zeros  possess  the  equality of  mean  and  variance.  For  data  with
            heavy zeros and long tails, zero inflated negative binomial (ZINB), zero inflated
            double  poisson  (ZIDP),  and  zero  inflated  generalized  Poisson  (ZIGP)  are
            suggested (Phang & Loh 2013). A new distribution has  very  recently  been
            introduced for analyzing data characterized by a large number of zeros. This
            mixed distribution is known as the NB–Lindley (NBL) distribution which is a
            mixture of the NB and Lindley distributions. This two-parameters distribution
            has interesting and sound theoretical properties in which the distribution is
            characterized by a single long-term mean that is never equal to zero and a

                                                               374 | I S I   W S C   2 0 1 9
   380   381   382   383   384   385   386   387   388   389   390