Page 385 - Special Topic Session (STS) - Volume 3
P. 385
STS551 Zamira Hasanah Zamzuri et al.
A groundbreaking work in this field was produced by Maycock & Hall
(1984) and Hauer & Lovell (1989), by associating the relationship between the
accident rate and explanatory variables, using generalized linear models. The
most basic model used is the Poisson regression model (Miao et al. 1992; Miao
& Lum 1993). However, traffic accident data are typically overdispersed, hence
attention shifted to the negative binomial regression model as can be found
in Miao (1994), Vogt (1999), Miao (2001) and Zeeger et al. (2001). The work by
Chin & Quddus (2003) and Kweon & Kockelman (2003) show that these
univariate fixed effects models are inadequate due to their inability to capture
variation caused by unobserved covariates. To cater for this, random effects
models that allow the unobserved heterogeneity have been introduced as can
be found in Anastasopoulos & Mannering (2009).
Another stream of interest in traffic accident modelling is to handle count
data with extra zeros, which is contributed by the underreporting scenario.
Underreporting or failure to report the road traffic accidents has frequently
increased the attention on the imprecision of data and its impact on road
safety policy-making and development. The World Health Report has
highlighted the necessity for precise and comprehensive information and
scientific methodologies with regard to the prevention and control of road
traffic injuries (Peden, 2004). A previous study demonstrated that generally, the
exact figure of road crashes is indefinite, and practically entire studies of road
crashes comprising greater than single form of data compare only two sources
including police and hospital records (Elvick and Mysen.,1999). In this case, the
level of comprehensiveness of these datasets is incomplete. The
aforementioned two sources fail to include those who do not go to the hospital
or to the police, causing in an additional underestimation of underreporting.
Evidence has indicated that community-based studies usually provide precise
death and injury rates (Sethi et al., 2004).
Commonly to handle the presence of this extra zeros, zero adjusted models
are used. According to Winkelmann (2003), when dealing with data in the form
of extra zeros, zero inflation model may be used. Overdispersion of zero
inflated model is caused by the occurrence of extra zeroes in observed than
expected. Zero inflated Poisson (ZIP) model is applied when the count data
with extra zeros possess the equality of mean and variance. For data with
heavy zeros and long tails, zero inflated negative binomial (ZINB), zero inflated
double poisson (ZIDP), and zero inflated generalized Poisson (ZIGP) are
suggested (Phang & Loh 2013). A new distribution has very recently been
introduced for analyzing data characterized by a large number of zeros. This
mixed distribution is known as the NB–Lindley (NBL) distribution which is a
mixture of the NB and Lindley distributions. This two-parameters distribution
has interesting and sound theoretical properties in which the distribution is
characterized by a single long-term mean that is never equal to zero and a
374 | I S I W S C 2 0 1 9