Page 19 - Contributed Paper Session (CPS) - Volume 6
P. 19
CPS1468 Takeshi Kurosawa et al.
expectation ( |). The RCC lies between 0 and 1, and similar to the multiple
correlation coefficient R judges a candidate model as performing if it is close
to one. Recently, Takahashi and Kurosawa (2016) studied the performance of
the RCC in Poisson GLMs and derived explicit forms for it under the condition
that the vector of explanatory variables has a certain distribution.
Inspired by the work on RCC, Eshima (2004) and Eshima and Tabata (2007)
proposed an alternate entropy correlation coefficient (ECC) measure for GLMs,
based on the Kullback-Leibler divergence (which is equivalent to the
symmetric Kullback-Leibler distance) between the marginal distribution of the
response variable and the conditional distribution (|). They showed that
the form of ECC reduces to simply calculating the correlation between the
response variable and the canonical parameter in the exponential family.
Like the RCC and , the ECC varies between 0 and 1. In this article, we study
the unstandardized version of the ECC for Poisson GLMs. We refer to this as
the measure of predictive power or , and it is defined as the covariance (as
pp
opposed to the correlation) between and the canonical parameter in the
exponential family.
The article is structured as follows. In Section 2, we provide formal definitions
of . We propose a new estimator of pp in the next section. Finally, we
pp
conduct a real data analysis using the proposed estimator of pp in Section 4.
2. Section Measure of Predictive Power pp
We introduce the form of for GLMs more generally, before focusing
pp
on the case of Poisson regression. In a GLM, conditional on a vector of
explanatory variables , the responses are assumed to be independent
observations from the exponential family of distributions. That is, (|) =
exp{() { − ()} + (, )} for known functions (·), (·), and (·) ,
−1
where θ is the canonical parameter and is a scale parameter which may be
known or require estimation. The conditional mean, ( |) = ′(), is then
⊤
modeled as {(|)} = = + for some specified link function (·
), where is the linear predictor, is the intercept, and is the vector of
regression coefficients. We focus on the following goodness-of-fit measure.
Definition 2.1. The measure of predictive power for a GLM, denoted as ,
pp
is defined to be the covariance between the response and the canonical
parameter
(, )
(, ) = () ,
pp
where the value is determined from the mean model i.e. , ′() = −1 ().
8 | I S I W S C 2 0 1 9