Page 19 - Contributed Paper Session (CPS) - Volume 6
P. 19

CPS1468 Takeshi Kurosawa et al.
            expectation ( |). The RCC lies between 0 and 1, and similar to the multiple
            correlation coefficient R judges a candidate model as performing if it is close
            to one. Recently, Takahashi and Kurosawa (2016) studied the performance of
            the RCC in Poisson GLMs and derived explicit forms for it under the condition
            that the vector of explanatory variables  has a certain distribution.
            Inspired by the work on RCC, Eshima (2004) and Eshima and Tabata (2007)
            proposed an alternate entropy correlation coefficient (ECC) measure for GLMs,
            based  on  the  Kullback-Leibler  divergence  (which  is  equivalent  to  the
            symmetric Kullback-Leibler distance) between the marginal distribution of the
            response variable  and the conditional distribution (|). They showed that
            the form of ECC reduces to simply calculating the correlation between the
            response variable  and the canonical parameter in the exponential family.
            Like the RCC and , the ECC varies between 0 and 1. In this article, we study
            the unstandardized version of the ECC for Poisson GLMs. We refer to this as
            the measure of predictive power or  , and it is defined as the covariance (as
                                                pp
            opposed to the correlation) between  and the canonical parameter in the
            exponential family.
            The article is structured as follows. In Section 2, we provide formal definitions
            of  . We propose a new estimator of    pp  in the next section. Finally, we
                 pp
            conduct a real data analysis using the proposed estimator of  pp  in Section 4.

            2. Section Measure of Predictive Power   pp
               We introduce the form of    for GLMs more generally, before focusing
                                           pp
            on  the  case  of  Poisson  regression.  In  a  GLM,  conditional  on  a  vector  of
            explanatory  variables  ,  the  responses  are  assumed  to  be  independent
            observations from the exponential family of distributions. That is, (|) =
            exp{() {   − ()} +  (, )}  for  known  functions  (·), (·),  and  (·) ,
                     −1
            where θ is the canonical parameter and  is a scale parameter which may be
            known or require estimation. The conditional mean, ( |) =  ′(), is then
                                                  ⊤
            modeled as {(|)} =    =    +   for some specified link function (·
            ),  where  is  the  linear  predictor,  is  the  intercept,  and  is  the  vector  of
            regression coefficients. We focus on the following goodness-of-fit measure.
               Definition 2.1. The measure of predictive power for a GLM, denoted as  ,
                                                                                     pp
            is defined to be the covariance between the response  and the canonical

            parameter 
                                                   (, )
                                        (, ) =  ()  ,
                                         pp

            where the value  is determined from the mean model i.e. , ′() =  −1 ().





                                                                 8 | I S I   W S C   2 0 1 9
   14   15   16   17   18   19   20   21   22   23   24