Page 196 - Contributed Paper Session (CPS) - Volume 2
P. 196
CPS1823 Ishapathik D. et al.
Regression for doubly inflated multivariate
poisson distributions
3
4
Ishapathik Das , Sumen Sen , N Rao Chaganty , Pooja Sengupta
2
1
1 Department of Mathematics, Indian Institute of Technology Tirupati, Tirupati,India
2 Department of Mathematics and Statistics, Austin Peay State University, Clarksville, TN, USA
3 Department of Mathematics and Statistics, Old Dominion University, Norfolk, VA, USA
4 International Management Institute, Kolkata, India
Abstract
Dependent multivariate count data occur in several research studies. These
data can be modeled by a multivariate Poisson or Negative binomial
distribution constructed using copulas. However, when some of the counts are
inflated, that is, the number of observations in some cells are much larger than
other cells, then the copula based multivariate Poisson (or Negative binomial)
distribution may not fit well and it is not an appropriate statistical model for
the data. There is a need to modify or adjust the multivariate distribution to
account for the inflated frequencies. In this article, we consider the situation
where the frequencies of two cells are higher compared to the other cells, and
develop a doubly inflated multivariate Poisson distribution function using
multivariate Gaussian copula. We also discuss procedures for regression on
covariates for the doubly inflated multivariate count data. For illustrating the
proposed methodologies, we present a real data containing bivariate count
observations with inflations in two cells. Several models and linear predictors
with log link functions are considered, and we discuss maximum likelihood
estimation to estimate unknown parameters of the models.
Keywords
Multivariate Poisson; Gaussian copula; inflated count
1. Introduction
Count data are ubiquitous in scientific investigations. Count data could be
univariate as well as multivariate. Data with multivariate count responses occur
in many contemporary applications, such as purchase of different products,
different types of faults in manufacture process and sports data. In practice
bivariate count data are encountered more often than multivariate count data,
and bivariate Poisson models are appropriate for these data to account for the
correlation between the pairs. Although the Poisson distribution has been
widely accepted as a primary modeling approach for the distribution of the
number of event occurrence, several researchers (see, for example, Lee et al.
(2009) and the references therein) have shown the existence of a correlation
between bivariate counts, this has been ignored in most modeling approaches
185 | I S I W S C 2 0 1 9