Page 220 - Special Topic Session (STS) - Volume 2
P. 220
STS486 R. Ayesha A. et al.
Dirichlet-multinomial regression (Guimarães and Lindrooth, 2007) is a
multinomial logistic regression model developed in econometrics that
accounts for over-dispersion. Under a consumer-behaviour framework, we
assume that a pollinator faced with a number of choices (plant species in the
network) assigns a level of utility to each plant species and then selects the
one with the maximum utility (McFadden, 1974; Guimarães and Lindrooth,
2007). These utilities may be modelled as a function of plant species
attributes/traits, pollinator species characteristics/traits, or as interactions
between the two (e.g. linkage rules). Therefore, the probability of a specific
plant-pollinator pair interacting can also be modelled as a function of these
traits. Since modelling is done at the species level, this model corresponds to
a grouped DM regression (Mosimann, 1962).
Dispersion may be constant for all pollinator species or may be a function
of pollinator-specific covariates such that dispersion varies across pollinators.
Estimates of the regression coefficients quantify the relative contribution of
each trait/linkage rule to the interaction probabilities, while the dispersion
parameter accounts for heterogeneity in the data that is not explained by the
covariates.
However, it is only within the past few years that field ecologists have
collected detailed species data to better understand the mechanisms that
drive pollination. Investigators may be faced with a large number of potential
covariates to go into the model. Variable selection techniques can reduce the
size and complexity of the model, paring down the features to a set of ones
predictive of plant-pollinator interaction while both avoiding overfit and
increasing interpretability. Here we present a new regularized grouped DM
regression model using standard and adaptive lasso methods. Tuning
parameters are selected using an information criterion while optimization is
achieved via the fast iterative shrinkage-thresholding algorithm (FISTA) (Beck
and Teboulle, 2009). All the proposed methods are evaluated via simulated
and empirical data sets and all implementations of the standard and
regularized grouped DM regression model are publicly available as routines
in R (R Core Team, 2017).
Section 2 provides a brief description of the DM parameterizations for both
the unpenalized and penalized grouped DM regression model. Section 3
investigates the performance of lasso-type methods for the regularized DM
regression model via simulation and Section 4 shows the results of the analysis
of an empirical plant-pollinator network using our proposed lasso approach.
Discussion and conclusions are provided in Section 5.
2. Methodology
DM regression for grouped data assumes that individuals within a group
share common characteristics and are faced with the same choice set; hence,
209 | I S I W S C 2 0 1 9