Page 220 - Special Topic Session (STS) - Volume 2
P. 220

STS486 R. Ayesha A. et al.
                     Dirichlet-multinomial  regression  (Guimarães  and  Lindrooth,  2007)  is  a
                  multinomial  logistic  regression  model  developed  in  econometrics  that
                  accounts for over-dispersion.  Under a consumer-behaviour framework, we
                  assume that a pollinator faced with a number of choices (plant species in the
                  network) assigns a level of utility to each plant species and then selects the
                  one with the maximum  utility (McFadden, 1974; Guimarães and Lindrooth,
                  2007).    These  utilities  may  be  modelled  as  a  function  of  plant  species
                  attributes/traits,  pollinator  species  characteristics/traits,  or  as  interactions
                  between the two (e.g. linkage rules).  Therefore, the probability of a specific
                  plant-pollinator pair interacting can also be modelled as a function of these
                  traits.  Since modelling is done at the species level, this model corresponds to
                  a grouped DM regression (Mosimann, 1962).
                     Dispersion may be constant for all pollinator species or may be a function
                  of pollinator-specific covariates such that dispersion varies across pollinators.
                  Estimates of the regression coefficients quantify the relative contribution of
                  each  trait/linkage  rule  to  the  interaction  probabilities,  while  the  dispersion
                  parameter accounts for heterogeneity in the data that is not explained by the
                  covariates.
                     However,  it  is  only  within  the  past  few  years  that  field  ecologists  have
                  collected  detailed  species  data  to  better  understand  the  mechanisms  that
                  drive pollination.  Investigators may be faced with a large number of potential
                  covariates to go into the model.  Variable selection techniques can reduce the
                  size and complexity of the model, paring down the features to a set of ones
                  predictive  of  plant-pollinator  interaction  while  both  avoiding  overfit  and
                  increasing interpretability.  Here we present a new regularized grouped DM
                  regression  model  using  standard  and  adaptive  lasso  methods.  Tuning
                  parameters are selected using an information criterion while optimization is
                  achieved via the fast iterative shrinkage-thresholding algorithm (FISTA) (Beck
                  and Teboulle, 2009). All the proposed methods are evaluated via simulated
                  and  empirical  data  sets  and  all  implementations  of  the  standard  and
                  regularized grouped DM regression model are publicly available as routines
                  in R (R Core Team, 2017).
                     Section 2 provides a brief description of the DM parameterizations for both
                  the  unpenalized  and  penalized  grouped  DM  regression  model.    Section  3
                  investigates the performance of lasso-type methods for the regularized DM
                  regression model via simulation and Section 4 shows the results of the analysis
                  of an empirical plant-pollinator network using our proposed lasso approach.
                  Discussion and conclusions are provided in Section 5.

                  2.  Methodology
                     DM regression for grouped data assumes that individuals within a group
                  share common characteristics and are faced with the same choice set; hence,

                                                                     209 | I S I   W S C   2 0 1 9
   215   216   217   218   219   220   221   222   223   224   225