Page 338 - Invited Paper Session (IPS) - Volume 2
P. 338
IPS273 Tomoki Tokuda et al.
based on the following extension of the co-clustering model. First, we consider
multiple views of co-clustering structure (Fig.1C), where a univariate
distribution is fitted to each cluster block (Shan & Banerjee, 2008). Second, for
each cluster block, the proposed method simultaneously deals with an
ensemble of several types of distribution families such as Gaussian,
multinomial and Poisson distribution. Obviously, the first extension enables
our model to fit high-dimensional data, while the second enables it to fit data
that include different types of features (numerical, categorical, and integer).
2. Methodology
Our method is based on a Bayesian approach, which models view/co-
clustering structures and instances in each cluster block. We outline relevant
parameters and an inference method in the following sections (see more
details of this section in Tokuda et al., 2017).
View and co-clustering structure: We denote a × data matrix as X
with subjects (or, objects) and features. To infer the multiple co-clustering
structures as seen in Fig.1C, we introduce latent variables of labelling feature
and subject memberships. First, both for view and co-clustering structures, we
introduce a × latent matrix Yj for the feature j, where V and G are the
number of views and feature clusters, respectively. In this notation, a view
membership and a feature cluster membership are defined as 0 (false) or 1
T
T
T
(true). For instance, Yj= ((0, 0, 0, 0) , (0, 0, 1, 0) , (0, 0, 0, 0) ) denotes that the
feature j belongs to view 2 and feature cluster 3 in that view (the superscript T
denotes the transpose of vector; each vector denotes a feature cluster
membership for a particular view). As the definition of membership implies,
only one element in Yj is 1, while the remainder of them 0.
325 | I S I W S C 2 0 1 9