Page 340 - Invited Paper Session (IPS) - Volume 2
P. 340
IPS273 Tomoki Tokuda et al.
distribution (Y, Z, ) for these parameters and the likelihood (X|Y, Z, ) in
Eq.(1), the posterior is given by
(Y, Z, |) ∝ (X|Y, Z, )(Y, Z, ).
Inferences: The MAP (Maximum a posteriori) estimates Y, Z, and , which
maximize the left-hand side in Eq.(2), provide an optimal solution for
estimation of multiple co-clustering structures and probabilistic density
distributions in each cluster block. However, direct maximization of (Y, Z, |)
is rather difficult because of the complexity of relationships among the
parameters. Hence, we approximate it by means of variational inference
(Bishop, 2006), denoting an approximate density function as (Y, Z, ), in
which we assume independence among these parameters. In this framework,
we aim to find the best approximation in terms of Kullback-Leibler (KL)
divergence between (Y, Z, ) and (Y, Z, |), which reduces the problem
into optimization of hyperparameters in these distributions. The algorithm of
optimization proceeds as follows. First, the data matrix is randomly partitioned
into views in which subject- and feature- cluster memberships are randomly
allocated to subjects and features. From this initial configuration, the
hyperparameters are iteratively updated to monotonously decrease the KL
divergence. When the first round of updating is completed, the next round of
updating is subsequently carried out, which is repeated a number of rounds
until the KL divergence converges. A visual demonstration using an artificial
data is shown in Fig.2.
3. Results
As an application to real data, we present the results of analysis on a dataset
of MDD (Major Depressive Disorder) in our recent study (Tokuda et al., 2018).
The objective of the study is to identify subtypes of MDD in a data driven
manner.
Background: It is well known that MDD is heterogenous in such
characteristics as clinical presentation, progression, treatment response,
genetics, and neurobiology. This heterogeneity hampers progress in
identifying the cause of MDD and its effective treatment. To overcome this
problem, several studies have been so far conducted to identify subtypes of
MDD in a data-driven manner, relying on clinical questionnaires. However, the
results of these studies either conflict or they simply identify clusters related
to depression severity, which does not provide conclusive evidence for
subtypes of depressive symptoms. Furthermore, these studies are based on
clinical questionnaires without considering of biological substrates. We
address this question by means of the multiple co-clustering method. We
apply this method to a combination of several dataset modalities, such as
functional connectivity (FC) data in resting state fMRI (i.e., a subject stays at
327 | I S I W S C 2 0 1 9