Page 340 - Invited Paper Session (IPS) - Volume 2
P. 340

IPS273 Tomoki Tokuda et al.

                  distribution (Y, Z, ) for these parameters and the likelihood (X|Y, Z, ) in
                  Eq.(1), the posterior is given by
                                        (Y, Z, |) ∝ (X|Y, Z, )(Y, Z, ).
                     Inferences: The MAP (Maximum a posteriori) estimates Y, Z, and , which
                  maximize  the  left-hand  side  in  Eq.(2),  provide  an  optimal  solution  for
                  estimation  of  multiple  co-clustering  structures  and  probabilistic  density
                  distributions in each cluster block. However, direct maximization of (Y, Z, |)
                  is  rather  difficult  because  of  the  complexity  of  relationships  among  the
                  parameters.  Hence,  we  approximate  it  by  means  of  variational  inference
                  (Bishop,  2006),  denoting  an  approximate  density  function  as  (Y,  Z,  ),  in
                  which we assume independence among these parameters. In this framework,
                  we  aim  to  find  the  best  approximation  in  terms  of  Kullback-Leibler  (KL)
                  divergence between (Y, Z, ) and (Y, Z, |), which reduces the problem
                  into optimization of hyperparameters in these distributions. The algorithm of
                  optimization proceeds as follows. First, the data matrix is randomly partitioned
                  into views in which subject- and feature- cluster memberships are randomly
                  allocated  to  subjects  and  features.  From  this  initial  configuration,  the
                  hyperparameters are  iteratively  updated  to  monotonously  decrease  the KL
                  divergence. When the first round of updating is completed, the next round of
                  updating is subsequently carried out, which is repeated a number of rounds
                  until the KL divergence converges. A visual demonstration using an artificial
                  data is shown in Fig.2.

                  3. Results
                     As an application to real data, we present the results of analysis on a dataset
                  of MDD (Major Depressive Disorder) in our recent study (Tokuda et al., 2018).
                  The objective of the study is to identify subtypes of MDD in a data driven
                  manner.
                     Background:  It  is  well  known  that  MDD  is  heterogenous  in  such
                  characteristics  as  clinical  presentation,  progression,  treatment  response,
                  genetics,  and  neurobiology.  This  heterogeneity  hampers  progress  in
                  identifying the cause of MDD and its effective treatment. To overcome this
                  problem, several studies have been so far conducted to identify subtypes of
                  MDD in a data-driven manner, relying on clinical questionnaires. However, the
                  results of these studies either conflict or they simply identify clusters related
                  to  depression  severity,  which  does  not  provide  conclusive  evidence  for
                  subtypes of depressive symptoms. Furthermore, these studies are based on
                  clinical  questionnaires  without  considering  of  biological  substrates.  We
                  address  this  question  by  means  of  the  multiple  co-clustering  method.  We
                  apply this method to a combination of several dataset modalities, such as
                  functional connectivity (FC) data in resting state fMRI (i.e., a subject stays at


                                                                     327 | I S I   W S C   2 0 1 9
   335   336   337   338   339   340   341   342   343   344   345