Page 349 - Invited Paper Session (IPS) - Volume 2
P. 349

IPS273 Charlotte L. et al.
                and columns marginal distributions. For  more details, see (Laclau et al,
                2017).

                3.2 From Optimal Transport to Matrix Factorization
                    Now,  let  us  further  consider  the  objective  function  related  to  the
                optimal  transportation  problem  with  Kullback-Leibler  regularization
                presented in above and defined as:

                                 ℒ(; ; , ) = 〈, 〉 + (‖ ).
                                                                   ⊤
                                                      

                    The minimization of ℒ is usually performed w.r.t. the coupling   with
                a and b being fixed vectors from size- and size- simplexes, respectively.
                However,  for  the  co-clustering  problem  introduced  above,  we  are
                interested in finding the scaling vectors u and v for a given data matrix A
                directly and thus propose to consider the following minimization problem:

                                     (, ) ≔ arg min ℒ (; ; , ),
                                                 (,)

                    where contrary to the original problem the minimization is performed
                over  u  and  v.  We  note  that  this  problem  simply  amounts  to  the
                minimization of KL(‖ ) with  =  that does not necessarily represent
                                        ⊤
                some joint distribution. Instead, this problem aims to find two vectors u
                and v that have sufficient entropy or small enough mutual information
                w.r.t. the matrix of interactions A. In the general case of a matrix A ∈ ℝ ×
                with  potentially  missing  values,  the  calculation  of  u  and  v  amounts  to
                solving the following matrix factorization problem:

                                          ,
                                                         ,
                                 min      ∑      ,  log (  ) −  ,  +   .
                                                                        
                                   
                             (,)∈ℝ ×ℝ              
                                                          
                                          ,=1
                                         ,  

                    The  connection  between  optimal  transport  and  matrix  factorization
                can  also  be  shown  using  a  different  point  of  view.  When  α  and  β  are
                defined  as  uniformly  weighted  sums  of  Diracs,  the  Kullback-Leibler
                divergence term in (3) becomes:

                                                           ⊤
                                     KL(‖ ) = KL(‖ /).
                                              ⊤

                    One may further note that as λ grows, the solution of the regularized
                optimal transport becomes closer to the uniform distribution and thus its
                factorization becomes:

                                            ⋆
                                ⊤
                               / ~  (, ) = diag() −M/  diag()
                                           
                                      →∞
                                                   
                                       (1/) (1/) ≃  −M/ .

                                                               336 | I S I   W S C   2 0 1 9
   344   345   346   347   348   349   350   351   352   353   354