Page 351 - Invited Paper Session (IPS) - Volume 2
P. 351

IPS273 Charlotte L. et al.













                   Fig. 1: Comparison of the vectors obtained with Matrix Factorization
                  (NMF), and optimal transport (OT) on a simulated squared data matrix
                                      with a 3 × 3 block structure.

                3.3 Detecting the number of clusters
                    In order to map coordinates of the obtained cluster-generating vectors
                to final clustering partitions, we concentrate our attention on a popular
                methods based on the Potts problem (Weinmann et al, 2015). In general,
                for  a  given  size−  vector  u,  this  method  aims  at  finding  a  piece-wise
                constant vector x by solving an optimization problem composed of two
                                                                   
                terms: (1)a fitting term taken as the ℓ norm ‖ − ‖  for some  ≥ 1; (2)
                                                    
                                                                   
                a  regularization  term,  controlled  by  an  hyper-parameter   reg  > 0, that
                penalizes  ∈ ℝ ×−1  such that [x] =1  = (x +1  − x ) for  = 1, … ,  − 1. In
                                                                 
                order to properly solve this problem, we propose to use the ℓ  -sorted-
                                                                             
                Potts  problem  which  penalizes  the  increments  of  the  sorted  vector.
                Denoting by  the sortin operation, the ℓ  -sorted-Potts problem writes:
                                                        

                                                     
                                  ⋆
                                  = arg min‖ − ‖ +   ‖(x)‖ .
                                                                     0
                                                     
                                          ∈ℝ 

                    We propose to apply this regularisation to the vectors resulting from
                both OT and MF.

            4.  Conclusion
                In this paper, we studied regularized optimal transport and its application
            in co-clustering. More precisely, we showed that this former can be expressed
            as  rank-one  matrix  factorization  problem  that  produces  cluster-generating
            vectors of a given data matrix. Finally, we proposed a regularization strategy
            that enhances the block structure of the obtained cluster-generating vectors.
            This work can be extended in several original research directions. In the near
            future we would like to exploit a long-known equivalence between optimal
            transport and k-means Caas and Rosasco (2012) to extend this study further
            to other families of learning methods.





                                                               338 | I S I   W S C   2 0 1 9
   346   347   348   349   350   351   352   353   354   355   356