Page 14 - Contributed Paper Session (CPS) - Volume 4
P. 14

CPS2101 Bertail Patrice et al.
                  will  assume  that  G  has  a  density  with  respect  to  the  Lebesgue  measure,
                  denoted  by  belonging  to  some  regular  space  G  (we  will  precise  later  the
                  regulatity assumption needed on G ensuring convergence of our estimators) .
                  The distribution of the r.v. v is entirely determined by the law of the pair (W,G):
                  we denote it by PW,g and pW,g its density . The semiparametric model is thus
                  entirely described by the set


                                                                          .
                      In the semiparametric terminology W is the parameter of interest and g is
                  the nuisance parameter.
                  Remark : This model can be extended to noisy models. Although additive
                  noises are generally modelled in an additive way in most applications, it should
                  be noticed that multiplicative and poisson noise models may also be used to
                  take advantage of leading to nonnegative data vectors in the context of NMF.
                  Hence,  as  highlighted  in  [6],  uniqueness  of  matrices  (W,H)  cannot  be
                  guaranteed in absence of further assumptions on W and/or H’s distribution.
                  We now set the hypotheses which will be assumed throughout this paper and
                  ensuring  the  existence  and  the  unicity  of  the  representation  in  a
                  semiparametric framework.
                   H1   The matrix W is of full rank K, 1 ≤ K ≤ F.
                   H2   The columns of the matrix W are of unit (euclidian) norm: ∀k ∈ {1, ..., K},
                             2
                        ||W.k|| = 1.
                   H3   The columns of the matrix W are sorted by lexicographic order of the
                        vectors (α1,k(W), ..., αF,k(W)).
                   H4   The  span  of  the  support  of  the  distribution  of  the  v’is  isdenoted  by
                                  .
                   H5  The      distribution    G(dh)     is    such     that     for    any
                                                           , where        {λw = (λwf) : λ > 0}
                                          F
                        for any w = (wf) ∈ R and supp(G) denotes the support of h’s distribution.
                      We will denote by W+ the set of matrices W ∈ MFK(R+) fulfilling assumptions
                  H1 − H3.
                      Theorem (Semi-parametric identifiability in NMF models) Let G be a set of
                  probability distributions on  . Assume that all the distributions in  G fulfill
                  assumptions H4 −H5. The family of distributions

                                           is then identifiable.
                      Now, under assumption H1, the likelihood pW,g is given by [2]
                                             pW,g(v) = vol(W )g(W v).                      (3)
                                                            −1
                                                                  −1
                      In  that  case  the  likelihood  of  this  semi-parametric  model  based  on  a
                  sample vn = (v1, ..., vn) of n independent copies of the random variable v, is
                  simply given by



                                                                       3 | I S I   W S C   2 0 1 9
   9   10   11   12   13   14   15   16   17   18   19