Page 225 - Special Topic Session (STS) - Volume 4
P. 225

STS582 Júlia M. P. S.
            multivariate data, let   be a vector for all p variables and all members of the
                                  
            f-th family, with covariance matrix given by Ω = 2 ⊗  +  ⊗  , where
                                                                               
                                                               
                                                                     
                                                                         
            2  is the kinship matrix for family f,   and   are (  × ) covariance matrix
               
                                                  
                                                         
            associated with polygenic and error component, respectively, and ⊗ is the
            Kronecker  product.  Oualkacha  et  al.  (2012)  obtained  MANOVA  based
            estimators for these covariance matrices. De Andrade et al. (2015) obtained
            principal components of heritability for reduction of genomic dataset in terms
            of  ancestry  scores.  Different  scores  can  be  obtained  from  family  data  by
            operating on the covariance components, i.e.,  ,  , ∑ −1    as  well as  =
                                                                    
                                                            
                                                                       
                                                                
             +  .  Following  this  idea,  Ribeiro  and  Soler  (2018)  proposed  to  learn
             
                   
            polygenic,  environmental  and  total  graphical  models  from  family  dataset
            exploiting   ,    and   =  +  .  The  authors  also  exploit  to  learn  the
                                             
                            
                        
                                        
            multivariate relations among variables based on a univariate polygenic mixed
            models framework.
                For  multi-omics  integration  in  family  data,  we  are  extending  the
            multivariate   projection-based   methods     available   for   independent
            observations to include familial dependences. Under quadratic solutions, in
            ℜ × ,  it  is  performed  considering  the  factorization  of  the  polygenic  and
            environmental  components  of  the  covariance  matrix.  In  addition,  for
            rectangular  solutions,  in ℜ × ,  N-integration can  be performed  structuring
            data matrix through ANOVA-simultaneous component analysis (Smilde et al.,
            2005) and then building the reductions on the components of the data.

            3.  Results
                Figure 2 shows two representations of n observations clustered in family
            structure. In (a) it is assumed independent observations, where the principal
            components  are  extracted  from  covariance  matrix  .  In  (b)  familial
            dependences  are  taking  in  account,  where  the  principal  components  are
            extracted  from  matrix  ∑ −1  .  Different  colors  are  used  to  discriminate
                                         
                                     
            members  from  different  families.  The  uni-omics  dataset  correspond  to
            genotype  information  obtained  from  SNP  platform  (Affymetrics  6.0).  A
            detailed description of the dataset is in de Andrade et al. (2015). The figure
            illustrates the impact of modelling family structure on the reduction analysis.
            When familial dependence is used more adaptive representation of the data
            is obtained, allowing discriminate members between the ancestry arms found
            in the analysis.
                Figure 3 shows probabilistic graphical models learned from family data
            considering multiple phenotypes extracted from the Baependi Heart Study
            (Oliveira  et  al.,  2006;  Egan  et  al.,  2016).  In  the  figure,  vertices  represent
            variables and the connections indicate partial correlations between variables.
            Important  differences  are  found  on  the  relations  obtained  from  patterns
            coming from the polygenic, environmental or total covariance matrices.


                                                               214 | I S I   W S C   2 0 1 9
   220   221   222   223   224   225   226   227   228   229   230