Page 225 - Special Topic Session (STS) - Volume 4
P. 225
STS582 Júlia M. P. S.
multivariate data, let be a vector for all p variables and all members of the
f-th family, with covariance matrix given by Ω = 2 ⊗ + ⊗ , where
2 is the kinship matrix for family f, and are ( × ) covariance matrix
associated with polygenic and error component, respectively, and ⊗ is the
Kronecker product. Oualkacha et al. (2012) obtained MANOVA based
estimators for these covariance matrices. De Andrade et al. (2015) obtained
principal components of heritability for reduction of genomic dataset in terms
of ancestry scores. Different scores can be obtained from family data by
operating on the covariance components, i.e., , , ∑ −1 as well as =
+ . Following this idea, Ribeiro and Soler (2018) proposed to learn
polygenic, environmental and total graphical models from family dataset
exploiting , and = + . The authors also exploit to learn the
multivariate relations among variables based on a univariate polygenic mixed
models framework.
For multi-omics integration in family data, we are extending the
multivariate projection-based methods available for independent
observations to include familial dependences. Under quadratic solutions, in
ℜ × , it is performed considering the factorization of the polygenic and
environmental components of the covariance matrix. In addition, for
rectangular solutions, in ℜ × , N-integration can be performed structuring
data matrix through ANOVA-simultaneous component analysis (Smilde et al.,
2005) and then building the reductions on the components of the data.
3. Results
Figure 2 shows two representations of n observations clustered in family
structure. In (a) it is assumed independent observations, where the principal
components are extracted from covariance matrix . In (b) familial
dependences are taking in account, where the principal components are
extracted from matrix ∑ −1 . Different colors are used to discriminate
members from different families. The uni-omics dataset correspond to
genotype information obtained from SNP platform (Affymetrics 6.0). A
detailed description of the dataset is in de Andrade et al. (2015). The figure
illustrates the impact of modelling family structure on the reduction analysis.
When familial dependence is used more adaptive representation of the data
is obtained, allowing discriminate members between the ancestry arms found
in the analysis.
Figure 3 shows probabilistic graphical models learned from family data
considering multiple phenotypes extracted from the Baependi Heart Study
(Oliveira et al., 2006; Egan et al., 2016). In the figure, vertices represent
variables and the connections indicate partial correlations between variables.
Important differences are found on the relations obtained from patterns
coming from the polygenic, environmental or total covariance matrices.
214 | I S I W S C 2 0 1 9