Page 188 - Contributed Paper Session (CPS) - Volume 7
P. 188

Dataset                                 CPS2057 Ana C. M. Ciconelle et al.
               to  quantify  and  call  CNVs  from  SNP  platforms  and  to  analyse  such  data
               considering family based designs and characterize the patterns of the CNVs
               detected in this population.

               2.  Methodology
                   Dataset
                   Due  to  multiple  waves  of  immigration,  Brazil  has  a  highly  admixed
               population, which can be driven by genetic and environmental influences on
               several  traits.  The  Baependi  Heart  Study  is  being  conducted  by  the  Heart
               Institute since 2005 to develop a longitudinal family‐based cohort study for
               understanding the variation of cardiovascular risk factors within the Brazilian
               population and disentangle its genetic and environmental components. The
               data provides information about 105 families (1,666 individuals, 723 male and
               943 females) living in the village of Baependi, in the state of Minas Gerais,
               Brazil. Data from 631 nuclear families were available, with offspring ranging
               from 1 to 14. The number of generations per family varied from 2 to 4 (54% of
               the families had 3 generations, and 45% had 2 generations). Only individuals
               aged 18 years or older were considered eligible for participating in the study.
               The mean age was 44 years, with a range of 18 to 100 years.
                   For  each  participant  a  questionnaire  was  used  to  obtain  information
               regarding family  relationships, demographic characteristics, medical history
               and environmental risk factors. Anthropometric measures, physical and clinical
               examination  and  electrocardiogram  of  the  participants  were  performed  by
               trained medical students. Genomic DNA was extracted by standard procedures.
               From DNA samples, genotyping with SNP array was made based on Affymetrix
               Platform  6.0  and  1,120  CEL  files  were  obtained,  which  stores  the  intensity
               values of each probe array for a single sample and several others information.
               More details are described in Egan et al. (2016).
                   Overview
                   The  methodology  used  in  this  work  is  summarized  by  Figure  1,  which
               describes  the  pre‐processing  of  SNP  data,  the  CNV  calling  and  the  CNV
               analysis. For the pre‐processing of SNP data and the CNV calling, the software
               Affymetrix  Power  Tools  (APT)  (Affymetrix,  2017),  PennCNV  by  Wang  et  al.
               (2007) and packages from the R environment were used. Using APT, given the
               CEL files, signal intensity values for probes are normalized through quantile
               normalization.  Then,  the  median  polish  is  applied  to  get  the  final  cleaned
               intensity values for alleles A and B for each SNP. Also, the individual genotype
               calls is made using the Birdseed algorithm. For each SNP in each sample, the
               genotype will be coded as 0, 1 and 2 for AA, AB, BB and ‐1 for missing values,
               respectively,  with  its  corresponding  confidence  scores.  In  addition,  a  final
               report  will  infer  the  sample  sex.  PennCNV  generates  canonical  genotype
               clustering files based on the output files from APT. These files contain cluster

                                                                  175 | I S I   W S C   2 0 1 9
   183   184   185   186   187   188   189   190   191   192   193