Page 58 - Contributed Paper Session (CPS) - Volume 1
P. 58

CPS877 Paula J.G. et al.
                  capacity  allows  the  analysts  to  avoid  the  complexity  resulting  from  the
                  evaluation  of  each  table  and  variable  by  employing  an  integrated  graphic
                  representation of the data collected on periodic occasions. The focus on the
                  relative position of the individuals provided by the STATIS analysis results from
                  the graphic displays that summarize the most important aspects related to
                  large  data  sets  involving  multiple  variables.  Despite  the  loss  of  some
                  information  detail,  the  representations  resulting  from  a  multidimensional
                  method (such as STATIS) are easy to interpret visually which permits to unveil
                  the main features of the data.
                      For a set of S data tables, the STATIS method represents each study by an
                  object  Ws  and  the  study  is  defined  by  three  elements  (Xs,  Qs,  D)  with  D
                  (observations weight) being constant and with Qs being equal to either Ip or
                         -1
                  (diagV)  (for normalized data). For a table Xs (n x p) (with s = 1, ..., S), the
                                                                   t
                  representative object is obtained by: Ws = Xs Qs X s(size n x n). For the object
                  distances and graphical representation of the tables, the STATIS method uses
                  the  Hilbert-Schmidt  inner-product  which  indicates  the  existing  degree  of
                  association between data tables:〈 〉 ( ′)  = Tr(DWsDWs’), where
                  Tr (trace) is the sum of the diagonal elements. The joint analysis of multiple
                  data tables permits to have a varying number of variables (STATIS, for object
                  relations)  or  objects  (Dual-STATIS,  for  variable  relations)  over  time  and  to
                  collect data with or without a defined periodicity.
                      This  method  involves  four  stages:  (i)  global  analysis  based  on  an
                  interstructure comparing  the  data  table  structures  with  the  support  of  the
                  existing  distances  and  graphic  representation;  (ii)  identification  of  a
                  compromise table W representing all the data tables in order to avoid the
                  complexity of analyzing the various tables in an independent and separate
                  way; (iii) detailed analysis resulting from the study of the intrastructure which
                  permits to evaluate the similarities and differences between the tables based
                  on their compromise positions; and (iv) analysis of the trajectories presented
                  by each component (objects or variables) of the various data tables over time
                  to appraise the evolution.X

                  3.  Results
                      The OECD data related to the “How’s Life” program for the member and
                  associated countries (35 plus 6 countries in total) involved a varying number
                  of observations and variables during the period from 2009 to 2015. Likewise,
                  it was decided to focus the study on 34 member countries (excluding Chile
                  and the associated countries due to their extensive data gaps) and to use the
                  data  for  the  15  most  complete  variables  only.  Although  there  were  some
                  missing  values  (c.  5.5%  that  were  imputed  through  maximum  likelihood
                  estimates or correlations), it was possible to produce a joint analysis of the



                                                                      47 | I S I   W S C   2 0 1 9
   53   54   55   56   57   58   59   60   61   62   63