Page 58 - Contributed Paper Session (CPS) - Volume 1
P. 58
CPS877 Paula J.G. et al.
capacity allows the analysts to avoid the complexity resulting from the
evaluation of each table and variable by employing an integrated graphic
representation of the data collected on periodic occasions. The focus on the
relative position of the individuals provided by the STATIS analysis results from
the graphic displays that summarize the most important aspects related to
large data sets involving multiple variables. Despite the loss of some
information detail, the representations resulting from a multidimensional
method (such as STATIS) are easy to interpret visually which permits to unveil
the main features of the data.
For a set of S data tables, the STATIS method represents each study by an
object Ws and the study is defined by three elements (Xs, Qs, D) with D
(observations weight) being constant and with Qs being equal to either Ip or
-1
(diagV) (for normalized data). For a table Xs (n x p) (with s = 1, ..., S), the
t
representative object is obtained by: Ws = Xs Qs X s(size n x n). For the object
distances and graphical representation of the tables, the STATIS method uses
the Hilbert-Schmidt inner-product which indicates the existing degree of
association between data tables:〈 〉 ( ′) = Tr(DWsDWs’), where
Tr (trace) is the sum of the diagonal elements. The joint analysis of multiple
data tables permits to have a varying number of variables (STATIS, for object
relations) or objects (Dual-STATIS, for variable relations) over time and to
collect data with or without a defined periodicity.
This method involves four stages: (i) global analysis based on an
interstructure comparing the data table structures with the support of the
existing distances and graphic representation; (ii) identification of a
compromise table W representing all the data tables in order to avoid the
complexity of analyzing the various tables in an independent and separate
way; (iii) detailed analysis resulting from the study of the intrastructure which
permits to evaluate the similarities and differences between the tables based
on their compromise positions; and (iv) analysis of the trajectories presented
by each component (objects or variables) of the various data tables over time
to appraise the evolution.X
3. Results
The OECD data related to the “How’s Life” program for the member and
associated countries (35 plus 6 countries in total) involved a varying number
of observations and variables during the period from 2009 to 2015. Likewise,
it was decided to focus the study on 34 member countries (excluding Chile
and the associated countries due to their extensive data gaps) and to use the
data for the 15 most complete variables only. Although there were some
missing values (c. 5.5% that were imputed through maximum likelihood
estimates or correlations), it was possible to produce a joint analysis of the
47 | I S I W S C 2 0 1 9