Page 99 - Invited Paper Session (IPS) - Volume 2
P. 99
IPS184 Celestino G. et al.
pairs (two different periods for 19 countries) represented by 4 indicators
corresponding to the sector net lending medians. This organisation of the
information allows not only to cluster countries according to similarities in the
distribution of imbalances, but also to examine transitions from one cluster to
another as a result of the financial crisis.
We cluster these 38 country-period pairs using SOM, which yields a
simplified representation of the input space of 4-dimensional net lending
vectors corresponding to the pairs. The input vectors are mapped onto a
network of a few, also 4-dimensional output vectors (“neuron” weights
vectors) whose topological configuration is as close as possible to that of the
input space (the network is “trained” on the basis of the input data). SOM thus
creates a low-resolution view of the larger data set, which, apart from yielding
a clustering on the basis of similarities in the input data, is useful for the
3
purposes of visualization and general data analysis. In our set-up we have
used a 2x3 hexagonal neural network to map the input space. We believe that
the resulting clustering into six neurons/categories strikes a balance between
delivering a manageable number of clusters and still allowing for detecting
“outliers”, country-period pairs that do not present clear analogies with others.
As a second step in our analysis, we use additionally PCA, which is applied
to the initial data to obtain a second data set to which SOM is then applied.
PCA transforms a set of observations of possibly correlated variables (in our
case net lending per sector) into other variables called “components”. This
transformation is defined in such a way that the components get ordered by
variance, the first “principal” component having the largest variance (that is,
accounts for as much of the variability in the data as possible), and each
succeeding component in turn having the highest variance possible under the
constraint that it is orthogonal to the preceding components. The resulting
components (each being a linear combination of the original variables) are
4
then an uncorrelated orthogonal basis set.
PCA is a useful tool for identifying key characteristics of a large set of
indicators, based on their variance, and synthetizing them into a reduced
number of “components” (those with the highest variance) thus reducing
dimensionality and helping in analysing and visualising the information. In this
paper we use PCA for obtaining results that can be depicted in a two-
dimensional chart (see below).
3. Results
Figure 1 shows the outcome of the clustering exercise on the original data.
The country-period pairs are denoted with the ISO 3166-1 alpha-2 country
3 Kohonen 1982, 59–69.
4 Pearson 1901, pp. 559-572.
86 | I S I W S C 2 0 1 9