Page 99 - Invited Paper Session (IPS) - Volume 2
P. 99

IPS184 Celestino G. et al.
            pairs  (two  different  periods  for  19  countries)  represented  by  4  indicators
            corresponding to the sector net lending medians.  This organisation of the
            information allows not only to cluster countries according to similarities in the
            distribution of imbalances, but also to examine transitions from one cluster to
            another as a result of the financial crisis.
                We  cluster  these  38  country-period  pairs  using  SOM,  which  yields  a
            simplified  representation  of  the  input  space  of  4-dimensional  net  lending
            vectors  corresponding  to  the  pairs.  The  input  vectors  are  mapped  onto  a
            network  of  a  few,  also  4-dimensional  output  vectors  (“neuron”  weights
            vectors) whose topological configuration is as close as possible to that of the
            input space (the network is “trained” on the basis of the input data). SOM thus
            creates a low-resolution view of the larger data set, which, apart from yielding
            a  clustering  on  the  basis  of  similarities  in  the  input  data,  is  useful  for  the
                                                               3
            purposes of visualization and general data analysis.   In our set-up we have
            used a 2x3 hexagonal neural network to map the input space. We believe that
            the resulting clustering into six neurons/categories strikes a balance between
            delivering a manageable number of clusters and still allowing for detecting
            “outliers”, country-period pairs that do not present clear analogies with others.
                As a second step in our analysis, we use additionally PCA, which is applied
            to the initial data to obtain a second data set to which SOM is then applied.
            PCA transforms a set of observations of possibly correlated variables (in our
            case net lending per sector) into other variables called “components”. This
            transformation is defined in such a way that the components get ordered by
            variance, the first “principal” component having the largest  variance (that is,
            accounts  for  as  much  of  the  variability  in  the  data  as  possible),  and  each
            succeeding component in turn having the highest variance possible under the
            constraint that it is orthogonal to the preceding components. The resulting
            components (each being a linear combination of the original variables) are
                                                     4
            then an uncorrelated orthogonal basis set.
                PCA is a useful tool for  identifying key characteristics of a large set of
            indicators,  based  on  their  variance,  and  synthetizing  them  into  a  reduced
            number  of  “components”  (those  with  the  highest  variance)  thus  reducing
            dimensionality and helping in analysing and visualising the information. In this
            paper  we  use  PCA  for  obtaining  results  that  can  be  depicted  in  a  two-
            dimensional chart (see below).

            3.  Results
                Figure 1 shows the outcome of the clustering exercise on the original data.
            The country-period pairs are denoted with the ISO 3166-1 alpha-2 country


            3  Kohonen 1982, 59–69.
            4  Pearson 1901, pp. 559-572.
                                                                86 | I S I   W S C   2 0 1 9
   94   95   96   97   98   99   100   101   102   103   104