Page 395 - Special Topic Session (STS) - Volume 4
P. 395

STS2320 Ali S. H.



                         On the identification and handling of outliers in

                                       composite index data
                                             Ali S. Hadi
                         Department of Mathematics and Actuarial Science,
                 The American University in Cairo, Egypt. E-mail ahadi@aucegypt.edu

            Abstract
            Composite  indices  data  often  need  editing  before  the  computation  of  the
            indices. Variables may need some transformation due to their high skewness
            or kurtosis coefficients. Also, outliers are commonly found in composite index
            data.  These  outliers  can  drastically  affect  the  results  of  composite  indices.
            Identification  of  outliers  improves  data  quality  and  reliability,  hence  it
            improves the quality of the decisions drawn from the data and analysis. Three
            important  steps  in  constructing  composite  indices  are  (1)  determining  the
            variables that need transformation, (2) identifying outliers when they exist in
            index data, and (3) what to do with the outliers once they are identified? We
            discuss these steps in constructing composite indices.

            Keywords
            BACON; Kurtosis; MCD; Mahalanobis distance; Min-Max Normalization;
            Outliers; Robust, Skewness


            1.  Introduction
                Numerous  composite  indices  are  computed  on  an  annual  basis.  For
            example, the Global Knowledge Index, the corruption perception index (CPI),
            The  Human  Development  Index  (HDI),  the  Ibrahim  Index  of  African
            Governance (IIAG), the Gender Inequality Index (GII), and the Climate Change
            Performance Index (CCPI) to mention only a few. Bandura (2008) provides a
            survey  of  the  current  composite  indices  around  the  world.  At  that  time
            Bandura (2008) found 187 indices.
                Most recently, the International Knowledge Index (IKI) was computed for
            the  first  time  and  published  in  2017  by  the  United  Nations  Development
            Program (UNDP). The IKI extended the Arab Knowledge Index (AKI) which was
            computed  for  the  first  time  in  2015  by  the  Al  Maktoum  Foundation
            (http://www.mbrf.ae/) to measure knowledge in the Arab countries.
                Composite indices data are high-dimensional data, where the number of
            variables sometimes exceeds the number of observations. A composite index
            is a single number summary for each observation in the data. The quality of a
            composite  index  cannot  exceed  the  quality  of  the  data  that  are  used  to
            construct  the  index.  Variables  can  be  highly  skewed  and/or  have  severe
            kurtosis. The data may also contain univariate and multivariate outliers. These


                                                               384 | I S I   W S C   2 0 1 9
   390   391   392   393   394   395   396   397   398   399   400