Page 440 - Contributed Paper Session (CPS) - Volume 4
P. 440

CPS2526 Holger Cevallos-Valdiviezo et al.
                  starting  values  random  orthogonal  matrices  are  generated  and  as  initial
                  location  estimate  we  use  the  spatial  median  of  the  data.  The  best  local
                  minimum that is reached is then the approximation for the global optimum
                  (see also Cevallos-Valdiviezo & Van Aelst (2019);
                  Boente & Salibian-Barrera (2015)).

                  d.  Functional data

                      In most applications, curves are only partially observed at different design
                  points   , 1 ≤  ≤  , 1 ≤  ≤  ,  i.e.   =  ( ) .  To  extend  MVLTS  and
                          
                                      
                                                                
                                                              
                                                        
                  CooLTS to the functional case we use smoothed robust principal components
                  by the Sieves method introduced in Bali et al. (2011). The Sieves smoothing
                  method  uses —splines  as  a  smoothing  tool.  Hence,  we  first  project  the
                  functional  data  on  a  finite  dimensional  space  by  using  appropriate  basis
                  functions, then we estimate the principal components by MVLTS or CooLTS in
                  the finite dimensional space and finally we transform the solution back to the
                  original functional space.

                  3.  Result
                      To assess robustness of MVLTS and CooLTS with functional data we carried
                  out  the  experiments  in  Boente  &  Salibian-Barrera  (2015)  which  introduce
                  complicated patterns of contamination. We compare our methods to classical
                  PCA  (LS)  and  other  robust  PCA  techniques  such  as  the  coordinatewise  S-
                  estimator  (CooS)  of  Boente  &  Salibian-Barrera  (2015),  the  Multivariate  S-
                  cstimator (MVS) of Maronna (2005), and the sieve projection-pursuit approach
                  (PP) of Bali et al. (2011). We also included in the comparisons the best —
                  dimensional linear space (True) according to the data generating process as a
                  benchmark  for  all  methods.  For  the  S-estimates  we  consider  the  Tukey's

                  bisquare function for p with constants  = 1.54764,  = 0.50 and  = 3,  =
                  0.2426. For the LTS estimates we consider  = 0.5.
                      We generate functional data from a model with finite-rank process (Model
                  1) and from a model with an infinite-rank process (Model 2). For Model 1 we
                  estimated one-dimensional approximations while for Model 2 we estimated
                  four-dimensional approximations since this choice explains 95% of the total
                  variance.  In  all cases  = 70 functional observations  were  generated  where
                  each curve was observed at  = 100 equidistant instants in the interval [0,1].
                  A fraction ϵ of the curves has been contaminated. Figure 1 shows an example
                  of data generated from Model 1 with  = 0.30 (left) and from Model 2 with
                   = 0.90 (right). Regular curves are shown in blue while contaminated curves
                  are shown in red color. A total of 500 replications was generated for each
                  setting. A cubic  -spline basis of dimension  = 50 was used to project the
                  functional data. To assess the performance of  estimators we examine their

                                                                     429 | I S I   W S C   2 0 1 9
   435   436   437   438   439   440   441   442   443   444   445