Page 440 - Contributed Paper Session (CPS) - Volume 4
P. 440
CPS2526 Holger Cevallos-Valdiviezo et al.
starting values random orthogonal matrices are generated and as initial
location estimate we use the spatial median of the data. The best local
minimum that is reached is then the approximation for the global optimum
(see also Cevallos-Valdiviezo & Van Aelst (2019);
Boente & Salibian-Barrera (2015)).
d. Functional data
In most applications, curves are only partially observed at different design
points , 1 ≤ ≤ , 1 ≤ ≤ , i.e. = ( ) . To extend MVLTS and
CooLTS to the functional case we use smoothed robust principal components
by the Sieves method introduced in Bali et al. (2011). The Sieves smoothing
method uses —splines as a smoothing tool. Hence, we first project the
functional data on a finite dimensional space by using appropriate basis
functions, then we estimate the principal components by MVLTS or CooLTS in
the finite dimensional space and finally we transform the solution back to the
original functional space.
3. Result
To assess robustness of MVLTS and CooLTS with functional data we carried
out the experiments in Boente & Salibian-Barrera (2015) which introduce
complicated patterns of contamination. We compare our methods to classical
PCA (LS) and other robust PCA techniques such as the coordinatewise S-
estimator (CooS) of Boente & Salibian-Barrera (2015), the Multivariate S-
cstimator (MVS) of Maronna (2005), and the sieve projection-pursuit approach
(PP) of Bali et al. (2011). We also included in the comparisons the best —
dimensional linear space (True) according to the data generating process as a
benchmark for all methods. For the S-estimates we consider the Tukey's
bisquare function for p with constants = 1.54764, = 0.50 and = 3, =
0.2426. For the LTS estimates we consider = 0.5.
We generate functional data from a model with finite-rank process (Model
1) and from a model with an infinite-rank process (Model 2). For Model 1 we
estimated one-dimensional approximations while for Model 2 we estimated
four-dimensional approximations since this choice explains 95% of the total
variance. In all cases = 70 functional observations were generated where
each curve was observed at = 100 equidistant instants in the interval [0,1].
A fraction ϵ of the curves has been contaminated. Figure 1 shows an example
of data generated from Model 1 with = 0.30 (left) and from Model 2 with
= 0.90 (right). Regular curves are shown in blue while contaminated curves
are shown in red color. A total of 500 replications was generated for each
setting. A cubic -spline basis of dimension = 50 was used to project the
functional data. To assess the performance of estimators we examine their
429 | I S I W S C 2 0 1 9