Page 270 - Contributed Paper Session (CPS) - Volume 4
P. 270

CPS2222 Abdullah M.R. et al.
                  misleading conclusion concerning the fitting of regression model. Pena and
                  Yohai (1995) noted that HLPs are in the main answerable for masking and
                  swamping of outliers in regression models. It is currently evident that HLPs is
                  also the prime source of collinearity influential observations (Imon and Khan,
                  2003). Habshah et al. (2011) recognized that high leverage point collinearity
                  influential  observations  are  those  HLPs  that  can  choosed  the  pattern  of
                  multicollinearity.  There  are  many  good  papers  of  identification  of  high
                  leverage points in linear model (Hadi 1993, Habshah et al 2009, Limet at, 2016
                  and  Alguraibawi  et  al  2015).  However,  there  are  not  much  work  has  been
                  focused on identification high leverage points in high dimensional data. High-
                  dimensional  statistics  refers  to  statistical  inference  when  the  number  of
                  unknown parameters is of much larger order than sample size (Bühlmann, P.,
                  & Van De Geer, S. 2011). In real-life applications, samples are always subject
                  to noise, or outliers.
                      The  support  vector  machine  (SVM)  is  one  of  the  most  important
                  techniques used to deal with problems in high-dimensional data. Recently,
                  many techniques have been developed which depends on SVM. Dhhan et al
                  (2015) and Rana et al (2018) has developed method which depends on fixed
                  parameters  support  vector  regression  (FP-SVR)  to  detect outliers  and  high
                  leverage  points  (HLP’s)  for  high  dimensional  data.  Unfortunately,  (FP-SVR)
                  which  employs  Eps-SVR  is  not  very  successful  in  the  identification  of  mild
                  outliers  and  other  contamination  scenarios.  To  remedy  this  problem,  we
                  propose to use Nu-SVR to detect extreme and mild outliers. Section 2 briefly
                  describe FP-SVR and propose method of Nu-SVR will discuss in Section 3,
                  numerical example and simulation study are presented in Section 4, finally,
                  concluding remark are given in Section 5.

                  2.    Fixed Parameters Support Vector Regressions
                      A practical procedure involving fixed parameters ɛ −tube SV Regression
                  (SVR) has been suggested in order to elevate the performance of the standard
                  SVR in detecting outliers. This method is suitable to be applied as it has many
                  advantages  in  terms  of  time  taken  as  it  consumes  less  time  than  the
                  conventional methods and also able to detect outliers without even getting
                  rid of them.
                     The advantage of non-sparseness of the ɛ −insensitive loss function has
                  been utilized in the fixed parameters ɛ −tube SVR. As appeared in Ceperic et
                  al. (2014) and Guo et al. (2010), the SVR model will rely on most training data
                  if  the  value  of  threshold  ε  is  very  small  and  hence  giving  the  non-sparse
                  solution.  At  the  point  when  the  ɛ  parameter  is  more  than  zero,  almost
                  certainly, a portion of the outliers are not considered as support vectors (fall
                  inside  the  ɛ − zone),  inferring  the  requirement  for  further  iterations  for



                                                                     259 | I S I   W S C   2 0 1 9
   265   266   267   268   269   270   271   272   273   274   275