Page 270 - Contributed Paper Session (CPS) - Volume 4
P. 270
CPS2222 Abdullah M.R. et al.
misleading conclusion concerning the fitting of regression model. Pena and
Yohai (1995) noted that HLPs are in the main answerable for masking and
swamping of outliers in regression models. It is currently evident that HLPs is
also the prime source of collinearity influential observations (Imon and Khan,
2003). Habshah et al. (2011) recognized that high leverage point collinearity
influential observations are those HLPs that can choosed the pattern of
multicollinearity. There are many good papers of identification of high
leverage points in linear model (Hadi 1993, Habshah et al 2009, Limet at, 2016
and Alguraibawi et al 2015). However, there are not much work has been
focused on identification high leverage points in high dimensional data. High-
dimensional statistics refers to statistical inference when the number of
unknown parameters is of much larger order than sample size (Bühlmann, P.,
& Van De Geer, S. 2011). In real-life applications, samples are always subject
to noise, or outliers.
The support vector machine (SVM) is one of the most important
techniques used to deal with problems in high-dimensional data. Recently,
many techniques have been developed which depends on SVM. Dhhan et al
(2015) and Rana et al (2018) has developed method which depends on fixed
parameters support vector regression (FP-SVR) to detect outliers and high
leverage points (HLP’s) for high dimensional data. Unfortunately, (FP-SVR)
which employs Eps-SVR is not very successful in the identification of mild
outliers and other contamination scenarios. To remedy this problem, we
propose to use Nu-SVR to detect extreme and mild outliers. Section 2 briefly
describe FP-SVR and propose method of Nu-SVR will discuss in Section 3,
numerical example and simulation study are presented in Section 4, finally,
concluding remark are given in Section 5.
2. Fixed Parameters Support Vector Regressions
A practical procedure involving fixed parameters ɛ −tube SV Regression
(SVR) has been suggested in order to elevate the performance of the standard
SVR in detecting outliers. This method is suitable to be applied as it has many
advantages in terms of time taken as it consumes less time than the
conventional methods and also able to detect outliers without even getting
rid of them.
The advantage of non-sparseness of the ɛ −insensitive loss function has
been utilized in the fixed parameters ɛ −tube SVR. As appeared in Ceperic et
al. (2014) and Guo et al. (2010), the SVR model will rely on most training data
if the value of threshold ε is very small and hence giving the non-sparse
solution. At the point when the ɛ parameter is more than zero, almost
certainly, a portion of the outliers are not considered as support vectors (fall
inside the ɛ − zone), inferring the requirement for further iterations for
259 | I S I W S C 2 0 1 9