Page 172 - Contributed Paper Session (CPS) - Volume 4
P. 172
CPS2166 Divo Dharma Silalahi et al.
Robust wavelength selection using input scaling
of filter-wrapper methods on near infrared
spectral data of oil palm fruit mesocarp
2
2
1
2
Divo Dharma Silalahi , Habshah Midi , Jayanthi Arasan , Mohd Shafie Mustafa ,
1
Jean-Pierre Caliman
1 SMART Research Institute, PT. SMART TBK, Riau, Indonesia
2 Institute of Mathematical Research, Universiti Putra Malaysia, Serdang, Malaysia
Abstract
In this study, a new robust wavelength selection based on input scaling
method is introduced. The method called Filter-Wrapper method that
combines the modified Variable Importance Projection (VIP) and modified
Monte Carlo Uninformative Variable Eliminations (MCUVE) to scale the
wavelength variable as input factor. The modified VIP uses the orthogonal
components of PLS in investigating the informative variable in the model by
applying the amount of variation both in X and y SSX, SSY ,
simultaneously. The VIP score then is calculated by using the normalized
loading t from the obtained loading w . Using the VIP score as scaling input
method of wavelength variable, the modified MCUVE uses the robust
th
tolerance interval to eliminate the most j uninformative variable in the scaled
input wavelength. In the experiment using simulation data and real data, the
proposed method offered some advantages such as improved model
interpretability, computationally extensive, and increases the model accuracy.
Keywords
Partial Least Squares, Variable Selection, Variable Importance Projection,
Uninformative Variable Eliminations
1. Introduction:
In practice, it is considered if difficult to eliminate all the irrelevant variables
and it is also noted if a less number of X variables used in the calibration will
result to the over or under fitting. To overcome this, a new procedure in
wavelengths selection using input scaling method based on the combination
of Orthogonal Projections to Latent Structures (OPLS)-VIP score and modified
UVE is proposed. The scaling method called as mod-VIP-MCUVE (also denotes
as Filter-Wrapper method) benefits to guarantee all the wavelengths have
equal contribution in the model and improve the convergence speed of the
algorithm (Kim et al., 2015; Kim et al., 2016). With related to the Near Infrared
Spectroscopy (NIRS) spectral data, the method has benefit to highlight the
relevant wavelengths and to downgrade the influence of irrelevant
wavelengths in the Partial Least Square Regression (PLSR) model. In the recent
work, it has been investigated if only auto-scaling method that mostly applied
161 | I S I W S C 2 0 1 9