Page 172 - Contributed Paper Session (CPS) - Volume 4
P. 172

CPS2166 Divo Dharma Silalahi et al.

                               Robust wavelength selection using input scaling
                                 of filter-wrapper methods on near infrared
                                   spectral data of oil palm fruit mesocarp
                                                                      2
                                                     2
                                      1
                                                                                             2
                  Divo Dharma Silalahi , Habshah Midi , Jayanthi Arasan , Mohd Shafie Mustafa ,
                                                                 1
                                              Jean-Pierre Caliman
                                1 SMART Research Institute, PT. SMART TBK, Riau, Indonesia
                        2 Institute of Mathematical Research, Universiti Putra Malaysia, Serdang, Malaysia

                  Abstract
                  In  this  study,  a  new  robust  wavelength  selection  based  on  input  scaling
                  method  is  introduced.  The  method  called  Filter-Wrapper  method  that
                  combines  the  modified  Variable  Importance  Projection  (VIP)  and  modified
                  Monte  Carlo  Uninformative  Variable  Eliminations  (MCUVE)  to  scale  the
                  wavelength variable as  input factor. The modified VIP uses the orthogonal
                  components of PLS in investigating the informative variable in the model by
                  applying  the  amount  of  variation  both  in  X  and  y   SSX,   SSY  ,
                  simultaneously.  The  VIP  score  then  is  calculated  by  using  the  normalized
                  loading t   from the obtained loading w . Using the VIP score as scaling input
                  method  of  wavelength  variable,  the  modified  MCUVE  uses  the  robust
                                                         th
                  tolerance interval to eliminate the most j  uninformative variable in the scaled
                  input wavelength. In the experiment using simulation data and real data, the
                  proposed  method  offered  some  advantages  such  as  improved  model
                  interpretability, computationally extensive, and increases the model accuracy.

                  Keywords
                  Partial  Least  Squares,  Variable  Selection,  Variable  Importance  Projection,
                  Uninformative Variable Eliminations

                  1.  Introduction:
                      In practice, it is considered if difficult to eliminate all the irrelevant variables
                  and it is also noted if a less number of  X variables used in the calibration will
                  result  to  the  over  or  under  fitting.  To  overcome  this,  a  new  procedure  in
                  wavelengths selection using input scaling method based on the combination
                  of Orthogonal Projections to Latent Structures (OPLS)-VIP score and modified
                  UVE is proposed. The scaling method called as mod-VIP-MCUVE (also denotes
                  as  Filter-Wrapper  method)  benefits  to  guarantee  all  the  wavelengths  have
                  equal contribution in the model and improve the convergence speed of the
                  algorithm (Kim et al., 2015; Kim et al., 2016). With related to the Near Infrared
                  Spectroscopy (NIRS) spectral data, the method has benefit to highlight the
                  relevant  wavelengths  and  to  downgrade  the  influence  of  irrelevant
                  wavelengths in the Partial Least Square Regression (PLSR) model. In the recent
                  work, it has been investigated if only auto-scaling method that mostly applied
                                                                     161 | I S I   W S C   2 0 1 9
   167   168   169   170   171   172   173   174   175   176   177