Page 446 - Contributed Paper Session (CPS) - Volume 4
P. 446

CPS2564 Tiffany Rizkika et al.
                      epoch, so that the number of epochs needed to reach the desired target
                      is far less.
                      b.  Non-volatile food price nowcasting
                         Nowcast  based  on  present  data  use  two  methods,  i.e.,  time  series-
                      based (Nowcast Model) and statistical filtering-based which is followed by
                      cubic smoothing spline modelling (IQR-Spline Model, KDE-Spline Model).
                      There are several data pre-processing techniques applied in this study,
                      namely  data  cleaning,  data  integration,  data  reduction,  and  data
                      transformation.  Because  of  preprocessing  techniques  are  not  mutually
                      exclusive, so deep the process does not have to work separately, it can be
                      done simultaneously (Han, Kamber, & Pei, 2012).
                      1.  Data Cleaning. Application of data cleaning in statistical filtering-based
                         (IQR.Spline  Model,  KDE.Spline  Model)  is  to  eliminate  missing  value.
                         Whereas in time series-based (Nowcast Model), apply imputation to
                         overcome  the  missing  value  using  formula  that include  in  Nowcast
                         Model  (Kim,  Cha,  &  Lee,  2017).  The  statistical  filtering-based  will
                         eliminate  extreme  data  (outliers)  before  continuing  with  cubic
                         smoothing  spline  modelling.  Two  outlier  detections  are  used  in
                         statistical filtering-based, namely parametric (Interquartile Range-IQR)
                         and non-parametric (Kernel Density Estimation-KDE) detections.
                      2.  Data integration is implemented to detect fraudulent in crowdsourcing
                         data. The addition of market identity data can illustrate how data is
                         distributed according to commodities, markets, days and time.
                      3.  Data Reduction. The data is adjusted to the scope of research based
                         on commodities, location, and time.
                      4.  Data Transformation is carried out to standardize units according to
                         their commodities.
                  Here  is  the  cubic  smoothing  spline  formula  which  is  used  for  statistical
                  filtering-based modelling (Model 1 and Model 2):

                                    () =  ∑( − ( )) +   ∫[ ()]  
                                             ⏟               ⏟

                     In that function, part (a) is a function of the distance between the data and
                  the  estimate  or  the  number  of  squares,  section  (b)  is  a  measure  of  the
                  smoothness of the curve in data roughness penalty. Lambda (λ) is a smoothing
                  parameter as a balance controller between the compatibility of the data and
                  the smoothness of the curve which has a value range of 0 <λ <1. The greater
                  the lambda value, the greater the smoothness of weight and the smaller the
                  variance produced. The value of f (x) used in the PLS function is obtained from
                  the P_i (x) polynomial. The polynomial used is the third order polynomial (k =

                                                                     435 | I S I   W S C   2 0 1 9
   441   442   443   444   445   446   447   448   449   450   451