Page 446 - Contributed Paper Session (CPS) - Volume 4
P. 446
CPS2564 Tiffany Rizkika et al.
epoch, so that the number of epochs needed to reach the desired target
is far less.
b. Non-volatile food price nowcasting
Nowcast based on present data use two methods, i.e., time series-
based (Nowcast Model) and statistical filtering-based which is followed by
cubic smoothing spline modelling (IQR-Spline Model, KDE-Spline Model).
There are several data pre-processing techniques applied in this study,
namely data cleaning, data integration, data reduction, and data
transformation. Because of preprocessing techniques are not mutually
exclusive, so deep the process does not have to work separately, it can be
done simultaneously (Han, Kamber, & Pei, 2012).
1. Data Cleaning. Application of data cleaning in statistical filtering-based
(IQR.Spline Model, KDE.Spline Model) is to eliminate missing value.
Whereas in time series-based (Nowcast Model), apply imputation to
overcome the missing value using formula that include in Nowcast
Model (Kim, Cha, & Lee, 2017). The statistical filtering-based will
eliminate extreme data (outliers) before continuing with cubic
smoothing spline modelling. Two outlier detections are used in
statistical filtering-based, namely parametric (Interquartile Range-IQR)
and non-parametric (Kernel Density Estimation-KDE) detections.
2. Data integration is implemented to detect fraudulent in crowdsourcing
data. The addition of market identity data can illustrate how data is
distributed according to commodities, markets, days and time.
3. Data Reduction. The data is adjusted to the scope of research based
on commodities, location, and time.
4. Data Transformation is carried out to standardize units according to
their commodities.
Here is the cubic smoothing spline formula which is used for statistical
filtering-based modelling (Model 1 and Model 2):
2
() = ∑( − ( )) + ∫[ ()]
2
′′
⏟ ⏟
=1
()
()
In that function, part (a) is a function of the distance between the data and
the estimate or the number of squares, section (b) is a measure of the
smoothness of the curve in data roughness penalty. Lambda (λ) is a smoothing
parameter as a balance controller between the compatibility of the data and
the smoothness of the curve which has a value range of 0 <λ <1. The greater
the lambda value, the greater the smoothness of weight and the smaller the
variance produced. The value of f (x) used in the PLS function is obtained from
the P_i (x) polynomial. The polynomial used is the third order polynomial (k =
435 | I S I W S C 2 0 1 9