Page 86 - Contributed Paper Session (CPS) - Volume 5
P. 86
CPS1060 Taik Guan T. et al.
The result is in line with the behavior of machine learning where large
training samples will improve machine’s performance. With the framework of
the machine learning technique being established and proven, the trained and
tested SVM is then successfully applied to classify real-life field data in the
context of geographic areas in Malaysia.
4. Discussion and Conclusion
SVM does not memorize the data given. Instead, it learns the pattern from
the training datasets and classifies the response for the testing datasets. The
two obvious evidence of machine learning are:
- Consistency and accuracy of the cross-validation and testing with WB
and GA data
- Consistency and accuracy of the testing results with real-life data for
states in Malaysia.
Both learning algorithm and hypothesis sets are the solution tools (or
components) in the machine learning process. And the solution tools worked
successfully in this research. Together, the learning algorithm and the
hypothesis sets are named as the learning model. The research results showed
that a learning model had been successfully established. The SVM is the
hypothesis H, whereas the kernels (linear, polynomial and RBF) are the sub-
sets {h} of the hypothesis which is used for pattern recognition. The quadratic
programming is the learning algorithm used in the research.
where,
o ℎ denotes the hyperplane that generalizes the data of
geographical features in the feature space
o X denotes data that are transformed from input space into
features space with higher dimensions.
o x1,x2,,,xN are the specific data points of X
o {+1,−1} is the binary class, where +1 denotes areas having
socioeconomic potential; and -1 denotes areas without
socioeconomic potential.
Through the experiment, it is found that larger training samples deliver
higher training accuracy. This result is in line with the machine learning theory
of “when N is small, the delta in error is high.” It is also observed that, in line
with another machine learning. “when N is big, both sample-in and sample-
out will have about the same error.” The sample-in are data used for machine
training whereas sample-out are unseen data for machine testing. It the case
of this paper, it is true to say the SVM is doing well within the data sets
provided, and the framework of machine learning technique works. Of the 19
75 | I S I W S C 2 0 1 9