Page 87 - Contributed Paper Session (CPS) - Volume 5
P. 87
CPS1060 Taik Guan T. et al.
PESTEL features selected in this paper for experiments, 12 features
demonstrated some impact or high impact to broadband development. These
features are GDP per capita, fixed broadband penetration, wireless broadband
penetration, telephony penetration, life expectancy, length of roads, economic
activity, GDP, GNI, electricity access, population density, and birth rate. This
observation is in line with findings in past research. However, the impact of
the labor force and secondary education are found to be in contrary with the
literature review. As broadband technology is evolving, the tertiary education
might be a better indicator than secondary education. Goldfarb (2006) found
that university education improved the diffusion of the Internet. The low
correlation coefficient for labour force might open up an area of new research
or further literature review if the occupation is a better feature to replace
labour force. Land size, agricultural land size, population size, rainfall and
average temperature are found to have minimum impact on broadband
development. The literature reviewed does not reveal the correlation between
these five features against broadband development. The research experiment
shows that % of agricultural land has a high impact as compared to land size
or agricultural land by itself.
It is concluded that the machine learning technique is a feasible model for
use in the telecom industry to classify geographic areas according to their
socioeconomic potential. Training data are available from the World Bank
databank and the Department of Statistics Malaysia to initiate the process of
machine learning. Even though there are shortcomings in the data sets
regarding feature sets and sample size, the existing data are good enough to
be used as prototyping data to be put through statistical modeling which
results to the formulation of interdependencies (correlation coefficient)
among the features and targeted response. The statistical modeling has been
successful in generalizing the data and screening for important factors to
establish the optimal product formulation, which is an equation that correlates
the geographical features corresponding to the socioeconomic response. By
applying the equation to a genetic algorithm, virtual samples in a large size
have been generated for SVM training and testing. The high accuracy achieved
in cross-validation and testing are good evidence that the SVM has been
properly trained. Finally, when real-life field data for states in Malaysia are
provided to the SVM, the machine can successfully classify the states
according to their socioeconomic potential.
The research results show that the land size and population size have a low
co-relationship impact to GNI per capita and fixed broadband penetration,
thus, this machine learning model can be applied to classify countries, states,
urban and rural areas. Using a machine learning technique (MLT) to classify
the socioeconomic potential of a geographic area according to its
geographical features is a novelty of this research. The MLT is relatively more
76 | I S I W S C 2 0 1 9