Page 364 - Special Topic Session (STS) - Volume 2
P. 364
STS500 Fauzana I. et al.
the use of big data. The objective is to provide a better view in monitoring and
analysing consumer prices. It is also to create a price analysis of new basket
which will be used as the value added for the Consumer Price Index in
Malaysia.
2. Methodology
The initiative is to create an internal portal for Price Intelligence (PI). It
involves modernisation of data collection tools for improving the quality of
Consumer Price Index (CPI) in Malaysia. The modernization of data collection
consists of the adoption of web techniques to scrape price data from related
websites for the CPI compilation. The idea is to crawl data from hypermarkets
and to be collected in the big data project. From there, can be done analysis,
data visualisation, data mining, reports, dashboards and alert. Discussions and
consultations pertaining to this Price Intelligence Module are conducted from
time to time among respected parties who involve in this project. The
meetings to discuss the progress status of this project are also conducted on
weekly basis. In this respect, for Price Intelligence Module, among the parties
involved in the Department of Statistics Malaysia are the Prices, Income and
Expenditure Statistics Division, the Methodology and Research Division and
the Information Management Division. Among the challenges at the initial
stage of the Price Intelligence Module, at least on the industry’s personnel
part, is to understand the nature and scope of work of the Department of
Statistics Malaysia which involves the codes and the classifications used, the
items, price statistics and the very definition of the Consumer Price Index itself.
In Price Intelligence Module, there is a Data Management which objective is
to classify raw online data to its corresponding Classification of Individual
Consumption According to Purpose (COICOP) and to provide a working
platform for managing PI Data Management. Data management involves in
the process of matching data with the Consumption of Individual According
to Purpose (COICOP). There is also a crawling process which involves the
monitoring and alert regarding the crawling process from the selected
websites. As for Price Lake, it involves the data generator, Big Data concept
and monitoring storage data. In PI Module, there is Analytics & Visualisation
which involves some analysis and visualisation using R programming and
Tableau. As for PI Data Processing, since the crawling of the data is conducted
all the time, data processing cannot be conducted on a time-base manner.
This is to avoid the FTP folder from crammed. Having said that, as PI module
in Malaysia made its maiden journey, there are lots of challenges and issues
related to it. Beside the prices data that are crawled everyday keep changing,
some of the issues that are inevitable to be encountered are the price phishing,
the prices that become too broad as well as other issues.
353 | I S I W S C 2 0 1 9