Page 357 - Special Topic Session (STS) - Volume 2
P. 357
STS500 Neo S.K. et al.
Profiling the internet economy in Singapore
Neo Soo Khee, Jeremy Tan, Choo Kit Hoong
Department of Statistics Singapore
Abstract
Recognising the need to understand the effects the internet has on the
economy, the Singapore Department of Statistics has embarked on a pilot
project to make use of web-based data sources to profile the internet
economy in Singapore. This paper begins by defining the internet economy
and discusses the work undertaken to use information from firms’ websites to
profile the type of internet usage for the firms. Specifically, it details the use of
web scraping tools to first extract firm website addresses, and then to extract
relevant features from the firms’ websites. The websites will finally be classified
into one of four internet categories using supervised machine learning. The
paper finally presents the experiences of the pilot project and future work to
expand the scope of the project.
Keywords
internet economy; web scraping; machine learning
1. Introduction
The internet permeates many aspects of our society, from the way people
interact to how companies and businesses operate. Over the last few decades,
the internet provided growth and start-up opportunities for many companies.
Google and Temasek (2016) estimated that there would be 3.8 million new
internet users every month in Southeast Asia, making it the fastest growing
internet region between 2015 and 2020 . Based on the Infocomm Media
1
Development Authority’s Business Infocomm Usage Survey, the business
usage of internet for enterprises in Singapore increased from 82% in 2014 to
2
91% in 2018 . Consequently, the internet economy in Southeast Asia is
expected to grow exponentially. Increasingly, there is also a growing demand
for a better understanding of the nature and effects of the internet economy.
The Singapore Department of Statistics (DOS) embarked on a pilot project
to make use of web-based data sources to profile the internet economy in
Singapore to better understand this emerging trend. Several national statistical
offices have explored the use of web-based data sources. For instance,
Statistics Netherlands studied the use of data from the web to measure the
internet economy while the Italian National Institute of Statistics explored the
3
4
use of web-based data to update the business registry .
346 | I S I W S C 2 0 1 9