Page 374 - Special Topic Session (STS) - Volume 2
P. 374
STS500 Li J.
certain extent; compared with the current era of big data, the past can be
called the era of small data, in which some results with a tendency could not
be manifested and many conclusions were hard to be reached due to the
limitation of data size. With an unprecedented increase of data acquired, the
information contained in the data has also seen explosive growth. In the
analysis of massive data, the meaning of single data seems to be less
important, but the full data aggregated will show an effect of “1+1>2” and
then generate the other new information outside this area.
(2) Reuse of historical data, meaning that the potential value of data is
continuously mined with each mining as the data that is used only once will
not become worthless; in the traditional thinking, the data that is used only
once will be stored and filed after reaching the specific purpose, but the era
of big data tells us that the data can be reused, and the potential value of data
can be continuously mined with each mining to bring more useful information.
(3) Reciprocal recombination of data in different areas, meaning that the
use of data is not limited to one area and new valuable information may be
generated if the data in this area is used into another area; for example,
through the exchange about flight information and weather information on
the website of FlyOnTime and the analysis of historical big data, it was
concluded that the delay time due to heavy fog is twice as much as that due
to snow for the flight from Boston to Laguardia Airport, New York. Therefore,
the possibility of flight delay and the delay time can be more accurately
speculated through the application of data exchange.
(4) Reverse use of “useless” data, meaning that the “useless” data is used
by means of reverse thinking so as to make up the forward use. The spelling
checker of Google is known perfect in the world. Its strong background
database contains the spelling mistakes entered into the search box from 3
billion queries processed every day and then informs the system of the content
actually entered by users through feedback loop so as to display the related
1
correct spelling results . The wrong spelling that seems “useless” is highly
related to the correct spelling actually. The database is updated in real time
through reciprocating feedback loop.
We can take inspiration from the application of big data in enterprises that
the big data is three-dimensional and thus different information can be mined
from different dimensions. In terms of the data size, the tendency information
can be shown through the aggregation of massive data; and in terms of the
time dimension, the reuse of the past historical data will provide some
reference for future prediction; in terms of the field interaction, more accurate
1 Big Data Era: Big Changes In Life, Work and Thinking: 144-145, Zhejiang People's Publishing
House, 2017.12
363 | I S I W S C 2 0 1 9