Page 71 - Invited Paper Session (IPS) - Volume 1
P. 71
IPS57 Gerardo L. et al.
The mood of the twitterers in Mexico
Gerardo Leyva, Abel Coronado
National Institute of Statistics and Geography (INEGI)
Abstract
The mood of the twitterers in Mexico generates statistical information from
georeferenced tweets that are posted daily from any point of the Mexican
territory and reports the "positivity ratio", which relates the tweets with a
positive emotional load to those that have a negative emotional load. The
process uses sentiment analysis techniques that involve the labeling of a
subset of tweets by human beings and the subsequent machine learning
process that allows automating the identification of the underlying emotional
charge, which happens to match the human criterion more than 80% of the
time. This allows generating the highest frequency time series published by
INEGI, since every day the computer updates the classification of about
100,000 tweets with information up to the previous day. The system, freely
available on the Internet, allows consulting at national and regional level in
annual, quarterly, monthly, weekly, daily and even hourly series. Also, it is
possible to see the hashtags that dominated the scene and directly access
news sites that help to associate the variations in the series with specific
events. With a series of time that begins in 2016, the system shows dynamics
consistent with what should be expected for a number of clearly identified
events, such as the results of the US Presidential Elections on November 11,
2016 (negative), the increase in national petrol prices at the beginning of 2017
(negative) and the two large earthquakes of September 2017 (both negative),
the triumph in Soccer of Mexico against Germany on June 2017 (positive) and
the award of "The Form of Water" with the Oscar for the best film of 2018
(positive). This is a practical exercise in the generation of statistical information
on inferred aggregate subjective well-being using entirely non-conventional
statistical information, which can be refined to address particular topics in
relation to how they are felt by the twitterers. This product is the result of
combined research efforts of INEGI and national and international academic
institutions and has been the basis for the development of other data science
projects at INEGI.
Keywords
Georeferenced tweets; sentiment analysis techniques; machine learning
process; positivity quotient; big data
60 | I S I W S C 2 0 1 9