Page 132 - Invited Paper Session (IPS) - Volume 2
P. 132

IPS188 Bruno Tissot
                  of “big data analytics” – broadly referring to the general analysis of large data
                  sets – and “artificial intelligence” (AI); cf IFC (2019). Modern computing tools
                  can now be used to collect data, correct them, improve coverage (eg web-
                  scraping),  process  textual  information  (text-mining),  match  different  data
                  sources  (eg  fuzzy-matching),  extract  relevant  information  (eg  machine
                  learning)  and  communicate  or  display  pertinent  indicators  (eg  interactive
                  dashboards). All these elements can help to address the resource issues posed
                  by  the  compilation  of  official  price  statistics,  especially  in  developing
                  economies  where  statistical  systems  are  still  in  infancy  and  staff  skills  are
                  limited. One example is the Billion Prices Project at the Massachusetts Institute
                  of  Technology  (MIT),  which  allows  inflation  indices  to  be  constructed  for
                  countries that lack an official and/or comprehensive index and that can be
                  used  for  enhancing  international  comparisons  of  price  indexes  in  multiple
                  countries  and  for  dealing  with  measurement  biases  and  distortions  in
                  international relative prices (see www.thebillionpricesproject.com and Cavallo
                  and Rigobon (2016)). Similarly, a number of central banks in emerging market
                  economies  have  compiled  quick  price  estimates  for  selected  goods  and
                  properties, by directly scraping the information displayed on the web, instead
                  of setting up specific surveys that can be quite time- and resource-intensive.
                  One notable situation relates to those developing economies as big as India,
                  where collecting internet-based data is seen as a potentially useful alternative
                  to  the  organisation  of  large  surveys  that  would  have  to  cover  millions  of
                                                             5
                  reporters. Yet, as indeed noted by Hill (2018)  in the case of the United States,
                  and in contrast to what is observed in the research and academic community,
                  the  use  of  big  data  in  more  mature  statistical  systems  has  been  relatively
                  incremental and limited. It is often targeted at methodological improvements
                  (for instance quality adjustment) and at reducing reporting lags.
                      Turning to the measurement challenges posed by rapid innovation, the
                  high velocity of big data sources can be particularly useful when prices change
                  rapidly. For instance, direct web-scraping allows extracting almost in real time
                  retailers’  prices  from  online  advertisements.  This  can  support  a  timelier
                  publication of official data, by bridging the time lags before official statistics
                  become  available  –  ie  through  the  compilation  of  advance  estimates  or
                  “nowcasting exercises”. In addition to the lag issue, the information provided
                  by the wide range of web and electronic devices is often available with a higher
                  frequency; changes in price developments can thus be tracked more promptly,
                  compared to official CPI numbers that are usually available on a monthly basis.
                  This can be particularly useful when analysing early warning indicators and



                  5  “… while nearly 15% of the price quotes in the Consumer Price Index are now collected online
                  (…)  the  size  of  the  CPI  sample  has  not  increased  to  reflect  the  lower  cost  of  online  data
                  collection”.
                                                                     119 | I S I   W S C   2 0 1 9
   127   128   129   130   131   132   133   134   135   136   137