Page 330 - Invited Paper Session (IPS) - Volume 1
P. 330

IPS155 Laura B.
                  other hand, when this initial check is passed, BIRD runs the program and a
                  dedicated Banca  d’Italia employee examines the output of the program in
                  order to further verify that confidentiality is not breached by the submitted
                                4
                  computations.   Once  the  output  does  not  violate  any  confidentiality
                  restriction,  i.e.  it  does  not  identify  information  referred  to  any  single  or
                  restricted group of firms (or banks, in the future), the output can be released
                  to researcher. She hence receives an email with the cleared output. If, on the
                  opposite,  the  manual  check  envisages  a  confidentiality  violation,  the
                  researcher receives an email explaining the reason of the rejection. It could
                  also be the case that the program is miswritten, and BIRD ends in error. Again
                  the researcher receives an email reporting the misspelling of the program. In
                  order to reduce the number of these occurrences and speed up the process,
                  since 2016 a dataset with fake figures in semicolon-delimited ASCII format that
                  replicates  the  internal  structure  of  the  original  data  from  the  Survey  of
                  Industrial  and  Service  Firms  (but  contains  randomly  generated  data)  is
                  available on Bank of Italy’s web site, so that researchers can test the editing of
                  their codes before submitting them to BIRD.
                      In order to prevent any violation to confidentiality restrictions there are
                  several firewalls, at three different levels: user (users are identified, qualified
                  and registered; registered mailboxes are whitelisted; outputs are monitored
                  and  archived;  deontological  code,  privacy  law,  specific  penalties);  data
                  (identifying  variables  are  expunged  from  the  datasets  used  for  remote
                  processing; extreme data are censored; stratification variables are collapsed);
                  processing (forbidden to display individual data; keyword parser implemented
                  with blacklist and greylist; particularly long and/or complex programmes are
                  always reviewed manually; all submissions are reviewed manually).
                      Each institution allowing for remote processing of granular data provides
                  a  similar  set  of  controls.  Remote  execution  platforms  are  then  considered
                  reasonably  safe  and  useful  and  thus  remain  an  important  tool  for  the
                  dissemination of granular data for many data providers around the globe.

                  4.  Households’ survey data
                      The Survey on Household Income and Wealth (SHIW) was begun in the
                  1960s to gather data on the incomes and savings of Italian households. From
                  the beginning to the publication of Banca d’Italia’s Internet web site, dataset



                  4   There’s  a  clear  trade-off  between  the  length  of  the  commands  that  are  automatically
                  forbidden, and the flexibility allowed to researchers in running computations and regressions.
                  The higher the flexibility granted to the researcher, the higher the role played by Banca d’Italia
                  employee in manually checking the output of the job submitted. As for the moment, the list of
                  forbidden keywords is very limited, but should users and jobs submitted increase significantly
                  and dedicated employees not balance this rising burden, the list of forbidden keywords could
                  be enlarged.
                                                                     319 | I S I   W S C   2 0 1 9
   325   326   327   328   329   330   331   332   333   334   335