Page 332 - Invited Paper Session (IPS) - Volume 1
P. 332
IPS155 Laura B.
dataset on their own before running the statistical computations. In addition,
in the historical database some identifying variables (i.e. province of
residency), previously expunged, have been added back for the period 1977-
1986. We also deliver the Italian component of the Household Finance and
Consumption Survey (coordinated by the ECB), adding some variables that are
not included in the original SHIW dataset (as, for example, gross income). For
all datasets, the information needed for data usage (questionnaires of the
latest waves, variables names, instructions for data usage, etc.) are available in
pdf format, while data are available in different formats (SAS, STATA, CSV).
5. Banca d’Italia experience so far and the way forward
As said we usually don’t receive, and hence cannot count and archive,
all research papers written by external researchers using households and
firms survey data, we don’t have a measure of the utility of Banca d’Italia’s
granular data dissemination. On the other hand, we can count the number
of people submitting jobs through BIRD for business survey data and
downloading the file with households’ survey data. As for the first figure, we
observe a huge volatility of users and jobs submitted: on average 7 jobs are
run every week, even if the number of researchers is rather limited (around
10 new researchers per year file for using BIRD). We believe that these
figures could easily increase as accessing data becomes more user-friendly.
To this end Banca d’Italia has included in her 3-years strategic plan the goal
to enrich the offer of granular data to the general public. We have hence
started working on a brand new Research Data Center (RDC) in order to
facilitate the way internal and external users access microdata, also
increasing data availability and improving methodology. It will be a single
entry point, notwithstanding differences in the permissions of use and
access points allowed for every dataset, owing to the already discussed
5
differences in households’ and firms’ survey data. The RDC will therefore
include all datasets already available as Public Use File (hence the
households’ survey data, and more) and in BIRD, but it will also provide two
new important tools. The first one is a web tabulator, an easy to use device,
available online, for performing tabulations from micro data in a
personalized way, i.e. the researcher can choose categories and breakdowns
to a certain extent. The web tabulator will have some built-in firewalls in
order to prevent the identification of a single respondent or of a small
number of respondents. In fact a simple, and commonly used, way to
prevent re-identification, is imposing a minimum number of respondents
5 This approach is common to other data providers. The Data without Boundaries programme
for European data, ended in 2015, noticeably supported this approach, up to listing a set of
principles for access to elementary data (Schiller and Welpton, 2013).
321 | I S I W S C 2 0 1 9