Page 467 - Invited Paper Session (IPS) - Volume 1
P. 467
IPS177 F. Ricciato et al.
scenarios for SMPC and SDC integration in the future “confidentiality
engineering” setup of modern official statistics.
Keywords
Privacy; Confidentiality; Security; Statistical Disclosure Control; Secure
Multiparty Computation
1. Introduction and motivations
The modern society is undergoing a process of massive datafication [1].
The availability of new digital data sources represents an opportunity for
Statistical Offices (SO) to complement traditional statistics and/or deliver novel
statistical products with improved timeliness and relevance, so as to meet the
increasing demands by users. However, such opportunities come with
important challenges in almost every aspect – methodological, business
models, data governance, regulatory, organizational and others. The new
scenario calls for an evolution of the modus operandi adopted by SO also with
respect to privacy and data confidentiality. We propose here a discussion
framework focused on the prospective combination of advanced (dynamic)
Statistical Disclosure Control (SDC) methods with Secure Multi-Party
Computation (SMC) techniques.
For decades, the data business has been a natural monopoly centered
around SO: no other entity had the technical and legal capability to collect and
process large scale data across individuals and organizations. In the traditional
operation model, illustrated in Fig. 1, the SO ingests internally all source
(micro-)data that were collected either directly from the data subjects, via
surveys and censuses, or indirectly through administrative registers. The input
source data collected in the back-end are then processed centrally to deliver
two types of front-end data in output: (i) official statistics for the general
public; and (ii) more detailed data for further processing by expert users and
researchers downstream the data flow.
The legal mandate of SO includes two important obligations that can be
summarized as ‘closed input and open output’. On the input side (back-end)
SO must preserve the confidentiality of the (micro-)data in order to protect
the privacy of data subjects. On the output side (front-end) SO are committed
to publish openly the processed statistics (and in general any output data), so
as to ensure that all potential users get the same information and do so at the
same time. The motivations and implications of both obligations are intimately
connected to the democratic role of official statistics in modern society.
However, in terms of real world applications, there is an unavoidable conflict
between these two goals, since by definition the output data carry non-zero
information about the input data (otherwise they would be useless), i.e., they
always reveal something about the input. On the front-end, SO must
456 | I S I W S C 2 0 1 9