Page 470 - Invited Paper Session (IPS) - Volume 1
P. 470
IPS177 F. Ricciato et al.
respectively, to the entities holding the input data and those interested to
learn the output result. For example, when processing confidential data held
by private business companies for official statistics, such companies take the
role of IP, while the OP role is taken by SO. In another citizen statistics scenario,
each individual respondent (data subject) can be an independent IP, and again
the OP maps to SO.
Given this abstract setting, with IP and OP role, we identify two distinct
confidentiality challenges:
· Output privacy problem: Given that the computation results will be made
available in some way to the OP how to prevent OP from inferring too
much about the input data held by IP?
· Input privacy problem: Given that the input data are confidential and
cannot be disclosed outside their respective IP, how to enable the OP to
learn the computation results?
In the particular case where a single IP holds all input data, the input
privacy problem admits a very simple solution: the whole computation can be
executed internally to the IP, and only the final output is passed to the OP.
This has been indeed the case of official statistics for decades, with the
statistical office playing the role of IP on the front-end, where the external
users (including researchers and the general public) play the role of OP. In this
setting, exemplified in Figure 1, the input privacy problem is inherently solved
and only the output privacy problem had to be addressed.
Instead, in the new scenario depicted in Figure 2, we foresee the
possibility for the statistical office to compute statistics based on confidential
data held by other entities (e.g., private companies, other public institutions,
or individuals) that we cannot or do not want move into the statistical office
domain. In this case, external data holders play the role of IP in the back-end,
where the statistical office plays the role of OP. Furthermore, on the front-end,
we may want to let our users compute statistics based on the fusion of
confidential data held by the statistical office with other external input data. In
this case, the input privacy and output privacy problems occur jointly on the
front-end.
In the new scenario, we must cope with the input privacy problem in
addition to (not in place of) the output privacy problem. Again, if the desired
statistics can be computed from the input data held by a single data holder
(as IP) in isolation from other data holders, the most natural approach is to let
the IP execute the computation and then pass the (final or intermediate) non-
confidential data to the statistical office (as OP). Standard technical and non-
technical means can be adopted to ensure that the program that is executed
by the IP premises does not deviate from what was approved (or developed)
by the OP. This is particular relevant on the back-end, where OP maps to the
statistical office: we highlight that outsourcing the mere execution of a
459 | I S I W S C 2 0 1 9