Page 470 - Invited Paper Session (IPS) - Volume 1
P. 470

IPS177 F. Ricciato et al.
                  respectively, to the entities holding the input data and those interested to
                  learn the output result. For example, when processing confidential data held
                  by private business companies for official statistics, such companies take the
                  role of IP, while the OP role is taken by SO. In another citizen statistics scenario,
                  each individual respondent (data subject) can be an independent IP, and again
                  the OP maps to SO.
                      Given this abstract setting, with IP and OP role, we identify two distinct
                  confidentiality challenges:
                  ·   Output privacy problem: Given that the computation results will be made
                      available in some way to the OP how to prevent OP from inferring too
                      much about the input data held by IP?
                  ·   Input privacy problem: Given that the input data are confidential and
                      cannot be disclosed outside their respective IP, how to enable the OP to
                      learn the computation results?
                      In the particular  case where a  single IP holds all input data, the input
                  privacy problem admits a very simple solution: the whole computation can be
                  executed internally to the IP, and only the final output is passed to the OP.
                  This  has  been  indeed  the  case  of  official  statistics  for  decades,  with  the
                  statistical office playing the role of IP on the front-end, where the external
                  users (including researchers and the general public) play the role of OP. In this
                  setting, exemplified in Figure 1, the input privacy problem is inherently solved
                  and only the output privacy problem had to be addressed.
                      Instead,  in  the  new  scenario  depicted  in  Figure  2,  we  foresee  the
                  possibility for the statistical office to compute statistics based on confidential
                  data held by other entities (e.g., private companies, other public institutions,
                  or individuals) that we cannot or do not want move into the statistical office
                  domain. In this case, external data holders play the role of IP in the back-end,
                  where the statistical office plays the role of OP. Furthermore, on the front-end,
                  we  may  want  to  let  our  users  compute  statistics  based  on  the  fusion  of
                  confidential data held by the statistical office with other external input data. In
                  this case, the input privacy and output privacy problems occur jointly on the
                  front-end.
                      In  the  new  scenario,  we  must  cope  with  the  input  privacy  problem  in
                  addition to (not in place of) the output privacy problem. Again, if the desired
                  statistics can be computed from the input data held by a single data holder
                  (as IP) in isolation from other data holders, the most natural approach is to let
                  the IP execute the computation and then pass the (final or intermediate) non-
                  confidential data to the statistical office (as OP). Standard technical and non-
                  technical means can be adopted to ensure that the program that is executed
                  by the IP premises does not deviate from what was approved (or developed)
                  by the OP. This is particular relevant on the back-end, where OP maps to the
                  statistical  office:  we  highlight  that  outsourcing  the  mere  execution  of  a

                                                                     459 | I S I   W S C   2 0 1 9
   465   466   467   468   469   470   471   472   473   474   475