Page 471 - Invited Paper Session (IPS) - Volume 1
P. 471
IPS177 F. Ricciato et al.
computation program to the IP does not imply loss of control by the OP over
what program is executed.
The input privacy problem is more challenging when the required output
results are based on the contribution of several input data sets held by
multiple IPs that cannot disclose their data. In some cases, the computation
program can be factorized into separate computation instances that are run
independently by the multiple IPs, either sequentially or in parallel. However,
very often the desired output results do not allow for computation
factorization. For example, this is the case when output results must be
computed on the intersection records between different IP data sets, or when
a regression must be run over variables that are held by different IPs. In these
cases, we may resort to Secure Multi-Party Computation (SMC).
3. Solution approaches to Output Privacy problem
The output privacy approach is traditionally addressed by so-called
Statistical Disclosure Control (SDC) techniques, possibly in combination with
Access Control (AC). SDC aims at restricting what is disclosed, while AC
imposes restrictions on to whom it is disclosed. Generally speaking, there is a
trade-off between the two: the weakest SDC requires strongest AC, and vice-
versa.
AC methods rely on combination of requirements referring to the nature
of the potential users, their experience with holding confidential data and
legitimate use of the data. Potential users must provide evidence of fulfilling
AC requirements which is scrutinized by the data owners. The trustworthy
users are confined with more detailed data and better access facilities. SDC
methods rely on combination of suppression, perturbation, randomization
and aggregation of data.
Historically, SDC was performed manually by dedicated experts, following
practices and criteria that were developed through the years in the official
statistics community. SO successfully managed the output control as the
statistics going out were pre-defined and SO could consistently apply
suppressions on primary and secondary confidential cells. The current trend is
towards “on-the-fly SDC”. Nowadays many users want to calculate their own
tailor-made statistics on the basis of the detailed data sources. In response to
these needs SO make available dynamic data querying systems that
implement modern SDC approaches, addressing in particular the problem of
differential disclosure. These new SDC approaches require that the output is
always safe, also in combination with any other statistics based on the same
source. The random noise protection method developed by the Australian
Bureau of Statistics (ABS) is an example of modern SDC approach [11, 12]. The
ABS method consists in applying small perturbations (controlled noise) to the
data with the predefined probability distribution. A specific pseudo-random
460 | I S I W S C 2 0 1 9