Page 469 - Invited Paper Session (IPS) - Volume 1
P. 469
IPS177 F. Ricciato et al.
processing of different kinds of data about citizens, companies, goods etc. For
SO this change has implications on both sides. In the back-end, there are new
potential data sources to be accessed, in addition to traditional survey/census
and administrative data, but the peculiarities of such new sources might
require alternative access models other than direct ingestion of raw input data
[4]. On the front-end, the expert users downstream the processing flow have
now increased possibilities to combine the data obtained by SO with other
external data, an aspect that exacerbates the challenge for SDC.
2. Input privacy vs. Output privacy
Hereafter we provide an abstract view about the relations between input
data and output data and then present the notions of ‘input privacy’ and
‘output privacy’ as introduced in the literature (see e.g. [5,10]). Finally we
elaborate about how these categories map to the new scenario described
above.
Figure 3 – Input Privacy vs. Output Privacy problems.
We call computation the task of extracting the desired output information
(or results) from a set of input data. We call inference the task of extracting
some (partial) information about one of the input components based on
knowledge of the output (and possibly other external data). It is clear from
Figure 3 that computation and inference flow logically in opposite direction.
In this discussion the output can take any arbitrary form, including for example
a summary indicator, a set of regression coefficients or a whole frequency
table, to name some concrete examples. We focus here on the case where the
computation function (that may be called algorithm, methodology, procedure,
program etc. in different context) is well defined before execution. In other
words, our focus is on the stage of statistics production, not on the (logically
antecedent) phase of data exploration and methodological development.
What is relevant for our discussion is the multi-party scenario where the entity
or entities (organizations, institutions, individuals, etc.) holding the input data
differ from the entity/entities interested to get the output results. We shall use
the terms ‘input party’ (IP for short) and ‘output party’ (OP for short) to refer,
458 | I S I W S C 2 0 1 9