Page 469 - Invited Paper Session (IPS) - Volume 1
P. 469

IPS177 F. Ricciato et al.
            processing of different kinds of data about citizens, companies, goods etc. For
            SO this change has implications on both sides. In the back-end, there are new
            potential data sources to be accessed, in addition to traditional survey/census
            and  administrative  data,  but  the  peculiarities  of  such  new  sources  might
            require alternative access models other than direct ingestion of raw input data
            [4]. On the front-end, the expert users downstream the processing flow have
            now increased possibilities to combine the data obtained by SO with other
            external data, an aspect that exacerbates the challenge for SDC.

            2.  Input privacy vs. Output privacy
                Hereafter we provide an abstract view about the relations between input
            data  and  output  data  and  then  present  the  notions  of  ‘input  privacy’  and
            ‘output  privacy’  as  introduced  in  the  literature  (see  e.g.  [5,10]).  Finally  we
            elaborate  about  how  these  categories  map  to  the  new  scenario  described
            above.











                        Figure 3 – Input Privacy vs. Output Privacy problems.

                 We call computation the task of extracting the desired output information
            (or results) from a set of input data. We call inference the task of extracting
            some  (partial)  information  about  one  of  the  input  components  based  on
            knowledge of the output (and possibly other external data). It is clear from
            Figure 3 that computation and inference flow logically in opposite direction.
            In this discussion the output can take any arbitrary form, including for example
            a  summary indicator, a set of regression coefficients or  a whole frequency
            table, to name some concrete examples. We focus here on the case where the
            computation function (that may be called algorithm, methodology, procedure,
            program etc. in different context) is well defined before execution. In other
            words, our focus is on the stage of statistics production, not on the (logically
            antecedent)  phase  of  data  exploration  and  methodological  development.
            What is relevant for our discussion is the multi-party scenario where the entity
            or entities (organizations, institutions, individuals, etc.) holding the input data
            differ from the entity/entities interested to get the output results. We shall use
            the terms ‘input party’ (IP for short) and ‘output party’ (OP for short) to refer,

                                                               458 | I S I   W S C   2 0 1 9
   464   465   466   467   468   469   470   471   472   473   474