Page 35 - Invited Paper Session (IPS) - Volume 1
P. 35

IPS35 Dinov I.D. et al.
            S(f ), such that ⱯD1, D2 that differ on at most one element IIf (D1) - f (D2)
            II1 ≤ S(f ). There are many differentially private algorithms, e.g., random
            forests, decision trees, k-means clustering, etc. For instance, if f  : D = DB
                m
            → R , the algorithm outputting f (D ) + (ղ1, ղ2, … , ղm), with ղί € Laplace
                                 ,    Ɐ ί is ℇ-differently private.

            2.2 Fully-Homomorphic Encryption (FHE)
                FHE security is based on preprocessing the data by encryption to allow
            subsequent program execution and data-driven inference using the encrypted
            information (Gentry 2009). As a result, the process outputs are encrypted and
            their interpretation requires ability to decrypt the information following the
            data analytics. It represents an elegant and powerful mathematical framework
            for bijective (encoding/decoding) processing and analytics. Albeit, it is very
            fast, FHE has some limitations, e.g., deriving the f ′ – commutative analytic
            evaluators – is never a trivial task and requires close cooperation between data
            governor and data user. Figure 2 shows schematically the process of data
            analytics using fully-homomorphic encryption.

























                     Figure 2: Data analytics via fully-homomorphic encryption.

            2.3 DataSifter Statistical Obfuscation
                The process of data-masking using statistical obfuscation is the core of the
            DataSifter technique. It combines artificial random missingness with partial
            information alterations using data swapping within subjects’ neighbourhoods.
            These  operations  have  minimal  impact  on  the  joint  distribution  of  the
            obfuscated  (sifted)  output  data  as  the  controlled  rate  of  missingness  is
            introduced completely at random and nearest neighbourhoods tend to have
            consistent  distributions.  The  DataSifter  algorithm  preserves  the  exact  data


                                                               24 | I S I   W S C   2 0 1 9
   30   31   32   33   34   35   36   37   38   39   40