Page 309 - Special Topic Session (STS) - Volume 1
P. 309

STS441 David B. et al.
                On the basis of expert domain knowledge, we combine the results of the
            supervised  approach  (decision  tree)  and  the  unsupervised  approach
            (association  rules).  The  combined  ruleset  comprises  ten  rules  for  isolating
            subsamples that allow for an accurate matching of SHS with MiFID data.

            5.  Discussion and Conclusion
                To  develop  a  more  complete  picture  of  exposures  to  risks  and  the
            interconnectedness of market participants, we integrate an official statistic on
            securities  holdings  and  regulatory  transaction-by-transaction  securities
            trading data. We develop a simple and transparent set of rules that allows us
            to integrate the datasets despite the stark conceptual differences between the
            data sources, by combining supervised and unsupervised machine learning
            algorithms. We find that, in combination with domain knowledge, this data
            driven  approach  allows  for  a  successful  isolation  of  subsamples  that  can
            accurately be matched between the datasets. One benefit from integrating the
            datasets is that, for the successfully integrated subsamples, we are able to
            analyse portfolio risks at any point in time, rather than only on a monthly basis.
            With  the  implementation  of  the  new  Market  in  Financial  Instruments
            Regulation (MiFIR, MiFID II) in the beginning of 2018, we will be able to further
            improve the matching. This will allow for an even more accurate integration of
            both data sources.







































                                                               298 | I S I   W S C   2 0 1 9
   304   305   306   307   308   309   310   311   312   313   314