Page 309 - Special Topic Session (STS) - Volume 1
P. 309
STS441 David B. et al.
On the basis of expert domain knowledge, we combine the results of the
supervised approach (decision tree) and the unsupervised approach
(association rules). The combined ruleset comprises ten rules for isolating
subsamples that allow for an accurate matching of SHS with MiFID data.
5. Discussion and Conclusion
To develop a more complete picture of exposures to risks and the
interconnectedness of market participants, we integrate an official statistic on
securities holdings and regulatory transaction-by-transaction securities
trading data. We develop a simple and transparent set of rules that allows us
to integrate the datasets despite the stark conceptual differences between the
data sources, by combining supervised and unsupervised machine learning
algorithms. We find that, in combination with domain knowledge, this data
driven approach allows for a successful isolation of subsamples that can
accurately be matched between the datasets. One benefit from integrating the
datasets is that, for the successfully integrated subsamples, we are able to
analyse portfolio risks at any point in time, rather than only on a monthly basis.
With the implementation of the new Market in Financial Instruments
Regulation (MiFIR, MiFID II) in the beginning of 2018, we will be able to further
improve the matching. This will allow for an even more accurate integration of
both data sources.
298 | I S I W S C 2 0 1 9