Page 212 - Special Topic Session (STS) - Volume 4
P. 212

STS580 Vassilis P. P.



                                  Healthcare fraud detection using machine

                                             learning approaches
                                             Vassilis P. Plagianakos
                   Department of Computer Science and Biomedical Informatics University of Thessaly, Greece
                         Hellenic National Organization for the Provision of Health Services (E.O.P.Y.Y.)

                  Abstract
                  Biomedicine is undergoing a revolution driven by the explosion of biomedical
                  data, as a result, Big Data has shifted the biomedical informatics research from
                  case-based to data-driven-based studies. Data from hospitals and clinics form
                  a very large data set (Big Data) since they have monthly submissions, posing
                  several challenges under the perspective of Big Data mining and analysis. On
                  parallel, fraud detection in the healthcare domain is an important issue, since
                  it has considerably inflated losses for individuals, entities, and governments.
                  Hence,  there  is  an  imperative  need  for  new  computational  tools  able  to
                  effectively  detect  fraud  by  exploiting  the  potential  of  Big  Data.  Machine
                  Learning (ML) approaches can shed more light in healthcare fraud since they
                  can cope with these challenges. In this study, we utilized clinical data from the
                  National Organization for the Provision of Health Services of Greece, focusing
                  on investigating the Clinics behavior with respect to their hospital expenditure.
                  The  core  of  our  analysis  is  based  on  t-SNE  and  Density  Peak,  two  well-
                  established ML tools for data visualization and clustering respectively. Our
                  results show that ML approaches can contribute to healthcare fraud detection
                  and interpretation.

                  Keywords
                  Fraud Detection; Machine Learning; Visualization; Big Data

                  1.  Introduction
                      We  live  in  the  “Big  Data”  era,  where  there  is  a  great  potential  for
                  revolutionizing the entire healthcare domain [1]. Biomedical data generation
                  is increased constantly through the recent advancements in biomedicine field
                  creating  a  large  pool  of  heterogeneous  increased  with  exponentially
                  increasing  rate.  This  data  volume  poses  several  challenges  under  the
                  perspective of Big Data analysis and visualization. Given the fact that these
                  data have ultra-high dimensionality and complexity, it is obvious that we need
                  computational tools to cope with these challenges. Machine Learning (ML)
                  techniques are among the best approaches to tackle these limitations [2,3].
                  Nevertheless, data generated in the health domain are too big, too complex
                  and their production rate too fast for the healthcare providers to process and
                  interpret with the existing tools. Hence, there is an imperative and urgent need


                                                                     201 | I S I   W S C   2 0 1 9
   207   208   209   210   211   212   213   214   215   216   217