Page 213 - Special Topic Session (STS) - Volume 4
P. 213

STS580 Vassilis P. P.
            for more accurate, fast and intelligent methodological frameworks from the
            ML perspective. Furthermore, clinical data follows the Big Data era offering
            data with high diversity since these they come from different subareas [4].
            Also, the majority of clinical data has high dimensionality due to a limited
            number  of  patients  (small/large  n,  large  p).  However,  most  computational
            tools can handle data with large n and small p, since in high-dimensional data
            there exists the “curse of dimensionality” phenomenon [5].
                Also, the integration of big biomedical data and advanced computational
            tools can contribute in healthcare fraud detection. The frauds in healthcare are
            classified in three main pillars related to health insurance, drug and medical.
            Healthcare  fraud  is  a  field  with  high  impact  since  several  people  suffer
            financially  with  indicative  examples  the  insurance  holder  who  have  to  pay
            higher expenses while she/he receives reduced coverage, the business who
            pay  increasing  amounts  for  employer  healthcare,  increasing  cost  of  doing
            business, clinics that charge patients for their services or charge services that
            should  be  covered  by  the  state  and  so  on.  Indicatively,  the  World  Health
            Organization (WHO) has estimated recently that every year the state is lost the
            7.3% of the annual healthcare expenditure (around $470 billion) to healthcare
            fraud  annually  [6].  In  this  study,  we  utilized  clinical  data  from  National
            Organization  for  the  Provision  of  Health  Services  of  Greece,  focusing  in
            investigating the Clinics behavior with respect to their hospital expenditure.
            Our analysis is based on t-SNE and Density Peak, two well-established ML
            tools for data visualization and clustering respectively.

            2.  Machine Learning approaches in Healthcare Fraud detection
                Machine learning (ML) approaches can tackle part of the complexity of
            fraud detection since the digitalization of health care information offers more
            data enabling robust Data Mining and ML frameworks [7]. These methods are
            classified  into  three  categories  as  supervised,  unsupervised  learning  and
            reinforcement  learning.  Briefly,  the  first  category  is  the  process  where  the
            algorithm constructs a function that represents given inputs (training set) at
            known desired outputs, with the ultimate goal of generalizing this function
            and for inputs with unknown output. It is used in real word problems related
            to classification, prediction and data interpretation. Unsupervised Learning is
            the process where the algorithm constructs a model for a set of inputs in the
            form  of  observations  without  knowing  the  desired  outputs.  We  have  no
            knowledge of the true label of data in order to compare its efficiency, as we
            can  in  previous  model.  It  is  used  in  real  word  problems  related  to  data
            clustering  and  association  analysis.  The  latter  category  concerns  methods
            which  learn  a  strategy  of  actions  through  direct  interaction  with  the
            environment. It is used in planning problems such as robot mobility control or
            functions optimization in industries.



                                                               202 | I S I   W S C   2 0 1 9
   208   209   210   211   212   213   214   215   216   217   218