Page 218 - Special Topic Session (STS) - Volume 4
P. 218

STS580 Vassilis P. P.













                  Figure  5:  2-dimensional  visualization  using  t-SNE  along  with  the  retrieved  clusters  from  k-
                  means, when k is set to 2 (left) and 6 (center), respectively, and 2-dimensional visualization using
                  t-SNE along with the retrieved clusters of Density Peak, when applied on the 2-dimensional data
                  (right)

                      In the last part of our experimental analysis in an attempt to make the
                  evaluation more accessible (assuming that we trust the visualization procedure
                  enough), we apply a clustering methodology directly on the 2-dimensional
                  mapping retrieved by t-SNE. This way we guaranty that the clustering result
                  will appear in a more suitable manner. For this purpose, we also employ the
                  Density Peak algorithm [20], which can also automatically estimate the number
                  of existing clusters. The results are reported in Figure 5(right). It is evident that
                  there exist clear clusters in our dataset. With this evidence one can claim that
                  Clinics belonging to different clusters could have different control limits and
                  bounds. The preliminary results we are reporting here can be an extremely
                  helpful tool when designing Health Policies.

                  4.  Conclusions
                      Big Data offers a great opportunity in the healthcare domain to elucidate
                  biomedical research fields such as the healthcare fraud detection. Frauds in
                  the health domain constitutes an important issue for both states and citizens
                  since it holds a significant percentage of the annual healthcare expenditure
                  globally. Nowadays, where we are in Big Data era and ML approaches have a
                  recent  advent,  there  is  the  potential  for  new  computational  tools  able  to
                  handle the challenges of fraud detection in healthcare.
                      Our analysis based on clinical data from EOPYY, focusing in investigating
                  the  Clinics  behavior  with  respect  to  their  hospital  expenditure.  Outcomes
                  indicates that it is obvious that there are clear patterns regarding the Clinics
                  found in our dataset. We now have enough evidence to claim that Clinics that
                  belong  to  different  clusters  should  be  examined  under  different
                  circumstances. For example, control limits and bounds for Clinics could be
                  scaled according to the cluster they belong to.

                  5.  Acknowledgment
                      This  project  is  funded  by  the  International  Research  Project,“Collective
                  wisdom  driving  public  health  poli¬cies  -  CrowdHEALTH”,  in  terms  of  the


                                                                     207 | I S I   W S C   2 0 1 9
   213   214   215   216   217   218   219   220   221   222   223