Page 130 - Special Topic Session (STS) - Volume 2
P. 130

STS466 Md. Khadzir S.A. et al.
                  data. The unstructured data can be in the form of free-text, visual, audio and
                  machine  generated  data.  Unstructured  data  does  not  have  predetermined
                  values  and  not  stored  in  an  organized  manner  to  be  analysed  by  a
                  conventional data warehouse. Therefore, other techniques need to be applied.
                  MyHarmony aims to address this and be included as part of MyHDW.
                      There were three (3) major deliverables in the conceptual stage. The first
                  part  refers  to  the  development  and  implementation  of  health  terminology
                  standards,  namely  SNOMED  CT,  which  will  be  the  knowledge  bases  for
                  MyHarmony. The second part was harmonization of the medical terminology
                  to  SNOMED  CT  terms  by  way  of  mapping.  The  last  part  was  about  the
                  development and implementation of MyHarmony to show that the application
                  can codify relevant terms in free-text using Natural Language Processing (NLP)
                  technique.  The  SNOMED  CT  codified  data  can  then  be  analysed  for
                  information generation.

                  2.  Methodology
                     The  development  was  first  started  in  2014  with  the  development  of
                  Cardiology Refset. Cardiology Refset was the terminology reference for the
                  MyHarmony  engine  during  the  harmonisation/mapping  and  codification
                  process. Cardiology Refset (version 1.0) was completed and released in 2014.
                  It  is  a  simple  reference  set  [1]  containing  about  600  terms  related  to
                  Cardiology speciality including signs and symptoms, diagnoses, procedures,
                  body structures, medical devices and medications. It was delivered in time to
                  be  tested  on  MyHarmony  standalone  system  to  generate  National
                  Cardiovascular Disease (NCVD) registries.
                     The draft Refset and method was presented during IHTSDO meetings and
                  Expo in succession on September 2013, October 2013, and April 2014 to gain
                  feedback from experts in the international community. The finalized method
                  was presented during SNOMED CT Expo, October 2014 [2].
                     The Cardiology Refset was then expanded to include all cardiology related
                  terms and Cardiology Refset v1.1 was completed in July 2016 containing more
                  than  6000 concepts. First,  more  than 300,000 SNOMED CT  concepts  (Fully
                  Specified  Names)  were  extracted  and  reviewed  by  PIK  using  eyeballing
                  technique.  About  12,000  concepts  that  were  believed  to  be  related  to
                  Cardiology  specialty  was  given  to  the  clinicians  for  review.  The  clinicians
                  reduced  the  number  of  concepts  to  about  6,000.  Additionally,  the  Refset
                  included  local  terms  and  common  abbreviations  which  were  mapped  to
                  existing concepts.

                     Next, the team utilise MyHarmony to generate the analysis. There were 4
                  main functions in MyHarmony:



                                                                     119 | I S I   W S C   2 0 1 9
   125   126   127   128   129   130   131   132   133   134   135