Page 123 - Contributed Paper Session (CPS) - Volume 6
P. 123

CPS1847 Shariful I.
            Big Data refers to collection of data sets so large and complex that it becomes
            difficult to process using on-hand database management tools or traditional
            data process applications. Big Data Analytics and Data process tiles a platform
            to  globalize  the  research  by  installing  a  dialogue  between  industries  and
            academic organizations and knowledge transfer from research to industry. Big
            data can be characterized by 3vs: the extreme volume of data, the wide variety
            of  types  of  data  and  the  velocity  at  which  the  data  must  be  processed.
            Although the big data doesn’t refer to any specific quantity, the term is often
            used when speaking about petabytes and exabytes of data, much of which
            cannot be integrated easily. Because big data takes too much time and costs
            to  money  to  load  into  a  traditional  relational  database  for  analysis,  new
            approaches to storing and analyzing data have e emerged that rely less on
            data schema and data quality. Although the demand for big data analytics is
            high, there is currently a shortage of data scientists and other analysis who
            have  experience  working  with  big  data  in  a  distributed,  open  source
            environment. In the enterprise, vendors have responded to this shortage by
            creating Hadoop appliances to help companies take advantages of the semi-
            structured and unstructured data they own. Big data can be contrasted with
            small data, another evolving term that’s often used to describe data whose
            volume  and  format  can  be  easily  used  for  self-service  analytics.  A  country
            quoted axiom is that “big data is for machines; small data is for people”.

            2. Methodology
               Industry press is enamoured by the 4 V’s of Big Data. These are Volume,
            Velocity,  Variety  and  Veracity.  Volume  is  referring  to  the  size  of  the  data.
            Velocity  is  referring  to  the  speed  of  how  data  is  collected and  consumed.
            Variety referring to the different kinds of data consumed, from structured data,
            unstructured data and sensor data. Veracity is referring to the trustworthiness
            of the data.
               The Methods of big Data can be described by the following characteristics:
                 1.  Volume-The quantity of data that is generated is very important in this
                   context.  It  is  the  size  of  the  data  which  determines  the  value  and
                   potential of the data under consideration.
                 2.  Variety- The next aspect of Big Data is its variety. This means that the
                   category to which Big Data belongs to is also a very essential fact that
                   needs to be known by the data analysis.
                 3.  Velocity-The  term  ‘velocity’  in  the  context  refers  to  the  speed  of
                   generation of data or how fast the data is generated and processed to
                   meet the demands and the challenges which lie ahead in the path of
                   growth and development.
                 4.  Variability-  This  is  a  factor  which  can  be  a  problem  for  those  who
                   analyse the data. This refers to the inconsistency which can be shown

                                                               112 | I S I   W S C   2 0 1 9
   118   119   120   121   122   123   124   125   126   127   128