Page 125 - Contributed Paper Session (CPS) - Volume 6
P. 125

CPS1847 Shariful I.















            4.2 Open source Big Data Analysis Platforms and Tools
            1 Hadoop
                 Without Hadoop no one can talk about big data. The Apache distributed
            data processing software is so pervasive that often the terms ‘Hadoop’ and
            ‘big data’ are used synonymously. The Apache distributed data processing
            software is so pervasive that often that often the terms ‘Hadoop and ‘Big Data’
            are used synonymously. The Apache Foundation also sponsors a number of
            related projects that extend the capabilities of Hadoop, and many of them are
            mentioned below. In addition, numerous vendors offer supported versions of
            Hadoop and related technologies. Operating system: Windows,
            Linux, OS X,
            2 MapReduce
                 Originally developed by Google, the Mapreduce website describes it as “a
            programming  model  and  software  framework  for  writing  applications  that
            rapidly process vast amounts of data in parallel on large clusters of compute
            nodes”.  It’s  used  by  Hadoop,  as  well  as  many  other  data  processing
            applications. Operating System: OS Independent.
            3 Gridgrain
                 Gridgrain offers an alternative to Hadoop’s Mapreduce that is compatible
            with the Hadoop Distributed file system. It often in memory processing for fast
            analysis of real time data. One can Download the open source version from
            GitHub  or  purchase  a  commercially  supported  version  from  the  link  in
            operating System: Windows, Linux, OS X.
            4 HPCC
                 Developed  by  LexisNexis  Risk  Solutions,  HPCC  is  short  for  "high
            performance computing cluster." It claims to offer superior performance to
            Hadoop.  Both  free  community  versions  and  paid  enterprise  versions  are
            available. Operating
            System: Linux.
            5 Storm
                 Owned  by  Twitter,  Storm  offers  distributed  real-time  computation
            capabilities and is often described as the "Hadoop of real-time." It's highly




                                                               114 | I S I   W S C   2 0 1 9
   120   121   122   123   124   125   126   127   128   129   130