Page 54 - Contributed Paper Session (CPS) - Volume 6
P. 54

CPS1490 Nehall Ahmed Farouk Mohamed
                  sample of the available big data set makes it quite easy, as it shrinks the size
                  of data to be trained, also using decentralized storage.

                  4. Discussion and Conclusion
                     A massive work had been done in studying big data analytics, especially
                  predictive analytics. Some problem blocks the completeness of using ML in
                  big  data  predictive  analysis.  Recently  some  of  these  problems  have  been
                  solved. The overall work of deep learning in this manner is enhanced than
                  before.  But  there  still  an  open  issues  to  be  studied  in  the  future,  like:  (a)
                  training a sample of the data, needs to define the adequate sample design. (b)
                  The  granularity  issue  needs  to  be  studied  over  different  categories  and
                  classifications  to  see  how  it  might  effect.  (c)  The  issue  of  the  continuous
                  streaming of data.

                  References
                  1.  Fan,  J.,  &  Lv,  J.  (2008).  Sure  independence  screening  for  ultrahigh
                      dimensional feature space. Journal of the Royal Statistical Society: Series
                      B (Statistical Methodology),70(5), 849–911.
                  2.  Diebold, F. X. (2012). A personal perspective on the origin(s) and develop-
                      ment  of  “big  data”:  The  phenomenon,  the  term,  and  the  discipline.
                      (ScholarlyPaper No. ID 2202843). Social Science Research Network.
                  3.  Kalampokis, E., Tambouris, E., & Tarabanis, K. (2013). Understanding the
                      predictive  power  of  social  media.  Internet  Research,  23(5),  544–559.
                      doi:10.1108/IntR-06-2012-0114.
                  4.  O. Y. Al-Jarrah, P. D. Yoo, S Muhaidat, G. K. Karagiannidis, and K. Taha
                      .Efficient Machine Learning for Big Data: A Review. Khalifa University, Abu-
                      Dhabi,  UAE,  Data  Science  Institute,  Bournemouth  University,  UK,
                      University  of Surrey,  Guildford,  UK,  Aristotle  University  of  Thessaloniki,
                      Thessaloniki, Greece.
                  5.  Bart Buelens, Piet Daas, Joep Burger, Marco Puts, and Jan van den Brakel
                      (2014). Selectivity of Big data. Discussion paper. Statistics Netherlands.
                  6.  Eric P. Xing, Qirong Ho, Wei Dai1, Jin Kyu Kim, Jinliang Wei, Seunghak Lee,
                      Xun  Zheng,  Pengtao  Xie,  Abhimanu  Kumar,  and  Yaoliang  Yu  (2015).
                      Petuum: A New Platform for Distributed Machine Learning on Big Data.
                      School of Computer Science, Carnegie Mellon University.
                  7.  Amir Gandomi, and Murtaza Haider (2015). Beyond the hype: Big data
                      concepts,  methods,  and  analytics.  School  of  Management,  Ryerson
                      University. Canadaa.
                  8.  Siu-Ming Tam and Frederic Clarke (2015). Big Data, Official Statistics and
                      Some Initiatives by the Australian Bureau of Statistics. Methodology and
                      Data Management Division. Australia.



                                                                      43 | I S I   W S C   2 0 1 9
   49   50   51   52   53   54   55   56   57   58   59