Page 52 - Contributed Paper Session (CPS) - Volume 6
P. 52
CPS1490 Nehall Ahmed Farouk Mohamed
analytics means the knowledge of the 4 techniques of it. Those techniques are:
(data stream learning – deep learning – granular computing – incremental and
ensemble learning). Here the reader is advised to read (Ahmed Oussous et al.
2018).
But it needs to be mentioned that deep learning is more effective while
working with big data because it has some features, which are: (the ability to
work in an environment that consists of a very huge number of data – it is
based on hierarchy structure in learning - transfer the raw data into feature
vector where the classifier can detect the patterns of the input). Now seeking
to the perfect big data predictive analytics needs to investigate about the
problem that deep learning in that manner. The challenges in deep learning
are emerged of the challenges of big data predictive analytics, as it will affect
the deep learning work processes. So deep learning will challenge the
following issues: (dealing with continuous data streaming- data
incompleteness- running time complexity and model complexity – inability to
train data on central processor or storage- difficulty in parallelizing
algorithms). Overcoming those challenges emphasize intensive study of each
of them, in order to improve deep learning in predictive analysis. (Eric P. Xing
et al.2015) mentioned data parallelism and model parallelism for ML big data
analytics.” In data-parallel ML, the data D is partitioned and assigned to
computational workers (indexed by p = 1:P ); we denote the data partition by
Dp. We assume that the function Δ ( ) can be applied to each these data
subsets independently, yielding a data-parallel update equation: = F ( ,Σ ,
))”(Eric P. Xing et al.2015)-see figure (2). So this consideration solves the
problem of data and model parallelism.it is suggested to solve the mentioned
deep learning challenges, the following: (a) Data labeling and work on missing
values. (b) Using decentralized system to train data. (c) Train sample of the
data. (d) Data and model parallelism. After investigations it was founded that,
some of these suggestions already taken into consideration in a real project
China. It is a modified prediction models over real-life hospital data collected
from central China in 2013_2015 –read (Min Chen et. al 2017). This project
overcome the problem of incomplete data, parallel algorithms, proposed a
new convolutional neural network (CNN)-based multimodal disease risk
prediction algorithm, and with accuracy of the algorithm that reaches
94.8%.the following table (1), shows the different categories of data that were
used.
41 | I S I W S C 2 0 1 9