Page 239 - Contributed Paper Session (CPS) - Volume 2
P. 239
CPS1845 Devni P.S. et al.
> dag <- set.arc(dag, from = "F", to = "L")
> dag <- set.arc(dag, from = "S", to = "L")
Dependency is directly listed for each variable, denoted by a bar (|) and
replaced by a semicolon (: ). For example, [ | : ] means → and → ;
while [] and [] mean there is no arc that leads to F also S. Representation of
this graph structure is designed to represent the multiplication of conditional
problems, and can be used with string model functions.
> modelstring(dag)
[1] "[C][E][S][F][P|E][L|S:F][D|C:P:L:F]"
The two primary functions contained in the package are vertices and arcs.
> nodes(dag)
[1] "C" "P" "E" "L" "S" "F" "D"
> arcs(dag)
from to
[1,] "F" "L"
[2,] "S" "L"
[3,] "F" "D"
[4,] "C" "D"
[5,] "E" "P"
[6,] "L" "D"
[7,] "P" "D"
To complete BN model, we will determine the joint probability distribution
of the variables. All discrete variables will be defined in the set status (called
the level in ). Like the example below, we will call state for the variable.
> C.lv <- levels(Data[, "C"])
[1] "1" "2" "3"
In the context of BN, this shared distribution is called global distribution.
Using global distribution directly, will be difficult because the number of
parameters is very high. In this case, based on the combination of the levels
of all variables, the number of parameter sets is 647 probabilities. The
advantage in DAG is that we can simplify global distribution into smaller sets
of local distributions for each variable. Variables that are not connected by
arcs are conditional independent. We can factor global distribution as follows:
(, , , , , , ) = (|, , , ). (|). (). (|, ). (). (). ()
In this case, the parameter for estimation is the conditional probability in
the local distribution. Local probability can be estimated with empirical
frequency in a data set, for example
228 | I S I W S C 2 0 1 9