Page 174 - Contributed Paper Session (CPS) - Volume 7
P. 174
CPS2055 Asanao S. et al.
̂ ( )
2 ̂ ( ) , >
̂
= Pr( < | > , > ) = { .
1 − ̂ ( ) , <
2 ̂ ( )
(. ) is the Kaplan-Meier estimator for the failure distribution based on the
̂
samples included in .
(iv) The criterion based on Korn and Simon’s approach
2 ( − 1) + ( − 1)
̂
= ( − 1) {∑ ∑ + 4 },
∈ ∈
where
= Pr ̂ ( < ) = ∑[1 − ̂ ( )][ ̂ ( ) − ̂ ( )] + [1 − ̂ ( )] ̂ ().
∗−
−
∗
∗−
∈
2.3 Pruning and Selection
The maximum-size tree is obtained by recursively splitting in the
0
splitting step, after which an optimal-size tree is constructed from the in
0
the pruning and selection steps. In the pruning step, the nested subtrees ≺
−1 ≺ ⋯ ≺ ≺ are obtained by recursively removing the node in the
0
1
. is the tree which has the root node only. For this purpose, we propose
0
the concordance-complexity measure:
̃
̂
( ) = ( ) − | |
ℎ
ℎ
ℎ
̂
where ( ) is the concordance measure for the sub-tree (ℎ = 0,1, ⋯ , ).
ℎ
ℎ
̂
̂
For example, if we use the for splitting step, then ( ) is given by
ℎ
)
̂ ( ℎ
∑ ( , )∈ ̂ max{∑ ∑ ( < ), ∑ ∑ ( > )} + 0.5{∑ ∈ ̂ ∑ ,∈ ( < )}
= ℎ ∈ ∈ ∈ ∈ ℎ ,
∑ ( < )
,
where represents the set of terminal nodes in , and | |is the number of
̃
̃
ℎ
ℎ
ℎ
terminal nodes in . ∑ ( , )∈ ̃ represents the all pairs of terminal node in .
ℎ
ℎ
ℎ
The optimal tree for an arbitrary is defined as the subtree that maximizes
( ). If the value of is 0, then the optimal subtree is . On the other hand,
ℎ
0
if approaches ∞, then a model that is not considered to be a tree structure
is selected as the optimal subtree. Therefore, by gradually increasing from
0, we can obtain a set of optimal subtrees.
In selection step, we select a sub-tree from , ,… based on the V-fold
0
1
cross validation. First, we construct the , , ⋯ based on the ℒ − ℒ () for
1
2
each ( = 1, ⋯ , ) . Then, select the best subtree ( ) from
′
̂
′
′
, , ⋯,which maximize the value of ( ( )) for each = √ .
+1
1
2
̂ ′ () for
( ( ))is the concordance measure obtained from test samples ℒ
161 | I S I W S C 2 0 1 9