Page 173 - Contributed Paper Session (CPS) - Volume 7
P. 173
CPS2055 Asanao S. et al.
used splitting rule consists of a single covariate ( = 1, ⋯ , ). If is a
quantitative variable, then the rule becomes `` ≤ ?'', where is a threshold.
If is a categorical variable with the set of possible values ℱ , then the rule
becomes `` ∈ ℱ ?'', where ⊂ ℱ .
The CART algorithm for constructing the tree-structured model comprises
splitting, pruning, and selection. In the splitting step, covariates space are
recursively divided based on the optimal splitting rules and the maximum-size
tree is constructed. To determine the optimal splitting rule of a node into
0
and , we evaluate all the possible splitting rules for . In order to build the
model with measures for concordance probability, we assume the following
to dichotomize the node : has higher risk than , we evaluate the
concordance probabilities from = 0 to = max{ ; = 1, ∈ }, and
the contribution of the pair (,), where = to the estimate of is 0.5.
Under these assumptions, the splitting criterions based on the measures
̂ ̂ ̂ ̂
, , and are given by as follows:
(i) The criterion based on Harrell’s C
∑ ∑ ( < ) + 0.5{∑ ( < ) + ∑ ( < )}
̂
= ∈ ∈ ∑ ,∈ ,∈
( < )
,∈
(ii) The criterion based on Uno’s approach
−2 −2 −2
∑ ∈ ∑ ∈ { ̂ ( )} ( < ) + 0.5{∑ ,∈ { ̂ ( )} ( < ) + ∑ ,∈ { ̂ ( )} ( < )}
̂ = −2
∑ ,∈ { ̂ ( )} ( < )
̂
where (. )is the Kaplan-Meier estimator for the censoring distribution based
on the samples included in .
(iii) The criterion based on Begg’s approach
2 ( − 1) + ( − 1)
̂
= ( − 1) {∑ ∑ + 4 },
∈ ∈
where is the number of samples included in the node . is defined as
follows: If = = 1, then
0, >
= { 1, < .
If = 0, = 1, then
If = 1, = 0, then
If = = 0, then
160 | I S I W S C 2 0 1 9