Page 170 - Contributed Paper Session (CPS) - Volume 7
P. 170

CPS2055 Asanao S. et al.
               size tree is constructed. Various authors have proposed several criterions, and
               essentially these criterions divided into two types. One is the minimization of
               the risk within the node, and the other is the maximization of the degree of
               separation between nodes. For example, Log-rank test statistics is widely used
               (Leblanc and Crowley (1993)). The maximum size tree obtained by the splitting
               step  suffers  from  an overfitting  problem.  To  handle  this  problem, a  set  of
               nested subtrees is produced from the maximum size tree in the pruning step.
               In the selection step, the optimal size tree is selected by cross-validation or
               bootstrap method.
                   In  this  study,  we  consider  the  concordance  probability-based  splitting
               criterions  for  constructing  a  survival  tree.  The  area  under  the curve of  the
               receiver  operating  characteristic  curves  is  widely  used  to  evaluate  the
               prediction accuracy of the model for binary outcome, and it is relevant to
               Kendall's tau and Mann-Whitney U test statistics. In survival data case, this
               idea is inherited by concordance probability and it is used to evaluate the
               prediction accuracy of the model. We use the four measures which evaluate
               the concordance probabilities as the splitting criterions: Harrell’s C (Harrell et
               al(1996)), Uno’s approach (Uno et al. (2011)), Begg’s approach (Begg et al.
               (2000)), and Korn and Simon’s approach (Korn and Simon (1990)).
                   In the Schmid et al. (2016), it has been proposed that Harrell’s C is used as
               the splitting criterion to construct a random forest. In their research, maximum
               size trees are constructed using Harrell’s C from bootstrap samples, and then
               the trees are aggregated to construct a forest. In this research, we propose the
               pruning  and  selection  methods  to  construct  a  tree  model  based  on  the
               concordant measures. We study the performance of the splitting abilities of
               the  criterions  based  on  these  measures,  and  compare  the  survival  trees
               constructed  by  these  criterions  and  conventional  criterions  through
               simulations.
                   The  remainder  of  this  paper  is  organized  as  follows.  In  Section  2,  we
               introduce the method to construct a survival tree based on the measures for
               concordance probabilities. In Section 3, the results of the simulation studies
               are described. Finally, in Section 4, we present the conclusions of this paper.

               2.  Methodology
               1.  Concordance probability
                   Let    and    be  the  true  failure  and  censoring  time  for  subject  ,
                                
                        
               respectively.  Then,  we  can  observe  the  time    =  min(  ,   ) .  Let    =
                                                                             
                                                                                       
                                                                          
                                                               
                (  =   )  be  the  event  indicator  for  ,  which  is  1  if  the  observation
                         
                   
               experience  an  event  and  0  if  the  observation  is  censored.  Let    =
                                                                                       
                ( , ⋯ ,  ) denote  dimensional covariate vector for . Then, an observed
                  1
                         
               sample is represented by ℒ = {(  ,   ,   );  = 1, ⋯ , }.
                                                  
                                              
                                                     
                                                                  157 | I S I   W S C   2 0 1 9
   165   166   167   168   169   170   171   172   173   174   175