Page 174 - Contributed Paper Session (CPS) - Volume 7
P. 174

CPS2055 Asanao S. et al.
                                                                ̂    (  )
                                                                         
                                                              2 ̂  (  ) ,      >  
                               ̂
                          = Pr( <  | >  ,  >  ) = {           .
                                               
                          
                                                  
                                                       
                                        
                                           
                                   
                                                                   
                                                             1 −   ̂ (  )  ,  <  
                                                                          
                                                                 2 ̂ (  )
                                                                    
                (. ) is the Kaplan-Meier estimator for the failure distribution based on the
                ̂
                
               samples included in .
               (iv) The criterion based on Korn and Simon’s approach
                                  2                   ( − 1) +  (  − 1)
                        ̂
                         =   ( − 1) {∑ ∑  +          4        },
                         
                                                 
                              
                                 
                                         ∈   ∈ 
               where
                   = Pr ̂ ( <  ) = ∑[1 −  ̂ ( )][ ̂ ( ) −  ̂ ( )] + [1 −  ̂ ( )] ̂ ().
                                               ∗−
                                                                               −
                                                                  ∗
                                                        ∗−
                                                        
                                                                  
                                               
                   
                           
                               
                                                                                   
                                             
                                                                            
                                                               
                                                     
                                    ∈ 
               2.3 Pruning and Selection
                   The  maximum-size  tree    is  obtained  by  recursively  splitting  in  the
                                             0
               splitting step, after which an optimal-size tree is constructed from the   in
                                                                                       0
               the pruning and selection steps. In the pruning step, the nested subtrees   ≺
                                                                                      
                 −1  ≺ ⋯ ≺   ≺   are obtained by recursively removing the node in the
                                    0
                               1
                .   is the tree which has the root node only. For this purpose, we propose
                   
                0
               the concordance-complexity measure:
                                                             ̃
                                                   ̂
                                          ( ) = ( ) − | |
                                              ℎ
                                                              ℎ
                                           
                                                      ℎ
                      ̂
               where ( ) is the concordance measure for the sub-tree  (ℎ  =  0,1, ⋯ , ).
                                                                        ℎ
                         ℎ
                                                                   ̂
                                          ̂
               For example, if we use the   for splitting step, then ( ) is given by
                                          
                                                                      ℎ
                   )
                ̂ ( ℎ
                 ∑ ( ,   )∈ ̂  max{∑  ∑    (  <   ), ∑  ∑    (  >   )} + 0.5{∑ ∈ ̂  ∑ ,∈    (  <   )}
               =        ℎ     ∈   ∈   ∈   ∈       ℎ               ,
                                               ∑   (  <   )
                                                 ,
               where   represents the set of terminal nodes in  , and | |is the number of
                      ̃
                                                                       ̃
                       ℎ
                                                                        ℎ
                                                                ℎ
               terminal nodes in  . ∑ (  ,  )∈ ̃  represents the all pairs of terminal node in  .
                                                                                        ℎ
                                  ℎ
                                            ℎ
               The optimal tree for an arbitrary  is defined as the subtree that maximizes
                ( ). If the value of  is 0, then the optimal subtree is  . On the other hand,
                    ℎ
                                                                      0
                
               if  approaches ∞, then a model that is not considered to be a tree structure
               is selected as the optimal subtree. Therefore, by gradually increasing  from
               0, we can obtain a set of optimal subtrees.
                   In selection step, we select a sub-tree from  ,  ,…   based on the V-fold
                                                                     
                                                             0
                                                                1
                                                        
                                                           
               cross validation. First, we construct the   ,   , ⋯ based on the ℒ  − ℒ ()  for
                                                       1
                                                           2
                                                                             
               each   (  =  1, ⋯ , ) .  Then,  select  the  best  subtree    (  )  from
                                                                                 ′
                                                                                 
                                                       ̂
                    
                                                                             ′
                                                           
                                                               ′
                 
                 ,   , ⋯,which  maximize  the  value  of  (  (  )) for  each  = √   .
                                                              
                                                                                     +1
                                                                             
                1
                    2
               ̂      ′                                                          () for
                (  (  ))is the concordance measure obtained from test samples ℒ
                                                                  161 | I S I   W S C   2 0 1 9
   169   170   171   172   173   174   175   176   177   178   179