Page 323 - Contributed Paper Session (CPS) - Volume 2
P. 323

CPS1874 Yiyao Chen et al.
                                                                 ′
                                                          
                Assumption 1   = ( ,  , … ,  , … ,  )    =
                                                       
                                                  
                                            2
                                         1
                                
            ( ,  , … ,  , … ,  ) ,          ℎ  .
                                                                             ℎ
                        ′
                                                       ′
                              
              1
                  2
                        
              = ′ , ℎ  () =  ( )     () =  ( ).
                                                            ′
                                        ′
                
                                                         
                                                 
                             
                                     
                      
                We now leave out the foot mark  in the biopsy and outcome indicators to
            avoid  notation  redundancy  and  use ()and () as  biopsy  and  outcome
            indicator with group assignment  be . We focus on the joint potential biopsy
            outcomes ((0), (1)) for each patient, of which only one outcome can ever
            be observed, and which stratifies patients into four categories. The first is the
            always biopsy group (1,1) of patients that would have undergone biopsy no
            matter which cohort, training versus test. Patients in this strata are of interest
            for assessing reproducibility of the risk tool developed on the training set. The
            next strata of interest (1,0) represents patients who would be biopsied in the
            training set but not the test set, and thus would be useful for assessing the
            generalizability of the risk tool developed on the training set. The remaining
            two strata, (0,1) and (0,0), comprise patients who would not have formed the
            risk tool in the training set of which validation is of interest, so are not of
            interest here. For this report, we focus on the always biopsy stratum to assess
            reproducibility of the tool.
                With    denoting  ()  to  simplify  notation,  the  true  positive  rate  at
            threshold  ∈ (0,1) evaluated on the test set is given by
                              () = ( > |(1) = 1, (1),  = 1).
                The   () is  fully  observable  as  it  is  measured  on  the  biopsied
                         
            participants on  the  test set.  The  test  set,  however,  comprises  a  mixture of
            patients exchangeable to those in the training set as well as those divergent
            from the training set in an unknown ratio. It is this lack of clarity that leads to
            the  differences  in  operating  characteristics  of  single  risk  tools  across
            populations  and  subsequent  confusion  in  the  literature.  Therefore,  we
            recommend as an additional pure estimate of the reproducibility of  the 
            on the always biopsied stratum:
                         () = ( > |(1) = 1, (0) = 1, (1),  = 1).
                The  () is not estimable from the observed data and not identifiable
                        
            without additional restrictions. In the following we propose the simplest of
            such restrictions as well as sensitivity analyses over plausible violations of the
            restrictions. The conditional probability representing  () can be written
                                                                    
            as  fraction  with  numerator  and  denominator  equal  to  ( ∈ , (1) =
            1, (0) = 1, (1) = 1,  = 1)  for   = ( > ) and  (0,  1),  respectively.  The
            probabilities can be decomposed as ((0) = 1| ∈ , (1) = 1, (1) = 1,  =
            1) × ( ∈ |(1) = 1, (1) = 1,  = 1) × ((1) = 1, (1) = 1,  = 1), where
            the last term cancels from the numerator and denominator, leaving
                               ((0) = 1| > , (1) = 1, (1) = 1,  = 1
                    () =                                         ().
                       
                                    ((0) = 1|(1) = 1, (1),  = 1


                                                               312 | I S I   W S C   2 0 1 9
   318   319   320   321   322   323   324   325   326   327   328