Page 324 - Contributed Paper Session (CPS) - Volume 2
P. 324

CPS1874 Yiyao Chen et al.
                     The  fraction  before () is  unidentifiable  from  the  observed  test  set
                  data. For c ∈ [0, 1], we let  () be the numerator of the unidentifiable fraction,
                                           1
                  ((0) = 1| > , (1) = 1, (1) = 1,  = 1. which represents the probability
                  of being biopsied in the training set for participants with risks larger than c
                  who would be biopsied in the test set, and diagnosed with cancer. Similarly,
                  we  let   () denotes  ((0) = 1| ≤ , (1) = 1, (1) = 1,  = 1 .  The
                             2
                   () becomes
                      
                                                        ()()
                                  () =          1                     .
                                     
                                              ()() +  ()(1 − ())
                                                             2
                                              1
                      We assume that regardless of cohort or strata membership, patients with
                  higher risks should have a greater chance of being biopsied than patients with
                  lower risks.
                      Assumption 2 For any  ∈ [0,1],  () ≥  ().
                                                      1
                                                               2
                      Note that as  ranges from 0 to 1, and () and  () range from 1
                                                                           
                  to  0,  with   () ≥ ()  in  the  middle  of  the  range  of  ;  ()
                                 
                                                                                        
                  increases  as  () decreases,  obtaining  its  maximum  value  of  1  at  () =
                                2
                                                                                       2
                  0. When  () =  () i.e. the probabilities of being biopsied in training are
                            1
                                    2
                  the same for biopsied participants with risk larger than the threshold or less
                  than or equal to the threshold, the   equals to . For evaluation at
                                                          
                  multiple thresholds, we further assume the following.
                      Assumption       3     For     any     ,  ∈ [0,1]   ≥  ,  ( ) ≥
                                                             1
                                                                                   2
                                                                                      1
                                                                                         1
                                                                2
                                                                              1
                   ( ) ( ) ≥  ( ).
                             2
                                1
                                         2
                   1
                      2
                                      2
                      In addition to true positive rates, validation also considers the false positive
                  rate  per  threshold  , (), which  evaluates  the  proportion  of  patients
                  without  cancer  in  the  test  set  that  tested  positive:  () =
                  ( > |(1) = 1, (1) = 0,  = 1). Similar to the derivation for the , we
                  set    () = ((0) = 1| > , (1) = 0, (1) = 1,  = 1)   and    () =
                                                                                       4
                          3
                  ((0) = 1| ≤ , (1) = 0, (1) = 1,  = 1) and define this rate on the always
                  biopsied stratum as:
                                             ()()
                       () =           3                    ,  (),  (),  ∈ [0,1].
                                                                            4
                                                                     3
                          
                                   ()() +  ()(1 − ())
                                   3
                                                  4

                      The area under the receiver operating curve (AUC) is calculated by integral
                  under the curve of TPR on the y-axis versus FPR on the x-axis:

                                      1                         1
                                                   −1
                         () = ∫    (   ())  = ∫  () ′  (),
                                                                     
                            
                                     0                          0

                                −1
                      where    (∙) is the inverse function of   and  ′  (∙) is the first-
                                                                  
                  degree derivative of  .
                                          
                      95% confidence intervals for the observed as well as the principal stratum
                  measured TPRs and FPRs were calculated using asymptotic approximations,
                  and for all the AUCs using the bootstrap with 2000 samples.

                                                                     313 | I S I   W S C   2 0 1 9
   319   320   321   322   323   324   325   326   327   328   329