Page 44 - Special Topic Session (STS) - Volume 1
P. 44

STS353 H. Zhao et al.
                  center may share similar medical environment and thus their failure times
                  may tend to be correlated with each center serving as a cluster. Furthermore,
                  the cluster size, the number of subjects from a center, could be different from
                  one center to another and may contain some relevant information about the
                  failure time of interest. Similar data can occur in a dental study concerning all
                  teeth of an individual (Zhang and Sun, 2010) and such an example will be
                  discussed below in details.
                     For the analysis of clustered failure time data, a commonly used approach
                  is the marginal model approach in which estimation is usually carried out
                  based  on  estimating  equations-based  (GEE)  procedures.  One  major
                  advantage of these methods is their robustness against the misspecification
                  of the correlation structure and also it is relatively easy to use as one can
                  leave the association structure to be arbitrary (Williamson et al., 2003). On
                  the other hand, it is apparent that such methods can be less efficient and
                  more importantly, it is difficult to take into account the informative cluster
                  size. Corresponding to these, we present a within-cluster resampling (WCR)
                  method  when  the  failure  time  of  interest  follows  a  class  of  linear
                  transformation models (Fine et al., 1998; Zhang et al., 2005). One advantage
                  of  these  models  is  their  flexibility  as  they  include  many  commonly  used
                  models such as he proportional hazards model and the proportional odds
                  model  as  special  cases.  The  WCR  method  uses  a  single  observation  to
                  represent each cluster and is a cluster-based approach (Hoffman et al., 2001;
                  Cong et al., 2007; Chen et al., 2016; Chen et al., 2017). Like the GEE-based
                  methods,  the  new  method  can  be  easily  implemented  and  leave  the
                  correlation structure arbitrary, and in the meantime, it still works or is valid
                  when the cluster size is informative.

                  2.  Methodology
                     Consider  a  failure  time  study  consisting  of  clusters  and   subjects
                                                                                  
                  within  cluster . For  subject  in  cluster , let   denote  the  failure  time  of
                                                               
                  interest and suppose that there exists a  −dimensional vector of categorial
                  covariates  denoted  by  ,  = 1, … ,    = 1, … , . Some  comments  on  the
                                                      ,
                                          
                  covariates  will  be  given  below.  Let  =  + ⋯ +    and  assume  that 
                                                            1
                                                                                          
                  follows the linear transformation model given by
                                                 
                                       ( ) =   +                      (1)
                                                  0
                                                         
                                       0
                                          
                  In the above,  (∙) denotes an unknown strictly increasing function,   is a
                                                                                       0
                                 0
                  vector of unknown regression parameters, and   denotes a random error
                                                                  
                  assuming to have a completely known distribution function . An advantage
                  of  the  model  above  is  its  flexibility  as  it  includes  some  commonly  used
                  models as special cases. For example, it gives the Cox model if () = 1 −
                                                                      33 | I S I   W S C   2 0 1 9
   39   40   41   42   43   44   45   46   47   48   49