Page 104 - Contributed Paper Session (CPS) - Volume 3
P. 104

CPS1954 Vincent C. et al.





                  for   =  1, . . . ,  where  =  0 and  , + 1  =   for convenience and   is an
                                         0
                  indicator function which takes value 1 if the event E occurs and 0 otherwise.
                  The  first  term  in  (2.6)  is  the  even-numbered  order  statistics  from 2  + 1
                  points uniformly distributed on [0, ] used in Green (1995) in order to ensure
                  that  adequate  spacing  between  internal  knots  is  achieved  probabilistically.
                  Although it penalises short subintervals, it might still be possible that   are
                                                                                         
                  concentrated on regions where there is an abundance of data. We thus impose
                  a hard constraint via the second term in (2.6) so that there is an internal knot
                  within each subinterval of equal length on [0, ].
                      We  update   one component  at  a  time  using  the  Metropolis-Hastings
                                  
                  algorithm  with  independent  proposal  distribution  within  the  Markov  chain
                  Monte Carlo (MCMC) sampling scheme. The acceptance probability is




                                             ̃
                         ̃
                  where  ()  = ( , … ,  ,−1 ,  ,  ,+1 , , . . . ,  ) is  a  proposal  vector  of  knot
                                  1
                                                           ,
                                              
                          
                  location with   uniformly sampled from the subinterval ((  − 1)/, /)
                                ̃
                                
                  and ℓ  is the log-likelihood for child .
                       

                  3.   Result
                      Our  application  example  of  classifying  growth  curves  is  based  on  a
                  longitudinal study from the HBGDki project which analyses the prevalence of
                  rotavirus infections in a birth cohort in Vellore, India (Paul et al., 2014). The
                  sample population of 373 children are followed up for three years since birth
                  and have their anthropometric measurements recorded. For the purpose of
                  our analysis, we only focus on the HAZ up to one year old after removing
                  outliers (HAZ < -6 or HAZ > 6) based on WHO recommendations. There are 5
                  to 15 observations for each child and the first measurement is taken between
                  day 1 and 225. We convert the time scale to age in years and set the number
                  of  random  change  points    =  3 .  More  sophisticated  models  can  be
                  formulated by allowing  to vary, for example by using the reversible jump
                  algorithm introduced in Green (1995). However, we fix the value of  here as
                  the number of measurements taken for each child is relatively small.
                      Figure 3.1a shows a random sample of raw trajectories in each subgroup
                  obtained from the classification model, while their respective posterior mean
                  curves  are  given  in  Figure  3.1b.  Eight  different  subgroups  of  children  are
                  identified in the dataset. The largest subgroup which accounts for more than
                  half of the child population shows a constant faltering pattern throughout the
                  first year of the observational period. Subgroups 2 and 6 experience severe

                                                                      93 | I S I   W S C   2 0 1 9
   99   100   101   102   103   104   105   106   107   108   109