Page 104 - Contributed Paper Session (CPS) - Volume 3
P. 104
CPS1954 Vincent C. et al.
for = 1, . . . , where = 0 and , + 1 = for convenience and is an
0
indicator function which takes value 1 if the event E occurs and 0 otherwise.
The first term in (2.6) is the even-numbered order statistics from 2 + 1
points uniformly distributed on [0, ] used in Green (1995) in order to ensure
that adequate spacing between internal knots is achieved probabilistically.
Although it penalises short subintervals, it might still be possible that are
concentrated on regions where there is an abundance of data. We thus impose
a hard constraint via the second term in (2.6) so that there is an internal knot
within each subinterval of equal length on [0, ].
We update one component at a time using the Metropolis-Hastings
algorithm with independent proposal distribution within the Markov chain
Monte Carlo (MCMC) sampling scheme. The acceptance probability is
̃
̃
where () = ( , … , ,−1 , , ,+1 , , . . . , ) is a proposal vector of knot
1
,
location with uniformly sampled from the subinterval (( − 1)/, /)
̃
and ℓ is the log-likelihood for child .
3. Result
Our application example of classifying growth curves is based on a
longitudinal study from the HBGDki project which analyses the prevalence of
rotavirus infections in a birth cohort in Vellore, India (Paul et al., 2014). The
sample population of 373 children are followed up for three years since birth
and have their anthropometric measurements recorded. For the purpose of
our analysis, we only focus on the HAZ up to one year old after removing
outliers (HAZ < -6 or HAZ > 6) based on WHO recommendations. There are 5
to 15 observations for each child and the first measurement is taken between
day 1 and 225. We convert the time scale to age in years and set the number
of random change points = 3 . More sophisticated models can be
formulated by allowing to vary, for example by using the reversible jump
algorithm introduced in Green (1995). However, we fix the value of here as
the number of measurements taken for each child is relatively small.
Figure 3.1a shows a random sample of raw trajectories in each subgroup
obtained from the classification model, while their respective posterior mean
curves are given in Figure 3.1b. Eight different subgroups of children are
identified in the dataset. The largest subgroup which accounts for more than
half of the child population shows a constant faltering pattern throughout the
first year of the observational period. Subgroups 2 and 6 experience severe
93 | I S I W S C 2 0 1 9