Figure 6 displays the (normalized quantile) residuals, de¯ned by (9),

from the chosen ¯tted model BCPE(10; 6; 6; 2; 0:35). Panels (a) and (b)

plot the residuals against the ¯tted values of ¹ and against age respectively,

while panels (c) and (d) provide a kernel density estimate and normal QQ

plot for them respectively. The residuals appear random, although the QQ

plot shows two possible outliers in the upper tail and a slightly longer ex-

treme lower tail than that of the Box-Cox power exponential distribution for

y. Nevertheless the Box-Cox power exponential distribution (LMSP model)

provides a reasonable ¯t to the data, clearly better than the power exponen-

tial distribution and preferable to the Box-Cox normal distribution (LMS

model).

Figure 7 displays detailed diagnostic plots for the residuals using a worm

plot developed by van Buuren and Fredriks [12]. In this plot the range of

age is split into sixteen non-overlapping intervals with roughly equal num-

bers of cases. The sixteen age ranges are listed in Table 3 and displayed

in horizontal steps in the chart above the worm plot in Figure 7. A total

of 20 observations lay outside the vertical axis deviation range of the worm

plots. A de-trended normal QQ plot of the residuals in each interval is then

16

displayed, for the lowest age range in the bottom left hand plot, in rows to

the highest age range in the top right hand plot in Figure 7. The worm plot

allows detection of inadequacies in the model ¯t within speci¯c age ranges.

From Figure 7, the de-trended QQ plots show adequate ¯ts to the data

within most of the 16 age ranges, with only occasional minor inadequacies.

van Buuren and Fredriks [12] proposed ¯tting cubic models to each of the

de-trended QQ plots, with the resulting constant, linear, quadratic and cu-

bic coe±cients, ^b0;^b1;^b2 and ^b3 respectively, indicating di®erences between

the theoretical model and empirical mean, variance, skewness and kurtosis

respectively, within the age range in the QQ plot. They summarize their

interpretations in their Table II. For model diagnosis, they categorize abso-

lute values of ^b0;^b1;^b2 and ^b3 in excess of threshold values, 0.10, 0.10, 0.05

and 0.03 respectively, as mis¯ts. There was a single marginal mis¯t (with

^b

1 < ¡0:10 indicating low residual variance in age group 15-16 years) in the

worm coe±cients from the ¯tted model BCPE(10; 6; 6; 2; 0:35), indicating

overall a reasonable ¯t to the data within age ranges.

The ¯t within age groups is further investigated by calculating Q statis-

tics for testing normality of the residuals within age groups, Royston and

Wright [15].

Let G be the number of age groups and let frgi; i = 1; 2; ::; nig be the

residuals in age group g, with mean ¹rg and standard deviation sg, for

g = 1; 2; ::;G. The following statistics Zg1; Zg2; Zg3; Zg4 are calculated from

the residuals in group g to test whether the residuals in group g have pop-

ulation mean 0, variance 1, skewness 0 and kurtosis 3, where Zg1 = n1=2

g ¹rg,

Zg2 =

n

s2=3

g ¡ [1 ¡ 2=(9ng ¡ 9)]

o

= f2=(9ng ¡ 9)g1=2 and Zg3 and Zg4 are test

statistics for skewness and kurtosis given by D’Agostino et al. [16], in their

equations (13) and (19) respectively.

17

The Q statistics of Royston and Wright [15] are then calculated by

Qj =

PG

g=1 Z2

gj for j = 1; 2; 3; 4. Royston and Wright [15] discuss approx-

imate distributions for the Q statistics under the null hypothesis that the

true residuals are normally distributed (although their simulation study was

mainly for normal error models) and suggest Chi-squared distributions with

adjusted degrees of freedom G¡df¹, G¡[df¾ +1]=2 and G¡dfº for Q1;Q2

and Q3 respectively. By analogy we suggest degrees of freedom G ¡ df¿ for

Q4. The resulting signi¯cance levels should be regarded as providing a guide

to model inadequacy, rather than exact formal test results.

Signi¯cant Q1;Q2;Q3 or Q4 statistics indicate possible inadequacies in

the models for parameters ¹; ¾; º and ¿ respectively, which may be overcome

by increasing the degrees of freedom in the model for the particular param-

eter. Table 2 gives the Q statistics Q1;Q2;Q3 or Q4 for testing the mean,

variance, skewness and kurtosis respectively of the residuals (within the 16

age groups listed in Table 3) from the chosen ¯tted BCPE, BC and PE

models, together with their approximate test p-values. The BCPE model

provides a ¯t with acceptable mean, skewness and kurtosis in the residu-

als, while the residual variance is unacceptable. The BC model provides a

¯t with acceptable mean and skewness in the residuals, but unacceptable

variance and kurtosis. The PE provided a completely unacceptable ¯t.

The Zgj statistic when squared provides the contribution from age group

g to the statistic Qj , and hence helps identify which age groups are causing

the Qj statistic to be signi¯cant and therefore in which age groups the model

is unacceptable.

Provided the number of groups G is su±ciently large relative to the de-

grees of freedom in the model for the parameter, then the Zgj values should

have approximately standard normal distributions under the null hypothe-

18

sis that the true residuals are standard normally distributed. We suggest

as a rough guide values of jZgj j greater than 2 be considered as indicative

of signi¯cant inadequacies in the model. Note that signi¯cant positive (or

negative) values Zgj > 2 (or Zgj < 2) for j = 1; 2; 3 or 4 indicate respectively

that the residuals in age group g have a higher (or lower) mean, variance,

skewness or kurtosis than the assumed standard normal distribution. The

model for parameter ¹; ¾; º or ¿ may need more degrees of freedom to over-

come this. For example if the residual mean in an age group is too high,

the model for ¹ may need more degrees of freedom in order for the ¯tted ¹

from the model to increase within the age group.

Table 3 gives the values of Zgj obtained from the chosen ¯tted BCPE

model. There are no signi¯cant values of Zg1 or Zg3 indicating that the mean

and skewness of the residuals within age groups are acceptable. The only

signi¯cant values of Zg2 are Z5;2 and Z13;2 indicating that the residual vari-

ance is too high in age range 2-3 years and too low in age range 15-16 years.

The only signi¯cant value of Zg4 is Z5;4 indicating that the residual kurtosis

is too high (or equivalently the model kurtosis is too low) in age range 2-3

years. The equivalent Zgj values for the BC model had no signi¯cant values

of Zg1 or Zg3, the same two signi¯cant values of Zg2 as BCPE, but had four

signi¯cant values of Zg4 in age ranges 0-0.25, 0.7-1.2, 2-3 and 15-16 years.

Clearly the BC model does not model kurtosis adequately.

Figure 8 provides nine ¯tted model centile curves, de¯ned by (6), for

body mass index for model BCPE(10; 6; 6; 2; 0:35) with centiles 100® =

0.4, 2, 10, 25, 50, 75, 90, 98, 99.6. [These particular centiles correspond to

standard normal ‘z-scores’ spaced at intervals of 2=3]. For each ® the centile

curve is obtained by ¯nding the ¯tted values (^¹; ^¾; ^º; ^¿ ) for each x (over a

range of values of x) and substituting the value of ^¿ into (7) and then ^¹; ^¾; ^º

19

and z® into (6). The resulting centiles are plotted for age range from 0 to 2

years in Figure 8(a) and for age range from 2 to 21 years in Figure 8(b), for

clarity of presentation.

The equivalent ¯gure for the chosen Box-Cox (BC) distribution model

(LMS), denoted by BCPE(10,6,8,0,0.35), is given in Figure 9. Comparing

Figures 8 and 9 shows the di®erence between modelling both skewness and

kurtosis (Figure and modelling just skewness (Figure 9).

Since it cannot model kurtosis, the LMS model tries to use its skewness

parameter as a proxy for modelling the leptokurtosis in the data. Leptokur-

tosis leads to outliers in the upper and lower tails of the distribution of Y

occurring randomly over the age range of the leptokurtosis. Usually the out-

liers resulting from leptokurtosis lie in both tails, but sometimes by chance,

in a small age range, a group of a few outliers all happen to be in the upper

tail, then the LMS tries to ¯t them using positive skewness, i.e. º < 1. This

happens in Figure 8 in the age range 7 to 11 years, where the LMS model

tries to ¯t a few upper tail outliers by using a positively skew distribution,

while the LMSP model views these outliers as representing kurtosis (since

at a slightly higher age range there are also lower tail outliers) and models

these data using a kurtosis model rather than a skewness model. Similarly

if, in a small age range, a few outliers all lie in the lower tail, the LMS model

may try to ¯t them using negative skewness, i.e. º > 1. In an extreme case

the LMS ¯tted model parameter º could oscillate back and forth between

º < 1 and º > 1 in order to ¯t small groups of upper and lower tail outliers.

The conclusion is that the ¯tted skewness parameter º in the BC distri-

bution (LMS model) is highly sensitive to outliers (and therefore to leptokur-

tosis), while in the BCPE distribution (LMSP model) the ¯tted º is more

robust to outliers (and in the BCT distribution, Rigby and Stasinopoulos

20

[6], the ¯tted º is the most robust to extreme outliers or high leptokurtosis).

As a ¯nal model diagnostic, Table 4 compares the sample percent lying

below each centile curve with the nominal model 100® percent, for each of

the chosen BCPE, BC and PE models. The model and sample percents

agree very well for the BCPE model. For the BC model the ¯t is reasonable

but not quite as good in the centre of the distribution (10% to 90%) and

in the extreme tails (0.4% and 99.6%), while the PE model provides an

extremely poor ¯t in the tails.