Appendix

Human infections

This case study provides an opportunity to demonstrate the method of logistic regression. Let us pretend that N_infections represents, not the numbers of times infections that were detected but, in an environment where schistosomiasis is highly prevalent, the number of individuals found to be infected. We assume that the data follow a binomial distribution and fit a logistical model to the data for sites 2-5 just with a single parameter for age. We do this by using Stats → Regression Analysis → Generalized Linear Models... and choosing the model for modelling binomial proportions as shown in the dialog box.

By clicking the Options... button and ticking 'Accumulated' we obtain the analysis of deviance shown here. By checking the value of the residual deviance we can ascertain how well the residuals follow a binomial distribution. If the mean deviance approximates to 1, as can be seen here, the residual values can be assumed to follow a binomial distribution; the statistical significance of the change in deviance due to the inclusion of age can then be assumed to follow a Chi squared distribution, shown here to be significant (P<0.01). If the mean deviance is much greater than 1 then the data are said to be over dispersed and an F test is required. (Note that this interpretation does not apply when data comprise (0,1) values, i.e. not (r,n) values as used in this example.) The reader is referred to the guide on Statistical Modelling for further discussion on this topic.