Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Confidence Intervals and Standard Errors in Applied Biostatistics, Study notes of Mathematical Methods

The concept of standard errors and confidence intervals in the context of applied biostatistics. It discusses how samples are drawn from populations and how statistics, such as means and proportions, vary from sample to sample. The document also introduces the concept of the sampling distribution and how it can be used to make inferences about populations. It further explains the importance of standard errors in estimating the precision of sample statistics and how confidence intervals can be used to estimate population parameters with a certain level of confidence.

Typology: Study notes

2010/2011

Uploaded on 09/10/2011

myohmy
myohmy 🇬🇧

4.8

(10)

82 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Understanding Confidence Intervals and Standard Errors in Applied Biostatistics and more Study notes Mathematical Methods in PDF only on Docsity! 1 Health Sciences M.Sc. Programme Applied Biostatistics Week 5: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population. The sample tells us something about the population. The notion of sampling is a familiar one in health care. For example, if I want to measure a subject’s blood glucose, I do not take all the blood. I draw a sample. One drop of blood is then used to represent all the blood in the body. I did this three times, from the same subject (myself) and got three measurements: 6.0, 5.9, and 5.8 mmol/L. Which of these was correct? The answer is that none of them were; they were all estimates of the same quantity. We do not know which of them was actually closest. In research, we collect data on our research subjects so we can draw conclusions about some larger population. For example, in a randomised controlled trial comparing two obstetric regimes, the proportion of women in the active management of labour group who had a Caesarean section was 0.97 times the proportion of women in the routine management group who had sections (Sadler et al., 2000). (We call this ratio the relative risk.) This trial was carried out in one obstetric unit in New Zealand, but we are not specifically interested in this unit or in these patients. We are interested in what they can tell us about what would happen if we treated future patients with active management of labour rather than routine management. We want know, not the relative risk for these particular women, but the relative risk for all women. The trial subjects form a sample, which we use to draw some conclusions about the population of such patients in other clinical centres, in New Zealand and other countries, now and in the future. The observed relative risk of Caesarean section, 0.97, provides an estimate of the relative risk we would expect to see in this wider population. If we were to repeat the trial, we would not get exactly the same point estimate. Other similar trials cited by Sadler et al. (2000) have reported different relative risks: 0.75, 1.01, and 0.64. Each of these trials represents a different sample of patients and clinicians and there is bound to be some variation between samples. Hence we cannot conclude that the relative risk in the population will be the same as that found in our particular trial sample. The relative risk which we get in any particular sample would be compatible with a range of possible differences in the population. When we draw a sample from a population, it is just one of the many samples we could take. If we calculate a statistic from the sample, such as a mean or proportion, this will vary from sample to sample. The means or proportions from all the possible samples form the sampling distribution. To illustrate this with a simple example, we could put lots numbered 1 to 9 into a hat and sample by drawing one out, replacing it, drawing another out, and so on. Each number would have the same chance of being chosen each time and the sampling distribution would be as in Figure 1(a). Now we change the procedure, draw out two lots at a time and calculate the average. There are 36 possible pairs, and some pairs will have the same average (e.g. 1 and 9, 4 and 6 both have average 5.0). The sampling distribution of this average is shown in Figure 1(b). 2 0 .02 .04 .06 .08 .1 .12 R el at iv e fre qu en cy 1 2 3 4 5 6 7 8 9 Digits 1 to 9 (a) Single digit 0 .02 .04 .06 .08 .1 .12 R el at iv e fre qu en cy 1 2 3 4 5 6 7 8 9 Mean of two digits (b) Mean of two digits Figure 1. Sampling distribution for a single digit drawn at random and for the mean of two digits drawn together There are three things which we should notice about Figure 1(b). 1. The mean of the distribution remains the same, 5. 2. The sampling distribution of the mean is not so widely spread as the parent distribution. It has a smaller variance and standard deviation. 3. It has a different shape to Figure 1(a). The sampling distribution of a statistic does not necessarily have the same shape as the distribution of the observations themselves, which we call the parent distribution. In this case, as so often, it looks closer to a Normal distribution than does the distribution of the observations themselves. If we know the sampling distribution it can help us draw conclusions about the population from the sample, using confidence intervals and significance tests. We often use our sample statistic as an estimate of the corresponding value in population, for example using the sample mean to estimate the population mean. The sampling distribution tells us how far from the population value the sample statistic is likely to be. Any statistic which is calculated from a sample, such as a mean, proportion, median, or standard deviation, will have a sampling distribution. Standard error If the sample statistic is used as an estimate, we call the standard deviation of the sampling distribution the standard error. Rather confusingly, we use this term both for the unknown standard deviation of the sampling distribution and for the estimate of this standard deviation found from the data. 5 of the 20 confidence intervals. Thus, for 95% of confidence intervals, it will be true to say that the population value lies within the interval. We just don't know which 95%. We expect to see 5% of the intervals having the population value outside the interval and 95% having the population value inside the interval. This is not the same as saying that 95% of further samples will have estimates within the interval. For example, if we look at the first interval in Figure 2, we can see that samples 8, 10, and 20 all have point estimates outside this interval. In fact, we expect about 83% of further samples to have their point estimates within a 95% confidence interval chosen at random. The confidence interval need not have a probability of 95%. For example, we can also calculate 99% confidence limits. The upper 0.5% point of the Standard Normal Distribution is 2.58, so the probability of a Standard Normal deviate being above 2.58 or below –2.58 is 1% and the probability of being within these limits is 99%. The 99% confidence limits for the mean FEV1 are therefore 4.062 – 2.58 × 0.089 and 4.062 + 2.58 × 0.089, i.e. 3.8 and 4.3 litres. These give a wider interval than the 95% limits, as we would expect since we are more confident that the mean will be included. We could also calculate a 90% confidence interval, which is 3.916 to 4.208, narrower than the 95% confidence interval. However, only 90% of such intervals will include the population value, 10% will not. The probability we choose for a confidence interval is thus a compromise between the desire to include the estimated population value and the desire to avoid parts of scale where there is a low probability that the mean will be found. For most purposes, 95% confidence intervals have been found to be satisfactory and this is what is usually quoted in health research. For the trial comparing active management of labour with routine management (Sadler et al., 2000), the relative risk for Caesarean section was 0.97. Sadler et al. quoted the 95% confidence interval for the relative risk as 0.60 to 1.56. Hence we estimate that in the population which these subjects represent, the proportion of women undergoing Caesarean section when undergoing active management of labour is between 0.60 and 1.56 times the proportion who would have Caesarean section with routine management. Significance tests and confidence intervals Significance tests and confidence intervals often involve similar calculations. For example, we can test the null hypothesis that two groups have the same mean and we can find a confidence interval for the difference between the means. If the 95% confidence interval for the difference does not include the null hypothesis value, the difference is significant at the 5% level. If the 95% confidence interval for the difference includes the null hypothesis value, the difference is not significant at the 5% level. For example, in a study of respiratory disease in schoolchildren, children were followed at ages 5 and 14. We looked at the proportions of children with bronchitis in infancy and with no such history who were reported to have respiratory symptoms in later life (Holland et al., 1978). We had 273 children with a history of bronchitis before age 5 years, 26 of whom were reported to have day or night cough at age 14. We had 1046 children with no bronchitis before age 5 years, 44 of whom were reported to have day or night cough at age 14. We shall test the null hypothesis that 6 the prevalence of the symptom is the same in both populations, against the alternative that it is not. We shall use a test called the large sample Normal or z test for the difference between two proportions. This test uses a standard error, like others we shall come across in this course. It follows the structure described above and for this lecture we shall not go into the details of the method. It works like this. 1. The null hypothesis is that the prevalence of the symptom is the same in both populations. The alternative that it is not. 2. The assumptions of the test are that the observations are all independent, which they are because these are all different, unrelated children, and that the sample is large enough, we shall accept as being met here. 3. The test statistic is the difference between the two proportions divided by the standard error it would have if the proportions were actually the same. The two proportions of children reported to have cough are 26/273 = 0.09524 for children with a history of bronchitis and 44/1046 = 0.04207 for those with no bronchitis. The difference between these proportions is = 0.09524 – 0.04207 = 0.05317. The standard error for this difference if the two proportions are actually the same is estimated to be = 0.01524. The test statistic is therefore 0.05317/0.01524 = 3.49. 4. If the null hypothesis were true, this would be an observation from the Standard Normal distribution. This is because sample is large and both proportions will follow approximately Normal distributions. The distribution of differences should have mean zero if the null hypothesis is true. Dividing by the standard error gives us standard deviation of this distribution = 1.0. 5. The probability of the test statistic having a value as far from zero as 3.49 is quite small, 0.0005. 6. We therefore conclude that the data are not consistent with the null hypothesis and we have strong evidence that children with a history of bronchitis are more likely than other to be reported to have cough during the day or at night at the age of 14. What about the confidence interval? For this, we use a different standard error, the standard error when the proportions may not be equal. This is SE = 0.0188. The 95% confidence interval for the difference is 0.05317 – 1.96 × 0.0188 to 0.05317 + 1.96 × 0.0188 = 0.016 to 0.090. Although the difference is not very well estimated, it is well away from zero and gives us clear evidence that children with bronchitis reported in infancy are more likely than others to be reported to have respiratory symptoms in later life. The null hypothesis value of the difference is zero and this is not included in the 95% confidence interval. We do not include zero as a value for the difference which is consistent with the data. Note that the standard error used here is different when the null hypothesis is true from that which is used for the confidence interval, when of course we do not say that there is no difference. The null hypothesis may contain information about the standard error and in the comparison of two proportions, the standard error for the difference depends on the proportions themselves. If the null hypothesis is true we need only one estimate of the proportion and this alters the standard error for the 7 difference. As a result, 95% confidence intervals and 5% significance tests sometimes give different answers near the cut-off point. J. M. Bland 14 August 2006 References Bland M. (2000) An Introduction to Medical Statistics, 3rd. Edition Oxford University Press. Sadler LC, Davison T, McCowan LM. (2000) A randomised controlled trial and meta-analysis of active management of labour. British Journal of Obstetrics and Gynaecology 107, 909-15.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved