Download Hypothesis Testing and Confidence Intervals: An Introduction and more Study notes Statistics in PDF only on Docsity! Tests of hypotheses Confidence interval: Form an interval (on the basis of data) of plausible values for a population pa- rameter. Test of hypothesis: Answer a yes or no question regarding a population parameter. Examples: • Do the two strains have the same average re- sponse? • Is the concentration of substance X in the wa- ter supply above the safe limit? • Does the treatment have an effect? 1 Example We have a quantitative assay for the concentration of antibodies against a certain virus in blood from a mouse. We apply our assay to a set of ten mice before and after the injection of a vaccine. (This is called a “paired” experiment.) Let X i denote the differences between the measurements (“after” minus “before”) for mouse i. We imagine that the X i are independent and identically distributed normal(µ, σ). Does the vaccine have an effect? In other words: Is µ 6= 0? 2 The data ● ● ● ● ● ● ● ● ● ● 8 10 12 14 10 11 12 13 14 15 16 before af te r ● ● ● ● ● ● ● ● ● ● −2 0 2 4 af te r − b ef or e 3 Hypothesis testing We consider two hypotheses: Null hypothesis, H0: µ = 0 Alt. hypothesis, Ha: µ 6= 0 Type I error: Reject H0 when it is true. (false positive) Type II error: Fail to reject H0 when it is false. (false negative) We set things up so that a Type I error is a worse error (and so that we are seeking to prove the alternative hypothesis). We want to control the rate (the significance level, α) of such errors. Test statistic: T = (X̄ − 0)/(s/ √ 10) We reject H0 if |T| > t?, where t? is chosen so that Pr(Reject H0 | H0 is true) = Pr(|T| > t? | µ = 0) = α. (generally α = 5%) 4 P-values P-value: smallest significance level (α) for which you would fail to reject H0 with the observed data. probability, if H0 was true, of receiving data as extreme as what was observed. X 1, . . . , X 10 ∼ iid normal(µ, σ) H0: µ = 0; Ha: µ 6= 0. Observe: X̄ = 1.93; s = 2.24 so Tobs = 1.93−02.24/√10 = 2.72 P-value = Pr(|T| > Tobs) = 2*pt(-2.72,9) = 2.4%. t(df=9) distribution Tobs− Tobs 1.2%1.2% 9 Another example X 1, . . . , X 4 ∼ normal(µ, σ) H0: µ ≥ 6; Ha : µ < 6. Observe: X̄ = 5.51; s = 0.43 Tobs = 5.51−60.43/√4 = –2.28 P-value = Pr(T < Tobs | µ = 4) = pt(-2.28, 3) = 5.4%. t(df=3) distribution Tobs 5.4% The P-value is (roughly) a measure of evidence against the null hypothesis. Recall: We want to prove the alternative hypothesis (i.e., reject H0; i.e., receive a small P-value) 10 Hypothesis tests and confidence intervals The 95% confidence interval for µ is the set of values, µ0, such that the null hypothesis H0 : µ = µ0 would not be rejected (by a two-sided test with α = 5%). The 95% CI for µ is the set of plausible values of µ. If a value of µ is plausible, then as a null hypothesis, it would not be rejected. For example: 9.98 9.87 10.05 10.08 9.99 9.90 (assumed iid normal(µ,σ).) X̄ = 9.98; s = 0.082; n = 6 qt(0.975,5) = 2.57 95% CI for µ = 9.98 ± 2.57 · 0.082 / √ 6 = 9.98 ± 0.086 = (9.89,10.06) 11 Power The power of a test = Pr(reject H0 | H0 is false). µ0 µa C− C Null dist'n Alt dist'n Area = power The power depends on: • The null hypothesis and test statistic • The sample size • The true value of µ • The true value of σ 12 Why “fail to reject”? If the data are insufficient to reject H0, we say, “The data are insufficient to reject H0.” We shouldn’t say, “We have proven H0.” Why? We have very low power to detect similar alternatives. We may have low power to detect anything but extreme differences. We control the rate of type I errors (“false positives”) at 5% (or whatever), but we have little or no control over the rate of type II errors. 13 The effect of sample size Let X 1, . . . , X n be iid normal(µ, σ). We wish to test H0 : µ = µ0 vs Ha : µ 6= µ0. Imagine µ = µa. n = 4 µ0 µa C− C Null dist'n Alt dist'n n = 16 µ0 µaC− C Null dist'n Alt dist'n 14