Download Lecture 7: Confidence Intervals in Statistics and more Lecture notes Statistics in PDF only on Docsity! Lecture 7: Confidence Intervals
ee
Lecture 7 Outline Lecture 7: Confidence Intervals Example – Birth Weight Theory – Confidence Interval Confidence Interval for µ Example – Paint Primer Thickness Example – Concrete Tensile Strength Example – Clinton vs Trump Polls Approximation CI for proportion Conservative CI for proportion Justifying the conservative CI for proportion Formulae for test statistic & confidence intervals 2/24 Lecture 7 Revision – Probability What is the green area? qnorm(0.975) ## [1] 1.959964 Distribution calculator z −1.96 0.00 1.96 5/24 Lecture 7 Theory – Confidence Interval I The confidence interval is different from sample to sample. I 100(1− α)% confidence interval – A sequence of intervals which contain the unknown parameter 100(1− α)% of the time, where 1− α is the confidence level, often α = 0.05 Sequence of CIs Backup I A set of possible hypothesis {H0 : µ = µ0} which will be retained if µ0 ∈ CI. I 100(1− α)% confidence interval does not mean that the probability of true parameter containing within the interval is 1− α. 6/24 Lecture 7 Confidence Interval for µ I If the sampling population is N(µ, σ2) with known σ, 100(1− α)% confidence interval for µ is x̄± z1−α/2 · σ√n where z1−α/2 = z ∗ such that P (Z < z∗) = 1− α2 . I In practice σ is unknown so we use x̄± t1−α/2,n−1 · s√n where t1−α/2,n−1 = t ∗ such that P (tn−1 < t ∗) = 1− α2 . I z1−α/2 and t1−α/2,n−1 are called critical values. I σ√ n and s√ n are standard errors of the estimate of µ. I Remember tn−1 → N(0, 1) as n→∞! I What if the distribution of the sampling population is unknown? Remember CLT! 7/24 Lecture 7 Example – Paint Primer Thickness Assume that paint primer thickness can be modelled by X ∼ N(µ, σ2). In an ongoing process of quality control in an industrial system, the following first sample of values was obtained: x = c(1.3, 1.1, 1.2, 1.25, 1.05, 0.95, 1.1, 1.16, 1.37, 0.98) (i) What is a 95% confidence interval for the primer thickness? (ii) What is a 99% confidence interval for the primer thickness? (iii) The company advertises that the primer thickness is 1.25. What would you conclude? 8/24 Lecture 7 Theory – Size of confidence interval I What happens if we increase the sample size n? As we increase n, the confidence interval gets narrower, and so x̄ is a better estimate for the long term estimate of µ. I What happens if we increase the confidence level 1− α? As we increase the confidence level, the confidence interval gets wider. 9/24 Lecture 7 Example – Paint Primer Thickness We have X ∼ N(µ, σ2) and the data: x = c(1.3, 1.1, 1.2, 1.25, 1.05, 0.95, 1.1, 1.16, 1.37, 0.98) mean(x) ## [1] 1.146 sd(x) ## [1] 0.1363166 10/24 Lecture 7 Example – Concrete Tensile Strength We are interested in the influence of the size of test specimens of concrete on the tensile strength. Eight concrete mixes were made, and from each mix two test specimens were prepared and tested, resulting in the following strengths (in kN/m2): small = c(4404, 4236, 3788, 3475, 3418, 2262, 7415, 6993) large = c(4140, 3984, 3842, 3053, 3145, 1813, 6867, 7091) diff = large - small diff ## [1] -264 -252 54 -422 -273 -449 -548 98 13/24 Lecture 7 Example – Concrete Tensile Strength (i) Find a 95% CI for the mean tensile strength of small specimens, assuming that the strengths can be modelled by N(µ, 10002). (ii) Find a 90% CI for the mean difference in tensile strengths, assuming that the differences can be modelled by N(µ, σ2). 14/24 Lecture 7 Example – Concrete Tensile Strength small = c(4404, 4326, 3788, 3475, 3418, 2262, 7415, 6993) mean(small) ## [1] 4510.125 sd(small) ## [1] 1792.36 (i) Assuming that the strengths can be modelled by N(µ, 10002), a 95% CI for the mean tensile strength of small specimens is: x̄± 1.96 · σ√ n which is 4510.125± 1.96 · 1000/ √ 8 which gives (3817, 5203). 15/24 Lecture 7 “The NBC/WSJ/Marist poll Florida poll surveyed 700 likely voters between October 3-5 with a margin of error of plus or minus 3.7 percentage points.” “The Pennsylvania poll surveyed 709 likely voters between October 3-6 with a margin of error of plus or minus 3.7 percentage points.” A random survey of 2000 voters found that 1165 were going to vote for Hilary Clinton. (i) Find a 95% CI for the proportion of voters p that will vote for Hillary. (ii) What is the margin of error? (iii) What is the smallest sample size needed to give a 95% CI for p with width at most ±0.03? 18/24 Lecture 7 Example – Clinton vs Trump Polls (i) Suppose that X is the number of people voting for Clinton out of the n = 2000 sampled voters. Assume that X ∼ Bin(2000, p). An estimate of the proportion is p̂ = 1165/2000 = 0.5825. Now E ( X n ) = 1 n E (X) = 1 n · np = p V ar ( X n ) = 1 n2 V ar (X) = 1 n2 · np(1− p) = p(1− p) n . If n is large enough, by CLT, p̂ = X n approx ∼ N ( p, p(1− p) n ) . 19/24 Lecture 7 Approximation CI for proportion A 95% approximate CI for the proportion of voters p that will vote for Hillary is: p̂± z0.975 · √ p̂(1− p̂) n We get 0.5825± 1.96 · √ 0.5825 · (1− 0.5825) 2000 which gives (0.56, 0.60). 20/24 Lecture 7 (ii) The margin of error is 1.96 · 1 2 √ 2000 ≈ 0.02191347 or approximately 2%. (iii) To give a 95% CI for p with width ±0.03 (i.e. margin of error 3%): We solve 1.96 · 1 2 √ n ≤ 0.03 which gives n ≥ ( 1.96 2 · 0.03 )2 ≈ 1067.1. Therefore, you need at least a sample size of 1068. 23/24 Lecture 7 Formulae for test statistic & confidence intervals Context Parameter Test Statistic and CI Proportion p T p̂−p√ p̂(1−p̂) n ∼ N(0, 1) CI p̂± z1−α2 √ p̂(1−p̂) n Approximate CI p̂± z1−α2 1 2 √ n Conservative known σ µ T X̄−µσ√ n ∼ N(0, 1) CI x̄± z1−α2 σ√ n unknown σ µ T X̄−µs√ n ∼ tn−1 CI x̄± t1−α2 ,n−1 s√ n 2 sample with common vari- ance µ1 − µ2 T X̄1−X̄2−(µ1−µ2) sp √ 1 n1 + 1n2 ∼ tn1+n2−2 CI x̄1 − x̄2 ± t1−α2 ,n1+n2−2 · sp √ 1 n1 + 1n2 24/24