Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Confidence Intervals and Margin of Error in Statistical Sampling, Assignments of Probability and Statistics

The concept of confidence intervals and margin of error in statistical sampling. It covers the calculation of confidence intervals for population proportions and means, and discusses the use of z and t scores for different confidence levels. The document also explains how to calculate confidence intervals for large samples and for samples drawn without replacement.

Typology: Assignments

Pre 2010

Uploaded on 07/23/2009

koofers-user-34t
koofers-user-34t 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Confidence Intervals and Margin of Error in Statistical Sampling and more Assignments Probability and Statistics in PDF only on Docsity! STT315 201-202 Homework due Friday, 8-4-06 Next exam date is Friday, 8-11-06 Consult Chapter 6 as indicated in a message sent to the class. This material concerns a “bread and butter” application of statistics going by the name “confidence interval for a population mean or proportion.” Confidence intervals help fill an important need to quantify the accuracy of information. CI for a proportion, n large. One might read that the percentage of smokers in a population is estimated by the poll data to be “22.5% with a margin of error of plus or minus 6%.” Typically this means that a random sample of some target population was selected and that the fraction 0.225 of those sampled were scored as smokers. The margin of error +/- 0.06 is usually, but not always, obtained as +/- 1.96 root(0.225 0.775 / n), where n is the number of persons sampled. What is being claimed is that around 95% of the time such data is collected it is true that the interval calculated this way will cover the true fraction “p” of smokers in the population. Technically, for every z > 0 P( p lies within interval pHAT +/- z root(pHAT qHAT / n)) --> P(-z < Z < z) as sample size n --> infinity where p is the true population proportion having some characteristic, pHAT is the fraction of the sample having that characteristic, and n is the sample size. CI for a mean, n large. One might read that the average income “mu” of a population is estimated by a sample to be “$46,112 with a margin of error of plus or minus $5,596.” Typically this means that a random sample of some target population was selected and the sample average income xBAR of the persons in the sample was $46,112. The margin of error +/- $5,596 is usually, but not always, obtained as +/- 1.96 s / root(n) where s denotes the sample standard deviation defined by s = root(n / (n-1)) root(mean of squares – square of mean) = root(n / (n-1)) root(x2BAR – (xBAR2)). See Chapter 1. Technically, for every z > 0 P( mu lies within interval xBAR +/- z s / root(n)) --> P(-z < Z < z) as sample size n --> infinity where mu is the population mean, xBAR is the sample mean, and s is the sample standard deviation. CI for a mean, any n > 1, NORMAL POPULATION. This is known as Student’s-t method. Do not overlook the requirement that the population distribution be normal. It is another dividend of “being in control” that we may quote a margin of error even for a sample size as small as n = 2. Technically, for every t > 0 P( mu lies within interval xBAR +/- t s / root(n)) = P(-t < T < t) for every n > 1 where mu is the population mean, xBAR is the sample mean, and s is the sample standard deviation. The score t (replacing z of the large n method) is determined from the t-Table (inside front cover of your book) and the “confidence level” (e.g. 95%). For example, if n = 7 and we desire 95% confidence we look in the t-Table for “C.I. = 95%” and run up that column to the row indicated at 6 “degrees of freedom.” There we find t = 2.447. Degrees of freedom n-1 = 6 2.447 infinity 1.96 C. I. 95% So if we have selected a sample of n = 7 from a normal population and find that the sample mean income is xBAR = $54,226 with sample standard deviation s = $3,386 we are entitled to quote a margin of error +/- t s / root(n) = +/- 2.447 ($3,386) / root(7) = +/- $3,131.64 If, instead, the same “xBAR and s” had arisen from a sample of n = 500 we would quote the (large n) margin of error +/- z s / root(n) = +/- 1.96 ($3,386) / root(500) = +/- $296.80. This could be used even if the population were not normal. Around 95% of the time, with either method, the true population mean mu will be within +/- margin of error of xBAR. scores calculate the 95% C.I. for mu = 4(4.5) = 18. We pretend, as before, that we do not know mu. Also, calculate s for these ten scores. form the 95% t C.I. (degrees of freedom = 10 – 1 = 9) xBAR +/- t s / root(10) This interval should have a 95% chance of covering the true mu = 18. Does it cover 18? I get 10 block sums {20, 23, 20, 14, 21, 26, 7, 26, 16, 9} having sample mean 91/5 = 18.2 (close to the theoretical population mean mu = 4(4.5) = 18) and sample standard deviation s = 6.5963 (only relatively close to the theoretical population standard deviation 2 (2.8723) = 5.7446). The appropriate t score for 95% confidence is (for degrees of freedom 10 – 1 = 9) given by the t-table as t = 2.262. So the 95% t-based CI for the mean of this approximately NORMAL population is [13.4816, 22.9184]. It does indeed cover the true population mean of 18. 8. Repeat (7) for 90% C.I. but only using the first 5 of your ten block sums. Here df = 5-1 = 4 and t = 2.132. The 90% CI is [16.3949, 22.8051]. 9. Repeat (7) for your specific row of the table (1 + day of month of your birth). Real application. 10. Apply method of (1) to a random sample of a real population of your choice. 11. Apply method of (4) to a random sample of a real population of your choice. 12. Apply method (7) to a random sample of a real population of your choice. Without replacement correction. In each of the two large-n methods we can make a simple adjustment if the sample is instead drawn without replacement. Simply multiply the +/- term by the finite population correction FPC = root( (N-n) / (N-1) ) where N denotes the size of the population. It is required that both n and N-n be “large enough.” For example, if we sample n = 100 without replacement from a population of 1000 then FPC = root(900 / 999) = 0.949, so the margin of error would be reduced by around 5% due to sampling without replacement. All C.I. will likewise be narrowed by 5% indicating more precision. On the other hand, if N = 10000 and n = 100 we have FPC = 0.995 which only represents around 1/2 of one percent reduction in margin of error. It is another dividend of using normal approximations that such a simple correction accounts for the complexities of sampling without replacement. 13. Consider row 1 of the table of random numbers to represent 20 consecutive two digit numbers: 15 59 90 68 etc. From them we will derive a without replacement sample of two digit numbers by simply dropping duplicates as we go through. So one of the two 03 entries will be dropped. What is the population mean mu (average of 00 through 99)? Form the large-n 95% C.I. for mu taking into account the fact that this is a without replacement sample. Do you think it has made much of a difference from sampling with replacement? Here is the data with all two digit numbers from row one {15, 59, 90, 68, 92, 90, 83, 03, 85, 08, 89, 54, 10, 51, 66, 77, 64, 15, 03, 42} Here is the data with duplicates removed {15, 59, 90, 68, 92, 83, 03, 85, 08, 89, 54, 10, 51, 66, 77, 64, 42} We’ll form a 95% large-n CI from the n = 17 which constitutes a sample of n = 17 without replacement. The sample mean xBAR = 56.2353 (fairly close to the population mean 99/2 = 49.5) and the sample sd is s = 30.6217. The FPC = root((N-n)/(N-1)) = root(83/99)= 0.9156 since N = 100 is the size of the population of two digit numbers {00, 01, …, 99} and the sample size without replacement is n = 17. So the FPC makes around 8.5% difference in the width of the CI. Here is the 95% CI with the FPC and n = 17 treated as a large-n sample without replacement xBAR +/- 1.96 (s / root(n) ) FPC = [46.3997, 60.0003] NOTE: THE SOLUTION TO 13 ABOVE IS CORRECTED FROM THE VERSION EMAILED TO YOU.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved