Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Confidence Intervals and Hypothesis Testing for Sample Proportions and Means, Exams of Political Science

This document from a university economics lecture covers the concepts of confidence intervals and hypothesis testing for sample proportions and means. It explains how to calculate confidence intervals for population proportions and means using normal and t distributions, and discusses the importance of sample size and level of confidence. The document also covers hypothesis testing, including formulating hypotheses, calculating test statistics, and determining significance levels.

Typology: Exams

Pre 2010

Uploaded on 08/30/2009

koofers-user-ovp-1
koofers-user-ovp-1 🇺🇸

5

(1)

10 documents

1 / 12

Toggle sidebar

Related documents


Partial preview of the text

Download Confidence Intervals and Hypothesis Testing for Sample Proportions and Means and more Exams Political Science in PDF only on Docsity! Oct. 14, 2008 LEC #6 ECON 240A-1 L. Phillips Interval Estimation and Hypothesis Testing I. Introduction From a simple random sample of voters we obtain a sample proportion of the voters supporting a candidate but we do not know the proportion for the entire population of voters. Can we say that this population proportion lies in a specified interval with some likelihood? Of course the candidate hopes this interval is above 0.5 and that the likelihood is nearly certain. Recall that our series of monthly rates of return for the UC stock index consisted of values that were independent of one another. From this data we can calculate a sample mean but we do not know the mean rate of return for the underlying population generating these monthly observations. Once again, can we say the population mean lies in some interval and, on average, if we used this procedure many times, what fraction of the time we would be right? II. Confidence Intervals for Sample Proportions and Population Proportions Suppose a simple random sample of 1000 California Democratic likely voters produces a sample proportion of 0.52 who support Obama. Does the interval for the population proportion lie above 0.50? What fraction of the time, if we used this procedure again and again, would this interval be correct? We know that proportions are distributed binomially and since this is a large sample, we can approximate it with the normal distribution. We also know that for the normal distribution, approximately 68 percent of the observations lie within one standard deviation of the mean, and that 95% of the observations lie within plus or minus 1.96 standard deviations of the mean. i.e. P[-1.96  ( p̂ - p)/( p̂ )  1.96]  0.95, (1) Oct. 14, 2008 LEC #6 ECON 240A-2 L. Phillips Interval Estimation and Hypothesis Testing where ( p̂ ) = )1( pp  /n . With some manipulation( multiply the inequality by ( p̂ ) and add p to the inequality) we can restate this as: P[p – 1.96( p̂ ) < p̂ < p + 1.96( p̂ )]  0.95. (2) This means that this sampling procedure, if repeatedly used, would produce sample proportions that would lie within plus or minus 1.96 standard deviations from the population mean, 95% of the time. Alternatively, Eq. (1) can be expressed as: P[ p̂ - 1.96 ( p̂ ) < p < p̂ + 1.96 ( p̂ )]  0.95, (3) (which can be obtained by multiplying the inequality in Eq. (1) by ( p̂ ) and subtracting p̂ , and multiplying by minus one which reverses the inequality). This expression in Eq. (3) provides an interval within which the true population proportion will lie 95% of the time. However, since we do not know the population proportion p, we use the sample proportion, p̂ , to calculate the standard deviation: s( p̂ ) = npp /)ˆ1(ˆ  , (4) for use in Eq. (3): P[ )ˆ(96.1ˆ)ˆ(96.1ˆ pspppsp  ]  0.95 (5) As a numerical example, take the sample mean of 0.52 for a sample of 1000. The standard deviation is: s( p̂ ) = 1000/)52.01(52.0  = 0.0158. (6) The 95% confidence interval for the population proportion of California voters supporting Obama is: Oct. 14, 2008 LEC #6 ECON 240A-5 L. Phillips Interval Estimation and Hypothesis Testing critical level, i.e. a willingness to bear the burden of a probability of rejecting the null hypothesis, even if it were true, as high as 5 %, is designated . An example of a hypothesis test can be formulated from our virtual sample of one thousand California voters, 520 of whom support Mr. Obama. The null hypothesis is that the true population proportion is 0.5, i.e. H0 : p = 0.5, (13) While the alternative hypothesis is that the true population proportion, p, lies below 0.5, Ha : p < 0.5. (14) The test statistic, z, is the sample proportion, p̂ , minus its expected value under the null hypothesis, i.e. the population mean, p, divided by the standard deviation of the sample proportion, ( p̂ ) = npp /)1(  : z = ( p̂ - p)/( p̂ ) = (0.52 – 0.5)/ 1000/)5.0)(5.0( = 0.02/0.0158 = 1.27. (15) Using the normal distribution, the probability of getting a value of z greater than 1.27 is 0.120. For a significance criterion  of 5%, this probability of 0.029 is smaller so we would accept the null hypothesis of p = 0.5 and the camp supporting Obama would be nervous unless the sample proportion goes to 0.53 in the next poll. Then the probability of getting a value of z greater than 1.90 would be 3%, less than our critical value of 5% and we would accept the null hypothesis that the true population proportion is 0.5 or greater and that Obama will win. V. Hypothesis Test for a Sample Mean Although we know that the interval for the mean monthly rate of return for the population includes zero for our sample of 12 monthly returns for the UC stock index fund, for practice we can test the hypothesis that the population mean is zero, i.e. Oct. 14, 2008 LEC #6 ECON 240A-6 L. Phillips Interval Estimation and Hypothesis Testing H0 :  = 0, (16) Against the alternative that the mean differs from zero, Ha :   0. (17) The test statistic is the t-value equal to the sample mean, r , minus the population mean under the null hypothesis, , divided by the estimate of the standard deviation of the sample mean, s/n: t = ( )r /(s/n) = (1.61 – 0)/(4.04/ 12 ) = 1.61/1.166 = 1.38. (18) If we choose a critical level  of 5%, for 11 degrees of freedom, t0.025 = 2.2. Since our observed t statistic is only 1.38, we can not reject the null hypothesis that the mean monthly rate of return is zero. This does not jibe with our expectation that you can make money with stock index funds. Perhaps a larger sample would decrease the standard deviation of r , and provide more precision for this test. VI. Decision Theory Life is full of tradeoffs and hypothesis testing is another example. Consider the null hypothesis that the proportion of the population of voters supporting Hillary is 0.5 (more than 0.5 would have an equivalent consequence on the supporters aspirations to carry California with a majority for this candidate) versus the alternative hypothesis that this proportion is less than 0.5. We do not know the true state of affairs before the election is held, which we are trying to guess, but consider two possibilities. One possibility is that the proportion is 0.5(or more). The second possibility is that the proportion is less than 0.5. Table 1 cross-classifies four possible outcomes depicting our decision to accept or reject the null hypothesis versus the true (but unknown to us) state of affairs (state of nature). Oct. 14, 2008 LEC #6 ECON 240A-7 L. Phillips Interval Estimation and Hypothesis Testing Our decision has two choices: accept or reject the null hypothesis. If we accept the null and it happens the null is true (a fact as yet unknown to us), then no problem. If we accept the null and the true state of affairs is that the null is false, then we make what Table 1. Decision Theory and Two Types of Error --------------------------------------------------------------------------------- is called a type II error. So there are two possible errors, accept the null when it is false or reject the null when it is true. This latter error is called a type I error. Recall that step three of our hypothesis test procedure answered the question, “if the null were true, then what is the probability of observing a test statistic at least as extreme as the one observed? The fourth step was to choose a critical significance level , such as one percent, to compare to this probability from step three. So steps three and four are focusing on a type I error, rejecting the null when it is true. We reject the null only if the test statistic, for example a z value or a t value, would have such an extreme value by chance only one percent of the time or less. Decision Accept null Reject null True State P = 0.5 P < 0.5 No Error Type II Error Type I Error No Error Oct. 14, 2008 LEC #6 ECON 240A-10 L. Phillips Interval Estimation and Hypothesis Testing z = (0.536 – p)/ 1000/)1( pp  = (0.536 – 0.540)/ 1000/46.0)(54.0( , z = -0.004/0.0158 = -0.253. (21) -------------------------------------------------------------------------------- Using the cumulative distribution function of the normal distribution, F(z) =F(-0.253) =0.40, (22) i.e. the probability of a type II error, , is 40% if p=0.54. The values of  calculated for various what-if scenarios of the true proportion of California voters supporting this proposition, p, are listed in Table 2. A plot of  versus p is called 0.0 0.1 0.2 0.3 0.4 0.5 440 460 480 500 520 540 560 Voters Supporting Davis DENSITY, p=0.50 DENSITY, p=0.54 p = 0.50 p = 0.54 alpha = 1 % Figure 2: The Pobability of a Type II Error = 40% Decision Rule: Reject Null if Voters>536 beta = 40% Oct. 14, 2008 LEC #6 ECON 240A-11 L. Phillips Interval Estimation and Hypothesis Testing Table 2: Probability of Type II Error Versus Population Proportion True Population z value* F(z) = beta 1 - beta Proportion, p 0.51 1.64 0.950 0.050 0.52 1.01 0.844 0.156 0.53 0.38 0.648 0.352 0.54 -0.25 0.400 0.600 0.55 -0.89 0.187 0.813 0.56 -1.53 0.063 0.937 0.57 -2.17 0.015 0.985 0.58 -2.82 0.002 0.998 0.59 -3.47 0.000 1.000 0.6 -4.13 0.000 1.000 * z = (0.536 - p)/sqrt[p*(1-p)/1000] the operating characteristic curve and is shown in Figure 3. The corresponding plot of 1- versus p is called the power function and is shown in Figure 4. Figure 3: Operating Characteristic Curve 0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.000 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 0.61 Presumed Population Proportion, p B et a Oct. 14, 2008 LEC #6 ECON 240A-12 L. Phillips Interval Estimation and Hypothesis Testing Figure 4: Power Function of the Test 0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.000 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 0.61 Supposed Population Proportion, p 1 - b et a Ideal Power Function
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved