Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Estimating Political Approval Ratings: Understanding Percentages and Confidence Intervals, Study notes of Statistics

This document, from a uc berkeley statistics lecture delivered by david shilane on april 10, 2007, discusses the concept of estimating political approval ratings using percentages. The lecture covers the importance of understanding the accuracy of percentages, the concept of a confidence interval, and how to calculate it. The document uses tarja halonen, the president of finland, as an example to illustrate these concepts.

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-mtw
koofers-user-mtw 🇺🇸

10 documents

1 / 21

Toggle sidebar

Related documents


Partial preview of the text

Download Estimating Political Approval Ratings: Understanding Percentages and Confidence Intervals and more Study notes Statistics in PDF only on Docsity! David Shilane UC Berkeley The Accuracy of Percentages David Shilane Lecture 17, Statistics 20 University of California, Berkeley Tuesday, April 10th, 2007 April 10th, 2007 Statistics Page 1 David Shilane UC Berkeley We’re often interested in estimating a percentage. Some examples include: • Baseball batting averages • The proportion of customers who buy an item during a sale • The risk of obtaining a disease • Political approval rates The usual statistical technique we use to estimate percentages is to sample data and calculate the proportion of outcomes we’re interested in. However, because data are random, and we can usually only collect a small amount of it, the question is: how accurate are our estimates? April 10th, 2007 Statistics Page 2 David Shilane UC Berkeley Estimating Tarja’s Approval Rating We can define a politician’s approval rating to be the proportion of constituents who approve of the politician. Though of another way, the approval rating is the probability that a randomly selected constituent will approve of the politician. Sometimes approval is measured in different ways – it can be the proportion of people who intend to vote for the politician, who approve of actions relating to an issue, or even just whether people like the person or not. Unfortunately, people respond differently depending upon what question is asked and even how it’s delivered. Therefore, it is important to remember that the results obtained from a poll are with respect to a particular question, and we should be hesitant to generalize results for one question to answer another. April 10th, 2007 Statistics Page 5 David Shilane UC Berkeley Types of Surveys Do you approve of Tarja Halonen? Yes No How strongly do you approve of Tarja Halonen? 1 2 3 4 5 6 For the latter survey, we might say people approve of Tarja if they responded with at least 4 and otherwise disapprove. April 10th, 2007 Statistics Page 6 David Shilane UC Berkeley The Data In the simplest case, we would draw n names out of a hat with replacement and ask them to complete the survey. Then our data is: Xi = { 1 if person i approves 0 if person i disapproves ; 1 ≤ i ≤ n Our quantity of interest is the approval rating P (Xi = 1) = p, where p is an unknown number we wish to estimate. We can do so by taking the empirical proportion p̂ of people who approve, which is also the sample mean of n Binomial(1, p) random variables: p̂ = X̄ = 1n ∑n i=1 Xi April 10th, 2007 Statistics Page 7 David Shilane UC Berkeley Expected Value and Variance for a single coin flip • E[X] = 1 ∗ p + 0 ∗ (1− p) = p • V ar(X) = ( E[X2] ) − (E[X])2 = ( 12(p)− 02(1− p) ) − ( p2 ) = p− p2 = p(1− p) • SD(X) = √ V ar(X) = √ p(1− p) April 10th, 2007 Statistics Page 10 David Shilane UC Berkeley Expected Value and Variance of the Sample Mean • E[X̄] = E [ 1 n ∑n i=1 Xi ] = 1nE [X1 + · · ·+ Xn] = 1n ∑n i=1 E[Xi] = 1 n [p + · · ·+ p] = np n = p • V ar(X̄) = V ar ( 1 n ∑n i=1 Xi ) = 1n2 ∑n i=1 V ar(Xi) = 1n2 [p(1− p) + · · ·+ p(1− p)] = n n2 p(1− p) = p(1−p) n • SD(X̄) = √ V ar(X̄) = √ p(1−p) n Note: E [ ∑n i=1 Xi] = ∑n i=1 E[Xi] always, but V ar ( ∑n i=1 Xi) = ∑n i=1 V ar(Xi) only when X1, . . . , Xn are uncorrelated (which they are by independence). April 10th, 2007 Statistics Page 11 David Shilane UC Berkeley Mean and Variance for Tarja’s Estimated Approval Rating • Mean: p̂ = X̄ = 1n ∑n i=1 Xi. • Variance: p̂(1−p̂)n • SD: √ p̂(1−p̂) n April 10th, 2007 Statistics Page 12 David Shilane UC Berkeley Finding an Interval In order to find a 95% confidence interval, we need to backsolve the standard units equation: z0.95 = X−X̄SD(X) ⇒ z0.95SD(X) = X−X̄ ⇒ Xright = X̄+z0.95SD(X). This gives the right endpoint. Then we just plug in −z0.95 to find the left endpoint: Xleft = X̄ − z0.95SD(X) Then, for any Normal random variable X, a 95% confidence interval is given by: (Xleft, Xright) = X̄ ± z0.95SD(X) April 10th, 2007 Statistics Page 15 David Shilane UC Berkeley A 95% Confidence Interval for the Sample Mean Remember that E[p̂] = p ≈ p̂. We would prefer to fill in the true expected value, but since we don’t know p, the best we can do is fill in our estimate p̂ in its place. Likewise, SD(p̂) = √ p(1−p) n ≈ √ p̂(1−p̂) n . Therefore, we plug in these values for the mean and SD to find a 95% confidence interval for the sample mean as: (Xleft, Xright) = p̂± z0.95 √ p̂(1−p̂) n = X̄ ± 1.96 √ X̄(1−X̄) n April 10th, 2007 Statistics Page 16 David Shilane UC Berkeley Our Old Friend the Box Model If you’re following along in Chapter 21 of the Freedman-Pisani-Purves text, then it’s perfectly equivalent to use the following Box Model to produce your results: 1. Start with a box containing tickets labeled with 0’s and 1’s on them. The proportion of 1’s is p, and the proportion of 0’s is 1− p. 2. Calculate the mean and SD of the box. 3. Find the standard deviation of the approval rating by dividing the SD of the box by √ n. 4. Then, since we don’t actually know p, estimate these numbers using the empirical mean p̂ and SD √ p̂(1−p̂) n . 5. Construct a 95% confidence interval using the formula p̂± 1.96 √ p̂(1−p̂) n April 10th, 2007 Statistics Page 17 David Shilane UC Berkeley Repeating the Experiment Now let’s pretend that we know Tarja Halonen’s approval rating is exactly 0.53. If we conduct a large number of polls, what proportion of 95% confidence intervals will contain her true approval? I performed this experiment a total of 10000 times by simulating random numbers on a computer, which took about 4 seconds to run. I ultimately found that 9473 of the experiments generated confidence intervals containing the truth, so a proportion of 0.9473 of all 95% confidence intervals contained the true value. Is this a reasonable proportion? Let’s make another confidence interval... April 10th, 2007 Statistics Page 20 David Shilane UC Berkeley We now have n = 10000 experiments, and on each one the confidence interval we generated either contained the value 0.53 or it didn’t. For 1 ≤ i ≤ n, the data are of the form Yi = { 1 if experiment i’s CI contains 0.53 0 otherwise Because we were generating 95% confidence intervals, our assumption is that P (Yi = 1) = p = 0.95. We can validate this assumption if the 95% CI for p̂ contains 0.95. This confidence interval is: p̂±z0.95 √ p̂(1−p̂) n = 0.9473±1.96 √ 0.9473(1−0.9473) 10000 = (0.9429, 0.9517). Therefore, the experiment produced a reasonable result that appears to validate the notion that roughly 95% of all confidence intervals will contain the true value. April 10th, 2007 Statistics Page 21
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved