Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

A Brief History of Statistical Thought - Lecture Slides | MATH 243, Study notes of Probability and Statistics

Material Type: Notes; Class: + Dis >4; Subject: Mathematics; University: University of Oregon; Term: Unknown 2007;

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-f5q-1
koofers-user-f5q-1 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download A Brief History of Statistical Thought - Lecture Slides | MATH 243 and more Study notes Probability and Statistics in PDF only on Docsity! A brief history of statistical thought: There is something we want to measure over a whole population (average or proportion). We can’t do an exact measurement (too costly) so we sample to make a guess. Once upon a time, that was all we knew how to do, but then we realized (developing the field of statistics) that we can theoretically predict with some level of certainty how good such a guess is - and it is surprisingly good, independent of how large the original population is. Of course, this theory does not allow us to account for things such as sample bias. (There are more advanced theories which can help, but one’s energy is usually best spent in taking care to get a good sample). The last manifestation of these ideas we are developing for this class is that of comparing proportions (sampled from) two populations. The logic of confidence intervals and hypothesis testing is the same as in previous cases (estimating: an average value; the difference between two average values; or a single proportion). As seasoned statisticians, you now just say “tell me the formula for deviation/error/variance.” OK: √ p̂1(1− p̂1) n1 + p̂2(1− p̂2) n2 . (I will assume you/we can recall/reconstruct what n1, n2, p̂1 and p̂2 are). Example A poll of 700 people conducted in 2007 showed that 39% of Americans favored universal health care. Recently a poll of 400 people found that 50% favored universal health care. Determine, based on this, if more people now favor universal health care. Assuming that the poll in 2007 was perfectly accurate, in the last class we calculated a P-value (that is, a probability of observing a 50% (or more) poll of 400 people if the true number was 39%) of .00000324. But now we can account for uncertainty in both polls simultaneously. You are working for Senator Snort’s opponent’s campaign. You choose a SRS of 250 registered voters. Of these, 149 say they intend to vote for Senator Snort. Then you run a series of negative ads featuring Senator Snort’s recent conviction for drunken driving. Now you take a new poll, with an independent sample of size 300, and find that 151 people in this new sample say they intend to vote for Senator Snort. Is Senator Snort’s support now smaller? You are working for Senator Snort’s opponent’s campaign. You choose a SRS of 250 registered voters. Of these, 149 say they intend to vote for Senator Snort. Then you run a series of negative ads featuring Senator Snort’s recent conviction for drunken driving. Now you take a new poll, with an independent sample of size 300, and find that 151 people in this new sample say they intend to vote for Senator Snort. Is Senator Snort’s support now smaller? Let p1 be Senator Snort’s support before the ads, and let p2 be Senator Snort’s support after the ads. H0: p1 = p2. Ha: p1 > p2. We will use a standardized version of p̂1 − p̂2. (Important note: There is no “plus four” method here.) What about the standard error? Recall that everything we calculate in a hypothesis test assumes that the null hypothesis is true. In this case, that means that both samples are effectively from the same population. Let p be the true proportion in this combined population. The null hypothesis says that p1 = p2 = p. In this case, we get a better estimate of p by combining both samples. Accordingly, let p̂ be number of successes in both samples combined number of individuals in both samples combined . This is called the pooled sample proportion. (Do not confuse it with the pooled sample method for comparing two means!) Since we are assuming that the null hypothesis is true, we use p̂ in place of both p̂1 and p̂2 in the formula for the sampling standard error. This gives √ p̂(1− p̂) ( 1 n1 + 1 n2 ) . Accordingly, our test statistic is z = p̂1 − p̂2√ p̂(1− p̂) ( 1 n1 + 1n2 ) . The distributions are sufficiently close to normal that we can use this test statistic whenever both samples have at least 5 successes and at least 5 failures. Before your ads, 149 out of a sample of 250 said they would vote for Senator Snort. After your ads, 151 out of a sample of 300 said they would vote for Senator Snort. H0: p1 = p2; Ha: p1 > p2. Let’s test at the significance level α = 0.10. Both samples have at least 5 successes and at least 5 failures. We discussed the other conditions previously. We have p̂1 = 149 250 = 0.596 p̂2 = 151 300 ≈ 0.503333, and p̂ = 149 + 151 250 + 300 = 300 550 ≈ 0.545455. Example: How many families have pets? For unknown reasons, you want to determine what fraction of families in Eugene has at least one dog or cat. You choose a SRS of families in Eugene, and for each one whether it has at least one dog or cat. Which procedure do you use? Example: How many families have pets? For unknown reasons, you want to determine what fraction of families in Eugene has at least one dog or cat. You choose a SRS of families in Eugene, and for each one whether it has at least one dog or cat. This problem asks about one proportion, so use a one proportion z procedure. Note that this problem calls for a confidence interval. Example: Math 243 review sessions Do Math 243 students who attend the review session do better on the final exam? (We won’t worry for now about why their performances differ; there could be many reasons, such as differences in motivation, differences in how well the students know the material, and the effects of the review session itself.) We choose a SRS of Math 243 students who attended the review session, and a SRS of Math 243 students who didn’t attend the review session, and compare the mean final exam scores of the two groups. Which procedure do you use? Example: Math 243 review sessions Do Math 243 students who attend the review session do better on the final exam? We choose a SRS of Math 243 students who attended the review session, and a SRS of Math 243 students who didn’t attend the review session, and compare the mean final exam scores of the two groups. This problem asks you to compare two means, but no information about the standard deviations is given, so use a two sample t procedure. Note that this problem calls for a hypothesis test. Let µ1 be the mean final exam score of all Math 243 students who attended the review session, and let µ2 be the mean final exam score of all of Math 243 students who didn’t attend the review session. H0 : µ1 = µ2. Ha : µ1 > µ2. Example: Repeating the Math SAT, version 1 Do people taking the Math SAT a second time do better? We choose a SRS of people taking the Math SAT a second time. We assume that the standard deviation of scores of second time takers is the same as for all people taking the Math SAT, namely 114. We compute the mean Math SAT score on their second time of the people in the sample, and compare it with the mean score of all Math SAT test takers, which is 518. Which procedure do you use? Example: Repeating the Math SAT, version 1 Do people taking the Math SAT a second time do better? We choose a SRS of people taking the Math SAT a second time. We assume that the standard deviation of scores of second time takers is the same as for all people taking the Math SAT, namely 114. We compute the mean Math SAT score on their second time of the people in the sample, and compare it with the mean score of all Math SAT test takers, which is 518. This problem asks you to examine one mean, with supposedly known standard deviations, so use a one sample z procedure. Note that this problem calls for a hypothesis test. Let µ be the mean score on the second attempt of all people taking the Math SAT a second time. H0 : µ = 518. Ha : µ > 518. Note: This is not a good way to investigate the original question! Example: Repeating the Math SAT, version 2 Do people taking the Math SAT a second time do better? We choose a SRS of people taking the Math SAT a second time. Unlike the previous example, we make no assumption about the standard deviation of scores of second time takers of the Math SAT. We compute the mean Math SAT score on their second time of the people in the sample, and compare it with the mean score of all Math SAT test takers, which is 518. Which procedure do you use? Example: Repeating the Math SAT, version 2 Do people taking the Math SAT a second time do better? We choose a SRS of people taking the Math SAT a second time. Unlike the previous example, we make no assumption about the standard deviation of scores of second time takers of the Math SAT. We compute the mean Math SAT score on their second time of the people in the sample, and compare it with the mean score of all Math SAT test takers, which is 518. This problem asks you to examine one mean, with unknown standard deviations, so use a one sample t procedure. Note that this problem calls for a hypothesis test. Let µ be the mean score on the second attempt of all people taking the Math SAT a second time. H0 : µ = 518. Ha : µ > 518. Note: This is not a good way to investigate the original question! Example: Repeating the Math SAT, version 3 Do people taking the Math SAT a second time do better? We choose a SRS of people taking the Math SAT a second time. We compare their Math SAT scores on the second attempt with their Math SAT scores on the first attempt. Which procedure do you use? Example: Repeating the Math SAT, version 3 Do people taking the Math SAT a second time do better? We choose a SRS of people taking the Math SAT a second time. We compare their Math SAT scores on the second attempt with their Math SAT scores on the first attempt. This problem involves the comparison of two means, but we do not have two independent samples. Instead, for each individual, we subtract the Math SAT score on the first attempt from the Math SAT score on the second attempt, and do a one sample t procedure on the differences. Taken together, this is a matched pairs t procedure. This problem calls for a hypothesis test. Let µ be the mean improvement in the Math SAT score the second time it is taken. H0 : µ = 0. Ha : µ > 0. Note: This is the right way to investigate the original question. Example: Playing computer games A researcher choose a SRS of 12 year old children, and finds what fraction of them have played the computer game “Mars Invaders”. Meanwhile, on Mars, a different researcher choose a SRS of 12 year old Martian children, and finds what fraction of them have played the computer game “Earth Invaders”. By means which we won’t go into, you have obtained both sets of data. You want to know if there is a difference in the popularity of the two games. Which procedure do you use? Example: Playing computer games A researcher choose a SRS of 12 year old children, and finds what fraction of them have played the computer game “Mars Invaders”. Meanwhile, on Mars, a different researcher choose a SRS of 12 year old Martian children, and finds what fraction of them have played the computer game “Earth Invaders”. By means which we won’t go into, you have obtained both sets of data. You want to know if there is a difference in the popularity of the two games. This problem involves the comparison of two proportions, so we use the two proportion z procedure. Note that this problem calls for a hypothesis test. Let p1 be the proportion of 12 year old Earth children who have played “Mars Invaders”, and let p2 be the proportion of 12 year old Martian children who have played “Earth Invaders”. H0 : p1 = p2. Ha : p1 6= p2.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved