Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Notes on Confidence Intervals and Hypothesis Tests | STA 281, Study notes of Statistics

Material Type: Notes; Class: PROB&STAT INTER COMP TEC; Subject: Statistics; University: University of Kentucky; Term: Fall 2002;

Typology: Study notes

Pre 2010

Uploaded on 10/01/2009

koofers-user-1us
koofers-user-1us 🇺🇸

4

(1)

10 documents

1 / 5

Toggle sidebar

Related documents


Partial preview of the text

Download Notes on Confidence Intervals and Hypothesis Tests | STA 281 and more Study notes Statistics in PDF only on Docsity! Confidence Intervals and Hypothesis Tests STA281 Fall 2002 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes. In particular, ifX1, . . . ,Xn are independent and identically distributed (iid) with mean E[Xi] = µ and variance V [Xi] = σ2 AND n is large, then X̄ ∼ N ( µ, σ2 n ) For the remainder of this handout, all sample sizes should be assumed to be greater than 30 so this result holds. A special case of this result is that if X1, . . . ,Xn ∼ Bern(p), then p̂ ∼ N ( p, p(1− p) n ) Results for two samples may be obtained using the formulas for linear combinations of normal distributions. Thus, if X1, . . . ,XnX are iid with mean µX and variance σ 2 X while Y1, . . . , YnY are iid with mean µY and variance σ2Y (and of course the X and Y samples are independent), then X̄ − Ȳ ∼ N ( µX − µY , σ 2 X nX + σ2Y nY ) This also has a special case for proportions. If X1, . . . ,XnX ∼ Bern(pX) and Y1, . . . , YnY ∼ Bern(pY ), then p̂X − p̂Y ∼ N ( pX − pY , pX(1− pX) nX + pY (1− pY ) nY ) These formulas are the four fundamental results that motivate all of the confidence interval and hypothesis testing theory we will investigate in this course. 2 What are confidence intervals and hypothesis tests? Inference is the use of data to draw conclusions about population parameters. Probability theory assumes we have X1, . . . ,Xn ∼ Bern(0.4) and then specifies the likelihood of generating 0 through n successes. Thus probability theory assumes we know the parameter p and specifies how our data should appear. Inference is concerned about the reverse problem. We already have X1, . . . ,Xn ∼ Bern(p), but we don’t know p. Our goal is to use the data to determine p. The first thing to note is that we will NEVER be able to determine p exactly with only a finite amount of data. Suppose n = 1000 and we observe X1, . . . ,Xn ∼ Bern(p) has 800 successes. What 1 is p? Unfortunately, no value of p in (0, 1) can be completely excluded based on this data. It is possible to see the observed data when p = 0.01 (not likely, but possible). For any value of p in (0, 1), the observed data is possible. Thus, we are forced to make probabilistic statements about p. Intuitively, while p = 0.01 cannot be excluded, our observed data (800 successes in 1000 trials) is so unlikely when p = 0.01 that for all practical purposes we can exclude p = 0.01. These are the kind of inferences we will pursue. We will focus on making two kinds of inferences, confidence intervals and hypothesis tests in several scenarios. Not coincidentally, these scenarios correspond to the situations where we applied the central limit theorem in section 1. Specifically, we will make inferences on means for one or two random samples, and on proportions for one or two random samples. The two kinds of inferences correspond to two common questions asked in scientific experiments. The first, confidence intervals, answers the question “I have no idea what µ (p) is, how do I use the data to estimate it?”. The second, hypothesis tests, answer the question “I have a specific value of µ (p) in mind. Is the data consistent that particular value of µ (p)?” 3 Point Estimates (our best guesses) Fundamental to answering both these guesses is the notion of a point estimate. A point estimate takes the observed data and produces a single value (or guess) of the parameter. Returning to our example where we had 1000 Bernoulli trials and observed 800 successes, we have already established we are not pleased with p = 0.01. If we had to guess a single number, what would we guess? The most common choice is p̂, which in this example is 800/1000 = 0.8. This guess is justified by the central limit theorem, which states that the expected value of p̂ is p. While p̂ may not be equal to p in any particular situation, p̂ has a distribution that is centered around the true value. Thus, if we are estimating a proportion p, we estimate it with p̂. For a mean µ, use X̄ . These extend to the two sample case, so the difference of two proportions pX −pY is estimated by p̂X − p̂Y and the difference of two means µX − µY is estimated by X̄ − Ȳ . Not coincidentally, the center of the distributions of all these guesses is the quantity we are trying to guess. 4 Confidence Intervals Our best guess is a good start for inference, but it isn’t ideal. Specifically, our best guess is basically guaranteed to be wrong. If X1, . . . ,Xn ∼ N(0, 1), then X̄ ∼ N(0, 1/n), a continuous distribution. Although the distribution of X̄ is centered at µ = 0, the probability that X̄ will exactly equal 0 is 0. Ok, that doesn’t sound great, but it’s not so terrible. While X̄ might not be exactly correct, its key advantage is that it should be close to µ, and the larger the sample size, the closer to µ it should be (this can be observed by noting the variance of X̄, σ2/n, tends to 0 as n increases). In fact, the central limit theorem allow us to quantify just how close our point estimates should be to the correct answers. In general, a confidence interval is best guess ± z∗α/2 (std dev of best guess) Thus, for each situation, the only thing to do is find the best guess, and then use the central limit theorem to compute the standard deviation of that best guess. 2
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved