Download Inference for Simple OLS: Normal Distribution of Coefficient Estimates and more Exams English Literature in PDF only on Docsity! 1 INFERENCE FOR SIMPLE OLS Model Assumptions ("The" Simple Linear Regression Model Version IV): (We consider x1, … , xn as fixed.) 1. E(Y|x) = η0 + η1x (linear mean function) 2. Var(Y|x) = σ2 (Equivalently, Var(e|x) = σ2) (constant variance) 3. y1, … , yn are independent observations. (independence) 4. (NEW) Y|x is normal for each x (normality) (1) + (2) + (4) can be summarized as: Y|x ~ N(η0 + η1x, σ2) Recall: e|x = Y|x - E(Y|x) So: e|x ~ N(0, σ2) i.e., all errors have the same distribution -- so we just say e instead of e|x . Since € η̂0 and € η̂1 are linear combinations of the Y|xi's, (3) + (4) imply that € η̂0 and € η̂1 (that is, their sampling distributions) are normally distributed. Recalling that E( € η̂1) = η1 Var(( € η̂1) = € σ 2 SXX E( € η̂0) = η0 Var ( € η̂0) = € σ 2 21 n x SXX + , We have € η̂1~ € η̂0~ Look more at € η̂1: We can standardize to get € η̂ η σ 1 1 2 − SXX ~ N(0,1) But we don't know σ2, so need to approximate it by € σ̂ 2 -- in other words approximate Var( € η̂1) by € Varˆ ( ˆ )η1 = [s.e. ( € η̂1)] 2 = € σ̂ 2 SXX . Thus we want to use € ˆ ˆ η η σ 1 1 2 − SXX . But we can't expect this to be normal, too. However, 2 € ˆ ˆ η η σ 1 1 2 − SXX = (*) € ˆ ˆ η η σ σ σ 1 1 2 2 2 − SXX The numerator of the last fraction is normal (in fact, standard normal), as noted above. Facts: (Proofs omitted) a. (n-2) € σ̂ σ 2 2 has a χ2 distribution with n-2 degrees of freedom Notation: (n-2) € σ̂ σ 2 2 ~ χ2(n-2) b. (n-2) € σ̂ σ 2 2 is independent of € η̂1- η1 (hence independent of the numerator in (*) ) Comments on distributions: 1. A χ2(k) distribution is defined as the distribution of a random variable which is a sum of squares of k independent standard normal random variables. [Comment: Recall that € σ̂ 2 = € 1 2n RSS − , so (n-2) € σ̂ σ 2 2 = € RSS σ 2 = € êi σ ∑ 2 is a sum of n squares; the fact quoted above says that it can also be expressed as a sum of n-2 squares of independent standard normal random variables.] 2. A t-distribution with k degrees of freedom is defined as the distribution of a random variable of the form € Z U k where • Z~N(0,1) • U~ χ2(k) • Z and U are independent. In the fraction (*) above, take U = (n-2) € σ̂ σ 2 2 ~ χ2(n-2) Z = € η̂ η σ 1 1 2 − SXX ~ N(0,1)