Download Exam Confidence Intervals, Hypothesis Testing - Regression Analysis | STA 108 and more Exams Statistics in PDF only on Docsity! Information: chapter 2 (Sec 2.1-2.6) The observations are (X1, Y1), ..., (Xn, Yn) and the model is Yi = β0+β1Xi+εi, where ε1, ..., εn are independent N(0, σ2). We will use the following simplifying notations SXX = ∑ (Xi −X)2, SY Y = ∑ (Yi − X̄)2, SXY = ∑ (Xi − X̄)(Yi − Ȳ ). With these notations, estimates of β1 and β0 are b1 = SXY SXX , b0 = Ȳ − b1X̄. The following fact is very useful. Fact a) E(b0) = β0, b) E(b1) = β1, c) V ar(b0) = [ 1 n + X 2 SXX ]σ2, d) V ar(b1) = σ2 SXX . Notations in the text: V ar(b0) is denoted by σ2(b0), V ar(b1) is denoted by σ2(b1). Since σ2 is estimated by MSE we can obtain estimates of V ar(b0) and V ar(b1). Estimate of V ar(b0) is given by s2(b0) = [ 1 n + X̄ 2 SXX ]MSE and estimate of V ar(b1) is given by s2(b1) = MSE SXX . The following fact will be used to construct confidence intervals for β0 and β1. Fact a) The random variable b0−β0 s(b0) has a t-distribution with n− 2 degrees of freedom (df). b) The random variable b1−β1 s(b1) has a t-distribution with n− 2 df. Confidence intervals: A (1− α)100% confidence interval for β1 is given by b1 ± t(1− α/2;n − 2)s(b1), where area to the left of t(1− α/2;n− 2) under the t-curve with n− 2 df is 1− α/2. For the Housing data, n = 19, b1 = 2.941 and s(b1) = 0.5412. From the t-table we get t(.975; 19 − 2) = t(.975; 17) = 2.110. Hence a 95% confidence interval for β1 is 2.941 ± (2.110)(.5412), i.e., 2.941± 1.142, i.e., (1.80, 4.08). Similarly we can obtain a 90% confidence interval for β0. For the Housing data b0 = 28.981 and s(b0) = 8.544. From the t-table, we have t(.95; 19 − 2) = t(.95; 17) = 1.740. So a 90% confidence interval for β0 is 28.981± (1.740)(8.544), i.e., 28.98± 14.86, i.e., (14.12, 43.84). Hypothesis testing Example 1: Suppose we wish to test H0 : β0 = 20 against H1 : β0 > 20 at a level of significance α = 0.1. 1 Step i) Compute the t-statistic: t∗ = b0−20 s(b0) = 28.981−208.544 = 1.051 Step ii) Decision rule: reject H0 if t ∗ > t(.9;n− 2) = t(.9; 17). Here t(.9; 17) = 1.333. Step iii) Conclusion: Since the calculated value of the t-statistic is smaller than t(.9; 17) = 1.333, we cannot reject H0. [Calculation of p-value in Example 1. Area to the right of t∗ = 1.051 under the t-curve with n− 2 = 17 df is the p-value. Using MINTAB we find that the p-value is 0.1540. If we use the t-table we can usually find only bounds on the p-value. From the t-table, t(.80; 17) = 0.863 and t(.85; 17) = 1.069. So the p-value is between 0.15 and 0.20. The p-value here is much closer to 0.15 than to 0.20]. Example 2: Suppose we want to test H0 : β1 = 0 against H1 : β1 > 0 at a level of significance α = 0.01. Step i) Calculate the t-statistic: t∗ = b1−0 s(b1) = 2.941−00.5412 = 5.434. Step ii) Decision rule: reject H0 if t ∗ > t(.99;n− 2) = t(.99; 17)̇. Step iii) Conclusion: Since the calculated value of the t-statistic is larger than t(.99; 17) = 2.567, we reject H0. [The p-value in this example is given by the area to the right of t∗ = 5.434 under the t-curve with n − 2 = 17 df. From MINITAB the p-value≈ 0.000. If we use the t-table, we can obtain a bound for the p-value. From the t-table we find that t(.9995; 17) = 3.965. So the p-value is smaller than 0.0005.] Example 3: Suppose we wish to test H0 : β0 = 20 against H1 : β0 = 20 at a level of significance α = 0.1. Step i) Same as before: t∗ = 1.051. Step ii) Decision rule: reject H0 if |t ∗| > t(.95;n− 2). Step iii) Conclusion: Since |t∗| is smaller than t(.95; 17) = 1.740, we cannot reject H0. [Comment 1: For this case: the p-value is sum of the area to the left of -1.051 and area to the right of 1.051 under the t-curve with 17 df. Since the t-curve is symmetric about zero, the p-value is 2 times the area to the right of 1.051. Using MINITAB, the p-value is (2)(.1540)=0.3080. If we use the t-table, then the p-value is between (2)(.15)=0.3 and (2)(.20) = 0.4. Comment 2: In example 3, the alternative is two sided and hence it is possible to carry out this test using a confidence interval approach. If the 90% confidence interval of β0 does not include 2