Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Regression Analysis: Understanding Parameter Estimates and Confidence Intervals, Study notes of Statistics

An in-depth analysis of regression analysis, focusing on understanding parameter estimates and constructing confidence intervals. It covers topics such as regression equations, histograms, error in prediction, and statistical inference. The document also includes examples and formulas for calculating parameter estimates and confidence intervals.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-mlx
koofers-user-mlx 🇺🇸

10 documents

1 / 12

Toggle sidebar

Related documents


Partial preview of the text

Download Regression Analysis: Understanding Parameter Estimates and Confidence Intervals and more Study notes Statistics in PDF only on Docsity! From last time . . . ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●●● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●● ●●● ●● ● ● ●● ●● ●● ●● ● ●● ● ●● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●●● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ●● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● 60 65 70 75 80 60 65 70 75 Father's span (inches) F at he r's h ei gh t ( in ch es ) corr = 0.78 1 The equations Regression of y on x (for predicting y from x) Slope = r SD(y)SD(x) Goes through the point (x̄, ȳ) ŷ− ȳ = r SD(y)SD(x) (x− x̄) −→ ŷ = β̂0 + β̂1 x where β̂1 = r SD(y)SD(x) and β̂0 = ȳ− β̂1 x̄ Regression of x on y (for predicting x from y) Slope = r SD(x)SD(y) Goes through the point (ȳ, x̄) x̂− x̄ = r SD(x)SD(y) (y− ȳ) −→ x̂ = β̂?0 + β̂?1 y where β̂?1 = r SD(x) SD(y) and β̂ ? 0 = x̄− β̂?1 ȳ 2 Histograms Spans span (inches) 60 65 70 75 80 mean = 68.7 SD = 3.2 Heights height (inches) 60 65 70 75 80 mean = 67.7 SD = 2.7 3 Error in prediction Having no information about x, Predict y as ȳ Typical prediction error: SD(y) For predicting height, SD(y) ≈ 2.73 Having been told about x, Predict y using the regression line: ŷ = β̂0 + β̂1 x Typical prediction error: SD(y) √ 1− r2 For predicting height from span, SD(y) √ 1− r2 ≈ 1.71 4 ● ● ● ● ● ● ● ● ● ● 0 10 20 30 40 50 0.15 0.20 0.25 0.30 0.35 H2O2 O D 9 0 10 20 30 40 50 0.15 0.20 0.25 0.30 0.35 H2O2 O D 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.340 0.345 0.350 0.355 0.360 0.365 0.370 −0.0044 −0.0042 −0.0040 −0.0038 −0.0036 −0.0034 −0.0032 y−intercept sl op e 11 Confidence intervals We know that β̂0 ∼ N ( β0, σ 2 ( 1 n + x̄2 SXX )) β̂1 ∼ N ( β1, σ2 SXX ) We can use those distributions for hypothesis testing and to construct confidence intervals! 12 Statistical inference We want to test: H0 : β1 = β?1 versus Ha : β1 6= β?1 Generally, β?1 is 0. We use t = β̂1 − β∗1 se(β̂1) ∼ tn – 2 where se(β̂1) = √ σ̂2 SXX Also, [ β̂1 − t(1 – α2 ),n – 2 × se(β̂1) , β̂1 + t(1 – α2 ),n – 2 × se(β̂1) ] is a (1 – α)×100% confidence interval for β1. 13 Results The calculations in the test H0 : β0 = β∗0 versus Ha : β0 6= β∗0 are analogous, except that we have to use se(β̂0) = √ σ̂2 × ( 1 n + x̄2 SXX ) For the pf3d7 data we get the 95% confidence intervals (0.342 , 0.364) for the intercept (– 0.0043 , – 0.0035) for the slope Testing whether the intercept (slope) is equal to zero, we obtain 70.7 (– 22.0) as test statistic. This corresponds to a p-value of 7.8 ×10-15 (8.4 ×10-10). 14 Coefficient of determination In the previous lecture we wrote SSreg = SYY− RSS = (SXY)2 SXX Define R2 = SSreg SYY = 1− RSS SYY R2 is often called the coefficient of determination. Notice that R2 = SSreg SYY = (SXY)2 SXX× SYY = r2XY 19 Back to the heme data The scientist was actually interested in the slopes when one re-scales the y-axis so that the y-intercept is at 1. y = β0 + β1x +  becomes y/β0 = 1 + (β1/β0)x + ′ So we’re really interested in β1/β0. We’d estimate that by β̂1/β̂0, but what about its standard error? 20 First-order Taylor expansion Consider f (x, y) = x/y. A first-order Taylor expansion to approximate the function would be f (x, y) ≈ f (x0, y0) + (x− x0) ∂f ∂x ∣∣∣∣ (x0,y0) + (y − y0) ∂f ∂y ∣∣∣∣ (x0,y0) Since ∂f/∂x = 1/y and ∂f/∂y = −x/y2, we obtain the following: x/y ≈ x0/y0 + (x− x0)/y0 − (y − y0)x0/y20 = (x0/y0)[1 + (x− x0)/x0 + (y − y0)/y0] How do we use this? We use the first-order Taylor expansion of β̂1/β̂0 around β1 and β0. 21 Variance of a ratio Remember that β1 and β0 are fixed, while β̂1 and β̂0 are random. Add the fact that var(X+Y) = var(X) + var(Y) + 2 cov(X,Y) var{β̂1/β̂0} ≈ var{(β1/β0)[1 + (β̂1 − β1)/β1 + (β̂0 − β0)/β0]} = (β1/β0) 2{var(β̂1)/β21 + var(β̂0)/β20 + 2 cov(β̂1, β̂0)/(β1β0)} We then replace β1 and β0 in this formula with our estimates of them, β̂1 and β̂0. Further, we replace the variances and covariance with our estimates. ˆvar{β̂1/β̂0} = (β̂1/β̂0)2{ ˆvar(β̂1)/β̂21 + ˆvar(β̂0)/β̂20 + 2 ˆcov(β̂1, β̂0)/(β̂1β̂0)} The estimated SE is then ŜE{β̂1/β̂0} = |β̂1/β̂0| √ [ŜE(β̂1)/β̂1]2 + [ŜE(β̂0)/β̂0]2 + 2 ˆcov(β̂1, β̂0)/(β̂1β̂0) 22 Results pf3d7: β̂0 = 0.353(0.005) β̂1 = −0.0039(0.0002) ˆcov(β̂1, β̂0) = −6.6× 107 β̂1/β̂0 × 100 = –1.10 (SE = 0.07). estimate SE bhem -2.04 0.32 pgalnoel -2.02 0.35 pgal -1.88 0.17 pyoelii -1.33 0.09 pf3d7 -1.10 0.07 pviv -0.86 0.26 pknow -0.79 0.14 pov -0.70 0.07 pbr -0.67 0.08 pfhz -0.31 0.17 23
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved