Download Final Exam Solution | Engineering Statistics | STA 3032 and more Exams Statistics in PDF only on Docsity! STA3032 (Section 7393) FINAL EXAM SOLUTION 1. What is the right interpretation of the given fitted vs residual plot? −10 0 10 20 30 − 50 0 50 Fitted vs Residual plot fit re s • (a) It’s showing non-constant variance.: It shows the increasing variance, thus square-root transformation of Y might make the variance stable. 2. In simple linear regression, what is the distribution of the slope β̂1? • (d) Normal distribution with the mean β1 and the variance σ 2Pn i=1(xi−x̄) 2 1 3. To monitor the manufacturing process of rubber support bearing used between the superstructure and foundation pads of nuclear power plant, a quality control engineer samples 100 bearings from the production line each day over a 15-day period. The summary data is ∑n i=1 p̂i = ∑n i=1 xi m = 0.48. What is the Lower Control Limit and the Upper Control Limit for such a mean proportion? • (a) (0, 0.0848) n = 15,m = 100, p̄ = ∑n i=1 p̂i n = 0.48 15 = 0.032 Lower Limit : p̄ − 3 √ p̄(1 − p̄) m = 0.032 − 3 √ 0.032(1 − 0.032) 100 = −0.0208 ⇒ Lower Limit = 0 Upper Limit : p̄ + 3 √ p̄(1 − p̄) m = 0.032 + 3 √ 0.032(1 − 0.032) 100 = 0.0848 4. The number of noticeable defects found by quality control inspectors in a randomly selected 1-square-meter specimen of woolen fabric from a certain loom is recorded each hour for a period of 20 hours. The summary data is ∑n i=1 ci = 151. What is the Lower Control Limit and the Upper Control Limit for the number of defects? • (b) ( 0, 15.7932) n = 20 c̄ = ∑n i=1 ci n = 151 20 = 7.55 Lower Limit : c̄ − 3 √ c̄ = 7.55 − 3 √ 7.55 = −0.693179 ⇒ Lower Limit = 0 Upper Limit : c̄ + 3 √ c̄ = 7.55 + 3 √ 7.55 = 15.79318 5. Suppose that from the process for manufacturing electrical shafts, the lower and upper control limits for x̄-chart is (116.149, 127.851). However, sample number 7 (the 7th samples) has x̄ = 129, sample number 14 has x̄ = 128, sample number 21 has x̄ = 132, sample number 28 has x̄ = 128 and sample number 35 has ¯127. Which of the statement is appropriate? • (b) There is a pattern in this process, thus the process is out of control. : Every 7th samples are outside of Control Limits. Thus there is a patteren, therefore the process is out of control. 2 11-15 . An investigator wants to find out the relationship between traffic flow X (1000’s of cars per 24 hours) and lead content Y of bark on trees near the highway (µg/g dry wt). He collected 11 samples. Following is the output (incomplete) for fitting a simple linear regression of y on x: y = β0 + β1x + ǫ, where ǫ ∼ N(0, σ2). Estimate Std. Error t-value p-value Intercept -12.842 72.143 -0.178 0.863 X 36.184 3.693 0.00 Residual standard error: 92.19 on ? degrees of freedom Multiple R-squared: 0.9143, Adjusted R-squared: 0.9048 F-statistic: 96.01 on 1 and ? DF, p-value: 0.000 11. What is the least-squares regression equation? • (c) ŷ = −12.842 + 36.184x 12. What is the degrees of freedom of error? • (a) 9: df = n − 2 = 11 − 2 = 9 13. What is the t-test statistic of β1? • (d) 9.798:t = bβ1sbβ1 = 36.1843.693 = 9.798 14. What is the alternative hypothesis (H1) of ANOVA p-value (F-statistic: 96.01 on 1 and ? DF, p-value: 0.000)? • (d) H1 : β1 6= 0 15. What is the estimate of the σ2? • (b) 8498.996: σ̂2 = s2 = (92.19)2 = 8498.996 5 16. To predict the average miles per gallon (MPG) Y , an engineer consider the multiple regression model with given explanatory variables. • VOL: Cubic feet of cab space • HP: Engine horspower • SP: Top speed (mph) • WT: Vehicle weight (100 lb) From previous paper, he could find out that log transformation to engine horsepower (HP) and adding a second order term of vehicle weight (WT) will predict the log transformed MPG better. Thus, he fit the multiple regression model log(y) = β0 + β1V OL + β2 log(HP ) + β3SP + β4WT + β5WT 2 + ǫ and he use step() function in R for the model selection. The output is given below (LMPG= log(y), LHP= log(HP ) and WT2= WT 2). Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.788e+00 2.668e-01 21.696 < 2e-16 *** LHP -6.102e-01 1.104e-01 -5.528 4.15e-07 *** SP 6.971e-03 2.178e-03 3.201 0.00198 ** WT2 -2.408e-04 4.508e-05 -5.341 8.85e-07 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.08208 on 78 degrees of freedom Multiple R-Squared: 0.93,Adjusted R-squared: 0.9273 F-statistic: 345.3 on 3 and 78 DF, p-value: < 2.2e-16 The best model that R chooses based on AIC is log(ŷ) = 5.788 − 0.6102 log(HP ) + 0.00697SP − 0.00024WT 2 Is this the final model what we can suggest? Explain the reason of your answer. WT 2 is included in this model, thus, even though R didn’t choose WT variable for the best model, we need to include WT term in the model. Therefore, the final model is log(y) = β0 + β1 log(HP ) + β2SP + β3WT + β4WT 2 + ǫ. 6 17-25. We consider the problem of predicting gasoline mileage Y (in mpg). The independent variables are fuel octane rating X1, average speed X2 (mph), the load X3 carried during each test run and stops per mile X4. Table 1: Coefficient table Estimate Std. Error t-value p-value Intercept -56.526 6.140 -9.207 0.000 X1 1.176 0.076 15.401 0.000 X2 -0.281 0.036 -7.766 0.000 X3 -0.009 0.001 -8.975 0.000 X4 -0.955 0.364 ? 0.019 Residual standard error: 0.6281 on 15 degrees of freedom Multiple R-Squared: 0.9587, Adjusted R-squared: 0.9477 F-statistic: 87.03 on ? and 15 DF, p-value: 0.000 Table 2: Analysis of Variance table Df Sum Sq Mean Sq F -value p-value X1 1 78.476 78.476 198.9374 0.000 X2 1 19.306 19.306 48.9405 0.000 X3 1 36.815 36.815 93.3265 0.000 X4 1 2.724 2.724 6.9063 0.01901 Residuals 15 ? 0.394 Total 19 143.238 17. What is the sample size? • (d) 20: df = n − p − 1 ⇒ 15 = n − 4 − 1 ⇒ n = 20 18. Controlling for X2, X3 and X4, the predicted mean change in Y when X1 is increased from 5 to 10 is which of the following? • (d) 5.88: 1.176 × (10 − 5) = 5.88 19. What is the numerator degrees of freedom of F -statistic? • (b) 4: p = number of explanatory variables = 4 7