Download Regression Analysis: Choosing the Right Model and Interpreting Results - Prof. William Q. and more Exams Statistics in PDF only on Docsity! Statistics 328 Examination 2 Name Fall 1999 ****You must show all of your work**** When asked to explain something, or to provide an interpretation for a quantity, provide an ex- planation that could be understood by someone who does not have formal training in statistical methods. Keep your explanations brief. 1. Choosing how many and which model terms to put on the right-hand side of a regression equation is a somewhat subjective process. Choosing the model with the largest R2 is not a good idea because adding a variable to a model will never decrease R2. Why is it not always a good idea to choose the model with the smallest value of S? 2. Consider the antique clock auction data discussed in class. In this example the highest bid price for the clocks (the response) was to be modeled as a function of the clock’s age and the number of bidders (the Xs). In a two-variable problem like this we might be concerned with both “correlation between the Xs” and “interaction between the Xs.” They are not the same. (a) Draw a simple graph or plot to illustrate the meaning of “correlation between the Xs” in the above problem. Make sure to label your axes. (b) Draw a simple graph or plot to illustrate the meaning of “interaction between the Xs” in the above problem. Make sure to label your axes. 3. In some applications it is possible to control the Xs. This allows the use of “designed ex- periments.” A popular method is to use a rectangular array (e.g., the square in the point of sale/media advertising example. What some advantages of such a design? 1 4. A study was conducted to build a model that can be used to predict Y , the number of paid visitors to a pool-water park in a resort district of New Hampshire. Let X1 = { 1 for a weekday 0 otherwise X2 = { 1 for a sunny day 0 otherwise X3 = Degrees F The following three models were fit to the data from 30 consecutive days in the summer of 1992: Model 1 Y = β0 + β1X1 + β2X2 + β3X3 + Model 2 Y = β0 + β1X1 + β2X2 + β3X3 + β4X1X3 + Model 3 Y = β0 + β1X1 + β2X2 + β3X3 + β5X1X2 + giving the following results Model SSY Y SSE 1 1762646 77212 2 1762646 76727 3 1762646 76292 (a) Briefly explain why the SSY Y values are the same for all three of these models and why the SSE values differ. (b) In Model 2, what is the expected increase in paid visitors for an additional degree F in temperature? (c) Test the null hypothesis that β4 = 0 in Model 2, using α = .05. In practical terms, what is your conclusion? (d) Briefly explain practical interpretation of Model 2, relative to Model 1. 2 8. Consider the following two alternative models relating sales to media and point of sale adver- tising X1 and X2, respectively. All variables are in units of thousands of dollars. Model 1 Y = β0 + β1X1 + β2X2 + β3X1X2 + Model 2 Y = β0 + β1X1 + β2X2 + β3X1X2 + β4X 2 1 + β5X 2 2 + (a) In Model 1, what is the expected increase in Y for an additional dollar of Media adver- tising? (b) In Model 2, what is the expected increase in Y for an additional dollar of Media adver- tising? (c) Briefly explain, and give an equation, to show how you could use regression analysis computer output to test for the need for the more complicated Model 2. (d) Suppose that the evidence in support of Model 2 was significant at the 10% level of significance, but not at the 5% level of significance. What would be a good argument for presenting the results of the study in terms of Model 1? 5 6 7