Download Data Analysis: Comparing Graphs and Statistics for Different Scenarios - Prof. Nancy M. Pf and more Exams Data Analysis & Statistical Methods in PDF only on Docsity! Name: Practice First Midterm Exam Statistics 1000 Spring 2007 (Pfenning) This is a closed book exam worth 150 points. You are allowed to use a calculator and a two-sided sheet of notes. There are 9 problems, with point values as shown. If you want to receive partial credit for wrong answers, show your work. Don’t spend too much time on any one problem. 1. (5 pts.) Suppose we are interested in finding out if students tend to sleep less, the older they are. (a) What would be an appropriate display? (i) bar graph (ii) histogram (iii)side-by-side boxplots (iv) scatterplot (b) Which of these would provide the best summary? (i) compare percentages (ii) compare means and standard deviations (iii) compare Five Number Summaries (iv) report the correlation 2. (5 pts.) Suppose we are interested in finding out if smokers exercise less than non- smokers. Data values for exercise times include some high outliers. (a) What would be an appropriate display? (i) bar graph (ii) histogram (iii)side-by-side boxplots (iv) scatterplot (b) Which of these would provide the best summary? (i) compare percentages (ii) compare means and standard deviations (iii) compare Five Number Summaries (iv) report the correlation 3. (5 pts.) Suppose we are interested in finding out if males are just as likely as females to prefer the color black. (a) What would be an appropriate display? (i) bar graph (ii) histogram (iii)side-by-side boxplots (iv) scatterplot (b) Which of these would provide the best summary? (i) compare percentages (ii) compare means and standard deviations (iii) compare Five Number Summaries (iv) report the correlation 4. (10 pts.) Words per minute typed by experienced typists follows a normal distribution with mean 60 and standard deviation 15. (a) According to the 68-95-99.7 Rule, 95% of experienced typists type between and words per minute. (b) Suppose an experienced typist can type 78 words per minute. What is his standard (z) score? 5. (20 pts.) Two banks each have three tellers helping customers. One bank requires customers to stand in separate lines for the three tellers, the other has customers stand in a single line and be called to the next available teller. Below are a back-to- back stemplot and side-by-side boxplot for waiting times (stems are minutes) of 10 customers at the bank with separate lines and 11 customers at the bank with a single line. Separate Single 1 4 6 2 5 7 2 6 6 7 7 9 7 7 7 1 2 3 4 7 8 8 5 8 3 9 10 0 11 (a) Judging from the looks of the stemplot, which arrangement seems to be faster? (i) separate lines (ii) single line (iii) both about the same (b) For which arrangement do the waiting times have more spread? (i) separate lines (ii) single line (iii) both about the same (c) One fourth of the customers in the bank with separate lines waited minutes or less. (Find Q1.) (d) The boxplots indicate that both distributions are (i) very left-skewed (ii) fairly symmetric (iii) very right-skewed 7. (30 pts.) Cereal manufacturers looked at the relationship between number of days x that 14 cereal boxes spent on the supermarket shelf, and moisture content y. Scatter- plot and regression output are given below. (a) What is the response variable? (b) Sitting on the shelf tends to make cereal (i) dryer (ii) soggier (iii) neither (c) Which of the following is the best guess for r? (i) -.95 (ii) -.55 (iii) -.15 (iv) .15 (v) .65 (vi) .95 (d) If we switched the roles of x and y, then which of the following would change? (i) the value of r (ii) the equation of the regression line (iii) both (iv) neither (e) Predict the moisture content of a cereal box that sat on the shelf for 10 days. (f) What is the residual for a shelf time of 10 days, if the actual moisture content was 3.40? (g) Suppose a supermarket accidentally kept a cereal box on the shelf for 100 days. What can we say about its moisture content? i. It should equal 7.29. ii. It should be very close to the predicted value because of the high x value. iii. It could be far from the predicted value because of extrapolation. (h) The box which spent 20 days on the shelf is an (i) outlier (ii) influential observation (iii) both (iv) neither (i) Taste tests indicated that the cereal is unacceptably soggy when the moisture content exceeds 4.1. Judging from the scatterplot, what would be a good time to remove unsold cereal from the shelf? After (i) a day (ii) a week (iii) a month (iv) a year Regression Analysis: moisture versus days The regression equation is moisture = 2.79 + 0.045 days Predictor Coef SE Coef T P Constant 2.78551 0.09485 29.37 0.000 days 0.044620 0.004113 10.85 0.000 S = 0.1962 R-Sq = 90.7% R-Sq(adj) = 90.0% Unusual Observations Obs days moisture Fit SE Fit Residual St Resid 8 20.0 3.1000 3.6779 0.0525 -0.5779 -3.06R R denotes an observation with a large standardized residual 403020100 5 4 3 days m oi st ur e 8. (20 pts.) 350 students at 18 Seattle schools in high-crime areas participated in a study during the 1980’s. About half of the students took part in a program throughout elementary school which trained them how to earn good grades and get along with others; the other half did not take part in the program. The pregnancy rate for young women in the program, by the time they reached the age of 21, was only 38 percent, compared with 56 percent for the women who had gotten no training. (a) What kind of study was this? (i) observational study (ii) experiment (iii) anecdotal evidence (iv) multistage sample (b) Which of these best describes the intended population of interest? i. 350 students at 18 Seattle schools in high crime areas ii. all students at Seattle schools iii. all students at schools in high crime areas (c) The treatment group’s pregnancy rate was how much lower than the rate for the control group? (d) Which of the following could be a possible lurking (confounding) variable? i. if students in one group had a different Health and Sex Ed teacher than those in the other group ii. if students in one group were trained to get along with others and students in the other group were not iii. if female students in one group tended to get pregnant and those in the other group did not (e) What would be the best way for researchers to assign some students to attend the program, others not? (i) put males in one group and females in the other (ii) ask for volunteers (iii) make a random assignment (f) This problem involves (i) one quantitative and one categorical variable (ii) two quantitative variables (iii) two categorical variables (g) To summarize differences, we (i) compare percentages (ii) compare means (iii) report the correlation r