Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

INTRO TO STATISTICS Q & A 2023.Qualified Virginia State University, Exams of Nursing

INTRO TO STATISTICS Q & A 2023.Qualified Virginia State University

Typology: Exams

2022/2023

Available from 07/17/2023

Collet
Collet 🇺🇸

2

(1)

716 documents

1 / 47

Toggle sidebar

Related documents


Partial preview of the text

Download INTRO TO STATISTICS Q & A 2023.Qualified Virginia State University and more Exams Nursing in PDF only on Docsity! ISYE 6501 MIDTERM QUIZ 1 WITH CORRECT ANSWERS UPDATED 2022.Qualified Virginia State University Course Midter… Step 2: … Step 2: … Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) 95 Minute Time Limit Instructions Work alone. Do not collaborate with or copy from anyone else. You may use any of the following resources: One sheet (both sides) of handwritten (not photocopied or scanned) notes If any question seems ambiguous, use the most reasonable interpretation (i.e. don't be like Calvin): ISYE 6501 MIDTERM QUIZ 1 WITH CORRECT ANSWERS UPDATED 2022.Qualified Virginia State University Good Luck! Question 0 -- Practice with Drag & Drop 0 points possible (ungraded) Keyboard Help CUSUM Principal component analysis Support vector machine k-means ARIMACARTExponential smoothing k-nearest-neighborLinear regression Logistic regressionRandom forest Cross validation GARCH Submit ISYE 6501 MIDTERM QUIZ 1 WITH CORRECT ANSWERS UPDATED 2022.Qualified Virginia State University Drag each model or method to a category of question it is commonly used for. For models/methods that have more than one correct category, choose any one correct category; for models/methods that have no correct category listed, do not drag them. You have used 1 of 1 attempts. ISYE 6501 MIDTERM QUIZ 1 WITH CORRECT ANSWERS UPDATED 2022.Qualified Virginia State University Reset Show Answer FEEDBACK Correctly placed 8 items. Misplaced 1 item. ISYE 6501 MIDTERM QUIZ 1 WITH CORRECT ANSWERS UPDATED 2022.Qualified Virginia State University Did not place 3 required items. Good work! You have completed this drag and drop problem. Final attempt was used, highest score is 9.0 Question 2 2.19/3.0 points (graded) Select all of the following models that are designed for use with time series data : k-nearest-neighbor Principal component analysis ARIMA k-means CUSUM Logistic regression GARCH Exponential smoothing ISYE 6501 MIDTERM QUIZ 1 WITH CORRECT ANSWERS UPDATED 2022.Qualified Virginia State University Figure A's classifier is based only on the value of x2 . Figure A has fewer classification errors in the training data. Figure A's classifier has a wider margin in the training data. Submit Figure A's classifier incorrectly classifies exactly 4 white points in the training data. Figure A shows that the black point (7.2,1.4) is an outlier. You have used 1 of 1 attempt Answers are displayed within the problem Question 3b 2.25/3.0 points (graded) 3b. Select all of the following statements that are true. Figure B's classifier has a narrower margin in the training data. Figure B's classifier is more likely to be over-fit. Figure B's classifier incorrectly classifies exactly 5 white points in the training data. Figure B shows that the black point (7.2,1.4) should be white. Submit You have used 1 of 1 attempt Answers are displayed within the problem Question 3c 1.5/3.0 points (graded) 3c. Select all of the following statements that are true. A new point at (3,3) would be classified as white by Figure A's classifier. Submit Question 3e 0.99/3.0 points (graded) 3e. In the hard classification SVM model, it might be desirable to not put the classifier in a location that has equal margin on both sides... (select all correct answers): ...because moving the classifier will usually result in fewer classification errors in the validation data. ...because moving the classifier will usually result in fewer classification errors in the test data. ...when the costs of misclassifying the two types of points are significantly different. You have used 1 of 1 attempt Answers are displayed within the problem Information for Questions 4a, 4b, 4c Seven different regression models have been fitted, using different sets of variables. The figure below shows the resulting adjusted R-squared value for various models, as measured by cross- validation. Question 4a 0.0/3.0 points (graded) Which of the models would you expect to perform worst on a test data set? Model 6, because it has a slightly lower Adjusted R2 than Model 5 and uses one more predictor. Model 2, because it's the simplest of those with a high Adjusted R2 . Model 5, because it has the highest Adjusted R2 . 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Additional Information for Question 4c The table below shows the Akaike Information Criterion (AIC), Corrected AIC, and Bayesian Information Criterion (BIC) for each of the models. Mode l AIC Corrected AIC BIC 1 -5.58 -5.32 2.07 2 -5.67 -5.15 3.89 3 -6.51 -5.62 4.96 4 -4.77 -3.41 8.61 5 -2.80 -0.85 12.49 Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… 6 - 1.31 1.35 15.90 7 0.19 3.71 19.31 Question 4c 0.75/3.0 points (graded) Based on the table above and the figure shown for Question 4a, select all of the following statements that are correct. Adjusted R2 (see figure above 4a) and BIC (see table above 4c) give qualitatively opposite evaluations of Model 7. Among Models 2 and 4, AIC suggests that Model 2 is e(−5.67−(−4.77))/2 = 63.8% as likely as Model 4 to be better. Among Models 2 and 4, AIC suggests that Model 4 is e(−5.67−(−4.77))/2 = 63.8% as likely as Model 2 to be better. BIC suggests that Model 7 is very likely to be better than Model 5. You have used 1 of 1 attempt Answers are displayed within the problem 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Information for all parts of Question 5 Atlanta’s main library has collected the following day-by-day data over the past six years (more than 2000 data points): x1 = Number of books borrowed from the library on that day x2 = Day of the week x3 = Temperature x4 = Amount of rainfall x5 = Whether the library was closed that day x6 = Whether public schools were open that day Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Negative, because higher values of x7 decrease the response (books borrowed today) Positive, because higher values of x7 increase the response (books borrowed today) c. Does x7 make the model autoregressive? No, because the model does not use previous response data to predict the day t response. Yes, because the model uses day t − 1 data to predict day t circulation. Yes, because the model uses both day t − 1 and day t temperature data as predictors. You have used 1 of 1 attempt Answers are displayed within the problem Information for Question 5d 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… The library believes that as the temperature gets either too cold or too hot, more people come indoors to the library to borrow books. They have fit the data to a quadratic function (see the figure below). Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Question 5d 0.0/4.0 points (graded) How would you incorporate the new information above into the library's regression model? Add a (temperature)2 variable to the model. Replace the temperature variable with a (temperature)2 variable in the model. Change the model to estimate the square root of the books borrowed, as a function of temperature, day of the week, inches of rainfall, whether the day is a holiday, and whether schools were open. You have used 1 of 1 attempt 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Answers are displayed within the problem Question 5e-iii 3.0/3.0 points (graded) iii. Aside from seasonal and trend effects, the library believes that the random variation in books borrowed each day is small. Should they expect the best value of α (the baseline smoothing constant) to be: α < 0 Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… 0 < α < 1 2 1 < α < 1 2 α > 1 You have used 1 of 1 attempt Answers are displayed within the problem Information for Questions 5f, 5g, 5h The library would like to compare the regression and exponential smoothing models to determine which is a better predictor, using the mean absolute error |(books borrowed) – (model’s estimate)|/n as a measure of prediction quality. Question 5f 0.0/4.0 points (graded) Select the best of the following four options for splitting the data: Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… 70% for training, 15% for validation, 15% for test 15% for training, 70% for validation, 15% for test 15% for training, 15% for validation, 70% for test 55% for training, 15% for cross-validation, 15% for validation, 15% for test You have used 1 of 1 attempt Answers are displayed within the problem 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Submit Mean absolute error (training set) Mean absolute error (validation set) Regression model 117 152 Exponential smoothing model 148 153 Select all of the reasonable suggestions below: To choose between the models, we should see which one does better on the training set. The regression model is clearly better, because it does better on the training set and about the same on the validation set. The exponential smoothing model is probably fit too much to random patterns (i.e., it is overfit), because it performs much worse than the regression model on the training set. If there had been 20 models, the one that performed best on the validation set would probably not perform as well on the test set as it did on the validation set. You have used 1 of 1 attempt Answers are displayed within the problem 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Question 5i 2.01/3.0 points (graded) Fewer books are borrowed on Fridays than any other day. The library would like to determine whether there has been a change in the Friday effect on borrowing, over the past forty years (for this part only, assume there are forty years of data available). Select all of the approaches that might reasonably be correct. Use CUSUM on the number of additional books borrowed on the average Friday compared to the average other day over the past forty years. 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Submit Use exponential smoothing (with L = 7) to find the seasonal mulitplier values Ct for each Friday, and use CUSUM on those values. Build a regression model for each of the forty years, and use CUSUM on the coefficients of the Friday variable. You have used 1 of 1 attempt Answers are displayed within the problem Information for Questions 6a, 6b A logistic regression model was built to model the probability that a retailer’s inventory of a popular product will run out before the next delivery from the manufacturer, based on a number of factors (amount of current inventory, past demand, promotions, etc.). If the logistic regression’s output is greater than a threshold value p, the retailer pays an additional amount D for a quick delivery to avoid running out. There are three confusion matrices below, for three different threshold values of p: 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Answers are displayed within the problem Question 7 8/8 points (graded) Keyboard Help The figures below each show a data set that will be used in k-means clustering algorithms (where distance between values is important). Each data set has two attributes. For each data set, drag to it the data preparations that are needed for k-means to work well on the data set. Only outlier removal Neither outlier removal nor scaling Only scaling First outlier removal and then scaling Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… First scaling and then outlier removal You have used 1 of 1 attempts. Reset Show Answer FEEDBACK Correctly placed 4 items. Good work! You have completed this drag and drop problem. 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Final attempt was used, highest score is 8.0 Information for Questions 8a, 8b Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Answers are displayed within the problem Question 8b 2.01/3.0 points (graded) A random forest model was built for the same purpose, using the same 7 covariates. Which of the following statements are true? The random forest model uses many trees, but returns a single tree solution that can be analyzed. The random forest model uses a single tree solution. The random forest model can report the relative importance of each variable. You have used 1 of 1 attempt Answers are displayed within the problem Information for Question 8c A data scientist has run principal component analysis on the 7 covariates, with the following 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… results: Component Eigenvalue 1 2.20 2 0.12 3 0.10 4 0.09 5 0.08 6 0.06 7 0.05 Submit 11/11/20 19 Step 2: Midterm Quiz 1 - GT Students (Launch Proctortrack first before taking the Midterm Quiz 1) | Step 2: Midterm Quiz 1 - GT St… Question 8c 2.0/4.0 points (graded) Select all of the following statements that are correct: It is likely that the first principal component has much more predictive power than principal components 2-7. It is likely that the first ori ginal covariate has much more predictive power than covariates 2-7. It is likely that the last ori ginal covariate has much less predictive power than covariates 1-6. The first principal component cannot contain information from all 7 original covariates. You have used 1 of 1 attempt
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved