Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Solutions to PSTAT 120C Practice Midterm: Hypothesis Testing and Sign Test, Exams of Asian literature

Solutions to the practice midterm questions for the pstat 120c course, focusing on hypothesis testing and the sign test. It includes calculations for various statistical tests, such as the one-sample t-test, signed-rank test, and runs test.

Typology: Exams

Pre 2010

Uploaded on 09/17/2009

koofers-user-tzg
koofers-user-tzg 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Solutions to PSTAT 120C Practice Midterm: Hypothesis Testing and Sign Test and more Exams Asian literature in PDF only on Docsity! PSTAT 120C: Solutions to Practice Midterm Questions April 29, 2009 1. 15.4 (a) There are 7 out of 10 of the twins where student A did better than student B. The probability of this occurring is P{M ≥ 7} = 2−10 (( 10 7 ) + ( 10 8 ) + ( 10 9 ) + ( 10 10 )) = 2−10 (120 + 45 + 10 + 1) = 176 1024 so that the P value is 2 ( 176 1024 ) = 0.34375. This is not significant at the α = 0.05 because the P value is bigger than α. (b) For the one-sided alternative, the P -value is P = 176 1024 = 0.171875 15.13 The ranks of the differences are Pair Difference Rank Sign Pair Difference Rank Sign 1 28 10 + 6 2 1 − 2 5 4 + 7 7 5 + 3 4 3 − 8 9 7 + 4 15 9 + 9 3 2 − 5 8 6 + 10 13 8 + The sum of the ranks of the negative observations is T− = 3 + 1 + 2 = 6. The critical value for α = 0.05 from Table 9 is 8 for a two-sided test and 11 for a one-sided test. Thus, we get a significant difference between the two samples in either the one-sided or two-side tests. Therefore, the test is more significant than the sign test. This is often true because we are able to use more information in the data (from the ranks) to get a more powerful test. 2. (a) The estimate of the variance of the difference is s2 = 7 6 ( 0.44464− (−.45)2 ) = 0.2825 The critical values for a 90% confidence interval when there are 6 degrees of freedom are χ26,.95 = 1.63539 and χ 2 6,.05 = 12.5916. The confidence interval is therefore[ (n− 1)s2 χ26,.05 , (n− 1)s2 χ26,.95 ] [ 6(0.2825) 12.5916 , 6(0.2825) 1.63539 ] [0.135, 1.04] 1 (b) The t statistic is t = d̄√ s2/n = −0.45√ 0.2825/7 = 2.24. The critical value with 6 degrees of freedom is 1.943, and therefore there is a significant difference between the prices. (c) The sign test notes that M = 1 of the stores has sugar more expensive than eggs. The probability of this happening by chance is P{M ≤ 1} = P{M = 0}+ P{M = 1} = 2−7 (1 + 7) = .0625 giving a P -value of 0.125. (d) The signed-rank test statistic first ranks the seven stores by the magnitudes of the differ- ence between the observations. Store Sugar, x Eggs, y |x− y| Rank 1 $1.40 $2.00 $0.60 4 2 $1.55 $1.70 $0.15 2 3 $1.85 $1.45 $0.40 3 4 $0.95 $1.00 $0.05 1 5 $0.85 $1.90 $1.05 7 6 $1.00 $1.75 $0.75 5 7 $1.15 $2.10 $0.95 6 So that the rank sum is 3. From Table 9 in the text we get the critical value for α = 0.1 is 4. Therefore, the signed-rank test shows a significant difference between the prices in the two samples. 3. (a) n = 239, x̄ = 118.5 and ȳ = 113.7 with sd = 38.37. The test statistic is Z = 118.5− 113.7 38.37/ √ 239 = 1.934 This is greater than 1.645 and therefore it is significant. (b) The power of the test is P {Z > 1.645} = P { x̄− ȳ 38.37/ √ 239 > 1.645 } = P { x̄− ȳ − 10 38.37/ √ 239 > 1.645− 10 38.37/ √ 239 } = P {Z > 1.645− 4.029} = P {Z > −2.38} = 1− 0.0087 = 0.9913 4. The following data on the number of incident reports filed at two police stations was collected over a number of weeks. 2 (a) In 186 of the 437 plots, variety A produced a greater yield of lysine than variety B. The normal approximation takes Z = 186− 437 ( 1 2 )√ 437/4 = −32.5√ 109.25 = −3.10937 Thus, the probability of being smaller than this is P{Z ≤ −3.11} = 0.00094 so that we get a P value of 0.00188. (b) We only need to assume that yield in the separate plots are independent and that all have the pairs all have the same distribution so that the probability that variety A has a higher yield is the same for each plot. (c) Suppose we want to perform a Wilcoxon Signed-Rank test and calculate T− = 38, 025. The expected value of T− is n(n+ 1)/4 = 437(438)/4 = 47, 851.5, and the variance of the test statistic is n(n+ 1)(2n+ 1)/24 = (437)438(875)/24 = 6, 978, 343.75. Thus, Z = 38, 025− 47, 851.5√ 6, 978, 343.75 = −9, 826.5 2, 641.66 = −3.72 which yields a P value 2P{Z ≤ −3.72} = 2(0.0001) = 0.0002. Note: I’m sorry these Z values were not on the table in the book. I got the results above using computer software. If I was just using the book I would simply bound the P value using the values given at the bottom of the table. For part (a), the answer would be P = 2P{Z < −3.11} < 2P{Z < −3} = 0.0027, and for part (b), you could have used P = 2P{Z ≤ −3.72} < 2P{Z < −3.5} = 0.000466 8. A number of fish were collected from two lakes and the length of each fish was recorded. The X column refers to fish from Xavier Lake, and the Y column refers to Yearling Lake. # X Rank Y Rank 1 8.5 4 7.3 1 2 9.2 8 7.6 3 3 10.4 11 10.0 10 4 9.1 7 7.47 2 5 8.6 5 8.7 6 6 9.4 9 Sum 44 22 5 (a) The appropriate rank-sum statistic W = 22. (b) The U statistic is U = 22− 5(6)/2 = 22− 15 = 7. This is less than the expected value of 15. From the table on pages 862–867 of the textbook the P{U ≤ 7} = 0.0887 which makes the P -value 0.1774. (c) The advantage of this nonparametric test over the two-sample t test is that we do not need to assume that the data is normally distributed. It is likely that size of each fish drawn from the lake has a distribution that is not exactly normal, and this test is still appropriate under those conditions. Unfortunately, the nonparametric test has somewhat less power. 9. (a) The generalized likelihood ratio test uses the ratio Λ = maxλ∈Θ0 L(λ;X) maxλ L(λ;X) . The likelihood is L(λ;X) = λXe−λ X! . In our case, the null hypothesis consists of the single value λ = 10. The MLE is given as λ̂ = X. In the denominator of our fraction we want to maximize over all λ ≥ 10 because that is the extent of our parameter space being considered in the one-sided test. Therefore, max λ L(λ;X) = { L(X;X) if X > 10 L(10;X) if X ≤ 10 This leads to the ratio Λ = L(10) L(X) = ( 10Xe−10 X! )( X! XXe−X ) = ( 10 X )X eX−10 when X ≥ 10 and Λ = 1 when X < 10. (b) The size of this test is the probability of the set under the null hypothesis. P{X ≤ 3 | λ = 10} = 3∑ k=0 P{X = k | λ = 10} = e−10 + 10e−10 + 100 2 e−10 + 1000 6 e−10 = e−10 (1 + 10 + 50 + 166.6667) = 0.01034 6 (c) The power of the test when λ = 5 is P{X ≤ 3 | λ = 5} = 3∑ k=0 P{X = k | λ = 5} = e−5 + 5e−5 + 25 2 e−5 + 125 6 e−5 = e−5 (1 + 5 + 12.5 + 20.83333) = 0.26503 (d) This is not the GLRT for this test. The Likelihood ratio is (for X > 10) Λ = ( 10 X )X eX−10 = exp [X log 10−X logX +X − 10] The derivative of this function with respect to X is dΛ dX = (log 10− logX) exp [X log 10−X logX +X − 10] which is negative for X > 10. Therefore, the Λ is a decreasing function in X which implies that the GLRT is of the form {X ≥ k} for some constant k. Our set {X ≤ 3} is not of this form and therefore cannot be a GLRT. In fact, for X ≤ 3 the ratio Λ = 1, and Λ < 1 for X > 10 (eg. Λ(11) = 0.952741). Therefore there is no k such that the set {Λ < k} = {X ≤ 3}. 7
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved