Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Hypothesis Testing in Statistics - Prof. Saeed Maghsoodloo, Study notes of Probability and Statistics

The concept of statistical hypothesis testing, including the null and alternative hypotheses, the sampling distribution of the point estimator, and the critical region of a null hypothesis. It also covers the power of a statistical test, type i and type ii errors, and the operating characteristic (oc) curve.

Typology: Study notes

Pre 2010

Uploaded on 08/18/2009

koofers-user-846
koofers-user-846 🇺🇸

10 documents

1 / 14

Toggle sidebar

Related documents


Partial preview of the text

Download Hypothesis Testing in Statistics - Prof. Saeed Maghsoodloo and more Study notes Probability and Statistics in PDF only on Docsity! Reference: Chapter 8 of J. L. Devore’s 7th Edition Maghsoodloo TESTING A STATISTICAL HYPOTHESIS A statistical hypothesis is an assumption about the frequency function(s) (i.e., PDF or pdf) of one or more random variables. Stated in simpler terms, a statistical hypothesis is an assumption about the parameter(s) of one or more population(s) such as µ, σ2, µx − µy, p, p1 − p2 and σ12/σ22, etc. Examples. (a) H0: µ = 100, H1: µ ≠ 100; (b) H0: µ = 5000 psi, H1: µ > 5000 psi; (c) H0: σ2 = 0.05, H1: σ2 < 0.05; (d) H0: µx − µy = 0, H1: µx − µy ≠ 0; (e) H0: σx2 / σy2 = 1, H1: σx2 / σy2 ≠ 1; (f) H0: µ = 55 dB, H1: µ < 55. In the above examples, H0 is called the null hypothesis while H1 is called the alternative hypothesis (note that Devore uses the notation Ha for the alternative hypothesis but his notation is not as prevalent in Statistical Literature as is H1). Further, the above examples indicate that, just like confidence intervals (CIs), there are two types of hypotheses: two-sided and one-sided. The alternative H1 (or Ha) always determines whether a hypothesis is one or 2-sided. If the statement under H1 involves ≠, then the hypothesis is 2-sided; otherwise, the statement under H1 will either include < for a left-tailed test, or will include > for a right-tailed test. Therefore, Examples (a), (d) and (e) above constitute 2-sided hypotheses, while Example (b) formulates a right-tailed test, and (c) and (f) are left-tailed tests. Finally, bear in mind that the equality sign "=" generally will be used only in the statement under H0 (almost never under H1). In the remaining of this chapter, we will study hypothesis testing about a single normal population parameter such as µ, σ2, and p. Hypothesis testing about parameters of two populations will be discussed in Chapter 9 of Devore. Just like in the case of CIs (confidence intervals), the sampling distribution (SMD) of the point estimator of the parameter under H0 must be used to conduct a parametric test of hypothesis. That is to say, the null 109 109 hypothesis H0: µ = µ0 must be tested using the SMD of the point estimator x when σ2 is known. The null hypothesis H0: σ = σ0 must be tested using the SMD of (n −1)S2/σ2, which is n 1 2 − χ . The null hypothesis H0: µ = µ0, when σ2 is unknown, must be tested using the SMD of S/n)x( µ− which is that of the Student’s t with ν = n − 1 degrees of freedom (df). The critical (or rejection) region of a null hypothesis is that part of range space of the test statistic that corresponds to the rejection of H0. In testing a statistical hypothesis, any one of the following four circumstances may occur. H0 is true H0 is false Reject H0 Event ≈ “ Type I Error, or False Positive” Occurrence Pr = α = LOS True Positive (or Sensitivity) → Correct Decision Occurrence Pr = 1 − β Do not reject H0 (or Accept H0 ) True Negative (or Specificity of the test) → Correct Decision Occurrence Pr = 1 − α Event ≈ “ Type II Error, or False Negative” Occurrence Pr = β Note that once a decision about H0 is made based on sample data, then the above 4 circumstances reduce to only two possibilities. For example, if a data set provides sufficient evidence to reject H0, then the experimenter has either committed a type I error or has made a correct decision [cells (1, 1) and (1, 2), respectively]. And vice a versa when sample data does not provide sufficient evidence to reject H0. Further, the Pr of rejecting H0 given that H0 is false is called the power of the statistical test (or the sensitivity of the test), which, from cell (1, 2) of the above table, is clearly equal to 1 − β. Finally, the question arises as to when an experimenter should conduct a 2- sided test, and when a one-sided test is warranted? This author recommends a 2- 112 112 is always less than the corresponding pre-assigned value of α; further, the smaller P-value = α̂ is, the stronger the evidence is against the validity of H0. Before leaving this section, we must state the fact that all CIs are tests of hypotheses in disguise for all possible values of the parameter under H0. To illustrate this fact, recall the results of Example 36 (pp. 96 – 98 of STAT 3600) where we obtained the 95% lower one-sided CI for the mean breaking strength to be 1251.7750 < µ < ∞. The rejection of our null hypothesis H0: µ = 1250 vs H1: µ > 1250 at α = 0.05 is consistent with the 95% CI 1251.7750 < µ < ∞ because the hypothesized mean value µ0 ≡ 1250 is outside the 95% CI [1251.7750, ∞). Then, it must be clear by now that the sample mean x = 1260 does not provide sufficient evidence at α = 0.05 to reject H0: µ = 1252 vs H1: µ > 1252 because µ0 ≡ 1252 is inside the 95% CI 1251.7750 < µ < ∞. Note that the upper one-sided 95% CI for µ is given by [−∞, x +Z0.05×SE( x )] = (−∞, 1268.225), which includes the hypothesized value of 1250 and hence in contradiction to the decision of rejecting H0: µ =1250 at the 5% level. Exercise 65. Determine if the 99% CI of Exercise 55(a) on page 97 of my STAT3600 notes is consistent with the test result of Exercise 64. COMPUTING A TYPE II ERROR PROBABILITY β Since β is the Pr of accepting H0 if H0 is false, then when testing H0: µ = µ0, the Pr statement for β of the Example 38 is given by β = Pr( x falls within the acceptance interval H0 is false, i.e., µ > 1250). To illustrate the computation of β, again consider the Example 38, where H0: µ = 1250 and H1: µ > 1250 psi. Suppose the true mean value is µ = 1255, i.e., H0 is now false. Then, what is the Pr of accepting H0: µ = 1250 with a random sample of size 25 before the sample is drawn given that the population mean µ = 1255 ≠ µ0? The SMD of x given that H1 is true (or H0 is false, i.e., µ ≠ 1250) is depicted in Figure 14 113 113 below. From Figure 14 we deduce that the value of β(at µ = 1255) = Pr( −∞ < x < 1258.225 µ = 1255) = Pr( x 1255 5 − ≤ 0.645) = = Pr(Z ≤ 0.645) = Φ (0.645) = 0.74054. Figure 14 and the above computation of β(at µ = 1255) show that as the true value of µ changes, so does the value of β, i.e., Type II Error Pr is a function of the x H1 Figure 14. The Sampling Distribution of x Given that H0 is false parameter under H0 (in this case β is a function of µ). For example, if µ = 1258.2250, then the type II error Pr is β = 0.50; if µ > 1258.225, then β < 0.50; if µ = 1270, β = 0.009261; if µ = 1280, β = 0.0000066534, and if µ = 1300, then β is practically 0. Note that the larger is the true value of µ than 1250, i.e., the falser H0 becomes, then the smaller is the Pr of committing a Type II error. For example, if µ = 1300, then the Pr( −∞ < x < 1258.2250 µ = 1300) = Pr(Z ≤ − 8.355) = 0.01160, which is less than 1 in a billion. The graph of β as a function of the parameter under H0 is called an operating characteristic (OC) curve. The OC curve for a 2-sided H0: µ = µ0 is symmetrical about µ0 with maximum ordinate of 1 − α which occurs at µ = µ0. For one-sided SE( x ) = 5 β 1255 Au = 1258.2250 1− β 114 114 tests, β = 1 − α at µ = µ0, but for right-tailed tests, β > 1 − α if µ < µ0, and for a left- tailed test β > 1 − α if µ > µ0. Exercise 66. (a) For the null hypothesis H0: µ = 1250 of the above Example 38, draw the OC curve by computing and tabulating the values of β at µ = 1245, 1250, 1253, 1258.225, 1260, 1263, and 1265. (b) Repeat part (a) using a LOS α = 0.01. It must be clear by now that if H0 is 2-sided, then the critical region of the test is divided equally at the left and right tails of the distribution of the test statistic, and for a left-tailed test the entire value of α is placed on the distribution's left tail and vice a versa for a right-tailed test. Further, always bear in mind that β gives the Pr over the AI while α = Pr( AI ), and almost for all statistical tests 1− β ≥ α. A test for which 1− β ≥ α is said to be unbiased, and a statistical test for which the value of 1 − β → 1 as n → ∞ is said to be consistent Exercise 67. (a) Work Exercises 8-10, 8-12 (pp. 293-4), 8-18, 8-19 , 8-20, and 8-25 on pp. 304-5 of the 7th edition of Devore. (b) For the Exercise 8-10(d), derive the minimum value of n if α = 0.05 and β = 0.01 at µ = 1350. Compute the critical levels, or P-values, of your tests in all cases. TEST OF HYPOTHESIS ABOUT THE MEAN OF A NORMAL UNIVERSE WITH UNKNOWN VARIANCE 2σ Since the population variance σ2 is unknown, then we may obtain a point unbiased estimator of σ2 using the sample statistic S2. Clearly, x/)x( σµ− = (x ) n− µ /σ has a standardized normal distribution, but σ is unknown and has to be estimated by S. However, (x ) n− µ /S is not normally distributed but its 117 117 71.42) as depicted in Figure 15 below. The value of our test statistic is 2 0 χ = 2 250(0.31) / 0.25 = 76.88, which is outside the AI = (32.36, 71.42) and hence we have sufficient evidence to reject H0 in favor of H1: σ ≠ 0.25. Since the test is 2-tailed, the critical level of the test is α̂ = P-value = 2×Pr( 2 50 χ > 76.88) = 2(0.0086273) = (n −1)S2/ 20σ Figure 15. The Sampling Distribution of (n −1)S2/ 20σ under H0 0.017255, which as expected, is less than the prescribed LOS α = 0.05. Although, S is not a linear sum of independent rvs, yet its SMD very slowly approaches the N(σ, σ2/2n) pdf as n → ∞ as depicted in Figure 16. For practical applications the normal approximation is fairly accurate for n > 60. From Figure 16, α̂ ≅ 2×Pr(S > 0.31) = 2Pr(Z > 2.424) = 2×Φ (− 2.424) = 0.01536 as compared to the exact value 0.017255. Observe that even a sample of size n = 51 is not sufficiently large for an adequate normal approximation. We now use the above example to illustrate the computation of β when H0 is false. To this end, assume that the true value of σ = 0.30. Then 0.025 0.025 32.36 71.42 0.95 50 2f( )χ AI = (32.36, 71.42) 118 118 β(at σ = 0.30) = Pr( 2 0 χ is inside the AIH0 is false) = Pr(32.36 ≤ (n −1)S2/ 20σ ≤ 71.42 σ = 0.30) = Pr(32.36 2 2 0 σ σ ≤ (n −1)S2/σ2 ≤ 71.42 2 0 2 σ σ σ = 0.30) β(at σ = 0.30) = Pr(32.36× 46944.0 ≤ 2 50 χ ≤ 49.59722) SH0 = Pr( 22.47222 ≤ 2 50 χ ≤ 49.59722) = The cdf of 2 50 χ at 49.59722 − the cdf of 2 50 χ at 22.47222 = 0.510526 − 0.000271 = 0.510255. The above exact Pr of type II error could also be approximated from the OC curves if Devore had provided them. The abscissa of such OC curves would be λ = σ /σ0 = 0.30 / 0.25 = 1.20. I used chart (i) on page A-16 of Montgomery and Runger (1994, Applied Statistics and Probability For Engineers) to obtain the approximate ordinate value of β ≅ 0.50. As far as I have been able to ascertain, Minitab does not provide power values for a χ2 test either. Therefore, Excel has to 0.25 02475.0102/25.0 = 0.31 ˆ /α 2 Figure 16. The Approximate SMD of S 119 119 be used for all χ2 test computations. Exercise 69. (a) For the above example, use MS Excel to compute the pr of a type II error for testing H0 : σ = 0.25 (with n = 51) if the true value of σ were equal to 0.33. (b) Obtain a 95% CI for σ and compare your CI with test result of Example 40 and draw conclusions. (c) For the Example 40, derive the necessary sample size n such that the pr of type II error at λ = σ/σ0 = 0.30/0.25 = 1.20 is reduced from 0.5103 to β = 0.20. TEST of HYPOTHESIS ABOUT A POPULATION PROPORTION To illustrate the procedure for testing H0 : p = p0 vs one of the 3 alternatives H1 : p < p0 , H1 : p ≠ p0 , or H1 : p > p0 , we consider Devore’s Example 8.11 on pp. 335-6 of his 5th edition, which gives the results of a survey of 102 doctors, only 47 of whom knew the generic name for the drug Methadone. The objective was to determine if this sample of size n = 102 provided sufficient evidence to conclude, at the 5% LOS, that less than half of all Doctors know the generic name for Methadone. Before proceeding, however, you will need to review your STAT 3600 notes pp. 113−115. As stated at the beginning of this chapter, the “? statement “ must be placed under H1. In this example, n = 102 doctors were randomly surveyed only 47 of whom knew the generic name for Methadone, i.e., there were 47 successes in n = 102 Bernoulli trials. Therefore, a point unbiased estimate of the population proportion (of all doctors), p, is p̂ = 47/102 = 0.46078. The question is “does this sample provide sufficient evidence at the LOS α = 0.05 to warrant the rejection of H0 : p = 0.50 in favor of H1 : p < 0.50”? The approximate large-sample SMD of p̂ is depicted in Figure 17. Figure 17 clearly shows that AL = 0.50 −1.645×0.04951 = 0.4186 so that the AI = (0.4186 , 1). Since the sample test statistic p̂ = 0.461 is well inside this AI, the null hypothesis cannot be rejected at the 5% LOS. The critical level of this test is given by P-value ≅ Pr( p̂ ≤ 0.46078 ) = Pr(Z ≤ −0.79221) = 0.21412,
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved