Download Topic 6 hypothesis testing and more Cheat Sheet Commercial Law in PDF only on Docsity! Statistics - DADE ◼ Topic 6 ◼ Hypothesis Testing: Two Populations ◼ Professor Dr. Carlos Alberto Lastras Rodríguez Chapter Goals After completing this chapter, you should be able to: ◼ Test hypotheses for the difference between two population means ◼ Two means, matched pairs ◼ Independent populations, population variances known ◼ Independent populations, population variances unknown but equal ◼ Complete a hypothesis test for the difference between two proportions (large samples) ◼ Use the F table to find critical F values ◼ Complete an F test for the equality of two variances Test Statistic: Dependent Samples The test statistic for the mean difference is a t value, with n – 1 degrees of freedom: where n s d t d = sd = sample standard dev. of differences n = the sample size (number of pairs) Population Means, Dependent Samples n d d i =For tests of the following form: H0: μx – μy 0 H0: μx – μy ≤ 0 H0: μx – μy = 0 Decision Rules: Matched Pairs Lower-tail test: H0: μx – μy 0 H1: μx – μy < 0 Upper-tail test: H0: μx – μy ≤ 0 H1: μx – μy > 0 Two-tail test: H0: μx – μy = 0 H1: μx – μy ≠ 0 Matched or Paired Samples a a/2 a/2a -ta -ta/2ta ta/2 Reject H0 if t < -tn-1, a Reject H0 if t > tn-1, a Reject H0 if t < -tn-1 , a/2 or t > tn-1 , a/2 Where n s d t d = has n - 1 d.f. ◼ Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data: Matched Pairs Example Number of Complaints: (2) - (1) Salesperson Before (1) After (2) Difference, di C.B. 6 4 - 2 T.F. 20 6 -14 M.H. 3 2 - 1 R.K. 0 0 0 M.O. 4 0 - 4 -21 d = di n 5.67 1n )d(d S 2 i d = − − = = - 4.2 Difference Between Two Means Population means, independent samples Test statistic is a z value Test statistic is a a value from the Student’s t distribution σx 2 and σy 2 assumed equal σx 2 and σy 2 known σx 2 and σy 2 unknown σx 2 and σy 2 assumed unequal (continued) σx 2 and σy 2 Known Population means, independent samples Assumptions: ▪ Samples are randomly and independently drawn ▪ both population distributions are normal ▪ Population variances are known *σx 2 and σy 2 known σx 2 and σy 2 unknown σx 2 and σy 2 Known Population means, independent samples …and the random variable has a standard normal distribution When σx 2 and σy 2 are known and both populations are normal, the variance of X – Y is y 2 y x 2 x2 YX n σ n σ σ += − (continued) * Y 2 y X 2 x YX n σ n σ )μ(μ)yx( Z + −−− = σx 2 and σy 2 known σx 2 and σy 2 unknown Decision Rules Two Population Means, Independent Samples, Variances Known Lower-tail test: H0: μx – μy 0 H1: μx – μy < 0 Upper-tail test: H0: μx – μy ≤ 0 H1: μx – μy > 0 Two-tail test: H0: μx – μy = 0 H1: μx – μy ≠ 0 a a/2 a/2a -za -za/2za za/2 Reject H0 if z < -za Reject H0 if z > za Reject H0 if z < -za/2 or z > za/2 σx 2 and σy 2 Unknown, Assumed Equal Population means, independent samples Assumptions: ▪ Samples are randomly and independently drawn ▪ Populations are normally distributed ▪ Population variances are unknown but assumed equal*σx 2 and σy 2 assumed equal σx 2 and σy 2 known σx 2 and σy 2 unknown σx 2 and σy 2 assumed unequal σx 2 and σy 2 Unknown, Assumed Equal Population means, independent samples (continued) ▪ The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ ▪ use a t value with (nx + ny – 2) degrees of freedom*σx 2 and σy 2 assumed equal σx 2 and σy 2 known σx 2 and σy 2 unknown σx 2 and σy 2 assumed unequal Pooled Variance t Test: Example You are a financial analyst for a brokerage firm. Is there a difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data: NYSE NASDAQ Number 21 25 Sample mean 3.27 2.53 Sample std dev 1.30 1.16 Assuming both populations are approximately normal with equal variances, is there a difference in average yield (a = 0.05)? Calculating the Test Statistic ( ) ( ) ( ) ( ) 1.5021 1)25(1)-(21 1.161251.30121 1)n()1(n S1nS1n S 22 21 2 22 2 112 p = −+ −+− = −+− −+− = ( ) ( ) 2.040 25 1 21 1 1.5021 2.533.27 n 1 n 1 S XX t 21 2 p 21 = + − = + − = The test statistic is: H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2) Solution H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2) a = 0.05 df = 21 + 25 − 2 = 44 Critical Values: t = ± 2.0154 Test Statistic: Decision: Conclusion: Reject H0 at a = 0.05 There is evidence of a difference in means. t0 2.0154-2.0154 .025 Reject H0 Reject H0 .025 2.040 2.040 25 1 21 1 5021.1 2.533.27 t = + − = Test Statistic, σx 2 and σy 2 Unknown, Unequal * σx 2 and σy 2 assumed equal σx 2 and σy 2 unknown σx 2 and σy 2 assumed unequal 1)/(n n s 1)/(n n s ) n s () n s ( y 2 y 2 y x 2 x 2 x 2 y 2 y x 2 x − +− + =vWhere t has degrees of freedom: The test statistic for H0: μx – μy = 0 is: Y 2 y X 2 x n s n s )yx( t + − = Two Population Proportions Goal: Test hypotheses for the difference between two population proportions, Px – Py Population proportions Assumptions: Both sample sizes are large, nP(1 – P) > 5 10.3 Tests of the Difference Between Two Population Proportions (Large Samples) Two Population Proportions ◼ The random variable has a standard normal distribution Population proportions (continued) y yy x xx yxyx n )P(1P n )P(1P )P(P)pp( Z − + − −−− = ˆˆ Example: Two Population Proportions Is there a significant difference between the proportion of men and the proportion of women who will vote Yes on Proposition A? ◼ In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes ◼ Test at the .05 level of significance ◼ The hypothesis test is: H0: PM – PW = 0 (the two proportions are equal) H1: PM – PW ≠ 0 (there is a significant difference between proportions) ◼ The sample proportions are: ◼ Men: = 36/72 = .50 ◼ Women: = 31/50 = .62 .549 122 67 5072 50(31/50)72(36/72) nn pnpn p WM WWMM 0 == + + = + + = ˆˆ ˆ ▪ The estimate for the common overall proportion is: Example: Two Population Proportions (continued) Mp̂ Wp̂ Example: Two Population Proportions The test statistic for PM – PW = 0 is: (continued) .025 -1.96 1.96 .025 -1.31 Decision: Do not reject H0 Conclusion: There is not significant evidence of a difference between men and women in proportions who will vote yes. ( ) ( ) 1.31 50 .549)(1.549 72 .549)(1.549 .62.50 n )p(1p n )p(1p pp z 2 00 1 00 WM −= − + − − = − + − − = ˆˆˆˆ ˆˆ Reject H0 Reject H0 Critical Values = ±1.96 For a = .05 Test Statistic Tests for Two Population Variances F test statistic 2 y 2 x s s F = The critical value for a hypothesis test about two population variances is where F has (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom Decision Rules: Two Variances ◼ rejection region for a two- tail test is: F 0 a Reject H0Do not reject H0 F0 a/2 Reject H0Do not reject H0 H0: σx 2 = σy 2 H1: σx 2 ≠ σy 2 H0: σx 2 ≤ σy 2 H1: σx 2 > σy 2 Use sx 2 to denote the larger variance. α1,n1,n yx F −− 2/α1,n1,n0 yx FF if H Reject −− 2/α1,n1,n yx F −− where sx 2 is the larger of the two sample variances α1,n1,n0 yx FF if H Reject −− Example: F Test You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data: NYSE NASDAQ Number 21 25 Mean 3.27 2.53 Std dev 1.30 1.16 Is there a difference in the variances between the NYSE & NASDAQ at the a = 0.10 level? Some Comments on Hypothesis Testing ◼ A test with low power can result from: ◼ Small sample size ◼ Large variances in the underlying populations ◼ Poor measurement procedures ◼ If sample sizes are large it is possible to find significant differences that are not practically important ◼ Researchers should select the appropriate level of significance before computing p-values Two-Sample Tests in EXCEL For paired samples (t test): ◼ Data | data analysis | t-test: paired two sample for means For independent samples: ◼ Independent sample z test with variances known: ◼ Data | data analysis | z-test: two sample for means For variances… ◼ F test for two variances: ◼ Data | data analysis | F-test: two sample for variances Chapter Summary ◼ Compared two dependent samples (paired samples) ◼ Performed paired sample t test for the mean difference ◼ Compared two independent samples ◼ Performed z test for the differences in two means ◼ Performed pooled variance t test for the differences in two means ◼ Compared two population proportions ◼ Performed z-test for two population proportions