Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Comparing Parameters in Bernoulli & Poisson Distributions: Statistical Hypothesis Testing , Study notes of Mathematical Statistics

An in-depth analysis of statistical hypothesis testing, focusing on comparing parameters in bernoulli and poisson distributions. It covers the concepts of null and alternative hypotheses, errors of type i and ii, size and significance level, power, and uniformly most powerful tests. The document also includes examples of tests for normal, bernoulli, and poisson distributions, as well as exercises to reinforce the concepts learned.

Typology: Study notes

Pre 2010

Uploaded on 08/31/2009

koofers-user-wce
koofers-user-wce 🇺🇸

10 documents

1 / 37

Toggle sidebar

Related documents


Partial preview of the text

Download Comparing Parameters in Bernoulli & Poisson Distributions: Statistical Hypothesis Testing and more Study notes Mathematical Statistics in PDF only on Docsity! Mathematical Statistics I Notes 3 prepared by Professor Jenny Baglivo c© Copyright 2004 by Jenny A. Baglivo. All Rights Reserved. 8 Hypothesis testing 72 8.1 Definitions: hypothesis, simple and compound hypotheses; test . . . . . . . . . . . . 72 8.1.1 Neyman-Pearson framework: null and alternative hypotheses . . . . . . . . . 72 8.1.2 Test statistic, rejection region (RR), acceptance region (AR) . . . . . . . . . 73 8.1.3 Equivalent tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 8.2 Properties of tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 8.2.1 Errors of type I and II; size, significance level; power . . . . . . . . . . . . . . 77 8.2.2 Observed significance level, p value . . . . . . . . . . . . . . . . . . . . . . . . 82 8.2.3 Power function; uniformly more powerful; UMPT . . . . . . . . . . . . . . . . 83 8.3 Example: Normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 8.3.1 Tests of µ = µo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 8.3.2 Tests of σ2 = σ2o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 8.3.3 Some Mathematica commands . . . . . . . . . . . . . . . . . . . . . . . . . . 89 8.4 Example: Bernoulli/binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . 89 8.4.1 Small sample tests of p = po . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 8.4.2 Large sample tests of p = po . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 8.5 Example: Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.5.1 Small sample tests of λ = λo . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.5.2 Large sample tests of λ = λo . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.6 Likelihood ratio tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 8.6.1 Likelihood ratio statistic; Neyman-Pearson lemma . . . . . . . . . . . . . . . 93 8.6.2 Generalized likelihood ratio tests . . . . . . . . . . . . . . . . . . . . . . . . . 97 8.6.3 Large sample theory; approximate tests . . . . . . . . . . . . . . . . . . . . . 99 8.6.4 Example: comparing Bernoulli parameters . . . . . . . . . . . . . . . . . . . . 100 8.6.5 Example: comparing Poisson parameters . . . . . . . . . . . . . . . . . . . . . 102 8.6.6 Example: multinomial goodness-of-fit . . . . . . . . . . . . . . . . . . . . . . 104 71 8 Hypothesis testing 8.1 Definitions: hypothesis, simple and compound hypotheses; test An hypothesis is an assertion about the distribution of a random variable. A simple hypothesis specifies the distribution of X completely. For example, H : X is a Bernoulli random variable with parameter p = 0.45. H : X is an exponential random variable with parameter λ = 1/5. A compound hypothesis does not specify the distribution of X completely, for example, H : X is a Bernoulli random variable with parameter p ≥ 0.45. H : X is an exponential random variable. H : X is not an exponential random variable. A test is a decision rule allowing the user to choose between competing assertions. 8.1.1 Neyman-Pearson framework: null and alternative hypotheses In the Neyman-Pearson framework of hypothesis testing, there are two competing assertions: 1. The null hypothesis, Ho, and 2. The alternative hypothesis, Ha. The null hypothesis is accepted as true unless sufficient evidence is provided to the contrary; then the null hypothesis is rejected in favor of the alternative hypothesis. Example 1 Suppose that the standard treatment for a given medical condition is effective in 45% of patients. A new treatment promises to be effective in more than 45% of patients. In testing the efficacy of the new treatment, the hypotheses could be set up as follows: Ho: The new treatment is no more effective than the standard treatment. Ha: The new treatment is more effective than the standard treatment. If p is the proportion of patients for whom the new treatment would be effective, then the hypotheses could be set up as follows: Ho : p = 0.45 versus Ha : p > 0.45 . 72 Example 4 Let X1, X2, . . ., X16 be a random sample from a normal distribution with mean µ and standard deviation 10, and let X be the sample mean. Consider the following decision rule for a test of the null hypothesis that µ = 85 versus the alternative hypothesis that µ 6= 85: Reject µ = 85 in favor of µ 6= 85 when X ≤ 80.3 or X ≥ 89.7. • If the null hypothesis is true, then P (X ≤ 80.3 or X ≥ 89.7 when µ = 85) = P (X ≤ 80.3 when µ = 85) + P (X ≥ 89.7 when µ = 85) ≈ 0.03 + 0.03 = 0.06. • If the actual mean is µ = 78, then P (X ≤ 80.3 or X ≥ 89.7 when µ = 78) = P (X ≤ 80.3 when µ = 78) + P (X ≥ 89.7 when µ = 78) ≈ 0.821 + 0.000 = 0.821. The following graph shows the distribution of X under the null hypothesis that µ = 85 (in gray) and under the alternative hypothesis when the actual mean is µ = 78 (in black). Vertical dashed lines are drawn at x = 80.3 and x = 89.7. 70 75 80 85 90 95 x0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 Density Two tailed tests. The test above is an example of a two tailed test. In a two tailed test, the null hypothesis is rejected if the test statistic is either in the upper tail or the lower tail of distributions satisfying the null hypothesis. A two tailed test is also called a two sided test. 75 8.1.3 Equivalent tests Consider two tests, each based on a random sample of size n: 1. A test based on statistic T with rejection region RRT , and 2. A test based on statistic W with rejection region RRW . The tests are said to be equivalent if T ∈ RRT ⇐⇒ W ∈ RRW . That is, given the same information, either both tests accept the null hypothesis or both reject the null hypothesis. Equivalent tests have the same properties. For example, if X1, X2, . . ., X16 is a random sample from a normal distribution with mean µ and standard deviation 10, X is the sample mean, and Z = X − 85 10/ √ 16 is the standardized mean when µ = 85, then the test with decision rule: Reject µ = 85 in favor of µ 6= 85 when X ≤ 80.3 or X ≥ 89.7. is equivalent to the test with decision rule: Reject µ = 85 in favor of µ 6= 85 when |Z| ≥ 1.88. 8.2 Properties of tests Let X1, X2, . . ., Xn be a random sample from a distribution with parameter θ. Let Ω be the set of parameter values under consideration, and assume that the null and alternative hypotheses are set up as follows: Ho: θ ∈ ωo versus Ha: θ ∈ Ω − ωo where ωo is a subset of Ω (ωo ⊆ Ω). For example, if X is a Bernoulli random variable with success probability p, and we are interested in testing the null hypothesis p ≤ 0.30 versus the alternative hypothesis p > 0.30, then Ω = {p | 0 ≤ p ≤ 1} and ωo = {p | 0 ≤ p ≤ 0.30}. 76 8.2.1 Errors of type I and II; size, significance level; power When carrying out a test, two types of errors can occur: Accept Ho Reject Ho Ho is true No error Type I error Ho is false Type II error No error • An error of type I occurs when a true null hypothesis is rejected. • An error of type II occurs when a false null hypothesis is accepted. Size, significance level. The size or significance level of the test with decision rule Reject θ ∈ ωo in favor of θ ∈ Ω − ωo when T ∈ RR is defined as follows: α = supθ∈ωoP(T ∈ RR when the true parameter is θ). A test with size α is called a “100α% test.” The size or significance level is the maximum type I error (or the least upper bound of type I errors, if a maximum does not exist). If the significance level is α and the observed data lead to rejecting the null hypothesis, then the result is said to be statistically significant at level α. If the observed data do not lead to rejecting the null hypothesis, then the result is not statistically significant at level α. Power. The power of the test with decision rule Reject θ ∈ ωo in favor of θ ∈ Ω − ωo when T ∈ RR at θ ∈ Ω is defined as follows: Power at θ equals P(T ∈ RR when the true parameter is θ). If θ ∈ ωo, then the power at θ is the same as the type I error. If θ ∈ Ω−ωo, then the power corresponds to the test’s ability to correctly reject the null hypothesis in favor of the alternative hypothesis. 77 Exercise 6 Let X1, X2, X3, X4, X5 be a random sample from a Poisson distribution with parameter λ, and let Y be the sample sum. (1) Find the rejection region for a 2% test (or as close as possible) of the null hypothesis λ = 2.0 versus the alternative hypothesis λ < 2.0 using Y as test statistic. State the exact size of the test. Note: For convenience, P (Y = y when λ = 2.0) = e−10.010.0y y! y = 0, 1, . . . , 26 is given in the following table: y = 0 y = 1 y = 2 y = 3 y = 4 y = 5 y = 6 y = 7 y = 8 0.0000 0.0005 0.0023 0.0076 0.0189 0.0378 0.0631 0.0901 0.1126 y = 9 y = 10 y = 11 y = 12 y = 13 y = 14 y = 15 y = 16 y = 17 0.1251 0.1251 0.1137 0.0948 0.0729 0.0521 0.0347 0.0217 0.0128 y = 18 y = 19 y = 20 y = 21 y = 22 y = 23 y = 24 y = 25 y = 26 0.0071 0.0037 0.0019 0.0009 0.0004 0.0002 0.0001 0.0000 0.0000 80 (2) For the test developed in step (1), find • The type II error when λ = 1.5, λ = 0.75. • The power when λ = 1.0, λ = 0.5. 81 8.2.2 Observed significance level, p value The observed significance level or p value is the minimum significance level for which the null hypothesis would be rejected. The p value measures the strength of the evidence against the null hypothesis. Example 7 Let X1, X2, . . ., X12 be a random sample from a Bernoulli distribution with success probability p, and let Y be the sample sum. Assume we are interested in testing p = 0.45 versus p > 0.45. If 8 successes are observed, then the p value is P (Y ≥ 8 when p = 0.45) = 12∑ y=8 ( 12 y ) 0.45y0.5512−y ≈ 0.1118. Example 8 Let X1, X2, X3, X4, X5 be a random sample from a Poisson distribution with parameter λ, and let Y be the sample sum. Assume we are interested in testing λ = 2.0 versus λ < 2.0. If 2 events are observed, then the p value is P (Y ≤ 2 when λ = 2.0) = 2∑ y=0 e−10.010.0y y! ≈ 0.0028. Example 9 Let X1, X2, . . ., X16 be a random sample from a normal distribution with mean µ and standard deviation 10, and let X be the sample mean. Assume we are interested in testing µ = 85 versus µ 6= 85. If a sample mean of 79.55 is observed, then the p value is 2 P (X ≤ 79.55 when µ = 85) ≈ 0.0292575. If a sample mean of 87.23 is observed, then the p value is 2 P (X ≥ 87.23 when µ = 85) ≈ 0.372393. For a size α test: If the observed significance level is > α, then the null hypothesis is accepted. If the observed significance level is ≤ α, then the null hypothesis is rejected in favor of the alternative hypothesis. 82 If the value of σ2 is estimated from the data, then the approximate standardization of the sample mean when µ = µo: T = X − µo√ S2/n can be used as test statistic. The following table gives the rejection regions for one sided and two sided 100α% tests: Alternative Hypothesis Rejection Region µ < µo T ≤ −tn−1(α) µ > µo T ≥ tn−1(α) µ 6= µo |T | ≥ tn−1(α/2) where tn−1(p) is the 100(1− p)% point of the Student t distribution with (n− 1) degrees of freedom. These are examples of t tests. A t test is a test based on a statistic with a Student t distribution under the null hypothesis. 8.3.2 Tests of σ2 = σ2o If the value of µ is known, then the sum of squared deviations from µ divided by the hypothesized variance: V = ∑n i=1(Xi − µ)2 σ2o can be used as test statistic. The following table gives the rejection regions for one sided and two sided 100α% tests: Alternative Hypothesis Rejection Region σ2 < σ2o V ≤ χ2n(1 − α) σ2 > σ2o V ≥ χ2n(α) σ2 6= σ2o V ≤ χ2n(1 − α/2) or V ≥ χ2n(α/2) where χ2n(p) is the 100(1 − p)% point of the chi-square distribution with n df. 85 If the value of µ is estimated from the data, then the sum of squared deviations from the sample mean divided by the hypothesized variance: V = ∑n i=1(Xi − X)2 σ2o can be used as test statistic. The following table gives the rejection regions for one sided and two sided 100α% tests: Alternative Hypothesis Rejection Region σ2 < σ2o V ≤ χ2n−1(1 − α) σ2 > σ2o V ≥ χ2n−1(α) σ2 6= σ2o V ≤ χ2n−1(1 − α/2) or V ≥ χ2n−1(α/2) where χ2n−1(p) is the 100(1 − p)% point of the chi-square distribution with n − 1 df. These are examples of chi-square tests. A chi-square test is a test based on a statistic with a chi-square distribution under the null hypothesis. Exercise 10 The mean yield of corn in the United States is about 120 bushels per acre. A survey of 40 farmers this year gives a sample mean yield of 123.8 bushels per acre. We want to know whether this evidence is sufficient to say that the national mean has changed. Assume the information above summarizes the values of a random sample from a normal distribution with mean µ and standard deviation 10. Test the null hypothesis µ = 120 versus the two-sided alternative µ 6= 120 at the 5% significance level. Report the conclusion and the observed significance level (p value). 86 Exercise 11 (Hand et al., Chapman & Hall, 1994, p. 229) The table below gives informa- tion on a before-and-after experiment of a standard treatment for anorexia. Before After After–Before Before After After–Before 70.5 81.8 11.3 72.3 88.2 15.9 74.0 86.3 12.3 75.1 86.7 11.6 77.3 77.3 0.0 77.5 81.2 3.7 77.6 77.4 −0.2 78.1 76.1 −2.0 78.1 81.4 3.3 78.4 84.6 6.2 79.6 81.4 1.8 79.7 73.0 −6.7 80.6 73.5 −7.1 80.7 80.2 −0.5 81.3 89.6 8.3 84.1 79.5 −4.6 84.4 84.7 0.3 85.2 84.2 −1.0 85.5 88.3 2.8 86.0 75.4 −10.6 87.3 75.1 −12.2 88.3 78.1 −10.2 88.7 79.5 −9.2 89.0 78.8 −10.2 89.4 80.1 −9.3 91.8 86.4 −5.4 After−Before summaries: n = 26, x = −0.45, s2 = 63.8194. Twenty-six young women suffering from anorexia were enrolled in the study. The table gives their weights (in pounds) before treatment began and at the end of the fixed treatment period. We analyze the differences data (after – before). Assume the differences data are the values of a random sample from a normal distribution with mean µ and standard deviation σ. Use these data to test the null hypothesis µ = 0 (the treatment has not affected mean weight) versus the two-sided alternative µ 6= 0 (the treatment has changed the mean, for better or for worse) at the 5% significance level. Report your conclusion, and comment on your analysis. 87 8.4.2 Large sample tests of p = po The standardized sample sum when p = po: Z = Y − npo√ npo(1 − po) can be used as test statistic. Since, by the central limit theorem, the distribution of Z is approximately standard normal when n is large, rejection regions for approximate one sided and two sided 100α% tests are as follows: Alternative Hypothesis Rejection Region p < po Z ≤ −z(α) p > po Z ≥ z(α) p 6= po |Z| ≥ z(α/2) where z(p) is the 100(1 − p)% point of the standard normal distribution. Exercise 13 (Approximate analysis) In a test of p = 0.45 versus p > 0.45 using 250 independent trials of a Bernoulli experiment, 51.2% (128/250) of the trials ended in success. Use an approximate analysis to determine if the results are significant at the 0.01 level. Clearly state the conclusion and report the observed significance level (p value). 90 8.5 Example: Poisson distribution Let Y be the sample sum of a random sample of size n from a Poisson distribution with parameter λ. Y is a Poisson random variable with parameter nλ. 8.5.1 Small sample tests of λ = λo Rejection regions for one sided and two sided 100α% tests are as follows: Alternative Hypothesis Rejection Region λ < λo Y ≤ c where c is chosen so that α = P(Y ≤ c when λ = λo) λ > λo Y ≥ c where c is chosen so that α = P(Y ≥ c when λ = λo) λ 6= λo Y ≤ c1 or Y ≥ c2 where c1 and c2 are chosen so that α = P(Y ≤ c1 when λ = λo) + P(Y ≥ c2 when λ = λo) (and the two probabilities are approximately equal) 8.5.2 Large sample tests of λ = λo The standardized sample sum when λ = λo: Z = Y − nλo√ nλo can be used as test statistic. Since, by the central limit theorem, the distribution of Z is approximately standard normal when n is large, rejection regions for approximate one sided and two sided 100α% tests are as follows: Alternative Hypothesis Rejection Region λ < λo Z ≤ −z(α) λ > λo Z ≥ z(α) λ 6= λo |Z| ≥ z(α/2) where z(p) is the 100(1 − p)% point of the standard normal distribution. 91 Exercise 14 (Approximate analysis) You decide to test λ = 2.0 versus λ 6= 2.0 using 80 independent observations from a Poisson distribution. a. Design an approximate 5% test of λ = 2.0 vs. λ 6= 2.0. b. Using the test from part a, find the approximate power when λ = 1.60. c. The results are now in: A total of 189 events were recorded. Is the null hypothesis accepted or rejected at the 5% level? What is the observed significance level (p value)? 92 Exercise 17 (Bernoulli distribution, lower tail) Let Y be the sample sum of a random sample of size n from a Bernoulli distribution. Use the Neyman-Pearson lemma (and the example above) to demonstrate that the test with decision rule Reject p = po in favor of p < po when Y ≤ k is a uniformly most powerful size α test of the null hypothesis p = po versus the one-sided alternative p < po, where k is chosen so that P(Y ≤ k when p = po) = α. 95 Exercise 18 (Exponential distribution, upper tail) Let Y be the sample sum of a random sample of size n from an exponential distribution with parameter 1θ and PDF f(x) = 1 θ e−x/θ when x > 0 and 0 otherwise. Use the Neyman-Pearson lemma to demonstrate that the test with decision rule Reject θ = θo in favor of θ > θo when Y ≥ k is a uniformly most powerful size α test of the null hypothesis θ = θo versus the one-sided alternative θ > θo, where k is chosen so that P(Y ≥ k when θ = θo) = α. 96 8.6.2 Generalized likelihood ratio tests The methods in this section generalize the approach above to compound hypotheses and to multiple parameter families. Generalized likelihood ratio tests are not guaranteed to be uniformly most powerful. In fact, in many situations (e.g. two tailed tests) uniformly most powerful tests do not exist. Let X be a distribution with parameter θ, where θ is a single parameter or a k-tuple of parameters. Assume that the null and alternative hypotheses can be stated in terms of values of θ as follows: Ho: θ ∈ ωo versus Ha: θ ∈ Ω − ωo where Ω represents the full set of parameter values under consideration, and ωo ⊆ Ω. For example, if X is a normal random variable with unknown mean and variance, the null hypothesis is µ = 120 and the alternative hypothesis is µ 6= 120, then θ = (µ, σ2), Ω = {(µ, σ2) | −∞ < µ < ∞, σ2 > 0} and ωo = {(120, σ2) | σ2 > 0}. Let X1, X2, . . ., Xn be a random sample from a distribution with parameter θ, and let Lik(θ) be the likelihood function based on this sample. The generalized likelihood ratio statistic, Λ, is the ratio of the maximum value of the likelihood function for models satisfying the null hypothesis to the maximum value of the likelihood function for all models under consideration: Λ = maxθ∈ωo Lik(θ) maxθ∈Ω Lik(θ) and a generalized likelihood ratio test based on this statistic is a test whose decision rule has the following form Reject θ ∈ ωo in favor of θ ∈ Ω − ωo when Λ ≤ c. Note that the value in the denominator of Λ is the value of the likelihood at the maximum likelihood estimator, and the value in the numerator is less than or equal to the value in the denominator. Thus, Λ ≤ 1. Further, if the null hypothesis is true, then the numerator and denominator values will be close (and Λ will be close to 1); otherwise, the numerator is likely to be much smaller than the denominator (and Λ will be close to 0). Thus, it is reasonable to “reject when Λ is small.” In general, a likelihood ratio test is not implemented as shown above. Instead, an equivalent test (with a simpler statistic and rejection region) is used. We often drop the word “generalized” and call these tests “likelihood ratio tests” (LRTs). 97 8.6.4 Example: comparing Bernoulli parameters Let Yi be the sample sum of a random sample of size ni from a Bernoulli distribution with parameter pi, for i = 1, 2, . . . , k. Consider testing the null hypothesis that the k Bernoulli parameters are equal versus the alternative that not all parameters are equal. Under the null hypothesis, the combined sample is a random sample from a Bernoulli distribution with parameter p = p1 = p2 = · · · = pk. The parameter sets for this test are Ω = {(p1, p2, . . . , pk) : 0 ≤ p1, p2, . . . , pk ≤ 1} and ωo = {(p, p, . . . , p) : 0 ≤ p ≤ 1}. There are k free parameters in Ω and 1 free parameter in ωo. The statistic −2 log(Λ) simplifies to −2 log(Λ) = k∑ i=1 [ 2Yi log ( Yi nip̂ ) + 2(ni − Yi) log ( ni − Yi ni(1 − p̂) )] where p̂ is the estimate of the common parameter under the null hypothesis: p̂ = Y1 + Y2 + · · · + Yk n1 + n2 + · · · + nk and log() is the natural logarithm function. If each ni is large, then −2 log(Λ) has an approximate chi-square distribution with (k − 1) df. 100 Example 22 (Hand et al., Chapman & Hall, 1994, p. 237) As part of a study on depression in adolescents (ages 12 through 18), researchers collected information on 465 individuals (both men and women) who were seriously emotionally disturbed (SED) or learning disabled (LD). The following table summarizes one aspect of the study: #Severely % Severely Depressed #Individuals Depressed 1. LD, Male 41 219 18.72% 2. LD, Female 26 102 25.49% 3. SED, Male 13 95 13.68% 4. SED, Female 17 49 34.69% To determine if the level of severe depression is the same in the four groups, we conduct a test of equality of parameters p1 = p2 = p3 = p4 where pi equals the probability that an adolescent in the i th group is severely depressed using the generalized likelihood ratio test and the 5% significance level. (1) The estimate of the common proportion is p̂ = 97/465 = 0.2086. (2) The following table lists the components of −2 log(Λ): Component of −2 log(Λ) 1. LD, Male 0.623 2. LD, Female 1.260 3. SED, Male 3.273 4. SED, Female 5.000 The sum of the components is 10.156. (3) The rejection region for the test is −2 log(Λ) ≥ χ23(0.05) = 7.815. Since the observed value of the test statistic is in the rejection region, we reject the hypoth- esis of equal probabilities in favor of the alternative that not all probabilities are equal. • The p value for the example above is P (−2 log(Λ) ≥ 10.156) ≈ 0.0173. • Any comments on the analysis above? 101 8.6.5 Example: comparing Poisson parameters Let Yi be the sample sum of a random sample of size ni from a Poisson distribution with parameter λi, for i = 1, 2, . . . , k. Consider testing the null hypothesis that the k Poisson parameters are equal versus the alternative that not all parameters are equal. Under the null hypothesis, the combined sample is a random sample from a Poisson distri- bution with parameter λ = λ1 = λ2 = · · · = λk. The parameter sets for this test are Ω = {(λ1, λ2, . . . , λk) : λ1, λ2, . . . , λk ≥ 0} and ωo = {(λ, λ, . . . , λ) : λ ≥ 0}. There are k free parameters in Ω and 1 free parameter in ωo. The statistic −2 log(Λ) simplifies to −2 log(Λ) = k∑ i=1 2Yi log ( Yi niλ̂ ) where λ̂ is the estimate of the common parameter under the null hypothesis: λ̂ = Y1 + Y2 + · · · + Yk n1 + n2 + · · · + nk and log() is the natural logarithm function. If each mean (E(Yi) = niλ) is large, then −2 log(Λ) has an approximate chi-square distribution with (k − 1) df. 102 Example 24 (Plackett, Macmillan, 1974, p. 134) In a study of spatial dispersion of houses in a Japanese village, the number of homes in each of 1200 squares of side 100 meters was recorded. There were a total of 911 homes. The following table gives the numbers of squares with 0, 1, 2, and 3 or more homes: Number of homes 0 1 2 3+ Frequency 584 398 168 50 To determine if a Poisson model is reasonable for these data, we conduct a generalized likelihood ratio test at the 5% significance level. (1) The estimated parameter of the Poisson model is 911/1200 = 0.7592, and the estimated probabilities for the multinomial model (a grouped Poisson model) are: Number of homes 0 1 2 3+ Estimated probability 0.468 0.355 0.135 0.042 (2) The following table lists the components of −2 log(Λ): Number of homes 0 1 2 3+ Component of −2 log(Λ) The sum of the components is . (3) The rejection region for the test is −2 log(Λ) ≥ χ22(0.05) = 5.99. Since . . . (please complete) 105 As a final exercise, we will demonstrate that Pearson’s goodness-of-fit statistic X2 = k∑ i=1 (Xi − npi)2 npi is a 2nd order Taylor approximation to the −2 log(Λ) statistic for multinomial goodness-of- fit. In large samples, the values of the two statistics are very close. (Thus, Pearson’s test is an approximate generalized likelihood ratio test.) Recall that a 2nd order Taylor approximation to f(x) at a is given by the expression on the right: f(x) ≈ f(a) + f ′(a)(x − a) + f ′′(a) 2 (x − a)2 Exercise 25 To demonstrate that Pearson’s statistic is a 2nd order Taylor approximation to the −2 log(Λ) statistic for multinomial goodness-of-fit, (1) Let a be a positive constant. Use calculus to demonstrate that 2x ln ( x a ) ≈ 2(x − a) + (x − a) 2 a . 106 (2) Use the result of step (1) to demonstrate that −2 log(Λ) ≈ X2. 107
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved