Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Chi-Square Tests and Goodness of Fit Tests in Multinomial Distributions - Prof. Jun Shao, Study notes of Mathematical Statistics

A lecture note from the university of wisconsin-madison's stat 710: mathematical statistics course, covering chi-square tests and goodness of fit tests in multinomial distributions. It discusses the concept of multinomial distributions, the chi-square statistic for testing hypotheses about cell probabilities, and the asymptotic distribution of the chi-square statistic under the null hypothesis.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-8al
koofers-user-8al 🇺🇸

10 documents

1 / 15

Toggle sidebar

Related documents


Partial preview of the text

Download Chi-Square Tests and Goodness of Fit Tests in Multinomial Distributions - Prof. Jun Shao and more Study notes Mathematical Statistics in PDF only on Docsity! logo Stat 710: Mathematical Statistics Lecture 29 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 1 / 10 logo Lecture 29: Chi-square tests and goodness of fit tests Testing in multinomial distributions Consider n independent trials with k possible outcomes for each trial. Let pj > 0 be the probability that the j th outcome occurs in a given trial and Xj be the number of occurrences of the j th outcome in n trials. Then X = (X1, ...,Xk) has the multinomial distribution (Example 2.7) with the parameter p = (p1, ...,pk). Let ξi = (0, ...,0,1,0, ...,0), where the single nonzero component 1 is located in the j th position if the i th trial yields the j th outcome. Then ξ1, ...,ξn are i.i.d. and X/n = ξ̄ = ∑ni=1 ξi/n. X/n is an unbiased estimator of p and, by the CLT, Zn(p) = √ n (X n −p ) = √ n(ξ̄ −p) →d Nk (0,Σ), where Σ = Var(X/ √ n) is a symmetric k ×k matrix whose i th diagonal element is pi(1−pi) and (i , j)th off-diagonal element is −pipj . We first consider the problem of testing H0 : p = p0 versus H1 : p 6= p0, where p0 = (p01, ...,p0k) is a known vector of cell probabilities. Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 2 / 10 logo Theorem 6.8 Let φ = ( √ p1, ..., √ pk ) and Λ be a k ×k projection matrix. (i) If Λφ = aφ , then [Zn(p)]τD(p)ΛD(p)Zn(p) →d χ2r , where χ2r has the chi-square distribution χ2r with r = tr(Λ)−a. (ii) The same result holds if D(p) in (i) is replaced by D(X/n). Remark The χ2-statistic and the modified χ2-statistic are special cases of the statistics in Theorem 6.8(i) and (ii), respectively, with Λ = Ik satisfying Λφ = φ . Proof The result in (ii) follows from the result in (i) and X/n →p p. To prove (i), let D = D(p), Zn = Zn(p), and Z = Nk (0, Ik). From the asymptotic normality of Zn and Theorem 1.10, Z τn DΛDZn →d Z τAZ with A = Σ1/2DΛDΣ1/2. Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 4 / 10 logo Theorem 6.8 Let φ = ( √ p1, ..., √ pk ) and Λ be a k ×k projection matrix. (i) If Λφ = aφ , then [Zn(p)]τD(p)ΛD(p)Zn(p) →d χ2r , where χ2r has the chi-square distribution χ2r with r = tr(Λ)−a. (ii) The same result holds if D(p) in (i) is replaced by D(X/n). Remark The χ2-statistic and the modified χ2-statistic are special cases of the statistics in Theorem 6.8(i) and (ii), respectively, with Λ = Ik satisfying Λφ = φ . Proof The result in (ii) follows from the result in (i) and X/n →p p. To prove (i), let D = D(p), Zn = Zn(p), and Z = Nk (0, Ik). From the asymptotic normality of Zn and Theorem 1.10, Z τn DΛDZn →d Z τAZ with A = Σ1/2DΛDΣ1/2. Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 4 / 10 logo Proof (continued) From Exercise 51 in §1.6, the result in (i) follows if we can show that A2 = A (i.e., A is a projection matrix) and tr(A) = tr(Λ)−a. Since Λ is a projection matrix and Λφ = aφ , a must be either 0 or 1. Note that DΣD = Ik −φφ τ . Then A3 = Σ1/2DΛDΣDΛDΣDΛDΣ1/2 = Σ1/2D(Λ−aφφ τ)(Λ−aφφ τ)ΛDΣ1/2 = Σ1/2D(Λ−2aφφ τ +a2φφ τ)ΛDΣ1/2 = Σ1/2D(Λ−aφφ τ)ΛDΣ1/2 = Σ1/2DΛDΣDΛDΣ1/2 = A2, which implies that the eigenvalues of A must be 0 or 1. Therefore, A2 = A. Also, tr(A) = tr[Λ(DΣD)] = tr(Λ−aφφ τ) = tr(Λ)−a. Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 5 / 10 logo If we still try to test H0 : p = p0 with pj = PFθ (Aj), j = 1, ...,k , the result in Example 6.23 is not applicable since p is unknown under H0. A generalized χ2-test can be obtained using the following result. Let p(θ ) = (p1(θ ), ...,pk(θ )) be a k-vector of known functions of θ ∈ Θ ⊂ Rs, where s < k . Consider the testing problem H0 : p = p(θ ) versus H1 : p 6= p(θ ). Note that H0 : p = p0 is the special case of H0 : p = p(θ ) with s = 0. Let θ̂ be an MLE of θ under H0. By Theorem 6.5, the LR test that rejects H0 when −2logλn > χ2k−s−1,α has asymptotic significance level α , where χ2k−s−1,α is the (1−α)th quantile of χ2k−s−1 and λn = k ∏ j=1 [pj(θ̂ )]Xj / (Xj/n) Xj . Using the fact that pj(θ̂)/(Xj/n) →p 1 under H0 and log(1+x) = x −x2/2+o(|x |2) as |x | → 0, Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 7 / 10 logo we obtain that −2logλn = −2 k ∑ j=1 Xj log ( 1+ pj(θ̂ ) Xj/n −1 ) = −2 k ∑ j=1 Xj ( pj(θ̂) Xj/n −1 ) + k ∑ j=1 Xj ( pj(θ̂ ) Xj/n −1 )2 +op(1) = k ∑ j=1 [Xj −npj(θ̂ )]2 Xj +op(1) = k ∑ j=1 [Xj −npj(θ̂ )]2 npj(θ̂ ) +op(1), where the third equality follows from ∑kj=1 pj(θ̂) = ∑ k j=1 Xj/n = 1. Generalized χ2-statistics The generalized χ2-statistics χ2 and χ̃2 are defined to be the previously defined χ2-statistics with p0j ’s replaced by pj(θ̂)’s. Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 8 / 10 logo we obtain that −2logλn = −2 k ∑ j=1 Xj log ( 1+ pj(θ̂ ) Xj/n −1 ) = −2 k ∑ j=1 Xj ( pj(θ̂) Xj/n −1 ) + k ∑ j=1 Xj ( pj(θ̂ ) Xj/n −1 )2 +op(1) = k ∑ j=1 [Xj −npj(θ̂ )]2 Xj +op(1) = k ∑ j=1 [Xj −npj(θ̂ )]2 npj(θ̂ ) +op(1), where the third equality follows from ∑kj=1 pj(θ̂) = ∑ k j=1 Xj/n = 1. Generalized χ2-statistics The generalized χ2-statistics χ2 and χ̃2 are defined to be the previously defined χ2-statistics with p0j ’s replaced by pj(θ̂)’s. Jun Shao (UW-Madison) Stat 710, Lecture 29 April 8, 2009 8 / 10
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved