Download Statistical Inference - Hypothesis Testing: Errors, P-values, and Confidence Sets - Prof. and more Exams Statistics in PDF only on Docsity! STAT 9220 Lecture 7 Statistical Inference–Hypothesis Testing Greg Rempala Department of Biostatistics Medical College of Georgia Feb 24, 2009 1 2/11 Greg Rempala STAT 9220 Lecture 7 7.1 Hypothesis tests To test the hypotheses H0 : P ∈ P0 versus H1 : P ∈ P1, there are two types of statistical errors we may commit: rejecting H0 when H0 is true (called the type I error) and accepting H0 when H0 is wrong (called the type II error). Let T be a test which is a statistic from X to {0, 1}. Probabilities of making two types of errors: αT (P ) = P (T (X) = 1), P ∈ P0 (type I) and 1− αT (P ) = P (T (X) = 0), P ∈ P1 Optimal decision rule does not exist. Therefore we assign a small bound to αT (P ) and minimize 1 − αT (P ) for P ∈ P1, subject to αT (P ) ≤ α, P ∈ P0. The bound α is called the level of significance α = sup P∈P0 αT (P ) The choice of a level of significance α is usually somewhat subjective. In most applications there is no precise limit to the size of T that can be tolerated. Standard values, such as 0.10, 0.05, or 0.01, are often used for convenience. Often we only impose bound on the significance level e.g., sup P∈P0 αT (P ) ≤ α. (7.1) In general a small α leads to a small rejection region. 5/11 Greg Rempala STAT 9220 Lecture 7 Example 7.2.2. Consider the problem in Example 7.1.1. Let us calculate the P -value for Tcα . Note that α = 1− Φ (√ n(cα − µ0) σ ) > 1− Φ (√ n(x̄− µ0) σ ) if and only if x̄ > cα (or Tcα(x) = 1). Hence 1− Φ (√ n(cα − µ0) σ ) = inf{α ∈ (0, 1) : Tcα(x) = 1} = α̂(x). is the P -value for Tcα . Thus it turns out that the decision in the testing problem may be written more concisely as Tcα(x) = I(0,α)(α̂(x)). With the additional information provided by P -values, using P -values is typically more appropriate than using fixed-level tests in a scientific problem. However, a fixed level of significance is unavoidable when acceptance or rejection of H0 implies an imminent concrete decision. 6/11 Greg Rempala STAT 9220 Lecture 7 7.3 Randomized tests In Example 7.1.1, the desired significance level α in (7.1) can always be achieved by a suitable choice of c. This is, however, not true in general and thus we need to consider randomized tests. Recall that a randomized decision rule is a probability measure δ(x, cot) on the action space for any fixed x. Since in the testing problem the action space contains only two points, 0 and 1, any randomized test δ(X,A) is equivalent to a statistic T (X) ∈ [0, 1] with T (x) = δ(x, {1}) and 1− T (x) = δ(x, {0}). A nonrandomized test is obviously a special case where T (x) does not take any value in (0, 1). For any randomized test T (X), we define the type I error probability to be αT (P ) = E[T (X)], P ∈ P0, and the type II error probability to be 1−αT (P ) = E[1−T (X)], P ∈ P1. For a class of randomized tests, we would like to minimize 1 − αT (P ) subject to (7.1). 7/11 Greg Rempala STAT 9220 Lecture 7 Example 7.3.1. Assume that the sample X has the binomial distribution Bi(θ, n) with an unknown θ ∈ (0, 1) and a fixed integer n > 1. Consider the hypotheses H0 : θ ∈ (0, θ0] versus H1 : θ ∈ (θ0, 1), where θ0 ∈ (0, 1) is a fixed value. Consider the following class of randomized tests: Tj,q(X) = 1 X > j q X = j 0 X < j, where j = 0, 1, . . . , n− 1 and q ∈ [0, 1]. Then αTj,q(θ) = P (X > j) + qP (X = j) 0 < θ ≤ θ0 and 1− αTj,q(θ) = P (X < j) + (1− q)P (X = j) θ0 < θ < 1. It can be shown that for any α ∈ (0, 1), there exist an integer j and q ∈ (0, 1) such that the size of Tj,q is α. 10/11 Greg Rempala STAT 9220 Lecture 7 Example 7.4.2. Let X1, . . . , Xn be i.i.d. from the N(µ, σ 2) distribution with both µ ∈ R and σ2 > 0 unknown. Let θ = (µ, σ2) and α ∈ (0, 1) be fixed. Let X̄ be the sample mean and S2 be the sample variance. Since (X̄, S2) is sufficient for θ, we focus on C(X) that is a function of (X̄, S2). X̄ and S2 are independent and (n− 1)S2/σ2 ∼ X 2(n− 1). Since √ n(X̄ − µ)/σ ∼ N(0, 1), P (|X̄ − µ σ/ √ n | ≤ c̃α) = √ 1− α and P (c1α ≤ (n− 1)S2 σ2 ≤ c2α) = √ 1− α using X 2(n−1) distribution to find c1α, c2α. Hence, P (−c̃α ≤ (X̄ − µ) √ n σ ≤ c̃α, c1α ≤ (n− 1)S2 σ2 ≤ c2α) = 1− α (by independence) P ( n(X̄ − µ)2 c̃2α ≤ σ2, (n− 1)S 2 c2α ≤ σ2 ≤ (n− 1)S 2 c1α ) = 1− α. 11/11 Greg Rempala STAT 9220 Lecture 7 Remark 7.4.2 (Some final rermarks). For a general confidence interval [ϑ(X), ϑ(X)], its length is ϑ(X)− ϑ(X), which may be random. We may consider the expected (or average) length E[ϑ(X)− ϑ(X)]. The confidence coefficient and expected length are a pair of good measures of performance of confidence intervals. Like the two types of error probabilities of a test in hypothesis testing, however, we cannot maximize the confidence coefficient and minimize the length (or expected length) simulta- neously. A common approach is to minimize the length (or expected length) subject to (7.2). For an unbounded confidence interval, its length is ∞. The idea of confidence pictures is becoming recently much more popular due to relatively easy access to computationally intense graphical tools (e.g. density es- timators or level-plots).