Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Maximum Likelihood Estimation and Hypothesis Testing for Logistic Regression in BIOST 515, Study notes of Biostatistics

A lecture note from biost 515, focusing on maximum likelihood estimation and hypothesis testing for logistic regression. It covers the concept of maximum likelihood estimation, the calculation of maximum likelihood estimates and information, and hypothesis testing methods such as likelihood ratio tests, wald tests, and score tests.

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-kfa-1
koofers-user-kfa-1 🇺🇸

5

(1)

10 documents

1 / 28

Toggle sidebar

Related documents


Partial preview of the text

Download Maximum Likelihood Estimation and Hypothesis Testing for Logistic Regression in BIOST 515 and more Study notes Biostatistics in PDF only on Docsity! Lecture 13 Estimation and hypothesis testing for logistic regression BIOST 515 February 19, 2004 BIOST 515, Lecture 13 Outline • Review of maximum likelihood estimation • Maximum likelihood estimation for logistic regression • Testing in logistic regression BIOST 515, Lecture 13 1 The first derivative of l with respect to p is U(p) = ∂l ∂p = n∑ i=1 Yi/p− (n− n∑ i=1 Yi)/(1− p) and is referred to as the score funcion. To calculate the MLE of p, we set the score function, U(p) equal to 0 and solve for p. In this case, we get an MLE of p that is p̂ = n∑ i=1 Yi/n. BIOST 515, Lecture 13 4 Information Another important function that can be derived from the likelihood is the Fisher information about the unknown pa- rameter(s). The information function is the negative of the curvature in l = log L. For the likelihood considered previously, the information is I(p) = E [ −∂ 2l ∂p2 ] = E [ n∑ i=1 Yi/p 2 + (n− n∑ i=1 Yi)/(1− p)2 ] = n p(1− p) BIOST 515, Lecture 13 5 We can estimate the information by substituting the MLE of p into I(p), yielding I(p̂) = np̂(1−p̂). Our next interest may be in making inference about the parameter p. We can use the the inverse of the information evaluated at the MLE to estimate the variance of p̂ as v̂ar(p̂) = I(p̂)−1 = p̂(1− p̂) n . For large n, p̂ is approximately normally distributed with mean p and variance p(1− p)/n. Therefore, we can construct a 100× (1− α)% confidence interval for p as p̂± Z1−α/2[p̂(1− p̂)/n]1/2. BIOST 515, Lecture 13 6 For the binary outcome discussed above, if the hypothesis is H0 : p = p0 vs HA : p 6= p0, then l(H0) = n∑ i=1 Yi log(p0) + (n− n∑ i=1 Yi) log(1− p0), l(MLE) = n∑ i=1 Yi log(p̂) + (n− n∑ i=1 Yi) log(1− p̂) and the LRT statistic is LR = −2 [ n∑ i=1 Yi log(p0/p̂) + (n− n∑ i=1 Yi) log{(1− p0)(1− p̂)} ] , where LR ∼ χ21. BIOST 515, Lecture 13 9 Wald test The Wald test statistic is a function of the difference in the MLE and the hypothesized value, normalized by an estimate of the standard deviation of the MLE. In our binary outcome example, W = (p̂− p0)2 p̂(1− p̂)/n . For large n, W ∼ χ2 with 1 degree of freedom. In R, you will see √ W ∼ N(0, 1) reported. BIOST 515, Lecture 13 10 Score test If the MLE equals the hypothesized value, p0, then p0 would maximize the likelihood and U(p0) = 0. The score statistic measures how far from zero the score function is when evaluated at the null hypothesis. The test statistic for the binary outcome example is S = U(p0)2/I(p0), and S ∼ χ2 with 1 degree of freedom. BIOST 515, Lecture 13 11 The likelihood for n observations is then L = n∏ i=1 ( exp(x′iβ) 1 + exp(x′iβ) )Pn i=1 Yi ( 1− exp(x ′ iβ) 1 + exp(x′iβ) )n−Pni=1 Yi , and the log-likelihood is l = n∑ i=1 [ Yi log ( exp(x′iβ) 1 + exp(x′iβ) ) + (1− Yi) log ( 1− exp(x ′ iβ) 1 + exp(x′iβ) )] . BIOST 515, Lecture 13 14 The p + 1 score functions of β for the logistic regression model cannot be solved analytically. It is common to use a numerical algorithm, such as the Newton-Raphson algorithm, to obtain the MLEs. The information in this case will be a (p + 1) × (p + 1) matrix of the partial second derivatives of l with respect to the parameters, β. The inverted information matrix is the covariance matrix for β̂. BIOST 515, Lecture 13 15 Testing a single logistic regression coefficient in R To test a single logistic regression coefficient, we will use the Wald test, β̂j − βj0 ŝe(β̂) ∼ N(0, 1), where ŝe(β̂) is calculated by taking the inverse of the estimated information matrix. This value is given to you in the R output for βj0 = 0. As in linear regression, this test is conditional on all other coefficients being in the model. BIOST 515, Lecture 13 16 Example logit(πi) = β0 + β1cad.duri + β2genderi. Estimate Std. Error z value Pr(>|z|) (Intercept) −0.3203 0.0579 −5.53 0.0000 cad.dur 0.0074 0.0008 9.30 0.0000 sex −0.3913 0.1078 −3.63 0.0003 95% CI for a one-unit change in cad.dur: [exp(.0074− 1.96× 0.0008), exp(.0074 + 1.96× 0.0008)] = [ e0.0058, e0.0090 ] = [1.006, 1.009] How can we construct a similar confidence interval for males vs. females? BIOST 515, Lecture 13 19 Testing a single logistic regression coefficient using LRT logit(πi) = β0 + β1x1i + β2x2i We want to test H0 : β2 = 0 vs. HA : β2 6= 0 Our model under the null hypothesis is logit(πi) = β0 + β1x1i. What is our LRT statistic? LR = −2 ( l(β̂|H0)− l(β̂|HA) ) To get both l(β̂|H0) and l(β̂|HA), we need to fit two models: the full model and the model under H0. Then l(β̂|H0) is the BIOST 515, Lecture 13 20 log-likelihood from the model under H0, and l(β̂|HA) is the log-likelihood from the full model. If we are testing just one coefficient, LR ∼ χ21. BIOST 515, Lecture 13 21 Testing groups of variables using the LRT Suppose instead of testing just variable, we wanted to test a group of variables. This follows naturally from the likelihood ratio test. Let’s look at it by example. Again suppose our full model is logit(πi) = β0 + β1cad.duri + β2genderi, and we test H0 : β1 = β2 = 0 vs. HA : β1 6= 0 or β2 6= 0. The −2 log L from the full model is 3117.9. For the reduced model, −2 log L = 3230.5. Therefore, LR = 3230.5− 3117.9 = 112.6 > 5.99 = χ22. Why 2 degrees of freedom? BIOST 515, Lecture 13 24 Analysis of deviance table We can get this same information from the analysis of deviance table. We can get this in R, by sending a glm object to the anova() function. For the model logit(πi) = β0 + β1cad.duri + β2genderi, the (edited) analysis of deviance table is: Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL 2331 3230.5 cad.dur 1 99.2 2330 3131.3 sex 1 13.4 2329 3117.9 BIOST 515, Lecture 13 25 You read the analysis of deviance table similarly to the ANOVA table. • The 1st row is the intercept only model, and the 5th column is the −2 log L for the intercept only model. • In the jth row, j = 2, . . . , p+1 row, the 5th column, labeled “Resid. Dev” is the −2 log L for the model with the variables labeling rows 1, . . . , j. – So, in the 2nd row of the table above, the −2 log L for the model, logit(πi) = β0 + β1cad.dur, is 3131.3. – In the third row, the −2 log L for the model, logit(πi) = β0 + β1cad.dur + β2sexi, is 3117.9. • The second column, labeled “Deviance”, lists the LRT statis- tics for the model in the jth row compared to the reduced model in the j − 1th row. BIOST 515, Lecture 13 26
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved