Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Sampling Distribution of OLS Estimators for Hypothesis Testing in Linear Regression, Study notes of Economics

The sampling distribution of ols estimators in linear regression beyond mean and variance, enabling us to test hypotheses on population parameters. It covers testing a single population parameter, a linear combination of parameters, and multiple linear restrictions using the t-statistic and f-statistic. The document also explains how to perform these tests using stata.

Typology: Study notes

Pre 2010

Uploaded on 08/26/2009

koofers-user-r75
koofers-user-r75 🇺🇸

10 documents

1 / 16

Toggle sidebar

Related documents


Partial preview of the text

Download Sampling Distribution of OLS Estimators for Hypothesis Testing in Linear Regression and more Study notes Economics in PDF only on Docsity! k  1 2   u x , .., x u N ,  . y Chapter 4: Multiple regression analysis: Inference Proposition 1 classical linear model Wooldridge, Introductory Econometrics, 2d ed. MLR6 (Normality) The population error is in- dependent of the explanatory variables and is normally distributed with zero mean and constant variance: We have discussed the conditions under which OLS estimators are unbiased, and derived the variances of these estimators under the Gauss-Markov assumptions. The Gauss- Markov theorem establishes that OLS estimators have the smallest variance of any linear unbiased estimators of the population parameters. We must now more fully characterize the sampling distribution of the OLS estimators–beyond its mean and variance–so that we may test hypotheses on the population parameters. To make the sampling distribution tractable, we add an assumption on the distribution of the errors: This is a much stronger assumption than we have previously made on the distribution of the errors. The assumption of normality, as we have stated it, subsumes both the assumption of the error process being independent of the explanatory variables, and that of homoskedasticity. For cross-sectional regression analysis, these six assumptions define the . The rationale for normally distributed errors is often phrased in terms of the many factors influencing being additive, appealing to the Central Limit Theorem to suggest that the sum of a large number of random factors will be normally distributed. Although we might have reason in a particular context to doubt 1 j        0 1 1 2 2 j j j j j b j j j k k ( ) (0 1) = + + + + + Testing an hypothesis on a single b N  , V ar b b  s N , b b  k y   x  x ...  x u this rationale, we usually use it as a working hypothesis. Various transformations–such as taking the logarithm of the dependent variable–are often motivated in terms of their inducing normality in the resulting errors. What is the importance of assuming normality for the error process? Under the assumptions of the classical linear model, normally distributed errors give rise to normally distributed OLS estimators: (1) which will then imply that: (2) This follows since each of the can be written as a linear combination of the errors in the sample. Since we assume that the errors are independent, identically distributed normal random variates, any linear combination of those errors is also normally distributed. We may also show that any linear combination of the is also normally distributed, and a subset of these estimators has a joint normal distribution. These properties will come in handy in formulating tests on the coefficient vector. We may also show that the OLS estimators will be approximately normally distributed (at least in large samples), even if the underlying errors are not normally distributed. To test hypotheses about a single population parameter, we start with the model containing regressors: (3) 2  0 0 : 0 : 0 j A j j j H H  > H  b Testing other hypotheses about hypothesis that can be rejected at the 8% level of confidence–thus quite irrelevant, since we would expect to find a value that large 92% of the time under the null hypothesis. On the other hand, a p-value of 0.08 will reject at the 90% level, but not at the 95% level; only 8% of the time would we expect to find a t-statistic of that magnitude if was true. What if we have a one-sided alternative? For instance, we may phrase the hypothesis of interest as: (6) Here, we must use the appropriate critical point on the t-distribution to perform this test at the same level of confidence. If the point estimate is positive, then we do not have cause to reject the null. If it is negative, we may have cause to reject the null if it is a sufficiently large negative value. The critical point should be that which isolates 5% of the mass of the distribution in that tail (for a 95% level of confidence). This critical value will be smaller (in absolute value) than that corresponding to a two-tailed test, which isolates only 2.5% of the mass in that tail. The computer program always provides you with a p-value for a two-tailed test; if the p-value is 0.08, for instance, it corresponds to a one-tailed p-value of 0.04 (that being the mass in that tail). Every regression output includes the information needed to test the two-tailed or one-tailed hypotheses that a population parameter equals zero. What if we want to test a different hypothesis about the value of that parameter? For instance, we 5 j   0 1   : = ( ) j j j j j b n k j mpc H  a a b a s t a , bdrms, bdrms test regress price bdrms sqrft test bdrms=20000 ( 1) bdrms = 20000.0 F( 1, 85) = 0.26 Prob > F = 0.6139 would not consider it sensible for the for a consumer to be zero, but we might have an hypothesized value (of, say, 0.8) implied by a particular theory of consumption. How might we test this hypothesis? If the null is stated as: (7) where is the hypothesized value, then the appropriate test statistic becomes: (8) and we may simply calculate that quantity and compare it to the appropriate point on the t-distribution. Most computer programs provide you with assistance in this effort; for instance, if we believed that the coefficient on should be equal to $20,000 in a regression of house prices on square footage and (e.g. using HPRICE1), we would use Stata’s command: where we use the name of the variable as a shorthand for the name of the coefficient on that variable. Stata, in that instance, presents us with: making use of an F-statistic, rather than a t-statistic, to 6 Confidence intervals . di (_b[bdrms]-20000)/_se[bdrms] -.50633208 test test test regress perform this test. In this particular case–of an hypothesis involving a single regression coefficient–we may show that this F-statistic is merely the square of the associated t-statistic. The p-value would be the same in either case. The estimated coefficient is 15198.19, with an estimated standard error of 9483.517. Plugging in these values to (8) yields a t-statistic: which, squared, is the F-statistic shown by the command. Just as with tests against a null hypothesis of zero, the results of the command may be used for one-tailed tests as well as two-tailed tests; then, the magnitude of the coefficient matters (i.e. the fact that the estimated coefficient is about $15,000 means we would never reject a null that it is less than $20,000), and the p-value must be adjusted for one tail. Any number of commands may be given after a command in Stata, testing different hypotheses about the coefficients. As we discussed in going over Appendix C, we may use the point estimate and its estimated standard error to calculate an hypothesis test on the underlying population parameter, or we may form a confidence interval for that parameter. Stata makes that easy in a regression context by providing the 95% confidence interval for every estimated coefficient. If you want to use some other level of significance, you may either use 7  0 0 0 L K L K L K CRTS CRTS : = 2 : = 2 : 2 = 0 H   , or H   , or H   Testing multiple linear restrictions test test labor - 2*cap = 0 test labor=2*cap test is 2/3. This implies that the labor coefficient should be twice the capital coefficient, or: (12) Note that this does not allow us to test a nonlinear hypothesis on the parameters: but considering that a ratio of two parameters is a constant is not a nonlinear restriction. In the latter form, we may specify it to Stata’s command as: In fact, Stata will figure out that form if you specify the hypothesis as: (rewriting it in the above form), but it is not quite smart enough to handle the ratio form. It is easy to rewrite the ratio form into one of the other forms. Either form will produce an F-statistic and associated p-value related to this single linear hypothesis on the parameters which may be used to make a judgment about the hypothesis of interest. When we use the command, an F-statistic is reported– even when the test involves only one coefficient–because in general, hypothesis tests may involve more than one restriction on the population parameters. The hypotheses discussed above– even that of , involving several coefficients–still only represent one restriction on the parameters. For instance, if 10 ∴ 1        k n k n b ,n joint tests each 0 2 2 2 2 2 ( +1) 1 2 2 2 = + : = = ( ( + 1)) = (1 ) ( ( + 1)) ( ) : y  u y. R R R SSE SST F SSE/k SSR/ n k F R /k R / n k F F , t t is imposed, the elasticities of the factors of production must sum to one, but they may individually take on any value. But in most applications of multiple linear regression, we concern ourselves with of restrictions on the parameters. The simplest joint test is that which every regression reports: the so-called “ANOVA F” test, which has the null hypothesis that of the slopes is equal to zero. Note that in a multiple regression, specifying that each slope individually equals zero is not the same thing as specifying that their sum equals zero. This “ANOVA” (ANalysis Of VAriance) F-test is of interest since it essentially tests whether the entire regression has any explanatory power. The null hypothesis, in this case, is that the “model” is that is, none of the explanatory variables assist in explaining the variation in We cannot test any hypothesis on the of a regression, but we will see that there is an intimate relationship between the and the ANOVA F: (13) where the ANOVA F, the ratio of mean square explained variation to mean square unexplained variation, is distributed as under the null hypothesis. For a simple regression, this statistic is which is identical to that is, the square of the statistic for the slope coefficient, with precisely the 11 2      p t F y k k R p p exclusion restrictions lsalary years gamesyr frstbase, scndbase, shrtstop, thrdbase, catcher) years gamesyr scndbase frstbase shrtstop thrdbase catcher scndbase same value as that statistic. In a multiple regression context, we do not often find an insignificant statistic, since the null hypothesis is a very strong statement: that none of the explanatory variables, taken singly or together, explain any significant fraction of the variation of about its mean. That can happen, but it is often somewhat unlikely. The ANOVA F tests : that all slope coefficients are jointly zero. We may use an F-statistic to test that a number of slope coefficients are jointly equal to zero. For instance, consider a regression of 353 major league baseball players’ salaries (from MLB1). If we regress (log of player’s salary) on (number of years in majors), (number of games played per year), and several variables indicating the position played ( , we get an of 0.6105, and an ANOVA F (with 7 and 345 d.f.) of 77.24 with a value of zero. The overall regression is clearly significant, and the coefficients on and both have the expected positive and significant coefficients. Only one of the five coefficients on the positions played, however, are significantly different from zero at the 5% level: , with a negative value (-0.034) and a value of 0.015. The and coefficients are also negative (but insignificant), while the and coefficients are positive and insignificant. Should we just remove all of these variables (except for )? The F-test for these five exclusion restrictions will provide an answer to that question: 12  { } } ur r r ) = 0 + + = 1 = 1 ? = 5 = 3 ? 1 2 3 4 5 1 2 3 4 5 1 5 2 3 4 Testing general linear restrictions CRTS  ,    ,  . y X ,X ,X ,X ,X k ,    ,  ,  q . SSR , SSR SSR this circumstance, we might then reformulate the model with the restrictions in place, since they do not conflict with the data. In the baseball player salary example, we might drop the four insignificant variables and reestimate the more parsimonious model. The apparatus described above is far more powerful than it might appear. We have considered individual tests involving a linear combination of the parameters (e.g. and joint tests involving exclusion restrictions (as in the baseball players’ salary example). But the “subset F” test defined in (14) is capable of being applied to any set of linear restrictions on the parameter vector: for instance, that and What would this set of restrictions imply about a regression of on That regression, in its unrestricted form, would have with 5 estimated slope coefficients and an intercept. The joint hypotheses expressed above would state that a restricted form of this equation would have three fewer parameters, since would be constrained to zero, to -1, and one of the coefficients { expressed in terms of the other two. In the terminology of (14), How would we test the hypothesis? We can readily calculate but what about One approach would be to algebraically substitute the restrictions in the model, estimate that restricted model, and record its value. This can be done with any computer program that estimates a multiple regression, but it requires that you do the algebra and transform the variables accordingly. (For 15 5 5  y X .( + )) test regress y x1 x2 x3 x4 x5 test x1 test x2+x3+x4=1, accum test x5=-1, accum test ( 1) years = 0.0 ( 2) frstbase + scndbase + shrtstop = 1.0 ( 3) sbases = -1.0 F( 3, 347) = 38.54 Prob > F = 0.0000 accum test instance, constraining to -1 implies that you should form a new dependent variable, Alternatively, if you are using a computer program that can test linear restrictions, you may use its features. Stata will test general linear restrictions of this sort with the command: The final command in this sequence will print an F-statistic for the set of three linear restrictions on the regression: for instance, The “ ” option on the command indicates that these tests are not to be performed separately, but rather jointly. The final F-test will have three numerator degrees of freedom, because you have specified three linear hypotheses to be jointly applied to the coefficient vector. This syntax of test may be used to construct any set of linear restrictions on the coefficient vector, and perform the joint test for the validity of those restrictions. The test statistic will reject the null hypothesis (that the restrictions are consistent with the data) if its value is large relative to the underlying F-distribution. 16
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved