Download Regression Analysis of Hourly Wage with Dummy Variables: Additive and Interactive Models - and more Exams Introduction to Sociology in PDF only on Docsity! sociology 362 dummy variables 4: categorical and quantitative regressors Additive vs interactive models and test of overall homogeneity CPS data on 1985 wages. edyrs: years of schooling hrwage: hourly wage in dollars fem: 1 female; 0 male Consider first the simple regressions of hourly wage on schooling and sex: 1. regress hrwage edyrs Source | SS df MS Number of obs = 515 ---------+------------------------------ F( 1, 513) = 97.72 Model | 1980.06338 1 1980.06338 Prob > F = 0.0000 Residual | 10394.8996 513 20.2629622 R-squared = 0.1600 ---------+------------------------------ Adj R-squared = 0.1584 Total | 12374.963 514 24.0758035 Root MSE = 4.5014 ------------------------------------------------------------------------------ hrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- edyrs | .823475 .0833033 9.885 0.000 .6598174 .9871327 _cons | -1.774601 1.116715 -1.589 0.113 -3.968497 .4192961 ------------------------------------------------------------------------------ 2. regress hrwage fem Source | SS df MS Number of obs = 515 ---------+------------------------------ F( 1, 513) = 32.16 Model | 730.04049 1 730.04049 Prob > F = 0.0000 Residual | 11644.9225 513 22.699654 R-squared = 0.0590 ---------+------------------------------ Adj R-squared = 0.0572 Total | 12374.963 514 24.0758035 Root MSE = 4.7644 ------------------------------------------------------------------------------ hrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- fem | -2.388079 .4210996 -5.671 0.000 -3.215371 -1.560787 _cons | 10.19249 .286266 35.605 0.000 9.630093 10.75489 ------------------------------------------------------------------------------ Now I’ll fit an additive multiple regression model. I don’t expect much change in the coefficients because sex and schooling are not highly correlated. 3. regress hrwage edyrs fem Source | SS df MS Number of obs = 515 ---------+------------------------------ F( 2, 512) = 69.02 Model | 2627.88062 2 1313.94031 Prob > F = 0.0000 Residual | 9747.08239 512 19.0372703 R-squared = 0.2124 ---------+------------------------------ Adj R-squared = 0.2093 Total | 12374.963 514 24.0758035 Root MSE = 4.3632 ------------------------------------------------------------------------------ hrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- edyrs | .8067068 .0807957 9.985 0.000 .6479749 .9654387 fem | -2.251005 .3858803 -5.833 0.000 -3.009109 -1.492902 _cons | -.5131201 1.103804 -0.465 0.642 -2.681662 1.655422 ------------------------------------------------------------------------------ Now I’ll fit an interactive model to take account of possibe sex differences in the return to years of schooling: create interaction term 4. gen sx_ed=fem*edyrs fit interactive model 5. regress hrwage edyrs fem sx_ed Source | SS df MS Number of obs = 515 ---------+------------------------------ F( 3, 511) = 46.07 Model | 2634.62784 3 878.20928 Prob > F = 0.0000 Residual | 9740.33517 511 19.0613213 R-squared = 0.2129 ---------+------------------------------ Adj R-squared = 0.2083 Total | 12374.963 514 24.0758035 Root MSE = 4.3659 ------------------------------------------------------------------------------ hrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- edyrs | .76382 .1083156 7.052 0.000 .5510214 .9766186 fem | -3.526907 2.179008 -1.619 0.106 -7.807823 .7540102 sx_ed | .0968346 .1627587 0.595 0.552 -.222924 .4165931 _cons | .05602 1.46117 0.038 0.969 -2.814619 2.926659 ------------------------------------------------------------------------------ The interaction coefficient is .096, indicating a slight advantage for women in the returns to schooling. But we cannot reject the hypothesis that the interaction parameter is equal to zero. At this point we might conclude that there are intercept, but not slope, differences in the male and female wage equations. An alternative way to fit the interactive model is to fit the regression of hourly wage on schooling separately for males and females: The stata “by” command that I use for the regressions first requires that the data be sorted: . sort fem