Introductory Econometrics Problem set 2

Introductory Econometrics
Problem set 2
Jan Zouhar
Department of Econometrics, University of Economics, Prague, zouharj@vse.cz
Due date: 24 April
Problem 2.1. Use the data in wage2.gdt for both this problem and the remaining problems below.
a ) Estimate the equation
log.wage/ D ˇ0 C ˇ1 exper C ˇ2 exper2 C ˇ3 educ C ˇ4 female C ˇ5 nonwhite C u:
(1)
and report the results using the usual format.
b ) Based on your results from part a, find the 99% confidence intervals for ˇ5 . Is the (partial) effect of
race statistically significant at the 1% level in your equation?
c ) Use White’s test and the Breusch-Pagan test (Tests ! Heteroskedasticity ! White’s test / BreuschPagan) to show whether Assumption MLR.5 holds. What do you conclude? (Report the value of the
test statistics and the resulting p-value along with your conclusions.) What does the test tell you about
the results you obtained from the regression?
d ) Using the approximation
%wage 100.ˇ1 C 2ˇ2 exper/exper;
find the approximate return to the fifth year of experience. What is the approximate return to the
twentieth year of experience?
e ) At what value of exper does additional experience actually begin to lower predicted log.wage/? (Or,
what is the turning point in the effect of experience?) How many people have more experience in this
sample? (Hint: Sorting the data using Data ! Sort data ! exper might help you out with the last
question.)
Problem 2.2. Based on (1), you want predict the salary of a white male person with 5 years of work experience and 18 years of education. This prediction is made difficult by the presence of logarithms; read
Wooldridge’s section ‘Predicting y when log.y/ is the dependent variable’.
a ) Find the prediction, assuming that u is normally distributed (conditional on all independent variables),
i.e. that assumptions MLR.1 through MLR.6 hold.
b ) Save the residuals from (1) to a new variable uhat, and test for normality (Variable ! Normality test),
the null is that uhat is normally distributed). What do you conclude?
c ) Find the prediction once again, this time using Duan’s (1983) smearing estimate, described in the same
section of Wooldridge’s book. (Hint: you will need to create a new variable, calculated as exp.uhat/,
and find its mean, e.g. by displaying summary statistics.)
Problem 2.3.
a ) Estimate a modified version of (1) with the level, rather than log, of wage as the dependent variable:
wage D ˇ0 C ˇ1 exper C ˇ2 exper2 C ˇ3 educ C ˇ4 female C ˇ5 nonwhite C u:
(2)
b ) Save the residuals (u)
O from (2) and find the sample correlation coefficients between uO and all the
explanatory variables (i.e., 5 correlation coefficients). Explain the results.
c ) Save the fitted values wage from (2) and find the sample correlation coefficient between wage and
wage. Is there any relationship between this correlation coefficient and the R2 from the regression
model? (Hint: See Wooldridge, look for the origin of the term ‘R-squared’.)
1
1
1
1
d ) Based on (1), calculate the predicted wage for all people in the sample (wage2), using Duan’s estimate
as in Problem 2.2. Find the squared correlation between wage and wage2, and use the result to compare
the goodness of fit of (1) and (2). (See Wooldridge, same section as in Problem 2.2, for a comparison
of goodness of fit for models that combine dependent variables in the level and the log form.)
1