Economics 637 Midterm Answers
Prof. Bryan Caplan
Spring, 1998
Part 1: True, False, and Explain
(10 points each - 3 for the right answer, and 7 for the explanation)
State whether each of the following nine propositions is true or false. Using 2-3 sentences AND/OR equations, explain your answer.
1. True, False, and Explain: If you regress Y on X, then the correlation between Y and the error terms has to equal 0.
FALSE. The correlation between error terms and the predicted values of Y is 0; the correlation between the error terms and Y must be positive. Intuitively, imagine that you regress Y on X, getting coefficient vector b and residuals e. Then if you regress Y on X and e, you will necessarily get a perfect fit, with coefficients Y=Xb+e*1.
2. Suppose you gather data on 1000 first-year students' GPA and average daily alcohol consumption in 1996, and repeat your study in 1997 on a new first-year class of 1000 students.
True, False, and Explain: If the correlation between GPA and alcohol consumption in both years examined separately is negative, then the correlation in both years combined will either be negative or zero.
FALSE. As the homework example showed, and as discussed in class, the two samples combined could easily have a positive correlation. Imagine plotting a graph of the murder rate vs. per-capita police expenditure for both NYC and Hays, Kansas. Individually, they would probably have a negative correlation, but in combination (NYC spends a lot more on police) they would have a positive correlation.
#3, #4, and #5 refer to the following regression:
|
Y |
X |
Yi=1.500 + .700*Xi (.100) (.245) R2=.942 |
|
4 |
3 |
|
|
4 |
4 |
|
|
3 |
2 |
|
|
2 |
1 |
|
|
1.5 |
0 |
3. True, False, and Explain: The estimated OLS coefficient vector is correct.
TRUE. Just calculate b=(X'X)-1X'Y; see the attached computer output for specifics.
4. True, False, and Explain: The estimated standard error for the constant is correct.
FALSE. The SEs for the constant and X were switched; just use var(b)=s2(X'X)-1 to double-check. See attached computer output for specifics.
5. Suppose you regress Y on a constant, X, and one additional variable. The R2 rises to .982.
True, False, and Explain: You can reject (at the 1% level) the hypothesis that the added variable is insignificant.
Use the change-in-R2 test:
, and note that the second regression with three variables is the "vanilla" one and the first one with two variables is the restricted one. Thus:
. Since the critical value is 98.49, you cannot reject the hypothesis that the added variable is insignificant (i.e., you can accept the hypothesis that the added variable is insignificant).
6. True, False, and Explain: If you run the regression
Wagei=a + b1*Experiencei + b2*(Experiencei)2, you would expect b1 to be positive and b2 to be negative.
TRUE. You would expect the return to education to increase at a decreasing rate. If b1 is positive and b2 is negative, then the regression equation will exhibit this property.
7. Suppose you regress 1000 people's incomes on their experience, education, number of kids, and WOMAN (which =1 if the person is female, and 0 otherwise). Your results (in dollars) and SEs are as follows:
Income = 300 + 1200 *experience + 600 *education - 2000 *kids + 200 *WOMAN
(200) (100) (100) (300) (300)
True, False, and Explain: These results show that on average, the income of women is not different from that of men in a statistically significant way.
FALSE. These results show that controlling for other factors the income of women and men is not different in a statistically significant way. The above equation is consistent with enormous income inequality between men and women. The answer would have been true if the equation were e.g.:
Income = 300 + 200 *WOMAN
(200) (300)
8. Suppose that you have "double vision," but perfectly accurate hearing. You gather data on 20 orchestras' decibel levels and number of performers, recording the correct decibel level for each orchestra, but twice as many performers as each orchestra really has. You then estimate:
Decibels=b1+ b2*Performers
True, False, and Explain: Your measurement error changes b2 but not b1.
TRUE. As per the rules of linear transformations, the coefficients change to keep the predictions the same. So b1 stays the same, but b2 doubles.
9. True, False, and Explain: If labor supply were perfectly stable (i.e., quantity supplied has no random element), then regressing Q on P could give you a consistent estimate of the price sensitivity of demand.
FALSE. This regression would however give you a consistent estimate of the price sensitivity of supply. Recall the formula for the plim of the "mongrel regression":
![]()
Part 2:
Short Answer(20 points each)
In 4-6 sentences AND/OR equations, answer all three of the following questions.
1. Suppose that a simple regression shows that st= -.1 + 1.05*st*, where st is the actual rate of return on the S&P 500 Index in a year, and st* is the experts' predictions of st in a given year. You have 1000 observations (50 experts over 20 years). RSS=500;
;
.
Test the hypothesis of rational expectations on this data set.
This is just like the homework problem: note that RE imposes TWO restrictions: the constant=0 and the coefficient on s*=1. So just set up the F-test:
.

Finally, note that X'X=
, and plug in:
. Consulting the tables, the critical value for the 5% F(2,998) test=3.85, so you can accept the null of RE.
2. Suppose you are trying to estimate the following model on demeaned data:
(S) Q=b1*P+b2*W
(D) Q=c1*P
You are given the following matrix of second moments:
|
Q |
P |
W |
|
|
Q |
10 |
0 |
5 |
|
P |
5 |
-5 |
|
|
W |
5 |
Use 2SLS to estimate c1. Hint: P*=W(W'W)-1W'P.
The first step is already done, so you can simply regress Q on P* to get c1:
c1=(P*'P*)-1P*'Q=[P'W(W'W)-1W'W(W'W)-1W'P]-1P'W(W'W)-1W'Q
Cancel the (W'W) and its inverse:
[P'W(W'W)-1W'P]-1P'W(W'W)-1W'Q
Now just plug in from the table:
[-5*5-1*-5]-1*-5*5-1*5=-5-1*5=-1. So c1=-1.
3. Caplan ("Has Leviathan Been Bound?") uses single-equation estimation to look at the impact of political composition on government spending. Could his results suffer from simultaneity bias? Briefly describe one possible sort of simultaneity bias his model might suffer from, and suggest one possible instrumental variable you could use to handle the simultaneity bias.
My regressions are supposed to show that a change in political composition changes fiscal policy measured in real, per-capita terms.
F=b1*Dempercent + b2*Distance
The most obvious kind of simultaneity bias to suggest, therefore, is just that fiscal policy could change a state's political composition. Perhaps big government makes anti-government voters flee the state, leading to larger Democratic majorities. E.g.:
Dempercent=c1*F
An easy suggested instrument - too easy to get full credit - is just to use lagged political composition. Anything more imaginative typically got full credit: federal grants (although note that federal grants were one of the control variables used); ADA ratings of politicians' ideology; perhaps also religious or ethnic or demographic factors.