Fit instrumental-variable regression by two-stage least squares. This is equivalent to direct instrumental-variables estimation when the number of instruments is equal to the number of predictors.
formula | specification(s) of the regression relationship |
---|---|
instruments | the instruments. Either |
model, x, y | logicals. If |
... | further arguments passed to methods. See also |
ivreg
is from Christian Kleiber and Achim Zeileis (2008). Applied
Econometrics with R. New York: Springer-Verlag. ISBN 978-0-387-77316-2. URL
https://CRAN.R-project.org/package=AER
Regressors and instruments for ivreg
are most easily specified in
a formula with two parts on the right-hand side, e.g.,
y ~ x1 + x2 | z1 + z2 + z3
, where x1
and x2
are the regressors and
z1
, z2
, and z3
are the instruments. Note that exogenous regressors
have to be included as instruments for themselves. For example, if there is
one exogenous regressor ex
and one endogenous regressor en
with
instrument in
, the appropriate formula would be y ~ ex + en | ex + in
.
Equivalently, this can be specified as y ~ ex + en | . - en + in
, i.e.,
by providing an update formula with a .
in the second part of the
formula. The latter is typically more convenient, if there is a large
number of exogenous regressors.
zelig(formula, data, model = NULL, ..., weights = NULL, by,
bootstrap = FALSE)
The zelig function estimates a variety of statistical models
zelig
,
Greene, W. H. (1993) Econometric Analysis, 2nd ed., Macmillan.
library(AER) # for sandwich vcov library(dplyr) # for the pipe operator %>% # load and transform data data("CigarettesSW") CigarettesSW$rprice <- with(CigarettesSW, price/cpi) CigarettesSW$rincome <- with(CigarettesSW, income/population/cpi) CigarettesSW$tdiff <- with(CigarettesSW, (taxs - tax)/cpi) # log second stage independent variables, as logging internally for ivreg is # not currently supported CigarettesSW$log_rprice <- log(CigarettesSW$rprice) CigarettesSW$log_rincome <- log(CigarettesSW$rincome) # estimate model z.out1 <- zelig(log(packs) ~ log_rprice + log_rincome | log_rincome + tdiff + I(tax/cpi), data = CigarettesSW, subset = year == "1995", model = "ivreg")#> Error in eval(substitute(subset), data, env): object 'year' not foundsummary(z.out1)#> Model: #> #> Call: #> z5$zelig(formula = log(packs) ~ log_rprice + log_rincome | log_rincome + #> tdiff + I(tax/cpi), data = CigarettesSW, subset = year == #> "1995") #> #> Residuals: #> Min 1Q Median 3Q Max #> -0.6006931 -0.0862222 -0.0009999 0.1164699 0.3734227 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 9.8950 1.0586 9.348 4.12e-12 #> log_rprice -1.2774 0.2632 -4.853 1.50e-05 #> log_rincome 0.2804 0.2386 1.175 0.246 #> #> Residual standard error: 0.1879 on 45 degrees of freedom #> Multiple R-Squared: 0.4294, Adjusted R-squared: 0.4041 #> Wald test: 13.28 on 2 and 45 DF, p-value: 2.931e-05 #> #> Next step: Use 'setx' method#> #> Call: #> `z5$zelig`(formula = log(packs) ~ log_rprice + log_rincome | #> log_rincome + tdiff + I(tax/cpi), data = CigarettesSW, subset = year == #> "1995") #> #> Residuals: #> Min 1Q Median 3Q Max #> -0.6006931 -0.0862222 -0.0009999 0.1164699 0.3734227 #> #> Coefficients: #> Estimate Std. Error z value Pr(>|z|) #> (Intercept) 9.8950 0.9288 10.654 < 2e-16 *** #> log_rprice -1.2774 0.2417 -5.286 1.25e-07 *** #> log_rincome 0.2804 0.2458 1.141 0.254 #> #> Diagnostic tests: #> df1 df2 statistic p-value #> Weak instruments 2 44 228.738 <2e-16 *** #> Wu-Hausman 1 44 3.823 0.0569 . #> Sargan 1 NA 0.333 0.5641 #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> Residual standard error: 0.1879 on Inf degrees of freedom #> Multiple R-Squared: 0.4294, Adjusted R-squared: 0.4041 #> Wald test: 34.51 on 2 DF, p-value: 3.214e-08 #># ANOVA z.out2 <- zelig(log(packs) ~ log_rprice | tdiff, data = CigarettesSW, subset = year == "1995", model = "ivreg")#> Error in eval(substitute(subset), data, env): object 'year' not found#> Analysis of Variance Table #> #> Model 1: log(packs) ~ log_rprice + log_rincome | log_rincome + tdiff + #> I(tax/cpi) #> Model 2: log(packs) ~ log_rprice | tdiff #> Res.Df RSS Df Sum of Sq F Pr(>F) #> 1 45 1.5880 #> 2 46 1.6668 -1 -0.078748 1.3815 0.246