Calculate two-stage difference-in-differences following Gardner (2021)
did2s.RdCalculate two-stage difference-in-differences following Gardner (2021)
Usage
did2s(
data,
yname,
first_stage,
second_stage,
treatment,
cluster_var,
weights = NULL,
bootstrap = FALSE,
n_bootstraps = 250,
return_bootstrap = FALSE,
verbose = TRUE
)Arguments
- data
The dataframe containing all the variables
- yname
Outcome variable
- first_stage
Fixed effects and other covariates you want to residualize with in first stage. Formula following
fixest::feols. Fixed effects specified after "|".- second_stage
Second stage, these should be the treatment indicator(s) (e.g. treatment variable or event-study leads/lags). Formula following
fixest::feols. Usei()for factor variables, seefixest::i.- treatment
A variable that = 1 if treated, = 0 otherwise
- cluster_var
What variable to cluster standard errors. This can be IDs or a higher aggregate level (state for example)
- weights
Optional. Variable name for regression weights.
- bootstrap
Optional. Should standard errors be calculated using bootstrap? Default is
FALSE.- n_bootstraps
Optional. How many bootstraps to run. Default is
250.- return_bootstrap
Optional. Logical. Will return each bootstrap second-stage estimate to allow for manual use, e.g. percentile standard errors and empirical confidence intervals.
- verbose
Optional. Logical. Should information about the two-stage procedure be printed back to the user? Default is
TRUE.
Value
fixest object with adjusted standard errors
(either by formula or by bootstrap). All the methods from fixest package
will work, including fixest::esttable and
fixest::coefplot
Examples
Load example dataset which has two treatment groups and homogeneous treatment effects
# Load Example Dataset
data("df_hom")Static TWFE
You can run a static TWFE fixed effect model for a simple treatment indicator
static <- did2s(df_hom,
yname = "dep_var", treatment = "treat", cluster_var = "state",
first_stage = ~ 0 | unit + year,
second_stage = ~ i(treat, ref=FALSE))
#> Running Two-stage Difference-in-Differences
#> - first stage formula `~ 0 | unit + year`
#> - second stage formula `~ i(treat, ref = FALSE)`
#> - The indicator variable that denotes when treatment is on is `treat`
#> - Standard errors will be clustered by `state`
fixest::esttable(static)
#> static
#> Dependent Var.: dep_var
#>
#> treat = TRUE 2.005*** (0.0202)
#> _______________ _________________
#> S.E. type Custom
#> Observations 46,500
#> R2 0.47520
#> Adj. R2 0.47520
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Event Study
Or you can use relative-treatment indicators to estimate an event study estimate
es <- did2s(df_hom,
yname = "dep_var", treatment = "treat", cluster_var = "state",
first_stage = ~ 0 | unit + year,
second_stage = ~ i(rel_year, ref=c(-1, Inf)))
#> Running Two-stage Difference-in-Differences
#> - first stage formula `~ 0 | unit + year`
#> - second stage formula `~ i(rel_year, ref = c(-1, Inf))`
#> - The indicator variable that denotes when treatment is on is `treat`
#> - Standard errors will be clustered by `state`
fixest::esttable(es)
#> es
#> Dependent Var.: dep_var
#>
#> rel_year = -20 0.0043 (0.0322)
#> rel_year = -19 0.0222 (0.0296)
#> rel_year = -18 -0.0358 (0.0308)
#> rel_year = -17 0.0043 (0.0337)
#> rel_year = -16 -0.0186 (0.0353)
#> rel_year = -15 -0.0045 (0.0346)
#> rel_year = -14 -0.0393 (0.0384)
#> rel_year = -13 0.0453 (0.0323)
#> rel_year = -12 0.0324 (0.0309)
#> rel_year = -11 -0.0245 (0.0349)
#> rel_year = -10 -0.0017 (0.0241)
#> rel_year = -9 0.0155 (0.0242)
#> rel_year = -8 -0.0073 (0.0210)
#> rel_year = -7 -0.0513* (0.0202)
#> rel_year = -6 0.0269 (0.0237)
#> rel_year = -5 0.0136 (0.0237)
#> rel_year = -4 0.0381. (0.0223)
#> rel_year = -3 -0.0228 (0.0284)
#> rel_year = -2 0.0041 (0.0228)
#> rel_year = 0 1.971*** (0.0470)
#> rel_year = 1 2.050*** (0.0466)
#> rel_year = 2 2.033*** (0.0441)
#> rel_year = 3 1.966*** (0.0400)
#> rel_year = 4 1.965*** (0.0430)
#> rel_year = 5 2.030*** (0.0456)
#> rel_year = 6 2.040*** (0.0447)
#> rel_year = 7 1.995*** (0.0370)
#> rel_year = 8 2.019*** (0.0485)
#> rel_year = 9 1.955*** (0.0468)
#> rel_year = 10 1.950*** (0.0455)
#> rel_year = 11 2.117*** (0.0664)
#> rel_year = 12 2.132*** (0.0741)
#> rel_year = 13 2.019*** (0.0640)
#> rel_year = 14 2.013*** (0.0522)
#> rel_year = 15 1.961*** (0.0605)
#> rel_year = 16 1.916*** (0.0584)
#> rel_year = 17 1.938*** (0.0607)
#> rel_year = 18 2.070*** (0.0666)
#> rel_year = 19 2.066*** (0.0609)
#> rel_year = 20 1.964*** (0.0612)
#> _______________ _________________
#> S.E. type Custom
#> Observations 46,500
#> R2 0.47577
#> Adj. R2 0.47533
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1# plot rel_year coefficients and standard errors
fixest::coefplot(es, keep = "rel_year::(.*)")Example from Cheng and Hoekstra (2013)
Here's an example using data from Cheng and Hoekstra (2013)
# Castle Data
castle <- haven::read_dta("https://github.com/scunning1975/mixtape/raw/master/castle.dta")
did2s(
data = castle,
yname = "l_homicide",
first_stage = ~ 0 | sid + year,
second_stage = ~ i(post, ref=0),
treatment = "post",
cluster_var = "state", weights = "popwt"
)
#> Running Two-stage Difference-in-Differences
#> - first stage formula `~ 0 | sid + year`
#> - second stage formula `~ i(post, ref = 0)`
#> - The indicator variable that denotes when treatment is on is `post`
#> - Standard errors will be clustered by `state`
#> OLS estimation, Dep. Var.: l_homicide
#> Observations: 550
#> Weights: weights_vector
#> Standard-errors: Custom
#> Estimate Std. Error t value Pr(>|t|)
#> post::1 0.075142 0.03538 2.12387 0.034127 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 263.4 Adj. R2: 0.052465