# Calculate two-stage difference-in-differences following Gardner (2021)

`did2s.Rd`

Calculate two-stage difference-in-differences following Gardner (2021)

## Usage

```
did2s(
data,
yname,
first_stage,
second_stage,
treatment,
cluster_var,
weights = NULL,
bootstrap = FALSE,
n_bootstraps = 250,
return_bootstrap = FALSE,
verbose = TRUE
)
```

## Arguments

- data
The dataframe containing all the variables

- yname
Outcome variable

- first_stage
Fixed effects and other covariates you want to residualize with in first stage. Formula following

`fixest::feols`

. Fixed effects specified after "`|`

".- second_stage
Second stage, these should be the treatment indicator(s) (e.g. treatment variable or event-study leads/lags). Formula following

`fixest::feols`

. Use`i()`

for factor variables, see`fixest::i`

.- treatment
A variable that = 1 if treated, = 0 otherwise

- cluster_var
What variable to cluster standard errors. This can be IDs or a higher aggregate level (state for example)

- weights
Optional. Variable name for regression weights.

- bootstrap
Optional. Should standard errors be calculated using bootstrap? Default is

`FALSE`

.- n_bootstraps
Optional. How many bootstraps to run. Default is

`250`

.- return_bootstrap
Optional. Logical. Will return each bootstrap second-stage estimate to allow for manual use, e.g. percentile standard errors and empirical confidence intervals.

- verbose
Optional. Logical. Should information about the two-stage procedure be printed back to the user? Default is

`TRUE`

.

## Value

`fixest`

object with adjusted standard errors
(either by formula or by bootstrap). All the methods from `fixest`

package
will work, including `fixest::esttable`

and
`fixest::coefplot`

## Examples

Load example dataset which has two treatment groups and homogeneous treatment effects

```
# Load Example Dataset
data("df_hom")
```

### Static TWFE

You can run a static TWFE fixed effect model for a simple treatment indicator

```
static <- did2s(df_hom,
yname = "dep_var", treatment = "treat", cluster_var = "state",
first_stage = ~ 0 | unit + year,
second_stage = ~ i(treat, ref=FALSE))
#> Running Two-stage Difference-in-Differences
#> - first stage formula `~ 0 | unit + year`
#> - second stage formula `~ i(treat, ref = FALSE)`
#> - The indicator variable that denotes when treatment is on is `treat`
#> - Standard errors will be clustered by `state`
fixest::esttable(static)
#> static
#> Dependent Var.: dep_var
#>
#> treat = TRUE 2.005*** (0.0202)
#> _______________ _________________
#> S.E. type Custom
#> Observations 46,500
#> R2 0.47520
#> Adj. R2 0.47520
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

### Event Study

Or you can use relative-treatment indicators to estimate an event study estimate

```
es <- did2s(df_hom,
yname = "dep_var", treatment = "treat", cluster_var = "state",
first_stage = ~ 0 | unit + year,
second_stage = ~ i(rel_year, ref=c(-1, Inf)))
#> Running Two-stage Difference-in-Differences
#> - first stage formula `~ 0 | unit + year`
#> - second stage formula `~ i(rel_year, ref = c(-1, Inf))`
#> - The indicator variable that denotes when treatment is on is `treat`
#> - Standard errors will be clustered by `state`
fixest::esttable(es)
#> es
#> Dependent Var.: dep_var
#>
#> rel_year = -20 0.0043 (0.0322)
#> rel_year = -19 0.0222 (0.0296)
#> rel_year = -18 -0.0358 (0.0308)
#> rel_year = -17 0.0043 (0.0337)
#> rel_year = -16 -0.0186 (0.0353)
#> rel_year = -15 -0.0045 (0.0346)
#> rel_year = -14 -0.0393 (0.0384)
#> rel_year = -13 0.0453 (0.0323)
#> rel_year = -12 0.0324 (0.0309)
#> rel_year = -11 -0.0245 (0.0349)
#> rel_year = -10 -0.0017 (0.0241)
#> rel_year = -9 0.0155 (0.0242)
#> rel_year = -8 -0.0073 (0.0210)
#> rel_year = -7 -0.0513* (0.0202)
#> rel_year = -6 0.0269 (0.0237)
#> rel_year = -5 0.0136 (0.0237)
#> rel_year = -4 0.0381. (0.0223)
#> rel_year = -3 -0.0228 (0.0284)
#> rel_year = -2 0.0041 (0.0228)
#> rel_year = 0 1.971*** (0.0470)
#> rel_year = 1 2.050*** (0.0466)
#> rel_year = 2 2.033*** (0.0441)
#> rel_year = 3 1.966*** (0.0400)
#> rel_year = 4 1.965*** (0.0430)
#> rel_year = 5 2.030*** (0.0456)
#> rel_year = 6 2.040*** (0.0447)
#> rel_year = 7 1.995*** (0.0370)
#> rel_year = 8 2.019*** (0.0485)
#> rel_year = 9 1.955*** (0.0468)
#> rel_year = 10 1.950*** (0.0455)
#> rel_year = 11 2.117*** (0.0664)
#> rel_year = 12 2.132*** (0.0741)
#> rel_year = 13 2.019*** (0.0640)
#> rel_year = 14 2.013*** (0.0522)
#> rel_year = 15 1.961*** (0.0605)
#> rel_year = 16 1.916*** (0.0584)
#> rel_year = 17 1.938*** (0.0607)
#> rel_year = 18 2.070*** (0.0666)
#> rel_year = 19 2.066*** (0.0609)
#> rel_year = 20 1.964*** (0.0612)
#> _______________ _________________
#> S.E. type Custom
#> Observations 46,500
#> R2 0.47577
#> Adj. R2 0.47533
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
# plot rel_year coefficients and standard errors
fixest::coefplot(es, keep = "rel_year::(.*)")
```

### Example from Cheng and Hoekstra (2013)

Here's an example using data from Cheng and Hoekstra (2013)

```
# Castle Data
castle <- haven::read_dta("https://github.com/scunning1975/mixtape/raw/master/castle.dta")
did2s(
data = castle,
yname = "l_homicide",
first_stage = ~ 0 | sid + year,
second_stage = ~ i(post, ref=0),
treatment = "post",
cluster_var = "state", weights = "popwt"
)
#> Running Two-stage Difference-in-Differences
#> - first stage formula `~ 0 | sid + year`
#> - second stage formula `~ i(post, ref = 0)`
#> - The indicator variable that denotes when treatment is on is `post`
#> - Standard errors will be clustered by `state`
#> OLS estimation, Dep. Var.: l_homicide
#> Observations: 550
#> Weights: weights_vector
#> Standard-errors: Custom
#> Estimate Std. Error t value Pr(>|t|)
#> post::1 0.075142 0.03538 2.12387 0.034127 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 263.4 Adj. R2: 0.052465
```