Chapter 6 Panel data
\(\{Y_{it},D_{it},X_{it}\}\)が観察できるデータを想定する
- \(i:\)回答者、\(t:\)回答時点
6.1 パッケージ
library(tidyverse)
library(estimatr)
library(AER)
library(did) # weighted two-way fixed effect
6.2 Data
AERパッケージに含まれるパネルデータPSID7682を利用
- 595名の回答者について、1976年から1983年までの7期間パネルデータ
data("PSID7682")
<-
data |>
PSID7682 group_by(id) |>
mutate(period = as.numeric(year), # yearを連続変数化
treatment.time = if_else(married == "yes",
period,999),
treatment.time = min(treatment.time)
|> # 結婚したperiodを作成(結婚しなかったサンプル = 9999)
) ungroup()
6.3 識別: Pallarel trend in the two-by-two case
2時点・2グループデータ
トリートメントグループ: 2期目に介入を受ける
コントロールグループ: 両期間ともに介入を受けない
Pallalel trendの仮定 \(E[Y_{2i}(0)-Y_{1i}(0)|i\in Treatment]-E[Y_{2i}(0)-Y_{1i}(0)|i\in Control]\)
差の差の推定量を推定
\[E[Y_{i2}|i\in Treatment]-E[Y_{i1}|i \in Treatment]\]
\[-(E[Y_{i2}|i\in Control]-E[Y_{i1}|i \in Control])\]
\[= E[Y_{i2}(1) - Y_{i2}(0)|i \in Treatment]\]
6.4 推定: Two-way fixed effect model
- Two-way fixed effect model
\[E[Y_{it}|D_{it}=d,f_{i},f_{t}]=\beta_\tau\times d + f_i + f_t\]
Two-by-two dataのもとでは、差の差の推定と同値
Two-by-two dataの整備
<-
df |>
data filter(period <= 2) |> # 1,2期目データ
filter(treatment.time == 999 |
== 2) |> # トリートメント/コントロールグループ
treatment.time mutate(D = if_else(period >= treatment.time,
1,
0)
# 介入後ダミー )
- Two-way fixed effectの推定
lm_robust(weeks ~
+
D factor(period),
data = df,
clusters = id,
fixed_effects = id)
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper
## D -1.066667 1.2271177 -0.8692456 0.4713004 -6.0381262 3.904793
## factor(period)2 1.400000 0.8532526 1.6407803 0.1043726 -0.2953947 3.095395
## DF
## D 2.135502
## factor(period)2 89.000000
6.5 推定:Weighted two-way fixed effect model
2期間以上のデータにおいて、parallel trendの仮定に基づいて因果効果を推定する手法
ここでは Callaway and Sant’Anna (2020) を紹介
データ整備
<-
df |>
data filter(treatment.time != 1) |>
mutate(id = as.numeric(id),
treatment.time = if_else(treatment.time == 999,
0,
treatment.time) )
- 推計
<-
fit att_gt(yname = "weeks",
tname = "period",
idname = "id",
gname = "treatment.time",
data = df,
control_group = 999)
fit
##
## Call:
## att_gt(yname = "weeks", tname = "period", idname = "id", gname = "treatment.time",
## data = df, control_group = 999)
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015>
##
## Group-Time Average Treatment Effects:
## Group Time ATT(g,t) Std. Error [95% Simult. Conf. Band]
## 2 2 -0.9412 1.1090 -3.8502 1.9678
## 2 3 -2.5455 2.4102 -8.8679 3.7769
## 2 4 -8.8526 7.6182 -28.8366 11.1314
## 2 5 -8.2151 9.1243 -32.1498 15.7197
## 2 6 -1.5055 1.2416 -4.7623 1.7514
## 2 7 -2.2556 1.9195 -7.2908 2.7797
## 3 2 -2.3434 0.9547 -4.8477 0.1608
## 3 3 1.7980 0.7098 -0.0639 3.6598
## 3 4 0.7228 1.3343 -2.7773 4.2229
## 3 5 1.0538 0.8148 -1.0835 3.1910
## 3 6 0.2125 1.2731 -3.1271 3.5520
## 3 7 1.8111 1.7658 -2.8208 6.4430
## 4 2 2.5765 5.1357 -10.8955 16.0486
## 4 3 0.6579 2.0509 -4.7220 6.0378
## 4 4 -2.7684 1.2410 -6.0238 0.4869
## 4 5 -1.0860 1.8611 -5.9680 3.7960
## 4 6 -6.8489 8.2446 -28.4761 14.7783
## 4 7 0.5833 1.0872 -2.2687 3.4354
## 5 2 -1.3000 0.7710 -3.3224 0.7224
## 5 3 -0.8866 0.9598 -3.4043 1.6311
## 5 4 -0.2742 0.6540 -1.9898 1.4414
## 5 5 -0.3118 0.8957 -2.6613 2.0377
## 5 6 -3.4286 3.4714 -12.5348 5.6777
## 5 7 0.5222 0.6690 -1.2327 2.2771
## 6 2 -5.3800 2.6992 -12.4606 1.7006
## 6 3 -7.5206 8.0760 -28.7055 13.6643
## 6 4 4.8333 5.2688 -8.9878 18.6545
## 6 5 6.3242 5.4700 -8.0247 20.6731
## 6 6 -6.2527 2.7701 -13.5191 1.0136
## 6 7 -2.3222 1.0107 -4.9736 0.3291
## 7 2 -1.2871 0.7614 -3.2845 0.7103
## 7 3 0.1327 0.6607 -1.6005 1.8658
## 7 4 -1.7872 0.5631 -3.2644 -0.3101 *
## 7 5 1.7065 0.4957 0.4062 3.0069 *
## 7 6 -2.2778 0.6351 -3.9439 -0.6117 *
## 7 7 0.9556 0.5356 -0.4493 2.3605
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## P-value for pre-test of parallel trends assumption: 0
## Control Group: , Anticipation Periods: 0
## Estimation Method: Doubly Robust
- 単純平均効果
|>
fit aggte(type = "simple") |>
summary()
##
## Call:
## aggte(MP = fit, type = "simple")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015>
##
##
## ATT Std. Error [ 95% Conf. Int.]
## -1.9877 1.1326 -4.2077 0.2322
##
##
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: , Anticipation Periods: 0
## Estimation Method: Doubly Robust
- 動学効果
|>
fit aggte(type = "dynamic") |>
ggdid()
References
Callaway, Brantly, and Pedro HC Sant’Anna. 2020. “Difference-in-Differences with Multiple Time Periods.” Journal of Econometrics.