edits to draft documents

2026-02-18 19:17:33 -08:00
parent 1342b06871
commit d77c351faa
4 changed files with 88 additions and 55 deletions
--- a/analysis/draft.md
+++ b/analysis/draft.md
@@ -4,7 +4,9 @@

 How does transparency alter regulatory enforcement in high-capacity but locally discretionary bureaucracies? We study the January 2019 Texas Railroad Commission (RRC) disclosure change that made well-level violation information publicly searchable. The policy constitutes a statewide transparency shock, but implementation and enforcement remain district-administered. This setting allows us to test both system-wide effects and district-level heterogeneity in policy response.

-Our core empirical finding is a two-part pattern. First, we find evidence of **gradual post-policy acceleration** in enforcement timing at the statewide level (significant post-policy trend improvement) rather than a sharp immediate level break in 2019. Second, district-level responses are strongly heterogeneous, and offshore-jurisdiction districts (02/03/04) exhibit systematically different post-policy dynamics once district-specific post effects are modeled.
+While targeted transparency is increasingly utilized as a regulatory tool to improve accountability, its actual impact is mediated by the bureaucratic discretion of local field offices. Because policy implementation often experiences a lag, we utilize an Interrupted Time Series design to capture gradual enforcement acceleration, while explicitly modeling the structural, spatial, and demographic factors that drive street-level bureaucratic heterogeneity.
+
+Our core empirical finding is a two-part pattern. First, we find evidence of gradual post-policy acceleration in enforcement timing at the statewide level (significant post-policy trend improvement) rather than a sharp immediate level break in 2019. Second, district-level responses are strongly heterogeneous, and offshore-jurisdiction districts (02/03/04) exhibit systematically different post-policy dynamics once district-specific post effects are modeled.

 ## Theory and Hypotheses

@@ -22,42 +24,49 @@ We test:

 ## Data and Measures

-We construct a district-year panel (2015-2025, 13 RRC districts) from administrative inspection and violation records. Well-level integration uses `api_norm` as the normalized identifier across sources.
+We construct a district-year panel (2015-2025, 13 RRC districts) from administrative inspection and violation records. Well-level records are linked across sources prior to district-year aggregation.

 Primary outcomes:

- `log_days_to_enf`: log mean days from violation discovery to enforcement action.
- `resolution_rate`: percent compliant on re-inspection.
- `compliance_rate`: percent compliant at inspection.
- `violations_per_inspection`.
+- Enforcement delay: the logged district-year mean number of days from violation discovery to enforcement action.
+- Resolution on re-inspection: the district-year share of violations marked compliant at re-inspection.
+- Inspection compliance rate: the district-year share of inspections marked compliant.
+- Violations per inspection: total violations divided by total inspections in each district-year.

 ## Empirical Strategy

-We estimate policy effects in three layers.
+To evaluate the January 2019 transparency reform, we pair an all-district interrupted panel design with district-specific heterogeneity models and a spatial dependence diagnostic. This sequence matches the hypotheses: H1 tests system-wide timing change, H2 tests district divergence, H5 tests offshore moderation, and H4 tests whether estimated district effects are spatially clustered. Administrative records are extracted from a PostgreSQL backend, linked across inspection and violation files at the well level, and aggregated to district-year panels in Python (`pandas`, `numpy`). Estimation is conducted with `statsmodels` (with `scipy` for auxiliary tests); figures are produced with `matplotlib`/`seaborn` and district map joins use `geopandas`. The H4 spatial test uses a permutation-based global Moran's I computed from district contiguity weights.

 ### Model 1: All-district policy-year shift (H1)

-\[
-Y_{dt}=\alpha_d + \beta_1 \text{YearNum}_t + \beta_2 \text{Post2019}_t + \beta_3 \text{PostTrend}_t + \varepsilon_{dt}
-\]
+$$
+Y_{dt} = \alpha_d + \beta_1 \mathrm{YearNum}_t + \beta_2 \mathrm{Post2019}_t + \beta_3 \mathrm{PostTrend}_t + \varepsilon_{dt}
+$$

-Where \(\text{PostTrend}_t = \max(0, t-2018)\). This distinguishes an immediate post-2019 level shift (\(\beta_2\)) from post-policy slope change (\(\beta_3\)).
+Where $(PostTrend_t = \max(0, t-2018))$. This distinguishes an immediate post-2019 level shift $(\beta_2)$ from a post-policy slope change $(\beta_3)$.
+This follows interrupted time-series logic for a common policy shock, separating immediate and gradual responses (Biglan, Ary, & Wagenaar, 2000; Bernal, Cummins, & Gasparrini, 2017; Linden, 2015).

 ### Model 2: District heterogeneity (H2)

-\[
-Y_{dt}=\alpha_d + \gamma_t + \sum_d \theta_d (\text{District}_d\times \text{Post2019}_t) + \varepsilon_{dt}
-\]
+$$
+Y_{dt} = \alpha_d + \gamma_t + \sum_{d} \theta_d \bigl(\mathrm{District}_d \times \mathrm{Post2019}_t\bigr) + \varepsilon_{dt}
+$$

 This yields district-specific post-policy effects and a joint heterogeneity test.
+Because all districts are exposed in the same year, this is not a staggered-adoption DiD problem. Still, recent DiD work highlights that pooled average effects can mask meaningful treatment-effect heterogeneity, so we estimate district-specific post effects directly rather than rely on a single pooled interaction (de Chaisemartin & D'Haultfœuille, 2020; Goodman-Bacon, 2021; Sun & Abraham, 2021).

 ### Model 3: Offshore moderation (H5)

-\[
-Y_{dt}=\alpha_d + \gamma_t + \sum_d \theta_d (\text{District}_d\times \text{Post2019}_t) + \phi(\text{Post2019}_t\times \text{Offshore}_d) + \varepsilon_{dt}
-\]
+$$
+Y_{dt} = \alpha_d + \gamma_t + \sum_{d} \theta_d \bigl(\mathrm{District}_d \times \mathrm{Post2019}_t\bigr) + \phi\bigl(\mathrm{Post2019}_t \times \mathrm{Offshore}_d\bigr) + \varepsilon_{dt}
+$$

 Where `Offshore_d = 1` for districts 02/03/04.
+This specification tests whether offshore-regulating districts differ systematically from other districts after controlling for district-specific post-policy shifts.
+
+### Spatial diagnostic (H4)
+
+After estimating district treatment effects, we test for global spatial autocorrelation using permutation-based Moran's I (Anselin, 1995). This assesses whether high- and low-response districts are geographically clustered in ways consistent with diffusion or regional administrative spillovers.

 All models use district-clustered standard errors.

@@ -76,18 +85,18 @@ Figure 1 visualizes these system-level changes across the regulatory pipeline. T

 **Model 1 (timing outcome):**

- `post_2019` level shift: **0.1514**, p=0.3294.
- `post_trend` slope shift: **-0.3603**, p=0.0010.
+- Immediate post-2019 level shift: **0.1514**, p=0.3294.
+- Post-2019 slope shift: **-0.3603**, p=0.0010.

 Interpretation: no statistically significant immediate level break in 2019, but a significant post-policy acceleration trend in enforcement timing.

 **Table 1. Core policy-year and moderator estimates**

-| Model | Parameter | Coefficient | P-value | Interpretation |
+| Model | Effect term | Coefficient | P-value | Interpretation |
 | :--- | :--- | ---: | ---: | :--- |
-| Model 1 (All districts, interrupted panel) | `post_2019` | 0.1514 | 0.3294 | No immediate level break |
-| Model 1 (All districts, interrupted panel) | `post_trend` | -0.3603 | 0.0010 | Significant post-policy acceleration trend |
-| Model 3 (District heterogeneity + offshore) | `post_2019:offshore_jurisdiction` | 0.3819 | <0.001 | Offshore districts relatively slower post-policy timing |
+| Model 1 (All districts, interrupted panel) | Immediate post-2019 level shift | 0.1514 | 0.3294 | No immediate level break |
+| Model 1 (All districts, interrupted panel) | Post-2019 annual trend shift | -0.3603 | 0.0010 | Significant post-policy acceleration trend |
+| Model 3 (District heterogeneity + offshore) | Offshore-by-post-policy differential | 0.3819 | <0.001 | Offshore districts relatively slower post-policy timing |

 Table 1 provides the baseline inferential results for the article’s identification strategy. The table shows that the main all-district effect appears in the post-policy slope term rather than a one-time post-2019 level break, and it also shows that offshore jurisdiction remains a statistically important differential once district heterogeneity is modeled.

@@ -136,7 +145,7 @@ The map indicates that large positive and negative effects coexist across region

 In the conditional heterogeneity model (Model 3):

- `post_2019:offshore_jurisdiction = 0.3819`, p<0.001.
+- Offshore-by-post-policy differential = **0.3819**, p<0.001.

 This indicates that, net of district-specific post effects, offshore-jurisdiction districts experience relatively slower post-policy enforcement timing.

@@ -159,14 +168,14 @@ Overall, H3 receives limited support except partial geology effects.

 **Table 3. Structural moderator tests**

-| Hypothesis | Term | Coefficient | P-value | Result |
+| Hypothesis | Moderator term | Coefficient | P-value | Result |
 | :--- | :--- | ---: | ---: | :--- |
-| H3a Capacity | `post_2019:high_capacity` | -0.0188 | 0.9415 | Not supported |
-| H3b Baseline performance | `post_2019:low_baseline_compliance` | -0.0884 | 0.7144 | Not supported |
-| H3c EJ context | `post_2019:high_eji` | 0.1818 | 0.4866 | Not supported |
-| H3e Border proximity | `post_2019:border_competition` | -0.3626 | 0.1669 | Not supported |
-| H3f Rurality | `post_2019:high_rural` | 0.2213 | 0.4649 | Not supported |
-| H3d Geology | `C(primary_basin):post_2019` | Mixed | Mixed | Partial support |
+| H3a Capacity | High-capacity district x post-policy | -0.0188 | 0.9415 | Not supported |
+| H3b Baseline performance | Low-baseline-compliance district x post-policy | -0.0884 | 0.7144 | Not supported |
+| H3c EJ context | High-EJ district x post-policy | 0.1818 | 0.4866 | Not supported |
+| H3e Border proximity | Border-proximity district x post-policy | -0.3626 | 0.1669 | Not supported |
+| H3f Rurality | High-rurality district x post-policy | 0.2213 | 0.4649 | Not supported |
+| H3d Geology | Basin category x post-policy interactions | Mixed | Mixed | Partial support |

 Table 3 summarizes why structural accounts are only partially successful in this run: most moderators are imprecisely estimated, while geology shows selective basin-specific effects. Figure 5 and Figure 6 then provide visual context for these moderator patterns.

@@ -218,7 +227,7 @@ Across variants, the post-policy **slope** result is more stable than the immedi

 **Table 4. Robustness summary (interrupted panel framework)**

-| Check | `post_2019` (p) | `post_trend` (p) | Read |
+| Check | Immediate post-policy level effect (p) | Post-policy trend effect (p) | Read |
 | :--- | :--- | :--- | :--- |
 | Full sample | 0.1514 (0.3294) | -0.3603 (0.0010) | Slope effect robust; level break weak |
 | Exclude extreme districts | 0.1917 (0.1930) | -0.2972 (0.0133) | Slope remains significant |
@@ -234,3 +243,21 @@ Table 4 consolidates robustness evidence in one place: level-shift estimates are
 The transparency reform is associated with a gradual statewide acceleration in enforcement timing rather than a single immediate break at implementation. At the same time, district responses diverge sharply, confirming bureaucratic heterogeneity. Offshore jurisdiction explains a meaningful share of that heterogeneity once district-specific post effects are included, while most other structural moderators are weak or inconsistent in this run. Spatial diffusion across neighboring districts is not supported by global autocorrelation tests.

 These findings suggest that transparency reforms in decentralized regulatory systems should be evaluated as dynamic, district-conditioned processes, not monolithic statewide shocks.
+
+### References
+
+Anselin, L. (1995). Local Indicators of Spatial Association—LISA. *Geographical Analysis*, 27(2), 93-115.
+
+Biglan, A., Ary, D., & Wagenaar, A. C. (2000). The Value of Interrupted Time-Series Experiments for Community Intervention Research. *Prevention Science*, 1(1), 31-49.
+
+Bernal, J. L., Cummins, S., & Gasparrini, A. (2017). Interrupted time series regression for the evaluation of public health interventions: A tutorial. *International Journal of Epidemiology*, 46(1), 348-355.
+
+de Chaisemartin, C., & D'Haultfœuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. *American Economic Review*, 110(9), 2964-96.
+
+Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. *Journal of Econometrics*, 225(2), 254-277.
+
+Linden, A. (2015). Conducting interrupted time-series analysis for single- and multiple-group comparisons. *The Stata Journal*, 15(2), 480-500.
+
+Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical modeling with Python. *Proceedings of the 9th Python in Science Conference*, 57-61.
+
+Sun, L., & Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. *Journal of Econometrics*, 225(2), 175-199.