texas-district-analysis/analysis/draft_appendix.md

# Appendix: Heterogeneous Enforcement of Transparency

## Evidence from the Texas Railroad Commission

## Appendix A. Data Construction and Variables

### A1. Data integration

The analysis combines inspection and violation administrative records (2015-2025) into a district-year panel. The estimation sample contains 143 district-year observations across 13 districts (52 pre-policy; 91 post-policy). Well-level records are linked across sources prior to district-year aggregation.

### A1b. Pipeline volume and sample flow

| Stage | Count |
| :--- | ---: |
| Well records loaded (well universe table) | 1,010,432 |
| Inspection records loaded (all available years) | 1,878,764 |
| Violation records loaded (all available years) | 193,338 |
| Inspection records retained (2015-2025) | 1,867,859 |
| Violation records retained (2015-2025) | 191,762 |
| District-year panel observations | 143 |
| Districts represented | 13 |

These counts show that district-year inference is generated from very large underlying administrative record streams, with modest reductions due to the analytic time-window restriction.

### A2. Core variables

| Measure | Definition |
| :--- | :--- |
| Enforcement delay (logged) | Log of district-year mean days from violation discovery to enforcement action |
| Resolution on re-inspection | Share of violations compliant on re-inspection |
| Inspection compliance rate | Share of inspections marked compliant |
| Violations per inspection | Total violations divided by inspections |
| Post-policy period indicator | Indicator for years >= 2019 |
| Post-policy trend term | Piecewise linear trend after policy (`max(year-2018,0)`) |
| Offshore jurisdiction indicator | Indicator for districts 02/03/04 |
| High-capacity indicator | District above median pre-policy inspection volume |
| Low-baseline-compliance indicator | District below median pre-policy compliance |
| High-EJ indicator | District above median EJ score |
| High-rurality indicator | District above median RUCA |
| Border-proximity indicator | Operationalized border-proximity indicator |
| Dominant basin category | Dominant basin category |

## Appendix B. Econometric Specifications

The specification sequence follows the main text: a common-shock interrupted panel for H1, district-specific post-policy heterogeneity for H2, an offshore moderator for H5, and a global spatial autocorrelation diagnostic for H4. Because all districts are exposed in the same policy year, heterogeneity is modeled through district-by-post interactions rather than staggered-adoption treatment-timing estimators.

### B1. Interrupted panel (all districts; H1)

$$
Y_{dt}=\alpha_d + \beta_1 \text{YearNum}_t + \beta_2 \text{Post2019}_t + \beta_3 \text{PostTrend}_t + \varepsilon_{dt}
$$

### B2. District heterogeneity (H2)

$$
Y_{dt}=\alpha_d + \gamma_t + \sum_d \theta_d (\text{District}_d\times \text{Post2019}_t) + \varepsilon_{dt}
$$

### B3. Offshore moderation (H5)

$$
Y_{dt}=\alpha_d + \gamma_t + \sum_d \theta_d (\text{District}_d\times \text{Post2019}_t) + \phi(\text{Post2019}_t\times \text{Offshore}_d) + \varepsilon_{dt}
$$

All models report district-clustered standard errors.

### B4. Spatial diagnostic (H4)

H4 is tested using permutation-based global Moran's I on estimated district treatment effects, using a manually specified district contiguity matrix and 5,000 random permutations for inference.

## Appendix C. Main Run Outputs

### C1. H1 (all-district timing outcome)

| Effect term | Coefficient | P-value |
| :--- | ---: | ---: |
| Immediate post-2019 level shift | 0.1514 | 0.3294 |
| Post-2019 annual trend shift | -0.3603 | 0.0010 |

Interpretation: no significant immediate level shift; significant post-policy acceleration slope.
Substantively, this table supports the main-text conclusion that the policy effect is best characterized as gradual acceleration through the enforcement pipeline rather than a single break at policy adoption.

### C2. Event-study decomposition (all districts; ref=2018)

| Year | Coefficient | P-value |
| :--- | ---: | ---: |
| 2015 | -0.4592 | 0.0658 |
| 2016 | -0.3359 | 0.1615 |
| 2017 | -0.0385 | 0.7502 |
| 2019 | -0.1149 | 0.2843 |
| 2020 | -0.1666 | 0.3878 |
| 2021 | -0.4192 | 0.1072 |
| 2022 | -0.5853 | 0.0333 |
| 2023 | -0.4899 | 0.1160 |
| 2024 | -0.7829 | 0.0057 |
| 2025 | -1.4800 | <0.001 |

Pre-policy years are jointly non-significant in this decomposition.
The coefficient pattern reinforces parallel-pretrend credibility while showing that the post-policy effect strengthens in later years, consistent with delayed organizational adaptation.

### C2b. H2 omnibus heterogeneity test

- Wald chi-square (all district-by-post terms = 0): 0.670
- P-value: 0.4130

This omnibus test is not statistically significant in the current run, so district heterogeneity is interpreted primarily from the dispersion of district-specific estimates and mapped effect magnitudes.

### C3. Offshore differential annual effects (ref=2018)

| Year | Offshore differential coef | P-value |
| :--- | ---: | ---: |
| 2019 | 0.3479 | 0.1581 |
| 2020 | 0.1796 | 0.7089 |
| 2021 | 0.9121 | 0.1095 |
| 2022 | 0.7532 | 0.0652 |
| 2023 | 0.9166 | 0.0325 |
| 2024 | 1.0693 | 0.0280 |
| 2025 | 0.7233 | 0.2091 |

These estimates indicate that offshore jurisdictions diverge from non-offshore districts in specific post years rather than uniformly across the entire post period. The strongest differentials appear in 2023-2024.

### C4. H5 offshore moderator (conditional model)

| Effect term | Coefficient | P-value |
| :--- | ---: | ---: |
| Offshore-by-post-policy differential | 0.3819 | <0.001 |

See Figure 4 in the main text (`district_treatment_effects_map_psj.png`) for the geographic distribution of district treatment effects.
Read alongside C3, this pooled interaction should be interpreted as an average offshore differential in the post period after district heterogeneity is already modeled, not as a claim that offshore status is the dominant driver of all district variation.

### C5. H3 moderator tests

Main block:

- H3a Capacity: -0.0188 (p=0.9415)
- H3b Baseline performance: -0.0884 (p=0.7144)
- H3e Border proximity: -0.2768 (p=0.3082)
- H5 (same block estimate): 0.6317 (p=0.1055)

Deep-dive block:

- H3c EJ: 0.1818 (p=0.4866)
- H3f Rurality: 0.2213 (p=0.4649)
- H3e Border proximity: -0.3626 (p=0.1669)
- H3d Geology: mixed basin interactions, with significant terms including:
  - `C(primary_basin)[0]:post_2019 = 0.5322` (p<0.001)
  - `C(primary_basin)[3]:post_2019 = -0.5707` (p<0.001)

Taken together, these moderator results imply that broad structural covariates provide limited explanatory leverage in this run, while basin composition remains the clearest structural correlate of differential policy response.

## Appendix D. Spatial Test (H4)

Moran's I on district treatment effects:

- Moran’s I = -0.0493
- Permutation p-value = 0.8550

Conclusion: no significant global spatial autocorrelation.
The sign and magnitude of Moran’s I are both small, indicating no evidence that high- or low-response districts are systematically clustered in ways consistent with regional diffusion.

## Appendix E. Robustness Tables

### E1. Placebo policy years (all-district interrupted model)

| Placebo year | Estimated level shift | P-value |
| :--- | ---: | ---: |
| 2017 | 0.6565 | 0.0020 |
| 2021 | -0.0245 | 0.9191 |

The significant 2017 placebo estimate suggests that single-cut timing designs can produce spurious break effects, which is why the main analysis emphasizes trend-change evidence and event-study diagnostics instead of level shifts alone.

### E2. Alternative outcomes (all-district interrupted model)

| Outcome | Immediate post-policy level effect (p) | Post-policy trend effect (p) |
| :--- | :--- | :--- |
| Resolution rate | 4.3721 (0.2104) | -2.9371 (0.1424) |
| Compliance rate | -0.1311 (0.9316) | -0.5562 (0.1870) |
| Violations per inspection | -0.0082 (0.6690) | 0.0106 (0.0600) |

This table shows that timing acceleration does not mechanically translate into improvements across all compliance-oriented outcomes in the same period, highlighting outcome-specific channels of policy response.

### E3. Sample restrictions (all-district interrupted model)

| Restriction | Immediate post-policy level effect (p) | Post-policy trend effect (p) |
| :--- | :--- | :--- |
| Full sample | 0.1514 (0.3294) | -0.3603 (0.0010) |
| Exclude extreme districts | 0.1917 (0.1930) | -0.2972 (0.0133) |
| Exclude 2015-2016 | 0.1942 (0.1958) | -0.2313 (0.0950) |
| Exclude 2020-2021 | 0.1516 (0.2959) | -0.3599 (0.0016) |

Across restrictions, the post-trend estimate remains negative and generally significant, while the post level term stays weak. This stability is central to the article’s interpretation of gradual policy-induced acceleration.

### E4. Specification sensitivity

| Specification | Immediate post-policy level effect | Post-policy trend effect |
| :--- | :--- | :--- |
| Linear interrupted | -41.9298 (p=0.3104) | -67.0420 (p=0.0100) |
| Winsorized interrupted | 0.2137 (p=0.1021) | -0.3147 (p=0.0016) |
| Year FE + district post terms | 13 interaction terms | N/A |

Specification checks again point to the same empirical hierarchy: slope effects are more robust than level effects, and district-specific post terms remain necessary to represent the observed heterogeneity.

## Appendix F. Interpretation Notes

1. The strongest system-wide evidence in this run is a post-policy slope change, not a one-time 2019 level shift.
2. District heterogeneity is substantial and statistically material.
3. Offshore jurisdiction contributes meaningfully in conditional models, but placebo behavior indicates caution in purely timing-based causal claims.
4. Spatial diffusion is not supported by global autocorrelation tests.