Files
texas-district-analysis/analysis/draft.md

265 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Heterogeneous Enforcement of Transparency: Evidence from the Texas Railroad Commission
## Introduction
How does transparency alter regulatory enforcement in high-capacity but locally discretionary bureaucracies? We study the January 2019 Texas Railroad Commission (RRC) disclosure change that made well-level violation information publicly searchable. The policy constitutes a statewide transparency shock, but implementation and enforcement remain district-administered. This setting allows us to test both system-wide effects and district-level heterogeneity in policy response.
While targeted transparency is increasingly utilized as a regulatory tool to improve accountability, its actual impact is mediated by the bureaucratic discretion of local field offices. Because policy implementation often experiences a lag, we utilize an Interrupted Time Series design to capture gradual enforcement acceleration, while explicitly modeling the structural, spatial, and demographic factors that drive street-level bureaucratic heterogeneity.
Our core empirical finding is a two-part pattern. First, we find evidence of gradual post-policy acceleration in enforcement timing at the statewide level (significant post-policy trend improvement) rather than a sharp immediate level break in 2019. Second, district-level responses are strongly heterogeneous, and offshore-jurisdiction districts (02/03/04) exhibit systematically different post-policy dynamics once district-specific post effects are modeled.
## Theory and Hypotheses
Transparency may alter enforcement through reputational, political, and managerial channels. Public disclosure can increase the salience of noncompliance and create incentives for agencies to accelerate case movement through the regulatory pipeline. But local implementation discretion can mediate this effect, producing district-level divergence.
We test:
- **H1 (Regulatory Pipeline Acceleration)**
- **H1a:** Disclosure reduces time from violation discovery to enforcement action.
- **H1b:** Disclosure improves compliance verification (resolution on re-inspection).
- **H2 (Bureaucratic Heterogeneity):** Post-policy effects vary across districts.
- **H3 (Structural Moderators):** Capacity, baseline performance, EJ context, geology, border proximity, and rurality explain variation.
- **H4 (Spatial Dynamics):** District treatment effects are spatially autocorrelated.
- **H5 (Offshore Jurisdiction Moderator):** Districts 02/03/04 exhibit differential post-2019 response.
## Data and Measures
We construct a district-year panel (2015-2025, 13 RRC districts) from administrative inspection and violation records. Well-level records are linked across sources prior to district-year aggregation.
Primary outcomes:
- Enforcement delay: the logged district-year mean number of days from violation discovery to enforcement action.
- Resolution on re-inspection: the district-year share of violations marked compliant at re-inspection.
- Inspection compliance rate: the district-year share of inspections marked compliant.
- Violations per inspection: total violations divided by total inspections in each district-year.
## Empirical Strategy
To evaluate the January 2019 transparency reform, we pair an all-district interrupted panel design with district-specific heterogeneity models and a spatial dependence diagnostic. This sequence matches the hypotheses: H1 tests system-wide timing change, H2 tests district divergence, H5 tests offshore moderation, and H4 tests whether estimated district effects are spatially clustered. Administrative records are extracted from a PostgreSQL backend, linked across inspection and violation files at the well level, and aggregated to district-year panels in Python (`pandas`, `numpy`). Estimation is conducted with `statsmodels` (with `scipy` for auxiliary tests); figures are produced with `matplotlib`/`seaborn` and district map joins use `geopandas`. The H4 spatial test uses a permutation-based global Moran's I computed from district contiguity weights.
### Model 1: All-district policy-year shift (H1)
$$
Y_{dt} = \alpha_d + \beta_1 \mathrm{YearNum}_t + \beta_2 \mathrm{Post2019}_t + \beta_3 \mathrm{PostTrend}_t + \varepsilon_{dt}
$$
Where $(PostTrend_t = \max(0, t-2018))$. This distinguishes an immediate post-2019 level shift $(\beta_2)$ from a post-policy slope change $(\beta_3)$.
This follows interrupted time-series logic for a common policy shock, separating immediate and gradual responses (Biglan, Ary, & Wagenaar, 2000; Bernal, Cummins, & Gasparrini, 2017; Linden, 2015).
### Model 2: District heterogeneity (H2)
$$
Y_{dt} = \alpha_d + \gamma_t + \sum_{d} \theta_d \bigl(\mathrm{District}_d \times \mathrm{Post2019}_t\bigr) + \varepsilon_{dt}
$$
This yields district-specific post-policy effects and a joint heterogeneity test.
Because all districts are exposed in the same year, this is not a staggered-adoption DiD problem. Still, recent DiD work highlights that pooled average effects can mask meaningful treatment-effect heterogeneity, so we estimate district-specific post effects directly rather than rely on a single pooled interaction (de Chaisemartin & D'Haultfœuille, 2020; Goodman-Bacon, 2021; Sun & Abraham, 2021).
### Model 3: Offshore moderation (H5)
$$
Y_{dt} = \alpha_d + \gamma_t + \sum_{d} \theta_d \bigl(\mathrm{District}_d \times \mathrm{Post2019}_t\bigr) + \phi\bigl(\mathrm{Post2019}_t \times \mathrm{Offshore}_d\bigr) + \varepsilon_{dt}
$$
Where `Offshore_d = 1` for districts 02/03/04.
This specification tests whether offshore-regulating districts differ systematically from other districts after controlling for district-specific post-policy shifts.
### Spatial diagnostic (H4)
After estimating district treatment effects, we test for global spatial autocorrelation using permutation-based Moran's I (Anselin, 1995). This assesses whether high- and low-response districts are geographically clustered in ways consistent with diffusion or regional administrative spillovers.
All models use district-clustered standard errors.
## Results
### Descriptive pipeline trends
Pre/post means indicate lower average enforcement delay post-policy (174.3 to 112.3 days), but reduced inspection frequency intensity (higher days between inspections).
Figure 1 visualizes these system-level changes across the regulatory pipeline. The key descriptive pattern is that timeliness improves over the post-policy period even as inspection cadence shifts, motivating a design that separates immediate policy breaks from post-policy trend effects.
![Regulatory Pipeline Trends](pipeline_trends_over_time.png)
**Figure 1.** Regulatory pipeline trends, 2015-2025.
### H1: Policy-year effects (all districts)
**Model 1 (timing outcome):**
- Immediate post-2019 level shift: **0.1514**, p=0.3294.
- Post-2019 slope shift: **-0.3603**, p=0.0010.
Interpretation: no statistically significant immediate level break in 2019, but a significant post-policy acceleration trend in enforcement timing.
**Table 1. Core policy-year and moderator estimates**
| Model | Effect term | Coefficient | P-value | Interpretation |
| :--- | :--- | ---: | ---: | :--- |
| Model 1 (All districts, interrupted panel) | Immediate post-2019 level shift | 0.1514 | 0.3294 | No immediate level break |
| Model 1 (All districts, interrupted panel) | Post-2019 annual trend shift | -0.3603 | 0.0010 | Significant post-policy acceleration trend |
| Model 3 (District heterogeneity + offshore) | Offshore-by-post-policy differential | 0.3819 | <0.001 | Offshore districts relatively slower post-policy timing |
Table 1 provides the baseline inferential results for the articles identification strategy. The table shows that the main all-district effect appears in the post-policy slope term rather than a one-time post-2019 level break, and it also shows that offshore jurisdiction remains a statistically important differential once district heterogeneity is modeled.
Event-study decomposition (relative to 2018) corroborates this dynamic pattern:
- No significant pre-policy years (2015-2017).
- Significant negative deviations in 2022, 2024, and 2025.
**Table 2. Event-study coefficients (all districts, reference year = 2018)**
| Year | Coefficient | P-value | Significant (p<0.05) |
| :--- | ---: | ---: | :--- |
| 2022 | -0.5853 | 0.0333 | Yes |
| 2024 | -0.7829 | 0.0057 | Yes |
| 2025 | -1.4800 | <0.001 | Yes |
Table 2 highlights the years where post-policy deviations are most pronounced. Substantively, these estimates indicate that the policy response intensifies over time instead of materializing immediately in 2019.
![Event Study](event_study_plot.png)
**Figure 2.** All-district event-study decomposition and offshore differential annual effects.
Figure 2 complements Table 2 by displaying the full time path (including pre-policy years), making the absence of pre-trend significance and the later post-policy acceleration visually transparent.
### H2: District heterogeneity
District-level post-policy responses are strongly heterogeneous and jointly significant. Estimated effects range from substantial acceleration (e.g., District 09) to substantial slowdown (e.g., Districts 03 and 04).
From district effect summaries used in mapping:
- Best improvement: District 09 (about -52.6%).
- Largest deterioration: District 04 (about +138.5%).
Figure 3 presents the estimated district-specific effects directly, while Figure 4 maps those effects geographically. Together they demonstrate that heterogeneity is not a minor perturbation around a common effect but a core empirical feature of the policy response.
![District Effects](district_treatment_effects_psj.png)
**Figure 3.** District-specific post-2019 treatment effects.
To show where these effects are concentrated geographically, Figure 4 maps district-level percent changes in enforcement timing.
![District Treatment Effects Map](district_treatment_effects_map_psj.png)
**Figure 4.** Geographic distribution of district treatment effects (percent change in days to enforcement).
**Figure note.** Districts are shaded by the estimated percent change in days to enforcement after 2019 (negative values indicate faster enforcement; positive values indicate slower enforcement), using a diverging scale centered at zero so improvements and slowdowns are visually comparable. Estimates come from the district-by-post model on logged enforcement delay and are converted to percent changes; district labels indicate RRC district codes. Magnitudes should be interpreted with the coefficient uncertainty reported in the corresponding model tables.
The map indicates that large positive and negative effects coexist across regions, reinforcing the need to model district-level discretion explicitly rather than assuming uniform policy implementation.
### H5: Offshore moderation
In the conditional heterogeneity model (Model 3):
- Offshore-by-post-policy differential = **0.3819**, p<0.001.
This estimand is the average post-2019 offshore differential conditional on district-specific post-policy effects. Because the outcome is logged enforcement delay, the coefficient implies an approximate $(e^{0.3819}-1 \approx 46.5\%)$ relative increase in time-to-enforcement for offshore-regulating districts, holding the rest of the specification constant. Read jointly with the annual offshore differential results (Table C3 in the appendix), this pooled estimate should be interpreted as an average over uneven yearly effects, with the strongest divergences concentrated in 2023-2024. Substantively, H5 is supported as a structured heterogeneity result within the all-district analysis, but it should not be interpreted as isolating a single offshore causal mechanism, given that offshore jurisdiction is concentrated in districts 02/03/04.
### H3: Structural moderators
Main moderator block:
- H3a Capacity: coef -0.0188, p=0.9415.
- H3b Baseline performance: coef -0.0884, p=0.7144.
- H3e Border proximity: coef -0.2768, p=0.3082.
Deep-dive TWFE block:
- H3c EJ context: coef 0.1818, p=0.4866.
- H3f Rurality: coef 0.2213, p=0.4649.
- H3e Border proximity: coef -0.3626, p=0.1669.
- H3d Geology: mixed basin interactions; some terms significant (p<0.001).
Overall, H3 receives limited support except partial geology effects.
**Table 3. Structural moderator tests**
| Hypothesis | Moderator term | Coefficient | P-value | Result |
| :--- | :--- | ---: | ---: | :--- |
| H3a Capacity | High-capacity district x post-policy | -0.0188 | 0.9415 | Not supported |
| H3b Baseline performance | Low-baseline-compliance district x post-policy | -0.0884 | 0.7144 | Not supported |
| H3c EJ context | High-EJ district x post-policy | 0.1818 | 0.4866 | Not supported |
| H3e Border proximity | Border-proximity district x post-policy | -0.3626 | 0.1669 | Not supported |
| H3f Rurality | High-rurality district x post-policy | 0.2213 | 0.4649 | Not supported |
| H3d Geology | Basin category x post-policy interactions | Mixed | Mixed | Partial support |
Table 3 summarizes why structural accounts are only partially successful in this run: most moderators are imprecisely estimated, while geology shows selective basin-specific effects. Figure 5 and Figure 6 then provide visual context for these moderator patterns.
![Moderators](heterogeneous_effects.png)
**Figure 5.** Moderator interaction estimates.
![Demographics and Geography](district_demographics_geography.png)
**Figure 6.** Demographic/geographic correlates of district effects.
### H4: Spatial dynamics
Morans I on district effects:
- \(I = -0.0493\), permutation p=0.8550.
No evidence of statistically significant global spatial autocorrelation.
Figure 7 visually corroborates the spatial test by showing no systematic clustering pattern consistent with strong spillovers.
![Spatial Spillovers](spatial_spillovers.png)
**Figure 7.** Spatial spillover diagnostics.
## Robustness
### Placebo policy years (all-district interrupted model)
- 2017 placebo: coef 0.6565, p=0.0020.
- 2021 placebo: coef -0.0245, p=0.9191.
### Alternative outcomes (all-district interrupted model)
- Resolution rate: post 4.3721 (p=0.2104), post-trend -2.9371 (p=0.1424).
- Compliance rate: post -0.1311 (p=0.9316), post-trend -0.5562 (p=0.1870).
- Violations/inspection: post -0.0082 (p=0.6690), post-trend 0.0106 (p=0.0600).
### Sample restrictions
- Full sample: post 0.1514 (p=0.3294), post-trend -0.3603 (p=0.0010).
- Excluding extreme districts: post-trend remains negative/significant.
- Excluding 2015-2016: post-trend remains negative (weaker significance).
- Excluding 2020-2021: post-trend remains negative/significant.
### Specification sensitivity
- Linear interrupted model: post-trend -67.0420 days (p=0.0100).
- Winsorized interrupted model: post-trend -0.3147 (p=0.0016).
Across variants, the post-policy **slope** result is more stable than the immediate **level** effect.
**Table 4. Robustness summary (interrupted panel framework)**
| Check | Immediate post-policy level effect (p) | Post-policy trend effect (p) | Read |
| :--- | :--- | :--- | :--- |
| Full sample | 0.1514 (0.3294) | -0.3603 (0.0010) | Slope effect robust; level break weak |
| Exclude extreme districts | 0.1917 (0.1930) | -0.2972 (0.0133) | Slope remains significant |
| Exclude 2015-2016 | 0.1942 (0.1958) | -0.2313 (0.0950) | Slope negative, marginal |
| Exclude 2020-2021 | 0.1516 (0.2959) | -0.3599 (0.0016) | Slope remains significant |
| Linear interrupted | -41.9298 (0.3104) | -67.0420 (0.0100) | Same directional pattern |
| Winsorized interrupted | 0.2137 (0.1021) | -0.3147 (0.0016) | Slope remains significant |
Table 4 consolidates robustness evidence in one place: level-shift estimates are sensitive, but the negative post-policy slope remains comparatively stable across sample restrictions and alternative functional forms.
## Discussion
The transparency reform is associated with a gradual statewide acceleration in enforcement timing rather than a single immediate break at implementation. At the same time, district responses diverge sharply, confirming bureaucratic heterogeneity. Offshore jurisdiction explains a meaningful share of that heterogeneity once district-specific post effects are included, while most other structural moderators are weak or inconsistent in this run. Spatial diffusion across neighboring districts is not supported by global autocorrelation tests.
These findings suggest that transparency reforms in decentralized regulatory systems should be evaluated as dynamic, district-conditioned processes, not monolithic statewide shocks.
### References
Anselin, L. (1995). Local Indicators of Spatial Association—LISA. *Geographical Analysis*, 27(2), 93-115.
Biglan, A., Ary, D., & Wagenaar, A. C. (2000). The Value of Interrupted Time-Series Experiments for Community Intervention Research. *Prevention Science*, 1(1), 31-49.
Bernal, J. L., Cummins, S., & Gasparrini, A. (2017). Interrupted time series regression for the evaluation of public health interventions: A tutorial. *International Journal of Epidemiology*, 46(1), 348-355.
de Chaisemartin, C., & D'Haultfœuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. *American Economic Review*, 110(9), 2964-96.
Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. *Journal of Econometrics*, 225(2), 254-277.
Linden, A. (2015). Conducting interrupted time-series analysis for single- and multiple-group comparisons. *The Stata Journal*, 15(2), 480-500.
Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical modeling with Python. *Proceedings of the 9th Python in Science Conference*, 57-61.
Sun, L., & Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. *Journal of Econometrics*, 225(2), 175-199.