Initial commit

This commit is contained in:
2026-01-30 10:57:55 -08:00
commit 7b8890ed80
73 changed files with 14439530 additions and 0 deletions

122
analysis/gemini_draft.md Normal file
View File

@@ -0,0 +1,122 @@
# Heterogeneous Enforcement of Transparency: Evidence from the Texas Railroad Commission
## Methods
### Data and Sample
To evaluate the impact of the January 2019 disclosure policy (Rule 8 compliance transparency) on regulatory enforcement, we constructed a novel panel dataset integrating administrative records from the Texas Railroad Commission (RRC) with demographic and geographic controls. The observation period spans January 2015 to December 2025, providing 4 years of pre-policy and 7 years of post-policy data.
**Administrative Data:** Our primary dataset is drawn from the Texas Railroad Commission (RRC) Resource Center (<https://www.rrc.texas.gov/resource-center/>). We utilized:
1. **Inspection Records (N = 2,151,839):** Site-level inspection logs detailing the date, district office, operator, and compliance determination.
2. **Violation Records (N = 242,899):** Detailed violation events, including the date of discovery, rule violated, severity classification, and dates of subsequent enforcement actions (e.g., Notice of Violation, Severance, Sealing).
**Demographic and Geographic Controls:** To test for environmental justice implications and structural moderators, we merged well-level locations with:
1. **American Community Survey (ACS 2021):** Well-level demographics including Environmental Justice Index (EJI) scores, poverty rates, and median household income at the census tract level. The **EJI score** is constructed as a composite social vulnerability index, calculated as the mean of percentile-ranked indicators including minority population share, poverty rate, unemployment rate, linguistic isolation, and educational attainment. Higher scores indicate greater cumulative social vulnerability.
2. **Rural-Urban Commuting Area (RUCA):** 2020 codes to classify well locations as metropolitan, micropolitan, or rural.
3. **Geologic Basins:** Spatial joins identifying the major shale play (e.g., Permian, Eagle Ford) associated with each well.
Data were aggregated to the district-month and district-year levels to align with the administrative structure of the RRC, which operates through 13 geographically distinct district offices.
### Hypotheses
Drawing on theories of bureaucratic reputation and transparency-as-regulation, we test four primary hypotheses:
* **H1 (Regulatory Pipeline Acceleration):** We conceptualize the enforcement process as a "regulatory pipeline" consisting of four distinct stages: (1) inspection, (2) violation discovery, (3) enforcement action, and (4) compliance verification. We hypothesize that the 2019 disclosure policy will accelerate the movement of violations through this pipeline, specifically by reducing the administrative delay between violation discovery and enforcement action ($H1a$) and increasing the rate of compliance verification ($H1b$).
* **H2 (Bureaucratic Heterogeneity):** The impact of the policy will vary significantly across administrative districts, reflecting local discretion rather than uniform implementation.
* **H3 (Structural Moderators):** Variation in policy response will be explained by district structural characteristics, specifically:
* *Capacity ($H3a$):* High-capacity districts will be more responsive.
* *Baseline Performance ($H3b$):* Low-compliance districts will improve most (catch-up effect).
* *Environmental Justice ($H3c$):* Districts with higher Environmental Justice Index (EJI) scores will experience different policy impacts.
* *Geology ($H3d$):* Enforcement patterns will vary by the dominant oil and gas basin.
* *Border Proximity ($H3e$):* Districts bordering other states or Mexico will exhibit different policy responsiveness due to inter-jurisdictional competition.
* *Rurality ($H3f$):* Rural districts (higher RUCA codes) will respond differently to the disclosure policy than metropolitan districts.
* **H4 (Spatial Dynamics):** We expect positive spatial autocorrelation in district-level treatment effects, indicating that neighboring districts will exhibit similar responsiveness due to regional diffusion of best practices.
### Econometric Strategy
We employ a Difference-in-Differences (DiD) framework with two-way fixed effects.
**Equation 1: Baseline Dynamic Effect (Event Study)**
$$Y_{dt} = \alpha + \sum_{k \neq 2018} \beta_k \mathbb{1}(Year_t = k) + \gamma_d + \epsilon_{dt}$$
Where $\gamma_d$ represents district fixed effects. This specification tests the parallel trends assumption and maps temporal evolution.
**Equation 2: Heterogeneous Treatment Effects**
$$Y_{dt} = \alpha + \lambda (District_d \times Post2019_t) + \delta_t + \gamma_d + \epsilon_{dt}$$
This estimates a unique treatment effect $\lambda$ for each of the 13 district offices.
**Equation 3: Moderator Analysis (Triple Difference)**
$$Y_{dt} = \alpha + \beta_1 Post_t + \beta_2 (Post_t \times Moderator_d) + \delta_t + \gamma_d + \epsilon_{dt}$$
This interacts the policy shock with structural moderators (Capacity, Compliance, Demographics) to explain heterogeneity.
## Analysis and Results
### Descriptive Trends
Figure 1 illustrates the aggregate trends in the regulatory pipeline. Following the 2019 policy, we observe a structural break in enforcement behavior. While inspection frequency fluctuated, the average days to enforcement (top right panel) shows a marked post-2019 decline.
![Regulatory Pipeline Trends](pipeline_trends_over_time.png)
*Figure 1: Trends in inspection, compliance, and enforcement metrics (2015-2025).*
### H1: Aggregate Policy Impact
Table 1 presents the pooled Difference-in-Differences results testing **H1**. The policy is associated with a statistically significant reduction in enforcement delays, supporting **H1a**. Specifically, the log-linear specification indicates a **30.8% reduction** in time to enforcement action ($p < 0.05$). Robustness checks confirm this with a linear reduction of ~62 days. **H1b** is also supported, with a 3.7 percentage point increase in compliance rates.
**Table 1: Baseline Difference-in-Differences Results**
| Outcome | Coefficient | Std. Error | P-Value | Result |
| :--- | :--- | :--- | :--- | :--- |
| **Log(Days to Enforcement)** | **-0.369*** | 0.157 | 0.019 | **Supported (H1a)** |
| Compliance Rate (%) | +3.740* | - | 0.002 | **Supported (H1b)** |
| Violations per Insp. | -0.052* | - | 0.001 | Supported |
*Note: Standard errors clustered at the district level. $* p < 0.05$.*
The event study (Figure 2) validates the design. Coefficients for pre-policy years are null, while significant negative effects (faster enforcement) emerge consistently starting in 2022.
![Event Study Plot](event_study_plot.png)
*Figure 2: Event study coefficients relative to the 2018 baseline.*
### H2: Bureaucratic Heterogeneity
**H2 is strongly supported.** The average effect masks profound variation across the 13 districts. As shown in Figure 3, the policy impact ranges from a **65.9% improvement** in District 09 to a **71.9% decline** in District 04. Ten districts improved, while three exhibited backsliding.
![District Treatment Effects](district_treatment_effects_psj.png)
*Figure 3: Heterogeneous treatment effects by district (Percent change in days to enforcement).*
### H3: Structural Moderators
We tested **H3** using Triple-Difference models to explain this divergence. Table 2 presents the results for the four structural hypotheses. **We find strong support for H3c (Environmental Justice).** Districts with higher Environmental Justice Index (EJI) scores—indicating greater social vulnerability—saw significantly *slower* improvements in enforcement speed (Coefficient: +0.412, p<0.05). This suggests that the transparency benefits of the policy were unequally distributed, potentially exacerbating existing inequities.
However, other structural factors failed to explain the heterogeneity. Capacity ($H3a$), baseline compliance ($H3b$), the underlying oil and gas basin ($H3d$), border proximity ($H3e$), and rurality ($H3f$) were not statistically significant predictors of policy response.
**Table 2: Triple-Difference Moderator Analysis**
| Moderator Hypothesis | Interaction Coef. | P-Value | Result |
| :--- | :--- | :--- | :--- |
| **H3a: High Capacity** | -0.285 | 0.363 | Not Supported |
| **H3b: Low Baseline Compliance** | -0.134 | 0.628 | Not Supported |
| **H3c: EnviroJustice Score (EJI)** | **+0.412** | **0.038** | **Supported (Inequity)** |
| **H3d: Oil & Gas Basin** | -0.220 | 0.154 | Not Supported |
| **H3e: Border Proximity** | -0.441 | 0.118 | Not Supported |
| **H3f: Rurality (RUCA)** | +0.093 | 0.770 | Not Supported |
Figure 4 visualizes these results. Figure 5 confirms a strong correlation between treatment effects and the district EJI scores.
![Heterogeneous Effects](heterogeneous_effects.png)
*Figure 4: Interaction effects for key district moderators.*
![Demographics and Geography](district_demographics_geography.png)
*Figure 5: Correlation of treatment effects with district demographic and geographic features (highlighting EJ Score).*
### H4: Spatial Dynamics
**H4 is not supported in the expected direction.** Instead of positive spillovers (clustering of high performance), spatial analysis reveals **significant negative spatial autocorrelation** ($I = -0.549$). Figure 6 maps this pattern, showing that high-performing districts are frequently adjacent to low-performing ones. This suggests distinct, localized administrative cultures rather than regional diffusion of best practices.
![Spatial Map of Effects](district_treatment_effects_map_psj.png)
*Figure 6: Spatial distribution of enforcement speed changes.*
### Discussion
These findings present a paradox for "transparency as regulation." While the policy succeeded in aggregate (**H1**), its implementation was heavily filtered through local discretion (**H2**). Although most structural variables failed to explain this variation, the significant finding regarding Environmental Justice (**H3c**) offers a critical caveat. The fact that high-vulnerability districts saw slower improvements suggests that transparency mechanisms may rely on community capacity to be effective, potentially widening the regulatory gap for disadvantaged populations. Beyond this inequity, the lack of other structural predictors and the negative spatial autocorrelation (**H4**) point to **managerial leadership and organizational culture**—rather than resources or geography—are the primary drivers of responsiveness in the Texas oil and gas sector.