new appendix and writeup docs
This commit is contained in:
180
intro_thoery_methods_analysis_results_discussion.md
Normal file
180
intro_thoery_methods_analysis_results_discussion.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# Introduction
|
||||
|
||||
Regulatory enforcement in borderlands jurisdictions is often expected to differ from interior jurisdictions due to administrative constraints, multi-jurisdictional exposure, and monitoring frictions. This manuscript analyzes Texas Railroad Commission district-year outcomes (2015-2025) to assess whether border-exposed districts show systematic enforcement gaps and whether those gaps changed after the 2019 disclosure reform.
|
||||
|
||||
The empirical design centers on two research questions from the notebook:
|
||||
|
||||
1. RQ1 (Border gaps): Do border-exposed Texas districts differ from non-border districts in enforcement intensity and pipeline outcomes?
|
||||
2. RQ2 (Disclosure heterogeneity): Did the 2019 disclosure reform change enforcement outcomes differently in border districts versus non-border districts (level shift and post-policy trend differential)?
|
||||
|
||||
# Theory
|
||||
|
||||
We use a borderlands governance framing with two linked mechanisms: capacity asymmetry and transparency-throughput effects. The corresponding hypotheses are:
|
||||
|
||||
1. H1 (Border inspection gap): Border districts have lower inspection intensity than non-border districts.
|
||||
2. H2 (Border pipeline disadvantage): Border districts show weaker enforcement pipeline outcomes (higher violations per inspection and/or slower timing and/or lower resolution rates).
|
||||
3. H3 (Disclosure heterogeneity in levels): Post-2019 level shifts differ between border and non-border districts (`post_2019:border`).
|
||||
4. H4 (Disclosure heterogeneity in trends): Post-2019 trend shifts differ between border and non-border districts (`post_trend:border`).
|
||||
|
||||
This yields a core empirical claim: post-2019 border effects should be strongest in enforcement timing rather than in inspection coverage or resolution outcomes.
|
||||
|
||||
# Methods
|
||||
|
||||
## Data and Unit of Analysis
|
||||
|
||||
- Unit: district-year.
|
||||
- Coverage: 13 Texas RRC districts, 2015-2025.
|
||||
- Source tables: `well_shape_tract`, `inspections`, `violations`.
|
||||
- Sample in current run: 1,010,432 wells; 1,867,859 inspections; 191,762 violations; 143 district-year observations.
|
||||
|
||||
## Border Measurement: District Coding and Well Proximity
|
||||
|
||||
We use two complementary border constructions.
|
||||
|
||||
1. District-level baseline treatment (`border_district`): districts in the predefined border-adjacent set (`01`, `02`, `06`, `08`, `8A`, `09`, `10`) are coded 1; others are coded 0.
|
||||
2. Well-level proximity treatment: each well is classified by spatial proximity to border segments, then rolled up to district-year exposure shares.
|
||||
|
||||
Well-level proximity was constructed from latitude/longitude and shapefiles as follows:
|
||||
|
||||
1. Texas-Mexico distance/flags from `WellAnalyzer` (`within_25km_texmex`, `within_50km_texmex`).
|
||||
2. Additional state-border segments (TX-NM, TX-OK, TX-LA) built from Texas county boundary geometry and seed lines.
|
||||
3. Distances computed in projected CRS (EPSG:5070), then threshold flags generated at 25 km and 50 km.
|
||||
4. Composite exposure indicators created:
|
||||
- `within_50km_state_border_any`
|
||||
- `well_border_exposed` (1 if within 50 km of TX-MX or any TX-state border segment).
|
||||
|
||||
District-year well-proximity exposure is measured as:
|
||||
$$
|
||||
ShareBorder_{dt} = \frac{BorderExposedInspections_{dt}}{Inspections_{dt}}
|
||||
$$
|
||||
and an alternative district treatment is defined as `border_exposure_district = 1` when `ShareBorder_{dt} \ge 0.25`.
|
||||
|
||||
## Outcomes
|
||||
|
||||
$$
|
||||
InspectionIntensity_{dt} = \frac{Inspections_{dt}}{UniqueWells_{dt}}
|
||||
$$
|
||||
$$
|
||||
ViolPerInsp_{dt} = \frac{Violations_{dt}}{Inspections_{dt}}
|
||||
$$
|
||||
$$
|
||||
DaysToEnf_{dt} = \frac{1}{N_{dt}} \sum_{i=1}^{N_{dt}} (EnforcementDate_i - ViolationDiscoveryDate_i)
|
||||
$$
|
||||
$$
|
||||
ResolutionRate_{dt} = \frac{CompliantOnReinspection_{dt}}{Violations_{dt}}
|
||||
$$
|
||||
|
||||
## Exposure Definitions
|
||||
|
||||
- Baseline treatment: `border_district` (binary district border status).
|
||||
- Additional robustness exposures:
|
||||
|
||||
1. Border-type indicators (`TX-MX`, `TX-NM`, `TX-OK`, `TX-LA`)
|
||||
2. Continuous exposure share:
|
||||
|
||||
$$
|
||||
ShareBorder_{dt} = \frac{BorderExposedInspections_{dt}}{Inspections_{dt}}
|
||||
$$
|
||||
3. Cutoff sensitivity with 25/75/100 km thresholds.
|
||||
|
||||
## Estimating Equations
|
||||
|
||||
RQ1 levels:
|
||||
$$
|
||||
Y_{dt} = \alpha + \beta_1 Border_d + \beta_2 \log(UniqueWells_{dt}) + \gamma_t + \varepsilon_{dt}
|
||||
$$
|
||||
|
||||
RQ2 FE interaction:
|
||||
$$
|
||||
Y_{dt} = \alpha_d + \gamma_t + \theta_1(Post2019_t \times Border_d) + \theta_2(PostTrend_t \times Border_d) + \varepsilon_{dt}
|
||||
$$
|
||||
$$
|
||||
Post2019_t = \mathbb{1}[t \ge 2019], \quad PostTrend_t = \max(0, t-2019)
|
||||
$$
|
||||
|
||||
Inference uses district-clustered standard errors (13 clusters), with emphasis on effect size and consistency across specifications.
|
||||
|
||||
## Tests Run in Notebook
|
||||
|
||||
The notebook estimated the following test families:
|
||||
|
||||
1. Descriptive border-gap tests:
|
||||
- Border vs non-border means for inspection intensity, violations per inspection, days to enforcement, and resolution rate.
|
||||
2. RQ1 levels regressions (border gaps):
|
||||
- Outcomes: `inspection_intensity`, `violations_per_inspection`.
|
||||
- Specification: `border_district + log_unique_wells + C(year)`.
|
||||
3. RQ2 FE interaction regressions (post-2019 heterogeneity):
|
||||
- Outcomes: `inspection_intensity`, `violations_per_inspection`, `avg_days_to_enforcement`, `resolution_rate`.
|
||||
- Specification: `C(district) + C(year) + post_2019:border_district + post_trend:border_district`.
|
||||
4. Border-type robustness tests:
|
||||
- District profiles for `TX-MX`, `TX-NM`, `TX-OK`, `TX-LA` exposure.
|
||||
- RQ1-style levels with `has_tx_*` indicators.
|
||||
- RQ2-style FE interactions with `post_2019:has_tx_*` and `post_trend:has_tx_*`.
|
||||
5. Continuous-exposure robustness tests:
|
||||
- Replace binary border indicator with `share_border_exposed_insp` in both RQ1-style and RQ2-style specifications.
|
||||
6. Cutoff-sensitivity tests:
|
||||
- Recompute proximity exposure from minimum distance to any border at 25 km, 75 km, and 100 km.
|
||||
- Estimate RQ1-style models for inspection intensity and RQ2-style timing interaction models.
|
||||
7. Visualization and reporting tests:
|
||||
- Border/non-border trend plots.
|
||||
- Main timing figure with district-year group means and 95% confidence intervals.
|
||||
8. Competition/reaction-function scaffolding (not estimated as causal model):
|
||||
- District-to-competitor jurisdiction link table and template generated for future interstate stringency integration.
|
||||
|
||||
# Analysis
|
||||
|
||||
## Descriptive Border Gaps
|
||||
|
||||
| Outcome | Non-border | Border |
|
||||
|---|---:|---:|
|
||||
| Inspection intensity | 1.515 | 1.329 |
|
||||
| Violations per inspection | 0.098 | 0.130 |
|
||||
| Mean days to enforcement | 122.8 | 145.2 |
|
||||
| Mean resolution rate | 0.596 | 0.543 |
|
||||
|
||||
Descriptively, border districts show weaker enforcement conditions across coverage, detection conditional on inspection, timing, and follow-through.
|
||||
|
||||
## Main Regression Evidence
|
||||
|
||||
| Model | Coefficient | p-value | N |
|
||||
|---|---:|---:|---:|
|
||||
| RQ1: `border_district` on `inspection_intensity` | -0.1755 | 0.0999 | 143 |
|
||||
| RQ1: `border_district` on `violations_per_inspection` | 0.0434 | 0.0949 | 143 |
|
||||
| RQ2: `post_2019:border` on `inspection_intensity` | -0.1191 | 0.0753 | 143 |
|
||||
| RQ2: `post_2019:border` on `violations_per_inspection` | 0.0040 | 0.8881 | 143 |
|
||||
| RQ2: `post_2019:border` on `avg_days_to_enforcement` | -74.5893 | 0.0156 | 143 |
|
||||
| RQ2: `post_2019:border` on `resolution_rate` | 0.0404 | 0.4520 | 143 |
|
||||
|
||||
The most stable differential post-2019 effect is a border-specific improvement in enforcement timing.
|
||||
|
||||
# Results
|
||||
|
||||
## Hypothesis Tests
|
||||
|
||||
| Hypothesis | Test evidence | Decision (current run) |
|
||||
|---|---|---|
|
||||
| H1: Border districts have lower inspection intensity | RQ1: `border_district -> inspection_intensity` = -0.1755, p = 0.0999; descriptives 1.329 (border) vs 1.515 (non-border) | Partial support |
|
||||
| H2: Border districts have weaker pipeline outcomes | Descriptives: 0.130 vs 0.098 violations/inspection, 145.2 vs 122.8 days, 0.543 vs 0.596 resolution; RQ1 `border_district -> violations_per_inspection` = 0.0434, p = 0.0949 | Supported descriptively, mixed regression support |
|
||||
| H3: Border-specific post-2019 level shift | RQ2 `post_2019:border -> avg_days_to_enforcement` = -74.5893, p = 0.0156; other outcomes null | Supported for timing only |
|
||||
| H4: Border-specific post-2019 trend shift | RQ2 `post_trend:border` terms: inspection p = 0.8181, violations p = 0.8350, timing p = 0.9252, resolution p = 0.3404 | Not supported in baseline model |
|
||||
|
||||
The hypothesis tests indicate the clearest inferential signal is a border-specific post-2019 timing level shift, consistent with "faster pipeline, not wider pipeline."
|
||||
|
||||
## Figure Callouts
|
||||
|
||||
Figure 1 (group trends): `analysis/output_borderlands/border_vs_nonborder_trends.png`
|
||||
Figure 2 (main timing figure with CI): `analysis/output_borderlands/money_plot_timing_border_prepost2019.png`
|
||||
|
||||
Figure 2 uses district-year means with equal district weighting:
|
||||
$$
|
||||
\bar{Y}_{gt} = \frac{1}{n_{gt}} \sum_{d \in g} Y_{dt}, \quad
|
||||
CI_{95\%} = \bar{Y}_{gt} \pm 1.96 \cdot \frac{s_{gt}}{\sqrt{n_{gt}}}
|
||||
$$
|
||||
|
||||
# Discussion
|
||||
|
||||
The findings are consistent with a transparency-throughput mechanism: disclosure-era pressure appears to accelerate processing where baseline constraints are stronger, but this does not map cleanly to expansion of enforcement reach or follow-through. The strongest claim supported by this design is "faster pipeline, not wider pipeline."
|
||||
|
||||
The contribution is a boundary condition argument: transparency reforms can produce uneven administrative effects across territorial governance contexts, with timing responsiveness exceeding capacity expansion.
|
||||
|
||||
The design does not identify interstate strategic competition. A full Neil Woods-style test requires district-year competitor stringency series and explicit enforcement-gap dynamics. That's the next step in the research agenda, but the current analysis provides a necessary first step by establishing the presence of border-specific enforcement gaps and their heterogeneous response to disclosure reform.
|
||||
Reference in New Issue
Block a user