new appendix and writeup docs

2026-02-21 09:10:29 -08:00
parent 373fff2867
commit 6cced896fa
4 changed files with 307 additions and 0 deletions
--- a/intro_thoery_methods_analysis_results_discussion.md
+++ b/intro_thoery_methods_analysis_results_discussion.md
@@ -0,0 +1,180 @@
+# Introduction
+
+Regulatory enforcement in borderlands jurisdictions is often expected to differ from interior jurisdictions due to administrative constraints, multi-jurisdictional exposure, and monitoring frictions. This manuscript analyzes Texas Railroad Commission district-year outcomes (2015-2025) to assess whether border-exposed districts show systematic enforcement gaps and whether those gaps changed after the 2019 disclosure reform.
+
+The empirical design centers on two research questions from the notebook:
+
+1. RQ1 (Border gaps): Do border-exposed Texas districts differ from non-border districts in enforcement intensity and pipeline outcomes?
+2. RQ2 (Disclosure heterogeneity): Did the 2019 disclosure reform change enforcement outcomes differently in border districts versus non-border districts (level shift and post-policy trend differential)?
+
+# Theory
+
+We use a borderlands governance framing with two linked mechanisms: capacity asymmetry and transparency-throughput effects. The corresponding hypotheses are:
+
+1. H1 (Border inspection gap): Border districts have lower inspection intensity than non-border districts.
+2. H2 (Border pipeline disadvantage): Border districts show weaker enforcement pipeline outcomes (higher violations per inspection and/or slower timing and/or lower resolution rates).
+3. H3 (Disclosure heterogeneity in levels): Post-2019 level shifts differ between border and non-border districts (`post_2019:border`).
+4. H4 (Disclosure heterogeneity in trends): Post-2019 trend shifts differ between border and non-border districts (`post_trend:border`).
+
+This yields a core empirical claim: post-2019 border effects should be strongest in enforcement timing rather than in inspection coverage or resolution outcomes.
+
+# Methods
+
+## Data and Unit of Analysis
+
+- Unit: district-year.
+- Coverage: 13 Texas RRC districts, 2015-2025.
+- Source tables: `well_shape_tract`, `inspections`, `violations`.
+- Sample in current run: 1,010,432 wells; 1,867,859 inspections; 191,762 violations; 143 district-year observations.
+
+## Border Measurement: District Coding and Well Proximity
+
+We use two complementary border constructions.
+
+1. District-level baseline treatment (`border_district`): districts in the predefined border-adjacent set (`01`, `02`, `06`, `08`, `8A`, `09`, `10`) are coded 1; others are coded 0.
+2. Well-level proximity treatment: each well is classified by spatial proximity to border segments, then rolled up to district-year exposure shares.
+
+Well-level proximity was constructed from latitude/longitude and shapefiles as follows:
+
+1. Texas-Mexico distance/flags from `WellAnalyzer` (`within_25km_texmex`, `within_50km_texmex`).
+2. Additional state-border segments (TX-NM, TX-OK, TX-LA) built from Texas county boundary geometry and seed lines.
+3. Distances computed in projected CRS (EPSG:5070), then threshold flags generated at 25 km and 50 km.
+4. Composite exposure indicators created:
+   - `within_50km_state_border_any`
+   - `well_border_exposed` (1 if within 50 km of TX-MX or any TX-state border segment).
+
+District-year well-proximity exposure is measured as:
+$$
+ShareBorder_{dt} = \frac{BorderExposedInspections_{dt}}{Inspections_{dt}}
+$$
+and an alternative district treatment is defined as `border_exposure_district = 1` when `ShareBorder_{dt} \ge 0.25`.
+
+## Outcomes
+
+$$
+InspectionIntensity_{dt} = \frac{Inspections_{dt}}{UniqueWells_{dt}}
+$$
+$$
+ViolPerInsp_{dt} = \frac{Violations_{dt}}{Inspections_{dt}}
+$$
+$$
+DaysToEnf_{dt} = \frac{1}{N_{dt}} \sum_{i=1}^{N_{dt}} (EnforcementDate_i - ViolationDiscoveryDate_i)
+$$
+$$
+ResolutionRate_{dt} = \frac{CompliantOnReinspection_{dt}}{Violations_{dt}}
+$$
+
+## Exposure Definitions
+
+- Baseline treatment: `border_district` (binary district border status).
+- Additional robustness exposures:
+
+1. Border-type indicators (`TX-MX`, `TX-NM`, `TX-OK`, `TX-LA`)
+2. Continuous exposure share:
+
+$$
+ShareBorder_{dt} = \frac{BorderExposedInspections_{dt}}{Inspections_{dt}}
+$$
+3. Cutoff sensitivity with 25/75/100 km thresholds.
+
+## Estimating Equations
+
+RQ1 levels:
+$$
+Y_{dt} = \alpha + \beta_1 Border_d + \beta_2 \log(UniqueWells_{dt}) + \gamma_t + \varepsilon_{dt}
+$$
+
+RQ2 FE interaction:
+$$
+Y_{dt} = \alpha_d + \gamma_t + \theta_1(Post2019_t \times Border_d) + \theta_2(PostTrend_t \times Border_d) + \varepsilon_{dt}
+$$
+$$
+Post2019_t = \mathbb{1}[t \ge 2019], \quad PostTrend_t = \max(0, t-2019)
+$$
+
+Inference uses district-clustered standard errors (13 clusters), with emphasis on effect size and consistency across specifications.
+
+## Tests Run in Notebook
+
+The notebook estimated the following test families:
+
+1. Descriptive border-gap tests:
+   - Border vs non-border means for inspection intensity, violations per inspection, days to enforcement, and resolution rate.
+2. RQ1 levels regressions (border gaps):
+   - Outcomes: `inspection_intensity`, `violations_per_inspection`.
+   - Specification: `border_district + log_unique_wells + C(year)`.
+3. RQ2 FE interaction regressions (post-2019 heterogeneity):
+   - Outcomes: `inspection_intensity`, `violations_per_inspection`, `avg_days_to_enforcement`, `resolution_rate`.
+   - Specification: `C(district) + C(year) + post_2019:border_district + post_trend:border_district`.
+4. Border-type robustness tests:
+   - District profiles for `TX-MX`, `TX-NM`, `TX-OK`, `TX-LA` exposure.
+   - RQ1-style levels with `has_tx_*` indicators.
+   - RQ2-style FE interactions with `post_2019:has_tx_*` and `post_trend:has_tx_*`.
+5. Continuous-exposure robustness tests:
+   - Replace binary border indicator with `share_border_exposed_insp` in both RQ1-style and RQ2-style specifications.
+6. Cutoff-sensitivity tests:
+   - Recompute proximity exposure from minimum distance to any border at 25 km, 75 km, and 100 km.
+   - Estimate RQ1-style models for inspection intensity and RQ2-style timing interaction models.
+7. Visualization and reporting tests:
+   - Border/non-border trend plots.
+   - Main timing figure with district-year group means and 95% confidence intervals.
+8. Competition/reaction-function scaffolding (not estimated as causal model):
+   - District-to-competitor jurisdiction link table and template generated for future interstate stringency integration.
+
+# Analysis
+
+## Descriptive Border Gaps
+
+| Outcome | Non-border | Border |
+|---|---:|---:|
+| Inspection intensity | 1.515 | 1.329 |
+| Violations per inspection | 0.098 | 0.130 |
+| Mean days to enforcement | 122.8 | 145.2 |
+| Mean resolution rate | 0.596 | 0.543 |
+
+Descriptively, border districts show weaker enforcement conditions across coverage, detection conditional on inspection, timing, and follow-through.
+
+## Main Regression Evidence
+
+| Model | Coefficient | p-value | N |
+|---|---:|---:|---:|
+| RQ1: `border_district` on `inspection_intensity` | -0.1755 | 0.0999 | 143 |
+| RQ1: `border_district` on `violations_per_inspection` | 0.0434 | 0.0949 | 143 |
+| RQ2: `post_2019:border` on `inspection_intensity` | -0.1191 | 0.0753 | 143 |
+| RQ2: `post_2019:border` on `violations_per_inspection` | 0.0040 | 0.8881 | 143 |
+| RQ2: `post_2019:border` on `avg_days_to_enforcement` | -74.5893 | 0.0156 | 143 |
+| RQ2: `post_2019:border` on `resolution_rate` | 0.0404 | 0.4520 | 143 |
+
+The most stable differential post-2019 effect is a border-specific improvement in enforcement timing.
+
+# Results
+
+## Hypothesis Tests
+
+| Hypothesis | Test evidence | Decision (current run) |
+|---|---|---|
+| H1: Border districts have lower inspection intensity | RQ1: `border_district -> inspection_intensity` = -0.1755, p = 0.0999; descriptives 1.329 (border) vs 1.515 (non-border) | Partial support |
+| H2: Border districts have weaker pipeline outcomes | Descriptives: 0.130 vs 0.098 violations/inspection, 145.2 vs 122.8 days, 0.543 vs 0.596 resolution; RQ1 `border_district -> violations_per_inspection` = 0.0434, p = 0.0949 | Supported descriptively, mixed regression support |
+| H3: Border-specific post-2019 level shift | RQ2 `post_2019:border -> avg_days_to_enforcement` = -74.5893, p = 0.0156; other outcomes null | Supported for timing only |
+| H4: Border-specific post-2019 trend shift | RQ2 `post_trend:border` terms: inspection p = 0.8181, violations p = 0.8350, timing p = 0.9252, resolution p = 0.3404 | Not supported in baseline model |
+
+The hypothesis tests indicate the clearest inferential signal is a border-specific post-2019 timing level shift, consistent with "faster pipeline, not wider pipeline."
+
+## Figure Callouts
+
+Figure 1 (group trends): `analysis/output_borderlands/border_vs_nonborder_trends.png`  
+Figure 2 (main timing figure with CI): `analysis/output_borderlands/money_plot_timing_border_prepost2019.png`
+
+Figure 2 uses district-year means with equal district weighting:
+$$
+\bar{Y}_{gt} = \frac{1}{n_{gt}} \sum_{d \in g} Y_{dt}, \quad
+CI_{95\%} = \bar{Y}_{gt} \pm 1.96 \cdot \frac{s_{gt}}{\sqrt{n_{gt}}}
+$$
+
+# Discussion
+
+The findings are consistent with a transparency-throughput mechanism: disclosure-era pressure appears to accelerate processing where baseline constraints are stronger, but this does not map cleanly to expansion of enforcement reach or follow-through. The strongest claim supported by this design is "faster pipeline, not wider pipeline."
+
+The contribution is a boundary condition argument: transparency reforms can produce uneven administrative effects across territorial governance contexts, with timing responsiveness exceeding capacity expansion.
+
+The design does not identify interstate strategic competition. A full Neil Woods-style test requires district-year competitor stringency series and explicit enforcement-gap dynamics. That's the next step in the research agenda, but the current analysis provides a necessary first step by establishing the presence of border-specific enforcement gaps and their heterogeneous response to disclosure reform.