Files
texas-borderlands/appendix.md
2026-02-21 09:32:01 -08:00

179 lines
7.3 KiB
Markdown

# Appendix
## A1. Data Construction and Scope
- Analysis window: 2015-2025.
- District-year panel size: 143 rows (13 districts x 11 years).
- Primary identifiers: `api_norm`, `district`, `year`.
- Border geometry includes TX-MX plus TX-NM, TX-OK, TX-LA boundary proximity.
### Build validation counts
| Metric | Value |
|---|---:|
| Wells loaded | 1,010,432 |
| Inspections (2015-2025) | 1,867,859 |
| Violations (2015-2025) | 191,762 |
| Border-exposed wells (any 50 km border) | 169,520 |
| Panel observations | 143 |
| Districts | 13 |
### Border subtype counts (50 km)
| Border subtype | Wells |
|---|---:|
| TX-MX | 40,339 |
| TX-NM | 81,567 |
| TX-OK | 19,643 |
| TX-LA | 29,675 |
## A2. Supplementary Equations
### Border-type FE interaction model
$$
Y_{dt} = \alpha_d + \gamma_t + \sum_k \lambda_k(Post2019_t \times Type_{kd}) + \sum_k \phi_k(PostTrend_t \times Type_{kd}) + \varepsilon_{dt}
$$
where $k \in \{\text{TX-MX}, \text{TX-NM}, \text{TX-OK}, \text{TX-LA}\}$.
### Continuous exposure FE interaction model
$$
Y_{dt} = \alpha_d + \gamma_t + \eta_1(Post2019_t \times ShareBorder_{dt}) + \eta_2(PostTrend_t \times ShareBorder_{dt}) + \varepsilon_{dt}
$$
### Cutoff-specific exposure
$$
ShareBorder^{(c)}_{dt}, \quad c \in \{25,75,100\}
$$
substituted into the same FE interaction framework.
## A3. Main FE Interaction Table (RQ2)
| Outcome | post_2019 x border | p-value | post_trend x border | p-value | N |
|---|---:|---:|---:|---:|---:|
| inspection_intensity | -0.1191 | 0.0753 | -0.0052 | 0.8181 | 143 |
| violations_per_inspection | 0.0040 | 0.8881 | -0.0012 | 0.8350 | 143 |
| avg_days_to_enforcement | -74.5893 | 0.0156 | -1.1587 | 0.9252 | 143 |
| resolution_rate | 0.0404 | 0.4520 | -0.0186 | 0.3404 | 143 |
## A4. Border-Type Timing Interactions (Money Plot Companion)
| Term | Coefficient | p-value |
|---|---:|---:|
| post_2019:has_tx_mex | 4.0900 | 0.9062 |
| post_2019:has_tx_nm | -18.7442 | 0.6013 |
| post_2019:has_tx_ok | -14.2446 | 0.8134 |
| post_2019:has_tx_la | -43.6598 | 0.6415 |
| post_trend:has_tx_mex | -0.0148 | 0.9991 |
| post_trend:has_tx_nm | 22.9067 | 0.0189 |
| post_trend:has_tx_ok | -16.7188 | 0.0794 |
| post_trend:has_tx_la | 0.6415 | 0.9551 |
## A5. Continuous Exposure Results
| Family | Outcome | Term | Coef | p-value | N |
|---|---|---|---:|---:|---:|
| RQ1 levels continuous | inspection_intensity | share_border_exposed_insp | 0.2095 | 0.4757 | 143 |
| RQ1 levels continuous | avg_days_to_enforcement | share_border_exposed_insp | 103.4683 | 0.5710 | 143 |
| RQ1 levels continuous | violations_per_inspection | share_border_exposed_insp | -0.1585 | 0.0144 | 143 |
| RQ1 levels continuous | resolution_rate | share_border_exposed_insp | -0.0420 | 0.8619 | 143 |
| RQ2 FE continuous | avg_days_to_enforcement | post_2019:share_border_exposed_insp | -109.4067 | 0.4449 | 143 |
| RQ2 FE continuous | avg_days_to_enforcement | post_trend:share_border_exposed_insp | 13.9623 | 0.7415 | 143 |
| RQ2 FE continuous | resolution_rate | post_2019:share_border_exposed_insp | -0.0322 | 0.8163 | 143 |
| RQ2 FE continuous | resolution_rate | post_trend:share_border_exposed_insp | -0.0979 | 0.0423 | 143 |
## A6. Cutoff Sensitivity (Timing-Focused Terms)
| Cutoff km | Term | Coef | p-value | N |
|---:|---|---:|---:|---:|
| 25 | post_2019:share_border_25km | -101.9283 | 0.7010 | 143 |
| 75 | post_2019:share_border_75km | -75.6591 | 0.5116 | 143 |
| 100 | post_2019:share_border_100km | -4.4795 | 0.9474 | 143 |
## A7. District Border-Type Profile
| District | Wells | Dominant type | TX-MX share | TX-NM share | TX-OK share | TX-LA share |
|---|---:|---|---:|---:|---:|---:|
| 01 | 31,898 | TX-MX | 0.2313 | 0.0000 | 0.0000 | 0.0000 |
| 02 | 17,099 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 03 | 16,700 | TX-LA | 0.0000 | 0.0000 | 0.0000 | 0.1166 |
| 04 | 20,973 | TX-MX | 0.6384 | 0.0000 | 0.0000 | 0.0000 |
| 05 | 9,938 | TX-OK | 0.0000 | 0.0000 | 0.0022 | 0.0000 |
| 06 | 24,422 | TX-LA | 0.0000 | 0.0000 | 0.0293 | 0.5235 |
| 08 | 105,931 | TX-NM | 0.0001 | 0.1905 | 0.0000 | 0.0000 |
| 09 | 46,485 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 10 | 29,621 | TX-OK | 0.0000 | 0.0009 | 0.3020 | 0.0000 |
| 6E | 6,235 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 7B | 21,230 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 7C | 43,061 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 8A | 42,005 | TX-NM | 0.0000 | 0.4182 | 0.0000 | 0.0000 |
## A8. Figures and Output Artifacts
- Border exposure map: `analysis/output_borderlands/well_border_exposure_map.png`
- Border vs non-border trends: `analysis/output_borderlands/border_vs_nonborder_trends.png`
- Main timing figure: `analysis/output_borderlands/money_plot_timing_border_prepost2019.png`
- Timing CI table: `analysis/output_borderlands/money_plot_timing_ci_by_year.csv`
## A9. Optional Prior Artifact (Not Estimated in Current Causal Scope)
`analysis/output_borderlands/competition_asymmetry_results.csv` contains:
- `gap_pos` = 0.5837 (p < 0.001)
- `gap_neg` = 0.4993 (p = 0.0027)
These estimates are not part of the current notebook's identified model scope and should not be interpreted as a completed reaction-function test in this manuscript version.
## A10. Estimation Workflow and Test Inventory
### Border treatment construction
Two border definitions were used:
1. District-level baseline treatment (`border_district`), with border-adjacent districts coded as `01`, `02`, `06`, `08`, `8A`, `09`, `10`.
2. Well-level proximity treatment, rolled up to district-year exposure.
Well-level proximity workflow:
1. Import TX-MX proximity flags from `WellAnalyzer` (`within_25km_texmex`, `within_50km_texmex`).
2. Build additional TX-NM, TX-OK, TX-LA border segments from county boundary geometry plus seed lines.
3. Compute distances in EPSG:5070 and generate threshold flags.
4. Build composite indicators:
- `within_50km_state_border_any`
- `well_border_exposed` (within 50 km of TX-MX or any TX-state border segment).
District-year exposure share:
$$
ShareBorder_{dt} = \frac{BorderExposedInspections_{dt}}{Inspections_{dt}}
$$
Alternative district coding:
$$
border\_exposure\_district_{dt} = \mathbb{1}[ShareBorder_{dt} \ge 0.25]
$$
### Tests run
1. Descriptive border-gap comparisons by group means.
2. RQ1 levels regressions:
- Outcomes: `inspection_intensity`, `violations_per_inspection`.
- Model: `border_district + log_unique_wells + C(year)`.
3. RQ2 FE interaction regressions:
- Outcomes: `inspection_intensity`, `violations_per_inspection`, `avg_days_to_enforcement`, `resolution_rate`.
- Model: `C(district) + C(year) + post_2019:border_district + post_trend:border_district`.
4. Border-type robustness:
- District border-type profile (`TX-MX`, `TX-NM`, `TX-OK`, `TX-LA`).
- RQ1-style levels with `has_tx_*`.
- RQ2-style FE interactions with `post_2019:has_tx_*` and `post_trend:has_tx_*`.
5. Continuous exposure robustness:
- Replace binary border term with `share_border_exposed_insp` in RQ1-style and RQ2-style specifications.
6. Cutoff sensitivity robustness:
- Recompute exposure using minimum distance to any border at 25/75/100 km.
- Estimate RQ1-style inspection-intensity models and RQ2-style timing interaction models.
7. Figures/reporting:
- Border vs non-border trend plots.
- Main timing figure with district-year means and 95% CIs.
8. Reaction-function scaffolding (not estimated):
- Create district-competitor link table and district-year competitor template for future data integration.