new appendix and writeup docs
This commit is contained in:
BIN
appendix.docx
Normal file
BIN
appendix.docx
Normal file
Binary file not shown.
127
appendix.md
Normal file
127
appendix.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# Appendix
|
||||
|
||||
## A1. Data Construction and Scope
|
||||
|
||||
- Analysis window: 2015-2025.
|
||||
- District-year panel size: 143 rows (13 districts x 11 years).
|
||||
- Primary identifiers: `api_norm`, `district`, `year`.
|
||||
- Border geometry includes TX-MX plus TX-NM, TX-OK, TX-LA boundary proximity.
|
||||
|
||||
### Build validation counts
|
||||
|
||||
| Metric | Value |
|
||||
|---|---:|
|
||||
| Wells loaded | 1,010,432 |
|
||||
| Inspections (2015-2025) | 1,867,859 |
|
||||
| Violations (2015-2025) | 191,762 |
|
||||
| Border-exposed wells (any 50 km border) | 169,520 |
|
||||
| Panel observations | 143 |
|
||||
| Districts | 13 |
|
||||
|
||||
### Border subtype counts (50 km)
|
||||
|
||||
| Border subtype | Wells |
|
||||
|---|---:|
|
||||
| TX-MX | 40,339 |
|
||||
| TX-NM | 81,567 |
|
||||
| TX-OK | 19,643 |
|
||||
| TX-LA | 29,675 |
|
||||
|
||||
## A2. Supplementary Equations
|
||||
|
||||
### Border-type FE interaction model
|
||||
|
||||
$$
|
||||
Y_{dt} = \alpha_d + \gamma_t + \sum_k \lambda_k(Post2019_t \times Type_{kd}) + \sum_k \phi_k(PostTrend_t \times Type_{kd}) + \varepsilon_{dt}
|
||||
$$
|
||||
where $k \in \{\text{TX-MX}, \text{TX-NM}, \text{TX-OK}, \text{TX-LA}\}$.
|
||||
|
||||
### Continuous exposure FE interaction model
|
||||
|
||||
$$
|
||||
Y_{dt} = \alpha_d + \gamma_t + \eta_1(Post2019_t \times ShareBorder_{dt}) + \eta_2(PostTrend_t \times ShareBorder_{dt}) + \varepsilon_{dt}
|
||||
$$
|
||||
|
||||
### Cutoff-specific exposure
|
||||
|
||||
$$
|
||||
ShareBorder^{(c)}_{dt}, \quad c \in \{25,75,100\}
|
||||
$$
|
||||
substituted into the same FE interaction framework.
|
||||
|
||||
## A3. Main FE Interaction Table (RQ2)
|
||||
|
||||
| Outcome | post_2019 x border | p-value | post_trend x border | p-value | N |
|
||||
|---|---:|---:|---:|---:|---:|
|
||||
| inspection_intensity | -0.1191 | 0.0753 | -0.0052 | 0.8181 | 143 |
|
||||
| violations_per_inspection | 0.0040 | 0.8881 | -0.0012 | 0.8350 | 143 |
|
||||
| avg_days_to_enforcement | -74.5893 | 0.0156 | -1.1587 | 0.9252 | 143 |
|
||||
| resolution_rate | 0.0404 | 0.4520 | -0.0186 | 0.3404 | 143 |
|
||||
|
||||
## A4. Border-Type Timing Interactions (Money Plot Companion)
|
||||
|
||||
| Term | Coefficient | p-value |
|
||||
|---|---:|---:|
|
||||
| post_2019:has_tx_mex | 4.0900 | 0.9062 |
|
||||
| post_2019:has_tx_nm | -18.7442 | 0.6013 |
|
||||
| post_2019:has_tx_ok | -14.2446 | 0.8134 |
|
||||
| post_2019:has_tx_la | -43.6598 | 0.6415 |
|
||||
| post_trend:has_tx_mex | -0.0148 | 0.9991 |
|
||||
| post_trend:has_tx_nm | 22.9067 | 0.0189 |
|
||||
| post_trend:has_tx_ok | -16.7188 | 0.0794 |
|
||||
| post_trend:has_tx_la | 0.6415 | 0.9551 |
|
||||
|
||||
## A5. Continuous Exposure Results
|
||||
|
||||
| Family | Outcome | Term | Coef | p-value | N |
|
||||
|---|---|---|---:|---:|---:|
|
||||
| RQ1 levels continuous | inspection_intensity | share_border_exposed_insp | 0.2095 | 0.4757 | 143 |
|
||||
| RQ1 levels continuous | avg_days_to_enforcement | share_border_exposed_insp | 103.4683 | 0.5710 | 143 |
|
||||
| RQ1 levels continuous | violations_per_inspection | share_border_exposed_insp | -0.1585 | 0.0144 | 143 |
|
||||
| RQ1 levels continuous | resolution_rate | share_border_exposed_insp | -0.0420 | 0.8619 | 143 |
|
||||
| RQ2 FE continuous | avg_days_to_enforcement | post_2019:share_border_exposed_insp | -109.4067 | 0.4449 | 143 |
|
||||
| RQ2 FE continuous | avg_days_to_enforcement | post_trend:share_border_exposed_insp | 13.9623 | 0.7415 | 143 |
|
||||
| RQ2 FE continuous | resolution_rate | post_2019:share_border_exposed_insp | -0.0322 | 0.8163 | 143 |
|
||||
| RQ2 FE continuous | resolution_rate | post_trend:share_border_exposed_insp | -0.0979 | 0.0423 | 143 |
|
||||
|
||||
## A6. Cutoff Sensitivity (Timing-Focused Terms)
|
||||
|
||||
| Cutoff km | Term | Coef | p-value | N |
|
||||
|---:|---|---:|---:|---:|
|
||||
| 25 | post_2019:share_border_25km | -101.9283 | 0.7010 | 143 |
|
||||
| 75 | post_2019:share_border_75km | -75.6591 | 0.5116 | 143 |
|
||||
| 100 | post_2019:share_border_100km | -4.4795 | 0.9474 | 143 |
|
||||
|
||||
## A7. District Border-Type Profile
|
||||
|
||||
| District | Wells | Dominant type | TX-MX share | TX-NM share | TX-OK share | TX-LA share |
|
||||
|---|---:|---|---:|---:|---:|---:|
|
||||
| 01 | 31,898 | TX-MX | 0.2313 | 0.0000 | 0.0000 | 0.0000 |
|
||||
| 02 | 17,099 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
|
||||
| 03 | 16,700 | TX-LA | 0.0000 | 0.0000 | 0.0000 | 0.1166 |
|
||||
| 04 | 20,973 | TX-MX | 0.6384 | 0.0000 | 0.0000 | 0.0000 |
|
||||
| 05 | 9,938 | TX-OK | 0.0000 | 0.0000 | 0.0022 | 0.0000 |
|
||||
| 06 | 24,422 | TX-LA | 0.0000 | 0.0000 | 0.0293 | 0.5235 |
|
||||
| 08 | 105,931 | TX-NM | 0.0001 | 0.1905 | 0.0000 | 0.0000 |
|
||||
| 09 | 46,485 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
|
||||
| 10 | 29,621 | TX-OK | 0.0000 | 0.0009 | 0.3020 | 0.0000 |
|
||||
| 6E | 6,235 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
|
||||
| 7B | 21,230 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
|
||||
| 7C | 43,061 | NONE | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
|
||||
| 8A | 42,005 | TX-NM | 0.0000 | 0.4182 | 0.0000 | 0.0000 |
|
||||
|
||||
## A8. Figures and Output Artifacts
|
||||
|
||||
- Border exposure map: `analysis/output_borderlands/well_border_exposure_map.png`
|
||||
- Border vs non-border trends: `analysis/output_borderlands/border_vs_nonborder_trends.png`
|
||||
- Main timing figure: `analysis/output_borderlands/money_plot_timing_border_prepost2019.png`
|
||||
- Timing CI table: `analysis/output_borderlands/money_plot_timing_ci_by_year.csv`
|
||||
|
||||
## A9. Optional Prior Artifact (Not Estimated in Current Causal Scope)
|
||||
|
||||
`analysis/output_borderlands/competition_asymmetry_results.csv` contains:
|
||||
|
||||
- `gap_pos` = 0.5837 (p < 0.001)
|
||||
- `gap_neg` = 0.4993 (p = 0.0027)
|
||||
|
||||
These estimates are not part of the current notebook's identified model scope and should not be interpreted as a completed reaction-function test in this manuscript version.
|
||||
BIN
intro_thoery_methods_analysis_results_discussion.docx
Normal file
BIN
intro_thoery_methods_analysis_results_discussion.docx
Normal file
Binary file not shown.
180
intro_thoery_methods_analysis_results_discussion.md
Normal file
180
intro_thoery_methods_analysis_results_discussion.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# Introduction
|
||||
|
||||
Regulatory enforcement in borderlands jurisdictions is often expected to differ from interior jurisdictions due to administrative constraints, multi-jurisdictional exposure, and monitoring frictions. This manuscript analyzes Texas Railroad Commission district-year outcomes (2015-2025) to assess whether border-exposed districts show systematic enforcement gaps and whether those gaps changed after the 2019 disclosure reform.
|
||||
|
||||
The empirical design centers on two research questions from the notebook:
|
||||
|
||||
1. RQ1 (Border gaps): Do border-exposed Texas districts differ from non-border districts in enforcement intensity and pipeline outcomes?
|
||||
2. RQ2 (Disclosure heterogeneity): Did the 2019 disclosure reform change enforcement outcomes differently in border districts versus non-border districts (level shift and post-policy trend differential)?
|
||||
|
||||
# Theory
|
||||
|
||||
We use a borderlands governance framing with two linked mechanisms: capacity asymmetry and transparency-throughput effects. The corresponding hypotheses are:
|
||||
|
||||
1. H1 (Border inspection gap): Border districts have lower inspection intensity than non-border districts.
|
||||
2. H2 (Border pipeline disadvantage): Border districts show weaker enforcement pipeline outcomes (higher violations per inspection and/or slower timing and/or lower resolution rates).
|
||||
3. H3 (Disclosure heterogeneity in levels): Post-2019 level shifts differ between border and non-border districts (`post_2019:border`).
|
||||
4. H4 (Disclosure heterogeneity in trends): Post-2019 trend shifts differ between border and non-border districts (`post_trend:border`).
|
||||
|
||||
This yields a core empirical claim: post-2019 border effects should be strongest in enforcement timing rather than in inspection coverage or resolution outcomes.
|
||||
|
||||
# Methods
|
||||
|
||||
## Data and Unit of Analysis
|
||||
|
||||
- Unit: district-year.
|
||||
- Coverage: 13 Texas RRC districts, 2015-2025.
|
||||
- Source tables: `well_shape_tract`, `inspections`, `violations`.
|
||||
- Sample in current run: 1,010,432 wells; 1,867,859 inspections; 191,762 violations; 143 district-year observations.
|
||||
|
||||
## Border Measurement: District Coding and Well Proximity
|
||||
|
||||
We use two complementary border constructions.
|
||||
|
||||
1. District-level baseline treatment (`border_district`): districts in the predefined border-adjacent set (`01`, `02`, `06`, `08`, `8A`, `09`, `10`) are coded 1; others are coded 0.
|
||||
2. Well-level proximity treatment: each well is classified by spatial proximity to border segments, then rolled up to district-year exposure shares.
|
||||
|
||||
Well-level proximity was constructed from latitude/longitude and shapefiles as follows:
|
||||
|
||||
1. Texas-Mexico distance/flags from `WellAnalyzer` (`within_25km_texmex`, `within_50km_texmex`).
|
||||
2. Additional state-border segments (TX-NM, TX-OK, TX-LA) built from Texas county boundary geometry and seed lines.
|
||||
3. Distances computed in projected CRS (EPSG:5070), then threshold flags generated at 25 km and 50 km.
|
||||
4. Composite exposure indicators created:
|
||||
- `within_50km_state_border_any`
|
||||
- `well_border_exposed` (1 if within 50 km of TX-MX or any TX-state border segment).
|
||||
|
||||
District-year well-proximity exposure is measured as:
|
||||
$$
|
||||
ShareBorder_{dt} = \frac{BorderExposedInspections_{dt}}{Inspections_{dt}}
|
||||
$$
|
||||
and an alternative district treatment is defined as `border_exposure_district = 1` when `ShareBorder_{dt} \ge 0.25`.
|
||||
|
||||
## Outcomes
|
||||
|
||||
$$
|
||||
InspectionIntensity_{dt} = \frac{Inspections_{dt}}{UniqueWells_{dt}}
|
||||
$$
|
||||
$$
|
||||
ViolPerInsp_{dt} = \frac{Violations_{dt}}{Inspections_{dt}}
|
||||
$$
|
||||
$$
|
||||
DaysToEnf_{dt} = \frac{1}{N_{dt}} \sum_{i=1}^{N_{dt}} (EnforcementDate_i - ViolationDiscoveryDate_i)
|
||||
$$
|
||||
$$
|
||||
ResolutionRate_{dt} = \frac{CompliantOnReinspection_{dt}}{Violations_{dt}}
|
||||
$$
|
||||
|
||||
## Exposure Definitions
|
||||
|
||||
- Baseline treatment: `border_district` (binary district border status).
|
||||
- Additional robustness exposures:
|
||||
|
||||
1. Border-type indicators (`TX-MX`, `TX-NM`, `TX-OK`, `TX-LA`)
|
||||
2. Continuous exposure share:
|
||||
|
||||
$$
|
||||
ShareBorder_{dt} = \frac{BorderExposedInspections_{dt}}{Inspections_{dt}}
|
||||
$$
|
||||
3. Cutoff sensitivity with 25/75/100 km thresholds.
|
||||
|
||||
## Estimating Equations
|
||||
|
||||
RQ1 levels:
|
||||
$$
|
||||
Y_{dt} = \alpha + \beta_1 Border_d + \beta_2 \log(UniqueWells_{dt}) + \gamma_t + \varepsilon_{dt}
|
||||
$$
|
||||
|
||||
RQ2 FE interaction:
|
||||
$$
|
||||
Y_{dt} = \alpha_d + \gamma_t + \theta_1(Post2019_t \times Border_d) + \theta_2(PostTrend_t \times Border_d) + \varepsilon_{dt}
|
||||
$$
|
||||
$$
|
||||
Post2019_t = \mathbb{1}[t \ge 2019], \quad PostTrend_t = \max(0, t-2019)
|
||||
$$
|
||||
|
||||
Inference uses district-clustered standard errors (13 clusters), with emphasis on effect size and consistency across specifications.
|
||||
|
||||
## Tests Run in Notebook
|
||||
|
||||
The notebook estimated the following test families:
|
||||
|
||||
1. Descriptive border-gap tests:
|
||||
- Border vs non-border means for inspection intensity, violations per inspection, days to enforcement, and resolution rate.
|
||||
2. RQ1 levels regressions (border gaps):
|
||||
- Outcomes: `inspection_intensity`, `violations_per_inspection`.
|
||||
- Specification: `border_district + log_unique_wells + C(year)`.
|
||||
3. RQ2 FE interaction regressions (post-2019 heterogeneity):
|
||||
- Outcomes: `inspection_intensity`, `violations_per_inspection`, `avg_days_to_enforcement`, `resolution_rate`.
|
||||
- Specification: `C(district) + C(year) + post_2019:border_district + post_trend:border_district`.
|
||||
4. Border-type robustness tests:
|
||||
- District profiles for `TX-MX`, `TX-NM`, `TX-OK`, `TX-LA` exposure.
|
||||
- RQ1-style levels with `has_tx_*` indicators.
|
||||
- RQ2-style FE interactions with `post_2019:has_tx_*` and `post_trend:has_tx_*`.
|
||||
5. Continuous-exposure robustness tests:
|
||||
- Replace binary border indicator with `share_border_exposed_insp` in both RQ1-style and RQ2-style specifications.
|
||||
6. Cutoff-sensitivity tests:
|
||||
- Recompute proximity exposure from minimum distance to any border at 25 km, 75 km, and 100 km.
|
||||
- Estimate RQ1-style models for inspection intensity and RQ2-style timing interaction models.
|
||||
7. Visualization and reporting tests:
|
||||
- Border/non-border trend plots.
|
||||
- Main timing figure with district-year group means and 95% confidence intervals.
|
||||
8. Competition/reaction-function scaffolding (not estimated as causal model):
|
||||
- District-to-competitor jurisdiction link table and template generated for future interstate stringency integration.
|
||||
|
||||
# Analysis
|
||||
|
||||
## Descriptive Border Gaps
|
||||
|
||||
| Outcome | Non-border | Border |
|
||||
|---|---:|---:|
|
||||
| Inspection intensity | 1.515 | 1.329 |
|
||||
| Violations per inspection | 0.098 | 0.130 |
|
||||
| Mean days to enforcement | 122.8 | 145.2 |
|
||||
| Mean resolution rate | 0.596 | 0.543 |
|
||||
|
||||
Descriptively, border districts show weaker enforcement conditions across coverage, detection conditional on inspection, timing, and follow-through.
|
||||
|
||||
## Main Regression Evidence
|
||||
|
||||
| Model | Coefficient | p-value | N |
|
||||
|---|---:|---:|---:|
|
||||
| RQ1: `border_district` on `inspection_intensity` | -0.1755 | 0.0999 | 143 |
|
||||
| RQ1: `border_district` on `violations_per_inspection` | 0.0434 | 0.0949 | 143 |
|
||||
| RQ2: `post_2019:border` on `inspection_intensity` | -0.1191 | 0.0753 | 143 |
|
||||
| RQ2: `post_2019:border` on `violations_per_inspection` | 0.0040 | 0.8881 | 143 |
|
||||
| RQ2: `post_2019:border` on `avg_days_to_enforcement` | -74.5893 | 0.0156 | 143 |
|
||||
| RQ2: `post_2019:border` on `resolution_rate` | 0.0404 | 0.4520 | 143 |
|
||||
|
||||
The most stable differential post-2019 effect is a border-specific improvement in enforcement timing.
|
||||
|
||||
# Results
|
||||
|
||||
## Hypothesis Tests
|
||||
|
||||
| Hypothesis | Test evidence | Decision (current run) |
|
||||
|---|---|---|
|
||||
| H1: Border districts have lower inspection intensity | RQ1: `border_district -> inspection_intensity` = -0.1755, p = 0.0999; descriptives 1.329 (border) vs 1.515 (non-border) | Partial support |
|
||||
| H2: Border districts have weaker pipeline outcomes | Descriptives: 0.130 vs 0.098 violations/inspection, 145.2 vs 122.8 days, 0.543 vs 0.596 resolution; RQ1 `border_district -> violations_per_inspection` = 0.0434, p = 0.0949 | Supported descriptively, mixed regression support |
|
||||
| H3: Border-specific post-2019 level shift | RQ2 `post_2019:border -> avg_days_to_enforcement` = -74.5893, p = 0.0156; other outcomes null | Supported for timing only |
|
||||
| H4: Border-specific post-2019 trend shift | RQ2 `post_trend:border` terms: inspection p = 0.8181, violations p = 0.8350, timing p = 0.9252, resolution p = 0.3404 | Not supported in baseline model |
|
||||
|
||||
The hypothesis tests indicate the clearest inferential signal is a border-specific post-2019 timing level shift, consistent with "faster pipeline, not wider pipeline."
|
||||
|
||||
## Figure Callouts
|
||||
|
||||
Figure 1 (group trends): `analysis/output_borderlands/border_vs_nonborder_trends.png`
|
||||
Figure 2 (main timing figure with CI): `analysis/output_borderlands/money_plot_timing_border_prepost2019.png`
|
||||
|
||||
Figure 2 uses district-year means with equal district weighting:
|
||||
$$
|
||||
\bar{Y}_{gt} = \frac{1}{n_{gt}} \sum_{d \in g} Y_{dt}, \quad
|
||||
CI_{95\%} = \bar{Y}_{gt} \pm 1.96 \cdot \frac{s_{gt}}{\sqrt{n_{gt}}}
|
||||
$$
|
||||
|
||||
# Discussion
|
||||
|
||||
The findings are consistent with a transparency-throughput mechanism: disclosure-era pressure appears to accelerate processing where baseline constraints are stronger, but this does not map cleanly to expansion of enforcement reach or follow-through. The strongest claim supported by this design is "faster pipeline, not wider pipeline."
|
||||
|
||||
The contribution is a boundary condition argument: transparency reforms can produce uneven administrative effects across territorial governance contexts, with timing responsiveness exceeding capacity expansion.
|
||||
|
||||
The design does not identify interstate strategic competition. A full Neil Woods-style test requires district-year competitor stringency series and explicit enforcement-gap dynamics. That's the next step in the research agenda, but the current analysis provides a necessary first step by establishing the presence of border-specific enforcement gaps and their heterogeneous response to disclosure reform.
|
||||
Reference in New Issue
Block a user