167 lines
6.1 KiB
Markdown
167 lines
6.1 KiB
Markdown
# Texas Borderlands: Regulatory Enforcement Disparities
|
||
|
||
An empirical research project examining whether oil and gas regulatory enforcement in Texas differs systematically between border-proximate and interior districts — and whether a 2019 disclosure reform produced heterogeneous effects across those regions.
|
||
|
||
## Research Questions
|
||
|
||
- **RQ1:** Do border-exposed Texas Railroad Commission (RRC) districts differ from non-border districts in inspection intensity, violation detection, enforcement timing, and resolution rates?
|
||
- **RQ2:** Did the 2019 disclosure reform change enforcement outcomes differently in border districts versus non-border districts?
|
||
|
||
## Key Findings
|
||
|
||
| Outcome | Border Districts | Non-Border Districts |
|
||
|--------|-----------------|---------------------|
|
||
| Inspections per well | 1.329 | 1.515 |
|
||
| Violations per inspection | 0.130 | 0.098 |
|
||
| Days to enforcement | 145.2 | 122.8 |
|
||
| Resolution rate | 0.543 | 0.596 |
|
||
|
||
**Post-2019 reform effect:** Enforcement processing time in border districts improved by **~75 days** (p=0.016) relative to non-border districts — but inspection reach and resolution rates did not converge. Conclusion: *faster pipeline, not wider pipeline*.
|
||
|
||
## Project Structure
|
||
|
||
```
|
||
texas-borderlands/
|
||
├── analysis/
|
||
│ ├── borderlands.ipynb # Main analysis notebook
|
||
│ ├── well_analyzer.py # WellAnalyzer class (data loading + metrics)
|
||
│ └── output_borderlands/
|
||
│ ├── rq1_results.csv # RQ1 regression results
|
||
│ ├── rq2_results.csv # RQ2 FE interaction results
|
||
│ ├── district_year_panel_borderlands.csv
|
||
│ ├── border_vs_nonborder_trends.png
|
||
│ ├── money_plot_timing_border_prepost2019.png
|
||
│ ├── well_border_exposure_map.png
|
||
│ ├── continuous_exposure_results.csv
|
||
│ ├── cutoff_sensitivity_results.csv
|
||
│ └── border_type_split_results.csv
|
||
│
|
||
├── data/
|
||
│ ├── oil_gas_basin_shape/ # EIA TX shale basin boundaries
|
||
│ ├── shale_play_shape/ # EIA TX shale play delineations
|
||
│ ├── texas_county_shape/ # US Census TX county subdivisions (2025)
|
||
│ ├── texmex_shape/ # US Census TX-MX international boundary (2023)
|
||
│ ├── competition_panel.csv
|
||
│ └── district_competitor_links.csv
|
||
│
|
||
├── intro_thoery_methods_analysis_results_discussion.md # Full paper draft
|
||
├── appendix.md # Supplementary tables and robustness checks
|
||
└── requirements.txt
|
||
```
|
||
|
||
## Tech Stack
|
||
|
||
- **Python 3**
|
||
- **pandas / numpy** — data manipulation and panel construction
|
||
- **sqlalchemy / psycopg2** — PostgreSQL database access
|
||
- **geopandas / shapely** — geospatial analysis and border proximity measurement
|
||
- **scipy / statsmodels** — regression models (OLS, fixed effects)
|
||
- **libpysal / esda** — spatial econometrics
|
||
- **matplotlib / seaborn** — visualization
|
||
- **python-dotenv** — environment configuration
|
||
|
||
## Setup
|
||
|
||
### 1. Install dependencies
|
||
|
||
```bash
|
||
pip install -r requirements.txt
|
||
```
|
||
|
||
### 2. Configure the database connection
|
||
|
||
Create a `.env` file in the project root (or set environment variables directly):
|
||
|
||
```env
|
||
PGHOST=localhost
|
||
PGPORT=5432
|
||
PGUSER=postgres
|
||
PGPASSWORD=your_password
|
||
PGDATABASE=texas_data
|
||
```
|
||
|
||
The database should have PostGIS enabled and contain the following tables:
|
||
|
||
| Table | Description |
|
||
|-------|-------------|
|
||
| `well_shape_tract` (or similar) | Wells with location and demographic enrichment |
|
||
| `inspections` | Inspection records with dates and district info |
|
||
| `violations` | Violation records with enforcement and resolution dates |
|
||
|
||
The `WellAnalyzer` class auto-detects the wells table name from a set of known aliases.
|
||
|
||
### 3. Run the analysis
|
||
|
||
Open and run the Jupyter notebook:
|
||
|
||
```bash
|
||
jupyter notebook analysis/borderlands.ipynb
|
||
```
|
||
|
||
Or use the `WellAnalyzer` class directly:
|
||
|
||
```python
|
||
from analysis.well_analyzer import WellAnalyzer
|
||
|
||
analyzer = WellAnalyzer()
|
||
analyzer.print_analysis()
|
||
analyzer.export_analysis("output.json")
|
||
```
|
||
|
||
## Data
|
||
|
||
### Primary Data (PostgreSQL)
|
||
|
||
- **~1.01M wells** with geospatial coordinates and demographic/census tract enrichment
|
||
- **~1.87M inspections** (2015–2025)
|
||
- **~191.7K violations** (2015–2025)
|
||
- **District-year panel:** 143 observations (13 RRC districts × 11 years)
|
||
|
||
### Shapefiles
|
||
|
||
| File | Source | Purpose |
|
||
|------|--------|---------|
|
||
| `texmex_shape/` | US Census Bureau (2023) | TX-MX border geometry for proximity calculations |
|
||
| `texas_county_shape/` | US Census Bureau (2025) | State and county boundaries |
|
||
| `oil_gas_basin_shape/` | US EIA | Texas shale basin delineations |
|
||
| `shale_play_shape/` | US EIA | Texas shale play delineations |
|
||
|
||
### Border Exposure Definitions
|
||
|
||
- **District-level:** Binary — district centroid or wells within 50 km of any state/international border
|
||
- **Well-level:** Binary flags at 25 km and 50 km buffers from TX-Mexico border
|
||
- **Border subtypes:** TX-MX, TX-NM, TX-OK, TX-LA
|
||
|
||
Border-exposed wells (50 km buffer): **169,520** of 1,010,432 total.
|
||
|
||
## Empirical Design
|
||
|
||
**Unit of analysis:** Texas RRC district × year (2015–2025)
|
||
|
||
**Outcome variables:**
|
||
- Inspection intensity (inspections per well)
|
||
- Violation rate (violations per inspection)
|
||
- Days to enforcement action
|
||
- Resolution rate (compliance on reinspection)
|
||
|
||
**RQ1 — Levels model:**
|
||
```
|
||
Y_{dt} = α + β·Border_d + γ·X_{dt} + ε_{dt}
|
||
```
|
||
|
||
**RQ2 — Fixed effects interaction model:**
|
||
```
|
||
Y_{dt} = α_d + δ_t + β·(Post2019_t × Border_d) + γ·X_{dt} + ε_{dt}
|
||
```
|
||
|
||
**Robustness checks:** Border-type splitting, continuous exposure shares, cutoff sensitivity (25/75/100 km thresholds).
|
||
|
||
## Documentation
|
||
|
||
- `intro_thoery_methods_analysis_results_discussion.md` — full paper draft covering theory, methods, results, and discussion
|
||
- `appendix.md` — supplementary regression tables, robustness checks, and district profiles
|
||
|
||
## License
|
||
|
||
This project is for academic research purposes. Underlying data sources are public records from the Texas Railroad Commission and US federal agencies.
|