12 KiB
Database Schema Reference
Database: orphaned_wells
Engine: PostgreSQL 18 with PostGIS
Connection: psql -U postgres -h localhost -d orphaned_wells
Last updated: March 2026
Tables
wells
Primary data table. 117,672 documented unplugged orphaned wells across 27 U.S. states. Source: USGS DOW Dataset (Grove & Merrill 2022). Geometry loaded from shapefile, reprojected to EPSG:4326.
| Column | Type | Description |
|---|---|---|
gid |
integer | Primary key (auto) |
api_number |
varchar | 14-digit API well number (stripped of API: prefix) |
state |
varchar | Full state name (from USGS source data) |
county |
varchar | County name (from USGS source data) |
well_name |
varchar | Well name — typically operator + lease name |
well_number |
varchar | Order within lease/operator permit sequence |
type |
varchar | Raw well type as reported by state agency (119 distinct values) |
well_type_normalized |
text | Canonical type: Oil, Gas, Oil & Gas, Injection/Disposal, Dry/Exploratory, Water/Brine, Coalbed Methane, Enhanced Recovery, Gas Storage, Observation/Monitor, Other/Administrative, Unknown/Unspecified |
status |
varchar | Well status as reported by state agency |
latitude |
numeric | State-provided latitude (decimal degrees, NAD83/WGS84) |
longitude |
numeric | State-provided longitude (decimal degrees, NAD83/WGS84) |
principal_meridian |
varchar | PLSS principal meridian |
township |
numeric | PLSS township number |
t_dir |
varchar | Township direction (N/S) |
range |
numeric | PLSS range number |
r_dir |
varchar | Range direction (E/W) |
section |
numeric | PLSS section (1–36; some LA wells non-standard) |
qtr |
varchar | PLSS quarter section (¼ sq mi) |
qtr_qtr |
varchar | PLSS quarter-quarter section (1/16 sq mi) |
qtr_qtr_qtr |
varchar | PLSS quarter-quarter-quarter (1/64 sq mi) |
source |
varchar | State agency that provided the data |
data_file_date |
date | Date data was last updated by source agency |
well_info_notes |
varchar | Additional well information from source |
location_notes |
varchar | Location/coordinate methodology notes |
other_notes |
varchar | Other notes (often includes status date) |
geom |
geometry(Point, 4326) | PostGIS point in WGS84 |
state_fips |
char(2) | 2-digit Census FIPS state code |
tract_geoid |
char(11) | 11-digit Census tract FIPS (join key to ACS) |
tract_name |
text | Census tract label (e.g., "Census Tract 2048") |
county_fips |
char(3) | 3-digit Census county FIPS |
county_name |
text | County name from 2021 TIGER/Line |
state_usps |
char(2) | 2-letter USPS state abbreviation (from spatial join) |
tract_aland_m2 |
bigint | Tract land area in square meters |
tract_awater_m2 |
bigint | Tract water area in square meters |
Indexes: api_number, state, state_fips, state_usps, county_fips, tract_geoid, well_type_normalized, geom (GIST)
Note on well_type_normalized: 59.3% of wells have Unknown/Unspecified type because Ohio (20,557), Pennsylvania (19,160), Kentucky (12,695), Kansas (5,477), and several other states submitted data without type classification. See data-sources.md for details.
census_tracts
2021 TIGER/Line cartographic boundary file (1:500k). 85,230 tracts covering all 50 states, DC, and territories.
| Column | Type | Description |
|---|---|---|
gid |
integer | Primary key |
statefp |
varchar(2) | 2-digit state FIPS |
countyfp |
varchar(3) | 3-digit county FIPS |
tractce |
varchar(6) | 6-digit tract code |
affgeoid |
varchar(20) | Affinity GEOID |
geoid |
varchar(11) | 11-digit FIPS (state + county + tract) — ACS join key |
name |
varchar(100) | Tract number |
namelsad |
varchar(100) | Full tract name (e.g., "Census Tract 1042.01") |
stusps |
varchar(2) | 2-letter state postal abbreviation |
namelsadco |
varchar(100) | County name |
state_name |
varchar(100) | Full state name |
lsad |
varchar(2) | Legal/statistical area description code |
aland |
numeric | Land area in square meters |
awater |
numeric | Water area in square meters |
geom |
geometry(MultiPolygon, 4326) | PostGIS polygon in WGS84 |
Indexes: geoid, statefp, stusps, geom (GIST)
state_transition_offices
State-level just transition offices and programs. One row per office/program.
Coded by RA from Climate Policy Dashboard.
Populates the v_state_governance and v_ch4_state_analysis views.
| Column | Type | Description |
|---|---|---|
id |
serial | Primary key |
state |
char(2) | USPS state abbreviation |
state_name |
text | Full state name |
office_name |
text | Exact name of office/program; NA if none |
year_established |
integer | Year established; NULL = not stated |
target_text |
text | Verbatim description of who the office serves |
code_fossil |
smallint | 1 = explicitly mentions oil/gas/coal/mining/power plant workers; 0 = absent |
code_equity |
smallint | 1 = explicitly mentions equity/justice/disadvantaged/low-income/EJ; 0 = absent |
source_url |
text | URL to Climate Policy Dashboard entry |
date_collected |
date | Date RA retrieved data |
collected_by |
text | RA name |
notes |
text | Ambiguity, edge cases, interpretation notes |
Coding rules:
code_fossil = 1ONLY with explicit language about fossil fuel workers, not merely "economic transition."code_equity = 1ONLY with explicit EJ/disadvantaged/low-income language, not merely "communities."- States not on the dashboard: do not create a row unless instructed.
- States listed but with no program: one row with
office_name = NA, codes = 0.
state_prioritization
State orphaned well plugging prioritization schemes. One row per state. Coded by RA from IOGCC Prioritization Report (July 2023).
| Column | Type | Description |
|---|---|---|
id |
serial | Primary key |
state |
char(2) | USPS state abbreviation (unique) |
state_name |
text | Full state name |
system_type |
text | Brief description of scoring system (e.g., 1–100 score); NA if not stated |
tech_factors |
text | Semicolon-separated list of technical/physical factors |
code_rural_urban |
smallint | 1 = explicitly uses population density or urban/rural classification as scoring factor |
code_vuln |
smallint | 1 = explicitly uses EJ/disadvantaged/low-income/DAC as prioritization factor |
code_surface |
smallint | 1 = explicitly uses surface land use as prioritization factor |
pdf_page |
text | Page(s) in IOGCC report; NA if uncertain |
source_quote |
text | 1–2 key sentences justifying coding decisions |
source_url |
text | IOGCC PDF URL |
date_collected |
date | Date RA retrieved data |
collected_by |
text | RA name |
notes |
text | Missing-state issues, interpretation risks |
Critical coding distinctions:
code_rural_urban: "distance to nearest building" is NOT sufficient — requires explicit urban/rural or density language.code_vuln: "near homes/schools" is NOT sufficient — requires explicit EJ/disadvantaged/DAC language about community status.- States not in IOGCC report: one row,
system_type = NA, all codes = 0, explain in notes.
state_liability
State-level financial liability estimates. Auto-populated from wells table; IIJA funding entered manually from DOI/OSMRE announcements.
| Column | Type | Description |
|---|---|---|
state |
char(2) | USPS abbreviation (unique) |
state_name |
text | Full state name |
well_count_dow |
integer | Wells in USGS DOW dataset |
iija_phase1_formula_usd |
bigint | IIJA/BIL Phase 1 formula grant (Nov 2022) |
iija_phase2_perf_usd |
bigint | Phase 2 performance grant (to be populated) |
iija_source |
text | Citation for funding figures |
est_liability_low_usd |
numeric | Generated: well_count × $9,000 (Raimi et al. low) |
est_liability_mid_usd |
numeric | Generated: well_count × $76,000 (Raimi et al. median) |
est_liability_high_usd |
numeric | Generated: well_count × $280,000 (Raimi et al. high) |
bonding_required |
boolean | Does state require bonds for active wells? |
bonding_adequacy |
text | adequate / inadequate / unknown / NA |
bonding_notes |
text | Bonding context |
cost_estimate_source |
text | Most applicable plugging_cost_references entry |
notes |
text | State-specific context |
National totals (Raimi et al. mid-range): $8.94 billion estimated liability; $310 million IIJA Phase 1 coverage = 3.47% funded.
plugging_cost_references
Five key published cost estimate sources for footnoting and methodology.
| Source | Year | Low/Well | Mid/Well | High/Well | Scope |
|---|---|---|---|---|---|
| EPA OLEM | 2018 | $5,000 | $25,000 | $85,000 | National |
| Raimi et al. (RFF) | 2021 | $9,000 | $76,000 | $280,000 | National |
| Carbon Tracker | 2020 | $20,000 | $82,000 | $300,000 | National |
| IOGCC | 2023 | $5,000 | $33,000 | $150,000 | State-reported |
| PA DEP | 2022 | $10,000 | $68,000 | $220,000 | Pennsylvania |
data_sources
The 27 state agencies and 3 ancillary sources that contributed to the USGS DOW dataset.
| Column | Type | Description |
|---|---|---|
source_name |
text | Full agency name |
source_type |
text | state_agency / federal / ngo / software |
state |
text | State served |
description |
text | Data type provided |
dataset_metadata
Full ScienceBase JSON and FGDC XML provenance for the primary USGS dataset. One row.
Key fields: doi, citation, summary, purpose, publication_date, data_start_date, data_end_date, bounding_box, sciencebase_url, related_report_doi, mapping_app_url, file_csv_md5, file_zip_md5.
dataset_contacts, dataset_tags, processing_steps
Supporting metadata tables from ScienceBase and FGDC XML. See data-sources.md for details.
Views
| View | Chapter | Description |
|---|---|---|
v_wells_by_state |
4 | Well counts, type breakdown, avg coordinates, date range per state |
v_wells_by_type |
4–5 | Raw type distribution (119 values) across states |
v_wells_by_status |
4 | Status classification distribution |
v_state_type_summary |
4 | State × normalized type cross-tabulation |
v_wells_by_tract |
4–5 | Well counts, type breakdown, density (wells/km²) per census tract |
v_wells_by_county |
4–5 | County-level rollup with 5-digit GEOID |
v_highest_density_tracts |
4 | Tracts ranked by well density (≥1 km² only) |
v_data_completeness |
— | Non-null/non-empty counts per column |
v_state_governance |
4 | Combines transition offices + prioritization → framework_type classification |
v_ch4_state_analysis |
4 | Governance framework × spatial × financial per state |
v_ch5_liability_summary |
5 | State-level liability estimates vs. IIJA funding |
v_ch5_national_totals |
5 | National aggregate: total liability, IIJA coverage percentage |
Key Joins
-- Wells → ACS (via tract_geoid)
SELECT w.*, acs.*
FROM wells w
JOIN acs_2021_b19013 acs ON w.tract_geoid = acs.geoid;
-- Wells → state governance framework
SELECT w.api_number, w.state, sg.framework_type
FROM wells w
JOIN v_state_governance sg ON w.state_usps = sg.state;
-- County liability summary
SELECT v.*, sl.est_liability_mid_usd
FROM v_wells_by_county v
JOIN state_liability sl ON v.state_usps = sl.state;