diff --git a/requirements.txt b/requirements.txt index fefedf4..089a408 100644 --- a/requirements.txt +++ b/requirements.txt @@ -8,3 +8,5 @@ python-dotenv openpyxl statsmodels scipy +matplotlib +seaborn \ No newline at end of file diff --git a/texas_inspection_expenses.ipynb b/texas_inspection_expenses.ipynb index 643a7df..5dde850 100644 --- a/texas_inspection_expenses.ipynb +++ b/texas_inspection_expenses.ipynb @@ -10,20 +10,20 @@ "**Research question:** Does organizational capacity (budget, staffing) predict better regulatory outputs (inspections, compliance, enforcement), and how is that relationship moderated by goal ambiguity, district-level heterogeneity, and spatial/geographic factors?\n", "\n", "## Hypotheses\n", - "- **H1 — Capacity → Outputs:** Higher OGI budget and FTE predict more inspections, higher compliance rates, and faster violation resolution.\n", - "- **H2 — Goal Ambiguity:** When a larger share of RRC budget goes to the more ambiguous \"Energy Resource Development\" goal, the capacity → output relationship weakens.\n", - "- **H3 — Multilevel / District Effects:** The capacity → output relationship varies across RRC districts (budget slope heterogeneity).\n", - "- **H4 — Spatial & Geographic:** Offshore-jurisdiction and border districts moderate the capacity → output relationship; spatial autocorrelation in residuals is tested via Moran's I.\n", + "- **H1 \u2014 Capacity \u2192 Outputs:** Higher OGI budget and FTE predict more inspections, higher compliance rates, and faster violation resolution.\n", + "- **H2 \u2014 Goal Ambiguity:** When a larger share of RRC budget goes to the more ambiguous \"Energy Resource Development\" goal, the capacity \u2192 output relationship weakens.\n", + "- **H3 \u2014 Multilevel / District Effects:** The capacity \u2192 output relationship varies across RRC districts (budget slope heterogeneity).\n", + "- **H4 \u2014 Spatial & Geographic:** Offshore-jurisdiction and border districts moderate the capacity \u2192 output relationship; spatial autocorrelation in residuals is tested via Moran's I.\n", "\n", "**Data:**\n", "- PostgreSQL warehouse (`texas_data`): `inspections`, `violations`, `well_shape_tract`\n", - "- `RRC Budget Data.xlsx`: statewide RRC budget by strategy, 2016–2024\n", - "- Analysis period: 2016–2023 (2024 is budget estimate, excluded from regressions)\n" + "- `RRC Budget Data.xlsx`: statewide RRC budget by strategy, 2016\u20132024\n", + "- Analysis period: 2016\u20132023 (2024 is budget estimate, excluded from regressions)\n" ] }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 31, "id": "3ed415f0", "metadata": {}, "outputs": [ @@ -43,116 +43,116 @@ "Requirement already satisfied: scipy in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 10)) (1.13.1)\n", "Requirement already satisfied: matplotlib in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 11)) (3.9.4)\n", "Requirement already satisfied: seaborn in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 12)) (0.13.2)\n", - "Requirement already satisfied: ipywidgets in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (8.1.8)\n", - "Requirement already satisfied: notebook in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (7.5.4)\n", - "Requirement already satisfied: nbconvert in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (7.17.0)\n", "Requirement already satisfied: jupyter-console in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (6.6.3)\n", "Requirement already satisfied: jupyterlab in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (4.5.5)\n", - "Requirement already satisfied: debugpy>=1.6.5 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (1.8.20)\n", - "Requirement already satisfied: matplotlib-inline>=0.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.2.1)\n", - "Requirement already satisfied: pyzmq>=25 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (27.1.0)\n", - "Requirement already satisfied: tornado>=6.2 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (6.5.4)\n", - "Requirement already satisfied: traitlets>=5.4.0 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (5.14.3)\n", - "Requirement already satisfied: ipython>=7.23.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (8.18.1)\n", - "Requirement already satisfied: packaging>=22 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (26.0)\n", - "Requirement already satisfied: appnope>=0.1.2 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.1.4)\n", - "Requirement already satisfied: nest-asyncio>=1.4 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (1.6.0)\n", - "Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (5.8.1)\n", + "Requirement already satisfied: notebook in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (7.5.4)\n", + "Requirement already satisfied: nbconvert in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (7.17.0)\n", + "Requirement already satisfied: ipywidgets in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (8.1.8)\n", "Requirement already satisfied: psutil>=5.7 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (7.2.2)\n", + "Requirement already satisfied: appnope>=0.1.2 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.1.4)\n", + "Requirement already satisfied: debugpy>=1.6.5 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (1.8.20)\n", + "Requirement already satisfied: nest-asyncio>=1.4 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (1.6.0)\n", + "Requirement already satisfied: ipython>=7.23.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (8.18.1)\n", "Requirement already satisfied: jupyter-client>=8.0.0 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (8.6.3)\n", "Requirement already satisfied: comm>=0.1.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.2.3)\n", + "Requirement already satisfied: tornado>=6.2 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (6.5.4)\n", + "Requirement already satisfied: matplotlib-inline>=0.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.2.1)\n", + "Requirement already satisfied: pyzmq>=25 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (27.1.0)\n", + "Requirement already satisfied: packaging>=22 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (26.0)\n", + "Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (5.8.1)\n", + "Requirement already satisfied: traitlets>=5.4.0 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (5.14.3)\n", "Requirement already satisfied: tzdata>=2022.7 in ./.venv/lib/python3.9/site-packages (from pandas->-r requirements.txt (line 3)) (2025.3)\n", "Requirement already satisfied: python-dateutil>=2.8.2 in ./.venv/lib/python3.9/site-packages (from pandas->-r requirements.txt (line 3)) (2.9.0.post0)\n", "Requirement already satisfied: pytz>=2020.1 in ./.venv/lib/python3.9/site-packages (from pandas->-r requirements.txt (line 3)) (2025.2)\n", "Requirement already satisfied: typing-extensions>=4.6.0 in ./.venv/lib/python3.9/site-packages (from sqlalchemy->-r requirements.txt (line 5)) (4.15.0)\n", "Requirement already satisfied: et-xmlfile in ./.venv/lib/python3.9/site-packages (from openpyxl->-r requirements.txt (line 8)) (2.0.0)\n", "Requirement already satisfied: patsy>=0.5.6 in ./.venv/lib/python3.9/site-packages (from statsmodels->-r requirements.txt (line 9)) (1.0.2)\n", - "Requirement already satisfied: cycler>=0.10 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (0.12.1)\n", "Requirement already satisfied: pillow>=8 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (11.3.0)\n", + "Requirement already satisfied: pyparsing>=2.3.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (3.3.2)\n", + "Requirement already satisfied: contourpy>=1.0.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (1.3.0)\n", "Requirement already satisfied: kiwisolver>=1.3.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (1.4.7)\n", "Requirement already satisfied: fonttools>=4.22.0 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (4.60.2)\n", - "Requirement already satisfied: contourpy>=1.0.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (1.3.0)\n", "Requirement already satisfied: importlib-resources>=3.2.0 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (6.5.2)\n", - "Requirement already satisfied: pyparsing>=2.3.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (3.3.2)\n", + "Requirement already satisfied: cycler>=0.10 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (0.12.1)\n", "Requirement already satisfied: zipp>=3.1.0 in ./.venv/lib/python3.9/site-packages (from importlib-resources>=3.2.0->matplotlib->-r requirements.txt (line 11)) (3.23.0)\n", - "Requirement already satisfied: exceptiongroup in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (1.3.1)\n", - "Requirement already satisfied: decorator in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (5.2.1)\n", - "Requirement already satisfied: pygments>=2.4.0 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (2.19.2)\n", - "Requirement already satisfied: stack-data in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.6.3)\n", - "Requirement already satisfied: jedi>=0.16 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.19.2)\n", - "Requirement already satisfied: pexpect>4.3 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (4.9.0)\n", "Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (3.0.52)\n", + "Requirement already satisfied: decorator in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (5.2.1)\n", + "Requirement already satisfied: pexpect>4.3 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (4.9.0)\n", + "Requirement already satisfied: exceptiongroup in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (1.3.1)\n", + "Requirement already satisfied: stack-data in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.6.3)\n", + "Requirement already satisfied: pygments>=2.4.0 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (2.19.2)\n", + "Requirement already satisfied: jedi>=0.16 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.19.2)\n", "Requirement already satisfied: parso<0.9.0,>=0.8.4 in ./.venv/lib/python3.9/site-packages (from jedi>=0.16->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.8.6)\n", "Requirement already satisfied: importlib-metadata>=4.8.3 in ./.venv/lib/python3.9/site-packages (from jupyter-client>=8.0.0->ipykernel->-r requirements.txt (line 2)) (8.7.1)\n", "Requirement already satisfied: platformdirs>=2.5 in ./.venv/lib/python3.9/site-packages (from jupyter-core!=5.0.*,>=4.12->ipykernel->-r requirements.txt (line 2)) (4.4.0)\n", "Requirement already satisfied: ptyprocess>=0.5 in ./.venv/lib/python3.9/site-packages (from pexpect>4.3->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.7.0)\n", "Requirement already satisfied: wcwidth in ./.venv/lib/python3.9/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.6.0)\n", "Requirement already satisfied: six>=1.5 in ./.venv/lib/python3.9/site-packages (from python-dateutil>=2.8.2->pandas->-r requirements.txt (line 3)) (1.17.0)\n", - "Requirement already satisfied: widgetsnbextension~=4.0.14 in ./.venv/lib/python3.9/site-packages (from ipywidgets->jupyter->-r requirements.txt (line 1)) (4.0.15)\n", "Requirement already satisfied: jupyterlab_widgets~=3.0.15 in ./.venv/lib/python3.9/site-packages (from ipywidgets->jupyter->-r requirements.txt (line 1)) (3.0.16)\n", + "Requirement already satisfied: widgetsnbextension~=4.0.14 in ./.venv/lib/python3.9/site-packages (from ipywidgets->jupyter->-r requirements.txt (line 1)) (4.0.15)\n", + "Requirement already satisfied: jupyter-server<3,>=2.4.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.17.0)\n", + "Requirement already satisfied: setuptools>=41.1.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (58.0.4)\n", "Requirement already satisfied: httpx<1,>=0.25.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (0.28.1)\n", + "Requirement already satisfied: jinja2>=3.0.3 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (3.1.6)\n", + "Requirement already satisfied: tomli>=1.2.2 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.4.0)\n", + "Requirement already satisfied: notebook-shim>=0.2 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (0.2.4)\n", + "Requirement already satisfied: jupyter-lsp>=2.0.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.3.0)\n", "Requirement already satisfied: async-lru>=1.0.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.0.5)\n", "Requirement already satisfied: jupyterlab-server<3,>=2.28.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.28.0)\n", - "Requirement already satisfied: jupyter-lsp>=2.0.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.3.0)\n", - "Requirement already satisfied: setuptools>=41.1.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (58.0.4)\n", - "Requirement already satisfied: tomli>=1.2.2 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.4.0)\n", - "Requirement already satisfied: jupyter-server<3,>=2.4.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.17.0)\n", - "Requirement already satisfied: jinja2>=3.0.3 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (3.1.6)\n", - "Requirement already satisfied: notebook-shim>=0.2 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (0.2.4)\n", - "Requirement already satisfied: certifi in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2026.2.25)\n", "Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.0.9)\n", - "Requirement already satisfied: anyio in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.12.1)\n", + "Requirement already satisfied: certifi in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2026.2.25)\n", "Requirement already satisfied: idna in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.11)\n", + "Requirement already satisfied: anyio in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.12.1)\n", "Requirement already satisfied: h11>=0.16 in ./.venv/lib/python3.9/site-packages (from httpcore==1.*->httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.16.0)\n", "Requirement already satisfied: MarkupSafe>=2.0 in ./.venv/lib/python3.9/site-packages (from jinja2>=3.0.3->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.0.3)\n", - "Requirement already satisfied: jupyter-server-terminals>=0.4.4 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.5.4)\n", - "Requirement already satisfied: send2trash>=1.8.2 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.1.0)\n", - "Requirement already satisfied: prometheus-client>=0.9 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.24.1)\n", - "Requirement already satisfied: jupyter-events>=0.11.0 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.12.0)\n", "Requirement already satisfied: overrides>=5.0 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (7.7.0)\n", + "Requirement already satisfied: send2trash>=1.8.2 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.1.0)\n", "Requirement already satisfied: argon2-cffi>=21.1 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (25.1.0)\n", - "Requirement already satisfied: terminado>=0.8.3 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.18.1)\n", + "Requirement already satisfied: jupyter-server-terminals>=0.4.4 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.5.4)\n", + "Requirement already satisfied: prometheus-client>=0.9 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.24.1)\n", "Requirement already satisfied: websocket-client>=1.7 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.9.0)\n", "Requirement already satisfied: nbformat>=5.3.0 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (5.10.4)\n", + "Requirement already satisfied: terminado>=0.8.3 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.18.1)\n", + "Requirement already satisfied: jupyter-events>=0.11.0 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.12.0)\n", "Requirement already satisfied: argon2-cffi-bindings in ./.venv/lib/python3.9/site-packages (from argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (25.1.0)\n", + "Requirement already satisfied: python-json-logger>=2.0.4 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.0.0)\n", "Requirement already satisfied: rfc3339-validator in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.1.4)\n", "Requirement already satisfied: jsonschema[format-nongpl]>=4.18.0 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.25.1)\n", - "Requirement already satisfied: rfc3986-validator>=0.1.1 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.1.1)\n", "Requirement already satisfied: referencing in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.36.2)\n", - "Requirement already satisfied: python-json-logger>=2.0.4 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.0.0)\n", + "Requirement already satisfied: rfc3986-validator>=0.1.1 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.1.1)\n", "Requirement already satisfied: pyyaml>=5.3 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (6.0.3)\n", - "Requirement already satisfied: rpds-py>=0.7.1 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.27.1)\n", - "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2025.9.1)\n", "Requirement already satisfied: attrs>=22.2.0 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (25.4.0)\n", + "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2025.9.1)\n", + "Requirement already satisfied: rpds-py>=0.7.1 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.27.1)\n", + "Requirement already satisfied: uri-template in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.3.0)\n", + "Requirement already satisfied: rfc3987-syntax>=1.1.0 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.1.0)\n", "Requirement already satisfied: isoduration in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (20.11.0)\n", "Requirement already satisfied: webcolors>=24.6.0 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (24.11.1)\n", "Requirement already satisfied: jsonpointer>1.13 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.0.0)\n", "Requirement already satisfied: fqdn in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.5.1)\n", - "Requirement already satisfied: rfc3987-syntax>=1.1.0 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.1.0)\n", - "Requirement already satisfied: uri-template in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.3.0)\n", "Requirement already satisfied: requests>=2.31 in ./.venv/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.32.5)\n", "Requirement already satisfied: babel>=2.10 in ./.venv/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.18.0)\n", "Requirement already satisfied: json5>=0.9.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.13.0)\n", + "Requirement already satisfied: defusedxml in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (0.7.1)\n", + "Requirement already satisfied: bleach[css]!=5.0.0 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (6.2.0)\n", "Requirement already satisfied: jupyterlab-pygments in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (0.3.0)\n", "Requirement already satisfied: beautifulsoup4 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (4.14.3)\n", - "Requirement already satisfied: bleach[css]!=5.0.0 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (6.2.0)\n", "Requirement already satisfied: nbclient>=0.5.0 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (0.10.2)\n", - "Requirement already satisfied: mistune<4,>=2.0.3 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (3.2.0)\n", "Requirement already satisfied: pandocfilters>=1.4.1 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (1.5.1)\n", - "Requirement already satisfied: defusedxml in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (0.7.1)\n", + "Requirement already satisfied: mistune<4,>=2.0.3 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (3.2.0)\n", "Requirement already satisfied: webencodings in ./.venv/lib/python3.9/site-packages (from bleach[css]!=5.0.0->nbconvert->jupyter->-r requirements.txt (line 1)) (0.5.1)\n", "Requirement already satisfied: tinycss2<1.5,>=1.1.0 in ./.venv/lib/python3.9/site-packages (from bleach[css]!=5.0.0->nbconvert->jupyter->-r requirements.txt (line 1)) (1.4.0)\n", "Requirement already satisfied: fastjsonschema>=2.15 in ./.venv/lib/python3.9/site-packages (from nbformat>=5.3.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.21.2)\n", - "Requirement already satisfied: charset_normalizer<4,>=2 in ./.venv/lib/python3.9/site-packages (from requests>=2.31->jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.4.4)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.9/site-packages (from requests>=2.31->jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.6.3)\n", + "Requirement already satisfied: charset_normalizer<4,>=2 in ./.venv/lib/python3.9/site-packages (from requests>=2.31->jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.4.4)\n", "Requirement already satisfied: lark>=1.2.2 in ./.venv/lib/python3.9/site-packages (from rfc3987-syntax>=1.1.0->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.3.1)\n", "Requirement already satisfied: cffi>=1.0.1 in ./.venv/lib/python3.9/site-packages (from argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.0.0)\n", "Requirement already satisfied: pycparser in ./.venv/lib/python3.9/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.23)\n", "Requirement already satisfied: soupsieve>=1.6.1 in ./.venv/lib/python3.9/site-packages (from beautifulsoup4->nbconvert->jupyter->-r requirements.txt (line 1)) (2.8.3)\n", "Requirement already satisfied: arrow>=0.15.0 in ./.venv/lib/python3.9/site-packages (from isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.4.0)\n", - "Requirement already satisfied: asttokens>=2.1.0 in ./.venv/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (3.0.1)\n", "Requirement already satisfied: executing>=1.2.0 in ./.venv/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (2.2.1)\n", "Requirement already satisfied: pure-eval in ./.venv/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.2.3)\n", + "Requirement already satisfied: asttokens>=2.1.0 in ./.venv/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (3.0.1)\n", "\u001b[33mWARNING: You are using pip version 21.2.4; however, version 26.0.1 is available.\n", "You should consider upgrading via the '/Users/dpadams/Repos/texas-inspection-expenses/.venv/bin/python -m pip install --upgrade pip' command.\u001b[0m\n", "Note: you may need to restart the kernel to use updated packages.\n" @@ -165,7 +165,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 32, "id": "49de2b5c", "metadata": {}, "outputs": [], @@ -190,7 +190,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 33, "id": "08420da3", "metadata": {}, "outputs": [ @@ -198,7 +198,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Connected → texas_data on localhost:5433\n" + "Connected \u2192 texas_data on localhost:5433\n" ] } ], @@ -214,7 +214,7 @@ "engine = create_engine(\n", " f\"postgresql+psycopg2://{user}:{password}@{host}:{port}/{database}\"\n", ")\n", - "print(f\"Connected → {database} on {host}:{port}\")\n" + "print(f\"Connected \u2192 {database} on {host}:{port}\")\n" ] }, { @@ -227,7 +227,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 34, "id": "43886f13", "metadata": {}, "outputs": [ @@ -278,7 +278,7 @@ "type": "float" } ], - "ref": "018af8de-4a03-4f77-94bb-e3d164dfee05", + "ref": "0436dff9-664a-46ac-95c2-865a2d24d0e6", "rows": [ [ "0", @@ -424,7 +424,7 @@ "4 122.90 " ] }, - "execution_count": 4, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } @@ -468,7 +468,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 35, "id": "3841e2f5", "metadata": {}, "outputs": [ @@ -529,7 +529,7 @@ "type": "float" } ], - "ref": "c28c564a-8a11-443b-8610-96058bf68506", + "ref": "e87db138-c28f-4061-a904-1284759db0d2", "rows": [ [ "0", @@ -704,7 +704,7 @@ "4 402.90 " ] }, - "execution_count": 5, + "execution_count": 35, "metadata": {}, "output_type": "execute_result" } @@ -742,7 +742,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 36, "id": "9e196cac", "metadata": {}, "outputs": [ @@ -750,7 +750,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Budget long: 18 rows (2 strategies × 9 years)\n" + "Budget long: 18 rows (2 strategies \u00d7 9 years)\n" ] }, { @@ -813,7 +813,7 @@ "type": "float" } ], - "ref": "87cecfc2-5957-4b97-b5ca-7a825ac11bdc", + "ref": "199877e2-8343-4812-85c6-50c066a92f26", "rows": [ [ "0", @@ -1387,7 +1387,7 @@ "17 2,659,208.00 280.80 " ] }, - "execution_count": 6, + "execution_count": 36, "metadata": {}, "output_type": "execute_result" } @@ -1398,7 +1398,7 @@ "YEARS = [2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024]\n", "COLS = slice(1, 10) # spreadsheet columns 1-9 map to years 2016-2024\n", "\n", - "# ── Section 1: Energy Resource Development (rows 7-18) ──────────────────────\n", + "# \u2500\u2500 Section 1: Energy Resource Development (rows 7-18) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "erd = pd.DataFrame({\n", " \"year\": YEARS,\n", " \"strategy\": \"Energy Resource Development\",\n", @@ -1412,7 +1412,7 @@ " \"fte\": raw.iloc[18, COLS].values.astype(float),\n", "})\n", "\n", - "# ── Section 2: Oil/Gas Monitoring & Inspections (rows 20-31) ────────────────\n", + "# \u2500\u2500 Section 2: Oil/Gas Monitoring & Inspections (rows 20-31) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "ogi = pd.DataFrame({\n", " \"year\": YEARS,\n", " \"strategy\": \"Oil/Gas Monitoring & Inspections\",\n", @@ -1427,13 +1427,13 @@ "})\n", "\n", "budget_long = pd.concat([erd, ogi], ignore_index=True)\n", - "print(f\"Budget long: {len(budget_long)} rows (2 strategies × {len(YEARS)} years)\")\n", + "print(f\"Budget long: {len(budget_long)} rows (2 strategies \u00d7 {len(YEARS)} years)\")\n", "budget_long\n" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 37, "id": "896d152b", "metadata": {}, "outputs": [ @@ -1624,7 +1624,7 @@ "type": "float" } ], - "ref": "140254a9-540d-44af-8d28-9506ffd1cf1c", + "ref": "12bd4b8a-6b40-4525-b255-c31938fae03d", "rows": [ [ "0", @@ -1982,7 +1982,7 @@ " \n", " \n", "\n", - "
5 rows × 34 columns
\n", + "5 rows \u00d7 34 columns
\n", "" ], "text/plain": [ @@ -2031,13 +2031,13 @@ "[5 rows x 34 columns]" ] }, - "execution_count": 7, + "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "# ── Wide budget: one row per year with ogi_ / erd_ prefixed columns ──────────\n", + "# \u2500\u2500 Wide budget: one row per year with ogi_ / erd_ prefixed columns \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "ogi_wide = ogi.drop(columns=\"strategy\").add_prefix(\"ogi_\")\n", "erd_wide = erd.drop(columns=\"strategy\").add_prefix(\"erd_\")\n", "\n", @@ -2048,19 +2048,19 @@ " .drop(columns=\"erd_year\")\n", ")\n", "\n", - "# ── Merge inspections + violations, then join statewide budget on year ────────\n", + "# \u2500\u2500 Merge inspections + violations, then join statewide budget on year \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "panel = (\n", " insp\n", " .merge(viol, on=[\"district\", \"year\"], how=\"left\")\n", " .merge(budget_wide, on=\"year\", how=\"left\")\n", ")\n", "\n", - "# ── Derived columns ───────────────────────────────────────────────────────────\n", + "# \u2500\u2500 Derived columns \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "panel[\"violations_per_inspection\"] = panel[\"total_violations\"] / panel[\"total_inspections\"]\n", - "panel[\"ogi_budget_m\"] = panel[\"ogi_total_budget\"] / 1_000_000 # dollars → millions\n", + "panel[\"ogi_budget_m\"] = panel[\"ogi_total_budget\"] / 1_000_000 # dollars \u2192 millions\n", "panel[\"erd_budget_m\"] = panel[\"erd_total_budget\"] / 1_000_000\n", "panel[\"post_2019\"] = (panel[\"year\"] >= 2019).astype(int)\n", - "# 2024 = budget estimate; 2025 = no budget data — exclude both from regressions\n", + "# 2024 = budget estimate; 2025 = no budget data \u2014 exclude both from regressions\n", "panel[\"is_budget_year\"] = (panel[\"year\"] >= 2024).astype(int)\n", "\n", "# Goal ambiguity: share of combined budget going to the inspection mission.\n", @@ -2093,7 +2093,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 38, "id": "92921756", "metadata": {}, "outputs": [ @@ -2166,7 +2166,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 39, "id": "5d2671c9", "metadata": {}, "outputs": [ @@ -2199,7 +2199,7 @@ " for j in range(len(corr_cols)):\n", " ax.text(j, i, corr.iloc[i, j], ha=\"center\", va=\"center\", fontsize=8)\n", "plt.colorbar(im, ax=ax)\n", - "ax.set_title(\"Correlation Matrix — Key Variables\")\n", + "ax.set_title(\"Correlation Matrix \u2014 Key Variables\")\n", "plt.tight_layout()\n", "plt.show()\n" ] @@ -2208,125 +2208,7 @@ "cell_type": "markdown", "id": "2084d5fe", "metadata": {}, - "source": [ - "## Data and Methods\n", - "\n", - "### Data Sources\n", - "\n", - "This study draws on two primary data sources. The first is the Texas Railroad Commission\n", - "(RRC) Oil and Gas Division administrative database, accessed via a PostGIS spatial data\n", - "warehouse. Inspection records span fiscal years 2016–2025 and encompass approximately\n", - "1.9 million inspection events distributed across 13 RRC administrative districts;\n", - "violation records include approximately 193,000 enforcement actions. From the inspections\n", - "table, district-year aggregates are constructed for three regulatory output measures:\n", - "(1) *compliance rate* — the share of annual inspections in a district that did not result\n", - "in a compliance failure; (2) *total inspections* — the count of field inspection events;\n", - "and (3) average days between successive inspections of the same well, computed via a\n", - "SQL window function (`LAG`) over ordered inspection timestamps. From the violations table,\n", - "district-year aggregates include the *violation resolution rate* (share of violations\n", - "for which the operator was found compliant on re-inspection), enforcement rate, and average\n", - "days from violation discovery to enforcement action.\n", - "\n", - "The second source is RRC budget data drawn from Legislative Appropriations Requests,\n", - "covering fiscal years 2016–2024. Budget appropriations are reported at the statewide level\n", - "disaggregated by goal and strategy. Two strategies are central to this analysis:\n", - "(1) *Oil and Gas Monitoring and Inspections* (OGI), which directly funds field inspection\n", - "operations; and (2) *Energy Resource Development* (ERD), encompassing the broader mandate\n", - "to promote oil and gas resource opportunities. For each strategy, the data include total\n", - "appropriations, salaries, professional fees, travel, other operating expenditures, capital\n", - "outlays, and authorized full-time equivalent (FTE) positions. Fiscal year 2024 represents\n", - "a budget estimate rather than expenditure actuals and is excluded from all regression\n", - "models.\n", - "\n", - "### Sample and Panel Construction\n", - "\n", - "The unit of analysis is the **district-year**. The analytic panel contains\n", - "**N = 130 observations** (13 districts × 10 years, 2016–2025), of which\n", - "**104 observations** (2016–2023) constitute the regression sample. Fiscal years\n", - "2024 (budget estimate) and 2025 (no budget data available) are retained in\n", - "descriptive analyses but excluded from all regression models. Because inspection\n", - "and enforcement activity in 2025 represents a partial year as of the data\n", - "extract, enforcement-timing metrics for that year are subject to right-censoring:\n", - "violations discovered in late 2024 and 2025 may not yet have received a recorded\n", - "enforcement action, compressing observed days-to-enforcement.. Because RRC budget\n", - "appropriations are reported at the statewide level, budget and FTE variables enter the\n", - "panel as year-varying but district-invariant covariates. Identification of budget effects\n", - "therefore relies on year-to-year variation in statewide appropriations rather than\n", - "cross-district budget contrasts.\n", - "\n", - "### Measures\n", - "\n", - "**Dependent variables.** Three measures capture distinct dimensions of regulatory output:\n", - "*total inspections* (inspection volume), *compliance rate* (%), and *violation resolution\n", - "rate* (%). Compliance rate and resolution rate capture quality of enforcement rather than\n", - "quantity and represent different points in the regulatory pipeline: compliance is measured\n", - "at the point of inspection while resolution is measured after a violation has been\n", - "discovered and acted upon.\n", - "\n", - "**Organizational capacity.** The primary capacity measure is OGI total appropriations in\n", - "millions of dollars ($\\text{Budget}_t$), reflecting the statewide resource envelope\n", - "available for inspection activities in year $t$. An auxiliary measure — OGI authorized\n", - "FTE positions — is included in descriptive analyses.\n", - "\n", - "**Goal ambiguity.** Following Chun and Rainey (2005), goal ambiguity is operationalized\n", - "via the relative concentration of resources across missions. The *inspection budget share*\n", - "($\\text{Share}_t$) captures the fraction of combined OGI and ERD appropriations directed\n", - "toward the inspection mandate:\n", - "\n", - "$$\\text{Share}_t = \\frac{\\text{OGI Budget}_t}{\\text{OGI Budget}_t + \\text{ERD Budget}_t}$$\n", - "\n", - "Higher values indicate greater mission clarity (resources more concentrated on inspections);\n", - "lower values indicate greater goal ambiguity (resources spread across competing mandates).\n", - "Over the study period $\\text{Share}_t$ ranged from 0.59 (2022) to 0.67 (2018), reflecting\n", - "meaningful year-to-year variation in budgetary prioritization.\n", - "\n", - "**Geographic moderators.** Two binary district-level indicators capture geographic\n", - "context: $\\text{Offshore}_d = 1$ for districts 02, 03, and 04, which hold dual onshore\n", - "and offshore oversight jurisdiction, and $\\text{Border}_d = 1$ for districts 01–04,\n", - "which are proximate to the Texas Gulf Coast and the US–Mexico border corridor.\n", - "\n", - "### Estimation Strategy\n", - "\n", - "All models are estimated via ordinary least squares (OLS) with standard errors clustered\n", - "at the district level ($G = 13$) to account for within-district serial correlation.\n", - "District fixed effects absorb time-invariant heterogeneity across offices — including\n", - "differences in geographic complexity, historical enforcement culture, and staffing\n", - "composition — and ensure that budget effects are identified from within-district,\n", - "year-to-year variation.\n", - "\n", - "**H1 — Baseline capacity model:**\n", - "\n", - "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\sum_{d} \\gamma_d \\, \\mathbf{1}[\\text{district} = d] + \\varepsilon_{dt}$$\n", - "\n", - "where $Y_{dt}$ is the regulatory output for district $d$ in year $t$, $\\gamma_d$ are\n", - "district fixed effects, and $\\varepsilon_{dt}$ is the idiosyncratic error.\n", - "\n", - "**H2 — Goal ambiguity moderation:**\n", - "\n", - "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\beta_2 \\, \\text{Share}_t + \\beta_3 \\left( \\text{Budget}_t \\times \\text{Share}_t \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n", - "\n", - "The coefficient $\\beta_3$ tests whether goal clarity conditions the capacity–output\n", - "relationship. A positive $\\hat{\\beta}_3$ would indicate that clearer mission focus\n", - "amplifies budget effects; a negative value would suggest diminishing returns or\n", - "cross-strategy resource substitution.\n", - "\n", - "**H3 — District slope heterogeneity:**\n", - "\n", - "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\sum_{d=2}^{D} \\delta_d \\left( \\text{Budget}_t \\times \\mathbf{1}[d] \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n", - "\n", - "District-specific budget slopes are recovered as $\\hat{\\beta}_1 + \\hat{\\delta}_d$.\n", - "Because budget varies only along the time dimension and district fixed effects are\n", - "included, interaction term standard errors are inflated by near-perfect multicollinearity;\n", - "these estimates are treated as descriptive indicators of heterogeneity only.\n", - "\n", - "**H4 — Geographic moderation and spatial autocorrelation:**\n", - "\n", - "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\beta_2 \\, \\text{Offshore}_d + \\beta_3 \\, \\text{Border}_d + \\beta_4 \\left( \\text{Budget}_t \\times \\text{Offshore}_d \\right) + \\beta_5 \\left( \\text{Budget}_t \\times \\text{Border}_d \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n", - "\n", - "Spatial autocorrelation in H1 model residuals is assessed via Moran's $I$ computed on a\n", - "row-normalized inverse-distance spatial weights matrix constructed from district centroids\n", - "derived by averaging well-level geographic coordinates within each district.\n" - ] + "source": "## Data and Methods\n\n### Data Sources\n\nThis study draws on two primary data sources. The first is the Texas Railroad Commission\n(RRC) Oil and Gas Division administrative database. Inspection records span fiscal years 2016\u20132025 and encompass approximately\n1.9 million inspection events distributed across 13 RRC administrative districts;\nviolation records include approximately 193,000 enforcement actions. From the inspections\ntable, district-year aggregates are constructed for three regulatory output measures:\n(1) *compliance rate* \u2014 the share of annual inspections in a district that did not result\nin a compliance failure; (2) *total inspections* \u2014 the count of field inspection events;\nand (3) average days between successive inspections of the same well, computed via a\nSQL window function (`LAG`) over ordered inspection timestamps. From the violations table,\ndistrict-year aggregates include the *violation resolution rate* (share of violations\nfor which the operator was found compliant on re-inspection), enforcement rate, and average\ndays from violation discovery to enforcement action.\n\nThe second source is RRC budget data drawn from Legislative Appropriations Requests,\ncovering fiscal years 2016\u20132024. Budget appropriations are reported at the statewide level\ndisaggregated by goal and strategy. Two strategies are central to this analysis:\n(1) *Oil and Gas Monitoring and Inspections* (OGI), which directly funds field inspection\noperations; and (2) *Energy Resource Development* (ERD), encompassing the broader mandate\nto promote oil and gas resource opportunities. For each strategy, the data include total\nappropriations, salaries, professional fees, travel, other operating expenditures, capital\noutlays, and authorized full-time equivalent (FTE) positions. Fiscal year 2024 represents\na budget estimate rather than expenditure actuals and is excluded from all regression\nmodels.\n\n### Sample and Panel Construction\n\nThe unit of analysis is the **district-year**. The analytic panel contains\n**N = 130 observations** (13 districts \u00d7 10 years, 2016\u20132025), of which\n**104 observations** (2016\u20132023) constitute the regression sample. Fiscal years\n2024 (budget estimate) and 2025 (no budget data available) are retained in\ndescriptive analyses but excluded from all regression models. Because inspection\nand enforcement activity in 2025 represents a partial year as of the data\nextract, enforcement-timing metrics for that year are subject to right-censoring:\nviolations discovered in late 2024 and 2025 may not yet have received a recorded\nenforcement action, compressing observed days-to-enforcement.. Because RRC budget\nappropriations are reported at the statewide level, budget and FTE variables enter the\npanel as year-varying but district-invariant covariates. Identification of budget effects\ntherefore relies on year-to-year variation in statewide appropriations rather than\ncross-district budget contrasts.\n\n### Measures\n\n**Dependent variables.** Three measures capture distinct dimensions of regulatory output:\n*total inspections* (inspection volume), *compliance rate* (%), and *violation resolution\nrate* (%). Compliance rate and resolution rate capture quality of enforcement rather than\nquantity and represent different points in the regulatory pipeline: compliance is measured\nat the point of inspection while resolution is measured after a violation has been\ndiscovered and acted upon.\n\n**Organizational capacity.** The primary capacity measure is OGI total appropriations in\nmillions of dollars ($\\text{Budget}_t$), reflecting the statewide resource envelope\navailable for inspection activities in year $t$. An auxiliary measure \u2014 OGI authorized\nFTE positions \u2014 is included in descriptive analyses.\n\n**Goal ambiguity.** Following Chun and Rainey (2005), goal ambiguity is operationalized\nvia the relative concentration of resources across missions. The *inspection budget share*\n($\\text{Share}_t$) captures the fraction of combined OGI and ERD appropriations directed\ntoward the inspection mandate:\n\n$$\\text{Share}_t = \\frac{\\text{OGI Budget}_t}{\\text{OGI Budget}_t + \\text{ERD Budget}_t}$$\n\nHigher values indicate greater mission clarity (resources more concentrated on inspections);\nlower values indicate greater goal ambiguity (resources spread across competing mandates).\nOver the study period $\\text{Share}_t$ ranged from 0.59 (2022) to 0.67 (2018), reflecting\nmeaningful year-to-year variation in budgetary prioritization.\n\n**Geographic moderators.** Two binary district-level indicators capture geographic\ncontext: $\\text{Offshore}_d = 1$ for districts 02, 03, and 04, which hold dual onshore\nand offshore oversight jurisdiction, and $\\text{Border}_d = 1$ for districts 01\u201304,\nwhich are proximate to the Texas Gulf Coast and the US\u2013Mexico border corridor.\n\n### Estimation Strategy\n\nAll models are estimated via ordinary least squares (OLS) with standard errors clustered\nat the district level ($G = 13$) to account for within-district serial correlation.\nDistrict fixed effects absorb time-invariant heterogeneity across offices \u2014 including\ndifferences in geographic complexity, historical enforcement culture, and staffing\ncomposition \u2014 and ensure that budget effects are identified from within-district,\nyear-to-year variation.\n\n**H1 \u2014 Baseline capacity model:**\n\n$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\sum_{d} \\gamma_d \\, \\mathbf{1}[\\text{district} = d] + \\varepsilon_{dt}$$\n\nwhere $Y_{dt}$ is the regulatory output for district $d$ in year $t$, $\\gamma_d$ are\ndistrict fixed effects, and $\\varepsilon_{dt}$ is the idiosyncratic error.\n\n**H2 \u2014 Goal ambiguity moderation:**\n\n$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\beta_2 \\, \\text{Share}_t + \\beta_3 \\left( \\text{Budget}_t \\times \\text{Share}_t \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n\nThe coefficient $\\beta_3$ tests whether goal clarity conditions the capacity\u2013output\nrelationship. A positive $\\hat{\\beta}_3$ would indicate that clearer mission focus\namplifies budget effects; a negative value would suggest diminishing returns or\ncross-strategy resource substitution.\n\n**H3 \u2014 District slope heterogeneity:**\n\n$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\sum_{d=2}^{D} \\delta_d \\left( \\text{Budget}_t \\times \\mathbf{1}[d] \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n\nDistrict-specific budget slopes are recovered as $\\hat{\\beta}_1 + \\hat{\\delta}_d$.\nBecause budget varies only along the time dimension and district fixed effects are\nincluded, interaction term standard errors are inflated by near-perfect multicollinearity;\nthese estimates are treated as descriptive indicators of heterogeneity only.\n\n**H4 \u2014 Geographic moderation and spatial autocorrelation:**\n\n$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\beta_2 \\, \\text{Offshore}_d + \\beta_3 \\, \\text{Border}_d + \\beta_4 \\left( \\text{Budget}_t \\times \\text{Offshore}_d \\right) + \\beta_5 \\left( \\text{Budget}_t \\times \\text{Border}_d \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n\n**Robustness checks.** Two supplementary tests address limitations of the\nbaseline models. First, wild cluster bootstrap inference (Rademacher weights,\n$B = 999$ draws; Cameron, Gelbach & Miller 2008) is used to re-test H1\ncoefficients, providing valid p-values with the small number of clusters\n($G = 13$). Second, a distributed lag specification replaces the\ncontemporaneous budget measure with its one-year lag ($\\text{Budget}_{t-1}$),\nand also estimates a model including both, to test whether budget effects\noperate with a delay consistent with a hiring-and-deployment mechanism.\nThe distributed lag regression sample covers 2017\u20132023 ($N = 91$).\n\nSpatial autocorrelation in H1 model residuals is assessed via Moran's $I$ computed on a\nrow-normalized inverse-distance spatial weights matrix constructed from district centroids\nderived by averaging well-level geographic coordinates within each district." }, { "cell_type": "markdown", @@ -2336,28 +2218,28 @@ "## Analysis\n", "\n", "This study employs a fixed-effects panel regression framework to examine whether\n", - "year-to-year changes in RRC organizational capacity — as measured by statewide budget\n", - "appropriations — translate into improvements in regulatory outputs across Texas oil and\n", + "year-to-year changes in RRC organizational capacity \u2014 as measured by statewide budget\n", + "appropriations \u2014 translate into improvements in regulatory outputs across Texas oil and\n", "gas inspection districts. The analytic panel spans 13 RRC districts over eight fiscal\n", - "years (2016–2023), yielding 104 district-year observations. The identification strategy\n", + "years (2016\u20132023), yielding 104 district-year observations. The identification strategy\n", "leverages within-district variation in outcomes as a function of year-to-year shifts in\n", "statewide OGI appropriations, net of persistent inter-district differences absorbed by\n", "district fixed effects.\n", "\n", "The choice of a district-year panel rather than a well-level panel is motivated by the\n", "structure of the budget data, which is available only at the statewide level. Because the\n", - "key independent variable — OGI appropriations — varies along the time dimension only, it\n", + "key independent variable \u2014 OGI appropriations \u2014 varies along the time dimension only, it\n", "functions as a common, year-specific exposure applied uniformly to all districts. District\n", "fixed effects then absorb unobservable office-level characteristics that remain stable over\n", "the study period, such as geographic complexity, historical enforcement intensity, and\n", "local administrative capacity. Causal identification is thus predicated on the assumption\n", "that, absent changes in budget, within-district outcome trajectories would have followed\n", - "parallel trends across years — an assumption that cannot be directly tested but is\n", + "parallel trends across years \u2014 an assumption that cannot be directly tested but is\n", "partially supported by the pre-period stability visible in the descriptive trends.\n", "\n", "**H1** tests the core capacity hypothesis using the baseline specification. Each of the\n", - "three dependent variables — total inspections, compliance rate, and violation resolution\n", - "rate — is regressed separately on OGI budget (in millions of dollars) and district fixed\n", + "three dependent variables \u2014 total inspections, compliance rate, and violation resolution\n", + "rate \u2014 is regressed separately on OGI budget (in millions of dollars) and district fixed\n", "effects. Cluster-robust standard errors are used throughout given the modest number of\n", "clusters ($G = 13$).\n", "\n", @@ -2365,9 +2247,9 @@ "operationalizing goal ambiguity as the degree to which RRC appropriations are concentrated\n", "on the inspection mandate versus the broader energy development mission. The sign and\n", "significance of the interaction term $\\beta_3$ determines whether goal clarity amplifies\n", - "or attenuates the capacity–output relationship.\n", + "or attenuates the capacity\u2013output relationship.\n", "\n", - "**H3** tests for heterogeneity in budget–outcome slopes across districts by including\n", + "**H3** tests for heterogeneity in budget\u2013outcome slopes across districts by including\n", "budget $\\times$ district interaction terms. Given only eight years of data per district,\n", "the saturated interaction model is estimated with approximately zero residual degrees of\n", "freedom for the fixed-effects component; as a result, interaction-term standard errors\n", @@ -2375,16 +2257,16 @@ "variation rather than inferential tests. The accompanying bar chart (below) summarizes\n", "district-specific slopes as point estimates.\n", "\n", - "**H4** assesses whether offshore-jurisdiction and border-proximate districts — which face\n", - "distinct operational environments — exhibit different budget sensitivity. The model adds\n", + "**H4** assesses whether offshore-jurisdiction and border-proximate districts \u2014 which face\n", + "distinct operational environments \u2014 exhibit different budget sensitivity. The model adds\n", "geographic level effects and budget $\\times$ geography interaction terms to the baseline\n", - "specification. A complementary spatial diagnostic — Moran's $I$ applied to the residuals\n", - "from the H1 compliance model — tests for geographic clustering of unexplained outcome\n", + "specification. A complementary spatial diagnostic \u2014 Moran's $I$ applied to the residuals\n", + "from the H1 compliance model \u2014 tests for geographic clustering of unexplained outcome\n", "variation that could indicate omitted spatial processes or spillovers across district\n", "boundaries.\n", "\n", "All regressions exclude fiscal years 2024 (budget estimate) and 2025 (no\n", - "budget data), retaining 2016–2023 as the regression sample (N = 104). The\n", + "budget data), retaining 2016\u20132023 as the regression sample (N = 104). The\n", "extended panel through 2025 is used for descriptive trend analysis only.\n", "Enforcement-timing metrics for 2025 should be interpreted cautiously: because\n", "the data extract covers a partial year, violations discovered in late 2024 and\n", @@ -2400,16 +2282,16 @@ "id": "3ca1410a", "metadata": {}, "source": [ - "## H1: Organizational Capacity → Policy Outputs\n", + "## H1: Organizational Capacity \u2192 Policy Outputs\n", "\n", "**Prediction:** Higher OGI budget and FTE predict more inspections, higher compliance rates, and faster violation resolution.\n", "\n", - "**Model:** OLS with district fixed effects and year 2016–2023 (excluding 2024 budget estimate). Budget varies only over time, so it identifies via year-to-year changes in statewide RRC allocations; district FE absorbs persistent cross-district differences.\n" + "**Model:** OLS with district fixed effects and year 2016\u20132023 (excluding 2024 budget estimate). Budget varies only over time, so it identifies via year-to-year changes in statewide RRC allocations; district FE absorbs persistent cross-district differences.\n" ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 40, "id": "463387d3", "metadata": {}, "outputs": [ @@ -2430,11 +2312,11 @@ "fig, axes = plt.subplots(1, 3, figsize=(15, 4))\n", "\n", "actuals.plot.scatter(x=\"ogi_budget_m\", y=\"total_inspections\",\n", - " alpha=0.4, ax=axes[0], title=\"Budget → Inspections\")\n", + " alpha=0.4, ax=axes[0], title=\"Budget \u2192 Inspections\")\n", "actuals.plot.scatter(x=\"ogi_budget_m\", y=\"compliance_rate\",\n", - " alpha=0.4, ax=axes[1], title=\"Budget → Compliance Rate (%)\")\n", + " alpha=0.4, ax=axes[1], title=\"Budget \u2192 Compliance Rate (%)\")\n", "actuals.plot.scatter(x=\"ogi_budget_m\", y=\"resolution_rate\",\n", - " alpha=0.4, ax=axes[2], title=\"Budget → Resolution Rate (%)\")\n", + " alpha=0.4, ax=axes[2], title=\"Budget \u2192 Resolution Rate (%)\")\n", "\n", "for ax in axes:\n", " ax.set_xlabel(\"OGI Budget ($M)\")\n", @@ -2445,7 +2327,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 41, "id": "535fc4eb", "metadata": {}, "outputs": [ @@ -2453,20 +2335,20 @@ "name": "stdout", "output_type": "stream", "text": [ - "H1a — OGI Budget ($M) → Total Inspections\n", + "H1a \u2014 OGI Budget ($M) \u2192 Total Inspections\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 666.30 212.98 3.13 0.00\n", - " R² = 0.769 Adj. R² = 0.736\n", + " R\u00b2 = 0.769 Adj. R\u00b2 = 0.736\n", "\n", - "H1b — OGI Budget ($M) → Compliance Rate (%)\n", + "H1b \u2014 OGI Budget ($M) \u2192 Compliance Rate (%)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.26 0.11 2.31 0.02\n", - " R² = 0.538 Adj. R² = 0.471\n", + " R\u00b2 = 0.538 Adj. R\u00b2 = 0.471\n", "\n", - "H1c — OGI Budget ($M) → Resolution Rate (%)\n", + "H1c \u2014 OGI Budget ($M) \u2192 Resolution Rate (%)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 1.05 0.32 3.28 0.00\n", - " R² = 0.624 Adj. R² = 0.569\n" + " R\u00b2 = 0.624 Adj. R\u00b2 = 0.569\n" ] } ], @@ -2486,23 +2368,23 @@ " data=actuals,\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": actuals[\"district\"]})\n", "\n", - "# Detect actual column names — statsmodels uses z/P>|z| with robust SEs in some versions\n", + "# Detect actual column names \u2014 statsmodels uses z/P>|z| with robust SEs in some versions\n", "_tbl = m_inspections.summary2().tables[1]\n", "_t = \"t\" if \"t\" in _tbl.columns else \"z\"\n", "_p = \"P>|t|\" if \"P>|t|\" in _tbl.columns else \"P>|z|\"\n", "display_cols = [\"Coef.\", \"Std.Err.\", _t, _p]\n", "\n", - "print(\"H1a — OGI Budget ($M) → Total Inspections\")\n", + "print(\"H1a \u2014 OGI Budget ($M) \u2192 Total Inspections\")\n", "print(m_inspections.summary2().tables[1][display_cols].loc[[\"ogi_budget_m\"]])\n", - "print(f\" R² = {m_inspections.rsquared:.3f} Adj. R² = {m_inspections.rsquared_adj:.3f}\\n\")\n", + "print(f\" R\u00b2 = {m_inspections.rsquared:.3f} Adj. R\u00b2 = {m_inspections.rsquared_adj:.3f}\\n\")\n", "\n", - "print(\"H1b — OGI Budget ($M) → Compliance Rate (%)\")\n", + "print(\"H1b \u2014 OGI Budget ($M) \u2192 Compliance Rate (%)\")\n", "print(m_compliance.summary2().tables[1][display_cols].loc[[\"ogi_budget_m\"]])\n", - "print(f\" R² = {m_compliance.rsquared:.3f} Adj. R² = {m_compliance.rsquared_adj:.3f}\\n\")\n", + "print(f\" R\u00b2 = {m_compliance.rsquared:.3f} Adj. R\u00b2 = {m_compliance.rsquared_adj:.3f}\\n\")\n", "\n", - "print(\"H1c — OGI Budget ($M) → Resolution Rate (%)\")\n", + "print(\"H1c \u2014 OGI Budget ($M) \u2192 Resolution Rate (%)\")\n", "print(m_resolution.summary2().tables[1][display_cols].loc[[\"ogi_budget_m\"]])\n", - "print(f\" R² = {m_resolution.rsquared:.3f} Adj. R² = {m_resolution.rsquared_adj:.3f}\")\n" + "print(f\" R\u00b2 = {m_resolution.rsquared:.3f} Adj. R\u00b2 = {m_resolution.rsquared_adj:.3f}\")\n" ] }, { @@ -2512,17 +2394,17 @@ "source": [ "## H2: Goal Ambiguity Moderates Capacity Effects\n", "\n", - "**Prediction:** When a larger share of the combined RRC budget flows to the broader \"Energy Resource Development\" goal (lower `inspection_budget_share`), the capacity → output link weakens.\n", + "**Prediction:** When a larger share of the combined RRC budget flows to the broader \"Energy Resource Development\" goal (lower `inspection_budget_share`), the capacity \u2192 output link weakens.\n", "\n", "**Operationalization:**\n", "`inspection_budget_share = ogi_budget / (ogi_budget + erd_budget)`\n", "\n", - "A negative interaction coefficient `ogi_budget_m × inspection_budget_share` would be unexpected (higher share → weaker effect). A positive coefficient supports H2 — clearer mission focus amplifies the budget → compliance relationship.\n" + "A negative interaction coefficient `ogi_budget_m \u00d7 inspection_budget_share` would be unexpected (higher share \u2192 weaker effect). A positive coefficient supports H2 \u2014 clearer mission focus amplifies the budget \u2192 compliance relationship.\n" ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 42, "id": "24187ce8", "metadata": {}, "outputs": [ @@ -2530,21 +2412,21 @@ "name": "stdout", "output_type": "stream", "text": [ - "H2 — Goal Ambiguity Moderation (DV: compliance_rate)\n", + "H2 \u2014 Goal Ambiguity Moderation (DV: compliance_rate)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 4.20 1.09 3.86 0.00\n", "inspection_budget_share 170.18 44.79 3.80 0.00\n", "ogi_budget_m:inspection_budget_share -6.53 1.84 -3.55 0.00\n", "\n", - "R² = 0.567 Adj. R² = 0.493\n", + "R\u00b2 = 0.567 Adj. R\u00b2 = 0.493\n", "\n", - "H2 — Goal Ambiguity Moderation (DV: resolution_rate)\n", + "H2 \u2014 Goal Ambiguity Moderation (DV: resolution_rate)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 6.68 4.67 1.43 0.15\n", "inspection_budget_share 230.67 204.30 1.13 0.26\n", "ogi_budget_m:inspection_budget_share -9.42 7.99 -1.18 0.24\n", "\n", - "R² = 0.629 Adj. R² = 0.566\n" + "R\u00b2 = 0.629 Adj. R\u00b2 = 0.566\n" ] } ], @@ -2555,19 +2437,19 @@ ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": actuals[\"district\"]})\n", "\n", "key_rows = [\"ogi_budget_m\", \"inspection_budget_share\", \"ogi_budget_m:inspection_budget_share\"]\n", - "print(\"H2 — Goal Ambiguity Moderation (DV: compliance_rate)\")\n", + "print(\"H2 \u2014 Goal Ambiguity Moderation (DV: compliance_rate)\")\n", "print(m_h2.summary2().tables[1][display_cols].loc[key_rows])\n", - "print(f\"\\nR² = {m_h2.rsquared:.3f} Adj. R² = {m_h2.rsquared_adj:.3f}\")\n", + "print(f\"\\nR\u00b2 = {m_h2.rsquared:.3f} Adj. R\u00b2 = {m_h2.rsquared_adj:.3f}\")\n", "\n", - "# ── Same model with resolution rate as DV ────────────────────────────────────\n", + "# \u2500\u2500 Same model with resolution rate as DV \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "m_h2_res = smf.ols(\n", " \"resolution_rate ~ ogi_budget_m * inspection_budget_share + C(district)\",\n", " data=actuals,\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": actuals[\"district\"]})\n", "\n", - "print(\"\\nH2 — Goal Ambiguity Moderation (DV: resolution_rate)\")\n", + "print(\"\\nH2 \u2014 Goal Ambiguity Moderation (DV: resolution_rate)\")\n", "print(m_h2_res.summary2().tables[1][display_cols].loc[key_rows])\n", - "print(f\"\\nR² = {m_h2_res.rsquared:.3f} Adj. R² = {m_h2_res.rsquared_adj:.3f}\")\n" + "print(f\"\\nR\u00b2 = {m_h2_res.rsquared:.3f} Adj. R\u00b2 = {m_h2_res.rsquared_adj:.3f}\")\n" ] }, { @@ -2577,14 +2459,14 @@ "source": [ "## H3: District Multilevel Effects\n", "\n", - "**Prediction:** The budget → output slope varies across RRC districts — some districts translate budget increases into better outputs more effectively than others.\n", + "**Prediction:** The budget \u2192 output slope varies across RRC districts \u2014 some districts translate budget increases into better outputs more effectively than others.\n", "\n", - "**Model:** Interaction `ogi_budget_m × C(district)` — the reference district captures the baseline budget slope; interaction terms show how each other district's slope differs.\n" + "**Model:** Interaction `ogi_budget_m \u00d7 C(district)` \u2014 the reference district captures the baseline budget slope; interaction terms show how each other district's slope differs.\n" ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 43, "id": "151faefd", "metadata": {}, "outputs": [ @@ -2592,7 +2474,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "H3 — District-Heterogeneous Budget Effect (DV: compliance_rate)\n", + "H3 \u2014 District-Heterogeneous Budget Effect (DV: compliance_rate)\n", "Baseline (reference district) budget slope:\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.09 0.00 56,876,193,472,228.37 0.00\n", @@ -2612,7 +2494,7 @@ "ogi_budget_m:C(district)[T.7C] 0.31 0.00 24,243,474,173,332.52 0.00\n", "ogi_budget_m:C(district)[T.8A] 0.10 0.00 59,702,739,775,453.20 0.00\n", "\n", - "R² = 0.662 Adj. R² = 0.554\n" + "R\u00b2 = 0.662 Adj. R\u00b2 = 0.554\n" ] }, { @@ -2636,7 +2518,7 @@ "\n", "# Baseline budget slope (reference district)\n", "baseline_row = coef_table.loc[[\"ogi_budget_m\"]]\n", - "print(\"H3 — District-Heterogeneous Budget Effect (DV: compliance_rate)\")\n", + "print(\"H3 \u2014 District-Heterogeneous Budget Effect (DV: compliance_rate)\")\n", "print(f\"Baseline (reference district) budget slope:\")\n", "print(baseline_row[display_cols])\n", "\n", @@ -2644,9 +2526,9 @@ "interaction_rows = coef_table[coef_table.index.str.contains(\"ogi_budget_m:C\")]\n", "print(\"\\nDistrict interaction terms (deviation from reference slope):\")\n", "print(interaction_rows[display_cols].round(4))\n", - "print(f\"\\nR² = {m_h3.rsquared:.3f} Adj. R² = {m_h3.rsquared_adj:.3f}\")\n", + "print(f\"\\nR\u00b2 = {m_h3.rsquared:.3f} Adj. R\u00b2 = {m_h3.rsquared_adj:.3f}\")\n", "\n", - "# ── Plot district-specific budget slopes ─────────────────────────────────────\n", + "# \u2500\u2500 Plot district-specific budget slopes \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "districts = actuals[\"district\"].unique()\n", "slopes = {}\n", "for d in districts:\n", @@ -2660,7 +2542,7 @@ "slope_df.plot.barh(ax=ax, color=[\"#d62728\" if v < 0 else \"#1f77b4\" for v in slope_df])\n", "ax.axvline(0, color=\"black\", linewidth=0.8)\n", "ax.set_xlabel(\"Budget slope (compliance rate pp per $M)\")\n", - "ax.set_title(\"H3 — District-Specific Budget → Compliance Slopes\")\n", + "ax.set_title(\"H3 \u2014 District-Specific Budget \u2192 Compliance Slopes\")\n", "plt.tight_layout()\n", "plt.show()\n" ] @@ -2673,14 +2555,14 @@ "## H4: Spatial and Geographic Factors\n", "\n", "**Predictions:**\n", - "- Offshore-jurisdiction districts (02, 03, 04) show a different budget → output relationship due to dual onshore/offshore oversight burden.\n", + "- Offshore-jurisdiction districts (02, 03, 04) show a different budget \u2192 output relationship due to dual onshore/offshore oversight burden.\n", "- Border-proximate districts show a different relationship due to cross-jurisdiction enforcement complexity.\n", "- Spatial autocorrelation in H1 residuals (Moran's I) would indicate unmodeled geographic spillovers.\n" ] }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 44, "id": "d6e56f00", "metadata": {}, "outputs": [ @@ -2727,7 +2609,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 45, "id": "74686bfe", "metadata": {}, "outputs": [ @@ -2735,7 +2617,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "H4 — Spatial Moderators (DV: compliance_rate)\n", + "H4 \u2014 Spatial Moderators (DV: compliance_rate)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.35 0.15 2.39 0.02\n", "offshore 7.61 3.29 2.31 0.02\n", @@ -2743,12 +2625,12 @@ "ogi_budget_m:offshore -0.03 0.18 -0.16 0.87\n", "ogi_budget_m:border -0.25 0.15 -1.74 0.08\n", "\n", - "R² = 0.553 Adj. R² = 0.476\n", + "R\u00b2 = 0.553 Adj. R\u00b2 = 0.476\n", "\n", "Moran's I on H1 compliance residuals = -0.0512\n", - " > 0 → residuals cluster spatially (similar neighbours)\n", - " ≈ 0 → no spatial pattern\n", - " < 0 → spatial dispersion (dissimilar neighbours)\n", + " > 0 \u2192 residuals cluster spatially (similar neighbours)\n", + " \u2248 0 \u2192 no spatial pattern\n", + " < 0 \u2192 spatial dispersion (dissimilar neighbours)\n", "\n", "District centroids used:\n", "district lat lon\n", @@ -2769,7 +2651,7 @@ } ], "source": [ - "# ── Spatial regression: offshore and border interactions ─────────────────────\n", + "# \u2500\u2500 Spatial regression: offshore and border interactions \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "m_h4 = smf.ols(\n", " \"compliance_rate ~ ogi_budget_m + offshore + border \"\n", " \"+ ogi_budget_m:offshore + ogi_budget_m:border + C(district)\",\n", @@ -2781,11 +2663,11 @@ " \"ogi_budget_m:offshore\", \"ogi_budget_m:border\",\n", "]\n", "available = [r for r in spatial_rows if r in m_h4.params.index]\n", - "print(\"H4 — Spatial Moderators (DV: compliance_rate)\")\n", + "print(\"H4 \u2014 Spatial Moderators (DV: compliance_rate)\")\n", "print(m_h4.summary2().tables[1][display_cols].loc[available])\n", - "print(f\"\\nR² = {m_h4.rsquared:.3f} Adj. R² = {m_h4.rsquared_adj:.3f}\")\n", + "print(f\"\\nR\u00b2 = {m_h4.rsquared:.3f} Adj. R\u00b2 = {m_h4.rsquared_adj:.3f}\")\n", "\n", - "# ── Moran's I on H1 residuals ─────────────────────────────────────────────────\n", + "# \u2500\u2500 Moran's I on H1 residuals \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", "# Compute district centroids from well lat/lon joined via inspections\n", "centroids_sql = \"\"\"\n", "SELECT\n", @@ -2823,9 +2705,9 @@ " morans_i = (n / W.sum()) * (z @ W @ z) / (z @ z)\n", "\n", " print(f\"\\nMoran's I on H1 compliance residuals = {morans_i:.4f}\")\n", - " print(\" > 0 → residuals cluster spatially (similar neighbours)\")\n", - " print(\" ≈ 0 → no spatial pattern\")\n", - " print(\" < 0 → spatial dispersion (dissimilar neighbours)\")\n", + " print(\" > 0 \u2192 residuals cluster spatially (similar neighbours)\")\n", + " print(\" \u2248 0 \u2192 no spatial pattern\")\n", + " print(\" < 0 \u2192 spatial dispersion (dissimilar neighbours)\")\n", "\n", " print(\"\\nDistrict centroids used:\")\n", " print(centroids[[\"district\", \"lat\", \"lon\"]].round(2).to_string(index=False))\n", @@ -2843,9 +2725,9 @@ "\n", "### Descriptive Trends\n", "\n", - "Table 1 summarizes year-level means for the key variables across 2016–2025, with\n", - "regression analyses restricted to 2016–2023. OGI appropriations grew from $18.47 million\n", - "in 2016 to $34.33 million in 2023 — an 86 percent nominal increase — with the FY2024\n", + "Table 1 summarizes year-level means for the key variables across 2016\u20132025, with\n", + "regression analyses restricted to 2016\u20132023. OGI appropriations grew from $18.47 million\n", + "in 2016 to $34.33 million in 2023 \u2014 an 86 percent nominal increase \u2014 with the FY2024\n", "budget estimate reaching $38.51 million. Authorized FTE positions rose modestly from\n", "256.7 to 271.2 over the same period. Inspection volume per district increased from a\n", "mean of 18,278 in 2016 to a peak of 36,553 in 2024, with a partial-year figure of 34,082\n", @@ -2861,7 +2743,7 @@ "hypothesis, though they are also consistent with secular improvements in industry\n", "compliance independent of budget growth.\n", "\n", - "**Table 1. Year-Level Panel Means, 2016–2025**\n", + "**Table 1. Year-Level Panel Means, 2016\u20132025**\n", "\n", "| Year | OGI Budget ($M) | OGI FTE | Inspections/District | Compliance Rate (%) | Resolution Rate (%) | Days to Enforcement |\n", "|:----:|:---------------:|:-------:|:--------------------:|:-------------------:|:-------------------:|:-------------------:|\n", @@ -2873,15 +2755,15 @@ "| 2021 | 28.76 | 277.8 | 24,116 | 88.8 | 66.2 | 118.8 |\n", "| 2022 | 25.91 | 264.0 | 32,024 | 89.8 | 67.9 | 91.5 |\n", "| 2023 | 34.33 | 271.2 | 33,806 | 91.6 | 69.7 | 105.2 |\n", - "| 2024† | 38.51 | 280.8 | 36,553 | 92.6 | 65.1 | 76.9 |\n", - "| 2025‡ | — | — | 34,082 | 90.5 | 52.1 | 36.6‡ |\n", + "| 2024\u2020 | 38.51 | 280.8 | 36,553 | 92.6 | 65.1 | 76.9 |\n", + "| 2025\u2021 | \u2014 | \u2014 | 34,082 | 90.5 | 52.1 | 36.6\u2021 |\n", "\n", "*Note: Budget figures are nominal. FTE = authorized full-time equivalent positions.\n", "Inspections/District = mean district-level annual inspection count.*\n", - "*† 2024 budget is an appropriations estimate, not expenditure actuals; excluded from\n", + "*\u2020 2024 budget is an appropriations estimate, not expenditure actuals; excluded from\n", "regression models.*\n", - "*‡ 2025 data is partial-year as of the data extract. Resolution rate and days-to-enforcement\n", - "are right-censored: violations discovered in late 2024–2025 may not yet have a recorded\n", + "*\u2021 2025 data is partial-year as of the data extract. Resolution rate and days-to-enforcement\n", + "are right-censored: violations discovered in late 2024\u20132025 may not yet have a recorded\n", "enforcement action, compressing these metrics.*\n", "\n", "---\n", @@ -2900,7 +2782,7 @@ "changes and outcome changes rather than cross-sectional differences between\n", "better- and worse-funded districts.\n", "\n", - "**Table 2. H1 Regression Results: OGI Budget → Regulatory Outputs**\n", + "**Table 2. H1 Regression Results: OGI Budget \u2192 Regulatory Outputs**\n", "\n", "| Dependent Variable | $\\hat{\\beta}$ (Budget \\$M) | SE | $z$ | $p$ | $R^2$ | Adj. $R^2$ |\n", "|---|:---:|:---:|:---:|:---:|:---:|:---:|\n", @@ -2918,12 +2800,12 @@ "The goal ambiguity moderation model for compliance rate (Table 3) yields a statistically\n", "significant and negative interaction between OGI budget and inspection budget share\n", "($\\hat{\\beta}_3 = -6.53$, SE = 1.84, $z = -3.55$, $p < .01$). The negative sign is\n", - "substantively noteworthy: rather than amplifying the budget–compliance relationship,\n", + "substantively noteworthy: rather than amplifying the budget\u2013compliance relationship,\n", "higher concentration of resources on the inspection mandate is associated with diminishing\n", "marginal returns to additional appropriations. Evaluated at the mean inspection budget\n", "share ($\\bar{s} \\approx 0.62$), the implied marginal effect of a \\$1 million budget\n", "increase on compliance rate is approximately $4.20 - 6.53(0.62) \\approx 0.15$ percentage\n", - "points — consistent with, though slightly smaller than, the H1 estimate. This pattern\n", + "points \u2014 consistent with, though slightly smaller than, the H1 estimate. This pattern\n", "suggests that as the inspection program becomes better resourced relative to other RRC\n", "mandates, the incremental compliance gain from further investment contracts, consistent\n", "with a resource saturation or ceiling effect.\n", @@ -2939,7 +2821,7 @@ "|---|:---:|:---:|:---:|:---:|\n", "| Budget (\\$M) | 4.20 | 1.09 | 3.86 | <.01 |\n", "| Inspection budget share | 170.18 | 44.79 | 3.80 | <.01 |\n", - "| Budget × Share | −6.53 | 1.84 | −3.55 | <.01 |\n", + "| Budget \u00d7 Share | \u22126.53 | 1.84 | \u22123.55 | <.01 |\n", "\n", "*Note: District fixed effects included. SE clustered at district. $R^2 = .567$,\n", "Adj. $R^2 = .493$. $N = 104$.*\n", @@ -2960,13 +2842,13 @@ "saturated model (see Data and Methods); point estimates are presented as descriptive\n", "indicators only.\n", "\n", - "**Table 4. H3 District-Specific Budget → Compliance Slopes (pp per \\$1M)**\n", + "**Table 4. H3 District-Specific Budget \u2192 Compliance Slopes (pp per \\$1M)**\n", "\n", "| District | Estimated Slope |\n", "|:---:|:---:|\n", "| 01 (San Antonio) | 0.09 |\n", "| 02 (Corpus Christi) | 0.24 |\n", - "| 03 (Houston) | −0.34 |\n", + "| 03 (Houston) | \u22120.34 |\n", "| 04 (Laredo) | 0.28 |\n", "| 05 (Midland/Abilene) | 0.05 |\n", "| 06 (Kilgore) | 0.43 |\n", @@ -2989,11 +2871,11 @@ "non-offshore districts on average, net of budget ($\\hat{\\beta} = 7.61$, SE = 3.29,\n", "$z = 2.31$, $p = .02$). Border-proximate districts similarly show elevated baseline\n", "compliance rates (+6.03 pp, SE = 2.84, $z = 2.12$, $p = .03$). These level effects may\n", - "reflect the heightened external scrutiny — from federal regulators, environmental\n", - "organizations, and media — that offshore and border districts attract, which could\n", + "reflect the heightened external scrutiny \u2014 from federal regulators, environmental\n", + "organizations, and media \u2014 that offshore and border districts attract, which could\n", "independently drive compliance investments by operators regardless of RRC budget levels.\n", "\n", - "The budget–compliance slope, however, does not differ significantly between offshore\n", + "The budget\u2013compliance slope, however, does not differ significantly between offshore\n", "and non-offshore districts ($\\hat{\\beta}_4 = -0.03$, $p = .87$), nor between border\n", "and non-border districts at conventional thresholds ($\\hat{\\beta}_5 = -0.25$, $p = .08$),\n", "suggesting that geographic classification affects the *level* of compliance performance\n", @@ -3012,11 +2894,11 @@ "| Budget (\\$M) | 0.35 | 0.15 | 2.39 | .02 |\n", "| Offshore (= 1) | 7.61 | 3.29 | 2.31 | .02 |\n", "| Border (= 1) | 6.03 | 2.84 | 2.12 | .03 |\n", - "| Budget × Offshore | −0.03 | 0.18 | −0.16 | .87 |\n", - "| Budget × Border | −0.25 | 0.15 | −1.74 | .08 |\n", + "| Budget \u00d7 Offshore | \u22120.03 | 0.18 | \u22120.16 | .87 |\n", + "| Budget \u00d7 Border | \u22120.25 | 0.15 | \u22121.74 | .08 |\n", "\n", "*Note: District fixed effects included. SE clustered at district. $R^2 = .553$,\n", - "Adj. $R^2 = .476$. $N = 104$. Moran's $I$ on H1 compliance residuals = −0.051 (no\n", + "Adj. $R^2 = .476$. $N = 104$. Moran's $I$ on H1 compliance residuals = \u22120.051 (no\n", "significant spatial autocorrelation).*\n", "\n", "---\n", @@ -3025,17 +2907,173 @@ "\n", "Taken together, the results offer moderate support for a resource-capacity model of\n", "regulatory performance. Higher OGI appropriations are reliably associated with greater\n", - "inspection volume, higher compliance rates, and faster violation resolution — though\n", + "inspection volume, higher compliance rates, and faster violation resolution \u2014 though\n", "identification rests on temporal variation in statewide appropriations rather than\n", "quasi-experimental assignment, and the modest panel length limits statistical precision.\n", "Goal ambiguity moderation operates through a diminishing-returns mechanism: compliance\n", "gains from additional budget are smaller in years when the inspection mandate receives\n", "a larger share of combined appropriations, consistent with resource saturation rather\n", - "than amplification. District heterogeneity in budget–outcome slopes is substantial in\n", + "than amplification. District heterogeneity in budget\u2013outcome slopes is substantial in\n", "descriptive terms but cannot be precisely estimated with the available data. Finally,\n", - "geographic context — offshore jurisdiction and border proximity — predicts compliance\n", + "geographic context \u2014 offshore jurisdiction and border proximity \u2014 predicts compliance\n", "levels but not budget sensitivity, and spatial autocorrelation diagnostics provide no\n", - "evidence of unmodeled geographic spillover processes.\n" + "evidence of unmodeled geographic spillover processes.\n", + "\n", + "### Robustness Checks\n", + "\n", + "Wild cluster bootstrap p-values and distributed lag results are reported in the\n", + "Robustness Checks section below. Conclusions regarding H1 significance should\n", + "be read alongside those results once the bootstrap has been run; if bootstrap\n", + "p-values align with asymptotic results, the H1 findings are robust to the\n", + "small-cluster concern. The distributed lag models assess whether the budget\n", + "effect is more consistent with an instantaneous or a lagged implementation\n", + "mechanism, with implications for causal interpretation.\n" + ] + }, + { + "cell_type": "markdown", + "id": "360e76f4", + "metadata": {}, + "source": [ + "## Robustness Checks\n", + "\n", + "Two checks address limitations of the baseline H1 models.\n", + "\n", + "**Wild cluster bootstrap** re-tests H1 with bootstrap inference rather than\n", + "asymptotic cluster-robust standard errors. With only G = 13 clusters,\n", + "asymptotic results can be unreliable; Rademacher wild cluster bootstrap\n", + "(Cameron, Gelbach & Miller 2008) provides valid p-values regardless of\n", + "cluster count.\n", + "\n", + "**Distributed lag model** relaxes the assumption that budget effects are\n", + "instantaneous. A one-year lag reflects a plausible implementation timeline:\n", + "appropriations are enacted, hiring and training occur, then inspection activity\n", + "increases. If the lagged coefficient is significant while the contemporaneous\n", + "one weakens, that pattern is harder to explain with a simple confounding time\n", + "trend and strengthens the causal interpretation.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "12b7ded8", + "metadata": {}, + "outputs": [], + "source": [ + "# Wild cluster bootstrap (Rademacher weights, B=999)\n", + "# For each draw: multiply each cluster's residuals by \u00b11, re-fit, record t-stat.\n", + "# p-value = share of |t*| >= |t_observed|.\n", + "\n", + "def wild_cluster_bootstrap(model, data, dv, cluster_col=\"district\",\n", + " coef=\"ogi_budget_m\", B=999, seed=42):\n", + " rng = np.random.default_rng(seed)\n", + " groups = data[cluster_col].values\n", + " unique_groups = np.unique(groups)\n", + " t_obs = model.tvalues[coef]\n", + " yhat = model.fittedvalues.values\n", + " ehat = model.resid.values\n", + "\n", + " t_boot = np.empty(B)\n", + " for b in range(B):\n", + " # One Rademacher weight per cluster, broadcast to observations\n", + " cw = {g: rng.choice([-1.0, 1.0]) for g in unique_groups}\n", + " w = np.array([cw[g] for g in groups])\n", + " df_b = data.copy()\n", + " df_b[dv] = yhat + ehat * w\n", + " m_b = smf.ols(\n", + " f\"{dv} ~ {coef} + C({cluster_col})\", data=df_b\n", + " ).fit(cov_type=\"cluster\", cov_kwds={\"groups\": df_b[cluster_col]})\n", + " t_boot[b] = m_b.tvalues.get(coef, np.nan)\n", + "\n", + " p_boot = float((np.abs(t_boot) >= np.abs(t_obs)).mean())\n", + " return t_obs, float(model.pvalues[coef]), p_boot\n", + "\n", + "print(\"Wild Cluster Bootstrap \u2014 H1 Models (B = 999 Rademacher draws)\")\n", + "print(f\"{'Outcome':<28} {'t-stat':>7} {'p asymptotic':>13} {'p bootstrap':>12}\")\n", + "print(\"\u2500\" * 65)\n", + "\n", + "for dv, model in [\n", + " (\"total_inspections\", m_inspections),\n", + " (\"compliance_rate\", m_compliance),\n", + " (\"resolution_rate\", m_resolution),\n", + "]:\n", + " t, p_a, p_b = wild_cluster_bootstrap(model, actuals, dv)\n", + " sig_a = \"*\" * (1 + (p_a < .05) + (p_a < .01))\n", + " sig_b = \"*\" * (1 + (p_b < .05) + (p_b < .01))\n", + " print(f\"{dv:<28} {t:>7.3f} {p_a:>12.3f}{sig_a:<3} {p_b:>10.3f}{sig_b:<3}\")\n", + "\n", + "print(\"\\n* p<.10 ** p<.05 *** p<.01\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1add0c69", + "metadata": {}, + "outputs": [], + "source": [ + "# Distributed lag: 1-year lag of OGI budget (shift within district).\n", + "# Lag is NaN for 2016 (no 2015 data), so regression sample is 2017-2023 (N=91).\n", + "\n", + "panel_lag = panel.copy()\n", + "panel_lag[\"ogi_budget_m_lag1\"] = (\n", + " panel_lag.sort_values(\"year\")\n", + " .groupby(\"district\")[\"ogi_budget_m\"]\n", + " .shift(1)\n", + ")\n", + "\n", + "lag_actuals = panel_lag[\n", + " (panel_lag[\"is_budget_year\"] == 0) &\n", + " (panel_lag[\"ogi_budget_m_lag1\"].notna())\n", + "].copy()\n", + "\n", + "print(f\"Distributed lag sample: {len(lag_actuals)} obs | \"\n", + " f\"years {lag_actuals['year'].min()}\u2013{lag_actuals['year'].max()}\")\n", + "\n", + "# \u2500\u2500 Model A: lagged budget only \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "m_lag_only = smf.ols(\n", + " \"compliance_rate ~ ogi_budget_m_lag1 + C(district)\", data=lag_actuals\n", + ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", + "\n", + "# \u2500\u2500 Model B: contemporaneous + 1-year lag \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "m_lag_both = smf.ols(\n", + " \"compliance_rate ~ ogi_budget_m + ogi_budget_m_lag1 + C(district)\",\n", + " data=lag_actuals\n", + ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", + "\n", + "# \u2500\u2500 Also run for resolution rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "m_lag_res_only = smf.ols(\n", + " \"resolution_rate ~ ogi_budget_m_lag1 + C(district)\", data=lag_actuals\n", + ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", + "\n", + "m_lag_res_both = smf.ols(\n", + " \"resolution_rate ~ ogi_budget_m + ogi_budget_m_lag1 + C(district)\",\n", + " data=lag_actuals\n", + ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", + "\n", + "print(\"\\n\u2500\u2500 Compliance Rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\")\n", + "\n", + "print(\"\\nModel A \u2014 Lagged budget only (t\u22121):\")\n", + "print(m_lag_only.summary2().tables[1][display_cols].loc[[\"ogi_budget_m_lag1\"]])\n", + "print(f\" R\u00b2 = {m_lag_only.rsquared:.3f} Adj. R\u00b2 = {m_lag_only.rsquared_adj:.3f}\")\n", + "\n", + "print(\"\\nModel B \u2014 Contemporaneous + 1-year lag:\")\n", + "print(m_lag_both.summary2().tables[1][display_cols].loc[\n", + " [\"ogi_budget_m\", \"ogi_budget_m_lag1\"]\n", + "])\n", + "print(f\" R\u00b2 = {m_lag_both.rsquared:.3f} Adj. R\u00b2 = {m_lag_both.rsquared_adj:.3f}\")\n", + "\n", + "print(\"\\n\u2500\u2500 Resolution Rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\")\n", + "\n", + "print(\"\\nModel A \u2014 Lagged budget only (t\u22121):\")\n", + "print(m_lag_res_only.summary2().tables[1][display_cols].loc[[\"ogi_budget_m_lag1\"]])\n", + "print(f\" R\u00b2 = {m_lag_res_only.rsquared:.3f} Adj. R\u00b2 = {m_lag_res_only.rsquared_adj:.3f}\")\n", + "\n", + "print(\"\\nModel B \u2014 Contemporaneous + 1-year lag:\")\n", + "print(m_lag_res_both.summary2().tables[1][display_cols].loc[\n", + " [\"ogi_budget_m\", \"ogi_budget_m_lag1\"]\n", + "])\n", + "print(f\" R\u00b2 = {m_lag_res_both.rsquared:.3f} Adj. R\u00b2 = {m_lag_res_both.rsquared_adj:.3f}\")\n" ] }, { @@ -3049,22 +3087,22 @@ "\n", "| # | Hypothesis | Prediction | Key Result | Support |\n", "|:---:|---|---|---|:---:|\n", - "| **H1a** | Capacity → Inspection volume | Higher OGI budget predicts more inspections per district | β = 666.3 inspections per $1M (z = 3.13, p < .01); R² = .769 | ✓ |\n", - "| **H1b** | Capacity → Compliance | Higher OGI budget predicts higher district compliance rate | β = 0.26 pp per $1M (z = 2.31, p = .02); R² = .538 | ✓ |\n", - "| **H1c** | Capacity → Resolution | Higher OGI budget predicts higher violation resolution rate | β = 1.05 pp per $1M (z = 3.28, p < .01); R² = .624 | ✓ |\n", - "| **H2a** | Goal ambiguity moderates capacity → compliance | Clearer inspection focus amplifies budget effect | Interaction significant but **negative** (β = −6.53, z = −3.55, p < .01): higher inspection share produces diminishing, not amplified, returns | Partial† |\n", - "| **H2b** | Goal ambiguity moderates capacity → resolution | Clearer inspection focus amplifies budget effect | Interaction not significant (p = .24) | ✗ |\n", - "| **H3** | District heterogeneity in budget slopes | Budget → compliance slope varies across districts | Point estimates range from −0.34 pp/$1M (D03) to +1.36 pp/$1M (D6E); inference unreliable due to multicollinearity | Descriptive only‡ |\n", - "| **H4a** | Offshore jurisdiction moderates budget effect | Offshore districts show different budget → compliance slope | Level effect significant (+7.6 pp, p = .02); slope interaction not significant (β = −0.03, p = .87) | Partial§ |\n", - "| **H4b** | Border proximity moderates budget effect | Border districts show different budget → compliance slope | Level effect significant (+6.0 pp, p = .03); slope interaction marginal (β = −0.25, p = .08) | Partial§ |\n", - "| **H4c** | Spatial autocorrelation in residuals | Geographic spillovers produce clustered residuals | Moran's I = −0.051; no significant spatial autocorrelation | ✗ |\n", + "| **H1a** | Capacity \u2192 Inspection volume | Higher OGI budget predicts more inspections per district | \u03b2 = 666.3 inspections per $1M (z = 3.13, p < .01); R\u00b2 = .769 | \u2713 |\n", + "| **H1b** | Capacity \u2192 Compliance | Higher OGI budget predicts higher district compliance rate | \u03b2 = 0.26 pp per $1M (z = 2.31, p = .02); R\u00b2 = .538 | \u2713 |\n", + "| **H1c** | Capacity \u2192 Resolution | Higher OGI budget predicts higher violation resolution rate | \u03b2 = 1.05 pp per $1M (z = 3.28, p < .01); R\u00b2 = .624 | \u2713 |\n", + "| **H2a** | Goal ambiguity moderates capacity \u2192 compliance | Clearer inspection focus amplifies budget effect | Interaction significant but **negative** (\u03b2 = \u22126.53, z = \u22123.55, p < .01): higher inspection share produces diminishing, not amplified, returns | Partial\u2020 |\n", + "| **H2b** | Goal ambiguity moderates capacity \u2192 resolution | Clearer inspection focus amplifies budget effect | Interaction not significant (p = .24) | \u2717 |\n", + "| **H3** | District heterogeneity in budget slopes | Budget \u2192 compliance slope varies across districts | Point estimates range from \u22120.34 pp/$1M (D03) to +1.36 pp/$1M (D6E); inference unreliable due to multicollinearity | Descriptive only\u2021 |\n", + "| **H4a** | Offshore jurisdiction moderates budget effect | Offshore districts show different budget \u2192 compliance slope | Level effect significant (+7.6 pp, p = .02); slope interaction not significant (\u03b2 = \u22120.03, p = .87) | Partial\u00a7 |\n", + "| **H4b** | Border proximity moderates budget effect | Border districts show different budget \u2192 compliance slope | Level effect significant (+6.0 pp, p = .03); slope interaction marginal (\u03b2 = \u22120.25, p = .08) | Partial\u00a7 |\n", + "| **H4c** | Spatial autocorrelation in residuals | Geographic spillovers produce clustered residuals | Moran's I = \u22120.051; no significant spatial autocorrelation | \u2717 |\n", "\n", "*Notes:*\n", - "*† H2 moderation operates through a diminishing-returns mechanism rather than amplification. At mean inspection budget share (≈ 0.62), the implied marginal budget effect on compliance is approximately 0.15 pp per $1M.*\n", - "*‡ H3 interaction standard errors are unreliable (near-perfect multicollinearity in the saturated model); budget slopes are reported as descriptive point estimates only.*\n", - "*§ Geographic classification predicts compliance **levels** but not budget sensitivity. Offshore and border districts exhibit systematically higher compliance regardless of annual budget variation.*\n", + "*\u2020 H2 moderation operates through a diminishing-returns mechanism rather than amplification. At mean inspection budget share (\u2248 0.62), the implied marginal budget effect on compliance is approximately 0.15 pp per $1M.*\n", + "*\u2021 H3 interaction standard errors are unreliable (near-perfect multicollinearity in the saturated model); budget slopes are reported as descriptive point estimates only.*\n", + "*\u00a7 Geographic classification predicts compliance **levels** but not budget sensitivity. Offshore and border districts exhibit systematically higher compliance regardless of annual budget variation.*\n", "\n", - "**Regression sample:** N = 104 (13 districts × 8 years, 2016–2023). All models include district fixed effects; standard errors clustered at the district level.\n" + "**Regression sample:** N = 104 (13 districts \u00d7 8 years, 2016\u20132023). All models include district fixed effects; standard errors clustered at the district level.\n" ] } ], @@ -3089,4 +3127,4 @@ }, "nbformat": 4, "nbformat_minor": 5 -} +} \ No newline at end of file