From 0b4d7f88416a8dfb3a4c59a158f484b0923ecf14 Mon Sep 17 00:00:00 2001 From: dadams Date: Wed, 25 Feb 2026 13:32:52 -0800 Subject: [PATCH] we have some analysis --- .gitignore | 2 + claude-resume.txt | 0 texas_inspection_expenses.html | 10452 ++++++++++++++++++++++++++++++ texas_inspection_expenses.ipynb | 847 ++- 4 files changed, 11046 insertions(+), 255 deletions(-) create mode 100644 claude-resume.txt create mode 100644 texas_inspection_expenses.html diff --git a/.gitignore b/.gitignore index 44dcd9f..1ad38ea 100644 --- a/.gitignore +++ b/.gitignore @@ -2,3 +2,5 @@ __pycache__/ *.pyc .ipynb_checkpoints/ +# macOS specific files +.DS_Store \ No newline at end of file diff --git a/claude-resume.txt b/claude-resume.txt new file mode 100644 index 0000000..e69de29 diff --git a/texas_inspection_expenses.html b/texas_inspection_expenses.html new file mode 100644 index 0000000..cef9a2a --- /dev/null +++ b/texas_inspection_expenses.html @@ -0,0 +1,10452 @@ + + + + + +texas_inspection_expenses + + + + + + + + + + + + +
+ + + + + + + + + + + + +
+ + diff --git a/texas_inspection_expenses.ipynb b/texas_inspection_expenses.ipynb index 8077ddd..3832f75 100644 --- a/texas_inspection_expenses.ipynb +++ b/texas_inspection_expenses.ipynb @@ -4,153 +4,26 @@ "cell_type": "markdown", "id": "dc6818b8", "metadata": {}, - "source": "# Texas RRC Inspection Expenses Analysis\n\n**Research question:** Does organizational capacity (budget, staffing) predict better regulatory outputs (inspections, compliance, enforcement), and how is that relationship moderated by goal ambiguity, district-level heterogeneity, and spatial/geographic factors?\n\n## Hypotheses\n- **H1 \u2014 Capacity \u2192 Outputs:** Higher OGI budget and FTE predict more inspections, higher compliance rates, and faster violation resolution.\n- **H2 \u2014 Goal Ambiguity:** When a larger share of RRC budget goes to the more ambiguous \"Energy Resource Development\" goal, the capacity \u2192 output relationship weakens.\n- **H3 \u2014 Multilevel / District Effects:** The capacity \u2192 output relationship varies across RRC districts (budget slope heterogeneity).\n- **H4 \u2014 Spatial & Geographic:** Offshore-jurisdiction and border districts moderate the capacity \u2192 output relationship; spatial autocorrelation in residuals is tested via Moran's I.\n\n**Data:**\n- PostgreSQL warehouse (`texas_data`): `inspections`, `violations`, `well_shape_tract`\n- `RRC Budget Data.xlsx`: statewide RRC budget by strategy, 2016\u20132024\n- Analysis panel: 2016\u20132025 (N = 130 district-years); regression sample: 2016\u20132023 (N = 104)\n" - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "3ed415f0", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Requirement already satisfied: jupyter in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 1)) (1.1.1)\n", - "Requirement already satisfied: ipykernel in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 2)) (6.31.0)\n", - "Requirement already satisfied: pandas in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 3)) (2.3.3)\n", - "Requirement already satisfied: numpy in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 4)) (2.0.2)\n", - "Requirement already satisfied: sqlalchemy in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 5)) (2.0.47)\n", - "Requirement already satisfied: psycopg2-binary in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 6)) (2.9.11)\n", - "Requirement already satisfied: python-dotenv in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 7)) (1.2.1)\n", - "Requirement already satisfied: openpyxl in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 8)) (3.1.5)\n", - "Requirement already satisfied: statsmodels in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 9)) (0.14.6)\n", - "Requirement already satisfied: scipy in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 10)) (1.13.1)\n", - "Requirement already satisfied: matplotlib in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 11)) (3.9.4)\n", - "Requirement already satisfied: seaborn in ./.venv/lib/python3.9/site-packages (from -r requirements.txt (line 12)) (0.13.2)\n", - "Requirement already satisfied: jupyterlab in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (4.5.5)\n", - "Requirement already satisfied: jupyter-console in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (6.6.3)\n", - "Requirement already satisfied: nbconvert in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (7.17.0)\n", - "Requirement already satisfied: ipywidgets in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (8.1.8)\n", - "Requirement already satisfied: notebook in ./.venv/lib/python3.9/site-packages (from jupyter->-r requirements.txt (line 1)) (7.5.4)\n", - "Requirement already satisfied: jupyter-client>=8.0.0 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (8.6.3)\n", - "Requirement already satisfied: psutil>=5.7 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (7.2.2)\n", - "Requirement already satisfied: matplotlib-inline>=0.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.2.1)\n", - "Requirement already satisfied: tornado>=6.2 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (6.5.4)\n", - "Requirement already satisfied: ipython>=7.23.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (8.18.1)\n", - "Requirement already satisfied: debugpy>=1.6.5 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (1.8.20)\n", - "Requirement already satisfied: appnope>=0.1.2 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.1.4)\n", - "Requirement already satisfied: nest-asyncio>=1.4 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (1.6.0)\n", - "Requirement already satisfied: packaging>=22 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (26.0)\n", - "Requirement already satisfied: pyzmq>=25 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (27.1.0)\n", - "Requirement already satisfied: comm>=0.1.1 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (0.2.3)\n", - "Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (5.8.1)\n", - "Requirement already satisfied: traitlets>=5.4.0 in ./.venv/lib/python3.9/site-packages (from ipykernel->-r requirements.txt (line 2)) (5.14.3)\n", - "Requirement already satisfied: tzdata>=2022.7 in ./.venv/lib/python3.9/site-packages (from pandas->-r requirements.txt (line 3)) (2025.3)\n", - "Requirement already satisfied: python-dateutil>=2.8.2 in ./.venv/lib/python3.9/site-packages (from pandas->-r requirements.txt (line 3)) (2.9.0.post0)\n", - "Requirement already satisfied: pytz>=2020.1 in ./.venv/lib/python3.9/site-packages (from pandas->-r requirements.txt (line 3)) (2025.2)\n", - "Requirement already satisfied: typing-extensions>=4.6.0 in ./.venv/lib/python3.9/site-packages (from sqlalchemy->-r requirements.txt (line 5)) (4.15.0)\n", - "Requirement already satisfied: et-xmlfile in ./.venv/lib/python3.9/site-packages (from openpyxl->-r requirements.txt (line 8)) (2.0.0)\n", - "Requirement already satisfied: patsy>=0.5.6 in ./.venv/lib/python3.9/site-packages (from statsmodels->-r requirements.txt (line 9)) (1.0.2)\n", - "Requirement already satisfied: contourpy>=1.0.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (1.3.0)\n", - "Requirement already satisfied: cycler>=0.10 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (0.12.1)\n", - "Requirement already satisfied: fonttools>=4.22.0 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (4.60.2)\n", - "Requirement already satisfied: pillow>=8 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (11.3.0)\n", - "Requirement already satisfied: pyparsing>=2.3.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (3.3.2)\n", - "Requirement already satisfied: kiwisolver>=1.3.1 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (1.4.7)\n", - "Requirement already satisfied: importlib-resources>=3.2.0 in ./.venv/lib/python3.9/site-packages (from matplotlib->-r requirements.txt (line 11)) (6.5.2)\n", - "Requirement already satisfied: zipp>=3.1.0 in ./.venv/lib/python3.9/site-packages (from importlib-resources>=3.2.0->matplotlib->-r requirements.txt (line 11)) (3.23.0)\n", - "Requirement already satisfied: decorator in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (5.2.1)\n", - "Requirement already satisfied: pexpect>4.3 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (4.9.0)\n", - "Requirement already satisfied: exceptiongroup in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (1.3.1)\n", - "Requirement already satisfied: jedi>=0.16 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.19.2)\n", - "Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (3.0.52)\n", - "Requirement already satisfied: pygments>=2.4.0 in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (2.19.2)\n", - "Requirement already satisfied: stack-data in ./.venv/lib/python3.9/site-packages (from ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.6.3)\n", - "Requirement already satisfied: parso<0.9.0,>=0.8.4 in ./.venv/lib/python3.9/site-packages (from jedi>=0.16->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.8.6)\n", - "Requirement already satisfied: importlib-metadata>=4.8.3 in ./.venv/lib/python3.9/site-packages (from jupyter-client>=8.0.0->ipykernel->-r requirements.txt (line 2)) (8.7.1)\n", - "Requirement already satisfied: platformdirs>=2.5 in ./.venv/lib/python3.9/site-packages (from jupyter-core!=5.0.*,>=4.12->ipykernel->-r requirements.txt (line 2)) (4.4.0)\n", - "Requirement already satisfied: ptyprocess>=0.5 in ./.venv/lib/python3.9/site-packages (from pexpect>4.3->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.7.0)\n", - "Requirement already satisfied: wcwidth in ./.venv/lib/python3.9/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.6.0)\n", - "Requirement already satisfied: six>=1.5 in ./.venv/lib/python3.9/site-packages (from python-dateutil>=2.8.2->pandas->-r requirements.txt (line 3)) (1.17.0)\n", - "Requirement already satisfied: jupyterlab_widgets~=3.0.15 in ./.venv/lib/python3.9/site-packages (from ipywidgets->jupyter->-r requirements.txt (line 1)) (3.0.16)\n", - "Requirement already satisfied: widgetsnbextension~=4.0.14 in ./.venv/lib/python3.9/site-packages (from ipywidgets->jupyter->-r requirements.txt (line 1)) (4.0.15)\n", - "Requirement already satisfied: jupyterlab-server<3,>=2.28.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.28.0)\n", - "Requirement already satisfied: tomli>=1.2.2 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.4.0)\n", - "Requirement already satisfied: jupyter-lsp>=2.0.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.3.0)\n", - "Requirement already satisfied: jupyter-server<3,>=2.4.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.17.0)\n", - "Requirement already satisfied: notebook-shim>=0.2 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (0.2.4)\n", - "Requirement already satisfied: setuptools>=41.1.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (58.0.4)\n", - "Requirement already satisfied: async-lru>=1.0.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (2.0.5)\n", - "Requirement already satisfied: httpx<1,>=0.25.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (0.28.1)\n", - "Requirement already satisfied: jinja2>=3.0.3 in ./.venv/lib/python3.9/site-packages (from jupyterlab->jupyter->-r requirements.txt (line 1)) (3.1.6)\n", - "Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.0.9)\n", - "Requirement already satisfied: idna in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.11)\n", - "Requirement already satisfied: anyio in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.12.1)\n", - "Requirement already satisfied: certifi in ./.venv/lib/python3.9/site-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2026.2.25)\n", - "Requirement already satisfied: h11>=0.16 in ./.venv/lib/python3.9/site-packages (from httpcore==1.*->httpx<1,>=0.25.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.16.0)\n", - "Requirement already satisfied: MarkupSafe>=2.0 in ./.venv/lib/python3.9/site-packages (from jinja2>=3.0.3->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.0.3)\n", - "Requirement already satisfied: jupyter-server-terminals>=0.4.4 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.5.4)\n", - "Requirement already satisfied: websocket-client>=1.7 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.9.0)\n", - "Requirement already satisfied: overrides>=5.0 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (7.7.0)\n", - "Requirement already satisfied: prometheus-client>=0.9 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.24.1)\n", - "Requirement already satisfied: argon2-cffi>=21.1 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (25.1.0)\n", - "Requirement already satisfied: nbformat>=5.3.0 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (5.10.4)\n", - "Requirement already satisfied: send2trash>=1.8.2 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.1.0)\n", - "Requirement already satisfied: terminado>=0.8.3 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.18.1)\n", - "Requirement already satisfied: jupyter-events>=0.11.0 in ./.venv/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.12.0)\n", - "Requirement already satisfied: argon2-cffi-bindings in ./.venv/lib/python3.9/site-packages (from argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (25.1.0)\n", - "Requirement already satisfied: rfc3339-validator in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.1.4)\n", - "Requirement already satisfied: referencing in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.36.2)\n", - "Requirement already satisfied: pyyaml>=5.3 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (6.0.3)\n", - "Requirement already satisfied: python-json-logger>=2.0.4 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.0.0)\n", - "Requirement already satisfied: jsonschema[format-nongpl]>=4.18.0 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (4.25.1)\n", - "Requirement already satisfied: rfc3986-validator>=0.1.1 in ./.venv/lib/python3.9/site-packages (from jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.1.1)\n", - "Requirement already satisfied: attrs>=22.2.0 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (25.4.0)\n", - "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2025.9.1)\n", - "Requirement already satisfied: rpds-py>=0.7.1 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.27.1)\n", - "Requirement already satisfied: fqdn in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.5.1)\n", - "Requirement already satisfied: rfc3987-syntax>=1.1.0 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.1.0)\n", - "Requirement already satisfied: uri-template in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.3.0)\n", - "Requirement already satisfied: webcolors>=24.6.0 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (24.11.1)\n", - "Requirement already satisfied: isoduration in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (20.11.0)\n", - "Requirement already satisfied: jsonpointer>1.13 in ./.venv/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.0.0)\n", - "Requirement already satisfied: requests>=2.31 in ./.venv/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.32.5)\n", - "Requirement already satisfied: babel>=2.10 in ./.venv/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.18.0)\n", - "Requirement already satisfied: json5>=0.9.0 in ./.venv/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (0.13.0)\n", - "Requirement already satisfied: defusedxml in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (0.7.1)\n", - "Requirement already satisfied: nbclient>=0.5.0 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (0.10.2)\n", - "Requirement already satisfied: bleach[css]!=5.0.0 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (6.2.0)\n", - "Requirement already satisfied: beautifulsoup4 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (4.14.3)\n", - "Requirement already satisfied: pandocfilters>=1.4.1 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (1.5.1)\n", - "Requirement already satisfied: mistune<4,>=2.0.3 in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (3.2.0)\n", - "Requirement already satisfied: jupyterlab-pygments in ./.venv/lib/python3.9/site-packages (from nbconvert->jupyter->-r requirements.txt (line 1)) (0.3.0)\n", - "Requirement already satisfied: webencodings in ./.venv/lib/python3.9/site-packages (from bleach[css]!=5.0.0->nbconvert->jupyter->-r requirements.txt (line 1)) (0.5.1)\n", - "Requirement already satisfied: tinycss2<1.5,>=1.1.0 in ./.venv/lib/python3.9/site-packages (from bleach[css]!=5.0.0->nbconvert->jupyter->-r requirements.txt (line 1)) (1.4.0)\n", - "Requirement already satisfied: fastjsonschema>=2.15 in ./.venv/lib/python3.9/site-packages (from nbformat>=5.3.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.21.2)\n", - "Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.9/site-packages (from requests>=2.31->jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.6.3)\n", - "Requirement already satisfied: charset_normalizer<4,>=2 in ./.venv/lib/python3.9/site-packages (from requests>=2.31->jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (3.4.4)\n", - "Requirement already satisfied: lark>=1.2.2 in ./.venv/lib/python3.9/site-packages (from rfc3987-syntax>=1.1.0->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.3.1)\n", - "Requirement already satisfied: cffi>=1.0.1 in ./.venv/lib/python3.9/site-packages (from argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.0.0)\n", - "Requirement already satisfied: pycparser in ./.venv/lib/python3.9/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (2.23)\n", - "Requirement already satisfied: soupsieve>=1.6.1 in ./.venv/lib/python3.9/site-packages (from beautifulsoup4->nbconvert->jupyter->-r requirements.txt (line 1)) (2.8.3)\n", - "Requirement already satisfied: arrow>=0.15.0 in ./.venv/lib/python3.9/site-packages (from isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.11.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter->-r requirements.txt (line 1)) (1.4.0)\n", - "Requirement already satisfied: executing>=1.2.0 in ./.venv/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (2.2.1)\n", - "Requirement already satisfied: asttokens>=2.1.0 in ./.venv/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (3.0.1)\n", - "Requirement already satisfied: pure-eval in ./.venv/lib/python3.9/site-packages (from stack-data->ipython>=7.23.1->ipykernel->-r requirements.txt (line 2)) (0.2.3)\n", - "\u001b[33mWARNING: You are using pip version 21.2.4; however, version 26.0.1 is available.\n", - "You should consider upgrading via the '/Users/dpadams/Repos/texas-inspection-expenses/.venv/bin/python -m pip install --upgrade pip' command.\u001b[0m\n", - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], "source": [ - "%pip install -r requirements.txt" + "# Texas RRC Inspection Expenses Analysis\n", + "\n", + "**Research question:** Does organizational capacity (budget, staffing) predict better regulatory outputs (inspections, compliance, enforcement), and how is that relationship moderated by goal ambiguity, district-level heterogeneity, and spatial/geographic factors?\n", + "\n", + "## Hypotheses\n", + "- **H1 — Capacity → Outputs:** Higher OGI budget and FTE predict more inspections, higher compliance rates, and faster violation resolution.\n", + "- **H2 — Goal Ambiguity:** When a larger share of RRC budget goes to the more ambiguous \"Energy Resource Development\" goal, the capacity → output relationship weakens.\n", + "- **H3 — Multilevel / District Effects:** The capacity → output relationship varies across RRC districts (budget slope heterogeneity).\n", + "- **H4 — Spatial & Geographic:** Offshore-jurisdiction and border districts moderate the capacity → output relationship; spatial autocorrelation in residuals is tested via Moran's I.\n", + "\n", + "**Data:**\n", + "- PostgreSQL warehouse (`texas_data`): `inspections`, `violations`, `well_shape_tract`\n", + "- `RRC Budget Data.xlsx`: statewide RRC budget by strategy, 2016–2024\n", + "- Analysis panel: 2016–2025 (N = 130 district-years); regression sample: 2016–2023 (N = 104)\n" ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 18, "id": "49de2b5c", "metadata": {}, "outputs": [], @@ -183,7 +56,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Connected \u2192 texas_data on localhost:5433\n" + "Connected → texas_data on localhost:5433\n" ] } ], @@ -199,7 +72,7 @@ "engine = create_engine(\n", " f\"postgresql+psycopg2://{user}:{password}@{host}:{port}/{database}\"\n", ")\n", - "print(f\"Connected \u2192 {database} on {host}:{port}\")\n" + "print(f\"Connected → {database} on {host}:{port}\")\n" ] }, { @@ -735,7 +608,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Budget long: 18 rows (2 strategies \u00d7 9 years)\n" + "Budget long: 18 rows (2 strategies × 9 years)\n" ] }, { @@ -1383,7 +1256,7 @@ "YEARS = [2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024]\n", "COLS = slice(1, 10) # spreadsheet columns 1-9 map to years 2016-2024\n", "\n", - "# \u2500\u2500 Section 1: Energy Resource Development (rows 7-18) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Section 1: Energy Resource Development (rows 7-18) ──────────────────────\n", "erd = pd.DataFrame({\n", " \"year\": YEARS,\n", " \"strategy\": \"Energy Resource Development\",\n", @@ -1397,7 +1270,7 @@ " \"fte\": raw.iloc[18, COLS].values.astype(float),\n", "})\n", "\n", - "# \u2500\u2500 Section 2: Oil/Gas Monitoring & Inspections (rows 20-31) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Section 2: Oil/Gas Monitoring & Inspections (rows 20-31) ────────────────\n", "ogi = pd.DataFrame({\n", " \"year\": YEARS,\n", " \"strategy\": \"Oil/Gas Monitoring & Inspections\",\n", @@ -1412,7 +1285,7 @@ "})\n", "\n", "budget_long = pd.concat([erd, ogi], ignore_index=True)\n", - "print(f\"Budget long: {len(budget_long)} rows (2 strategies \u00d7 {len(YEARS)} years)\")\n", + "print(f\"Budget long: {len(budget_long)} rows (2 strategies × {len(YEARS)} years)\")\n", "budget_long\n" ] }, @@ -1967,7 +1840,7 @@ " \n", " \n", "\n", - "

5 rows \u00d7 34 columns

\n", + "

5 rows × 34 columns

\n", "" ], "text/plain": [ @@ -2022,7 +1895,7 @@ } ], "source": [ - "# \u2500\u2500 Wide budget: one row per year with ogi_ / erd_ prefixed columns \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Wide budget: one row per year with ogi_ / erd_ prefixed columns ──────────\n", "ogi_wide = ogi.drop(columns=\"strategy\").add_prefix(\"ogi_\")\n", "erd_wide = erd.drop(columns=\"strategy\").add_prefix(\"erd_\")\n", "\n", @@ -2033,19 +1906,19 @@ " .drop(columns=\"erd_year\")\n", ")\n", "\n", - "# \u2500\u2500 Merge inspections + violations, then join statewide budget on year \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Merge inspections + violations, then join statewide budget on year ────────\n", "panel = (\n", " insp\n", " .merge(viol, on=[\"district\", \"year\"], how=\"left\")\n", " .merge(budget_wide, on=\"year\", how=\"left\")\n", ")\n", "\n", - "# \u2500\u2500 Derived columns \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Derived columns ───────────────────────────────────────────────────────────\n", "panel[\"violations_per_inspection\"] = panel[\"total_violations\"] / panel[\"total_inspections\"]\n", - "panel[\"ogi_budget_m\"] = panel[\"ogi_total_budget\"] / 1_000_000 # dollars \u2192 millions\n", + "panel[\"ogi_budget_m\"] = panel[\"ogi_total_budget\"] / 1_000_000 # dollars → millions\n", "panel[\"erd_budget_m\"] = panel[\"erd_total_budget\"] / 1_000_000\n", "panel[\"post_2019\"] = (panel[\"year\"] >= 2019).astype(int)\n", - "# 2024 = budget estimate; 2025 = no budget data \u2014 exclude both from regressions\n", + "# 2024 = budget estimate; 2025 = no budget data — exclude both from regressions\n", "panel[\"is_budget_year\"] = (panel[\"year\"] >= 2024).astype(int)\n", "\n", "# Goal ambiguity: share of combined budget going to the inspection mission.\n", @@ -2184,7 +2057,7 @@ " for j in range(len(corr_cols)):\n", " ax.text(j, i, corr.iloc[i, j], ha=\"center\", va=\"center\", fontsize=8)\n", "plt.colorbar(im, ax=ax)\n", - "ax.set_title(\"Correlation Matrix \u2014 Key Variables\")\n", + "ax.set_title(\"Correlation Matrix — Key Variables\")\n", "plt.tight_layout()\n", "plt.show()\n" ] @@ -2199,12 +2072,12 @@ "### Data Sources\n", "\n", "This study draws on two primary data sources. The first is the Texas Railroad Commission\n", - "(RRC) Oil and Gas Division administrative database. Inspection records span fiscal years 2016\u20132025 and encompass approximately\n", + "(RRC) Oil and Gas Division administrative database. Inspection records span fiscal years 2016–2025 and encompass approximately\n", "1.9 million inspection events distributed across 13 RRC administrative districts;\n", "violation records include approximately 193,000 enforcement actions. From the inspections\n", "table, district-year aggregates are constructed for three regulatory output measures:\n", - "(1) *compliance rate* \u2014 the share of annual inspections in a district that did not result\n", - "in a compliance failure; (2) *total inspections* \u2014 the count of field inspection events;\n", + "(1) *compliance rate* — the share of annual inspections in a district that did not result\n", + "in a compliance failure; (2) *total inspections* — the count of field inspection events;\n", "and (3) average days between successive inspections of the same well, computed via a\n", "SQL window function (`LAG`) over ordered inspection timestamps. From the violations table,\n", "district-year aggregates include the *violation resolution rate* (share of violations\n", @@ -2212,7 +2085,7 @@ "days from violation discovery to enforcement action.\n", "\n", "The second source is RRC budget data drawn from Legislative Appropriations Requests,\n", - "covering fiscal years 2016\u20132024. Budget appropriations are reported at the statewide level\n", + "covering fiscal years 2016–2024. Budget appropriations are reported at the statewide level\n", "disaggregated by goal and strategy. Two strategies are central to this analysis:\n", "(1) *Oil and Gas Monitoring and Inspections* (OGI), which directly funds field inspection\n", "operations; and (2) *Energy Resource Development* (ERD), encompassing the broader mandate\n", @@ -2225,8 +2098,8 @@ "### Sample and Panel Construction\n", "\n", "The unit of analysis is the **district-year**. The analytic panel contains\n", - "**N = 130 observations** (13 districts \u00d7 10 years, 2016\u20132025), of which\n", - "**104 observations** (2016\u20132023) constitute the regression sample. Fiscal years\n", + "**N = 130 observations** (13 districts × 10 years, 2016–2025), of which\n", + "**104 observations** (2016–2023) constitute the regression sample. Fiscal years\n", "2024 (budget estimate) and 2025 (no budget data available) are retained in\n", "descriptive analyses but excluded from all regression models. Because inspection\n", "and enforcement activity in 2025 represents a partial year as of the data\n", @@ -2249,8 +2122,8 @@ "\n", "**Organizational capacity.** The primary capacity measure is OGI total appropriations in\n", "millions of dollars ($\\text{Budget}_t$), reflecting the statewide resource envelope\n", - "available for inspection activities in year $t$. An auxiliary measure \u2014 OGI authorized\n", - "FTE positions \u2014 is included in descriptive analyses.\n", + "available for inspection activities in year $t$. An auxiliary measure — OGI authorized\n", + "FTE positions — is included in descriptive analyses.\n", "\n", "**Goal ambiguity.** Following Chun and Rainey (2005), goal ambiguity is operationalized\n", "via the relative concentration of resources across missions. The *inspection budget share*\n", @@ -2266,35 +2139,35 @@ "\n", "**Geographic moderators.** Two binary district-level indicators capture geographic\n", "context: $\\text{Offshore}_d = 1$ for districts 02, 03, and 04, which hold dual onshore\n", - "and offshore oversight jurisdiction, and $\\text{Border}_d = 1$ for districts 01\u201304,\n", - "which are proximate to the Texas Gulf Coast and the US\u2013Mexico border corridor.\n", + "and offshore oversight jurisdiction, and $\\text{Border}_d = 1$ for districts 01–04,\n", + "which are proximate to the Texas Gulf Coast and the US–Mexico border corridor.\n", "\n", "### Estimation Strategy\n", "\n", "All models are estimated via ordinary least squares (OLS) with standard errors clustered\n", "at the district level ($G = 13$) to account for within-district serial correlation.\n", - "District fixed effects absorb time-invariant heterogeneity across offices \u2014 including\n", + "District fixed effects absorb time-invariant heterogeneity across offices — including\n", "differences in geographic complexity, historical enforcement culture, and staffing\n", - "composition \u2014 and ensure that budget effects are identified from within-district,\n", + "composition — and ensure that budget effects are identified from within-district,\n", "year-to-year variation.\n", "\n", - "**H1 \u2014 Baseline capacity model:**\n", + "**H1 — Baseline capacity model:**\n", "\n", "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\sum_{d} \\gamma_d \\, \\mathbf{1}[\\text{district} = d] + \\varepsilon_{dt}$$\n", "\n", "where $Y_{dt}$ is the regulatory output for district $d$ in year $t$, $\\gamma_d$ are\n", "district fixed effects, and $\\varepsilon_{dt}$ is the idiosyncratic error.\n", "\n", - "**H2 \u2014 Goal ambiguity moderation:**\n", + "**H2 — Goal ambiguity moderation:**\n", "\n", "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\beta_2 \\, \\text{Share}_t + \\beta_3 \\left( \\text{Budget}_t \\times \\text{Share}_t \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n", "\n", - "The coefficient $\\beta_3$ tests whether goal clarity conditions the capacity\u2013output\n", + "The coefficient $\\beta_3$ tests whether goal clarity conditions the capacity–output\n", "relationship. A positive $\\hat{\\beta}_3$ would indicate that clearer mission focus\n", "amplifies budget effects; a negative value would suggest diminishing returns or\n", "cross-strategy resource substitution.\n", "\n", - "**H3 \u2014 District slope heterogeneity:**\n", + "**H3 — District slope heterogeneity:**\n", "\n", "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\sum_{d=2}^{D} \\delta_d \\left( \\text{Budget}_t \\times \\mathbf{1}[d] \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n", "\n", @@ -2303,7 +2176,7 @@ "included, interaction term standard errors are inflated by near-perfect multicollinearity;\n", "these estimates are treated as descriptive indicators of heterogeneity only.\n", "\n", - "**H4 \u2014 Geographic moderation and spatial autocorrelation:**\n", + "**H4 — Geographic moderation and spatial autocorrelation:**\n", "\n", "$$Y_{dt} = \\alpha + \\beta_1 \\, \\text{Budget}_t + \\beta_2 \\, \\text{Offshore}_d + \\beta_3 \\, \\text{Border}_d + \\beta_4 \\left( \\text{Budget}_t \\times \\text{Offshore}_d \\right) + \\beta_5 \\left( \\text{Budget}_t \\times \\text{Border}_d \\right) + \\sum_{d} \\gamma_d + \\varepsilon_{dt}$$\n", "\n", @@ -2315,7 +2188,7 @@ "contemporaneous budget measure with its one-year lag ($\\text{Budget}_{t-1}$),\n", "and also estimates a model including both, to test whether budget effects\n", "operate with a delay consistent with a hiring-and-deployment mechanism.\n", - "The distributed lag regression sample covers 2017\u20132023 ($N = 91$).\n", + "The distributed lag regression sample covers 2017–2023 ($N = 91$).\n", "\n", "Spatial autocorrelation in H1 model residuals is assessed via Moran's $I$ computed on a\n", "row-normalized inverse-distance spatial weights matrix constructed from district centroids\n", @@ -2326,13 +2199,94 @@ "cell_type": "markdown", "id": "95f794b2", "metadata": {}, - "source": "## Analysis\n\nThis study employs a fixed-effects panel regression framework to examine whether\nyear-to-year changes in RRC organizational capacity \u2014 as measured by statewide budget\nappropriations \u2014 translate into improvements in regulatory outputs across Texas oil and\ngas inspection districts. The analytic panel spans 13 RRC districts over ten fiscal\nyears (2016\u20132025), yielding 130 district-year observations. Regression analyses are\nrestricted to 2016\u20132023 (N = 104), excluding FY2024 (budget estimate only) and FY2025\n(no budget data available). The identification strategy leverages within-district\nvariation in outcomes as a function of year-to-year shifts in statewide OGI\nappropriations, net of persistent inter-district differences absorbed by district fixed\neffects.\n\nThe choice of a district-year panel rather than a well-level panel is motivated by the\nstructure of the budget data, which is available only at the statewide level. Because the\nkey independent variable \u2014 OGI appropriations \u2014 varies along the time dimension only, it\nfunctions as a common, year-specific exposure applied uniformly to all districts. District\nfixed effects then absorb unobservable office-level characteristics that remain stable over\nthe study period, such as geographic complexity, historical enforcement intensity, and\nlocal administrative capacity. Causal identification is thus predicated on the assumption\nthat, absent changes in budget, within-district outcome trajectories would have followed\nparallel trends across years \u2014 an assumption that cannot be directly tested but is\npartially supported by the pre-period stability visible in the descriptive trends.\n\n**H1** tests the core capacity hypothesis using the baseline specification. Each of the\nthree dependent variables \u2014 total inspections, compliance rate, and violation resolution\nrate \u2014 is regressed separately on OGI budget (in millions of dollars) and district fixed\neffects. Cluster-robust standard errors are used throughout given the modest number of\nclusters ($G = 13$).\n\n**H2** extends the baseline by interacting OGI budget with the inspection budget share,\noperationalizing goal ambiguity as the degree to which RRC appropriations are concentrated\non the inspection mandate versus the broader energy development mission. The sign and\nsignificance of the interaction term $\\beta_3$ determines whether goal clarity amplifies\nor attenuates the capacity\u2013output relationship.\n\n**H3** tests for heterogeneity in budget\u2013outcome slopes across districts by including\nbudget $\\times$ district interaction terms. Given only eight years of data per district,\nthe saturated interaction model is estimated with approximately zero residual degrees of\nfreedom for the fixed-effects component; as a result, interaction-term standard errors\nare unreliable and these estimates are reported as exploratory indicators of cross-district\nvariation rather than inferential tests. The accompanying bar chart (below) summarizes\ndistrict-specific slopes as point estimates.\n\n**H4** assesses whether offshore-jurisdiction and border-proximate districts \u2014 which face\ndistinct operational environments \u2014 exhibit different budget sensitivity. The model adds\ngeographic level effects and budget $\\times$ geography interaction terms to the baseline\nspecification. A complementary spatial diagnostic \u2014 Moran's $I$ applied to the residuals\nfrom the H1 compliance model \u2014 tests for geographic clustering of unexplained outcome\nvariation that could indicate omitted spatial processes or spillovers across district\nboundaries.\n\nAll regressions exclude fiscal years 2024 (budget estimate) and 2025 (no budget data),\nretaining 2016\u20132023 as the regression sample (N = 104). The extended panel through 2025\nis used for descriptive trend analysis only. Enforcement-timing metrics for 2025 should\nbe interpreted cautiously: because the data extract covers a partial year, violations\ndiscovered in late 2024 and 2025 may not yet have a recorded enforcement action,\nartificially compressing observed days-to-enforcement and resolution rates for that year.\n\nTwo supplementary robustness checks address key inferential limitations. First, wild\ncluster bootstrap inference (Rademacher, B = 999) re-tests H1 with valid small-sample\np-values given G = 13 clusters. Second, a distributed lag specification tests whether\nbudget effects operate with a one-year delay, consistent with a hiring-and-deployment\nimplementation timeline. Results from both checks are reported following the main\nhypothesis tests.\n" + "source": [ + "## Analysis\n", + "\n", + "This study employs a fixed-effects panel regression framework to examine whether\n", + "year-to-year changes in RRC organizational capacity — as measured by statewide budget\n", + "appropriations — translate into improvements in regulatory outputs across Texas oil and\n", + "gas inspection districts. The analytic panel spans 13 RRC districts over ten fiscal\n", + "years (2016–2025), yielding 130 district-year observations. Regression analyses are\n", + "restricted to 2016–2023 (N = 104), excluding FY2024 (budget estimate only) and FY2025\n", + "(no budget data available). The identification strategy leverages within-district\n", + "variation in outcomes as a function of year-to-year shifts in statewide OGI\n", + "appropriations, net of persistent inter-district differences absorbed by district fixed\n", + "effects.\n", + "\n", + "The choice of a district-year panel rather than a well-level panel is motivated by the\n", + "structure of the budget data, which is available only at the statewide level. Because the\n", + "key independent variable — OGI appropriations — varies along the time dimension only, it\n", + "functions as a common, year-specific exposure applied uniformly to all districts. District\n", + "fixed effects then absorb unobservable office-level characteristics that remain stable over\n", + "the study period, such as geographic complexity, historical enforcement intensity, and\n", + "local administrative capacity. Causal identification is thus predicated on the assumption\n", + "that, absent changes in budget, within-district outcome trajectories would have followed\n", + "parallel trends across years — an assumption that cannot be directly tested but is\n", + "partially supported by the pre-period stability visible in the descriptive trends.\n", + "\n", + "**H1** tests the core capacity hypothesis using the baseline specification. Each of the\n", + "three dependent variables — total inspections, compliance rate, and violation resolution\n", + "rate — is regressed separately on OGI budget (in millions of dollars) and district fixed\n", + "effects. Cluster-robust standard errors are used throughout given the modest number of\n", + "clusters ($G = 13$).\n", + "\n", + "**H2** extends the baseline by interacting OGI budget with the inspection budget share,\n", + "operationalizing goal ambiguity as the degree to which RRC appropriations are concentrated\n", + "on the inspection mandate versus the broader energy development mission. The sign and\n", + "significance of the interaction term $\\beta_3$ determines whether goal clarity amplifies\n", + "or attenuates the capacity–output relationship.\n", + "\n", + "**H3** tests for heterogeneity in budget–outcome slopes across districts by including\n", + "budget $\\times$ district interaction terms. Given only eight years of data per district,\n", + "the saturated interaction model is estimated with approximately zero residual degrees of\n", + "freedom for the fixed-effects component; as a result, interaction-term standard errors\n", + "are unreliable and these estimates are reported as exploratory indicators of cross-district\n", + "variation rather than inferential tests. The accompanying bar chart (below) summarizes\n", + "district-specific slopes as point estimates.\n", + "\n", + "**H4** assesses whether offshore-jurisdiction and border-proximate districts — which face\n", + "distinct operational environments — exhibit different budget sensitivity. The model adds\n", + "geographic level effects and budget $\\times$ geography interaction terms to the baseline\n", + "specification. A complementary spatial diagnostic — Moran's $I$ applied to the residuals\n", + "from the H1 compliance model — tests for geographic clustering of unexplained outcome\n", + "variation that could indicate omitted spatial processes or spillovers across district\n", + "boundaries.\n", + "\n", + "All regressions exclude fiscal years 2024 (budget estimate) and 2025 (no budget data),\n", + "retaining 2016–2023 as the regression sample (N = 104). The extended panel through 2025\n", + "is used for descriptive trend analysis only. Enforcement-timing metrics for 2025 should\n", + "be interpreted cautiously: because the data extract covers a partial year, violations\n", + "discovered in late 2024 and 2025 may not yet have a recorded enforcement action,\n", + "artificially compressing observed days-to-enforcement and resolution rates for that year.\n", + "\n", + "Two supplementary robustness checks address key inferential limitations. First, wild\n", + "cluster bootstrap inference (Rademacher, B = 999) re-tests H1 with valid small-sample\n", + "p-values given G = 13 clusters. Second, a distributed lag specification tests whether\n", + "budget effects operate with a one-year delay, consistent with a hiring-and-deployment\n", + "implementation timeline. Results from both checks are reported following the main\n", + "hypothesis tests.\n" + ] }, { "cell_type": "markdown", "id": "3ca1410a", "metadata": {}, - "source": "## H1: Organizational Capacity \u2192 Policy Outputs\n\n**Prediction:** Higher OGI budget predicts more inspections, higher compliance rates,\nand faster violation resolution.\n\n**Model:** OLS with district fixed effects, 2016\u20132023 (N = 104). Budget varies only over\ntime, identifying effects via year-to-year changes in statewide OGI appropriations;\ndistrict fixed effects absorb persistent cross-district differences. Standard errors\nclustered at the district level (G = 13).\n\n**Finding (preview):** All three outcomes show positive, statistically significant budget\ncoefficients under asymptotic inference. Wild cluster bootstrap results (reported in the\nRobustness Checks section) indicate these asymptotic p-values overstate precision; results\nshould be interpreted as suggestive rather than definitive.\n" + "source": [ + "## H1: Organizational Capacity → Policy Outputs\n", + "\n", + "**Prediction:** Higher OGI budget predicts more inspections, higher compliance rates,\n", + "and faster violation resolution.\n", + "\n", + "**Model:** OLS with district fixed effects, 2016–2023 (N = 104). Budget varies only over\n", + "time, identifying effects via year-to-year changes in statewide OGI appropriations;\n", + "district fixed effects absorb persistent cross-district differences. Standard errors\n", + "clustered at the district level (G = 13).\n", + "\n", + "**Finding (preview):** All three outcomes show positive, statistically significant budget\n", + "coefficients under asymptotic inference. Wild cluster bootstrap results (reported in the\n", + "Robustness Checks section) indicate these asymptotic p-values overstate precision; results\n", + "should be interpreted as suggestive rather than definitive.\n" + ] }, { "cell_type": "code", @@ -2357,11 +2311,11 @@ "fig, axes = plt.subplots(1, 3, figsize=(15, 4))\n", "\n", "actuals.plot.scatter(x=\"ogi_budget_m\", y=\"total_inspections\",\n", - " alpha=0.4, ax=axes[0], title=\"Budget \u2192 Inspections\")\n", + " alpha=0.4, ax=axes[0], title=\"Budget → Inspections\")\n", "actuals.plot.scatter(x=\"ogi_budget_m\", y=\"compliance_rate\",\n", - " alpha=0.4, ax=axes[1], title=\"Budget \u2192 Compliance Rate (%)\")\n", + " alpha=0.4, ax=axes[1], title=\"Budget → Compliance Rate (%)\")\n", "actuals.plot.scatter(x=\"ogi_budget_m\", y=\"resolution_rate\",\n", - " alpha=0.4, ax=axes[2], title=\"Budget \u2192 Resolution Rate (%)\")\n", + " alpha=0.4, ax=axes[2], title=\"Budget → Resolution Rate (%)\")\n", "\n", "for ax in axes:\n", " ax.set_xlabel(\"OGI Budget ($M)\")\n", @@ -2380,20 +2334,20 @@ "name": "stdout", "output_type": "stream", "text": [ - "H1a \u2014 OGI Budget ($M) \u2192 Total Inspections\n", + "H1a — OGI Budget ($M) → Total Inspections\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 666.30 212.98 3.13 0.00\n", - " R\u00b2 = 0.769 Adj. R\u00b2 = 0.736\n", + " R² = 0.769 Adj. R² = 0.736\n", "\n", - "H1b \u2014 OGI Budget ($M) \u2192 Compliance Rate (%)\n", + "H1b — OGI Budget ($M) → Compliance Rate (%)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.26 0.11 2.31 0.02\n", - " R\u00b2 = 0.538 Adj. R\u00b2 = 0.471\n", + " R² = 0.538 Adj. R² = 0.471\n", "\n", - "H1c \u2014 OGI Budget ($M) \u2192 Resolution Rate (%)\n", + "H1c — OGI Budget ($M) → Resolution Rate (%)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 1.05 0.32 3.28 0.00\n", - " R\u00b2 = 0.624 Adj. R\u00b2 = 0.569\n" + " R² = 0.624 Adj. R² = 0.569\n" ] } ], @@ -2413,30 +2367,49 @@ " data=actuals,\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": actuals[\"district\"]})\n", "\n", - "# Detect actual column names \u2014 statsmodels uses z/P>|z| with robust SEs in some versions\n", + "# Detect actual column names — statsmodels uses z/P>|z| with robust SEs in some versions\n", "_tbl = m_inspections.summary2().tables[1]\n", "_t = \"t\" if \"t\" in _tbl.columns else \"z\"\n", "_p = \"P>|t|\" if \"P>|t|\" in _tbl.columns else \"P>|z|\"\n", "display_cols = [\"Coef.\", \"Std.Err.\", _t, _p]\n", "\n", - "print(\"H1a \u2014 OGI Budget ($M) \u2192 Total Inspections\")\n", + "print(\"H1a — OGI Budget ($M) → Total Inspections\")\n", "print(m_inspections.summary2().tables[1][display_cols].loc[[\"ogi_budget_m\"]])\n", - "print(f\" R\u00b2 = {m_inspections.rsquared:.3f} Adj. R\u00b2 = {m_inspections.rsquared_adj:.3f}\\n\")\n", + "print(f\" R² = {m_inspections.rsquared:.3f} Adj. R² = {m_inspections.rsquared_adj:.3f}\\n\")\n", "\n", - "print(\"H1b \u2014 OGI Budget ($M) \u2192 Compliance Rate (%)\")\n", + "print(\"H1b — OGI Budget ($M) → Compliance Rate (%)\")\n", "print(m_compliance.summary2().tables[1][display_cols].loc[[\"ogi_budget_m\"]])\n", - "print(f\" R\u00b2 = {m_compliance.rsquared:.3f} Adj. R\u00b2 = {m_compliance.rsquared_adj:.3f}\\n\")\n", + "print(f\" R² = {m_compliance.rsquared:.3f} Adj. R² = {m_compliance.rsquared_adj:.3f}\\n\")\n", "\n", - "print(\"H1c \u2014 OGI Budget ($M) \u2192 Resolution Rate (%)\")\n", + "print(\"H1c — OGI Budget ($M) → Resolution Rate (%)\")\n", "print(m_resolution.summary2().tables[1][display_cols].loc[[\"ogi_budget_m\"]])\n", - "print(f\" R\u00b2 = {m_resolution.rsquared:.3f} Adj. R\u00b2 = {m_resolution.rsquared_adj:.3f}\")\n" + "print(f\" R² = {m_resolution.rsquared:.3f} Adj. R² = {m_resolution.rsquared_adj:.3f}\")\n" ] }, { "cell_type": "markdown", "id": "56add68a", "metadata": {}, - "source": "## H2: Goal Ambiguity Moderates Capacity Effects\n\n**Prediction:** When a larger share of combined RRC budget flows to the broader\n\"Energy Resource Development\" goal (lower `inspection_budget_share`), the capacity \u2192\noutput link weakens. A positive interaction coefficient would support H2.\n\n**Operationalization:**\n`inspection_budget_share = ogi_budget / (ogi_budget + erd_budget)`\n\n**Identification note:** Like the budget measure itself, `inspection_budget_share`\nvaries only over time, not across districts. The interaction term therefore exploits\nthe same narrow temporal variation as the main effect \u2014 budget share ranged from 0.59\nto 0.67 over 2016\u20132023, a span of 8 percentage points across 8 years. This limits\nthe strength of inference that can be drawn from the moderation test.\n\n**Finding (preview):** The interaction is significant and negative ($\\hat{\\beta}_3 = -6.53$,\n$p < .01$), but interpretation is constrained by the identification limitations above.\nResults are discussed in the Results section.\n" + "source": [ + "## H2: Goal Ambiguity Moderates Capacity Effects\n", + "\n", + "**Prediction:** When a larger share of combined RRC budget flows to the broader\n", + "\"Energy Resource Development\" goal (lower `inspection_budget_share`), the capacity →\n", + "output link weakens. A positive interaction coefficient would support H2.\n", + "\n", + "**Operationalization:**\n", + "`inspection_budget_share = ogi_budget / (ogi_budget + erd_budget)`\n", + "\n", + "**Identification note:** Like the budget measure itself, `inspection_budget_share`\n", + "varies only over time, not across districts. The interaction term therefore exploits\n", + "the same narrow temporal variation as the main effect — budget share ranged from 0.59\n", + "to 0.67 over 2016–2023, a span of 8 percentage points across 8 years. This limits\n", + "the strength of inference that can be drawn from the moderation test.\n", + "\n", + "**Finding (preview):** The interaction is significant and negative ($\\hat{\\beta}_3 = -6.53$,\n", + "$p < .01$), but interpretation is constrained by the identification limitations above.\n", + "Results are discussed in the Results section.\n" + ] }, { "cell_type": "code", @@ -2448,21 +2421,21 @@ "name": "stdout", "output_type": "stream", "text": [ - "H2 \u2014 Goal Ambiguity Moderation (DV: compliance_rate)\n", + "H2 — Goal Ambiguity Moderation (DV: compliance_rate)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 4.20 1.09 3.86 0.00\n", "inspection_budget_share 170.18 44.79 3.80 0.00\n", "ogi_budget_m:inspection_budget_share -6.53 1.84 -3.55 0.00\n", "\n", - "R\u00b2 = 0.567 Adj. R\u00b2 = 0.493\n", + "R² = 0.567 Adj. R² = 0.493\n", "\n", - "H2 \u2014 Goal Ambiguity Moderation (DV: resolution_rate)\n", + "H2 — Goal Ambiguity Moderation (DV: resolution_rate)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 6.68 4.67 1.43 0.15\n", "inspection_budget_share 230.67 204.30 1.13 0.26\n", "ogi_budget_m:inspection_budget_share -9.42 7.99 -1.18 0.24\n", "\n", - "R\u00b2 = 0.629 Adj. R\u00b2 = 0.566\n" + "R² = 0.629 Adj. R² = 0.566\n" ] } ], @@ -2473,26 +2446,42 @@ ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": actuals[\"district\"]})\n", "\n", "key_rows = [\"ogi_budget_m\", \"inspection_budget_share\", \"ogi_budget_m:inspection_budget_share\"]\n", - "print(\"H2 \u2014 Goal Ambiguity Moderation (DV: compliance_rate)\")\n", + "print(\"H2 — Goal Ambiguity Moderation (DV: compliance_rate)\")\n", "print(m_h2.summary2().tables[1][display_cols].loc[key_rows])\n", - "print(f\"\\nR\u00b2 = {m_h2.rsquared:.3f} Adj. R\u00b2 = {m_h2.rsquared_adj:.3f}\")\n", + "print(f\"\\nR² = {m_h2.rsquared:.3f} Adj. R² = {m_h2.rsquared_adj:.3f}\")\n", "\n", - "# \u2500\u2500 Same model with resolution rate as DV \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Same model with resolution rate as DV ────────────────────────────────────\n", "m_h2_res = smf.ols(\n", " \"resolution_rate ~ ogi_budget_m * inspection_budget_share + C(district)\",\n", " data=actuals,\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": actuals[\"district\"]})\n", "\n", - "print(\"\\nH2 \u2014 Goal Ambiguity Moderation (DV: resolution_rate)\")\n", + "print(\"\\nH2 — Goal Ambiguity Moderation (DV: resolution_rate)\")\n", "print(m_h2_res.summary2().tables[1][display_cols].loc[key_rows])\n", - "print(f\"\\nR\u00b2 = {m_h2_res.rsquared:.3f} Adj. R\u00b2 = {m_h2_res.rsquared_adj:.3f}\")\n" + "print(f\"\\nR² = {m_h2_res.rsquared:.3f} Adj. R² = {m_h2_res.rsquared_adj:.3f}\")\n" ] }, { "cell_type": "markdown", "id": "b6583857", "metadata": {}, - "source": "## H3: District Multilevel Effects\n\n**Prediction:** The budget \u2192 output slope varies across RRC districts \u2014 some districts\ntranslate budget increases into better outputs more effectively than others.\n\n**Model:** Interaction `ogi_budget_m \u00d7 C(district)` \u2014 the reference district captures\nthe baseline budget slope; interaction terms show how each other district's slope\ndiffers. Standard errors are unreliable due to near-perfect multicollinearity in the\nsaturated model (budget varies only over time while district FE absorb cross-sectional\nvariation); results are treated as descriptive point estimates only.\n\n**Finding (preview):** District slopes for compliance rate range from \u22120.34 pp per \\$1M\n(District 03, Houston/Coastal) to +1.36 pp per \\$1M (District 6E, East Texas Piney\nWoods), with most districts showing small positive slopes. The bar chart below plots\ndistrict-specific slope estimates.\n" + "source": [ + "## H3: District Multilevel Effects\n", + "\n", + "**Prediction:** The budget → output slope varies across RRC districts — some districts\n", + "translate budget increases into better outputs more effectively than others.\n", + "\n", + "**Model:** Interaction `ogi_budget_m × C(district)` — the reference district captures\n", + "the baseline budget slope; interaction terms show how each other district's slope\n", + "differs. Standard errors are unreliable due to near-perfect multicollinearity in the\n", + "saturated model (budget varies only over time while district FE absorb cross-sectional\n", + "variation); results are treated as descriptive point estimates only.\n", + "\n", + "**Finding (preview):** District slopes for compliance rate range from −0.34 pp per \\$1M\n", + "(District 03, Houston/Coastal) to +1.36 pp per \\$1M (District 6E, East Texas Piney\n", + "Woods), with most districts showing small positive slopes. The bar chart below plots\n", + "district-specific slope estimates.\n" + ] }, { "cell_type": "code", @@ -2504,7 +2493,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "H3 \u2014 District-Heterogeneous Budget Effect (DV: compliance_rate)\n", + "H3 — District-Heterogeneous Budget Effect (DV: compliance_rate)\n", "Baseline (reference district) budget slope:\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.09 0.00 56,876,193,472,228.37 0.00\n", @@ -2524,7 +2513,7 @@ "ogi_budget_m:C(district)[T.7C] 0.31 0.00 24,243,474,173,332.52 0.00\n", "ogi_budget_m:C(district)[T.8A] 0.10 0.00 59,702,739,775,453.20 0.00\n", "\n", - "R\u00b2 = 0.662 Adj. R\u00b2 = 0.554\n" + "R² = 0.662 Adj. R² = 0.554\n" ] }, { @@ -2548,7 +2537,7 @@ "\n", "# Baseline budget slope (reference district)\n", "baseline_row = coef_table.loc[[\"ogi_budget_m\"]]\n", - "print(\"H3 \u2014 District-Heterogeneous Budget Effect (DV: compliance_rate)\")\n", + "print(\"H3 — District-Heterogeneous Budget Effect (DV: compliance_rate)\")\n", "print(f\"Baseline (reference district) budget slope:\")\n", "print(baseline_row[display_cols])\n", "\n", @@ -2556,9 +2545,9 @@ "interaction_rows = coef_table[coef_table.index.str.contains(\"ogi_budget_m:C\")]\n", "print(\"\\nDistrict interaction terms (deviation from reference slope):\")\n", "print(interaction_rows[display_cols].round(4))\n", - "print(f\"\\nR\u00b2 = {m_h3.rsquared:.3f} Adj. R\u00b2 = {m_h3.rsquared_adj:.3f}\")\n", + "print(f\"\\nR² = {m_h3.rsquared:.3f} Adj. R² = {m_h3.rsquared_adj:.3f}\")\n", "\n", - "# \u2500\u2500 Plot district-specific budget slopes \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Plot district-specific budget slopes ─────────────────────────────────────\n", "districts = actuals[\"district\"].unique()\n", "slopes = {}\n", "for d in districts:\n", @@ -2572,7 +2561,7 @@ "slope_df.plot.barh(ax=ax, color=[\"#d62728\" if v < 0 else \"#1f77b4\" for v in slope_df])\n", "ax.axvline(0, color=\"black\", linewidth=0.8)\n", "ax.set_xlabel(\"Budget slope (compliance rate pp per $M)\")\n", - "ax.set_title(\"H3 \u2014 District-Specific Budget \u2192 Compliance Slopes\")\n", + "ax.set_title(\"H3 — District-Specific Budget → Compliance Slopes\")\n", "plt.tight_layout()\n", "plt.show()\n" ] @@ -2581,7 +2570,23 @@ "cell_type": "markdown", "id": "5bb27b3e", "metadata": {}, - "source": "## H4: Spatial and Geographic Factors\n\n**Predictions:**\n- Offshore-jurisdiction districts (02, 03, 04) show a different budget \u2192 output\n relationship due to dual onshore/offshore oversight burden.\n- Border-proximate districts show a different relationship due to cross-jurisdiction\n enforcement complexity.\n- Spatial autocorrelation in H1 residuals (Moran's I) would indicate unmodeled\n geographic spillovers.\n\n**Finding (preview):** Offshore and border districts show significantly *higher* baseline\ncompliance rates (+7.6 pp and +6.0 pp respectively, both $p < .05$) but not different\nbudget sensitivity. Moran's $I = -0.051$ indicates slight spatial dispersion and no\nsignificant geographic clustering of residuals. Results are discussed in the Results\nsection.\n" + "source": [ + "## H4: Spatial and Geographic Factors\n", + "\n", + "**Predictions:**\n", + "- Offshore-jurisdiction districts (02, 03, 04) show a different budget → output\n", + " relationship due to dual onshore/offshore oversight burden.\n", + "- Border-proximate districts show a different relationship due to cross-jurisdiction\n", + " enforcement complexity.\n", + "- Spatial autocorrelation in H1 residuals (Moran's I) would indicate unmodeled\n", + " geographic spillovers.\n", + "\n", + "**Finding (preview):** Offshore and border districts show significantly *higher* baseline\n", + "compliance rates (+7.6 pp and +6.0 pp respectively, both $p < .05$) but not different\n", + "budget sensitivity. Moran's $I = -0.051$ indicates slight spatial dispersion and no\n", + "significant geographic clustering of residuals. Results are discussed in the Results\n", + "section.\n" + ] }, { "cell_type": "code", @@ -2640,7 +2645,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "H4 \u2014 Spatial Moderators (DV: compliance_rate)\n", + "H4 — Spatial Moderators (DV: compliance_rate)\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.35 0.15 2.39 0.02\n", "offshore 7.61 3.29 2.31 0.02\n", @@ -2648,12 +2653,12 @@ "ogi_budget_m:offshore -0.03 0.18 -0.16 0.87\n", "ogi_budget_m:border -0.25 0.15 -1.74 0.08\n", "\n", - "R\u00b2 = 0.553 Adj. R\u00b2 = 0.476\n", + "R² = 0.553 Adj. R² = 0.476\n", "\n", "Moran's I on H1 compliance residuals = -0.0512\n", - " > 0 \u2192 residuals cluster spatially (similar neighbours)\n", - " \u2248 0 \u2192 no spatial pattern\n", - " < 0 \u2192 spatial dispersion (dissimilar neighbours)\n", + " > 0 → residuals cluster spatially (similar neighbours)\n", + " ≈ 0 → no spatial pattern\n", + " < 0 → spatial dispersion (dissimilar neighbours)\n", "\n", "District centroids used:\n", "district lat lon\n", @@ -2674,7 +2679,7 @@ } ], "source": [ - "# \u2500\u2500 Spatial regression: offshore and border interactions \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Spatial regression: offshore and border interactions ─────────────────────\n", "m_h4 = smf.ols(\n", " \"compliance_rate ~ ogi_budget_m + offshore + border \"\n", " \"+ ogi_budget_m:offshore + ogi_budget_m:border + C(district)\",\n", @@ -2686,11 +2691,11 @@ " \"ogi_budget_m:offshore\", \"ogi_budget_m:border\",\n", "]\n", "available = [r for r in spatial_rows if r in m_h4.params.index]\n", - "print(\"H4 \u2014 Spatial Moderators (DV: compliance_rate)\")\n", + "print(\"H4 — Spatial Moderators (DV: compliance_rate)\")\n", "print(m_h4.summary2().tables[1][display_cols].loc[available])\n", - "print(f\"\\nR\u00b2 = {m_h4.rsquared:.3f} Adj. R\u00b2 = {m_h4.rsquared_adj:.3f}\")\n", + "print(f\"\\nR² = {m_h4.rsquared:.3f} Adj. R² = {m_h4.rsquared_adj:.3f}\")\n", "\n", - "# \u2500\u2500 Moran's I on H1 residuals \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Moran's I on H1 residuals ─────────────────────────────────────────────────\n", "# Compute district centroids from well lat/lon joined via inspections\n", "centroids_sql = \"\"\"\n", "SELECT\n", @@ -2728,9 +2733,9 @@ " morans_i = (n / W.sum()) * (z @ W @ z) / (z @ z)\n", "\n", " print(f\"\\nMoran's I on H1 compliance residuals = {morans_i:.4f}\")\n", - " print(\" > 0 \u2192 residuals cluster spatially (similar neighbours)\")\n", - " print(\" \u2248 0 \u2192 no spatial pattern\")\n", - " print(\" < 0 \u2192 spatial dispersion (dissimilar neighbours)\")\n", + " print(\" > 0 → residuals cluster spatially (similar neighbours)\")\n", + " print(\" ≈ 0 → no spatial pattern\")\n", + " print(\" < 0 → spatial dispersion (dissimilar neighbours)\")\n", "\n", " print(\"\\nDistrict centroids used:\")\n", " print(centroids[[\"district\", \"lat\", \"lon\"]].round(2).to_string(index=False))\n", @@ -2743,13 +2748,299 @@ "cell_type": "markdown", "id": "02c42877", "metadata": {}, - "source": "## Results\n\n### Descriptive Trends\n\nTable 1 summarizes year-level means for the key variables across 2016\u20132025, with\nregression analyses restricted to 2016\u20132023. OGI appropriations grew from $18.47 million\nin 2016 to $34.33 million in 2023 \u2014 an 86 percent nominal increase \u2014 with the FY2024\nbudget estimate reaching $38.51 million. Authorized FTE positions rose modestly from\n256.7 to 271.2 over the same period. Inspection volume per district increased from a\nmean of 18,278 in 2016 to a peak of 36,553 in 2024, with a partial-year figure of 34,082\nrecorded for 2025. Mean district compliance rate improved from 83.1 percent in 2016 to\na peak of 92.6 percent in 2024, with a slight moderation to 90.5 percent in the 2025\npartial-year extract. Violation resolution rate rose from 36.8 percent in 2016 to 69.7\npercent in 2023 before declining to 52.1 percent in 2025; this decline almost certainly\nreflects right-censoring rather than a genuine deterioration in enforcement outcomes, as\nrecently discovered violations will not yet have received a recorded resolution on\nre-inspection. Similarly, the 2025 days-to-enforcement figure of 36.6 days should be\ninterpreted as a lower bound on the true enforcement timeline for that cohort of\nviolations. These trends are broadly consistent with the organizational capacity\nhypothesis, though they are also consistent with secular improvements in industry\ncompliance independent of budget growth.\n\n**Table 1. Year-Level Panel Means, 2016\u20132025**\n\n| Year | OGI Budget ($M) | OGI FTE | Inspections/District | Compliance Rate (%) | Resolution Rate (%) | Days to Enforcement |\n|:----:|:---------------:|:-------:|:--------------------:|:-------------------:|:-------------------:|:-------------------:|\n| 2016 | 18.47 | 256.7 | 18,278 | 83.1 | 36.8 | 131.9 |\n| 2017 | 17.20 | 249.5 | 20,139 | 86.5 | 59.0 | 185.0 |\n| 2018 | 17.56 | 229.9 | 25,704 | 90.2 | 59.5 | 207.3 |\n| 2019 | 21.95 | 255.6 | 25,058 | 89.9 | 61.4 | 170.4 |\n| 2020 | 26.06 | 284.0 | 27,669 | 89.6 | 56.8 | 154.7 |\n| 2021 | 28.76 | 277.8 | 24,116 | 88.8 | 66.2 | 118.8 |\n| 2022 | 25.91 | 264.0 | 32,024 | 89.8 | 67.9 | 91.5 |\n| 2023 | 34.33 | 271.2 | 33,806 | 91.6 | 69.7 | 105.2 |\n| 2024\u2020 | 38.51 | 280.8 | 36,553 | 92.6 | 65.1 | 76.9 |\n| 2025\u2021 | \u2014 | \u2014 | 34,082 | 90.5 | 52.1 | 36.6\u2021 |\n\n*Note: Budget figures are nominal. FTE = authorized full-time equivalent positions.\nInspections/District = mean district-level annual inspection count.*\n*\u2020 2024 budget is an appropriations estimate, not expenditure actuals; excluded from\nregression models.*\n*\u2021 2025 data is partial-year as of the data extract. Resolution rate and days-to-enforcement\nare right-censored: violations discovered in late 2024\u20132025 may not yet have a recorded\nenforcement action, compressing these metrics.*\n\n---\n\n### H1: Organizational Capacity and Regulatory Outputs\n\nThe baseline fixed-effects models provide consistent support for H1 across all three\ndependent variables (Table 2). Each additional million dollars in OGI appropriations is\nassociated with approximately **666 additional district-level inspections** per year\n($\\hat{\\beta} = 666.30$, SE = 212.98, $z = 3.13$, $p < .01$; $R^2 = .769$). The budget\ncoefficient is also positive and significant for compliance rate ($\\hat{\\beta} = 0.26$\npercentage points per \\$1M, SE = 0.11, $z = 2.31$, $p = .02$; $R^2 = .538$) and\nviolation resolution rate ($\\hat{\\beta} = 1.05$ percentage points per \\$1M, SE = 0.32,\n$z = 3.28$, $p < .01$; $R^2 = .624$). These associations are estimated net of district\nfixed effects and therefore reflect within-district covariation between annual budget\nchanges and outcome changes rather than cross-sectional differences between\nbetter- and worse-funded districts.\n\n**Table 2. H1 Regression Results: OGI Budget \u2192 Regulatory Outputs**\n\n| Dependent Variable | $\\hat{\\beta}$ (Budget \\$M) | SE | $z$ | $p$ | $R^2$ | Adj. $R^2$ |\n|---|:---:|:---:|:---:|:---:|:---:|:---:|\n| Total inspections | 666.30 | 212.98 | 3.13 | <.01 | .769 | .736 |\n| Compliance rate (%) | 0.26 | 0.11 | 2.31 | .02 | .538 | .471 |\n| Resolution rate (%) | 1.05 | 0.32 | 3.28 | <.01 | .624 | .569 |\n\n*Note: All models include district fixed effects ($D = 13$). Standard errors clustered\nat the district level. $N = 104$.*\n\n---\n\n### H2: Goal Ambiguity as a Moderator\n\nThe goal ambiguity moderation model for compliance rate (Table 3) yields a statistically\nsignificant and negative interaction between OGI budget and inspection budget share\n($\\hat{\\beta}_3 = -6.53$, SE = 1.84, $z = -3.55$, $p < .01$). However, this result\nrequires careful qualification before any mechanism is claimed.\n\nThe key issue is that `inspection_budget_share` \u2014 like the budget measure itself \u2014\nvaries only over time, not across districts. All 13 districts experience the same\nbudget share in any given year, ranging from 0.59 (FY2022) to 0.67 (FY2018) across\nthe study period \u2014 a span of 8 percentage points over 8 observations. The interaction\nterm is therefore identified from the same narrow temporal variation as the main budget\neffect, not from cross-district differences in mission structure. This makes it\ndifficult to distinguish a genuine moderation relationship from a spurious correlation\nwith year-specific factors that independently affected both budget share and compliance\noutcomes in the same years.\n\nThe negative sign is consistent with at least two interpretations. Under a\n*resource saturation* story, compliance gains from additional OGI investment contract\nas the inspection mandate becomes better resourced relative to other RRC goals \u2014\na plausible ceiling effect if districts are already operating near full compliance\nin high-share years. Alternatively, the result may simply reflect that FY2018 \u2014 the\nhighest-share year \u2014 saw particularly large compliance gains for reasons unrelated to\nbudget concentration (e.g., post-2016 industry recovery, early implementation of\nregulatory changes). Evaluated at mean budget share ($\\bar{s} \\approx 0.62$), the\nimplied marginal budget effect on compliance is $4.20 - 6.53(0.62) \\approx 0.15$\npp per \\$1M \u2014 directionally consistent with H1 but smaller.\n\nFor violation resolution rate, no terms reach conventional significance (all $p > .15$).\nGiven the identification constraints, the H2 compliance finding is best treated as an\nexploratory pattern consistent with goal ambiguity theory \u2014 one that motivates future\nresearch with district-level budget variation \u2014 rather than a robust confirmatory test.\n\n**Table 3. H2 Regression Results: Goal Ambiguity Moderation (DV: Compliance Rate)**\n\n| Term | $\\hat{\\beta}$ | SE | $z$ | $p$ |\n|---|:---:|:---:|:---:|:---:|\n| Budget (\\$M) | 4.20 | 1.09 | 3.86 | <.01 |\n| Inspection budget share | 170.18 | 44.79 | 3.80 | <.01 |\n| Budget \u00d7 Share | \u22126.53 | 1.84 | \u22123.55 | <.01 |\n\n*Note: District fixed effects included. SE clustered at district. $R^2 = .567$,\nAdj. $R^2 = .493$. $N = 104$.*\n\n---\n\n### H3: District-Level Heterogeneity\n\nDistrict-specific budget slopes for compliance rate range from $-0.34$ percentage points\nper \\$1 million (District 03, Coastal/Greater Houston) to $+1.36$ percentage points\n(District 6E, East Texas Piney Woods), with most districts showing small positive slopes\n(Table 4). The reference district (District 01, San Antonio) slope is 0.09 pp per \\$1M.\nPositive slopes are most pronounced in District 6E (+1.36), District 06 (+0.43), and\nDistrict 7C (+0.40); District 03 is the only district with a substantially negative slope.\nThe model $R^2$ of .662 modestly exceeds the baseline H1 value (.538), consistent with\nmeaningful cross-district slope heterogeneity. Standard errors for the interaction terms\nare not reported, as they are unreliable due to near-perfect multicollinearity in the\nsaturated model (see Data and Methods); point estimates are presented as descriptive\nindicators only.\n\n**Table 4. H3 District-Specific Budget \u2192 Compliance Slopes (pp per \\$1M)**\n\n| District | Estimated Slope |\n|:---:|:---:|\n| 01 (San Antonio) | 0.09 |\n| 02 (Corpus Christi) | 0.24 |\n| 03 (Houston) | \u22120.34 |\n| 04 (Laredo) | 0.28 |\n| 05 (Midland/Abilene) | 0.05 |\n| 06 (Kilgore) | 0.43 |\n| 08 (Midland) | 0.28 |\n| 09 (Wichita Falls) | 0.00 |\n| 10 (Amarillo) | 0.13 |\n| 6E (Kilgore East) | 1.36 |\n| 7B (Abilene) | 0.27 |\n| 7C (Big Spring) | 0.40 |\n| 8A (Lubbock) | 0.19 |\n\n*Note: Slopes are $\\hat{\\beta}_1 + \\hat{\\delta}_d$ from the H3 interaction model.*\n\n---\n\n### H4: Spatial and Geographic Factors\n\nThe geographic moderation model (Table 5) reveals that offshore-jurisdiction districts\n(02, 03, 04) exhibit compliance rates approximately **7.6 percentage points higher** than\nnon-offshore districts on average, net of budget ($\\hat{\\beta} = 7.61$, SE = 3.29,\n$z = 2.31$, $p = .02$). Border-proximate districts similarly show elevated baseline\ncompliance rates (+6.03 pp, SE = 2.84, $z = 2.12$, $p = .03$). These level effects may\nreflect the heightened external scrutiny \u2014 from federal regulators, environmental\norganizations, and media \u2014 that offshore and border districts attract, which could\nindependently drive compliance investments by operators regardless of RRC budget levels.\n\nThe budget\u2013compliance slope, however, does not differ significantly between offshore\nand non-offshore districts ($\\hat{\\beta}_4 = -0.03$, $p = .87$), nor between border\nand non-border districts at conventional thresholds ($\\hat{\\beta}_5 = -0.25$, $p = .08$),\nsuggesting that geographic classification affects the *level* of compliance performance\nbut not the degree to which additional budget translates into compliance gains.\n\nMoran's $I$ computed on district-level residuals from the H1 compliance model is\n$I = -0.051$, indicating slight spatial dispersion but no statistically significant\nspatial autocorrelation. This finding is consistent with prior district-level analysis\nof this regulatory system and suggests that unmodeled geographic spillovers are not a\nmaterial source of omitted variable bias in the panel models.\n\n**Table 5. H4 Regression Results: Geographic Moderation (DV: Compliance Rate)**\n\n| Term | $\\hat{\\beta}$ | SE | $z$ | $p$ |\n|---|:---:|:---:|:---:|:---:|\n| Budget (\\$M) | 0.35 | 0.15 | 2.39 | .02 |\n| Offshore (= 1) | 7.61 | 3.29 | 2.31 | .02 |\n| Border (= 1) | 6.03 | 2.84 | 2.12 | .03 |\n| Budget \u00d7 Offshore | \u22120.03 | 0.18 | \u22120.16 | .87 |\n| Budget \u00d7 Border | \u22120.25 | 0.15 | \u22121.74 | .08 |\n\n*Note: District fixed effects included. SE clustered at district. $R^2 = .553$,\nAdj. $R^2 = .476$. $N = 104$. Moran's $I$ on H1 compliance residuals = \u22120.051 (no\nsignificant spatial autocorrelation).*\n\n---\n\n### Summary\n\nTaken together, the results offer moderate support for a resource-capacity model of\nregulatory performance. Higher OGI appropriations are reliably associated with greater\ninspection volume, higher compliance rates, and faster violation resolution \u2014 though\nidentification rests on temporal variation in statewide appropriations rather than\nquasi-experimental assignment, and the modest panel length limits statistical precision.\nGoal ambiguity moderation operates through a diminishing-returns mechanism: compliance\ngains from additional budget are smaller in years when the inspection mandate receives\na larger share of combined appropriations, consistent with resource saturation rather\nthan amplification. District heterogeneity in budget\u2013outcome slopes is substantial in\ndescriptive terms but cannot be precisely estimated with the available data. Finally,\ngeographic context \u2014 offshore jurisdiction and border proximity \u2014 predicts compliance\nlevels but not budget sensitivity, and spatial autocorrelation diagnostics provide no\nevidence of unmodeled geographic spillover processes.\n\n### Robustness Checks\n\n**Wild cluster bootstrap.** With only $G = 13$ district clusters, asymptotic\ncluster-robust standard errors may substantially understate true uncertainty.\nWild cluster bootstrap inference (Rademacher weights, $B = 999$ draws; Cameron,\nGelbach & Miller 2008) yields bootstrap p-values near 0.49\u20130.51 for all three\nH1 outcomes: total inspections ($p_{boot} = 0.494$), compliance rate\n($p_{boot} = 0.473$), and resolution rate ($p_{boot} = 0.509$). These are far\nfrom any conventional significance threshold, in stark contrast to the asymptotic\np-values of 0.002, 0.021, and 0.001. The divergence indicates that with $G = 13$\nclusters, asymptotic inference significantly overstates precision. The H1 point\nestimates remain positive and directionally consistent, but the results do not\nsurvive bootstrap-based inference. This is the principal inferential limitation\nof the study.\n\n**Table 7. Wild Cluster Bootstrap vs. Asymptotic p-values (H1 Models, B = 999)**\n\n| Outcome | $t$-statistic | $p$ (asymptotic) | $p$ (bootstrap) |\n|---|:---:|:---:|:---:|\n| Total inspections | 3.13 | .002 | .494 |\n| Compliance rate | 2.31 | .021 | .473 |\n| Resolution rate | 3.28 | .001 | .509 |\n\n*Note: Bootstrap p-values based on 999 Rademacher wild cluster bootstrap draws.*\n*Small number of clusters (G = 13) renders asymptotic inference unreliable.*\n\n**Distributed lag model.** The distributed lag models test whether budget effects\noperate with a one-year delay consistent with a hiring-and-deployment mechanism.\nFor compliance rate, the lagged budget alone is not significant\n($\\hat{\\beta}_{t-1} = 0.10$, $p = .44$; Model A, N = 91), and in the combined\nmodel the contemporaneous term remains marginally significant\n($\\hat{\\beta}_t = 0.24$, $p = .04$) while the lagged term is negative and\nnon-significant ($\\hat{\\beta}_{t-1} = -0.14$, $p = .12$; Model B). For violation\nresolution rate, the lagged budget is marginally significant when estimated alone\n($\\hat{\\beta}_{t-1} = 0.83$, $p = .09$; Model A), but neither term reaches\nconventional significance in the combined model ($p = .22$ and $p = .14$).\n\nThese findings provide little support for a delayed implementation mechanism.\nThe persistence of contemporaneous effects alongside non-significant lagged terms\nis more consistent with an immediate budget\u2013output relationship. However, the\nN = 91 sample offers limited power to disentangle contemporaneous and lagged\neffects that are highly collinear over an eight-year window.\n\n**Table 8. Distributed Lag Results (2017\u20132023, N = 91)**\n\n| Model | DV | $\\hat{\\beta}_t$ | $p$ | $\\hat{\\beta}_{t-1}$ | $p$ | $R^2$ |\n|---|---|:---:|:---:|:---:|:---:|:---:|\n| A \u2014 Lag only | Compliance rate | \u2014 | \u2014 | 0.10 | .44 | .543 |\n| B \u2014 Both | Compliance rate | 0.24 | .04 | \u22120.14 | .12 | .569 |\n| A \u2014 Lag only | Resolution rate | \u2014 | \u2014 | 0.83 | .09 | .696 |\n| B \u2014 Both | Resolution rate | 0.24 | .22 | 0.59 | .14 | .698 |\n\n*Note: District fixed effects included; SE clustered at district.*\n" + "source": [ + "## Results\n", + "\n", + "### Descriptive Trends\n", + "\n", + "Table 1 summarizes year-level means for the key variables across 2016–2025, with\n", + "regression analyses restricted to 2016–2023. OGI appropriations grew from $18.47 million\n", + "in 2016 to $34.33 million in 2023 — an 86 percent nominal increase — with the FY2024\n", + "budget estimate reaching $38.51 million. Authorized FTE positions rose modestly from\n", + "256.7 to 271.2 over the same period. Inspection volume per district increased from a\n", + "mean of 18,278 in 2016 to a peak of 36,553 in 2024, with a partial-year figure of 34,082\n", + "recorded for 2025. Mean district compliance rate improved from 83.1 percent in 2016 to\n", + "a peak of 92.6 percent in 2024, with a slight moderation to 90.5 percent in the 2025\n", + "partial-year extract. Violation resolution rate rose from 36.8 percent in 2016 to 69.7\n", + "percent in 2023 before declining to 52.1 percent in 2025; this decline almost certainly\n", + "reflects right-censoring rather than a genuine deterioration in enforcement outcomes, as\n", + "recently discovered violations will not yet have received a recorded resolution on\n", + "re-inspection. Similarly, the 2025 days-to-enforcement figure of 36.6 days should be\n", + "interpreted as a lower bound on the true enforcement timeline for that cohort of\n", + "violations. These trends are broadly consistent with the organizational capacity\n", + "hypothesis, though they are also consistent with secular improvements in industry\n", + "compliance independent of budget growth.\n", + "\n", + "**Table 1. Year-Level Panel Means, 2016–2025**\n", + "\n", + "| Year | OGI Budget ($M) | OGI FTE | Inspections/District | Compliance Rate (%) | Resolution Rate (%) | Days to Enforcement |\n", + "|:----:|:---------------:|:-------:|:--------------------:|:-------------------:|:-------------------:|:-------------------:|\n", + "| 2016 | 18.47 | 256.7 | 18,278 | 83.1 | 36.8 | 131.9 |\n", + "| 2017 | 17.20 | 249.5 | 20,139 | 86.5 | 59.0 | 185.0 |\n", + "| 2018 | 17.56 | 229.9 | 25,704 | 90.2 | 59.5 | 207.3 |\n", + "| 2019 | 21.95 | 255.6 | 25,058 | 89.9 | 61.4 | 170.4 |\n", + "| 2020 | 26.06 | 284.0 | 27,669 | 89.6 | 56.8 | 154.7 |\n", + "| 2021 | 28.76 | 277.8 | 24,116 | 88.8 | 66.2 | 118.8 |\n", + "| 2022 | 25.91 | 264.0 | 32,024 | 89.8 | 67.9 | 91.5 |\n", + "| 2023 | 34.33 | 271.2 | 33,806 | 91.6 | 69.7 | 105.2 |\n", + "| 2024† | 38.51 | 280.8 | 36,553 | 92.6 | 65.1 | 76.9 |\n", + "| 2025‡ | — | — | 34,082 | 90.5 | 52.1 | 36.6‡ |\n", + "\n", + "*Note: Budget figures are nominal. FTE = authorized full-time equivalent positions.\n", + "Inspections/District = mean district-level annual inspection count.*\n", + "*† 2024 budget is an appropriations estimate, not expenditure actuals; excluded from\n", + "regression models.*\n", + "*‡ 2025 data is partial-year as of the data extract. Resolution rate and days-to-enforcement\n", + "are right-censored: violations discovered in late 2024–2025 may not yet have a recorded\n", + "enforcement action, compressing these metrics.*\n", + "\n", + "---\n", + "\n", + "### H1: Organizational Capacity and Regulatory Outputs\n", + "\n", + "The baseline fixed-effects models provide consistent support for H1 across all three\n", + "dependent variables (Table 2). Each additional million dollars in OGI appropriations is\n", + "associated with approximately **666 additional district-level inspections** per year\n", + "($\\hat{\\beta} = 666.30$, SE = 212.98, $z = 3.13$, $p < .01$; $R^2 = .769$). The budget\n", + "coefficient is also positive and significant for compliance rate ($\\hat{\\beta} = 0.26$\n", + "percentage points per \\$1M, SE = 0.11, $z = 2.31$, $p = .02$; $R^2 = .538$) and\n", + "violation resolution rate ($\\hat{\\beta} = 1.05$ percentage points per \\$1M, SE = 0.32,\n", + "$z = 3.28$, $p < .01$; $R^2 = .624$). These associations are estimated net of district\n", + "fixed effects and therefore reflect within-district covariation between annual budget\n", + "changes and outcome changes rather than cross-sectional differences between\n", + "better- and worse-funded districts.\n", + "\n", + "**Table 2. H1 Regression Results: OGI Budget → Regulatory Outputs**\n", + "\n", + "| Dependent Variable | $\\hat{\\beta}$ (Budget \\$M) | SE | $z$ | $p$ | $R^2$ | Adj. $R^2$ |\n", + "|---|:---:|:---:|:---:|:---:|:---:|:---:|\n", + "| Total inspections | 666.30 | 212.98 | 3.13 | <.01 | .769 | .736 |\n", + "| Compliance rate (%) | 0.26 | 0.11 | 2.31 | .02 | .538 | .471 |\n", + "| Resolution rate (%) | 1.05 | 0.32 | 3.28 | <.01 | .624 | .569 |\n", + "\n", + "*Note: All models include district fixed effects ($D = 13$). Standard errors clustered\n", + "at the district level. $N = 104$.*\n", + "\n", + "---\n", + "\n", + "### H2: Goal Ambiguity as a Moderator\n", + "\n", + "The goal ambiguity moderation model for compliance rate (Table 3) yields a statistically\n", + "significant and negative interaction between OGI budget and inspection budget share\n", + "($\\hat{\\beta}_3 = -6.53$, SE = 1.84, $z = -3.55$, $p < .01$). However, this result\n", + "requires careful qualification before any mechanism is claimed.\n", + "\n", + "The key issue is that `inspection_budget_share` — like the budget measure itself —\n", + "varies only over time, not across districts. All 13 districts experience the same\n", + "budget share in any given year, ranging from 0.59 (FY2022) to 0.67 (FY2018) across\n", + "the study period — a span of 8 percentage points over 8 observations. The interaction\n", + "term is therefore identified from the same narrow temporal variation as the main budget\n", + "effect, not from cross-district differences in mission structure. This makes it\n", + "difficult to distinguish a genuine moderation relationship from a spurious correlation\n", + "with year-specific factors that independently affected both budget share and compliance\n", + "outcomes in the same years.\n", + "\n", + "The negative sign is consistent with at least two interpretations. Under a\n", + "*resource saturation* story, compliance gains from additional OGI investment contract\n", + "as the inspection mandate becomes better resourced relative to other RRC goals —\n", + "a plausible ceiling effect if districts are already operating near full compliance\n", + "in high-share years. Alternatively, the result may simply reflect that FY2018 — the\n", + "highest-share year — saw particularly large compliance gains for reasons unrelated to\n", + "budget concentration (e.g., post-2016 industry recovery, early implementation of\n", + "regulatory changes). Evaluated at mean budget share ($\\bar{s} \\approx 0.62$), the\n", + "implied marginal budget effect on compliance is $4.20 - 6.53(0.62) \\approx 0.15$\n", + "pp per \\$1M — directionally consistent with H1 but smaller.\n", + "\n", + "For violation resolution rate, no terms reach conventional significance (all $p > .15$).\n", + "Given the identification constraints, the H2 compliance finding is best treated as an\n", + "exploratory pattern consistent with goal ambiguity theory — one that motivates future\n", + "research with district-level budget variation — rather than a robust confirmatory test.\n", + "\n", + "**Table 3. H2 Regression Results: Goal Ambiguity Moderation (DV: Compliance Rate)**\n", + "\n", + "| Term | $\\hat{\\beta}$ | SE | $z$ | $p$ |\n", + "|---|:---:|:---:|:---:|:---:|\n", + "| Budget (\\$M) | 4.20 | 1.09 | 3.86 | <.01 |\n", + "| Inspection budget share | 170.18 | 44.79 | 3.80 | <.01 |\n", + "| Budget × Share | −6.53 | 1.84 | −3.55 | <.01 |\n", + "\n", + "*Note: District fixed effects included. SE clustered at district. $R^2 = .567$,\n", + "Adj. $R^2 = .493$. $N = 104$.*\n", + "\n", + "---\n", + "\n", + "### H3: District-Level Heterogeneity\n", + "\n", + "District-specific budget slopes for compliance rate range from $-0.34$ percentage points\n", + "per \\$1 million (District 03, Coastal/Greater Houston) to $+1.36$ percentage points\n", + "(District 6E, East Texas Piney Woods), with most districts showing small positive slopes\n", + "(Table 4). The reference district (District 01, San Antonio) slope is 0.09 pp per \\$1M.\n", + "Positive slopes are most pronounced in District 6E (+1.36), District 06 (+0.43), and\n", + "District 7C (+0.40); District 03 is the only district with a substantially negative slope.\n", + "The model $R^2$ of .662 modestly exceeds the baseline H1 value (.538), consistent with\n", + "meaningful cross-district slope heterogeneity. Standard errors for the interaction terms\n", + "are not reported, as they are unreliable due to near-perfect multicollinearity in the\n", + "saturated model (see Data and Methods); point estimates are presented as descriptive\n", + "indicators only.\n", + "\n", + "**Table 4. H3 District-Specific Budget → Compliance Slopes (pp per \\$1M)**\n", + "\n", + "| District | Estimated Slope |\n", + "|:---:|:---:|\n", + "| 01 (San Antonio) | 0.09 |\n", + "| 02 (Corpus Christi) | 0.24 |\n", + "| 03 (Houston) | −0.34 |\n", + "| 04 (Laredo) | 0.28 |\n", + "| 05 (Midland/Abilene) | 0.05 |\n", + "| 06 (Kilgore) | 0.43 |\n", + "| 08 (Midland) | 0.28 |\n", + "| 09 (Wichita Falls) | 0.00 |\n", + "| 10 (Amarillo) | 0.13 |\n", + "| 6E (Kilgore East) | 1.36 |\n", + "| 7B (Abilene) | 0.27 |\n", + "| 7C (Big Spring) | 0.40 |\n", + "| 8A (Lubbock) | 0.19 |\n", + "\n", + "*Note: Slopes are $\\hat{\\beta}_1 + \\hat{\\delta}_d$ from the H3 interaction model.*\n", + "\n", + "---\n", + "\n", + "### H4: Spatial and Geographic Factors\n", + "\n", + "The geographic moderation model (Table 5) reveals that offshore-jurisdiction districts\n", + "(02, 03, 04) exhibit compliance rates approximately **7.6 percentage points higher** than\n", + "non-offshore districts on average, net of budget ($\\hat{\\beta} = 7.61$, SE = 3.29,\n", + "$z = 2.31$, $p = .02$). Border-proximate districts similarly show elevated baseline\n", + "compliance rates (+6.03 pp, SE = 2.84, $z = 2.12$, $p = .03$). These level effects may\n", + "reflect the heightened external scrutiny — from federal regulators, environmental\n", + "organizations, and media — that offshore and border districts attract, which could\n", + "independently drive compliance investments by operators regardless of RRC budget levels.\n", + "\n", + "The budget–compliance slope, however, does not differ significantly between offshore\n", + "and non-offshore districts ($\\hat{\\beta}_4 = -0.03$, $p = .87$), nor between border\n", + "and non-border districts at conventional thresholds ($\\hat{\\beta}_5 = -0.25$, $p = .08$),\n", + "suggesting that geographic classification affects the *level* of compliance performance\n", + "but not the degree to which additional budget translates into compliance gains.\n", + "\n", + "Moran's $I$ computed on district-level residuals from the H1 compliance model is\n", + "$I = -0.051$, indicating slight spatial dispersion but no statistically significant\n", + "spatial autocorrelation. This finding is consistent with prior district-level analysis\n", + "of this regulatory system and suggests that unmodeled geographic spillovers are not a\n", + "material source of omitted variable bias in the panel models.\n", + "\n", + "**Table 5. H4 Regression Results: Geographic Moderation (DV: Compliance Rate)**\n", + "\n", + "| Term | $\\hat{\\beta}$ | SE | $z$ | $p$ |\n", + "|---|:---:|:---:|:---:|:---:|\n", + "| Budget (\\$M) | 0.35 | 0.15 | 2.39 | .02 |\n", + "| Offshore (= 1) | 7.61 | 3.29 | 2.31 | .02 |\n", + "| Border (= 1) | 6.03 | 2.84 | 2.12 | .03 |\n", + "| Budget × Offshore | −0.03 | 0.18 | −0.16 | .87 |\n", + "| Budget × Border | −0.25 | 0.15 | −1.74 | .08 |\n", + "\n", + "*Note: District fixed effects included. SE clustered at district. $R^2 = .553$,\n", + "Adj. $R^2 = .476$. $N = 104$. Moran's $I$ on H1 compliance residuals = −0.051 (no\n", + "significant spatial autocorrelation).*\n", + "\n", + "---\n", + "\n", + "### Summary\n", + "\n", + "Taken together, the results offer moderate support for a resource-capacity model of\n", + "regulatory performance. Higher OGI appropriations are reliably associated with greater\n", + "inspection volume, higher compliance rates, and faster violation resolution — though\n", + "identification rests on temporal variation in statewide appropriations rather than\n", + "quasi-experimental assignment, and the modest panel length limits statistical precision.\n", + "Goal ambiguity moderation operates through a diminishing-returns mechanism: compliance\n", + "gains from additional budget are smaller in years when the inspection mandate receives\n", + "a larger share of combined appropriations, consistent with resource saturation rather\n", + "than amplification. District heterogeneity in budget–outcome slopes is substantial in\n", + "descriptive terms but cannot be precisely estimated with the available data. Finally,\n", + "geographic context — offshore jurisdiction and border proximity — predicts compliance\n", + "levels but not budget sensitivity, and spatial autocorrelation diagnostics provide no\n", + "evidence of unmodeled geographic spillover processes.\n", + "\n", + "### Robustness Checks\n", + "\n", + "**Wild cluster bootstrap.** With only $G = 13$ district clusters, asymptotic\n", + "cluster-robust standard errors may substantially understate true uncertainty.\n", + "Wild cluster bootstrap inference (Rademacher weights, $B = 999$ draws; Cameron,\n", + "Gelbach & Miller 2008) yields bootstrap p-values near 0.49–0.51 for all three\n", + "H1 outcomes: total inspections ($p_{boot} = 0.494$), compliance rate\n", + "($p_{boot} = 0.473$), and resolution rate ($p_{boot} = 0.509$). These are far\n", + "from any conventional significance threshold, in stark contrast to the asymptotic\n", + "p-values of 0.002, 0.021, and 0.001. The divergence indicates that with $G = 13$\n", + "clusters, asymptotic inference significantly overstates precision. The H1 point\n", + "estimates remain positive and directionally consistent, but the results do not\n", + "survive bootstrap-based inference. This is the principal inferential limitation\n", + "of the study.\n", + "\n", + "**Table 7. Wild Cluster Bootstrap vs. Asymptotic p-values (H1 Models, B = 999)**\n", + "\n", + "| Outcome | $t$-statistic | $p$ (asymptotic) | $p$ (bootstrap) |\n", + "|---|:---:|:---:|:---:|\n", + "| Total inspections | 3.13 | .002 | .494 |\n", + "| Compliance rate | 2.31 | .021 | .473 |\n", + "| Resolution rate | 3.28 | .001 | .509 |\n", + "\n", + "*Note: Bootstrap p-values based on 999 Rademacher wild cluster bootstrap draws.*\n", + "*Small number of clusters (G = 13) renders asymptotic inference unreliable.*\n", + "\n", + "**Distributed lag model.** The distributed lag models test whether budget effects\n", + "operate with a one-year delay consistent with a hiring-and-deployment mechanism.\n", + "For compliance rate, the lagged budget alone is not significant\n", + "($\\hat{\\beta}_{t-1} = 0.10$, $p = .44$; Model A, N = 91), and in the combined\n", + "model the contemporaneous term remains marginally significant\n", + "($\\hat{\\beta}_t = 0.24$, $p = .04$) while the lagged term is negative and\n", + "non-significant ($\\hat{\\beta}_{t-1} = -0.14$, $p = .12$; Model B). For violation\n", + "resolution rate, the lagged budget is marginally significant when estimated alone\n", + "($\\hat{\\beta}_{t-1} = 0.83$, $p = .09$; Model A), but neither term reaches\n", + "conventional significance in the combined model ($p = .22$ and $p = .14$).\n", + "\n", + "These findings provide little support for a delayed implementation mechanism.\n", + "The persistence of contemporaneous effects alongside non-significant lagged terms\n", + "is more consistent with an immediate budget–output relationship. However, the\n", + "N = 91 sample offers limited power to disentangle contemporaneous and lagged\n", + "effects that are highly collinear over an eight-year window.\n", + "\n", + "**Table 8. Distributed Lag Results (2017–2023, N = 91)**\n", + "\n", + "| Model | DV | $\\hat{\\beta}_t$ | $p$ | $\\hat{\\beta}_{t-1}$ | $p$ | $R^2$ |\n", + "|---|---|:---:|:---:|:---:|:---:|:---:|\n", + "| A — Lag only | Compliance rate | — | — | 0.10 | .44 | .543 |\n", + "| B — Both | Compliance rate | 0.24 | .04 | −0.14 | .12 | .569 |\n", + "| A — Lag only | Resolution rate | — | — | 0.83 | .09 | .696 |\n", + "| B — Both | Resolution rate | 0.24 | .22 | 0.59 | .14 | .698 |\n", + "\n", + "*Note: District fixed effects included; SE clustered at district.*\n" + ] }, { "cell_type": "markdown", "id": "360e76f4", "metadata": {}, - "source": "## Robustness Checks\n\nTwo checks address limitations of the baseline H1 models.\n\n**Wild cluster bootstrap** re-tests H1 with valid small-sample inference rather than\nasymptotic cluster-robust standard errors. With $G = 13$ clusters, asymptotic results\ncan overstate precision. Rademacher wild cluster bootstrap ($B = 999$ draws; Cameron,\nGelbach & Miller 2008) yields p-values near 0.49\u20130.51 for all three H1 outcomes \u2014\nfar from any conventional threshold \u2014 indicating that the asymptotic H1 results do\nnot survive this correction. Point estimates remain positive and substantively consistent\nin direction, but the study lacks the cluster count required to establish significance\nthrough bootstrap inference.\n\n**Distributed lag model** relaxes the assumption that budget effects are instantaneous.\nA one-year lag of OGI budget is estimated alone (Model A) and jointly with the\ncontemporaneous term (Model B), over the 2017\u20132023 sample (N = 91). The lagged budget\nis not independently significant for compliance rate ($p = .44$) and only marginally so\nfor resolution rate ($p = .09$). In the combined models, contemporaneous effects persist\nwhile lagged terms do not attain significance \u2014 providing little evidence that a delayed\nmechanism dominates an immediate one.\n" + "source": [ + "## Robustness Checks\n", + "\n", + "Two checks address limitations of the baseline H1 models.\n", + "\n", + "**Wild cluster bootstrap** re-tests H1 with valid small-sample inference rather than\n", + "asymptotic cluster-robust standard errors. With $G = 13$ clusters, asymptotic results\n", + "can overstate precision. Rademacher wild cluster bootstrap ($B = 999$ draws; Cameron,\n", + "Gelbach & Miller 2008) yields p-values near 0.49–0.51 for all three H1 outcomes —\n", + "far from any conventional threshold — indicating that the asymptotic H1 results do\n", + "not survive this correction. Point estimates remain positive and substantively consistent\n", + "in direction, but the study lacks the cluster count required to establish significance\n", + "through bootstrap inference.\n", + "\n", + "**Distributed lag model** relaxes the assumption that budget effects are instantaneous.\n", + "A one-year lag of OGI budget is estimated alone (Model A) and jointly with the\n", + "contemporaneous term (Model B), over the 2017–2023 sample (N = 91). The lagged budget\n", + "is not independently significant for compliance rate ($p = .44$) and only marginally so\n", + "for resolution rate ($p = .09$). In the combined models, contemporaneous effects persist\n", + "while lagged terms do not attain significance — providing little evidence that a delayed\n", + "mechanism dominates an immediate one.\n" + ] }, { "cell_type": "code", @@ -2761,9 +3052,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "Wild Cluster Bootstrap \u2014 H1 Models (B = 999 Rademacher draws)\n", + "Wild Cluster Bootstrap — H1 Models (B = 999 Rademacher draws)\n", "Outcome t-stat p asymptotic p bootstrap\n", - "\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "─────────────────────────────────────────────────────────────────\n", "total_inspections 3.128 0.002*** 0.494* \n", "compliance_rate 2.307 0.021** 0.473* \n", "resolution_rate 3.277 0.001*** 0.509* \n", @@ -2774,7 +3065,7 @@ ], "source": [ "# Wild cluster bootstrap (Rademacher weights, B=999)\n", - "# For each draw: multiply each cluster's residuals by \u00b11, re-fit, record t-stat.\n", + "# For each draw: multiply each cluster's residuals by ±1, re-fit, record t-stat.\n", "# p-value = share of |t*| >= |t_observed|.\n", "\n", "def wild_cluster_bootstrap(model, data, dv, cluster_col=\"district\",\n", @@ -2801,9 +3092,9 @@ " p_boot = float((np.abs(t_boot) >= np.abs(t_obs)).mean())\n", " return t_obs, float(model.pvalues[coef]), p_boot\n", "\n", - "print(\"Wild Cluster Bootstrap \u2014 H1 Models (B = 999 Rademacher draws)\")\n", + "print(\"Wild Cluster Bootstrap — H1 Models (B = 999 Rademacher draws)\")\n", "print(f\"{'Outcome':<28} {'t-stat':>7} {'p asymptotic':>13} {'p bootstrap':>12}\")\n", - "print(\"\u2500\" * 65)\n", + "print(\"─\" * 65)\n", "\n", "for dv, model in [\n", " (\"total_inspections\", m_inspections),\n", @@ -2828,33 +3119,33 @@ "name": "stdout", "output_type": "stream", "text": [ - "Distributed lag sample: 91 obs | years 2017\u20132023\n", + "Distributed lag sample: 91 obs | years 2017–2023\n", "\n", - "\u2500\u2500 Compliance Rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "── Compliance Rate ───────────────────────────────────────────\n", "\n", - "Model A \u2014 Lagged budget only (t\u22121):\n", + "Model A — Lagged budget only (t−1):\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m_lag1 0.10 0.13 0.77 0.44\n", - " R\u00b2 = 0.543 Adj. R\u00b2 = 0.466\n", + " R² = 0.543 Adj. R² = 0.466\n", "\n", - "Model B \u2014 Contemporaneous + 1-year lag:\n", + "Model B — Contemporaneous + 1-year lag:\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.24 0.11 2.08 0.04\n", "ogi_budget_m_lag1 -0.14 0.09 -1.55 0.12\n", - " R\u00b2 = 0.569 Adj. R\u00b2 = 0.490\n", + " R² = 0.569 Adj. R² = 0.490\n", "\n", - "\u2500\u2500 Resolution Rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "── Resolution Rate ───────────────────────────────────────────\n", "\n", - "Model A \u2014 Lagged budget only (t\u22121):\n", + "Model A — Lagged budget only (t−1):\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m_lag1 0.83 0.49 1.69 0.09\n", - " R\u00b2 = 0.696 Adj. R\u00b2 = 0.644\n", + " R² = 0.696 Adj. R² = 0.644\n", "\n", - "Model B \u2014 Contemporaneous + 1-year lag:\n", + "Model B — Contemporaneous + 1-year lag:\n", " Coef. Std.Err. z P>|z|\n", "ogi_budget_m 0.24 0.19 1.22 0.22\n", "ogi_budget_m_lag1 0.59 0.40 1.46 0.14\n", - " R\u00b2 = 0.698 Adj. R\u00b2 = 0.642\n" + " R² = 0.698 Adj. R² = 0.642\n" ] } ], @@ -2875,20 +3166,20 @@ "].copy()\n", "\n", "print(f\"Distributed lag sample: {len(lag_actuals)} obs | \"\n", - " f\"years {lag_actuals['year'].min()}\u2013{lag_actuals['year'].max()}\")\n", + " f\"years {lag_actuals['year'].min()}–{lag_actuals['year'].max()}\")\n", "\n", - "# \u2500\u2500 Model A: lagged budget only \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Model A: lagged budget only ───────────────────────────────────────────────\n", "m_lag_only = smf.ols(\n", " \"compliance_rate ~ ogi_budget_m_lag1 + C(district)\", data=lag_actuals\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", "\n", - "# \u2500\u2500 Model B: contemporaneous + 1-year lag \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Model B: contemporaneous + 1-year lag ────────────────────────────────────\n", "m_lag_both = smf.ols(\n", " \"compliance_rate ~ ogi_budget_m + ogi_budget_m_lag1 + C(district)\",\n", " data=lag_actuals\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", "\n", - "# \u2500\u2500 Also run for resolution rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n", + "# ── Also run for resolution rate ──────────────────────────────────────────────\n", "m_lag_res_only = smf.ols(\n", " \"resolution_rate ~ ogi_budget_m_lag1 + C(district)\", data=lag_actuals\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", @@ -2898,36 +3189,82 @@ " data=lag_actuals\n", ").fit(cov_type=\"cluster\", cov_kwds={\"groups\": lag_actuals[\"district\"]})\n", "\n", - "print(\"\\n\u2500\u2500 Compliance Rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\")\n", + "print(\"\\n── Compliance Rate ───────────────────────────────────────────\")\n", "\n", - "print(\"\\nModel A \u2014 Lagged budget only (t\u22121):\")\n", + "print(\"\\nModel A — Lagged budget only (t−1):\")\n", "print(m_lag_only.summary2().tables[1][display_cols].loc[[\"ogi_budget_m_lag1\"]])\n", - "print(f\" R\u00b2 = {m_lag_only.rsquared:.3f} Adj. R\u00b2 = {m_lag_only.rsquared_adj:.3f}\")\n", + "print(f\" R² = {m_lag_only.rsquared:.3f} Adj. R² = {m_lag_only.rsquared_adj:.3f}\")\n", "\n", - "print(\"\\nModel B \u2014 Contemporaneous + 1-year lag:\")\n", + "print(\"\\nModel B — Contemporaneous + 1-year lag:\")\n", "print(m_lag_both.summary2().tables[1][display_cols].loc[\n", " [\"ogi_budget_m\", \"ogi_budget_m_lag1\"]\n", "])\n", - "print(f\" R\u00b2 = {m_lag_both.rsquared:.3f} Adj. R\u00b2 = {m_lag_both.rsquared_adj:.3f}\")\n", + "print(f\" R² = {m_lag_both.rsquared:.3f} Adj. R² = {m_lag_both.rsquared_adj:.3f}\")\n", "\n", - "print(\"\\n\u2500\u2500 Resolution Rate \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\")\n", + "print(\"\\n── Resolution Rate ───────────────────────────────────────────\")\n", "\n", - "print(\"\\nModel A \u2014 Lagged budget only (t\u22121):\")\n", + "print(\"\\nModel A — Lagged budget only (t−1):\")\n", "print(m_lag_res_only.summary2().tables[1][display_cols].loc[[\"ogi_budget_m_lag1\"]])\n", - "print(f\" R\u00b2 = {m_lag_res_only.rsquared:.3f} Adj. R\u00b2 = {m_lag_res_only.rsquared_adj:.3f}\")\n", + "print(f\" R² = {m_lag_res_only.rsquared:.3f} Adj. R² = {m_lag_res_only.rsquared_adj:.3f}\")\n", "\n", - "print(\"\\nModel B \u2014 Contemporaneous + 1-year lag:\")\n", + "print(\"\\nModel B — Contemporaneous + 1-year lag:\")\n", "print(m_lag_res_both.summary2().tables[1][display_cols].loc[\n", " [\"ogi_budget_m\", \"ogi_budget_m_lag1\"]\n", "])\n", - "print(f\" R\u00b2 = {m_lag_res_both.rsquared:.3f} Adj. R\u00b2 = {m_lag_res_both.rsquared_adj:.3f}\")\n" + "print(f\" R² = {m_lag_res_both.rsquared:.3f} Adj. R² = {m_lag_res_both.rsquared_adj:.3f}\")\n" ] }, { "cell_type": "markdown", "id": "90c60ad1", "metadata": {}, - "source": "## Hypotheses Summary\n\n**Table 6. Summary of Hypotheses, Predictions, Findings, and Empirical Support**\n\n| # | Hypothesis | Prediction | Key Result | Support |\n|:---:|---|---|---|:---:|\n| **H1a** | Capacity \u2192 Inspection volume | Higher OGI budget predicts more inspections per district | \u03b2 = 666.3/\\$1M (z = 3.13, p < .01); bootstrap p = .494 | \u2713\u2020 |\n| **H1b** | Capacity \u2192 Compliance | Higher OGI budget predicts higher district compliance rate | \u03b2 = 0.26 pp/\\$1M (z = 2.31, p = .02); bootstrap p = .473 | \u2713\u2020 |\n| **H1c** | Capacity \u2192 Resolution | Higher OGI budget predicts higher violation resolution rate | \u03b2 = 1.05 pp/\\$1M (z = 3.28, p < .01); bootstrap p = .509 | \u2713\u2020 |\n| **H2a** | Goal ambiguity moderates capacity \u2192 compliance | Clearer inspection focus amplifies budget effect | Significant but **negative** (\u03b2 = \u22126.53, z = \u22123.55, p < .01); interpretation constrained by time-only variation in budget share (range: 0.59\u20130.67) | Exploratory\u2021 |\n| **H2b** | Goal ambiguity moderates capacity \u2192 resolution | Clearer inspection focus amplifies budget effect | Interaction not significant (p = .24) | \u2717 |\n| **H3** | District heterogeneity in budget slopes | Budget \u2192 compliance slope varies across districts | Slopes from \u22120.34 pp/\\$1M (D03) to +1.36 pp/\\$1M (D6E); inference unreliable | Descriptive\u00a7 |\n| **H4a** | Offshore jurisdiction moderates budget effect | Offshore districts show different budget \u2192 compliance slope | Level effect: +7.6 pp (p = .02); slope interaction not significant (p = .87) | Partial\u00b6 |\n| **H4b** | Border proximity moderates budget effect | Border districts show different budget \u2192 compliance slope | Level effect: +6.0 pp (p = .03); slope interaction marginal (p = .08) | Partial\u00b6 |\n| **H4c** | Spatial autocorrelation in residuals | Geographic spillovers produce clustered residuals | Moran's I = \u22120.051; no significant spatial autocorrelation | \u2717 |\n\n*Notes:*\n\n*\u2020 H1 point estimates are positive and directionally consistent across all three outcomes,\nsupporting the capacity hypothesis substantively. However, wild cluster bootstrap\ninference (B = 999 Rademacher draws) yields p-values near 0.49\u20130.51 for all outcomes,\nindicating that asymptotic cluster-robust standard errors substantially overstate precision\nwith G = 13 clusters. H1 findings should be interpreted as suggestive rather than\nstatistically definitive. Distributed lag models (2017\u20132023, N = 91) show contemporaneous\neffects persist while lagged terms do not reach significance, providing no clear evidence\nfor a delayed implementation mechanism.*\n\n*\u2021 H2a is statistically significant but the identification is weak: inspection\nbudget share varies only over time (like the budget itself), with a range of just\n0.59\u20130.67 across 8 years. The negative interaction is consistent with a resource\nsaturation effect but cannot be distinguished from year-specific confounders.\nAt mean share (\u2248 0.62), the implied marginal budget effect is \u2248 0.15 pp per \\$1M.\nH2b not significant for resolution rate. Both H2 findings are best treated as\nexploratory patterns for future research.*\n\n*\u00a7 H3 interaction standard errors are unreliable (near-perfect multicollinearity in\nthe saturated model); budget slopes are reported as descriptive point estimates only.*\n\n*\u00b6 Geographic classification predicts compliance **levels** but not budget sensitivity.\nOffshore and border districts exhibit systematically higher compliance regardless of\nannual budget variation.*\n\n**Regression sample:** N = 104 (13 districts \u00d7 8 years, 2016\u20132023). All models include\ndistrict fixed effects; standard errors clustered at the district level (G = 13).\nRobustness sample: N = 91 (2017\u20132023, distributed lag models).\n" + "source": [ + "## Hypotheses Summary\n", + "\n", + "**Table 6. Summary of Hypotheses, Predictions, Findings, and Empirical Support**\n", + "\n", + "| # | Hypothesis | Prediction | Key Result | Support |\n", + "|:---:|---|---|---|:---:|\n", + "| **H1a** | Capacity → Inspection volume | Higher OGI budget predicts more inspections per district | β = 666.3/\\$1M (z = 3.13, p < .01); bootstrap p = .494 | ✓† |\n", + "| **H1b** | Capacity → Compliance | Higher OGI budget predicts higher district compliance rate | β = 0.26 pp/\\$1M (z = 2.31, p = .02); bootstrap p = .473 | ✓† |\n", + "| **H1c** | Capacity → Resolution | Higher OGI budget predicts higher violation resolution rate | β = 1.05 pp/\\$1M (z = 3.28, p < .01); bootstrap p = .509 | ✓† |\n", + "| **H2a** | Goal ambiguity moderates capacity → compliance | Clearer inspection focus amplifies budget effect | Significant but **negative** (β = −6.53, z = −3.55, p < .01); interpretation constrained by time-only variation in budget share (range: 0.59–0.67) | Exploratory‡ |\n", + "| **H2b** | Goal ambiguity moderates capacity → resolution | Clearer inspection focus amplifies budget effect | Interaction not significant (p = .24) | ✗ |\n", + "| **H3** | District heterogeneity in budget slopes | Budget → compliance slope varies across districts | Slopes from −0.34 pp/\\$1M (D03) to +1.36 pp/\\$1M (D6E); inference unreliable | Descriptive§ |\n", + "| **H4a** | Offshore jurisdiction moderates budget effect | Offshore districts show different budget → compliance slope | Level effect: +7.6 pp (p = .02); slope interaction not significant (p = .87) | Partial¶ |\n", + "| **H4b** | Border proximity moderates budget effect | Border districts show different budget → compliance slope | Level effect: +6.0 pp (p = .03); slope interaction marginal (p = .08) | Partial¶ |\n", + "| **H4c** | Spatial autocorrelation in residuals | Geographic spillovers produce clustered residuals | Moran's I = −0.051; no significant spatial autocorrelation | ✗ |\n", + "\n", + "*Notes:*\n", + "\n", + "*† H1 point estimates are positive and directionally consistent across all three outcomes,\n", + "supporting the capacity hypothesis substantively. However, wild cluster bootstrap\n", + "inference (B = 999 Rademacher draws) yields p-values near 0.49–0.51 for all outcomes,\n", + "indicating that asymptotic cluster-robust standard errors substantially overstate precision\n", + "with G = 13 clusters. H1 findings should be interpreted as suggestive rather than\n", + "statistically definitive. Distributed lag models (2017–2023, N = 91) show contemporaneous\n", + "effects persist while lagged terms do not reach significance, providing no clear evidence\n", + "for a delayed implementation mechanism.*\n", + "\n", + "*‡ H2a is statistically significant but the identification is weak: inspection\n", + "budget share varies only over time (like the budget itself), with a range of just\n", + "0.59–0.67 across 8 years. The negative interaction is consistent with a resource\n", + "saturation effect but cannot be distinguished from year-specific confounders.\n", + "At mean share (≈ 0.62), the implied marginal budget effect is ≈ 0.15 pp per \\$1M.\n", + "H2b not significant for resolution rate. Both H2 findings are best treated as\n", + "exploratory patterns for future research.*\n", + "\n", + "*§ H3 interaction standard errors are unreliable (near-perfect multicollinearity in\n", + "the saturated model); budget slopes are reported as descriptive point estimates only.*\n", + "\n", + "*¶ Geographic classification predicts compliance **levels** but not budget sensitivity.\n", + "Offshore and border districts exhibit systematically higher compliance regardless of\n", + "annual budget variation.*\n", + "\n", + "**Regression sample:** N = 104 (13 districts × 8 years, 2016–2023). All models include\n", + "district fixed effects; standard errors clustered at the district level (G = 13).\n", + "Robustness sample: N = 91 (2017–2023, distributed lag models).\n" + ] } ], "metadata": { @@ -2951,4 +3288,4 @@ }, "nbformat": 4, "nbformat_minor": 5 -} \ No newline at end of file +}