diff --git a/analysis/new analysis Aug 2025/analayis11_2020_nooutliers.ipynb b/analysis/new analysis Aug 2025/analayis11_2020_nooutliers.ipynb index 2d5cadd..56580ce 100644 --- a/analysis/new analysis Aug 2025/analayis11_2020_nooutliers.ipynb +++ b/analysis/new analysis Aug 2025/analayis11_2020_nooutliers.ipynb @@ -1791,25 +1791,55 @@ "id": "1f786261", "metadata": {}, "source": [ - "## narrative\n", + "## Narrative Summary of Reporting Delay Analysis\n", "\n", - "In the analytic sample (N = 11,376; after IQR outlier trimming), predicted reporting delays declined after 2020 across spill types and rurality. Using a Poisson GLM with a full C(spill_type) × C(Period) × C(rurality) interaction and parametric bootstrap (B = 2,000) for uncertainty, the estimated reduction in predicted delay (Before 2020 − 2020 and After) ranged from about 0.22–0.46 days (≈5–11 hours). For example, Historical spills in urban areas show a median reduction of 0.460 days (95% bootstrap CI 0.404–0.517, p < 0.001), and Recent spills in urban areas show a median reduction of 0.299 days (95% CI 0.222–0.375, p < 0.001).\n", + "### Analytic Sample and Model\n", "\n", - "Comparing spill types, Recent spills tend to have longer predicted delays than Historical spills, with the largest and most robust gap in urban settings. After 2020, the Recent − Historical difference in Urban areas is 0.291 days (95% CI 0.244–0.341, p < 0.001), equivalent to roughly 7.0 hours. Differences by spill type are smaller and often not statistically distinguishable in some Suburban and Rural cells before 2020 (e.g., Before 2020 Suburban: 0.020 days, 95% CI −0.222–0.244, p ≈ 0.85).\n", + "- **Sample size:** N = 11,376 (after IQR outlier trimming)\n", + "- **Model:** Poisson GLM with full interaction: `C(spill_type) × C(Period) × C(rurality)`\n", + "- **Uncertainty:** Parametric bootstrap (B = 2,000)\n", "\n", - "Limitations: the primary inference is model‑dependent (Poisson fit showed underdispersion, Pearson χ2/df ≈ 0.69), so results rely on the bootstrap procedures used to quantify uncertainty; a nonparametric case bootstrap (B = 1,000) produced broadly similar group CIs. Small sample sizes in some SpillType×Period×rurality cells limit precision for those specific comparisons, and reported p‑values are unadjusted for multiple contrasts; therefore emphasize effect sizes and bootstrap CIs in policy interpretation.\n", + "### Main Findings\n", "\n", - "Figure caption Figure X. Predicted reporting delays (days) for each Spill Type × Period × rurality cell with parametric bootstrap (B = 2,000) 95% confidence intervals. Points show bootstrap medians of the model‑predicted group means; vertical bars are 2.5th–97.5th percentiles from the bootstrap distribution." + "- **Reporting delays declined after 2020** across spill types and rurality.\n", + "- **Estimated reduction (Before 2020 − 2020 and After):** \n", + " - Range: **0.22–0.46 days** (≈5–11 hours)\n", + " - **Historical spills, Urban:** \n", + " - Median reduction: **0.460 days** \n", + " - 95% bootstrap CI: **0.404–0.517** \n", + " - p < 0.001\n", + " - **Recent spills, Urban:** \n", + " - Median reduction: **0.299 days** \n", + " - 95% CI: **0.222–0.375** \n", + " - p < 0.001\n", + "\n", + "- **Spill Type Comparison:** \n", + " - Recent spills generally have longer predicted delays than Historical spills.\n", + " - **Largest and most robust gap:** Urban settings after 2020\n", + " - **Recent − Historical, Urban, 2020 and After:** \n", + " - Difference: **0.291 days** (≈7.0 hours) \n", + " - 95% CI: **0.244–0.341** \n", + " - p < 0.001\n", + " - **Suburban and Rural cells before 2020:** \n", + " - Differences smaller and often not statistically significant \n", + " - Example: Suburban, Before 2020: **0.020 days** (95% CI −0.222–0.244, p ≈ 0.85)\n", + "\n", + "### Limitations\n", + "\n", + "- **Model dependence:** Poisson fit showed underdispersion (Pearson χ²/df ≈ 0.69)\n", + "- **Bootstrap reliance:** Results rely on bootstrap procedures for uncertainty quantification\n", + "- **Sensitivity:** Nonparametric case bootstrap (B = 1,000) produced similar group CIs\n", + "- **Small sample sizes:** Some SpillType × Period × rurality cells have limited precision\n", + "- **Multiple comparisons:** p-values unadjusted; emphasize effect sizes and bootstrap CIs for policy\n", + "\n", + "---\n", + "\n", + "### Figure Caption\n", + "\n", + "**Figure X.** Predicted reporting delays (days) for each Spill Type × Period × rurality cell with parametric bootstrap (B = 2,000) 95% confidence intervals. \n", + "Points show bootstrap medians of the model-predicted group means; vertical bars are 2.5th–97.5th percentiles from the bootstrap distribution." ] }, - { - "cell_type": "code", - "execution_count": null, - "id": "eeed5308", - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "code", "execution_count": 49,