chore(ops): 新增 RLS preflight 與 registry certbot 修復包

2026-05-12 18:25:53 +08:00
parent a18e2f9c3f
commit 0bc1878778
6 changed files with 752 additions and 0 deletions
--- a/docs/LOGBOOK.md
+++ b/docs/LOGBOOK.md
@@ -1,3 +1,56 @@
+## 2026-05-12 | RLS Preflight 與 188 Registry Certbot 修復包
+
+**背景**：Wave 1 已確認 production RLS 是 P0，但不可直接熱開；188 `registry.wooo.work` certbot 也已確認失效，但目前 `ollama` SSH 帳號沒有免密 sudo。這輪把兩個紅燈轉成可重跑、可交接、可審批的 remediation 前置包。
+
+**新增 RLS preflight**：
+- `scripts/ops/awooop_rls_preflight.py`：
+  - 設計為在 production API pod 內執行，使用 pod-local `DATABASE_URL`，不輸出 DB URL 或密碼。
+  - read-only 檢查 DB role、`set_config('app.project_id')`、target table `project_id` 欄位、RLS enabled/forced/policy、fail-open policy expression。
+  - `--exact-counts` 才執行精確 `COUNT(*)` / `NULL project_id` 掃描。
+- `scripts/ops/awooop-rls-preflight.sh`：
+  - 預設透過 `wooo@192.168.0.120` 執行 `sudo kubectl -n awoooi-prod exec deployment/awoooi-api -c api -- python -`。
+  - 支援 `--local`、`--json`、`--exact-counts`。
+  - exit `2` 表示 RLS gate blocked，不可啟用 RLS。
+- `docs/runbooks/AWOOOP-RLS-PREFLIGHT.md`：
+  - 記錄 2026-05-12 production preflight 結果與 remediation order。
+
+**RLS live preflight 結果**：
+- `bash scripts/ops/awooop-rls-preflight.sh --exact-counts` → exit `2`，符合 blocked gate。
+- `PASS=5 WARN=0 BLOCKED=2`。
+- PASS：
+  - current DB user `awoooi` 不是 superuser / bypassrls。
+  - `set_config('app.project_id', 'awoooi', TRUE)` 可用。
+  - 所有已存在 target tables 都有 `project_id`。
+  - production DB 目前沒有 fail-open policy expression。
+  - exact counts 顯示已存在 target tables `NULL project_id = 0`。
+- BLOCKED：
+  - `awooop_app`、`awooop_platform_admin`、`awooop_migration` roles 不存在。
+  - target tables 尚未 RLS enabled / forced / policied。
+- 判讀：下一步不是回填資料，而是 role bootstrap + DB access path audit + staged policy enablement；目前 production app user 是 `awoooi`，policy 設計必須先決定是 grant `awooop_app` membership 還是切 connection role。
+
+**新增 188 registry certbot 修復包**：
+- `scripts/ops/188-registry-certbot-fix.sh`：
+  - root-only helper；預設 dry-run，必須 `--apply` 才會改 188。
+  - 建立 `/var/www/certbot`。
+  - 安裝 `/etc/nginx/conf.d/registry-acme-http.conf`，讓 `registry.wooo.work` HTTP-01 不再落到 `aiops.wooo.work` default vhost。
+  - `nginx -t` 後 reload。
+  - 用 `/snap/bin/certbot renew --cert-name registry.wooo.work` renew。
+  - snap certbot 存在時停用 broken apt `certbot.timer` 並 reset failed apt certbot service。
+- `docs/runbooks/REGISTRY-CERTBOT-188.md`：
+  - 記錄 expired cert、錯誤 route、apt/snap certbot owner split，以及 post-fix 驗證命令。
+
+**驗證**：
+- `python3 -m py_compile scripts/ops/awooop_rls_preflight.py` → passed。
+- `bash -n scripts/ops/awooop-rls-preflight.sh scripts/ops/188-registry-certbot-fix.sh` → passed。
+- `scripts/ops/188-registry-certbot-fix.sh` dry-run → 印出預期動作，未修改本機或 188。
+- RLS preflight 已對 production API pod 跑通；blocked 結果符合預期，未改 DB。
+- 已同步 helper 到 188 `/home/ollama/awoooi-ops/188-registry-certbot-fix.sh`。
+- 188 remote `bash -n` passed；remote dry-run 印出預期 root actions，未改 Nginx / certbot。
+
+**下一步**：
+- 由具 sudo 權限的 operator 在 188 執行 `sudo /home/ollama/awoooi-ops/188-registry-certbot-fix.sh --apply`。
+- RLS 先做 role bootstrap 設計審查，再產出 batch migration；不可直接套既有 RLS migration。
+
 ## 2026-05-12 | Wave 1 Claude P0 紅燈驗證與 GitHub CD 封堵

 **背景**：Claude Code 盤點只能作為候選清單，必須逐項用 production DB、主機狀態、provider logs、repo artifacts 驗證；本輪先處理可快速證實且風險高的紅燈。
--- a/docs/runbooks/AWOOOP-RLS-PREFLIGHT.md
+++ b/docs/runbooks/AWOOOP-RLS-PREFLIGHT.md
@@ -0,0 +1,88 @@
+# AwoooP RLS Preflight Runbook
+
+> Purpose: verify whether production is ready for PostgreSQL Row-Level Security
+> without enabling RLS or changing data.
+
+## Command
+
+Default path runs the probe inside the production API pod through the 120
+control-plane host. `DATABASE_URL` stays inside Kubernetes and is not printed.
+
+```bash
+bash scripts/ops/awooop-rls-preflight.sh
+```
+
+Before enabling RLS, run exact backfill counts:
+
+```bash
+bash scripts/ops/awooop-rls-preflight.sh --exact-counts
+```
+
+Useful variants:
+
+```bash
+bash scripts/ops/awooop-rls-preflight.sh --json
+bash scripts/ops/awooop-rls-preflight.sh --local
+AWOOOP_RLS_SSH_TARGET=wooo@192.168.0.120 bash scripts/ops/awooop-rls-preflight.sh
+```
+
+Exit code `2` means the gate is blocked and RLS must not be enabled yet.
+
+## 2026-05-12 Production Result
+
+`--exact-counts` returned:
+
+- `PASS current_role_rls_enforced`: current DB user is `awoooi`, not superuser and not `BYPASSRLS`.
+- `PASS project_context_set_config`: `set_config('app.project_id', 'awoooi', TRUE)` works in the API pod.
+- `BLOCKED required_roles`: `awooop_app`, `awooop_platform_admin`, and `awooop_migration` do not exist.
+- `PASS project_id_columns`: every existing target table has `project_id`.
+- `BLOCKED rls_enabled_forced_policy`: existing target tables are not yet RLS enabled, forced, or policied.
+- `PASS fail_open_policies`: production DB currently has no fail-open policy expressions.
+- `PASS project_id_backfill`: exact counts found zero `NULL project_id` rows in counted target tables.
+
+Current blocker summary:
+
+```text
+PASS=5 WARN=0 BLOCKED=2
+```
+
+Important exact counts from the same run:
+
+| Table | Rows | NULL project_id |
+| --- | ---: | ---: |
+| `audit_logs` | 686 | 0 |
+| `awooop_mcp_tool_registry` | 4 | 0 |
+| `awooop_outbound_message` | 228 | 0 |
+| `awooop_projects` | 2 | 0 |
+| `awooop_run_state` | 106 | 0 |
+| `incidents` | 1518 | 0 |
+| `knowledge_entries` | 2099 | 0 |
+| `playbooks` | 220 | 0 |
+
+## Remediation Order
+
+1. Create or reconcile RLS roles.
+   - Current production app user is `awoooi`; policy design must either grant it
+     membership in `awooop_app` or update the application connection role before
+     policies are enforced.
+   - Do not create passworded LOGIN roles in a migration unless the K8s Secret
+     rotation path is ready.
+2. Verify all DB access paths use `get_db()` / `get_db_context()` or otherwise set
+   `app.project_id` before queries.
+3. Apply policies first in staging or a canary DB.
+4. In production, enable one batch at a time.
+5. After each batch, run:
+
+```bash
+bash scripts/ops/awooop-rls-preflight.sh --exact-counts
+```
+
+6. Validate AwoooP Runs, Approvals, Monitoring, Tickets, Cost, alert ingestion,
+   background workers, and TelegramGateway mirror paths.
+
+## Do Not
+
+- Do not enable all policies in production before the role path is decided.
+- Do not rely on fail-open `IS NULL` or empty-string policies as the target state.
+- Do not run destructive rollback SQL unless the incident commander explicitly
+  approves it.
--- a/docs/runbooks/REGISTRY-CERTBOT-188.md
+++ b/docs/runbooks/REGISTRY-CERTBOT-188.md
@@ -0,0 +1,62 @@
+# 188 Registry Certbot Recovery
+
+> Scope: `registry.wooo.work` on host `192.168.0.188`.
+
+## Verified State On 2026-05-12
+
+- `registry.wooo.work` certificate expired at `May 8 04:16:08 2026 GMT`.
+- HTTP-01 route check:
+
+```text
+http://registry.wooo.work/.well-known/acme-challenge/codex-route-check
+-> 301 https://aiops.wooo.work/.well-known/acme-challenge/codex-route-check
+-> 404
+```
+
+- `/usr/bin/certbot` is broken by Python/OpenSSL mismatch.
+- `/snap/bin/certbot` exists and should be the renewal owner.
+- Both apt `certbot.timer` and snap `snap.certbot.renew.timer` were enabled.
+- The `ollama` SSH user is in sudo group but has no passwordless sudo in this
+  session, so Codex could not apply the root-level fix directly.
+
+## Fix Script
+
+The repo includes a root-only helper. It is dry-run by default:
+
+```bash
+bash scripts/ops/188-registry-certbot-fix.sh
+```
+
+To apply on 188:
+
+```bash
+sudo bash /home/ollama/awoooi-ops/188-registry-certbot-fix.sh --apply
+```
+
+The script:
+
+- creates `/var/www/certbot`;
+- installs `/etc/nginx/conf.d/registry-acme-http.conf`;
+- routes `registry.wooo.work` HTTP-01 to `/var/www/certbot`;
+- reloads Nginx after `nginx -t`;
+- renews `registry.wooo.work` via `/snap/bin/certbot`;
+- disables the broken apt `certbot.timer` when snap certbot is present;
+- prints the renewed certificate dates.
+
+## Post-Fix Verification
+
+Run from any host with network access:
+
+```bash
+curl -sI --max-redirs 0 http://registry.wooo.work/.well-known/acme-challenge/codex-route-check
+openssl s_client -servername registry.wooo.work -connect registry.wooo.work:443 </dev/null 2>/dev/null \
+  | openssl x509 -noout -subject -issuer -dates
+```
+
+Expected:
+
+- HTTP challenge path returns `404` from the `registry.wooo.work` vhost, not a
+  redirect to `aiops.wooo.work`.
+- `notAfter` is renewed to a future date.
+- `systemctl --failed` no longer lists apt `certbot.service` after failed state
+  reset.
--- a/scripts/ops/188-registry-certbot-fix.sh
+++ b/scripts/ops/188-registry-certbot-fix.sh
@@ -0,0 +1,117 @@
+#!/usr/bin/env bash
+# Repair helper for 188 registry.wooo.work HTTP-01 renewal.
+# Default is dry-run. Use --apply on 188 as root after reviewing the plan.
+set -euo pipefail
+
+APPLY=0
+DOMAIN="${REGISTRY_CERTBOT_DOMAIN:-registry.wooo.work}"
+WEBROOT="${REGISTRY_CERTBOT_WEBROOT:-/var/www/certbot}"
+NGINX_SNIPPET="${REGISTRY_CERTBOT_NGINX_SNIPPET:-/etc/nginx/conf.d/registry-acme-http.conf}"
+CERTBOT_BIN="${REGISTRY_CERTBOT_BIN:-/snap/bin/certbot}"
+
+usage() {
+  cat <<'USAGE'
+Usage: sudo bash scripts/ops/188-registry-certbot-fix.sh [--apply]
+
+Fixes the known 188 drift where registry.wooo.work HTTP-01 traffic falls through
+to the aiops.wooo.work default server and certbot cannot renew the registry cert.
+
+Default mode is dry-run and prints the exact actions. --apply requires root.
+
+Environment:
+  REGISTRY_CERTBOT_DOMAIN        Default: registry.wooo.work
+  REGISTRY_CERTBOT_WEBROOT       Default: /var/www/certbot
+  REGISTRY_CERTBOT_NGINX_SNIPPET Default: /etc/nginx/conf.d/registry-acme-http.conf
+  REGISTRY_CERTBOT_BIN           Default: /snap/bin/certbot
+USAGE
+}
+
+while [ "$#" -gt 0 ]; do
+  case "$1" in
+    --apply)
+      APPLY=1
+      ;;
+    -h|--help)
+      usage
+      exit 0
+      ;;
+    *)
+      echo "Unknown argument: $1" >&2
+      usage >&2
+      exit 64
+      ;;
+  esac
+  shift
+done
+
+run() {
+  if [ "$APPLY" -eq 1 ]; then
+    "$@"
+  else
+    printf 'DRY-RUN:'
+    printf ' %q' "$@"
+    printf '\n'
+  fi
+}
+
+write_snippet() {
+  local tmp
+  tmp="$(mktemp)"
+  cat > "$tmp" <<EOF
+# Managed by AWOOOI registry certbot repair.
+# LetsEncrypt HTTP-01 must not fall through to aiops.wooo.work.
+server {
+    listen 80;
+    server_name ${DOMAIN};
+
+    location /.well-known/acme-challenge/ {
+        root ${WEBROOT};
+        default_type "text/plain";
+    }
+
+    location / {
+        return 301 https://\$host\$request_uri;
+    }
+}
+EOF
+  run install -m 0644 "$tmp" "$NGINX_SNIPPET"
+  rm -f "$tmp"
+}
+
+if [ "$APPLY" -eq 1 ] && [ "$(id -u)" -ne 0 ]; then
+  echo "--apply must be run as root on 188" >&2
+  exit 77
+fi
+
+if [ "$APPLY" -eq 1 ] && [ ! -x "$CERTBOT_BIN" ]; then
+  echo "certbot binary not executable: $CERTBOT_BIN" >&2
+  exit 69
+fi
+
+echo "Plan: repair HTTP-01 route for ${DOMAIN}, renew via ${CERTBOT_BIN}, reload nginx."
+run install -d -m 0755 "$WEBROOT"
+write_snippet
+run nginx -t
+run systemctl reload nginx
+
+if [ "$APPLY" -eq 1 ]; then
+  code="$(curl -s -o /dev/null -w '%{http_code}' --max-time 8 "http://${DOMAIN}/.well-known/acme-challenge/codex-route-check" || true)"
+  if [ "$code" != "404" ]; then
+    echo "Unexpected ACME route status after nginx reload: ${code}; expected 404 from ${DOMAIN}, not redirect/default vhost" >&2
+    exit 1
+  fi
+fi
+
+run "$CERTBOT_BIN" renew --cert-name "$DOMAIN" --deploy-hook "systemctl reload nginx"
+
+if [ -x /snap/bin/certbot ]; then
+  run systemctl disable --now certbot.timer
+  run systemctl reset-failed certbot.service
+fi
+
+if [ "$APPLY" -eq 1 ]; then
+  openssl x509 -noout -subject -issuer -dates -in "/etc/letsencrypt/live/${DOMAIN}/fullchain.pem"
+  systemctl status snap.certbot.renew.timer --no-pager -l | sed -n '1,25p' || true
+else
+  echo "Dry-run only. Re-run with --apply on 188 as root to execute."
+fi
--- a/scripts/ops/awooop-rls-preflight.sh
+++ b/scripts/ops/awooop-rls-preflight.sh
@@ -0,0 +1,100 @@
+#!/usr/bin/env bash
+# Read-only AwoooP RLS preflight runner.
+#
+# Default path runs inside the production API pod through the 120 control-plane
+# host, so DATABASE_URL stays inside Kubernetes and is never printed locally.
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PY_SCRIPT="${SCRIPT_DIR}/awooop_rls_preflight.py"
+
+NAMESPACE="${AWOOOP_RLS_NAMESPACE:-awoooi-prod}"
+DEPLOYMENT="${AWOOOP_RLS_DEPLOYMENT:-deployment/awoooi-api}"
+CONTAINER="${AWOOOP_RLS_CONTAINER:-api}"
+SSH_TARGET="${AWOOOP_RLS_SSH_TARGET:-wooo@192.168.0.120}"
+REMOTE_KUBECTL="${AWOOOP_RLS_REMOTE_KUBECTL:-sudo kubectl}"
+KUBECTL="${AWOOOP_RLS_KUBECTL:-kubectl}"
+USE_SSH=1
+PY_ARGS=()
+SSH_OPTS=(-o BatchMode=yes -o ConnectTimeout=8)
+
+usage() {
+  cat <<'USAGE'
+Usage: bash scripts/ops/awooop-rls-preflight.sh [options]
+
+Read-only checks for AwoooP PostgreSQL RLS readiness. The script runs the Python
+probe inside the API pod and exits 2 when RLS is not ready to enable.
+
+Options:
+  --exact-counts        Run exact COUNT(*) project_id backfill checks.
+  --json                Print JSON output from the pod.
+  --local               Use local kubectl instead of SSH to 120.
+  --ssh USER@HOST       Override SSH target. Default: wooo@192.168.0.120.
+  -h, --help            Show this help.
+
+Environment:
+  AWOOOP_RLS_NAMESPACE       Default: awoooi-prod
+  AWOOOP_RLS_DEPLOYMENT      Default: deployment/awoooi-api
+  AWOOOP_RLS_CONTAINER       Default: api
+  AWOOOP_RLS_REMOTE_KUBECTL  Default: sudo kubectl
+  AWOOOP_RLS_KUBECTL         Default: kubectl
+USAGE
+}
+
+while [ "$#" -gt 0 ]; do
+  case "$1" in
+    --exact-counts)
+      PY_ARGS+=(--exact-counts)
+      ;;
+    --json)
+      PY_ARGS+=(--json)
+      ;;
+    --local)
+      USE_SSH=0
+      ;;
+    --ssh)
+      shift
+      SSH_TARGET="${1:-}"
+      if [ -z "$SSH_TARGET" ]; then
+        echo "--ssh requires USER@HOST" >&2
+        exit 64
+      fi
+      USE_SSH=1
+      ;;
+    -h|--help)
+      usage
+      exit 0
+      ;;
+    *)
+      echo "Unknown argument: $1" >&2
+      usage >&2
+      exit 64
+      ;;
+  esac
+  shift
+done
+
+if [ ! -f "$PY_SCRIPT" ]; then
+  echo "Missing Python probe: $PY_SCRIPT" >&2
+  exit 66
+fi
+
+if [ "$USE_SSH" -eq 1 ]; then
+  printf -v namespace_q "%q" "$NAMESPACE"
+  printf -v deployment_q "%q" "$DEPLOYMENT"
+  printf -v container_q "%q" "$CONTAINER"
+  remote_cmd="${REMOTE_KUBECTL} -n ${namespace_q} exec -i ${deployment_q} -c ${container_q} -- python -"
+  if [ "${#PY_ARGS[@]}" -gt 0 ]; then
+    for arg in "${PY_ARGS[@]}"; do
+      printf -v arg_q "%q" "$arg"
+      remote_cmd="${remote_cmd} ${arg_q}"
+    done
+  fi
+  ssh "${SSH_OPTS[@]}" "$SSH_TARGET" "$remote_cmd" < "$PY_SCRIPT"
+else
+  if [ "${#PY_ARGS[@]}" -gt 0 ]; then
+    "$KUBECTL" -n "$NAMESPACE" exec -i "$DEPLOYMENT" -c "$CONTAINER" -- python - "${PY_ARGS[@]}" < "$PY_SCRIPT"
+  else
+    "$KUBECTL" -n "$NAMESPACE" exec -i "$DEPLOYMENT" -c "$CONTAINER" -- python - < "$PY_SCRIPT"
+  fi
+fi
--- a/scripts/ops/awooop_rls_preflight.py
+++ b/scripts/ops/awooop_rls_preflight.py
@@ -0,0 +1,332 @@
+#!/usr/bin/env python3
+"""
+Read-only AwoooP RLS preflight.
+
+This script is designed to run inside the production API pod. It uses the
+pod-local DATABASE_URL and never prints the URL or credentials.
+"""
+
+from __future__ import annotations
+
+import argparse
+import asyncio
+import json
+import os
+import sys
+from dataclasses import asdict, dataclass
+from typing import Any
+
+from sqlalchemy import text
+from sqlalchemy.ext.asyncio import create_async_engine
+
+
+TARGET_TABLES = [
+    "incidents",
+    "knowledge_entries",
+    "playbooks",
+    "audit_logs",
+    "budget_ledger",
+    "awooop_projects",
+    "awooop_contracts",
+    "awooop_contract_revisions",
+    "awooop_published_contracts",
+    "awooop_run_state",
+    "awooop_run_event",
+    "awooop_cost_ledger",
+    "awooop_mcp_tool_registry",
+    "awooop_mcp_grants",
+    "awooop_mcp_credential_refs",
+    "awooop_mcp_gateway_audit",
+    "awooop_conversation_event",
+    "awooop_outbound_message",
+]
+
+REQUIRED_ROLES = [
+    "awooop_app",
+    "awooop_platform_admin",
+    "awooop_migration",
+]
+
+
+@dataclass
+class Check:
+    name: str
+    status: str
+    detail: str
+
+
+def add(checks: list[Check], name: str, status: str, detail: str) -> None:
+    checks.append(Check(name=name, status=status, detail=detail))
+
+
+async def scalar(conn: Any, sql: str, params: dict[str, Any] | None = None) -> Any:
+    return await conn.scalar(text(sql), params or {})
+
+
+async def rows(conn: Any, sql: str, params: dict[str, Any] | None = None) -> list[dict[str, Any]]:
+    result = await conn.execute(text(sql), params or {})
+    return [dict(row._mapping) for row in result.fetchall()]
+
+
+async def collect(exact_counts: bool) -> tuple[list[Check], dict[str, Any]]:
+    database_url = os.environ.get("DATABASE_URL")
+    if not database_url:
+        return [Check("database_url", "BLOCKED", "DATABASE_URL is not set in this environment")], {}
+
+    engine = create_async_engine(database_url, pool_pre_ping=True)
+    checks: list[Check] = []
+    evidence: dict[str, Any] = {}
+
+    async with engine.connect() as conn:
+        current_role = await rows(
+            conn,
+            """
+            SELECT
+                current_user AS current_user,
+                session_user AS session_user,
+                r.rolsuper AS current_user_superuser,
+                r.rolbypassrls AS current_user_bypassrls
+            FROM pg_roles r
+            WHERE r.rolname = current_user
+            """,
+        )
+        evidence["current_role"] = current_role[0] if current_role else {}
+        role = evidence["current_role"]
+        if role.get("current_user_superuser") or role.get("current_user_bypassrls"):
+            add(
+                checks,
+                "current_role_rls_enforced",
+                "BLOCKED",
+                f"current_user={role.get('current_user')} can bypass RLS",
+            )
+        else:
+            add(
+                checks,
+                "current_role_rls_enforced",
+                "PASS",
+                f"current_user={role.get('current_user')} is subject to RLS",
+            )
+
+        before = await scalar(conn, "SELECT current_setting('app.project_id', TRUE)")
+        await scalar(conn, "SELECT set_config('app.project_id', :pid, TRUE)", {"pid": "awoooi"})
+        after = await scalar(conn, "SELECT current_setting('app.project_id', TRUE)")
+        evidence["project_context_probe"] = {"before": before, "after": after}
+        if after == "awoooi":
+            add(checks, "project_context_set_config", "PASS", "set_config app.project_id works")
+        else:
+            add(checks, "project_context_set_config", "BLOCKED", f"expected awoooi, got {after!r}")
+
+        roles = await rows(
+            conn,
+            """
+            WITH required_roles(rolname) AS (
+                SELECT jsonb_array_elements_text(CAST(:roles_json AS jsonb))
+            )
+            SELECT
+                rr.rolname,
+                r.rolsuper,
+                r.rolbypassrls,
+                r.oid IS NOT NULL AS exists
+            FROM required_roles rr
+            LEFT JOIN pg_roles r ON r.rolname = rr.rolname
+            ORDER BY rr.rolname
+            """,
+            {"roles_json": json.dumps(REQUIRED_ROLES)},
+        )
+        evidence["required_roles"] = roles
+        present_roles = {row["rolname"] for row in roles if row["exists"]}
+        missing_roles = [role_name for role_name in REQUIRED_ROLES if role_name not in present_roles]
+        if missing_roles:
+            add(checks, "required_roles", "BLOCKED", f"missing roles: {', '.join(missing_roles)}")
+        else:
+            add(checks, "required_roles", "PASS", "all required RLS roles exist")
+
+        table_rows = await rows(
+            conn,
+            """
+            WITH target(relname) AS (
+                SELECT jsonb_array_elements_text(CAST(:tables_json AS jsonb))
+            ),
+            rels AS (
+                SELECT
+                    t.relname,
+                    c.oid,
+                    c.relrowsecurity,
+                    c.relforcerowsecurity,
+                    COALESCE(c.reltuples, 0)::bigint AS estimated_rows
+                FROM target t
+                LEFT JOIN pg_class c
+                    ON c.relname = t.relname
+                   AND c.relkind IN ('r', 'p')
+                   AND c.relnamespace = 'public'::regnamespace
+            ),
+            project_columns AS (
+                SELECT table_name, TRUE AS has_project_id
+                FROM information_schema.columns
+                WHERE table_schema = 'public'
+                  AND column_name = 'project_id'
+                  AND table_name IN (SELECT relname FROM target)
+            ),
+            policy_stats AS (
+                SELECT
+                    p.polrelid,
+                    COUNT(*) AS policy_count,
+                    BOOL_OR(
+                        COALESCE(pg_get_expr(p.polqual, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) IS NULL%'
+                        OR COALESCE(pg_get_expr(p.polwithcheck, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) IS NULL%'
+                    ) AS has_null_fail_open_policy,
+                    BOOL_OR(
+                        COALESCE(pg_get_expr(p.polqual, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) = ''''%'
+                        OR COALESCE(pg_get_expr(p.polwithcheck, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) = ''''%'
+                    ) AS has_empty_string_fail_open_policy
+                FROM pg_policy p
+                GROUP BY p.polrelid
+            )
+            SELECT
+                r.relname AS table_name,
+                r.oid IS NOT NULL AS exists,
+                COALESCE(pc.has_project_id, FALSE) AS has_project_id,
+                COALESCE(r.relrowsecurity, FALSE) AS rls_enabled,
+                COALESCE(r.relforcerowsecurity, FALSE) AS rls_forced,
+                COALESCE(ps.policy_count, 0) AS policy_count,
+                COALESCE(ps.has_null_fail_open_policy, FALSE) AS has_null_fail_open_policy,
+                COALESCE(ps.has_empty_string_fail_open_policy, FALSE) AS has_empty_string_fail_open_policy,
+                r.estimated_rows
+            FROM rels r
+            LEFT JOIN project_columns pc ON pc.table_name = r.relname
+            LEFT JOIN policy_stats ps ON ps.polrelid = r.oid
+            ORDER BY r.relname
+            """,
+            {"tables_json": json.dumps(TARGET_TABLES)},
+        )
+        evidence["tables"] = table_rows
+
+        existing = [row for row in table_rows if row["exists"]]
+        missing_project_id = [row["table_name"] for row in existing if not row["has_project_id"]]
+        if missing_project_id:
+            add(checks, "project_id_columns", "BLOCKED", f"missing project_id: {', '.join(missing_project_id)}")
+        else:
+            add(checks, "project_id_columns", "PASS", "all existing target tables have project_id")
+
+        rls_missing = [
+            row["table_name"]
+            for row in existing
+            if not row["rls_enabled"] or not row["rls_forced"] or row["policy_count"] == 0
+        ]
+        if rls_missing:
+            add(
+                checks,
+                "rls_enabled_forced_policy",
+                "BLOCKED",
+                f"RLS not fully enabled/forced/policied: {', '.join(rls_missing)}",
+            )
+        else:
+            add(checks, "rls_enabled_forced_policy", "PASS", "all existing target tables have forced RLS policy")
+
+        fail_open = [
+            row["table_name"]
+            for row in existing
+            if row["has_null_fail_open_policy"] or row["has_empty_string_fail_open_policy"]
+        ]
+        if fail_open:
+            add(checks, "fail_open_policies", "BLOCKED", f"fail-open policy expressions: {', '.join(fail_open)}")
+        else:
+            add(checks, "fail_open_policies", "PASS", "no fail-open policy expressions detected")
+
+        if exact_counts:
+            exact_rows: list[dict[str, Any]] = []
+            for row in existing:
+                if not row["has_project_id"]:
+                    continue
+                quoted = '"' + row["table_name"].replace('"', '""') + '"'
+                count_row = await rows(
+                    conn,
+                    f"SELECT :table_name AS table_name, COUNT(*) AS total_rows, COUNT(*) FILTER (WHERE project_id IS NULL) AS null_project_id_rows FROM {quoted}",
+                    {"table_name": row["table_name"]},
+                )
+                exact_rows.extend(count_row)
+            evidence["exact_counts"] = exact_rows
+            null_tables = [row["table_name"] for row in exact_rows if int(row["null_project_id_rows"]) > 0]
+            if null_tables:
+                add(checks, "project_id_backfill", "BLOCKED", f"NULL project_id remains: {', '.join(null_tables)}")
+            else:
+                add(checks, "project_id_backfill", "PASS", "no NULL project_id rows in counted tables")
+        else:
+            add(checks, "project_id_backfill", "WARN", "exact counts skipped; rerun with --exact-counts before enabling RLS")
+
+    await engine.dispose()
+    return checks, evidence
+
+
+def print_human(checks: list[Check], evidence: dict[str, Any]) -> None:
+    blocked = sum(1 for check in checks if check.status == "BLOCKED")
+    warn = sum(1 for check in checks if check.status == "WARN")
+    passed = sum(1 for check in checks if check.status == "PASS")
+    print(f"AwoooP RLS preflight: PASS={passed} WARN={warn} BLOCKED={blocked}")
+    for check in checks:
+        print(f"{check.status:<7} {check.name}: {check.detail}")
+
+    role = evidence.get("current_role") or {}
+    if role:
+        print(
+            "role "
+            f"current_user={role.get('current_user')} "
+            f"session_user={role.get('session_user')} "
+            f"superuser={role.get('current_user_superuser')} "
+            f"bypassrls={role.get('current_user_bypassrls')}"
+        )
+
+    for row in evidence.get("tables", []):
+        print(
+            "table "
+            f"{row['table_name']} "
+            f"exists={row['exists']} "
+            f"project_id={row['has_project_id']} "
+            f"rls={row['rls_enabled']} "
+            f"force={row['rls_forced']} "
+            f"policies={row['policy_count']} "
+            f"fail_open_null={row['has_null_fail_open_policy']} "
+            f"fail_open_empty={row['has_empty_string_fail_open_policy']} "
+            f"estimated_rows={row['estimated_rows']}"
+        )
+
+    for row in evidence.get("exact_counts", []):
+        print(
+            "count "
+            f"{row['table_name']} "
+            f"total_rows={row['total_rows']} "
+            f"null_project_id_rows={row['null_project_id_rows']}"
+        )
+
+
+async def main() -> int:
+    parser = argparse.ArgumentParser(description="Run read-only AwoooP RLS preflight checks.")
+    parser.add_argument("--exact-counts", action="store_true", help="Run exact COUNT(*) checks for project_id backfill.")
+    parser.add_argument("--json", action="store_true", help="Print JSON instead of human-readable output.")
+    args = parser.parse_args()
+
+    checks, evidence = await collect(exact_counts=args.exact_counts)
+    blocked = any(check.status == "BLOCKED" for check in checks)
+
+    if args.json:
+        print(
+            json.dumps(
+                {"checks": [asdict(check) for check in checks], "evidence": evidence},
+                ensure_ascii=False,
+                default=str,
+            )
+        )
+    else:
+        print_human(checks, evidence)
+
+    return 2 if blocked else 0
+
+
+if __name__ == "__main__":
+    try:
+        raise SystemExit(asyncio.run(main()))
+    except KeyboardInterrupt:
+        raise SystemExit(130)
+    except Exception as exc:
+        print(f"BLOCKED preflight_exception: {exc}", file=sys.stderr)
+        raise SystemExit(2)